Tag Archives: Compute

Improve the performance of Lambda applications with Amazon CodeGuru Profiler

Post Syndicated from Vishaal Thanawala original https://aws.amazon.com/blogs/devops/improve-performance-of-lambda-applications-amazon-codeguru-profiler/

As businesses expand applications to reach more users and devices, they risk higher latencies that could degrade the  customer experience. Slow applications can frustrate or even turn away customers. Developers therefore increasingly face the challenge of ensuring that their code runs as efficiently as possible, since application performance can make or break businesses.

Amazon CodeGuru Profiler helps developers improve their application speed by analyzing its runtime. CodeGuru Profiler analyzes single and multi-threaded applications and then generates visualizations to help developers understand the latency sources. Afterwards, CodeGuru Profiler provides recommendations to help resolve the root cause.

CodeGuru Profiler recently began providing recommendations for applications written in Python. Additionally, the new automated onboarding process for AWS Lambda functions makes it even easier to use CodeGuru Profiler with serverless applications built on AWS Lambda.

This post highlights these new features by explaining how to set up and utilize CodeGuru Profiler on an AWS Lambda function written in Python.

Prerequisites

This post focuses on improving the performance of an application written with AWS Lambda, so it’s important to understand the Lambda functions that work best with CodeGuru Profiler. You will get the most out of CodeGuru Profiler on long-duration Lambda functions (>10 seconds) or frequently invoked shorter generation Lambda functions (~100 milliseconds). Because CodeGuru Profiler requires five minutes of runtime data before the Lambda container is recycled, very short duration Lambda functions with an execution time of 1-10 milliseconds may not provide sufficient data for CodeGuru Profiler to generate meaningful results.

The automated CodeGuru Profiler onboarding process, which automatically creates the profiling group for you, supports Lambda functions running on Java 8 (Amazon Corretto), Java 11, and Python 3.8 runtimes. Additional runtimes, without the automated onboarding process, are supported and can be found in the Java documentation and the Python documentation.

Getting Started

Let’s quickly demonstrate the new Lambda onboarding process and the new Python recommendations. This example assumes you have already created a Lambda function, so we will just walk through the process of turning on CodeGuru Profiler and viewing results. If you don’t already have a Lambda function created, you can create one by following these set up instructions. If you would like to replicate this example, the code we used can be found on GitHub here.

  1. On the AWS Lambda Console page, open your Lambda function. For this example, we’re using a function with a Python 3.8 runtime.

 This image shows the Lambda console page for the Lambda function we are referencing.

2. Navigate to the Configuration tab, go to the Monitoring and operations tools page, and click Edit on the right side of the page.

This image shows the instructions to open the page to turn on profiling

3. Scroll down to “Amazon CodeGuru Profiler” and click the button next to “Code profiling” to turn it on. After enabling Code profiling, click Save.

This image shows how to turn on CodeGuru Profiler

4. Verify that CodeGuru Profiler has been turned on within the Monitoring and operations tools page

This image shows how to validate Code profiling has been turned on

That’s it! You can now navigate to CodeGuru Profiler within the AWS console and begin viewing results.

Viewing your results

CodeGuru Profiler requires 5 minutes of Lambda runtime data to generate results. After your Lambda function provides this runtime data, which may need multiple runs if your lambda has a short runtime, it will display within the “Profiling group” page in the CodeGuru Profiler console. The profiling group will be given a default name (i.e., aws-lambda-<lambda-function-name>), and it will take approximately 15 minutes after CodeGuru Profiler receives the runtime data before it appears on this page.

This image shows where you can see the profiling group created after turning on CodeGuru Profiler on your Lambda function

After the profile appears, customers can view their profiling results by analyzing the flame graphs. Additionally, after approximately 1 hour, customers will receive their first recommendation set (if applicable). For more information on how to reading the CodeGuru Profiler results, see Investigating performance issues with Amazon CodeGuru Profiler.

The below images show the two flame graphs (CPU Utilization and Latency) generated from profiling the Lambda function. Note that the highlighted horizontal bar (also referred to as a frame) in both images corresponds with one of the three frames that generates a recommendation. We’ll dive into more details on the recommendation in the following sections.

CPU Utilization Flame Graph:

This image shows the CPU Utilization flame graph for the Lambda function

Latency Flame Graph:

This image shows the Latency flame graph for the Lambda function

Here are the three recommendations generated from the above Lambda function:

This image shows the 3 recommendations generated by CodeGuru Profiler

Addressing a recommendation

Let’s dive even further into it an example recommendation. The first recommendation above notices that the Lambda function is spending more than the normal amount of runnable time (6.8% vs <1%) creating AWS SDK service clients. It recommends ensuring that the function doesn’t unnecessarily create AWS SDK service clients, which wastes CPU time.

This image shows the detailed recommendation for 'Recreation of AWS SDK service clients'

Based on the suggested resolution step, we made a quick and easy code change, moving the client creation outside of the lambda-handler function. This ensures that we don’t create unnecessary AWS SDK clients. The code change below shows how we would resolve the issue.

...
s3_client = boto3.client('s3')
cw_client = boto3.client('cloudwatch')

def lambda_handler(event, context):
...

Reviewing your savings

After making each of the three changes recommended above by CodeGuru Profiler, look at the new flame graphs to see how the changes impacted the applications profile. You’ll notice below that we no longer see the previously wide frames for boto3 clients, put_metric_data, or the logger in the S3 API call.

CPU Utilization Flame Graph:

This image shows the CPU Utilization flame graph after making the recommended changes

Latency Flame Graph:

This image shows the Latency flame graph after making the recommended changes

Moreover, we can run the Lambda function for one day (1439 invocations) and see the results in Lambda Insights in order to understand our total savings. After every recommendation was addressed, this Lambda function, with a 128 MB memory and 10 second timeout, decreased in CPU time by 10% and dropped in maximum memory usage and network IO leading to a 30% drop in GB-s. Decreasing GB-s leads to 30% lower cost for the Lambda’s duration bill as explained in the AWS Lambda Pricing.

Latency (before and after CodeGuru Profiler):

The graph below displays the duration change while running the Lambda function for 1 day.

This image shows the duration change while running the Lambda function for 1 day.

Cost (before and after CodeGuru Profiler):

The graph below displays the function cost change while running the lambda function for 1 day.

This image shows the function cost change while running the lambda function for 1 day.

Conclusion

This post showed how developers can easily onboard onto CodeGuru Profiler in order to improve the performance of their serverless applications built on AWS Lambda. Get started with CodeGuru Profiler by visiting the CodeGuru console.

All across Amazon, numerous teams have utilized CodeGuru Profiler’s technology to generate performance optimizations for customers. It has also reduced infrastructure costs, saving millions of dollars annually.

Onboard your Python applications onto CodeGuru Profiler by following the instructions on the documentation. If you’re interested in the Python agent, note that it is open-sourced on GitHub. Also for more demo applications using the Python agent, check out the GitHub repository for additional samples.

About the authors

Headshot of Vishaal

Vishaal is a Product Manager at Amazon Web Services. He enjoys getting to know his customers’ pain points and transforming them into innovative product solutions.

 

 

 

 

 

Headshot of Mirela

Mirela is working as a Software Development Engineer at Amazon Web Services as part of the CodeGuru Profiler team in London. Previously, she has worked for Amazon.com’s teams and enjoys working on products that help customers improve their code and infrastructure.

 

 

 

 

Headshot of Parag

Parag is a Software Development Engineer at Amazon Web Services as part of the CodeGuru Profiler team in Seattle. Previously, he was working for Amazon.com’s catalog and selection team. He enjoys working on large scale distributed systems and products that help customers build scalable, and cost-effective applications.

Run a Spark SQL-based ETL pipeline with Amazon EMR on Amazon EKS

Post Syndicated from Melody Yang original https://aws.amazon.com/blogs/big-data/run-a-spark-sql-based-etl-pipeline-with-amazon-emr-on-amazon-eks/

This blog post has been translated into the following languages:

Increasingly, a business’s success depends on its agility in transforming data into actionable insights, which requires efficient and automated data processes. In the previous post – Build a SQL-based ETL pipeline with Apache Spark on Amazon EKS, we described a common productivity issue in a modern data architecture. To address the challenge, we demonstrated how to utilize a declarative approach as the key enabler to improve efficiency, which resulted in a faster time to value for businesses.

Generally speaking, managing applications declaratively in Kubernetes is a widely adopted best practice. You can use the same approach to build and deploy Spark applications with open-source or in-house build frameworks to achieve the same productivity goal.

For this post, we use the open-source data processing framework Arc, which is abstracted away from Apache Spark, to transform a regular data pipeline to an “extract, transform, and load (ETL) as definition” job. The steps in the data pipeline are simply expressed in a declarative definition (JSON) file with embedded declarative language SQL scripts.

The job definition in an Arc Jupyter notebook looks like the following screenshot.

This representation makes ETL much easier for a wider range of personas: analysts, data scientists, and any SQL authors who can fully express their data workflows without the need to write code in a programming language like Python.

In this post, we explore some key advantages of the latest Amazon EMR deployment option Amazon EMR on Amazon EKS to run Spark applications. We also explain its major difference from the commonly used Spark resource manager YARN, and demonstrate how to schedule a declarative ETL job with EMR on EKS. Building and testing the job on a custom Arc Jupyter kernel is out of scope for this post. You can find more tutorials on the Arc website.

Why choose Amazon EMR on Amazon EKS?

The following are some of the benefits of EMR on EKS:

  • Simplified architecture by unifying workloads – EMR on EKS enables us to run Apache Spark workloads on Amazon Elastic Kubernetes Service (Amazon EKS) without provisioning dedicated EMR clusters. If you have an existing Amazon EKS landscape in your organization, it makes sense to unify analytical workloads with other Kubernetes-based applications on the same Amazon EKS cluster. It improves resource utilization and significantly simplifies your infrastructure management.
  • More resources to share with a smaller JVM footprint – A major difference in this deployment option is the resource manager shift from YARN to Kubernetes and from a Hadoop cluster manager to a generic containerized application orchestrator. As shown in the following diagram, each Spark executor runs as a YARN container (compute and memory unit) in Hadoop. Broadly, YARN creates a JVM in each container requested by Hadoop applications, such as Apache Hive. When you run Spark on Kubernetes, it keeps your JVM footprint minimal, so that the Amazon EKS cluster can accommodate more applications, resulting in more spare resources for your analytical workloads.

  • Efficient resource sharing and cost savings – With the YARN cluster manager, if you want to reuse the same EMR cluster for concurrent Spark jobs to reduce cost, you have to compromise on resource isolation. Additionally, you have to pay for compute resources that aren’t fully utilized, such as a master node, because only Amazon EMR can use these unused compute resources. With EMR on EKS, you can enjoy the optimized resource allocation feature by sharing them across all your applications, which reduces cost.
  • Faster EMR runtime for Apache Spark – One of the key benefits of running Spark with EMR on EKS is the faster EMR runtime for Apache Spark. The runtime is a performance-optimized environment, which is available and turned on by default on Amazon EMR release 5.28.0 and later. In our performance tests using TPC-DS benchmark queries at 3 TB scale, we found EMR runtime for Apache Spark 3.0 provides a 1.7 times performance improvement on average, and up to 8 times improved performance for individual queries over open-source Apache Spark 3.0.0. It means you can run your Apache Spark applications faster and cheaper without requiring any changes to your applications.
  • Minimum setup to support multi-tenancy – While taking advantage of Spark’s Dynamic Resource Allocation, auto scaling in Amazon EKS, high availability with multiple Availability Zones, you can isolate your workloads for multi-tenancy use cases, with a minimum configuration required. Additionally, without any infrastructure setup, you can use an Amazon EKS cluster to run a single application that requires different Apache Spark versions and configurations, for example for development vs test environments.

Cost effectiveness

EMR on EKS pricing is calculated based on the vCPU and memory resources used from the time you start to download your Amazon EMR application image until the Spark pod on Amazon EKS stops, rounded up to the nearest second. The following screenshot is an example of the cost in the us-east-1 Region.

The Amazon EMR uplift price is in addition to the Amazon EKS pricing and any other services used by Amazon EKS, such as EC2 instances and EBS volumes. You pay $0.10 per hour for each Amazon EKS cluster that you use. However, you can use a single Amazon EKS cluster to run multiple applications by taking advantage of Kubernetes namespaces and AWS Identity and Access Management (IAM) security policies.

While the application runs, your resources are allocated and removed automatically by the Amazon EKS auto scaling feature, in order to eliminate over-provisioning or under-utilization of these resources. It enables you to lower costs because you only pay for the resources you use.

To further reduce the running cost for jobs that aren’t time-critical, you can schedule Spark executors on Spot Instances to save up to 90% over On-Demand prices. In order to maintain the resiliency of your Spark cluster, it is recommended to run driver on a reliable On-Demand Instance, because the driver is responsible for requesting new executors to replace failed ones when an unexpected event happens.

Kubernetes comes with a YAML specification called a pod template that can help you to assign Spark driver and executor pods to Spot or On-Demand EC2 instances. You define nodeSelector rules in pod templates, then upload to Amazon Simple Storage Service (Amazon S3). Finally, at the job submission, specify the Spark properties spark.kubernetes.driver.podTemplateFile and spark.kubernetes.executor.podTemplateFile to point to the pod templates in Amazon S3.

For example, the following is the code for executor_pod_template.yaml:

apiVersion: v1
kind: Pod
spec:
  nodeSelector:
    eks.amazonaws.com/capacityType: SPOT

The following is the code for driver_pod_template.yaml:

apiVersion: v1
kind: Pod
spec:
  nodeSelector:
    eks.amazonaws.com/capacityType: ON_DEMAND

The following is the code for the Spark job submission:

aws emr-containers start-job-run \
--virtual-cluster-id ${EMR_EKS_CLUSTER_ID} \
--name spark-pi-pod-template \
--execution-role-arn ${EMR_EKS_ROLE_ARN} \
--release-label emr-5.33.0-latest \
--job-driver '{
    "sparkSubmitJobDriver": {
        "entryPoint": "s3://'${s3DemoBucket}'/someAppCode.py",
        "sparkSubmitParameters": "--conf spark.kubernetes.driver.podTemplateFile=\"s3://'${s3DemoBucket}'/pod_templates/driver_pod_template.yaml\" --conf spark.kubernetes.executor.podTemplateFile=\"s3://'${s3DemoBucket}'/pod_templates/executor_pod_template.yaml\" --conf spark.executor.instances=2 --conf spark.executor.memory=2G --conf spark.executor.cores=2"
	}
    }'

Beginning with Amazon EMR versions 5.33.0 or 6.3.0, Amazon EMR on EKS supports the Amazon S3-based pod template feature. If you’re using an unsupported Amazon EMR version, such as EMR 6.1.0, you can use the pod template feature without Amazon S3 support. Make sure your Spark version is 3.0 or later, and copy the template files into your custom Docker image. The job submit script is changed to the following:

"--conf spark.kubernetes.driver.podTemplateFile=/local/path/to/driver_pod_template.yaml" 
"--conf spark.kubernetes.executor.podTemplateFile=/local/path/to/executor_pod_template.yaml"

Serverless compute option: AWS Fargate

The sample solution runs on an Amazon EKS cluster with AWS Fargate. Fargate is a serverless compute engine for Amazon EKS and Amazon ECS. It makes it easy for you to focus on building applications because it removes the need to provision and manage EC2 instances or managed node groups in EKS. Fargate runs each task or pod in its own kernel, providing its own isolated compute environment. This enables your application to have resource isolation and enhanced security by design.

With Fargate, you don’t need to be an expert in Kubernetes operations. It automatically allocates the right amount of compute, eliminating the need to choose instances and scale cluster capacity, so the Kubernetes Cluster Autoscaler is no longer required to increase your cluster’s compute capacity.

In our instance, each Spark executor or driver is provisioned by a separate Fargate pod, to form a Spark cluster dedicated to an ETL pipeline. You only need to specify and pay for resources required to run the application—no more concerns about complex cluster management, queues, and isolation trade-offs.

Other Amazon EC2 options

In addition to the serverless choice, EMR on EKS can operate on all types of EKS clusters. For example, Amazon EKS managed node groups with Amazon Elastic Compute Cloud (Amazon EC2) On-Demand and Spot Instances.

Previously, we mentioned placing a Spark driver pod on an On-Demand Instance to reduce interruption risk. To further improve your cluster stability, it’s important to understand the Kubernetes high availability and restart policy features. These allow for exciting new possibilities, not only in computational multi-tenancy, but also in the ability of application self-recovery, for example relaunching a Spark driver on Spot or On-Demand instances. For more information and an example, see the GitHub repo.

As of this writing, our test result shows that a Spark driver can’t restart on failure with the EMR on EKS deployment type. So be mindful when designing a Spark application with the minimum downtime need, especially for a time-critical job. We recommend the following:

  • Diversify your Spot requests – Similar to Amazon EMR’s instance fleet, EMR on EKS allows you to deploy a single application across multiple instance types to further enhance availability. With Amazon EC2 Spot best practices, such as capacity rebalancing, you can diversify the Spot request across multiple instance types within each Availability Zone. It limits the impact of Spot interruptions on your workload, if a Spot Instance is reclaimed by Amazon EC2. For more details, see Running cost optimized Spark workloads on Kubernetes using EC2 Spot Instances.
  • Increase resilience – Repeatedly restarting a Spark application compromises your application performance or the length of your jobs, especially for time-sensitive data processes. We recommend the following best practice to increase your application resilience:
    • Ensure your job is stateless so that it can self-recover without waiting for a dependency.
    • If a checkpoint is required, for example in a stateful Spark streaming ETL, make sure your checkpoint storage is decoupled from the Amazon EKS compute resource, which can be detached and attached to your Kubernetes cluster via the persistent volume claims (PVCs), or simply use S3 Cloud Storage.
    • Run the Spark driver on the On-Demand Instance defined by a pod template. It adds an extra layer of resiliency to your Spark application with EMR on EKS.

Security

EMR on EKS inherits the fine-grained security feature IAM roles for service accounts, (IRSA), offered by Amazon EKS. This means your data access control is no longer at the compute instance level, instead it’s granular at the container or pod level and controlled by an IAM role. The token-based authentication approach allows us to use one of the AWS default credential providers WebIdentityTokenCredentialsProvider to exchange the Kubernetes-issued token for IAM role credentials. It makes sure our applications deployed with EMR on EKS can communicate to other AWS services in a secure and private channel, without the need to store a long-lived AWS credential pair as a Kubernetes secret.

To learn more about the implementation details, see the GitHub repo.

Solution overview

In this example, we introduce a quality-aware design with the open-source declarative data processing Arc framework to abstract Spark technology away from you. It makes it easy for you to focus on business outcomes, not technologies.

We walk through the steps to run a predefined Arc ETL job with the EMR on EKS approach. For more information, see the GitHub repo.

The sample solution launches a serverless Amazon EKS cluster, loads TLC green taxi trip records from a public S3 bucket, applies dataset schema, aggregates the data, and outputs to an S3 bucket as Parquet file format. The sample Spark application is defined as a Jupyter notebook green_taxi_load.ipynb powered by Arc, and the metadata information is defined in green_taxi_schema.json.

The following diagram illustrates our solution architecture.

Launch Amazon EKS

Provisioning takes approximately 20 minutes.

To get started, open AWS CloudShell in the us-east-1 Region. Run the following command to provision the new cluster eks-cluster, backed by Fargate. Then build the EMR virtual cluster emr-on-eks-cluster:

curl https://raw.githubusercontent.com/aws-samples/sql-based-etl-on-amazon-eks/main/emr-on-eks/provision.sh | bash

At the end of the provisioning, the shell script automatically creates an EMR virtual cluster by the following command. It registers Amazon EMR with the newly created Amazon EKS cluster. The dedicated namespace on the EKS is called emr.

Deploy the sample ETL job

  1. When provisioning is complete, submit the sample ETL job to EMR on EKS with a serverless virtual cluster called emr-on-eks-cluster:
curl https://raw.githubusercontent.com/aws-samples/sql-based-etl-on-amazon-eks/main/emr-on-eks/submit_arc_job.sh | bash

It runs the following job summit command:

The declarative ETL job can be found on the blogpost’s GitHub repository. This is a screenshot of the job specification:

  1. Check your job progress and auto scaling status:
kubectl get pod -n emr
kubectl get node --label-columns=topology.kubernetes.io/zone
  1. As the job requested 10 executors, it automatically scales the Spark application from 0 to 10 pods (executors) on the EKS cluster. The Spark application automatically removes itself from the EKS when the job is done.

  1. Navigate to your Amazon EMR console to view application logs on the Spark History Server.

  1. You can also check the logs in CloudShell, as long as your Spark Driver is running:
driver_name=$(kubectl get pod -n emr | grep "driver" | awk '{print $1}')
kubectl logs ${driver_name} -n emr -c spark-kubernetes-driver | grep 'event'

Clean up

To clean up the AWS resources you created, run the following code:

curl https://raw.githubusercontent.com/aws-samples/sql-based-etl-on-amazon-eks/main/emr-on-eks/deprovision.sh | bash

Region support

At the time of this writing, Amazon EMR on EKS is available in US East (N. Virginia), US West (Oregon), US West (N. California), US East (Ohio), Canada (Central), Europe (Ireland, Frankfurt, and London), and Asia Pacific (Mumbai, Seoul, Singapore, Sydney, and Tokyo) Regions. If you want to use EMR on EKS in a Region that isn’t available yet, check out the open-source Apache Spark on Amazon EKS alternative on aws-samples GitHub. You can deploy the sample solution to your Region as long as Amazon EKS is available. Migrating a Spark workload on Amazon EKS to the fully managed EMR on EKS is easy and straightforward, with minimum changes required. Because the self-contained Spark application remains the same, only the deployment implementation differs.

Conclusion

This post introduces Amazon EMR on Amazon EKS and provides a walkthrough of a sample solution to demonstrate the “ETL as definition” concept. A declarative data processing framework enables you to build and deploy your Spark workloads with enhanced efficiency. With EMR on EKS, running applications built upon a declarative framework maximizes data process productivity, performance, reliability, and availability at scale. This pattern abstracts Spark technology away from you, and helps you to focus on deliverables that optimize business outcomes.

The built-in optimizations provided by the managed EMR on EKS can help not only data engineers with analytical skills, but also analysts, data scientists, and any SQL authors who can fully express their data workflows declaratively in Spark SQL. You can use this architectural pattern to drive your data ownership shift in your organization, from IT to non-IT stakeholders who have a better understanding of business operations and needs.


About the Authors

Melody Yang is a Senior Analytics Specialist Solution Architect at AWS with expertise in Big Data technologies. She is an experienced analytics leader working with AWS customers to provide best practice guidance and technical advice in order to assist their success in data transformation. Her areas of interests are open-source frameworks and automation, data engineering and DataOps.

 

 

Shiva Achari is a Senior Data Lab Architect at AWS. He helps AWS customers to design and build data and analytics prototypes via the AWS Data Lab engagement. He has over 14 years of experience working with enterprise customers and startups primarily in the Data and Big Data Analytics space.

 

 

Daniel Maldonado is an AWS Solutions Architect, specialist in Microsoft workloads and Big Data technologies, focused on helping customers to migrate their applications and data to AWS. Daniel has over 12 years of experience working with Information Technologies and enjoys helping clients reap the benefits of running their workloads in the cloud.

 

 

Igor Izotov is an AWS enterprise solutions architect, and he works closely with Australia’s largest financial services organizations. Prior to AWS, Igor held solution architecture and engineering roles with tier-1 consultancies and software vendors. Igor is passionate about all things data and modern software engineering. Outside of work, he enjoys writing and performing music, a good audiobook, or a jog, often combining the latter two.

Using AWS CodePipeline for deploying container images to AWS Lambda Functions

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/using-aws-codepipeline-for-deploying-container-images-to-aws-lambda-functions/

AWS Lambda launched support for packaging and deploying functions as container images at re:Invent 2020. In the post working with Lambda layers and extensions in container images, we demonstrated packaging Lambda Functions with layers while using container images. This post will teach you to use AWS CodePipeline to deploy docker images for microservices architecture involving multiple Lambda Functions and a common layer utilized as a separate container image. Lambda functions packaged as container images do not support adding Lambda layers to the function configuration. Alternatively, we can use a container image as a common layer while building other container images along with Lambda Functions shown in this post. Packaging Lambda functions as container images enables familiar tooling and larger deployment limits.

Here are some advantages of using container images for Lambda:

  • Easier dependency management and application building with container
    • Install native operating system packages
    • Install language-compatible dependencies
  • Consistent tool set for containers and Lambda-based applications
    • Utilize the same container registry to store application artifacts (Amazon ECR, Docker Hub)
    • Utilize the same build and pipeline tools to deploy
    • Tools that can inspect Dockerfile work the same
  • Deploy large applications with AWS-provided or third-party images up to 10 GB
    • Include larger application dependencies that previously were impossible

When using container images with Lambda, CodePipeline automatically detects code changes in the source repository in AWS CodeCommit, then passes the artifact to the build server like AWS CodeBuild and pushes the container images to ECR, which is then deployed to Lambda functions.

Architecture diagram

 

DevOps Architecture

Lambda-docker-images-DevOps-Architecture

Application Architecture

lambda-docker-image-microservices-app

In the above architecture diagram, two architectures are combined, namely 1, DevOps Architecture and 2, Microservices Application Architecture. DevOps architecture demonstrates the use of AWS Developer services such as AWS CodeCommit, AWS CodePipeline, AWS CodeBuild along with Amazon Elastic Container Repository (ECR) and AWS CloudFormation. These are used to support Continuous Integration and Continuous Deployment/Delivery (CI/CD) for both infrastructure and application code. Microservices Application architecture demonstrates how various AWS Lambda Functions that are part of microservices utilize container images for application code. This post will focus on performing CI/CD for Lambda functions utilizing container containers. The application code used in here is a simpler version taken from Serverless DataLake Framework (SDLF). For more information, refer to the AWS Samples GitHub repository for SDLF here.

DevOps workflow

There are two CodePipelines: one for building and pushing the common layer docker image to Amazon ECR, and another for building and pushing the docker images for all the Lambda Functions within the microservices architecture to Amazon ECR, as well as deploying the microservices architecture involving Lambda Functions via CloudFormation. Common layer container image functions as a common layer in all other Lambda Function container images, therefore its code is maintained in a separate CodeCommit repository used as a source stage for a CodePipeline. Common layer CodePipeline takes the code from the CodeCommit repository and passes the artifact to a CodeBuild project that builds the container image and pushes it to an Amazon ECR repository. This common layer ECR repository functions as a source in addition to the CodeCommit repository holding the code for all other Lambda Functions and resources involved in the microservices architecture CodePipeline.

Due to all or the majority of the Lambda Functions in the microservices architecture requiring the common layer container image as a layer, any change made to it should invoke the microservices architecture CodePipeline that builds the container images for all Lambda Functions. Moreover, a CodeCommit repository holding the code for every resource in the microservices architecture is another source to that CodePipeline to get invoked. This has two sources, because the container images in the microservices architecture should be built for changes in the common layer container image as well as for the code changes made and pushed to the CodeCommit repository.

Below is the sample dockerfile that uses the common layer container image as a layer:

ARG ECR_COMMON_DATALAKE_REPO_URL
FROM ${ECR_COMMON_DATALAKE_REPO_URL}:latest AS layer
FROM public.ecr.aws/lambda/python:3.8
# Layer Code
WORKDIR /opt
COPY --from=layer /opt/ .
# Function Code
WORKDIR /var/task
COPY src/lambda_function.py .
CMD ["lambda_function.lambda_handler"]

where the argument ECR_COMMON_DATALAKE_REPO_URL should resolve to the ECR url for common layer container image, which is provided to the --build-args along with docker build command. For example:

export ECR_COMMON_DATALAKE_REPO_URL="0123456789.dkr.ecr.us-east-2.amazonaws.com/dev-routing-lambda"
docker build --build-arg ECR_COMMON_DATALAKE_REPO_URL=$ECR_COMMON_DATALAKE_REPO_URL .

Deploying a Sample

  • Step1: Clone the repository Codepipeline-lambda-docker-images to your workstation. If using the zip file, then unzip the file to a local directory.
    • git clone https://github.com/aws-samples/codepipeline-lambda-docker-images.git
  • Step 2: Change the directory to the cloned directory or extracted directory. The local code repository structure should appear as follows:
    • cd codepipeline-lambda-docker-images

code-repository-structure

  • Step 3: Deploy the CloudFormation stack used in the template file CodePipelineTemplate/codepipeline.yaml to your AWS account. This deploys the resources required for DevOps architecture involving AWS CodePipelines for common layer code and microservices architecture code. Deploy CloudFormation stacks using the AWS console by following the documentation here, providing the name for the stack (for example datalake-infra-resources) and passing the parameters while navigating the console. Furthermore, use the AWS CLI to deploy a CloudFormation stack by following the documentation here.
  • Step 4: When the CloudFormation Stack deployment completes, navigate to the AWS CloudFormation console and to the Outputs section of the deployed stack, then note the CodeCommit repository urls. Three CodeCommit repo urls are available in the CloudFormation stack outputs section for each CodeCommit repository. Choose one of them based on the way you want to access it. Refer to the following documentation Setting up for AWS CodeCommit. I will be using the git-remote-codecommit (grc) method throughout this post for CodeCommit access.
  • Step 5: Clone the CodeCommit repositories and add code:
      • Common Layer CodeCommit repository: Take the value of the Output for the key oCommonLayerCodeCommitHttpsGrcRepoUrl from datalake-infra-resources CloudFormation Stack Outputs section which looks like below:

    commonlayercodeoutput

      • Clone the repository:
        • git clone codecommit::us-east-2://dev-CommonLayerCode
      • Change the directory to dev-CommonLayerCode
        • cd dev-CommonLayerCode
      •  Add contents to the cloned repository from the source code downloaded in Step 1. Copy the code from the CommonLayerCode directory and the repo contents should appear as follows:

    common-layer-repository

      • Create the main branch and push to the remote repository
        git checkout -b main
        git add ./
        git commit -m "Initial Commit"
        git push -u origin main
      • Application CodeCommit repository: Take the value of the Output for the key oAppCodeCommitHttpsGrcRepoUrl from datalake-infra-resources CloudFormation Stack Outputs section which looks like below:

    appcodeoutput

      • Clone the repository:
        • git clone codecommit::us-east-2://dev-AppCode
      • Change the directory to dev-CommonLayerCode
        • cd dev-AppCode
      • Add contents to the cloned repository from the source code downloaded in Step 1. Copy the code from the ApplicationCode directory and the repo contents should appear as follows from the root:

    app-layer-repository

    • Create the main branch and push to the remote repository
      git checkout -b main
      git add ./
      git commit -m "Initial Commit"
      git push -u origin main

What happens now?

  • Now the Common Layer CodePipeline goes to the InProgress state and invokes the Common Layer CodeBuild project that builds the docker image and pushes it to the Common Layer Amazon ECR repository. The image tag utilized for the container image is the value resolved for the environment variable available in the AWS CodeBuild project CODEBUILD_RESOLVED_SOURCE_VERSION. This is the CodeCommit git Commit Id in this case.
    For example, if the CommitId in CodeCommit is f1769c87, then the pushed docker image will have this tag along with latest
  • buildspec.yaml files appears as follows:
    version: 0.2
    phases:
      install:
        runtime-versions:
          docker: 19
      pre_build:
        commands:
          - echo Logging in to Amazon ECR...
          - aws --version
          - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
          - REPOSITORY_URI=$ECR_COMMON_DATALAKE_REPO_URL
          - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
          - IMAGE_TAG=${COMMIT_HASH:=latest}
      build:
        commands:
          - echo Build started on `date`
          - echo Building the Docker image...          
          - docker build -t $REPOSITORY_URI:latest .
          - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
      post_build:
        commands:
          - echo Build completed on `date`
          - echo Pushing the Docker images...
          - docker push $REPOSITORY_URI:latest
          - docker push $REPOSITORY_URI:$IMAGE_TAG
  • Now the microservices architecture CodePipeline goes to the InProgress state and invokes all of the application image builder CodeBuild project that builds the docker images and pushes them to the Amazon ECR repository.
    • To improve the performance, every docker image is built in parallel within the codebuild project. The buildspec.yaml executes the build.sh script. This has the logic to build docker images required for each Lambda Function part of the microservices architecture. The docker images used for this sample architecture took approximately 4 to 5 minutes when the docker images were built serially. After switching to parallel building, it took approximately 40 to 50 seconds.
    • buildspec.yaml files appear as follows:
      version: 0.2
      phases:
        install:
          runtime-versions:
            docker: 19
          commands:
            - uname -a
            - set -e
            - chmod +x ./build.sh
            - ./build.sh
      artifacts:
        files:
          - cfn/**/*
        name: builds/$CODEBUILD_BUILD_NUMBER/cfn-artifacts
    • build.sh file appears as follows:
      #!/bin/bash
      set -eu
      set -o pipefail
      
      RESOURCE_PREFIX="${RESOURCE_PREFIX:=stg}"
      AWS_DEFAULT_REGION="${AWS_DEFAULT_REGION:=us-east-1}"
      ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text 2>&1)
      ECR_COMMON_DATALAKE_REPO_URL="${ECR_COMMON_DATALAKE_REPO_URL:=$ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com\/$RESOURCE_PREFIX-common-datalake-library}"
      pids=()
      pids1=()
      
      PROFILE='new-profile'
      aws configure --profile $PROFILE set credential_source EcsContainer
      
      aws --version
      $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      BUILD_TAG=build-$(echo $CODEBUILD_BUILD_ID | awk -F":" '{print $2}')
      IMAGE_TAG=${BUILD_TAG:=COMMIT_HASH:=latest}
      
      cd dockerfiles;
      mkdir ../logs
      function pwait() {
          while [ $(jobs -p | wc -l) -ge $1 ]; do
              sleep 1
          done
      }
      
      function build_dockerfiles() {
          if [ -d $1 ]; then
              directory=$1
              cd $directory
              echo $directory
              echo "---------------------------------------------------------------------------------"
              echo "Start creating docker image for $directory..."
              echo "---------------------------------------------------------------------------------"
                  REPOSITORY_URI=$ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$RESOURCE_PREFIX-$directory
                  docker build --build-arg ECR_COMMON_DATALAKE_REPO_URL=$ECR_COMMON_DATALAKE_REPO_URL . -t $REPOSITORY_URI:latest -t $REPOSITORY_URI:$IMAGE_TAG -t $REPOSITORY_URI:$COMMIT_HASH
                  echo Build completed on `date`
                  echo Pushing the Docker images...
                  docker push $REPOSITORY_URI
              cd ../
              echo "---------------------------------------------------------------------------------"
              echo "End creating docker image for $directory..."
              echo "---------------------------------------------------------------------------------"
          fi
      }
      
      for directory in *; do 
         echo "------Started processing code in $directory directory-----"
         build_dockerfiles $directory 2>&1 1>../logs/$directory-logs.log | tee -a ../logs/$directory-logs.log &
         pids+=($!)
         pwait 20
      done
      
      for pid in "${pids[@]}"; do
        wait "$pid"
      done
      
      cd ../cfn/
      function build_cfnpackages() {
          if [ -d ${directory} ]; then
              directory=$1
              cd $directory
              echo $directory
              echo "---------------------------------------------------------------------------------"
              echo "Start packaging cloudformation package for $directory..."
              echo "---------------------------------------------------------------------------------"
              aws cloudformation package --profile $PROFILE --template-file template.yaml --s3-bucket $S3_BUCKET --output-template-file packaged-template.yaml
              echo "Replace the parameter 'pEcrImageTag' value with the latest built tag"
              echo $(jq --arg Image_Tag "$IMAGE_TAG" '.Parameters |= . + {"pEcrImageTag":$Image_Tag}' parameters.json) > parameters.json
              cat parameters.json
              ls -al
              cd ../
              echo "---------------------------------------------------------------------------------"
              echo "End packaging cloudformation package for $directory..."
              echo "---------------------------------------------------------------------------------"
          fi
      }
      
      for directory in *; do
          echo "------Started processing code in $directory directory-----"
          build_cfnpackages $directory 2>&1 1>../logs/$directory-logs.log | tee -a ../logs/$directory-logs.log &
          pids1+=($!)
          pwait 20
      done
      
      for pid in "${pids1[@]}"; do
        wait "$pid"
      done
      
      cd ../logs/
      ls -al
      for f in *; do
        printf '%s\n' "$f"
        paste /dev/null - < "$f"
      done
      
      cd ../
      

The function build_dockerfiles() loops through each directory within the dockerfiles directory and runs the docker build command in order to build the docker image. The name for the docker image and then the ECR repository is determined by the directory name in which the DockerFile is used from. For example, if the DockerFile directory is routing-lambda and the environment variables take the below values,

ACCOUNT_ID=0123456789
AWS_DEFAULT_REGION=us-east-2
RESOURCE_PREFIX=dev
directory=routing-lambda
REPOSITORY_URI=$ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$RESOURCE_PREFIX-$directory

Then REPOSITORY_URI becomes 0123456789.dkr.ecr.us-east-2.amazonaws.com/dev-routing-lambda
And the docker image is pushed to this resolved REPOSITORY_URI. Similarly, docker images for all other directories are built and pushed to Amazon ECR.

Important Note: The ECR repository names match the directory names where the DockerFiles exist and was already created as part of the CloudFormation template codepipeline.yaml that was deployed in step 3. In order to add more Lambda Functions to the microservices architecture, make sure that the ECR repository name added to the new repository in the codepipeline.yaml template matches the directory name within the AppCode repository dockerfiles directory.

Every docker image is built in parallel in order to save time. Each runs as a separate operating system process and is pushed to the Amazon ECR repository. This also controls the number of processes that could run in parallel by setting a value for the variable pwait within the loop. For example, if pwait 20, then the maximum number of parallel processes is 20 at a given time. The image tag for all docker images used for Lambda Functions is constructed via the CodeBuild BuildId, which is available via environment variable $CODEBUILD_BUILD_ID, in order to ensure that a new image gets a new tag. This is required for CloudFormation to detect changes and update Lambda Functions with the new container image tag.

Once every docker image is built and pushed to Amazon ECR in the CodeBuild project, it builds every CloudFormation package by uploading all local artifacts to Amazon S3 via AWS Cloudformation package CLI command for the templates available in its own directory within the cfn directory. Moreover, it updates every parameters.json file for each directory with the ECR image tag to the parameter value pEcrImageTag. This is required for CloudFormation to detect changes and update the Lambda Function with the new image tag.

After this, the CodeBuild project will output the packaged CloudFormation templates and parameters files as an artifact to AWS CodePipeline so that it can be deployed via AWS CloudFormation in further stages. This is done by first creating a ChangeSet and then deploying it at the next stage.

Testing the microservices architecture

As stated above, the sample application utilized for microservices architecture involving multiple Lambda Functions is a modified version of the Serverless Data Lake Framework. The microservices architecture CodePipeline deployed every AWS resource required to run the SDLF application via AWS CloudFormation stages. As part of SDLF, it also deployed a set of DynamoDB tables required for the applications to run. I utilized the meteorites sample for this, thereby the DynamoDb tables should be added with the necessary data for the application to run for this sample.

Utilize the AWS console to write data to the AWS DynamoDb Table. For more information, refer to this documentation. The sample json files are in the utils/DynamoDbConfig/ directory.

1. Add the record below to the octagon-Pipelines-dev DynamoDB table:

{
"description": "Main Pipeline to Ingest Data",
"ingestion_frequency": "WEEKLY",
"last_execution_date": "2020-03-11",
"last_execution_duration_in_seconds": 4.761,
"last_execution_id": "5445249c-a097-447a-a957-f54f446adfd2",
"last_execution_status": "COMPLETED",
"last_execution_timestamp": "2020-03-11T02:34:23.683Z",
"last_updated_timestamp": "2020-03-11T02:34:23.683Z",
"modules": [
{
"name": "pandas",
"version": "0.24.2"
},
{
"name": "Python",
"version": "3.7"
}
],
"name": "engineering-main-pre-stage",
"owner": "Yuri Gagarin",
"owner_contact": "y.gagarin@",
"status": "ACTIVE",
"tags": [
{
"key": "org",
"value": "VOSTOK"
}
],
"type": "INGESTION",
"version": 127
}

2. Add the record below to the octagon-Pipelines-dev DynamoDB table:

{
"description": "Main Pipeline to Merge Data",
"ingestion_frequency": "WEEKLY",
"last_execution_date": "2020-03-11",
"last_execution_duration_in_seconds": 570.559,
"last_execution_id": "0bb30d20-ace8-4cb2-a9aa-694ad018694f",
"last_execution_status": "COMPLETED",
"last_execution_timestamp": "2020-03-11T02:44:36.069Z",
"last_updated_timestamp": "2020-03-11T02:44:36.069Z",
"modules": [
{
"name": "PySpark",
"version": "1.0"
}
],
"name": "engineering-main-post-stage",
"owner": "Neil Armstrong",
"owner_contact": "n.armstrong@",
"status": "ACTIVE",
"tags": [
{
"key": "org",
"value": "NASA"
}
],
"type": "TRANSFORM",
"version": 4
}

3. Add the record below to the octagon-Datsets-dev DynamoDB table:

{
"classification": "Orange",
"description": "Meteorites Name, Location and Classification",
"frequency": "DAILY",
"max_items_process": 250,
"min_items_process": 1,
"name": "engineering-meteorites",
"owner": "NASA",
"owner_contact": "[email protected]",
"pipeline": "main",
"tags": [
{
"key": "cost",
"value": "meteorites division"
}
],
"transforms": {
"stage_a_transform": "light_transform_blueprint",
"stage_b_transform": "heavy_transform_blueprint"
},
"type": "TRANSACTIONAL",
"version": 1
}

 

If you want to create these samples using AWS CLI, please refer to this documentation.

Record 1:

aws dynamodb put-item --table-name octagon-Pipelines-dev --item '{"description":{"S":"Main Pipeline to Merge Data"},"ingestion_frequency":{"S":"WEEKLY"},"last_execution_date":{"S":"2021-03-16"},"last_execution_duration_in_seconds":{"N":"930.097"},"last_execution_id":{"S":"e23b7dae-8e83-4982-9f97-5784a9831a14"},"last_execution_status":{"S":"COMPLETED"},"last_execution_timestamp":{"S":"2021-03-16T04:31:16.968Z"},"last_updated_timestamp":{"S":"2021-03-16T04:31:16.968Z"},"modules":{"L":[{"M":{"name":{"S":"PySpark"},"version":{"S":"1.0"}}}]},"name":{"S":"engineering-main-post-stage"},"owner":{"S":"Neil Armstrong"},"owner_contact":{"S":"n.armstrong@"},"status":{"S":"ACTIVE"},"tags":{"L":[{"M":{"key":{"S":"org"},"value":{"S":"NASA"}}}]},"type":{"S":"TRANSFORM"},"version":{"N":"8"}}'

Record 2:

aws dynamodb put-item --table-name octagon-Pipelines-dev --item '{"description":{"S":"Main Pipeline to Ingest Data"},"ingestion_frequency":{"S":"WEEKLY"},"last_execution_date":{"S":"2021-03-28"},"last_execution_duration_in_seconds":{"N":"1.75"},"last_execution_id":{"S":"7e0e04e7-b05e-41a6-8ced-829d47866a6a"},"last_execution_status":{"S":"COMPLETED"},"last_execution_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"last_updated_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"modules":{"L":[{"M":{"name":{"S":"pandas"},"version":{"S":"0.24.2"}}},{"M":{"name":{"S":"Python"},"version":{"S":"3.7"}}}]},"name":{"S":"engineering-main-pre-stage"},"owner":{"S":"Yuri Gagarin"},"owner_contact":{"S":"y.gagarin@"},"status":{"S":"ACTIVE"},"tags":{"L":[{"M":{"key":{"S":"org"},"value":{"S":"VOSTOK"}}}]},"type":{"S":"INGESTION"},"version":{"N":"238"}}'

Record 3:

aws dynamodb put-item --table-name octagon-Pipelines-dev --item '{"description":{"S":"Main Pipeline to Ingest Data"},"ingestion_frequency":{"S":"WEEKLY"},"last_execution_date":{"S":"2021-03-28"},"last_execution_duration_in_seconds":{"N":"1.75"},"last_execution_id":{"S":"7e0e04e7-b05e-41a6-8ced-829d47866a6a"},"last_execution_status":{"S":"COMPLETED"},"last_execution_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"last_updated_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"modules":{"L":[{"M":{"name":{"S":"pandas"},"version":{"S":"0.24.2"}}},{"M":{"name":{"S":"Python"},"version":{"S":"3.7"}}}]},"name":{"S":"engineering-main-pre-stage"},"owner":{"S":"Yuri Gagarin"},"owner_contact":{"S":"y.gagarin@"},"status":{"S":"ACTIVE"},"tags":{"L":[{"M":{"key":{"S":"org"},"value":{"S":"VOSTOK"}}}]},"type":{"S":"INGESTION"},"version":{"N":"238"}}'

Now upload the sample json files to the raw s3 bucket. The raw S3 bucket name can be obtained in the output of the common-cloudformation stack deployed as part of the microservices architecture CodePipeline. Navigate to the CloudFormation console in the region where the CodePipeline was deployed and locate the stack with the name common-cloudformation, navigate to the Outputs section, and then note the output bucket name with the key oCentralBucket. Navigate to the Amazon S3 Bucket console and locate the bucket for oCentralBucket, create two path directories named engineering/meteorites, and upload every sample json file to this directory. Meteorites sample json files are available in the utils/meteorites-test-json-files directory of the previously cloned repository. Wait a few minutes and then navigate to the stage bucket noted from the common-cloudformation stack output name oStageBucket. You can see json files converted into csv in pre-stage/engineering/meteorites folder in S3. Wait a few more minutes and then navigate to the post-stage/engineering/meteorites folder in the oStageBucket to see the csv files converted to parquet format.

 

Cleanup

Navigate to the AWS CloudFormation console, note the S3 bucket names from the common-cloudformation stack outputs, and empty the S3 buckets. Refer to Emptying the Bucket for more information.

Delete the CloudFormation stacks in the following order:
1. Common-Cloudformation
2. stagea
3. stageb
4. sdlf-engineering-meteorites
Then delete the infrastructure CloudFormation stack datalake-infra-resources deployed using the codepipeline.yaml template. Refer to the following documentation to delete CloudFormation Stacks: Deleting a stack on the AWS CloudFormation console or Deleting a stack using AWS CLI.

 

Conclusion

This method lets us use CI/CD via CodePipeline, CodeCommit, and CodeBuild, along with other AWS services, to automatically deploy container images to Lambda Functions that are part of the microservices architecture. Furthermore, we can build a common layer that is equivalent to the Lambda layer that could be built independently via its own CodePipeline, and then build the container image and push to Amazon ECR. Then, the common layer container image Amazon ECR functions as a source along with its own CodeCommit repository which holds the code for the microservices architecture CodePipeline. Having two sources for microservices architecture codepipeline lets us build every docker image. This is due to a change made to the common layer docker image that is referred to in other docker images, and another source that holds the code for other microservices including Lambda Function.

 

About the Author

kirankumar.jpeg Kirankumar Chandrashekar is a Sr.DevOps consultant at AWS Professional Services. He focuses on leading customers in architecting DevOps technologies. Kirankumar is passionate about DevOps, Infrastructure as Code, and solving complex customer issues. He enjoys music, as well as cooking and traveling.

 

How to run massively multiplayer games with EC2 Spot using Aurora Serverless

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/how-to-run-massively-multiplayer-games-with-ec2-spot-using-aurora-serverless/

This post is written by Yahav Biran, Principal Solutions Architect, and Pritam Pal, Sr. EC2 Spot Specialist SA

Massively multiplayer online (MMO) game servers must dynamically scale their compute and storage to create a world-scale persistence simulation with millions of dynamic objects, such as complex AR/VR synthetic environments that match real-world fidelity. The Elastic Kubernetes Service (EKS) powered by Amazon EC2 Spot and Aurora Serverless allow customers to create world-scale persistence simulations processed by numerous cost-effective compute chipsets, such as ARM, x86, or Nvidia GPU. It also persists them On-Demand — an automatic scaling configuration of open-source database engines like MySQL or PostgreSQL without managing any database capacity. This post proposes a fully open-sourced option to build, deploy, and manage MMOs on AWS. We use a Python-base game server to demonstrate the MMOs.

Challenge

Increasing competition in the gaming industry has driven game developers to architect cost-effective game servers that can scale up to meet player demands and scale down to meet cost goals. AWS enables MMO developers to scale their game beyond the limits of a single server. The game state (world) can spatially partition across many Regions based on the requested number of sessions or simulations using Amazon EC2 compute power. As the game progresses over many sessions, you must track the simulation’s global state. Amazon Aurora maintains the global game state in memory to manage complex interactions, such as hand-offs across instances and Regions. Amazon Aurora Serverless powers all of these for PostgreSQL or MySQL.

This blog shows how to use a commodity server using an ephemeral disk and decouple the game state from the game server. We store the game state in Aurora for PostgreSQL, but you can also use DynamoDB or KeySpaces for the NoSQL case.

 

Game overview

We use a Minecraft clone to demonstrate a distributed persistence simulation. The game server is python-based deployed on Agones, an open-source multiplayer dedicated game-server platform on Kubernetes. The Kubernetes cluster is powered by EC2 Spot Instances and configured with EC2 instances to auto-scale that expands and shrinks the compute seamlessly upon game-server allocation. We add a git-ops-based continuous delivery system that stores the game-server binaries and config in a git repository and deploys the game in a cluster deploy in one or more Regions to allow global compute scale. The following image is a diagram of the architecture.

The game server persists every object in an Aurora Serverless PostgreSQL-compatible edition. The serverless database configuration aids automatic start-up and scales capacity up or down as per player demand. The world is divided into 32×32 block chunks in the XYZ plane (Y is up). This allows it to be “infinite” (PostgreSQL Bigint type) and eases data management. Only visible chunks must be queried from the database.

The central database table is named “block” and has the columns p, q, x, y, z, w. (p, q) identifies the chunk, (x, y, z) identifies the block position, and (w) identifies the block type. 0 represents an empty block (air).

In the game, the chunks store their blocks in a hash map. An (x, y, z) key maps to a (w) value.

The y positions of blocks are limited to 0 <= y < 256. The upper limit is essentially an artificial limitation that prevents users from building tall structures. Users cannot destroy blocks at y = 0 to avoid falling underneath the world.

Solution overview

Kubernetes allows dedicated game server scaling and orchestration without limiting the compute platform spanning across many Regions and staying closer to the player. For simplicity, we use EKS to reduce operational overhead.

Amazon EC2 runs the compute simulation, which might require different EC2 instance types. These include compute-optimized instances for compute-bound applications benefiting from high-performance processors or accelerated compute (GPU), using hardware accelerators to perform functions like graphics processing or data pattern matching. In addition, the EC2 Auto Scaling  runs the game-server configured to use Amazon EC2 Spot Instances in order to allow up to 90% discount as compared to On-Demand Instance prices. However, Amazon EC2 can interrupt your Spot Instance when the demand for Spot Instances rises, when the supply of Spot Instances decreases, or when the Spot price exceeds your maximum price.

The following two mechanisms minimize the EC2 reclaim compute capacity impact:

  1. Pull interruption notifications and notify the game-server to replicate the session to another game server.
  2. Prioritize compute capacity based on availability.

Two Auto Scaling groups are deployed for method two. The first Auto Scaling group uses latest generation Spot Instances (C5, C5a, C5n, M5, M5a, and M5n) instances, and the second uses all generations x86-based instances (C4 and M4). We configure the cluster-autoscaler that controls the Auto Scaling group size with the Expander priority option in order to favor the latest generation Spot Auto Scaling group. The priority should be a positive value, and the highest value wins. For each priority value, a list of regular expressions should be given. The following example assumes an EKS cluster craft-us-west-2 and two ASGs. The craft-us-west-2-nodegroup-spot5 Auto Scaling group wins the priority. Therefore, new instances will be spawned from the EC2 Spot Auto Scaling group.

.…
- --expander=priority
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/craft-us-west-2
….
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-priority-expander
  namespace: kube-system
data:
  priorities: |-
    10:
      - .*craft-us-west-2-nodegroup-spot4*
    50:
      - .*craft-us-west-2-nodegroup-spot5*
---

The complete working spec is available in https://github.com/aws-samples/spotable-game-server/blob/master/specs/cluster_autoscaler.yml.

We propose the following two options to minimize player impact during interruption. The first is based on two-minute interruption notification, and the second on rebalance recommendations.

  1. Notify that that the instance will be shut down within two minutes.
  2. Notify when a Spot Instance is at elevated risk of interruption, this signal can arrive sooner than the two-minute Spot Instance interruption notice.

Choosing between the two depends on the transition you want for the player. Both cases deploy DaemonSet that listens to either notification and notifies every game server running on the EC2 instance. It also prevents new game-servers from running on this instance.

The daemon set pulls from the instance metadata every five seconds denoted by POLL_INTERVAL as follows:

while http_status=$(curl -o /dev/null -w '%{http_code}' -sL ${NOTICE_URL}); [ ${http_status} -ne 200 ]; do
  echo $(date): ${http_status}
  sleep ${POLL_INTERVAL}
done

where NOTICE_URL can be either

NOTICE_URL=”http://169.254.169.254/latest/meta-data/spot/termination-time”

Or, for the second option:

NOTICE_URL=”http://169.254.169.254/latest/meta-data/events/recommendations/rebalance”

The command that notifies all the game-servers about the interruption is:

kubectl drain ${NODE_NAME} --force --ignore-daemonsets --delete-local-data

From that point, every game server that runs on the instance gets notified by the Unix signal SIGTERM.

In our example server.py, we tell the OS to signal the Python process and complete the sig_handler function. The example prompts a message to every connected player regarding the incoming interruption.

def sig_handler(signum, frameframe):
  log('Signal handler called with signal',signum)
  model.send_talk("WARN game server maintenance is pending - your universe is saved")

def main():
    ..
    signal.signal(signal.SIGTERM,sig_handler)

 

Why Agones?

Agones orchestrates game servers via declarative configuration in order to manage groups of ready game-servers to play. It also offers integrated SDK for managing game server lifecycle, health, and configuration. Finally, it runs on Kubernetes, so it is an all-up open-source platform that runs anywhere. The Agones SDK is easily implemented. Furthermore, combining the compute platform AWS and Agones offers the most secure, resilient, scalable, and cost-effective method for running an MMO.

 

In our example, we implemented the /health in agones_health and /allocate in agones_allocate calls. Then, the agones_health() called upon the server init to indicate that it is ready to assign new players.

def agones_allocate(model):
  url="http://localhost:"+agones_port+"/allocate"
  req = urllib2.Request(url)
  req.add_header('Content-Type','application/json')
  req.add_data('')
  r = urllib2.urlopen(req)
  resp=r.getcode()
  log('agones- Response code from agones allocate was:',resp)
  model.send_talk("new player joined - reporting to agones the server is allocated") 

The agones_health() using the native health and relay its health to keep the game-server in a viable state.

def agones_health(model):
  url="http://localhost:"+agones_port+"/health"
  req = urllib2.Request(url)
  req.add_header('Content-Type','application/json')
  req.add_data('')
  while True:
    model.ishealthy()
    r = urllib2.urlopen(req)
    resp=r.getcode()
    log('agones- Response code from agones health was:',resp)
    time.sleep(10)

The main() function forks a new process that reports health. Agones manages the port allocation and maintains the game state, e.g., Allocated, Scheduled, Shutdown, Creating, and Unhealthy.

def main():
    …
    server = Server((host, port), Handler)
    server.model = model
    newpid=os.fork()
    if newpid ==0:
      log('agones-in child process about to call agones_health()')
      agones_health()
      log('agones-in child process called agones_health()')
    else:
      pids = (os.getpid(), newpid)
      log('agones server pid and health pid',pids)
    log('SERV', host, port)
    server.serve_forever()

 

Other ways than Agones on EKS?

Configuring an Agones group of ready game-servers behind a load balancer is difficult. Agones game-servers endpoint must be published so that players’ clients can connect and play. Agones creates game-servers endpoints that are an IP and port pair. The IP is the public IP of the EC2 Instance. The port results from PortPolicy, which generates a non-predictable port number. Hence, make it impossible to use with load balancer such as Amazon Network Load Balancer (NLB).

Suppose you want to use a load balancer to route players via a predictable endpoint. You could use the Kubernetes Deployment construct and configure a Kubernetes Service construct with NLB and no need to implement additional SDK or install any additional components on your EKS cluster.

The following example defines a service craft-svc that creates an NLB that will route TCP connections to Pod targets carrying the selector craft and listening on port 4080.

apiVersion: v1
kind: Service
metadata:
  name: craft-svc
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
  selector:
    app: craft
  ports:
    - protocol: TCP
      port: 4080
      targetPort: 4080
  type: LoadBalancer

The game server Deployment set the metadata label for the Service load balancer and the port.

Furthermore, the Kubernetes readinessProbe and livenessProbe offer similar features as the Agones SDK /allocate and /health implemented prior, making the Deployment option parity with the Agones option.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: craft
  name: craft
spec:
…
    metadata:
      labels:
        app: craft
    spec:
      …
        readinessProbe:
          tcpSocket:
            port: 4080
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          tcpSocket:
            port: 4080
          initialDelaySeconds: 5
          periodSeconds: 10

Overview of the Database

The compute layer running the game could be reclaimed at any time within minutes, so it is imperative to continuously store the game state as players progress without impacting the experience. Furthermore, it is essential to read the game state quickly from scratch upon session recovery from an unexpected interruption. Therefore, the game requires fast reads and lazy writes. More considerations are consistency and isolation. Developers could handle inconsistency via the game in order to relax hard consistency requirements. As for isolation, in our Craft example players can build a structure without other players seeing it and publish it globally only when they choose.

Choosing the proper database for the MMO depends on the game state structure, e.g., a set of related objects such as our Craft example or a single denormalized table. The former fits the relational model used with RDS or Aurora open-source databases such as MySQL or PostgreSQL. While the latter can be used with Keyspaces, an AWS managed Cassandra, or a key-value store such as DynamoDB. Our example includes two options to store the game state with Aurora Serverless for PostgreSQL or DynamoDB. We chose those because of the ACID support. PostgreSQL offers four isolation levels: dirty read, nonrepeatable read, phantom read, and serialization anomaly. DynamoDB offers two isolation levels: serializable and read-committed. Both databases’ options allow the game developer to implement the best player experience and avoid additional implementation efforts.

Moreover, both engines offer Restful connection methods to the database. Aurora uses Data API. The Data API doesn’t require a persistent DB cluster connection. Instead, it provides a secure HTTP endpoint and integration with AWS SDKs. Game developers can run SQL statements via the endpoint without managing connections. DynamoDB only supports a Restful connection. Finally, Aurora Serverless and DynamoDB scale the compute and storage to reduce operational overhead and pay only for what was played.

 

Conclusion

MMOs are unique because they require infrastructure features similar to other game types like FPS or casual games, and reliable persistence storage for the game-state. Unfortunately, this leads to expensive choices that make monetizing the game difficult. Therefore, we proposed an option with a fun game to help you, the developer, analyze what is best for your MMO. Our option is built upon open-source projects that allow you to build it anywhere, but we also show that AWS offers the most cost-effective and scalable option. We encourage you to read recent announcements about these topics, including several at AWS re:Invent 2021.

 

New – Amazon EC2 M6i Instances Powered by the Latest-Generation Intel Xeon Scalable Processors

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-amazon-ec2-m6i-instances-powered-by-the-latest-generation-intel-xeon-scalable-processors/

Last year, we introduced the sixth generation of EC2 instances powered by AWS-designed Graviton2 processors. We’re now expanding our sixth-generation offerings to include x86-based instances, delivering price/performance benefits for workloads that rely on x86 instructions.

Today, I am happy to announce the availability of the new general purpose Amazon EC2 M6i instances, which offer up to 15% improvement in price/performance versus comparable fifth-generation instances. The new instances are powered by the latest generation Intel Xeon Scalable processors (code-named Ice Lake) with an all-core turbo frequency of 3.5 GHz.

You might have noticed that we’re now using the “i” suffix in the instance type to specify that the instances are using an Intel processor. We already use the suffix “a” for AMD processors (for example, M5a instances) and “g” for Graviton processors (for example, M6g instances).

Compared to M5 instances using an Intel processor, this new instance type provides:

  • A larger instance size (m6i.32xlarge) with 128 vCPUs and 512 GiB of memory that makes it easier and more cost-efficient to consolidate workloads and scale up applications.
  • Up to 15% improvement in compute price/performance.
  • Up to 20% higher memory bandwidth.
  • Up to 40 Gbps for Amazon Elastic Block Store (EBS) and 50 Gbps for networking.
  • Always-on memory encryption.

M6i instances are a good fit for running general-purpose workloads such as web and application servers, containerized applications, microservices, and small data stores. The higher memory bandwidth is especially useful for enterprise applications, such as SAP HANA, and high performance computing (HPC) workloads, such as computational fluid dynamics (CFD).

M6i instances are also SAP-certified. For over eight years SAP customers have been relying on the Amazon EC2 M-family of instances for their mission critical SAP workloads. With M6i instances, customers can achieve up to 15% better price/performance for SAP applications than M5 instances.

M6i instances are available in nine sizes (the m6i.metal size is coming soon):

Name vCPUs Memory
(GiB)
Network Bandwidth
(Gbps)
EBS Throughput
(Gbps)
m6i.large 2 8 Up to 12.5 Up to 10
m6i.xlarge 4 16 Up to 12.5 Up to 10
m6i.2xlarge 8 32 Up to 12.5 Up to 10
m6i.4xlarge 16 64 Up to 12.5 Up to 10
m6i.8xlarge 32 128 12.5 10
m6i.12xlarge 48 192 18.75 15
m6i.16xlarge 64 256 25 20
m6i.24xlarge 96 384 37.5 30
m6i.32xlarge 128 512 50 40

The new instances are built on the AWS Nitro System, which is a collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware, delivering high performance, high availability, and highly secure cloud instances.

For optimal networking performance on these new instances, upgrade your Elastic Network Adapter (ENA) drivers to version 3. For more information, see this article about how to get maximum network performance on sixth-generation EC2 instances.

M6i instances support Elastic Fabric Adapter (EFA) on the m6i.32xlarge size for workloads that benefit from lower network latency, such as HPC and video processing.

Availability and Pricing
EC2 M6i instances are available today in six AWS Regions: US East (N. Virginia), US West (Oregon), US East (Ohio), Europe (Ireland), Europe (Frankfurt), and Asia Pacific (Singapore). As usual with EC2, you pay for what you use. For more information, see the EC2 pricing page.

Danilo

Build a SQL-based ETL pipeline with Apache Spark on Amazon EKS

Post Syndicated from Melody Yang original https://aws.amazon.com/blogs/big-data/build-a-sql-based-etl-pipeline-with-apache-spark-on-amazon-eks/

Today, the most successful and fastest growing companies are generally data-driven organizations. Taking advantage of data is pivotal to answering many pressing business problems; however, this can prove to be overwhelming and difficult to manage due to data’s increasing diversity, scale, and complexity. One of the most popular technologies that businesses use to overcome these challenges and harness the power of their growing data is Apache Spark.

Apache Spark is an open-source, distributed data processing framework capable of performing analytics on large-scale datasets, enabling businesses to derive insights from all of their data whether it is structured, semi-structured, or unstructured in nature.

You can flexibly deploy Spark applications in multiple ways within your AWS environment, including on the Amazon managed Kubernetes offering Amazon Elastic Kubernetes Service (Amazon EKS). With the release of Spark 2.3, Kubernetes became a new resource scheduler (in addition to YARN, Mesos, and Standalone) to provision and manage Spark workloads. Increasingly, it has become the new standard resource manager for new Spark projects, as we can tell from the popularity of the open-source project. With Spark 3.1, the Spark on Kubernetes project is officially production-ready and Generally Available. More data architecture and patterns are available for businesses to accelerate data-driven transitions. However, for organizations accustomed to SQL-based data management systems and tools, adapting to the modern data practice with Apache Spark may slow down the pace of innovation.

In this post, we address this challenge by using the open-source data processing framework Arc, which subscribes to the SQL-first design principle. Arc abstracts from Apache Spark and container technologies, in order to foster simplicity whilst maximizing efficiency. Arc is used as a publicly available example to prove the ETL architecture. It can be replaced by your own choice of in-house build or other data framework that supports the declarative ETL build and deployment pattern.

Why do we need to build a codeless and declarative data
pipeline?

Data platforms often repeatedly perform extract, transform, and load (ETL) jobs to achieve similar outputs and objectives. This can range from simple data operations, such as standardizing a date column, to performing complex change data capture processes (CDC) to track historical changes of a record. Although the outcomes are highly similar, the productivity and cost can vary heavily if not implemented suitably and efficiently.

A codeless data processing design pattern enables data personas to build reusable and performant ETL pipelines, without having to delve into the complexities of writing verbose Spark code. Writing your ETL pipeline in native Spark may not scale very well for organizations not familiar with maintaining code, especially when business requirements change frequently. The SQL-first approach provides a declarative harness towards building idempotent data pipelines that can be easily scaled and embedded within your continuous integration and continuous delivery (CI/CD) process.

The Arc declarative data framework simplifies ETL implementation in Spark and enables a wider audience of users ranging from business analysts to developers, who already have existing skills in SQL. It further accelerates users’ ability to develop efficient ETL pipelines to deliver higher business value.

For this post, we demonstrate how simple it is to use Arc to facilitate CDC to track incremental data changes from a source system.

Why adopt SQL to build Spark workloads?

When writing Spark applications, developers often opt for an imperative or procedural approach that involves explicitly defining the computational steps and the order of implementation.

A declarative approach involves the author defining the desired target state without describing the control flow. This is determined by the underlying engine or framework, which in this post will be determined by the Spark SQL engine.

Let’s explore a common data use case of masking columns in a table and how we can write our transformation code in these two paradigms.

The following code shows the imperative method (PySpark):

# Step 1 – Define a dataframe with a column to be masked
df1 = spark.sql("select phone_number from customer")

# Step 2 – Define a new dataframe with a new column that has masked
df2 = df1.withColumn("phone_number_masked", regexp_replace("phone_number", "[0-9]", "*"))

# Step 3 – Drop the old column that is unmasked
df3 = df2.drop("phone_number")

The following code shows the declarative method (Spark SQL):

SELECT regexp_replace(phone_number, '[0-9]', '*') AS phone_number_masked 
FROM customer

The imperative approach dictates how to construct a representation of the customer table with a masked phone number column; whereas the declarative approach defines just the “what” or the desired target state, leaving the “how” for the underlying engine to manage.

As a result, the declarative approach is much simpler and yields code that is easier to read. Furthermore, in this context, it takes advantage of SQL—a declarative language and more widely adopted and known tool, which enables you to easily build data pipelines and achieve your analytical objectives quicker.

If the underlying ETL technology changes, the SQL script remains the same as long as business rules remain unchanged. However, with an imperative approach processing data, the code will most likely require a rewrite and regression testing, such as when organizations upgrade Python from version 2 to 3.

Why deploy Spark on Amazon EKS?

Amazon EKS is a fully managed offering that enables you to run containerized applications without needing to install or manage your own Kubernetes control plane or worker nodes. When you deploy Apache Spark on Amazon EKS, applications can inherit the underlying advantages of Kubernetes, improving the overall flexibility, availability, scalability, and security:

  • Optimized resource isolation – Amazon EKS supports Kubernetes namespaces, network policies, and pods priority to provide isolation between workloads. In multi-tenant environments, its optimized resource allocation feature enables different personas such as IT engineers, data scientists, and business analysts to focus their attention towards innovation and delivery. They don’t need to worry about resource segregation and security.
  • Simpler Spark cluster management – Spark applications can interact with the Amazon EKS API to automatically configure and provision Spark clusters based on your Spark submit request. Amazon EKS spins up a number of pods or containers accordingly for your data processing needs. If you turn on the Dynamic Resource Allocation feature in your application, the Spark cluster on Amazon EKS dynamically evolves based on workload. This significantly simplifies the Spark cluster management.
  • Scale Spark applications seamlessly and efficiently – Spark on Amazon EKS follows a pod-centric architecture pattern. This means an isolated cluster of pods on Amazon EKS is dedicated to a single Spark ETL job. You can expand or shrink a Spark cluster per job in a matter of seconds in some cases. To better manage spikes, for example when training a machine learning model over a long period of time, Amazon EKS offers the elastic control through the Cluster Autoscaler at node level and the Horizontal pod Autoscaler at the pod level. Additionally, scaling Spark on Amazon EKS with the AWS Fargate launch type offers you a serverless ETL option with the least operational effort.
  • Improved resiliency and cloud support – Kubernetes was introduced in 2018 as a native Spark resource scheduler. As adoption grew, this project became Generally Available with Spark 3.1 (2021), alongside better cloud support. A major update in this exciting release is the Graceful Executor Decommissioning, which makes Apache Spark more robust to Amazon Elastic Compute Cloud (Amazon EC2) Spot Instance interruption. As of this writing, the feature is only available in Kubernetes and Standalone mode.

Spark on Amazon EKS can use all of these features provided by the fully managed Kubernetes service for more optimal resource allocation, simpler deployment, and improved operational excellence.

Solution overview

This post comes with a ready-to-use blueprint, which automatically provisions the necessary infrastructure and spins up two web interfaces in Amazon EKS to support interactive ETL build and orchestration. Additionally, it enforces the best practice in data DevOps and CI/CD deployment.

 The following diagram illustrates the solution architecture.

The architecture has four main components:

  • Orchestration on Amazon EKS – The solution offers a highly pluggable workflow management layer. In this post, we use Argo Workflows to orchestrate ETL jobs in a declarative way. It’s consistent with the standard deployment method in Amazon EKS. Apache Airflow and other tools are also available to use.
  • Data workload on Amazon EKS – This represents a workspace on the same Amazon EKS cluster, for you to build, test, and run ETL jobs interactively. It’s powered by Jupyter Notebooks with a custom kernel called Arc Jupyter. Its Git integration feature reinforces the best practice in CI/CD deployment operation. This means every notebook created on a Jupyter instance must check in to a Git repository for the standard source and version control. The Git repository should be your single source of truth for the ETL workload. When your Jupyter notebook files (job definition) and SQL scripts land to Git, followed by an Amazon Simple Storage Service (Amazon S3) upload, it runs your ETL automatically or based on a time schedule. The entire deployment process is seamless to prevent any unintentional human mistakes.
  • Security – This layer secures Arc, Jupyter Docker images, and other sensitive information. The IAM roles for service accounts feature (IRSA) on Amazon EKS provides token authorization with fine-grained access control to other AWS services. In this solution, Amazon EKS integrates with Amazon Athena, AWS Glue, and S3 buckets securely, so you don’t need to maintain a long-lived AWS credential for your applications. We also use Amazon CloudWatch for collecting ETL application logs and monitoring Amazon EKS with the container insights
  • Data lake – As an output of the solution, the data destination is an S3 bucket. You should be able to query the data directly in Athena, backed up by a Data Catalog in AWS Glue.

Prerequisites

To run the sample solution on a local machine, you should have the following prerequisites:

  • Python 3.6 or later.
  • The AWS Command Line Interface (AWS CLI) version 1. For Windows, use the MSI installer. For Linux, macOS, or Unix, use the bundled installer.
  • The AWS CLI is configured to communicate with services in your deployment account. Otherwise, either set your profile by EXPORT AWS_PROFILE=<your_aws_profile> or run aws configure to set up your AWS account access.

If you don’t want to install anything on your computer, use AWS CloudShell, a browser-based shell that makes it easy to run scripts with the AWS CLI.

Download the project

Clone the sample code either to your computer or your AWS CloudShell console:

git clone https://github.com/aws-samples/sql-based-etl-on-amazon-eks.git 
cd sql-based-etl-on-amazon-eks

Deploy the infrastructure

The deployment process takes approximately 30 minutes to complete.

Launch the AWS CloudFormation template to deploy the solution. Follow the Customization instructions if you want to make a change or deploy to a different Region.

Region

Launch Template

US East (N. Virginia)

Deploy with the default settings (recommended). If you want to use your own username for the Jupyter login, update the parameter jhubuser. If performing ETL on your own data, update the parameter datalakebucket with your S3 bucket. The bucket must be in the same Region as the deployment Region.

Post-deployment

Run the script to install command tools:

cd spark-on-eks
./deployment/post-deployment.sh

Test the job in Jupyter

To test the job in Jupyter, complete the following steps:

  1. Log in with the details from the preceding script output. Or look it up from the Secrets Manager Console.
  2. For Server Options, select the default server size.

By following the best security practice, the notebook session times out if idle for 30 minutes. You may need to refresh your web browser and log in again.

  1. Open a sample job spark-on-eks/source/example/notebook/scd2-job.ipynb from your notebook instance.
  2. Choose the refresh icon to see the file if needed.
  3. Run each block and observe the result. The job outputs a table to support the Slowly Changing Dimension Type 2 (SCD2) business need.

  1. To demonstrate the best practice in Data DevOps, the JupyterHub is configured to synchronize the latest code from this project GitHub repo. In practice, you must save all changes to a source repository in order to schedule your ETL job to run.
  2. Run the query on the Athena console to see if it’s a SCD type 2 table:
SELECT * FROM default.deltalake_contact_jhub WHERE id=12
  1. (Optional) If it’s your first time running an Athena query, configure your result location to:  s3://sparkoneks-appcode<random_string>/

Submit the job via Argo

To submit the job via Argo, complete the following steps:

  1. Check your connection in CloudShell or your local computer.
kubectl get svc & argo version --short
  1. If you don’t have access to Amazon EKS or the Argo CLI isn’t installed, run the post-deployment script again:
  1. Log in to the Argo Website. It refreshes every 10 minutes (which is configurable).
ARGO_URL=$(aws cloudformation describe-stacks --stack-name SparkOnEKS --query "Stacks[0].Outputs[?OutputKey=='ARGOURL'].OutputValue" --output text)
LOGIN=$(argo auth token)
echo -e "\nArgo website:\n$ARGO_URL\n" && echo -e "Login token:\n$LOGIN\n"
  1. Run the script again to get a new login token if you experience a timeout
  1. Choose the Workflows option icon on the sidebar to view job status.

  1. To demonstrate the job dependency feature in Argo Workflows, we break the previous Jupyter notebook into three files, in our case, three ETL jobs.
  2. Submit the same SCD2 data pipeline with three jobs:
# change the CFN stack name if yours is different
app_code_bucket=$(aws cloudformation describe-stacks --stack-name SparkOnEKS --query "Stacks[0].Outputs[?OutputKey=='CODEBUCKET'].OutputValue" --output text)
argo submit source/example/scd2-job-scheduler.yaml -n spark --watch -p codeBucket=$app_code_bucket
  1. Under the spark namespace, check the job progress and application logs on the Argo website.

  1. Query the table in Athena to see if it has the same outcome as the test in Jupyter earlier:
SELECT * FROM default.contact_snapshot WHERE id=12

The following screenshot shows the query results.

Submit a native Spark job

Previously, we ran the AWS CloudFormation-like ETL job defined in a Jupyter notebook powered by Arc. Now, let’s reuse the Arc Docker image that contains the latest Spark distribution, to submit a native PySpark job that processes around 50GB of data. The application code looks like this:

The job submitter is defined by spark-on-k8s-operator in a declarative manner. It follows the same declarative pattern as other applications deployment processes. As shown in the following code, we use the same command syntax kubectl apply.

  1. Submit the Spark job as a usual application on Amazon EKS:
# get the s3 bucket from CFN output
app_code_bucket=$(aws cloudformation describe-stacks --stack-name SparkOnEKS --query "Stacks[0].Outputs[?OutputKey=='CODEBUCKET'].OutputValue" --output text)

kubectl create -n spark configmap special-config --from-literal=codeBucket=$app_code_bucket
kubectl apply -f source/example/native-spark-job-scheduler.yaml

  1. Check the job status:
kubectl get pod -n spark

# the Spark cluster is running across two AZs.
kubectl get node \
--label-columns=eks.amazonaws.com/capacityType,topology.kubernetes.io/zone

# watch progress on SparkUI, if the job was submitted from local computer
kubectl port-forward word-count-driver 4040:4040 -n spark
# go to `localhost:4040` from your web browser
  1. Test fault tolerance and resiliency.

You can perform self-recovery with a simpler retry mechanism. In Spark, we know the driver is a single point of failure. If a Spark driver dies, the entire application won’t survive. It often requires extra effort to set up a job rerun, in order to provide the fault tolerance capability. However, it’s much simpler in Amazon EKS with only a few lines of retry declaration. It works for both batch and streaming Spark applications.

  1. Simulate a Spot interruption scenario by manually deleting the EC2 instance running the driver:
# find the ec2 host name
kubectl describe pod word-count-driver -n spark

# replace the placeholder 
kubectl delete node <ec2_host_name>

# has the driver come back?
kubectl get pod -n spark

  1. Delete the executor exec-1 when it’s running:
kubectl get pod -n spark
exec_name=$(kubectl get pod -n spark | grep "exec-1" | awk '{print $1}')
kubectl delete -n spark pod $exec_name ––force

# has it come back with a different number suffix? 
kubectl get pod -n spark
  1. Stop the job or rerun with different job configuration:
kubectl delete -f source/example/native-spark-job-scheduler.yaml

# modify the scheduler file and rerun
Kubectl apply -f source/example/native-spark-job-scheduler.yaml

Clean up

To avoid incurring future charges, delete the resources generated if you don’t need the solution anymore.

Run the cleanup script with your CloudFormation stack name. The default name is SparkOnEKS:

cd sql-based-etl-on-amazon-eks/spark-on-eks
./deployment/delete_all.sh <OPTIONAL:stack_name>

On the AWS CloudFormation console, manually delete the remaining resources if needed.

Conclusion

To accelerate data innovation, improve time-to-insight and support business agility by advancing engineering productivity, this post introduces a declarative ETL option driven by an SQL-centric architecture. Managing applications declaratively in Kubernetes is a widely adopted best practice. You can use the same approach to build and deploy Spark applications with an open-source data processing framework or in-house built software to achieve the same productivity goal. Abstracting from Apache Spark and container technologies, you can build a modern data solution on AWS managed services simply and efficiently.

You can flexibly deploy Spark applications in multiple ways within your AWS environment. In this post, we demonstrated how to deploy an ETL pipeline on Amazon EKS. Another option is to leverage the optimized Spark runtime available in Amazon EMR. You can deploy the same solution via Amazon EMR on Amazon EKS. Switching to this deployment option is effortless and straightforward, and doesn’t need an application change or regression test.

Additional reading

For more information about running Apache Spark on Amazon EKS, see the following:


About the Authors

Melody Yang is a Senior Analytics Specialist Solution Architect at AWS with expertise in Big Data technologies. She is an experienced analytics leader working with AWS customers to provide best practice guidance and technical advice in order to assist their success in data transformation. Her areas of interests are open-source frameworks and automation, data engineering and DataOps.

 

 

Avnish Jain is a Specialist Solution Architect in Analytics at AWS with experience designing and implementing scalable, modern data platforms on the cloud for large scale enterprises. He is passionate about helping customers build performant and robust data-driven solutions and realise their data & analytics potential.

 

 

Shiva Achari is a Senior Data Lab Architect at AWS. He helps AWS customers to design and build data and analytics prototypes via the AWS Data Lab engagement. He has over 14 years of experience working with enterprise customers and startups primarily in the Data and Big Data Analytics space.

Increase Amazon Elasticsearch Service performance by upgrading to Graviton2

Post Syndicated from Zachariah Elliott original https://aws.amazon.com/blogs/big-data/increase-amazon-elasticsearch-service-performance-by-upgrading-to-graviton2/

Amazon Elasticsearch Service (Amazon ES) supports multiple instance types based on your use case. In 2021, AWS announced general purpose (M6g), compute optimized (C6g), and memory optimized (R6g, R6gd) instance types for Amazon ES version 7.9 or later powered by AWS Graviton2 processors, which delivers a major leap in capabilities and better price/performance improvement over previous generation instances.

Graviton2 instances are built using custom silicon designed by Amazon. These instances are Amazon-designed hardware and software innovations that enable the delivery of efficient, flexible, and secure cloud services with isolated multi-tenancy, private networking, and fast local storage. You can launch Graviton2 instances via the Amazon ES console, the AWS Command Line Interface (AWS CLI), AWS API, AWS CloudFormation, or the AWS Cloud Development Kit (AWS CDK). You can change your existing Amazon ES instance types to Graviton2 using a blue/green deployment process, which minimizes downtime and maintains the original environment in the event of unsuccessful deployments.

In this post, we review prerequisites and considerations to upgrade your existing Amazon ES instances to Graviton2 with minimal downtime.

Why move to Graviton2?

The following are some of the reasons you should move to Graviton2:

  • You can enjoy up to 38% improvement in indexing throughput compared to the corresponding x86-based counterparts
  • The Graviton2 instance family provides up to 50% reduction in indexing latency, and up to 30% improvement in query performance when compared to the current generation (M5, C5, R5)
  • Amazon ES Graviton2 instances provide up to 44% price/performance improvement over previous generation instances
  • Graviton2 instances include support for all recently launched features like encryption at rest and in flight, role-based access control, cross-cluster search, Auto-Tune, Trace Analytics, Kibana Reporting, and UltraWarm

Solution overview

For this post, let’s consider a use case in which we have an Amazon ES cluster running version 7.4 with three data nodes and two primary nodes.

As a general best practice, we recommend testing the process in a non-production environment followed by validation tests to make sure everything is configured and operating as per your expectations before making changes to the production environment. We also recommend creating a snapshot of your cluster before performing upgrades or modifying the instance type to minimize the risk of data loss.

In this post, we walk you through the following steps:

  1. Upgrade the Amazon ES cluster (if needed):
    1. Determine if the current cluster version meets the minimum required version (7.9 or later) for moving to Graviton2.
    2. Upgrade the Amazon ES domain to the required minimum version.
  2. Modify the instance type of your cluster nodes.
  3. Confirm that your applications work correctly with the upgraded cluster.
  4. Roll back to the previous instance types if compatibility issues are discovered.

Upgrade Amazon ES versions

To take advantage of Graviton2-based Amazon ES instances, your cluster must be running Amazon ES version 7.9 and above and service software R20210331 or later (as of this post). For the latest updates of this information, see Supported instance types in Amazon Elasticsearch Service. For upgrade considerations, compatibilities, and instructions, see Upgrading Elasticsearch.

For our use case, our cluster is running version 7.4. We can confirm the version via the AWS CLI or Amazon ES console, as in the following screenshot.

To upgrade your domain, choose Upgrade domain on the Actions menu. You can then choose what version to upgrade to, or verify your cluster can be upgraded. The upgrade process takes some time depending on the size of your cluster.

If you prefer to use the AWS CLI, you can perform the same steps. To get a list of all valid upgrade targets for a current version using the AWS CLI, use the describe-elasticsearch-domain command.

The following describe-elasticsearch-domain example provides configuration details for a given domain:

aws es describe-elasticsearch-domain \
    --domain-name demo

If the cluster version is less than 7.9, use the upgrade-elasticsearch-domain command to upgrade your domain:

aws es upgrade-elasticsearch-domain \
--domain-name demo
--target-version 7.9

You can track the progress of the Amazon ES domain upgrade using API calls to Amazon ES. For more information, see Why is my Amazon Elasticsearch Service domain upgrade taking so long?

Modify instances

At the time of writing, you can’t mix x86 and Graviton2-based Amazon ES instances with the primary and data nodes. As such, both data nodes and primary nodes are modified at the same time. To modify your nodes, complete the following steps:

  1. On the Amazon ES console, go to the domain you want to upgrade.
  2. Choose Edit domain.

  1. In the Data nodes section, for Instance type, change your data nodes to Graviton 2 instance types. In our case, we upgrade from r5.large.elasticsearch to r6g.large.elasticsearch.

  1. In the Dedicated master nodes section, for Instance type, change your dedicated primary nodes to Graviton 2 instance types. In our case, we upgrade from r5.large.elasticsearch to r6g.large.elasticsearch.

  1. Choose Submit.

The cluster goes into a processing state. During this time, you can monitor the Cluster health tab to see your number of nodes increase. In our case, our cluster has two dedicated primary nodes and three data nodes (five total).

During deployment, Amazon ES performs a blue/green deployment. This ensures any errors encountered during modification can be rolled back. You can continue to use the cluster during this time, however there may be a brief service interruption when the cluster switches to the new dedicated primary nodes. During blue/green deployment, you’re charged for both instance types, and then only the new instance type going forward.

After the modification finishes successfully, you can verify both the primary and data nodes are using Graviton2 instances.

Validate and confirm the application works correctly

You can now validate Amazon ES is performing as expected with your application. You can check the Cluster health tab for metrics related to cluster performance and observe if you’re not seeing the expected performance.

Perform rollback

In the rare scenario in which issues are discovered with the Graviton2-based Amazon ES cluster, such as application compatibility or data issues, you can perform the same steps to change the cluster back to the original node type.

Summary

This post shared a step-by-step guide to migrate your Amazon ES cluster to Graviton2-based nodes, as well as some key considerations when modifying your cluster. We also talked about how to upgrade your cluster to the latest version of Amazon ES to take advantage of Graviton 2, as well as other features such as UltraWarm and cold storage. As always, make sure you fully test compatibility with your application and these newer versions of Amazon ES, and per best practices, always perform upgrades in a lower environment before making these changes in a production environment.

Additional resources

For more information, see the following:


About the Authors

Zachariah Elliott works as a Solutions Architect focusing on EdTech at AWS. He is passionate about helping customers build Well-Architected solutions on AWS. He is also part of the IoT Subject Matter Expert community at AWS and loves helping customers develop unique IoT-based solutions.

 

Pranusha Manchala is a Solutions Architect at AWS who works with education companies. She has worked with many EdTech customers and provided them with architectural guidance for building highly scalable and cost-optimized applications on AWS. She found her interests in machine learning and started to dive deep into this technology. She enjoys cooking, baking, and outdoor activities in her free time.

Frictionless hosting of containerized ASP.NET web apps using Amazon Lightsail

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/frictionless-hosting-of-containerized-asp-net-web-apps-using-amazon-lightsail/

This post is written by Fahad Mustafa, Cloud Application Architect, AWS Professional Services

There are many ways to deploy ASP.NET web apps to AWS. Each with its own use cases and differing pricing models. But what if you have a small website and database that you must deploy rapidly, manage, and scale? What if you want a cost-effective simple monthly plan? In these cases, Amazon Lightsail is a great choice. This post shows you how to take a containerized ASP.NET web application that connects to a PostgreSQL database and deploy it to Lightsail. So that you can get your ASP.NET web app up and running.

Product Overview

Amazon Lightsail is an easy way to get started on AWS. It gives you building blocks to deploy an application or website and provision a database at an affordable, monthly price.

Lightsail is perfect for students, small businesses, and startups to get their website or application up and running in the cloud. By providing a secure, highly available, and managed environment Lightsail does all the heavy lifting like setting up IAM roles and policies.

Lightsail can also run containers! By pointing Lightsail to a public image on Amazon ECR or Docker Hub, or uploading an image from your local machine, you can easily run the container, scale it, monitor it and use a custom domain.

Overview of solution

To deploy an ASP.NET app that connects to a PostgreSQL database, you create a Lightsail container service and PostgreSQL database through the AWS Management Console. Create your app and container image. Push the image to Lightsail and finally create the Lightsail deployment to run the container.

solution diagram

Overview of steps

In this post, you create a sample ASP.NET web app through the .NET CLI. Alternatively, you can use Visual Studio to create the app.

This is the sequence of steps I review in this post:

  • Create a PostgreSQL database
  • Create a Lightsail container service
  • Create an ASP.NET web app
  • Create a Dockerfile and build image
  • Upload the image to Lightsail
  • Deploy and run the image

Prerequisites

For this walkthrough, you should have the following prerequisites:

Walkthrough

Create a PostgreSQL database

In this step, you create a PostgreSQL database through the Lightsail console.

Create the database

  1. Sign in to the Lightsail console.
  2. On the Lightsail home page, choose the Database
  3. Choose Create database.
  4. Choose the Database location by changing the AWS Region and Availability Zone.
  5. Choose the database engine. In this example, select PostgreSQL 12.6.
  6. Optional – Specify login credentials. If not changed, AWS generates a default secure password.
  7. Optional – Specify the master database name. If not changed, AWS will use “dbmaster” as the default.
  8. Choose the database plan. Compare the plan’s memory, CPU, storage, and transfer quota to decide which best fits your needs. The smallest database plan is Free Tier eligible.
  9. Identify your database by giving it a unique name.
  10. Choose Create database.

Creating and configuring the database can take a few minutes. Once ready, the status changes to Available. For more information and options on creating a database in Lightsail, see Creating a database in Amazon Lightsail.

available database

Now you are ready to connect to the database and create a table. To connect, see Connecting to your PostgreSQL database in Amazon Lightsail. This sample uses a database named aspnetlightsaildb and a table named Person that you can create by running the following script using PgAdmin. Note that the Owner value is dbmasteruser. This is the default username AWS generates. If you changed the default, then use the username you specified in step 6.

-- Database: aspnetlightsaildb
CREATE DATABASE aspnetlightsaildb
    WITH 
    OWNER = dbmasteruser
    ENCODING = 'UTF8'
    LC_COLLATE = 'en_US.UTF-8'
    LC_CTYPE = 'en_US.UTF-8'
    TABLESPACE = pg_default
    CONNECTION LIMIT = -1;	
-- Table: public.Person
CREATE TABLE IF NOT EXISTS public."Person"
(
    "Id" integer NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
    "Name" text COLLATE pg_catalog."default",
    "DateOfBirth" date,
    "Address" text COLLATE pg_catalog."default",
    CONSTRAINT "Person_pkey" PRIMARY KEY ("Id")
)
TABLESPACE pg_default;
ALTER TABLE public."Person"
    OWNER to dbmasteruser;

Now your database and table is created and you can create a container service.

Create a Lightsail container service

In this step, you create a Lightsail container service that is ready to accept your container images.

Create the container service

  1. Sign in to the Lightsail console.
  2. On the Lightsail home page, choose the Containers Tab
  3. Choose Create container service.
  4. In the Create a container service page, choose Change AWS Region, then choose an AWS Region for your container service.
  5. Choose a capacity for your container service. For more information, see Container service capacity (scale and power).
  6. Skip the Set up your first deployment step as you’ll create the deployment after creating the container image on your dev machine.
  7. Enter a name for your container service. Take note of this name, you’ll need it later to deploy the container to Lightsail.
  8. Click Create container service.

After a few minutes, your container service status changes from Pending to Ready. This indicates you can now deploy images. If this is the first time you created a service, it can take 10–15 minutes for the status to become Ready.

container service

Create an ASP.NET web app

Using the .NET CLI, you’ll create a sample ASP.NET web app. In an empty directory run the following command:

dotnet new webapp --name HelloWorldLightsail

The “webapp” segment of the commands specifies the project template to use. In this case, it’s a default ASP.NET web app. The “name” parameter is the name of the ASP.NET project.

To connect to your PostgreSQL Db from ASP.NET, you must install the “Npgsql” Nuget package. In the root directory of the project run the following command in the terminal:

dotnet add package Npgsql.EntityFrameworkCore.PostgreSQL –-version 5.0.6

Once installed, you create a Model class to represent the data and a DbContext class to connect and query the database.

public class Person
    {
        public int Id { get; set; }

        public string Name { get; set; }

        public DateTime DateOfBirth { get; set; }

        public string Address { get; set; }
    }

public class PostgreSqlContext : DbContext
    {
        public PostgreSqlContext(DbContextOptions<PostgreSqlContext> options) : base(options)
        {
        }

        public DbSet<Person> Person { get; set; }
    }

The next step is to add the connection string to appSettings.json. In the root of the settings file add a new ConnectionStrings property as shown below. The following properties are required:

  • lightsail-endpoint: The database endpoint as shown in the Lightsail console.
  • db-name: The name of the database you want to connect to.
  • db-username: The username as shown in the Lightsail console.
  • db-password: The password as shown in the Lightsail console.
"ConnectionStrings": {
    "AspnetLightsailDb": "Server=<lightsail-endpoint>;Port=5432;Database=<db-name>;User Id=<db-username>;Password=<db-password>;"
  }

Next step is to tell ASP.NET where to find the connection string and which DbContext class to use. This is done by configuring the DbContext in Startup.cs. Under the ConfigureServices method add the following line of code:

  services.AddDbContext<PostgreSqlContext>(options => options.UseNpgsql(Configuration.GetConnectionString("AspnetLightsailDb")));

 

Now, you are ready to perform operations against the database. This is done by performing operations against the Person property of the PostgreSqlContext instance.

For example to fetch all records form the Person table:

public IList<Person> Person { get;set; }

        public async Task OnGetAsync()
        {
            Person = await _context.Person.ToListAsync();
        }

You now have an ASP.NET web application that can query the “Person” table against the PostgreSQL database.

Create a Dockerfile and build image

In order to containerize the web app, you must create a Dockerfile. This file provides instructions to Docker on how to build the container image.

To create a Dockerfile and build image

  1. In the root directory of the project, you created (where the .csproj file lives) create an empty file named “Dockerfile”. Note this file does not have an extension.
  2. Open the file with a text editor or IDE and insert the following:
# https://hub.docker.com/_/microsoft-dotnet
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build
WORKDIR /source

# copy csproj and restore as distinct layers
COPY *.csproj .
RUN dotnet restore

# copy everything else and build app
COPY . .
RUN dotnet publish -c release -o /app --no-restore

# final stage/image
FROM mcr.microsoft.com/dotnet/aspnet:5.0
WORKDIR /app
COPY --from=build /app ./
ENTRYPOINT ["dotnet", "HelloWorldLightsail.dll"]
  1. To build the image, open a terminal in the same directory as the Dockerfile. Run the following command to build the image.
    docker build -t helloworldlightsail .
    The “-t” parameter is a human readable tag you give the image to make it easy to identify.
  2. After the command completes, you can verify that the image exists by running
    docker images
    You should see the newly created image.newly created image

Upload the image to Lightsail

In this step, you upload the newly built image to the Lightsail container service that you created earlier.

To upload the image to Lightsail

  1. Ensure you have configured the AWS CLI to access AWS.
  2. In a terminal enter the following command:
    aws lightsail push-container-image --region ap-southeast-2 --service-name aspnet-helloworld --label helloworldlightsail --image helloworldlightsail:latest
    The –-region and –-service-name parameters should match the container service you created through the AWS Management Console. The –-label parameter is a descriptive name you give the image when it’s stored in the container service. This will help you track the different versions of the image. The –-image parameter consists of the image name and tag on your local machine that you want to push to Lightsail. Read more about how to push images to Lightsail.
  3. After the command runs successfully browse your container service in the Lightsail console and click the “Images” tab. You should see the uploaded image.

3. After the command runs successfully browse your container service in the Lightsail console and click the “Images” tab

Deploy and run the image

Now that your image is uploaded to the container service it’s time to create a deployment to run the app.

To create a deployment

  1. Go to the Deployments tab in the Lightsail console.
  2. Click on Create your first deployment.
  3. Enter the Container name.
  4. Click Choose stored image and select the image you uploaded in the previous step.
  5. Click on Add open ports to add a port mapping to the container. This allows Lightsail to forward web traffic to your ASP.NET web app. By default ASP.NET web server will listen to port 80.
  6. Under the Public endpoint section, select the container from the drop-down. This specifies which container Lightsail will forward traffic to since a single deployment can have more than one container.
  7. Click Save and deploy

Your configuration should looks like this. Read more about creating container services deployments in Lightsail.

configuration overview

After the deployment is complete, you can navigate to the Public domain of your container service. You will see your ASP.NET web app in action!

public domain

Conclusion

In this post, I demonstrated how easy it is to create a PostgreSQL DB and deploy an ASP.NET web app to Amazon Lightsail. Going from a container on your dev machine to a publicly accessible, scalable, and secure cloud environment within minutes.

You can now add a custom domain to your web app through the Lightsail console. Additionally, you can increase the scale of your container to keep up with demand based on the useful CPU and memory metrics provided in the console.

If you have more advanced needs for your web app, you have the whole robust ecosystem of AWS at your disposal. You can deploy your ASP.NET web app to Amazon Elastic Container Service (Amazon ECS) or even decide to go completely serverless and utilize AWS Lambda and API Gateway.

Visit the Amazon Lightsail homepage to get started with your next idea and read the docs for more details about container services on Amazon Lightsail.

 

Forwarding emails automatically based on content with Amazon Simple Email Service

Post Syndicated from Murat Balkan original https://aws.amazon.com/blogs/messaging-and-targeting/forwarding-emails-automatically-based-on-content-with-amazon-simple-email-service/

Introduction

Email is one of the most popular channels consumers use to interact with support organizations. In its most basic form, consumers will send their email to a catch-all email address where it is further dispatched to the correct support group. Often, this requires a person to inspect content manually. Some IT organizations even have a dedicated support group that handles triaging the incoming emails before assigning them to specialized support teams. Triaging each email can be challenging, and delays in email routing and support processes can reduce customer satisfaction. By utilizing Amazon Simple Email Service’s deep integration with Amazon S3, AWS Lambda, and other AWS services, the task of categorizing and routing emails is automated. This automation results in increased operational efficiencies and reduced costs.

This blog post shows you how a serverless application will receive emails with Amazon SES and deliver them to an Amazon S3 bucket. The application uses Amazon Comprehend to identify the dominant language from the message body.  It then looks it up in an Amazon DynamoDB table to find the support group’s email address specializing in the email subject. As the last step, it forwards the email via Amazon SES to its destination. Archiving incoming emails to Amazon S3 also enables further processing or auditing.

Architecture

By completing the steps in this post, you will create a system that uses the architecture illustrated in the following image:

Architecture showing how to forward emails by content using Amazon SES

The flow of events starts when a customer sends an email to the generic support email address like [email protected]. This email is listened to by Amazon SES via a recipient rule. As per the rule, incoming messages are written to a specified Amazon S3 bucket with a given prefix.

This bucket and prefix are configured with S3 Events to trigger a Lambda function on object creation events. The Lambda function reads the email object, parses the contents, and sends them to Amazon Comprehend for language detection.

Amazon DynamoDB looks up the detected language code from an Amazon DynamoDB table, which includes the mappings between language codes and support group email addresses for these languages. One support group could answer English emails, while another support group answers French emails. The Lambda function determines the destination address and re-sends the same email address by performing an email forward operation. Suppose the lookup does not return any destination address, or the language was not be detected. In that case, the email is forwarded to a catch-all email address specified during the application deployment.

In this example, Amazon SES hosts the destination email addresses used for forwarding, but this is not a requirement. External email servers will also receive the forwarded emails.

Prerequisites

To use Amazon SES for receiving email messages, you need to verify a domain that you own. Refer to the documentation to verify your domain with Amazon SES console. If you do not have a domain name, you will register one from Amazon Route 53.

Deploying the Sample Application

Clone this GitHub repository to your local machine and install and configure AWS SAM with a test AWS Identity and Access Management (IAM) user.

You will use AWS SAM to deploy the remaining parts of this serverless architecture.

The AWS SAM template creates the following resources:

  • An Amazon DynamoDB mapping table (language-lookup) contains information about language codes and associates them with destination email addresses.
  • An AWS Lambda function (BlogEmailForwarder) that reads the email content parses it, detects the language, looks up the forwarding destination email address, and sends it.
  • An Amazon S3 bucket, which will store the incoming emails.
  • IAM roles and policies.

To start the AWS SAM deployment, navigate to the root directory of the repository you downloaded and where the template.yaml AWS SAM template resides. AWS SAM also requires you to specify an Amazon Simple Storage Service (Amazon S3) bucket to hold the deployment artifacts. If you haven’t already created a bucket for this purpose, create one now. You will refer to the documentation to learn how to create an Amazon S3 bucket. The bucket should have read and write access by an AWS Identity and Access Management (IAM) user.

At the command line, enter the following command to package the application:

sam package --template template.yaml --output-template-file output_template.yaml --s3-bucket BUCKET_NAME_HERE

In the preceding command, replace BUCKET_NAME_HERE with the name of the Amazon S3 bucket that should hold the deployment artifacts.

AWS SAM packages the application and copies it into this Amazon S3 bucket.

When the AWS SAM package command finishes running, enter the following command to deploy the package:

sam deploy --template-file output_template.yaml --stack-name blogstack --capabilities CAPABILITY_IAM --parameter-overrides FromEmailAddress=info@ YOUR_DOMAIN_NAME_HERE CatchAllEmailAddress=catchall@ YOUR_DOMAIN_NAME_HERE

In the preceding command, change the YOUR_DOMAIN_NAME_HERE with the domain name you validated with Amazon SES. This domain also applies to other commands and configurations that will be introduced later.

This example uses “blogstack” as the stack name, you will change this to any other name you want. When you run this command, AWS SAM shows the progress of the deployment.

Configure the Sample Application

Now that you have deployed the application, you will configure it.

Configuring Receipt Rules

To deliver incoming messages to Amazon S3 bucket, you need to create a Rule Set and a Receipt rule under it.

Note: This blog uses Amazon SES console to create the rule sets. To create the rule sets with AWS CloudFormation, refer to the documentation.

  1. Navigate to the Amazon SES console. From the left navigation choose Rule Sets.
  2. Choose Create a Receipt Rule button at the right pane.
  3. Add info@YOUR_DOMAIN_NAME_HERE as the first recipient addresses by entering it into the text box and choosing Add Recipient.

 

 

Choose the Next Step button to move on to the next step.

  1. On the Actions page, select S3 from the Add action drop-down to reveal S3 action’s details. Select the S3 bucket that was created by the AWS SAM template. It is in the format of your_stack_name-inboxbucket-randomstring. You will find the exact name in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console. Set the Object key prefix to info/. This tells Amazon SES to add this prefix to all messages destined to this recipient address. This way, you will re-use the same bucket for different recipients.

Choose the Next Step button to move on to the next step.

In the Rule Details page, give this rule a name at the Rule name field. This example uses the name info-recipient-rule. Leave the rest of the fields with their default values.

Choose the Next Step button to move on to the next step.

  1. Review your settings on the Review page and finalize rule creation by choosing Create Rule

  1. In this example, you will be hosting the destination email addresses in Amazon SES rather than forwarding the messages to an external email server. This way, you will be able to see the forwarded messages in your Amazon S3 bucket under different prefixes. To host the destination email addresses, you need to create different rules under the default rule set. Create three additional rules for catchall@YOUR_DOMAIN_NAME_HERE , english@ YOUR_DOMAIN_NAME_HERE and french@YOUR_DOMAIN_NAME_HERE email addresses by repeating the steps 2 to 5. For Amazon S3 prefixes, use catchall/, english/, and french/ respectively.

 

Configuring Amazon DynamoDB Table

To configure the Amazon DynamoDB table that is used by the sample application

  1. Navigate to Amazon DynamoDB console and reach the tables view. Inspect the table created by the AWS SAM application.

language-lookup table is the table where languages and their support group mappings are kept. You need to create an item for each language, and an item that will hold the default destination email address that will be used in case no language match is found. Amazon Comprehend supports more than 60 different languages. You will visit the documentation for the supported languages and add their language codes to this lookup table to enhance this application.

  1. To start inserting items, choose the language-lookup table to open table overview page.
  2. Select the Items tab and choose the Create item From the dropdown, select Text. Add the following JSON content and choose Save to create your first mapping object. While adding the following object, replace Destination attribute’s value with an email address you own. The email messages will be forwarded to that address.

{

  “language”: “en”,

  “destination”: “english@YOUR_DOMAIN_NAME_HERE”

}

Lastly, create an item for French language support.

{

  “language”: “fr”,

  “destination”: “french@YOUR_DOMAIN_NAME_HERE”

}

Testing

Now that the application is deployed and configured, you will test it.

  1. Use your favorite email client to send the following email to the domain name info@ email address.

Subject: I need help

Body:

Hello, I’d like to return the shoes I bought from your online store. How can I do this?

After the email is sent, navigate to the Amazon S3 console to inspect the contents of the Amazon S3 bucket that is backing the Amazon SES Rule Sets. You will also see the AWS Lambda logs from the Amazon CloudWatch console to confirm that the Lambda function is triggered and run successfully. You should receive an email with the same content at the address you defined for the English language.

  1. Next, send another email with the same content, this time in French language.

Subject: j’ai besoin d’aide

Body:

Bonjour, je souhaite retourner les chaussures que j’ai achetées dans votre boutique en ligne. Comment puis-je faire ceci?

 

Suppose a message is not matched to a language in the lookup table. In that case, the Lambda function will forward it to the catchall email address that you provided during the AWS SAM deployment.

You will inspect the new email objects under english/, french/ and catchall/ prefixes to observe the forwarding behavior.

Continue experimenting with the sample application by sending different email contents to info@ YOUR_DOMAIN_NAME_HERE address or adding other language codes and email address combinations into the mapping table. You will find the available languages and their codes in the documentation. When adding a new language support, don’t forget to associate a new email address and Amazon S3 bucket prefix by defining a new rule.

Cleanup

To clean up the resources you used in your account,

  1. Navigate to the Amazon S3 console and delete the inbox bucket’s contents. You will find the name of this bucket in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console.
  2. Navigate to AWS CloudFormation console and delete the stack named “blogstack”.
  3. After the stack is deleted, remove the domain from Amazon SES. To do this, navigate to the Amazon SES Console and choose Domains from the left navigation. Select the domain you want to remove and choose Remove button to remove it from Amazon SES.
  4. From the Amazon SES Console, navigate to the Rule Sets from the left navigation. On the Active Rule Set section, choose View Active Rule Set button and delete all the rules you have created, by selecting the rule and choosing Action, Delete.
  5. On the Rule Sets page choose Disable Active Rule Set button to disable listening for incoming email messages.
  6. On the Rule Sets page, Inactive Rule Sets section, delete the only rule set, by selecting the rule set and choosing Action, Delete.
  7. Navigate to CloudWatch console and from the left navigation choose Logs, Log groups. Find the log group that belongs to the BlogEmailForwarderFunction resource and delete it by selecting it and choosing Actions, Delete log group(s).
  8. You will also delete the Amazon S3 bucket you used for packaging and deploying the AWS SAM application.

 

Conclusion

This solution shows how to use Amazon SES to classify email messages by the dominant content language and forward them to respective support groups. You will use the same techniques to implement similar scenarios. You will forward emails based on custom key entities, like product codes, or you will remove PII information from emails before forwarding with Amazon Comprehend.

With its native integrations with AWS services, Amazon SES allows you to enhance your email applications with different AWS Cloud capabilities easily.

To learn more about email forwarding with Amazon SES, you will visit documentation and AWS blogs.

Create a serverless feedback collector application using Amazon Pinpoint’s two-way SMS functionality

Post Syndicated from Murat Balkan original https://aws.amazon.com/blogs/messaging-and-targeting/create-a-serverless-feedback-collector-application-by-using-amazon-pinpoints-two-way-sms-functionality/

Introduction

Two-way SMS communication is used by many companies to create interactive engagements with their customers. Traditional SMS notifications are one-way. While this is valid for many different use cases like one-time passwords (OTP) notifications and security notifications or reminders, some other use-cases may benefit from collecting information from the same channel. Two-way SMS allows customers to create this feedback mechanism and enhance business interactions and overall customer experience.

SMS is chosen for its simplicity and availability across different sets of devices. By combining the two-way SMS mechanism with the vast breadth of services Amazon Web Services (AWS) offers, companies can create effective architectures to better interact and serve their customers.

This blog post shows you how a serverless online appointment application can use Amazon Pinpoint’s two-way SMS functionality to collect customer feedback for completed appointments. You will learn how Amazon Pinpoint interacts with other AWS serverless services with its out-of-the-box integrations to create a scalable messaging application.

Architecture

By completing the steps in this post, you can create a system that uses the architecture illustrated in the following image:

The architecture of a feedback collector application that is composed of serverless AWS services

The flow of events starts when a Amazon DynamoDB table item, representing an online appointment, changes its status to COMPLETED. An AWS Lambda function which is subscribed to these changes over DynamoDB Streams detects this change and sends an SMS to the customer by using Amazon Pinpoint API’s sendMessages operation.

Amazon Pinpoint delivers the SMS to the recipient and generates a unique message ID to the AWS Lambda function. The Lambda function then adds this message ID to a DynamoDB table called “message-lookup”. This table is used for tracking different feedback requests sent during a multi-step conversation and associate them with the appointment ids. At this stage, the Lambda function also populates another table “feedbacks” which will hold the feedback responses that will be sent as SMS reply messages.

Each time a recipient replies to an SMS, Amazon Pinpoint publishes this reply event to an Amazon SNS topic which is subscribed by an Amazon SQS queue. Amazon Pinpoint will also add a messageId to this event which allows you to bind it to a sendMessages operation call.

A second AWS Lambda function polls these reply events from the Amazon SQS queue. It checks whether the reply is in the correct format (i.e. a number) and also associated with a previous request. If all conditions are met, the AWS Lambda function checks the ConversationStage attribute’s value from its message-lookup table. According to the current stage and the SMS answer received, AWS Lambda function will determine the next step.

For example, if the feedback score received is less than 5, a follow-up SMS is sent to the user asking if they’ll be happy to receive a call from the customer support team.

All SMS replies from the users are reflected to “feedbacks” table for further analysis.

Deploying the Sample Application

  1. Clone this GitHub repository to your local machine and install and configure AWS SAM with a test AWS IAM user.

You will use AWS SAM to deploy the remaining parts of this serverless architecture.

The AWS SAM template creates the following resources:

    • An Amazon DynamoDB table (appointments) that contains information about appointments, customers and their appointment status.
    • An Amazon DynamoDB table (feedbacks) that holds the received feedbacks from customers.
    • An Amazon DynamoDB table (message-lookup) that holds the Amazon Pinpoint message ids and associate them to appointments to track a multi-step conversation.
    • Two AWS Lambda functions (FeedbackSender and FeedbackReceiver)
    • An Amazon SNS topic that collects state change events from Amazon Pinpoint.
    • An Amazon SQS queue that queues the incoming messages.
    • An Amazon Pinpoint Application with an associated SMS channel.

This architecture consists of two Lambda functions, which are represented as two different apps in the AWS SAM template. These functions are named FeedbackSender and FeedbackReceiver. The FeedbackSender function listens the Amazon DynamoDB Stream associated with the appointments table and sends the SMS message requesting a feedback. Second Lambda function, FeedbackReceiver, polls the Amazon SQS queue and updates the feedbacks table in Amazon DynamoDB. (pinpoint-two-way-sms)

          Note: You’ll incur some costs by deploying this stack into your account.

  1. To start the SAM deployment, navigate to the root directory of the repository you downloaded and where the template.yaml AWS SAM template resides. AWS SAM also requires you to specify an Amazon Simple Storage Service (Amazon S3) bucket to hold the deployment artifacts. If you haven’t already created a bucket for this purpose, create one now. The bucket should have read and write access by an AWS Identity and Access Management (IAM) user.

At the command line, enter the following command to package the application:

sam package --template template.yaml --output-template-file output_template.yaml --s3-bucket BUCKET_NAME_HERE

In the preceding command, replace BUCKET_NAME_HERE with the name of the Amazon S3 bucket that should hold the deployment artifacts.

AWS SAM packages the application and copies it into this Amazon S3 bucket.

When the AWS SAM package command finishes running, enter the following command to deploy the package:

sam deploy --template-file output_template.yaml --stack-name BlogStackPinpoint --capabilities CAPABILITY_IAM

When you run this command, AWS SAM shows the progress of the deployment. When the deployment finishes, navigate to the Amazon Pinpoint console and choose the project named “BlogApplication”. This example uses “BlogStackPinpoint” as the stack name, you can change this to any other name you want.

  1. From the left navigation, choose Settings, SMS and voice. On the SMS and voice settings page, choose the Request phone number button under Number settings

Screenshot of request phone number screen

  1. Choose a target country. Set the Default message type as Transactional, and click on the Request long codes button to buy a long code.

Note: In United States, you can also request a Toll Free Number(TFN)

Screenshot showing long code additio

A long code will be added to the Number settings list.

  1. Choose the newly added number to reach the SMS Settings page and enable the option Enable two-way-SMS. At the Incoming messages destination, select Choose an existing SNS topic, and from the drop down select the Amazon SNS topic that was created by the BlogStackPinpoint stack.

Choose Save to save your SMS settings.

 

Testing the Sample Application

Now that the application is deployed and configured, test it by creating sample records in the Amazon DynamoDB table. Navigate to Amazon DynamoDB console and reach the tables view. Inspect the tables that were created by the AWS SAM application.

Here, appointments table is the table where the appointments and their statuses are kept. It tracks the appointment lifecycle events with items identified by unique ids. In this sample scenario, we are assuming that an appointment application creates a record with ‘CREATED’ status when a new appointment is planned. After the appointment is finished, same application updates the status to ‘COMPLETED’ which will trigger the feedback collection process. Feedback results are collected in the feedbacks table. Amazon Pinpoint message id’s, conversation stage and appointment id’s are kept in the message-lookup table.

  1. To start testing the end-to-end flow, choose the appointments table to open table overview page.
  2. Next, select the Items tab and choose the Create item From the dropdown, select Text. Add the following and choose Save to create your first appointment object. While adding the following object, replace CustomerPhone attribute’s value with a phone number you own. The feedback request messages will be delivered to that number. Note: This number should match the country number for the long code you provisioned.

{

"CustomerName": "Customer A",

"CustomerPhone": "+12345678900",

"AppointmentStatus":"CREATED",

"id": "1"

}

  1. To trigger sending the feedback SMS, you need to set an existing item’s status to “COMPLETED” To do this, select the item and click Edit from the Actions menu.

Replace the item’s current JSON with the following.

{

"AppointmentStatus": "COMPLETED",

"CustomerName": "Customer A",

"CustomerPhone": "+12345678900",

"id": "1"

}

  1. Before choosing the Save button, double check that you have set CustomerPhone attribute’s value to a valid phone number.

After the change, you should receive an SMS message asking for a feedback. Provide a numeric reply of that is less than five to this message. This will trigger a follow up question asking for a consent to receive an in-person callback.

 

During your SMS conversation with the application, inspect the feedbacks table. The feedback you have given over this two-way SMS channel should have been reflected into the table.

If you want to repeat the process, make sure to increment the AppointmentId field for any additional appointment records.

Cleanup

To clean up the resources you used in your account, simply navigate to AWS Cloudformation console and delete the stack named “BlogStackPinpoint”.

After the stack is deleted, you also need to delete the Long code from the Pinpoint Console by choosing the number and pressing Remove phone number button. You can also delete the Amazon S3 bucket you used for packaging and deploying the AWS SAM application.

Conclusion

This architecture shows how Amazon Pinpoint can be used to make two-way SMS communication with your customers. You can implement Two-way SMS functionality in other use cases such as appointment reminders, polls, Q&A services, and more.

To learn more about Pinpoint and it’s two-way SMS mechanism, you can visit the Pinpoint documentation.

 

Supporting AWS Graviton2 and x86 instance types in the same Auto Scaling group

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/supporting-aws-graviton2-and-x86-instance-types-in-the-same-auto-scaling-group/

This post is written by Tyler Lynch, Sr. Solutions Architect – EdTech, and Praneeth Tekula, Technical Account Manager.

As customers seek performance improvements and to cost optimize their workloads, they are evaluating and adopting AWS Graviton2 based instances. This post provides instructions on how to configure your Amazon EC2 Auto Scaling group (ASG) to use both Graviton2 and x86 based Amazon EC2 Instances in the same Auto Scaling group with different AMIs. This allows you to introduce Graviton2 based instances as part of a multiple instance type strategy.

For example, a customer may want to use the same Auto Scaling group definition across multiple Regions, but an instance type might not available in that region yet. Implementing instance and architecture diversity allow those Auto Scaling group definitions to be portable.

Solution Overview

The Amazon EC2 Auto Scaling console currently doesn’t support the selection of multiple launch templates, so I use the AWS Command Line Interface (AWS CLI) throughout this post. First, you create your launch templates that specify AMIs for use on x86 and arm64 based instances. Then you create your Auto Scaling group using a mixed instance policy with instance level overrides to specify the launch template to use for that instance.

Finally, you extend the launch templates to use architecture-specific EC2 user data to download architecture-specific binaries. Putting it all together, here are the high-level steps to follow:

  1. Create the launch templates:
    1. Launch template for x86– Creates a launch template for x86 instances, specifying the AMI but not the instance sizes.
    2. Launch template for arm64– Creates a launch template for arm64 instances, specifying the AMI but not the instance sizes.
  2. Create the Auto Scaling group that references the launch templates in a mixed instance policy override.
  3. Create a sample Node.js application.
  4. Create the architecture-specific user data scripts.
  5. Modify the launch templates to use architecture-specific user data scripts.

Prerequisites

The prerequisites for this solution are as follows:

  • The AWS CLI installed locally. I use AWS CLI version 2 for this post.
    • For AWS CLI v2, you must use 2.1.3+
    • For AWS CLI v1, you must use 1.18.182+
  • The correct AWS Identity and Access Management(IAM) role permissions for your account allowing for the creation and execution of the launch templates, Auto Scaling groups, and launching EC2 instances.
  • A source control service such as AWS CodeCommit or GitHub that your user data script can interact with to git clone the Hello World Node.js application.
  • The source code repository initialized and cloned locally.

Create the Launch Templates

You start with creating the launch template for x86 instances, and then the launch template for arm64 instances. These are simple launch templates where you only specify the AMI for Amazon Linux 2 in US-EAST-1 (architecture dependent). You use the AWS CLI cli-input-json feature to make things more readable and repeatable.

You first must add the lt-x86-cli-input.json file to your local working for reference by the AWS CLI.

  1. In your preferred text editor, add a new file, and copy paste the following JSON into the file.

{
    "LaunchTemplateName": "lt-x86",
    "VersionDescription": "LaunchTemplate for x86 instance types using Amazon Linux 2 x86 AMI in US-EAST-1",
    "LaunchTemplateData": {
        "ImageId": "ami-04bf6dcdc9ab498ca"
    }
}
  1. Save the file in your local working directory and name it lt-x86-cli-input.json.

Now, add the lt-arm64-cli-input.json file into your local working directory.

  1. In a text editor, add a new file, and copy paste the following JSON into the file.

{
    "LaunchTemplateName": "lt-arm64",
    "VersionDescription": "LaunchTemplate for Graviton2 instance types using Amazon Linux 2 Arm64 AMI in US-EAST-1",
    "LaunchTemplateData": {
        "ImageId": "ami-09e7aedfda734b173"
    }
}
  1. Save the file in your local working directory and name it lt-arm64-cli-input.json.

Now that your CLI input files are ready, create your launch templates using the CLI.

From your terminal, run the following commands:


aws ec2 create-launch-template \
            --cli-input-json file://./lt-x86-cli-input.json \
            --region us-east-1

aws ec2 create-launch-template \
            --cli-input-json file://./lt-arm64-cli-input.json \
            --region us-east-1

After you run each command, you should see the command output similar to this:


{
	"LaunchTemplate": {
		"LaunchTemplateId": "lt-07ab8c76f8e021b0c",
		"LaunchTemplateName": "lt-x86",
		"CreateTime": "2020-11-20T16:08:08+00:00",
		"CreatedBy": "arn:aws:sts::111111111111:assumed-role/Admin/myusername",
		"DefaultVersionNumber": 1,
		"LatestVersionNumber": 1
	}
}

{
	"LaunchTemplate": {
		"LaunchTemplateId": "lt-0c65656a2c75c0f76",
		"LaunchTemplateName": "lt-arm64",
		"CreateTime": "2020-11-20T16:08:37+00:00",
		"CreatedBy": "arn:aws:sts::111111111111:assumed-role/Admin/myusername",
		"DefaultVersionNumber": 1,
		"LatestVersionNumber": 1
	}
}

Create the Auto Scaling Group

Moving on to creating your Auto Scaling group, start with creating another JSON file to use the cli-input-json feature. Then, create the Auto Scaling group via the CLI.

I want to call special attention to the LaunchTemplateSpecification under the MixedInstancePolicy Overrides property. This Auto Scaling group is being created with a default launch template, the one you created for arm64 based instances. You override that at the instance level for x86 instances.

Now, add the asg-mixed-arch-cli-input.json file into your local working directory.

  1. In a text editor, add a new file, and copy paste the following JSON into the file.
  2. You need to change the subnet IDs specified in the VPCZoneIdentifier to your own subnet IDs.

{
    "AutoScalingGroupName": "asg-mixed-arch",
    "MixedInstancesPolicy": {
        "LaunchTemplate": {
            "LaunchTemplateSpecification": {
                "LaunchTemplateName": "lt-arm64",
                "Version": "$Default"
            },
            "Overrides": [
                {
                    "InstanceType": "t4g.micro"
                },
                {
                    "InstanceType": "t3.micro",
                    "LaunchTemplateSpecification": {
                        "LaunchTemplateName": "lt-x86",
                        "Version": "$Default"
                    }
                },
                {
                    "InstanceType": "t3a.micro",
                    "LaunchTemplateSpecification": {
                        "LaunchTemplateName": "lt-x86",
                        "Version": "$Default"
                    }
                }
            ]
        }
    },    
    "MinSize": 1,
    "MaxSize": 5,
    "DesiredCapacity": 3,
    "VPCZoneIdentifier": "subnet-e92485b6, subnet-07fe637b44fd23c31, subnet-828622e4, subnet-9bd6a2d6"
}
  1. Save the file in your local working directory and name it asg-mixed-arch-cli-input.json.

Now that your CLI input file is ready, create your Auto Scaling group using the CLI.

  1. From your terminal, run the following command:

aws autoscaling create-auto-scaling-group \
            --cli-input-json file://./asg-mixed-arch-cli-input.json \
            --region us-east-1

After you run the command, there isn’t any immediate output. Describe the Auto Scaling group to review the configuration.

  1. From your terminal, run the following command:

aws autoscaling describe-auto-scaling-groups \
            --auto-scaling-group-names asg-mixed-arch \
            --region us-east-1

Let’s evaluate the output. I removed some of the output for brevity. It shows that you have an Auto Scaling group with a mixed instance policy, which specifies a default launch template named lt-arm64. In the Overrides property, you can see the instances types that you specified and the values that define the lt-x86 launch template to be used for specific instance types (t3.micro, t3a.micro).


{
    "AutoScalingGroups": [
        {
            "AutoScalingGroupName": "asg-mixed-arch",
            "AutoScalingGroupARN": "arn:aws:autoscaling:us-east-1:111111111111:autoScalingGroup:a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1:autoScalingGroupName/asg-mixed-arch",
            "MixedInstancesPolicy": {
                "LaunchTemplate": {
                    "LaunchTemplateSpecification": {
                        "LaunchTemplateId": "lt-0cc7dae79a397d663",
                        "LaunchTemplateName": "lt-arm64",
                        "Version": "$Default"
                    },
                    "Overrides": [
                        {
                            "InstanceType": "t4g.micro"
                        },
                        {
                            "InstanceType": "t3.micro",
                            "LaunchTemplateSpecification": {
                                "LaunchTemplateId": "lt-04b525bfbde0dcebb",
                                "LaunchTemplateName": "lt-x86",
                                "Version": "$Default"
                            }
                        },
                        {
                            "InstanceType": "t3a.micro",
                            "LaunchTemplateSpecification": {
                                "LaunchTemplateId": "lt-04b525bfbde0dcebb",
                                "LaunchTemplateName": "lt-x86",
                                "Version": "$Default"
                            }
                        }
                    ]
                },
                ...
            },
            ...
            "Instances": [
                {
                    "InstanceId": "i-00377a23630a5e107",
                    "InstanceType": "t4g.micro",
                    "AvailabilityZone": "us-east-1b",
                    "LifecycleState": "InService",
                    "HealthStatus": "Healthy",
                    "LaunchTemplate": {
                        "LaunchTemplateId": "lt-0cc7dae79a397d663",
                        "LaunchTemplateName": "lt-arm64",
                        "Version": "1"
                    },
                    "ProtectedFromScaleIn": false
                },
                {
                    "InstanceId": "i-07c2d4f875f1f457e",
                    "InstanceType": "t4g.micro",
                    "AvailabilityZone": "us-east-1a",
                    "LifecycleState": "InService",
                    "HealthStatus": "Healthy",
                    "LaunchTemplate": {
                        "LaunchTemplateId": "lt-0cc7dae79a397d663",
                        "LaunchTemplateName": "lt-arm64",
                        "Version": "1"
                    },
                    "ProtectedFromScaleIn": false
                },
                {
                    "InstanceId": "i-09e61e95cdf705ade",
                    "InstanceType": "t4g.micro",
                    "AvailabilityZone": "us-east-1c",
                    "LifecycleState": "InService",
                    "HealthStatus": "Healthy",
                    "LaunchTemplate": {
                        "LaunchTemplateId": "lt-0cc7dae79a397d663",
                        "LaunchTemplateName": "lt-arm64",
                        "Version": "1"
                    },
                    "ProtectedFromScaleIn": false
                }
            ],
            ...
        }
    ]
}

Create Hello World Node.js App

Now that you have created the launch templates and the Auto Scaling group you are ready to create the “hello world” application that self-reports the processor architecture. You work in the local directory that is cloned from your source repository as specified in the prerequisites. This doesn’t have to be the local working directory where you are creating architecture-specific files.

  1. In a text editor, add a new file with the following Node.js code:

// Hello World sample app.
const http = require('http');

const port = 3000;

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader('Content-Type', 'text/plain');
  res.end(`Hello World. This processor architecture is ${process.arch}`);
});

server.listen(port, () => {
  console.log(`Server running on processor architecture ${process.arch}`);
});
  1. Save the file in the root of your source repository and name it app.js.
  2. Commit the changes to Git and push the changes to your source repository. See the following commands:

git add .
git commit -m "Adding Node.js sample application."
git push

Create user data scripts

Moving on to your creating architecture-specific user data scripts that will define the version of Node.js and the distribution that matches the processor architecture. It will download and extract the binary and add the binary path to the environment PATH. Then it will clone the Hello World app, and then run that app with the binary of Node.js that was installed.

Now, you must add the ud-x86-cli-input.txt file to your local working directory.

  1. In your text editor, add a new file, and copy paste the following text into the file.
  2. Update the git clone command to use the repo URL where you created the Hello World app previously.
  3. Update the cd command to use the repo name.

sudo yum update -y
sudo yum install git -y
VERSION=v14.15.3
DISTRO=linux-x64
wget https://nodejs.org/dist/$VERSION/node-$VERSION-$DISTRO.tar.xz
sudo mkdir -p /usr/local/lib/nodejs
sudo tar -xJvf node-$VERSION-$DISTRO.tar.xz -C /usr/local/lib/nodejs 
export PATH=/usr/local/lib/nodejs/node-$VERSION-$DISTRO/bin:$PATH
git clone https://github.com/<<githubuser>>/<<repo>>.git
cd <<repo>>
node app.js
  1. Save the file in your local working directory and name it ud-x86-cli-input.txt.

Now, add the ud-arm64-cli-input.txt file into your local working directory.

  1. In a text editor, add a new file, and copy paste the following text into the file.
  2. Update the git clone command to use the repo URL where you created the Hello World app previously.
  3. Update the cd command to use the repo name.

sudo yum update -y
sudo yum install git -y
VERSION=v14.15.3
DISTRO=linux-arm64
wget https://nodejs.org/dist/$VERSION/node-$VERSION-$DISTRO.tar.xz
sudo mkdir -p /usr/local/lib/nodejs
sudo tar -xJvf node-$VERSION-$DISTRO.tar.xz -C /usr/local/lib/nodejs 
export PATH=/usr/local/lib/nodejs/node-$VERSION-$DISTRO/bin:$PATH
git clone https://github.com/<<githubuser>>/<<repo>>.git
cd <<repo>>
node app.js
  1. Save the file in your local working directory and name it ud-arm64-cli-input.txt.

Now that your user data scripts are ready, you need to base64 encode them as the AWS CLI does not perform base64-encoding of the user data for you.

  • On a Linux computer, from your terminal use the base64 command to encode the user data scripts.

base64 ud-x86-cli-input.txt > ud-x86-cli-input-base64.txt
base64 ud-arm64-cli-input.txt > ud-arm64-cli-input-base64.txt
  • On a Windows computer, from your command line use the certutil command to encode the user data. Before you can use this file with the AWS CLI, you must remove the first (BEGIN CERTIFICATE) and last (END CERTIFICATE) lines.

certutil -encode ud-x86-cli-input.txt ud-x86-cli-input-base64.txt
certutil -encode ud-arm64-cli-input.txt ud-arm64-cli-input-base64.txt
notepad ud-x86-cli-input-base64.txt
notepad ud-arm64-cli-input-base64.txt

Modify the Launch Templates

Now, you modify the launch templates to use architecture-specific user data scripts.

Please note that the contents of your ud-x86-cli-input-base64.txt and ud-arm64-cli-input-base64.txt files are different from the samples here because you referenced your own GitHub repository. These base64 encoded user data scripts below will not work as is, they contain placeholder references for the git clone and cd commands.

Next, update the lt-x86-cli-input.json file to include your base64 encoded user data script for x86 based instances.

  1. In your preferred text editor, open the ud-x86-cli-input-base64.txt file.
  2. Open the lt-x86-cli-input.json file, and add in the text from the ud-x86-cli-input-base64.txt file into the UserData property of the LaunchTemplateData object. It should look similar to this:

{
    "LaunchTemplateName": "lt-x86",
    "VersionDescription": "LaunchTemplate for x86 instance types using Amazon Linux 2 x86 AMI in US-EAST-1",
    "LaunchTemplateData": {
        "ImageId": "ami-04bf6dcdc9ab498ca",
        "UserData": "IyEvYmluL2Jhc2gKeXVtIHVwZGF0ZSAteQoKVkVSU0lPTj12MTQuMTUuMwpESVNUUk89bGludXgteDY0CndnZXQgaHR0cHM6Ly9ub2RlanMub3JnL2Rpc3QvJFZFUlNJT04vbm9kZS0kVkVSU0lPTi0kRElTVFJPLnRhci54egpzdWRvIG1rZGlyIC1wIC91c3IvbG9jYWwvbGliL25vZGVqcwpzdWRvIHRhciAteEp2ZiBub2RlLSRWRVJTSU9OLSRESVNUUk8udGFyLnh6IC1DIC91c3IvbG9jYWwvbGliL25vZGVqcyAKZXhwb3J0IFBBVEg9L3Vzci9sb2NhbC9saWIvbm9kZWpzL25vZGUtJFZFUlNJT04tJERJU1RSTy9iaW46JFBBVEgKZ2l0IGNsb25lIGh0dHBzOi8vZ2l0aHViLmNvbS88PGdpdGh1YnVzZXI+Pi88PHJlcG8+Pi5naXQKY2QgPDxyZXBvPj4Kbm9kZSBhcHAuanMK"
    }
}
  1. Save the file.

Next, update the lt-arm64-cli-input.json file to include your base64 encoded user data script for arm64 based instances.

  1. In your text editor, open the ud-arm64-cli-input-base64.txt file.
  2. Open the lt-arm64-cli-input.json file, and add in the text from the ud-arm64-cli-input-base64.txt file into the UserData property of the LaunchTemplateData It should look similar to this:

{
    "LaunchTemplateName": "lt-arm64",
    "VersionDescription": "LaunchTemplate for Graviton2 instance types using Amazon Linux 2 Arm64 AMI in US-EAST-1",
    "LaunchTemplateData": {
        "ImageId": "ami-09e7aedfda734b173",
        "UserData": "IyEvYmluL2Jhc2gKeXVtIHVwZGF0ZSAteQoKVkVSU0lPTj12MTQuMTUuMwpESVNUUk89bGludXgtYXJtNjQKd2dldCBodHRwczovL25vZGVqcy5vcmcvZGlzdC8kVkVSU0lPTi9ub2RlLSRWRVJTSU9OLSRESVNUUk8udGFyLnh6CnN1ZG8gbWtkaXIgLXAgL3Vzci9sb2NhbC9saWIvbm9kZWpzCnN1ZG8gdGFyIC14SnZmIG5vZGUtJFZFUlNJT04tJERJU1RSTy50YXIueHogLUMgL3Vzci9sb2NhbC9saWIvbm9kZWpzIApleHBvcnQgUEFUSD0vdXNyL2xvY2FsL2xpYi9ub2RlanMvbm9kZS0kVkVSU0lPTi0kRElTVFJPL2JpbjokUEFUSApnaXQgY2xvbmUgaHR0cHM6Ly9naXRodWIuY29tLzw8Z2l0aHVidXNlcj4+Lzw8cmVwbz4+LmdpdApjZCA8PHJlcG8+Pgpub2RlIGFwcC5qcwoKCg=="
    }
}
  1. Save the file.

Now, your CLI input files are ready. Next, create a new version of your launch templates and then set the newest version as the default.

From your terminal, run the following commands:


aws ec2 create-launch-template-version \
            --cli-input-json file://./lt-x86-cli-input.json \
            --region us-east-1

aws ec2 create-launch-template-version \
            --cli-input-json file://./lt-arm64-cli-input.json \
            --region us-east-1

aws ec2 modify-launch-template \
            --launch-template-name lt-x86 \
            --default-version 2
			
aws ec2 modify-launch-template \
            --launch-template-name lt-arm64 \
            --default-version 2

After you run each command, you should see the command output similar to this:


{
    "LaunchTemplate": {
        "LaunchTemplateId": "lt-08ff3d03d4cf0038d",
        "LaunchTemplateName": "lt-x86",
        "CreateTime": "1970-01-01T00:00:00+00:00",
        "CreatedBy": "arn:aws:sts::111111111111:assumed-role/Admin/myusername",
        "DefaultVersionNumber": 2,
        "LatestVersionNumber": 2
    }
}

{
    "LaunchTemplate": {
        "LaunchTemplateId": "lt-0c5e1eb862a02f8e0",
        "LaunchTemplateName": "lt-arm64",
        "CreateTime": "1970-01-01T00:00:00+00:00",
        "CreatedBy": "arn:aws:sts::111111111111:assumed-role/Admin/myusername",
        "DefaultVersionNumber": 2,
        "LatestVersionNumber": 2
    }
}

Now, refresh the instances in the Auto Scaling group so that the newest version of the launch template is used.

From your terminal, run the following command:


aws autoscaling start-instance-refresh \
            --auto-scaling-group-name asg-mixed-arch

Verify Instances

The sample Node.js application self reports the process architecture in two ways: when the application is started, and when the application receives a HTTP request on port 3000. Retrieve the last five lines of the instance console output via the AWS CLI.

First, you need to get an instance ID from the autoscaling group.

  1. From your terminal, run the following commands:

aws autoscaling describe-auto-scaling-groups \
            --auto-scaling-group-name asg-mixed-arch \
            --region us-east-1
  1. Evaluate the output. I removed some of the output for brevity. You need to use the InstanceID from the output.

{
    "AutoScalingGroups": [
        {
            "AutoScalingGroupName": "asg-mixed-arch",
            "AutoScalingGroupARN": "arn:aws:autoscaling:us-east-1:111111111111:autoScalingGroup:a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1:autoScalingGroupName/asg-mixed-arch",
            "MixedInstancesPolicy": {
                ...
            },
            ...
            "Instances": [
                {
                    "InstanceId": "i-0eeadb140405cc09b",
                    "InstanceType": "t4g.micro",
                    "AvailabilityZone": "us-east-1a",
                    "LifecycleState": "InService",
                    "HealthStatus": "Healthy",
                    "LaunchTemplate": {
                        "LaunchTemplateId": "lt-0c5e1eb862a02f8e0",
                        "LaunchTemplateName": "lt-arm64",
                        "Version": "2"
                    },
                    "ProtectedFromScaleIn": false
                }
            ],
          ....
        }
    ]
}

Now, retrieve the last five lines of console output from the instance.

From your terminal, run the following command:


aws ec2 get-console-output –instance-id d i-0eeadb140405cc09b \
            --output text | tail -n 5

Evaluate the output, you should see Server running on processor architecture arm64. This confirms that you have successfully utilized an architecture-specific user data script.


[  58.798184] cloud-init[1257]: node-v14.15.3-linux-arm64/share/systemtap/tapset/node.stp
[  58.798293] cloud-init[1257]: node-v14.15.3-linux-arm64/LICENSE
[  58.798402] cloud-init[1257]: Cloning into 'node-helloworld'...
[  58.798510] cloud-init[1257]: Server running on processor architecture arm64
2021-01-14T21:14:32+00:00

Cleaning Up

Delete the Auto Scaling group and use the force-delete option. The force-delete option specifies that the group is to be deleted along with all instances associated with the group, without waiting for all instances to be terminated.


aws autoscaling delete-auto-scaling-group \
            --auto-scaling-group-name asg-mixed-arch --force-delete \
            --region us-east-1

Now, delete your launch templates.


aws ec2 delete-launch-template --launch-template-name lt-x86
aws ec2 delete-launch-template --launch-template-name lt-arm64

Conclusion

You walked through creating and using architecture-specific user data scripts that were processor architecture-specific. This same method could be applied to fleets where you have different configurations needed for different instance types. Variability such as disk sizes, networking configurations, placement groups, and tagging can now be accomplished in the same Auto Scaling group.

Better performance for less: AWS continues to beat Azure on SQL Server price/performance

Post Syndicated from Fred Wurden original https://aws.amazon.com/blogs/compute/sql-server-runs-better-on-aws/

By Fred Wurden, General Manager, AWS Enterprise Engineering (Windows, VMware, RedHat, SAP, Benchmarking)

AWS R5b.8xlarge delivers better performance at lower cost than Azure E64_32s_v4 for a SQL Server workload

In this blog, we will review a recent benchmark that Principled Technologies published on 2/25. The benchmark found that an Amazon Elastic Compute Cloud (Amazon EC2) R5b.8xlarge instance delivered better performance for a SQL Server workload at a lower cost when directly tested against an Azure E64_32s_v4 VM.

Behind the study: Understanding how SQL Server performed better, for a lower cost with an AWS EC2 R5b instance

Principled Technologies tested an online transaction processing (OLTP) workload for SQL Server 2019 on both an R5b instance on Amazon EC2 with Amazon Elastic Block Store (EBS) as storage and Azure E64_32s_v4. This particular Azure VM was chosen as an equivalent to the R5b instance, as both instances have comparable specifications for input/output operations per second (IOPS) performance, use Intel Xeon processors from the same generation (Cascade Lake), and offer the same number of cores (32). For storage, Principled Technologies mirrored storage configurations across the Azure VM and the EC2 instance (which used Amazon Elastic Block Store (EBS)), maxing out the IOPS specs on each while offering a direct comparison between instances.

Test Configurations

Source: Principled Technologies

When benchmarking, Principled Technologies ran a TPC-C-like OLTP workload from HammerDB v3.3 on both instances, testing against new orders per minute (NOPM) performance. NOPM shows the number of new-order transactions completed in one minute as part of a serialized business workload. HammerDB claims that because NOPM is “independent of any particular database implementation [it] is the recommended primary metric to use.”

The results: SQL Server on AWS EC2 R5b delivered 2x performance than the Azure VM and 62% less expensive 

Graphs that show AWS instance outperformed the Azure instance

Source: Principled Technologies

These test results from the Principled Technologies report show the price/performance and performance comparisons. The performance metric is New Orders Per Minute (NOPM); faster is better. The price/performance calculations are based on the cost of on-demand, License Included SQL Server instances and storage to achieve 1,000 NOPM performance, smaller is better.

An EC2 r5b.8xlarge instance powered by an Intel Xeon Scalable processor delivered better SQL Server NOPM performance on the HammerDB benchmark and a lower price per 1,000 NOPM than an Azure E64_32s_v4 VM powered by similar Intel Xeon Scalable processors.

On top of that, AWS’s storage price-performance exceeded Azure’s. The Azure managed disks offered 53 percent more storage than the EBS storage, but the EC2 instance with EBS storage cost 24 percent less than the Azure VM with managed disks. Even by reducing Azure storage by the difference in storage, something customers cannot do, EBS would have cost 13 percent less per storage GB than the Azure managed disks.

Why AWS is the best cloud to run your Windows and SQL Server workloads

To us, these results aren’t surprising. In fact, they’re in line with the success that customers find running Windows on AWS for over 12 years. Customers like Pearson and Expedia have all found better performance and enhanced cost savings by moving their Windows, SQL Server, and .NET workloads to AWS. In fact, RepricerExpress migrated its Windows and SQL Server environments from Azure to AWS to slash outbound bandwidth costs while gaining agility and performance.

Not only do we offer better price-performance for your Windows workloads, but we also offer better ways to run Windows in the cloud. Whether you want to rehost your databases to EC2, move to managed with Amazon Relational Database for SQL Server (RDS), or even modernize to cloud-native databases, AWS stands ready to help you get the most out of the cloud.

 


To learn more on migrating Windows Server or SQL Server, visit Windows on AWS. For more stories about customers who have successfully migrated and modernized SQL Server workloads with AWS, visit our Customer Success page. Contact us to start your migration journey today.

Node.js 14.x runtime now available in AWS Lambda

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/node-js-14-x-runtime-now-available-in-aws-lambda/

You can now develop AWS Lambda functions using the Node.js 14.x runtime. This is the current Long Term Support (LTS) version of Node.js. Start using this new version today by specifying a runtime parameter value of nodejs14.x when creating or updating functions or by using the appropriate managed runtime base image.

Language Updates

Node.js 14 is a stable release and brings several new features, including:

  • Updated V8 engine
  • Diagnostic reporting
  • Updated Node streams

V8 engine updated To V8.1

Node.js 14.x is powered by V8 version 8.1, which is a significant upgrade from the V8 7.4 engine powering the previous Node.js 12.x. This upgrade brings performance enhancements and some notable new features:

  • Nullish Coalescing ?? A logical operator that returns its right-hand side operand when its left-hand side operand is not defined or null.
    const newVersion = null ?? ‘this works great’ ;
    console.log(newVersion);
    // expected output: "this works great"
    
    const nullishTest = 0 ?? 36;
    console.log(nullishTest);
    // expected output: 0 because 0 is not the same as null or undefined

This new operator is useful for debugging and error handling in your Lambda functions when values unexpectedly return null or undefined.

  • Intl.DateTimeFormat – This feature enables numberingSystem and calendar options.
    const newVersion = null ?? ‘this works great’ ;
    console.log(newVersion);
    // expected output: "this works great"
    
    const nullishTest = 0 ?? 36;
    console.log(nullishTest);
    // expected output: 0 because 0 is not the same as null or undefined
  • Intl.DisplayNames – Offers the consistent translation of region, language, and script display names.
    const date = new Date(Date.UTC(2021, 01, 20, 3, 23, 16, 738));
    // Results below assume UTC timezone - your results may vary
    
    // Specify date formatting for language
    console.log(new Intl.DateTimeFormat('en-US').format(date));
    // expected output: "2/20/2021"
  • Optional Chaining ?. – Use this operator to access a property’s value within a chain without needing to validate each reference. This removes the requirement of checking for the existence of a deeply nested property using the && operator or lodash.get:
    const player = {
      name: 'Roxie',
      superpower: {
        value: 'flight',
      }
    };
    
    // Using the && operator
    if (player && player.superpower && player.superpower.value) {
      // do something with player.superpower.value
    }
    
    // Using the ?. operator
    if (player?.superpower?.value) {
      // do something with player.superpower.value
    }
    

Diagnostic reporting

Diagnostic reporting is now a stable feature in Node.js 14. This option allows you to generate a JSON-formatted report on demand or when certain events occur. This helps to diagnose problems such as slow performance, memory leaks, unexpected errors, and more.

The following example generates a report from within a Lambda function, and outputs the results to Amazon Cloudwatch for further inspection.

const report = process.report.getReport();
console.log(typeof report === 'object'); // true

// Similar to process.report.writeReport() output
console.log(JSON.stringify(report, null, 2));

See the official docs on diagnostic reporting in Node.js to learn other ways to use the command.

Updated node streams

The streams APIs has been updated to help remove ambiguity and streamline behaviours across the various parts of Node.js core.

Runtime Updates

To help keep Lambda functions secure, AWS updates Node.js 14 with all minor updates released by the Node.js community when using the zip archive format. For Lambda functions packaged as a container image, pull, rebuild and deploy the latest base image from DockerHub or Amazon ECR Public.

Deprecation schedule

AWS will be deprecating Node.js 10 according to the end of life schedule provided by the community. Node.js 10 reaches end of life on April 30, 2021. After March 30, 2021 you can no longer create a Node.js 10 Lambda function. The ability to update a function will be disabled after May 28, 2021 . More information on can be found in the runtime support policy.

You can migrate Existing Node.js 12 functions to the new runtime by making any necessary changes to code for compatibility with Node.js 14, and changing the function’s runtime configuration to “nodejs14.x”. Lambda functions running on Node.js 14 will have 2 full years of support.

Amazon Linux 2

Node.js 14 managed runtime, like Node.js 12, Java 11, and Python 3.8, is based on an Amazon Linux 2 execution environment. Amazon Linux 2 provides a secure, stable, and high-performance execution environment to develop and run cloud and enterprise applications.

Next steps

Get started building with Node.js 14 today by specifying a runtime parameter value of nodejs14.x when creating your Lambda functions using the zip archive packaging format. You can also build Lambda functions in Node.js 14 by deploying your function code as a container image using the Node.js 14 AWS base image for Lambda. You can read about the Node.js programming model in the AWS Lambda documentation to learn more about writing functions in Node.js 14.

For existing Node.js functions, migrate to the new runtime by changing the function’s runtime configuration to nodejs14.x

Happy coding with Node.js 14!

Running cost optimized Spark workloads on Kubernetes using EC2 Spot Instances

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/running-cost-optimized-spark-workloads-on-kubernetes-using-ec2-spot-instances/

This post is written by Kinnar Sen, Senior Solutions Architect, EC2 Spot 

Apache Spark is an open-source, distributed processing system used for big data workloads. It provides API operations to perform multiple tasks such as streaming, extract transform load (ETL), query, machine learning (ML), and graph processing. Spark supports four different types of cluster managers (Spark standalone, Apache Mesos, Hadoop YARN, and Kubernetes), which are responsible for scheduling and allocation of resources in the cluster. Spark can run with native Kubernetes support since 2018 (Spark 2.3). AWS customers that have already chosen Kubernetes as their container orchestration tool can also choose to run Spark applications in Kubernetes, increasing the effectiveness of their operations and compute resources.

In this post, I illustrate the deployment of scalable, resilient, and cost optimized Spark application using Kubernetes via Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon EC2 Spot Instances. Learn how to save money on big data workloads by implementing this solution.

Overview

Amazon EC2 Spot Instances

Amazon EC2 Spot Instances let you take advantage of unused EC2 capacity in the AWS Cloud. Spot Instances are available at up to a 90% discount compared to On-Demand Instance prices. Capacity pools are a group of EC2 instances that belong to particular instance family, size, and Availability Zone (AZ). If EC2 needs capacity back for On-Demand Instance usage, Spot Instances can be interrupted by EC2 with a two-minute notification. There are many graceful ways to handle the interruption to ensure that the application is well architected for resilience and fault tolerance. This can be automated via the application and/or infrastructure deployments. Spot Instances are ideal for stateless, fault tolerant, loosely coupled and flexible workloads that can handle interruptions.

Amazon Elastic Kubernetes Service

Amazon EKS is a fully managed Kubernetes service that makes it easy for you to run Kubernetes on AWS without needing to install, operate, and maintain your own Kubernetes control plane. It provides a highly available and scalable managed control plane. It also provides managed worker nodes, which let you create, update, or terminate shut down worker nodes for your cluster with a single command. It is a great choice for deploying flexible and fault tolerant containerized applications. Amazon EKS supports creating and managing Amazon EC2 Spot Instances using Amazon EKS-managed node groups following Spot best practices. This enables you to take advantage of the steep savings and scale that Spot Instances provide for interruptible workloads running in your Kubernetes cluster. Using EKS-managed node groups with Spot Instances requires less operational effort compared to using self-managed nodes. In addition to launching Spot Instances in managed node groups, it is possible to specify multiple instance types in EKS managed node groups. You can find more in this blog.

Apache Spark and Kubernetes

When a spark application is submitted to the Kubernetes cluster the following happens:

  • A Spark driver is created.
  • The driver and the run within pods.
  • The Spark driver then requests for executors, which are scheduled to run within pods. The executors are managed by the driver.
  • The application is launched and once it completes, the executor pods are cleaned up. The driver pod persists the logs and remains in a completed state until the pod is cleared by garbage collection or manually removed. The driver in a completed stage does not consume any memory or compute resources.

Spark Deployment on Kubernetes Cluster

When a spark application runs on clusters managed by Kubernetes, the native Kubernetes scheduler is used. It is possible to schedule the driver/executor pods on a subset of available nodes. The applications can be launched either by a vanilla ‘spark submit’, a workflow orchestrator like Apache Airflow or the spark operator. I use vanilla ‘spark submit’ in this blog. is also able to schedule Spark applications on EKS clusters as described in this launch blog, but Amazon EMR on EKS is out of scope for this post.

Cost optimization

For any organization running big data workloads there are three key requirements: scalability, performance, and low cost. As the size of data increases, there is demand for more compute capacity and the total cost of ownership increases. It is critical to optimize the cost of big data applications. Big Data frameworks (in this case, Spark) are distributed to manage and process high volumes of data. These frameworks are designed for failure, can run on machines with different configurations, and are inherently resilient and flexible.

If Spark deploys on Kubernetes, the executor pods can be scheduled on EC2 Spot Instances and driver pods on On-Demand Instances. This reduces the overall cost of deployment – Spot Instances can save up to 90% over On-Demand Instance prices. This also enables faster results by scaling out executors running on Spot Instances. Spot Instances, by design, can be interrupted when EC2 needs the capacity back. If a driver pod is running on a Spot Instance, which is interrupted then the application fails and the application must be re-submitted. To avoid this situation, the driver pod can be scheduled on On-Demand Instances only. This adds a layer of resiliency to the Spark application running on Kubernetes. To cost optimize the deployment, all the executor pods are scheduled on Spot Instances as that’s where the bulk of compute happens. Spark’s inherent resiliency has the driver launch new executors to replace the ones that fail due to Spot interruptions.

There are a couple of key points to note here.

  • The idea is to start with minimum number of nodes for both On-Demand and Spot Instances (one each) and then auto-scale usingCluster Autoscaler and EC2 Auto Scaling  Cluster Autoscaler for AWS provides integration with Auto Scaling groups. If there are not sufficient resources, the driver and executor pods go into pending state. The Cluster Autoscaler detects pods in pending state and scales worker nodes within the identified Auto Scaling group in the cluster using EC2 Auto Scaling.
  • The scaling for On-Demand and Spot nodes is exclusive of one another. So, if multiple applications are launched the driver and executor pods can be scheduled in different node groups independently per the resource requirements. This helps reduce job failures due to lack of resources for the driver, thus adding to the overall resiliency of the system.
  • Using EKS Managed node groups
    • This requires significantly less operational effort compared to using self-managed nodegroup and enables:
      • Auto enforcement of Spot best practices like Capacity Optimized allocation strategy, Capacity Rebalancing and use multiple instances types.
      • Proactive replacement of Spot nodes using rebalance notifications.
      • Managed draining of Spot nodes via re-balance recommendations.
    • The nodes are auto-labeled so that the pods can be scheduled with NodeAffinity.
      • eks.amazonaws.com/capacityType: SPOT
      • eks.amazonaws.com/capacityType: ON_DEMAND

Now that you understand the products and best practices of used in this tutorial, let’s get started.

Tutorial: running Spark in EKS managed node groups with Spot Instances

In this tutorial, I review steps, which help you launch cost optimized and resilient Spark jobs inside Kubernetes clusters running on EKS. I launch a word-count application counting the words from an Amazon Customer Review dataset and write the output to an Amazon S3 folder. To run the Spark workload on Kubernetes, make sure you have eksctl and kubectl installed on your computer or on an AWS Cloud9 environment. You can run this by using an AWS IAM user or role that has the AdministratorAccess policy attached to it, or check the minimum required permissions for using eksctl. The spot node groups in the Amazon EKS cluster can be launched both in a managed or a self-managed way, in this post I use the former. The config files for this tutorial can be found here. The job is finally launched in cluster mode.

Create Amazon S3 Access Policy

First, I must create an Amazon S3 access policy to allow the Spark application to read/write from Amazon S3. Amazon S3 Access is provisioned by attaching the policy by ARN to the node groups. This associates Amazon S3 access to the NodeInstanceRole and, hence, the node groups then have access to Amazon S3. Download the Amazon S3 policy file from here and modify the <<output folder>> to an Amazon S3 bucket you created. Run the following to create the policy. Note the ARN.

aws iam create-policy --policy-name spark-s3-policy --policy-document file://spark-s3.json

Cluster and node groups deployment

Create an EKS cluster using the following command:

eksctl create cluster –name= sparkonk8 --node-private-networking  --without-nodegroup --asg-access –region=<<AWS Region>>

The cluster takes approximately 15 minutes to launch.

Create the nodegroup using the nodeGroup config file. Replace the <<Policy ARN>> string using the ARN string from the previous step.

eksctl create nodegroup -f managedNodeGroups.yml

Scheduling driver/executor pods

The driver and executor pods can be assigned to nodes using affinity. PodTemplates can be used to configure the detail, which is not supported by Spark launch configuration by default. This feature is available from Spark 3.0.0, requiredDuringScheduling node affinity is used to schedule the driver and executor jobs. Sample podTemplates have been uploaded here.

Launching a Spark application

Create a service account. The spark driver pod uses the service account to create and watch executor pods using Kubernetes API server.

kubectl create serviceaccount spark
kubectl create clusterrolebinding spark-role --clusterrole='edit'  --serviceaccount=default:spark --namespace=default

Download the Cluster Autoscaler and edit it to add the cluster-name. 

curl -LO https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Install the Cluster AutoScaler using the following command:

kubectl apply -f cluster-autoscaler-autodiscover.yaml

Get the details of Kubernetes master to get the head URL.

kubectl cluster-info 

command output

Use the following instructions to build the docker image.

Download the application file (script.py) from here and upload into the Amazon S3 bucket created.

Download the pod template files from here. Submit the application.

bin/spark-submit \
--master k8s://<<MASTER URL>> \
--deploy-mode cluster \
--name 'Job Name' \
--conf spark.eventLog.dir=s3a:// <<S3 BUCKET>>/logs \
--conf spark.eventLog.enabled=true \
--conf spark.history.fs.inProgressOptimization.enabled=true \
--conf spark.history.fs.update.interval=5s \
--conf spark.kubernetes.container.image=<<ECR Spark Docker Image>> \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.driver.podTemplateFile='../driver_pod_template.yml' \
--conf spark.kubernetes.executor.podTemplateFile='../executor_pod_template.yml' \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.dynamicAllocation.shuffleTracking.enabled=true \
--conf spark.dynamicAllocation.maxExecutors=100 \
--conf spark.dynamicAllocation.executorAllocationRatio=0.33 \
--conf spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=30 \
--conf spark.dynamicAllocation.executorIdleTimeout=60s \
--conf spark.driver.memory=8g \
--conf spark.kubernetes.driver.request.cores=2 \
--conf spark.kubernetes.driver.limit.cores=4 \
--conf spark.executor.memory=8g \
--conf spark.kubernetes.executor.request.cores=2 \
--conf spark.kubernetes.executor.limit.cores=4 \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.connection.ssl.enabled=false \
--conf spark.hadoop.fs.s3a.fast.upload=true \
s3a://<<S3 BUCKET>>/script.py \
s3a://<<S3 BUCKET>>/output 

A couple of key points to note here

  • podTemplateFile is used here, which enables scheduling of the driver pods to On-Demand Instances and executor pods to Spot Instances.
  • Spark provides a mechanism to allocate dynamically resources dynamically based on workloads. In the latest release of Spark (3.0.0), dynamicAllocation can be used with Kubernetes cluster manager. The executors that do not store, active, shuffled files can be removed to free up the resources. DynamicAllocation works well in tandem with Cluster Autoscaler for resource allocation and optimizes resource for jobs. We are using dynamicAllocation here to enable optimized resource sharing.
  • The application file and output are both in Amazon S3.

Output Files in S3

  • Spark Event logs are redirected to Amazon S3. Spark on Kubernetes creates local temporary files for logs and removes them once the application completes. The logs are redirected to Amazon S3 and Spark History Server can be used to analyze the logs. Note, you can create more instrumentation using tools like Prometheus and Grafana to monitor and manage the cluster.

Spark History Server + Dynamic Allocation

Observations

EC2 Spot Interruptions

The following diagram and log screenshot details from Spark History server showcases the behavior of a Spark application in case of an EC2 Spot interruption.

Four Spark applications launched in parallel in a cluster and one of the Spot nodes was interrupted. A couple of executor pods were terminated shut down in three of the four applications, but due to the resilient nature of Spark new executors were launched and the applications finished almost around the same time.
The Spark Driver identified the shut down executors, which handled the shuffle files and relaunched the tasks running on those executors.
Spark jobs

The Spark Driver identified the shut down executors, which handled the shuffle files and relaunched the tasks running on those executors.

Dynamic Allocation

Dynamic Allocation works with the caveat that it is an experimental feature.

dynamic allocation

Cost Optimization

Cost Optimization is achieved in several different ways from this tutorial.

  • Use of 100% Spot Instances for the Spark executors
  • Use of dynamicAllocation along with cluster autoscaler does make optimized use of resources and hence save cost
  • With the deployment of one driver and executor nodes to begin with and then scaling up on demand reduces the waste of a continuously running cluster

Cluster Autoscaling

Cluster Autoscaling is triggered as it is designed when there are pending (Spark executor) pods.

The Cluster Autoscaler logs can be fetched by:

kubectl logs -f deployment/cluster-autoscaler -n kube-system —tail=10  

Cluster Autoscaler Logs 

Cleanup

If you are trying out the tutorial, run the following steps to make sure that you don’t encounter unwanted costs.

Delete the EKS cluster and the nodegroups with the following command:

eksctl delete cluster --name sparkonk8

Delete the Amazon S3 Access Policy with the following command:

aws iam delete-policy --policy-arn <<POLICY ARN>>

Delete the Amazon S3 Output Bucket with the following command:

aws s3 rb --force s3://<<S3_BUCKET>>

Conclusion

In this blog, I demonstrated how you can run Spark workloads on a Kubernetes Cluster using Spot Instances, achieving scalability, resilience, and cost optimization. To cost optimize your Spark based big data workloads, consider running spark application using Kubernetes and EC2 Spot Instances.

 

 

 

How to monitor Windows and Linux servers and get internal performance metrics

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/how-to-monitor-windows-and-linux-servers-and-get-internal-performance-metrics/

This post was written by Dean Suzuki, Senior Solutions Architect.

Customers who run Windows or Linux instances on AWS frequently ask, “How do I know if my disks are almost full?” or “How do I know if my application is using all the available memory and is paging to disk?” This blog helps answer these questions by walking you through how to set up monitoring to capture these internal performance metrics.

Solution overview

If you open the Amazon EC2 console, select a running Amazon EC2 instance, and select the Monitoring tab  you can see Amazon CloudWatch metrics for that instance. Amazon CloudWatch is an AWS monitoring service. The Monitoring tab (shown in the following image) shows the metrics that can be measured external to the instance (for example, CPU utilization, network bytes in/out). However, to understand what percentage of the disk is being used or what percentage of the memory is being used, these metrics require an internal operating system view of the instance. AWS places an extra safeguard on gathering data inside a customer’s instance so this capability is not enabled by default.

EC2 console showing Monitoring tab

To capture the server’s internal performance metrics, a CloudWatch agent must be installed on the instance. For Windows, the CloudWatch agent can capture any of the Windows performance monitor counters. For Linux, the CloudWatch agent can capture system-level metrics. For more details, please see Metrics Collected by the CloudWatch Agent. The agent can also capture logs from the server. The agent then sends this information to Amazon CloudWatch, where rules can be created to alert on certain conditions (for example, low free disk space) and automated responses can be set up (for example, perform backup to clear transaction logs). Also, dashboards can be created to view the health of your Windows servers.

There are four steps to implement internal monitoring:

  1. Install the CloudWatch agent onto your servers. AWS provides a service called AWS Systems Manager Run Command, which enables you to do this agent installation across all your servers.
  2. Run the CloudWatch agent configuration wizard, which captures what you want to monitor. These items could be performance counters and logs on the server. This configuration is then stored in AWS System Manager Parameter Store
  3. Configure CloudWatch agents to use agent configuration stored in Parameter Store using the Run Command.
  4. Validate that the CloudWatch agents are sending their monitoring data to CloudWatch.

The following image shows the flow of these four steps.

Process to install and configure the CloudWatch agent

In this blog, I walk through these steps so that you can follow along. Note that you are responsible for the cost of running the environment outlined in this blog. So, once you are finished with the steps in the blog, I recommend deleting the resources if you no longer need them. For the cost of running these servers, see Amazon EC2 On-Demand Pricing. For CloudWatch pricing, see Amazon CloudWatch pricing.

If you want a video overview of this process, please see this Monitoring Amazon EC2 Windows Instances using Unified CloudWatch Agent video.

Deploy the CloudWatch agent

The first step is to deploy the Amazon CloudWatch agent. There are multiple ways to deploy the CloudWatch agent (see this documentation on Installing the CloudWatch Agent). In this blog, I walk through how to use the AWS Systems Manager Run Command to deploy the agent. AWS Systems Manager uses the Systems Manager agent, which is installed by default on each AWS instance. This AWS Systems Manager agent must be given the appropriate permissions to connect to AWS Systems Manager, and to write the configuration data to the AWS Systems Manager Parameter Store. These access rights are controlled through the use of IAM roles.

Create two IAM roles

IAM roles are identity objects that you attach IAM policies. IAM policies define what access is allowed to AWS services. You can have users, services, or applications assume the IAM roles and get the assigned rights defined in the permissions policies.

To use System Manager, you typically create two IAM roles. The first role has permissions to write the CloudWatch agent configuration information to System Manager Parameter Store. This role is called CloudWatchAgentAdminRole.

The second role only has permissions to read the CloudWatch agent configuration from the System Manager Parameter Store. This role is called CloudWatchAgentServerRole.

For more details on creating these roles, please see the documentation on Create IAM Roles and Users for Use with the CloudWatch Agent.

Attach the IAM roles to the EC2 instances

Once you create the roles, you attach them to your Amazon EC2 instances. By attaching the IAM roles to the EC2 instances, you provide the processes running on the EC2 instance the permissions defined in the IAM role. In this blog, you create two Amazon EC2 instances. Attach the CloudWatchAgentAdminRole to the first instance that is used to create the CloudWatch agent configuration. Attach CloudWatchAgentServerRole to the second instance and any other instances that you want to monitor. For details on how to attach or assign roles to EC2 instances, please see the documentation on How do I assign an existing IAM role to an EC2 instance?.

Install the CloudWatch agent

Now that you have setup the permissions, you can install the CloudWatch agent onto the servers that you want to monitor. For details on installing the CloudWatch agent using Systems Manager, please see the documentation on Download and Configure the CloudWatch Agent.

Create the CloudWatch agent configuration

Now that you installed the CloudWatch agent on your server, run the CloudAgent configuration wizard to create the agent configuration. For instructions on how to run the CloudWatch Agent configuration wizard, please see this documentation on Create the CloudWatch Agent Configuration File with the Wizard. To establish a command shell on the server, you can use AWS Systems Manager Session Manager to establish a session to the server and then run the CloudWatch agent configuration wizard. If you want to monitor both Linux and Windows servers, you must run the CloudWatch agent configuration on a Linux instance and on a Windows instance to create a configuration file per OS type. The configuration is unique to the OS type.

To run the Agent configuration wizard on Linux instances, run the following command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

To run the Agent configuration wizard on Windows instances, run the following commands:

cd "C:\Program Files\Amazon\AmazonCloudWatchAgent"

amazon-cloudwatch-agent-config-wizard.exe

Note for Linux instances: do not select to collect the collectd metrics in the agent configuration wizard unless you have collectd installed on your Linux servers. Otherwise, you may encounter an error.

Review the Agent configuration

The CloudWatch agent configuration generated from the wizard is stored in Systems Manager Parameter Store. You can review and modify this configuration if you need to capture extra metrics. To review the agent configuration, perform the following steps:

  1. Go to the console for the System Manager service.
  2. Click Parameter store on the left hand navigation.
  3. You should see the parameter that was created by the CloudWatch agent configuration program. For Linux servers, the configuration is stored in: AmazonCloudWatch-linux and for Windows servers, the configuration is stored in:  AmazonCloudWatch-windows.

System Manager Parameter Store: Parameters created by CloudWatch agent configuration wizard

  1. Click on the parameter’s hyperlink (for example, AmazonCloudWatch-linux) to see all the configuration parameters that you specified in the configuration program.

In the following steps, I walk through an example of modifying the Windows configuration parameter (AmazonCloudWatch-windows) to add an additional metric (“Available Mbytes”) to monitor.

  1. Click the AmazonCloudWatch-windows
  2. In the parameter overview, scroll down to the “metrics” section and under “metrics_collected”, you can see the Windows performance monitor counters that will be gathered by the CloudWatch agent. If you want to add an additional perfmon counter, then you can edit and add the counter here.
  3. Press Edit at the top right of the AmazonCloudWatch-windows Parameter Store page.
  4. Scroll down in the Value section and look for “Memory.”
  5. After the “% Committed Bytes In Use”, put a comma “,” and then press Enter to add a blank line. Then, put on that line “Available Mbytes” The following screenshot demonstrates what this configuration should look like.

AmazonCloudWatch-windows parameter contents and how to add a new metric to monitor

  1. Press Save Changes.

To modify the Linux configuration parameter (AmazonCloudWatch-linux), you perform similar steps except you click on the AmazonCloudWatch-linux parameter. Here is additional documentation on creating the CloudWatch agent configuration and modifying the configuration file.

Start the CloudWatch agent and use the configuration

In this step, start the CloudWatch agent and instruct it to use your agent configuration stored in System Manager Parameter Store.

  1. Open another tab in your web browser and go to System Manager console.
  2. Specify Run Command in the left hand navigation of the System Manager console.
  3. Press Run Command
  4. In the search bar,
    • Select Document name prefix
    • Select Equal
    • Specify AmazonCloudWatch (Note the field is case sensitive)
    • Press enter

System Manager Run Command's command document entry field

  1. Select AmazonCloudWatch-ManageAgent. This is the command that configures the CloudWatch agent.
  2. In the command parameters section,
    • For Action, select Configure
    • For Mode, select ec2
    • For Optional Configuration Source, select ssm
    • For optional configuration location, specify the Parameter Store name. For Windows instances, you would specify AmazonCloudWatch-windows for Windows instances or AmazonCloudWatch-linux for Linux instances. Note the field is case sensitive. This tells the command to read the Parameter Store for the parameter specified here.
    • For optional restart, leave yes
  3. For Targets, choose your target servers that you wish to monitor.
  4. Scroll down and press Run. The Run Command may take a couple minutes to complete. Press the refresh button. The Run Command configures the CloudWatch agent by reading the Parameter Store for the configuration and configure the agent using those settings.

For more details on installing the CloudWatch agent using your agent configuration, please see this Installing the CloudWatch Agent on EC2 Instances Using Your Agent Configuration.

Review the data collected by the CloudWatch agents

In this step, I walk through how to review the data collected by the CloudWatch agents.

  1. In the AWS Management console, go to CloudWatch.
  2. Click Metrics on the left-hand navigation.
  3. You should see a custom namespace for CWAgent. Click on the CWAgent Please note that this might take a couple minutes to appear. Refresh the page periodically until it appears.
  4. Then click the ImageId, Instanceid hyperlinks to see the counters under that section.

CloudWatch Metrics: Showing counters under CWAgent

  1. Review the metrics captured by the CloudWatch agent. Notice the metrics that are only observable from inside the instance (for example, LogicalDisk % Free Space). These types of metrics would not be observable without installing the CloudWatch agent on the instance. From these metrics, you could create a CloudWatch Alarm to alert you if they go beyond a certain threshold. You can also add them to a CloudWatch Dashboard to review. To learn more about the metrics collected by the CloudWatch agent, see the documentation Metrics Collected by the CloudWatch Agent.

Conclusion

In this blog, you learned how to deploy and configure the CloudWatch agent to capture the metrics on either Linux or Windows instances. If you are done with this blog, we recommend deleting the System Manager Parameter Store entry, the CloudWatch data and  then the EC2 instances to avoid further charges. If you would like a video tutorial of this process, please see this Monitoring Amazon EC2 Windows Instances using Unified CloudWatch Agent video.

 

 

Building PHP Lambda functions with Docker container images

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/building-php-lambda-functions-with-docker-container-images/

At re:Invent 2020, AWS announced that you can package and deploy AWS Lambda functions as container images. Packaging AWS Lambda functions as container images brings some notable benefits for developers running custom runtimes, such as PHP. This blog post explains those benefits and shows how to use the new container image support for Lambda functions to build serverless PHP applications.

Overview

Many PHP developers are familiar with building applications as containers to create a portable artifact for easier deployment. Packaging applications as containers helps to maintain consistent PHP versions, package versions, and configurations settings across multiple environments.

The new container image support for Lambda allows you to use familiar container tooling to build your applications. It also allows you to transition your applications into a serverless event-driven model. This brings the benefits of having no infrastructure to manage, automated scalability and a pay-per-use billing.

The advantages of an event-driven model for PHP applications are explained across the blog series “The serverless LAMP stack”. It explores the concepts, methods, and reasons for creating serverless applications with PHP. The architectural patterns and service limits in this blog series apply to functions packaged using both container image and zip archive formats, with some key exceptions:

Zip archive Container image
Maximum package size 250 MB 10 GB
Lambda layers Supported Include in image
Lambda Extensions Supported Include in image

Custom runtimes with container images

For custom runtimes such as PHP, Lambda provides base images containing the required Amazon Linux or Amazon Linux 2 operating system. Extend this to include your own runtime by implementing the Lambda Runtime API in a bootstrap file.

Before container image support for Lambda, a custom runtime is packaged using the .zip format. This required the developer to:

  1. Set up an Amazon Linux environment compatible with the Lambda execution environment.
  2. Install compilation dependencies and compile a version of PHP.
  3. Save the compiled PHP binary together with a bootstrap file and package as a .zip.
  4. Publish the .zip as a runtime layer.
  5. Add the runtime layer to a Lambda function.

Any edits to the custom runtime such as new packages, PHP versions, modules, or dependences require the process to be repeated. This process can be time consuming and prone to error.

Creating a custom PHP runtime using the new container image support for Lambda can simplify changing the runtime environment. Dockerfiles allow you to have a fully scripted, faster, and portable build process without setting up an Amazon Linux environment.

This GitHub repository contains a custom PHP runtime for Lambda functions packaged as a container image. The following Dockerfile uses the base image for Amazon Linux provided by AWS. The instructions perform the following:

  • Install system-wide Linux packages (zip, curl, tar).
  • Download and compile PHP.
  • Download and install composer dependency manager and dependencies.
  • Move PHP binaries, bootstrap, and vendor dependencies into a directory that Lambda can read from.
  • Set the container entrypoint.
#Lambda base image Amazon Linux
FROM public.ecr.aws/lambda/provided as builder 
# Set desired PHP Version
ARG php_version="7.3.6"
RUN yum clean all && \
    yum install -y autoconf \
                bison \
                bzip2-devel \
                gcc \
                gcc-c++ \
                git \
                gzip \
                libcurl-devel \
                libxml2-devel \
                make \
                openssl-devel \
                tar \
                unzip \
                zip

# Download the PHP source, compile, and install both PHP and Composer
RUN curl -sL https://github.com/php/php-src/archive/php-${php_version}.tar.gz | tar -xvz && \
    cd php-src-php-${php_version} && \
    ./buildconf --force && \
    ./configure --prefix=/opt/php-7-bin/ --with-openssl --with-curl --with-zlib --without-pear --enable-bcmath --with-bz2 --enable-mbstring --with-mysqli && \
    make -j 5 && \
    make install && \
    /opt/php-7-bin/bin/php -v && \
    curl -sS https://getcomposer.org/installer | /opt/php-7-bin/bin/php -- --install-dir=/opt/php-7-bin/bin/ --filename=composer

# Prepare runtime files
# RUN mkdir -p /lambda-php-runtime/bin && \
    # cp /opt/php-7-bin/bin/php /lambda-php-runtime/bin/php
COPY runtime/bootstrap /lambda-php-runtime/
RUN chmod 0755 /lambda-php-runtime/bootstrap

# Install Guzzle, prepare vendor files
RUN mkdir /lambda-php-vendor && \
    cd /lambda-php-vendor && \
    /opt/php-7-bin/bin/php /opt/php-7-bin/bin/composer require guzzlehttp/guzzle

###### Create runtime image ######
FROM public.ecr.aws/lambda/provided as runtime
# Layer 1: PHP Binaries
COPY --from=builder /opt/php-7-bin /var/lang
# Layer 2: Runtime Interface Client
COPY --from=builder /lambda-php-runtime /var/runtime
# Layer 3: Vendor
COPY --from=builder /lambda-php-vendor/vendor /opt/vendor

COPY src/ /var/task/

CMD [ "index" ]

To deploy this Lambda function, follow the instructions in the GitHub repository.

All runtime-related instructions are saved in the Dockerfile, which makes the custom runtime simpler to manage, update, and test. You can add additional Linux packages by appending to the yum install command. To install alternative PHP versions, change the php_version argument. Import additional PHP modules by adding to the compile command.

View the complete application in the following file tree:

project/
┣ runtime/
┃ ┗ bootstrap
┣ src/
┃ ┗ index.php
┗ Dockerfile

The Lambda function code is stored in the src directory in a file named index.php. This contains the Lambda function handler “index()”.

A bootstrap file is in the ‘runtime’ directory. This uses the Lambda runtime API to communicate with the Lambda execution environment.

The shebang hash sequence at the beginning of the bootstrap script instructs Lambda to run the file with the PHP executable, set by the Dockerfile.

All environment variables used in the bootstrap are set by the Lambda execution environment when running in the AWS Cloud. When running locally, the Lambda Runtime Interface Emulator (RIE) sets these values.

#!/var/lang/bin/php

Testing locally with the Lambda RIE

Using container image support for Lambda makes it easier for PHP developers to test Lambda functions locally. The previous container image example builds from the Lambda base image provided by AWS. This base image contains the Lambda RIE.

This is a proxy for Lambda’s Runtime and Extensions APIs. It acts as a lightweight web server that converts HTTP requests to JSON events and maintains functional parity with the Lambda Runtime API in the AWS Cloud. This allows developers to test functions locally using familiar tools such as cURL and the Docker CLI.

  1. Build the previous custom runtime image using the Docker build command:
    docker build -t phpmyfuntion .
  2. Run the function locally using the Docker run command, bound to port 9000:
    docker run -p 9000:8080 phpmyfuntion:latest
  3. This command starts up a local endpoint at:
    localhost:9000/2015-03-31/functions/function/invocations
  4. Post an event to this endpoint using a curl command. The Lambda function payload is provided by using the -d flag. This is a valid Json object required by the Runtime Interface Emulator:
    curl "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"queryStringParameters": {"name":"Ben"}}'
  5. A 200 status response is returned:

Building web applications with Bref container images

Bref is an open source runtime Lambda layer for PHP. Using the bref-fpm layer, you can build applications with traditional PHP frameworks such as Symfony and Laravel. Bref’s implementation of the FastCGI protocol returns an HTTP response instead of a JSON response. When using the zip archive format to package Lambda functions, Bref’s custom runtime is provided to the function as a Lambda layer. Functions packaged as container images do not support adding Lambda layers to the function configuration. In addition to runtime layers, Bref also provides a number of Docker images. These images use the Lambda runtime API to form a runtime interface client that communicates with the Lambda execution environment.

The following example shows how to compose a Dockerfile that uses the bref php-74-fpm container image:

# Uses PHP 74-fpm.0, as the base image
FROM bref/php-74-fpm
# download composer for dependency management
RUN curl -s https://getcomposer.org/installer | php
# install bref using composer
RUN php composer.phar require bref/bref
# copy the project files into a Location that the Lambda service can read from
COPY . /var/task
#set the function handler entry point
CMD _HANDLER=index.php /opt/bootstrap
  1. The first line sets the base image to use bref/php-74-fpm.
  2. Composer, a dependency manager for PHP is installed.
  3. Composer’s require command is used to add the bref package to the composer.json file.
  4. The project files are then copied into the /var/task directory, where the function code runs from.
  5. The function handler is set along with Bref’s bootstrap file.

The steps to build and deploy this image to the Amazon Elastic Container Registry are the same for any runtime, and explained in this announcement blog post.

Conclusion

The new container image support for Lambda functions allows developers to package Lambda functions of up to 10 GB in size. Using the container image format and a Dockerfile can make it easier to build and update functions with custom runtimes such as PHP.

Developers can include specific language versions, modules, and package dependencies. The Amazon Linux and Amazon Linux 2 base images give developers a starting point to customize the runtime. With the Lambda Runtime Interface Emulator, it’s simpler for developers to test Lambda functions locally. PHP developers can use existing third-party images, such as bref-fpm, to create web applications in a single Lambda function.

Visit serverlessland.com for more information on building serverless PHP applications.

Optimizing Lambda functions packaged as container images

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/optimizing-lambda-functions-packaged-as-container-images/

AWS Lambda launched support for packaging and deploying functions as container images at re:Invent 2020. In this post you learn how to build container images that reduce image size as well as build, deployment, and update time. Lambda container images have unique characteristics to consider for optimization. This means that the techniques you use to optimize container images for Lambda functions are slightly different from those you use for other environments.

To understand how to optimize container images, it helps to understand how container images are packaged, as well as how the Lambda service retrieves, caches, deploys, and retires container images.

Pre-requisites and assumptions

This post assumes you have access to an IAM user or role in an AWS account and a version of the tar utility on your machine. You must also install Docker and the AWS SAM CLI and start Docker.

Lambda container image packaging

Lambda container images are packaged according to the Open Container Initiative (OCI) Image Format specification. The specification defines how programs build and package individual layers into a single container image. To explore an example of the OCI Image Format, open a terminal and perform the following steps:

  1. Create an AWS SAM application.
    sam init –name container-images
  2. Choose 1 to select an AWS quick start template, then choose 2 to select container image as the packaging format, and finally choose 9 to use the amazon-go1.x-base image.
    Image showing the suggested choices for a sam init command
  3. After the AWS SAM CLI generates the application, enter the following commands to change into the new directory and build the Lambda container image
    cd container-images
    sam build
  4. AWS SAM builds your function and packages it as helloworldfunction:go1.x-v1. Export this container image to a tar archive and extract the filesystem into a new directory to explore the image format.
    docker save helloworldfunction:go1.x-v1 &gt; oci-image.tar
    mkdir -p image
    tar xf oci-image.tar -C image

The image directory contains several subdirectories, a container metadata JSON file, a manifest JSON file, and a repositories JSON file. Each subdirectory represents a single layer, and contains a version file, its own metadata JSON file, and a tar archive of the files that make up the layer.

Image of the result of running the tree command in a terminal window

The manifest.json file contains a single JSON object with the name of the container metadata file, a list of repository tags, and a list of included layers. The list of included layers is ordered according to the build order in your Dockerfile. The metadata JSON file in each subfolder also contains a mapping from each layer to its parent layer or final container.

Your function should have layers similar to the following. A separate layer is created any time files are added to the container image. This includes FROM, RUN, ADD, and COPY statements in your Dockerfile and base image Dockerfiles. Note that the specific layer IDs, layer sizes, number, and composition of layers may change over time.

ID Size Description Your function’s Dockerfile step
5fc256be… 641 MB Amazon Linux
c73e7f67… 320 KB Third-party licenses
de5f5100… 12 KB Lambda entrypoint script
2bd3c722… 7.8 MB AWS Lambda RIE
5d9d381b… 10.0 MB AWS Lambda runtime
cb832ffc… 12 KB Bootstrap link
1fcc74e8… 560 KB Lambda runtime library FROM public.ecr.aws/lambda/go:1
acb8dall… 9.6 MB Function code COPY –from=build-image /go/bin/ /var/task/

Runtimes generate a filesystem image by destructively overlaying each image layer over its parent. This means that any changes to one layer require all child layers to be recreated. In the following example, if you change the layer cb832ffc... then the layers 1fcc74e8… and acb8da111… are also considered “dirty” and must be recreated from the new parent image. This results in a new container image with eight layers, the first five the same as the original image, and the last three newly built, each with new IDs and parents.

Representation of a container image with eight layers, one of which is updated requiring two additional child layers to be updated also.

The layered structure of container images informs several decisions you make when optimizing your container images.

Strategies for optimizing container images

There are four main strategies for optimizing your container images. First, wherever possible, use the AWS-provided base images as a starting point for your container images. Second, use multi-stage builds to avoid adding unnecessary layers and files to your final image. Third, order the operations in your Dockerfile from most stable to most frequently changing. Fourth, if your application uses one or more large layers across all of your functions, store all of your functions in a single repository.

Use AWS-provided base images

If you have experience packaging traditional applications for container runtimes, using AWS-provided base images may seem counterintuitive. The AWS-provided base images are typically larger than other minimal container base images. For example, the AWS-provided base image for the Go runtime public.ecr.aws/lambda/go:1 is 670 MB, while alpine:latest, a popular starting point for building minimal container images, is only 5.58 MB. However, using the AWS-provided base images offers three advantages.

First, the AWS-provided base images are cached pro-actively by the Lambda service. This means that the base image is either nearby in another upstream cache or already in the worker instance cache. Despite being much larger, the deployment time may still be shorter when compared to third-party base images, which may not be cached. For additional details on how the Lambda service caches container images, see the re:Invent 2021 talk Deep dive into AWS Lambda security: Function isolation.

Second, the AWS-provided base images are stable. As the base image is at the bottom layer of the container image, any changes require every other layer to be rebuilt and redeployed. Fewer changes to your base image mean fewer rebuilds and redeployments, which can reduce build cost.

Finally, the AWS-provided base images are built on Amazon Linux and Amazon Linux 2. Depending on your chosen runtime, they may already contain a number of utilities and libraries that your functions may need. This means that you do not need to add them in later, saving you from creating additional layers that can cause more build steps leading to increased costs.

Use multi-stage builds

Multi-stage builds allow you to build your code in larger preliminary images, copy only the artifacts you need into your final container image, and discard the preliminary build steps. This means you can run any arbitrarily large number of commands and add or copy files into the intermediate image, but still only create one additional layer in your container image for the artifact. This reduces both the final size and the attack surface of your container image by excluding build-time dependencies from your runtime image.

AWS SAM CLI generates Dockerfiles that use multi-stage builds.

FROM golang:1.14 as build-image
WORKDIR /go/src
COPY go.mod main.go ./
RUN go build -o ../bin

FROM public.ecr.aws/lambda/go:1
COPY --from=build-image /go/bin/ /var/task/

# Command can be overwritten by providing a different command in the template directly.
CMD ["hello-world"]

This Dockerfile defines a two-stage build. First, it pulls the golang:1.14 container image and names it build-image. Naming intermediate stages is optional, but it makes it easier to refer to previous stages when packaging your final container image. Note that the golang:1.14 image is 810 MB, is not likely to be cached by the Lambda service, and contains a number of build tools that you should not include in your production images. The build-image stage then builds your function and saves it in /go/bin.

The second and final stage begins from the public.ecr.aws/lambda/go:1 base image. This image is 670 MB, but because it is an AWS-provided image, it is more likely to be cached on worker instances. The COPY command copies the contents of /go/bin from the build-image stage into /var/task in the container image, and discards the intermediate stage.

Build from stable to frequently changing

Any time a layer in an image changes, all layers that follow must be rebuilt, repackaged, redeployed, and recached by the Lambda service. In practice, this means that you should make your most frequently occurring changes as late in your Dockerfile as possible.

For example, if you have a stable Lambda function that uses a frequently updated machine learning model to make predictions, add your function to the container image before adding the machine learning model. However, if you have a function that changes frequently but relies on a stable Lambda extension, copy the extension into the image first.

If you put the frequently changing component early in your Dockerfile, all the build steps that follow must be re-run every time that component changes. If one of those actions is costly, for example, compiling a large library or running a complex simulation, these repetitions add unnecessary time and cost to your deployment pipeline.

Use a single repository for functions with large layers

When you create an application with multiple Lambda functions, you either store the container images in a single Amazon ECR repository or in multiple repositories, one for each function. If your application uses one or more large layers across all of your functions, store all of your functions in a single repository.

ECR repositories compare each layer of a container image when it is pushed to avoid uploading and storing duplicates. If each function in your application uses the same large layer, such as a custom runtime or machine learning model, that layer is stored exactly once in a shared repository. If you use a separate repository for each function, that layer is duplicated across repositories and must be uploaded separately to each one. This costs you time and network bandwidth.

Conclusion

Packaging your Lambda functions as container images enables you to use familiar tooling and take advantage of larger deployment limits. In this post you learn how to build container images that reduce image size as well as build, deployment, and update time. You learn some of the unique characteristics of Lambda container images that impact optimization. Finally, you learn how to think differently about image optimization for Lambda functions when compared to packaging traditional applications for container runtimes.

For more information on how to build serverless applications, including source code, blogs, videos, and more, visit the Serverless Land website.

Creating a cross-region Active Directory domain with AWS Launch Wizard for Microsoft Active Directory

Post Syndicated from AWS Admin original https://aws.amazon.com/blogs/compute/creating-a-cross-region-active-directory-domain-with-aws-launch-wizard-for-microsoft-active-directory/

AWS Launch Wizard is a console-based service to quickly and easily size, configure, and deploy third party applications, such as Microsoft SQL Server Always On and HANA based SAP systems, on AWS without the need to identify and provision individual AWS resources. AWS Launch Wizard offers an easy way to deploy enterprise applications and optimize costs. Instead of selecting and configuring separate infrastructure services, you go through a few steps in the AWS Launch Wizard and it deploys a ready-to-use application on your behalf. It reduces the time you need to spend on investigating how to provision, cost and configure your application on AWS.

You can now use AWS Launch Wizard to deploy and configure self-managed Microsoft Windows Server Active Directory Domain Services running on Amazon Elastic Compute Cloud (EC2) instances. With Launch Wizard, you can have fully-functioning, production-ready domain controllers within a few hours—all without having to manually deploy and configure your resources.

You can use AWS Directory Service to run Microsoft Active Directory (AD) as a managed service, without the hassle of managing your own infrastructure. If you need to run your own AD infrastructure, you can use AWS Launch Wizard to simplify the deployment and configuration process.

In this post, I walk through creation of a cross-region Active Directory domain using Launch Wizard. First, I deploy a single Active Directory domain spanning two regions. Then, I configure Active Directory Sites and Services to match the network topology. Finally, I create a user account to verify replication of the Active Directory domain.

Diagram of Resources deployed in this post

Figure 1: Diagram of resources deployed in this post

Prerequisites

  1. You must have a VPC in your home. Additionally, you must have remote regions that have CIDRs that do not overlap with each other. If you need to create VPCs and subnets that do not overlap, please refer here.
  2. Each subnet used must have outbound internet connectivity. Feel free to either use a NAT Gateway or Internet Gateway.
  3. The VPCs must be peered in order to complete the steps in this post. For information on creating a VPC Peering connection between regions, please refer here.
  4. If you choose to deploy your Domain Controllers to a private subnet, you must have an RDP jump / bastion instance setup to allow you to RDP to your instance.

Deploy Your Domain Controllers in the Home Region using Launch Wizard

In this section, I deploy the first set of domain controllers into the us-east-1 the home region using Launch Wizard. I refer to US-East-1 as the home region, and US-West-2 as the remote region.

  1. In the AWS Launch Wizard Console, select Active Directory in the navigation pane on the left.
  2. Select Create deployment.
  3. In the Review Permissions page, select Next.
  4. In the Configure application settings page set the following:
    • General:
      • Deployment name: UsEast1AD
    • Active Directory (AD) installation
      • Installation type: Active Directory on EC2
    • Domain Settings:
      • Number of domain controllers: 2
      • AMI installation type: License-included AMI
    • License-included AMI: ami-################# | Windows_Server-2019-English-Full-Base-202#-##-##
    • Connection type: Create new Active Directory
    • Domain DNS name: corp.example.com
    • Domain NetBIOS Name: CORP
    • Connectivity:
      • Key Pair Name: Choose and exiting Key pair or select and existing one.
      • Virtual Private Cloud (VPC): Select Virtual Private Cloud (VPC)
    • VPC: Select your home region VPC
    • Availability Zone (AZ) and private subnets:
      • Select 2 Availability Zones
      • Choose the proper subnet in each subnet
      • Assign a Controller IP address for each domain controller
    • Remote Desktop Gateway preferences: Disregard for now, this is set up later.
    • Check the I confirm that a public subnet has been set up. Each of the selected private subnets have outbound connectivity enabled check box.
  1. Select Next.
  2. In the Define infrastructure requirements page, set the following inputs.
    • Storage and compute: Based on infrastructure requirements
    • Number of AD users: Up to 5000 users
  3. Select Next.
  4. In the Review and deploy page, review your selections. Then, select Deploy.

Note that it may take up to 2 hours for your domain to be deployed. Once the status has changed to Completed, you can proceed to the next section. In the next section, I prepare Active Directory Sites and Services for the second set of domain controller in my other region.

Configure Active Directory Sites and Services

In this section, I configure the Active Directory Sites and Services topology to match my network topology. This step ensures proper Active Directory replication routing so that domain clients can find the closest domain controller. For more information on Active Directory Sites and Services, please refer here.

Retrieve your Administrator Credentials from Secrets Manager

  1. From the AWS Secrets Manager Console in us-east-1, select the Secret that begins with LaunchWizard-UsEast1AD.
  2. In the middle of the Secret page, select Retrieve secret value.
    1. This will display the username and password key with their values.
    2. You need these credentials when you RDP into one of the domain controllers in the next steps.

Rename the Default First Site

  1. Log in to the one of the domain controllers in us-east-1.
  2. Select Start, type dssite and hit Enter on your keyboard.
  3. The Active Directory Sites and Services MMC should appear.
    1. Expand Sites. There is a site named Default-First-Site-Name.
    2. Right click on Default-First-Site-Name select Rename.
    3. Enter us-east-1 as the name.
  4. Leave the Active Directory Sites and Services MMC open for the next set of steps.

Create a New Site and Subnet Definition for US-West-2

  1. Using the Active Directory Sites and Services MMC from the previous steps, right click on Sites.
  2. Select New Site… and enter the following inputs:
    • Name: us-west-2
    • Select DEFAULTIPSITELINK.
  3.  Select OK.
  4. A pop up will appear telling you there will need to be some additional configuration. Select OK.
  5. Expand Sites and right click on Subnets and select New Subnet.
  6. Enter the following information:
    • Prefix: the CIDR of your us-west-2 VPC. An example would be 1.0.0/24
    • Site: select us-west-2
  7. Select OK.
  8. Leave the Active Directory Sites and Services MMC open for the following set of steps.

Configure Site Replication Settings

Using the Active Directory Sites and Services MMC from the previous steps, expand Sites, Inter-Site Transports, and select IP. You should see an object named DEFAULTIPSITELINK,

  1. Right click on DEFAULTIPSITELINK.
  2. Select Properties. Set or verify the following inputs on the General tab:
  3. Select Apply.
  4. In the DEFAULTIPSITELINK Properties, select the Attribute Editor tab and modify the following:
    • Scroll down and double click on Enter 1 for the Value, then select OK twice.
      • For more information on these settings, please refer here.
  5. Close the Active Directory Sites and Services MMC, as it is no longer needed.

Prepare Your Home Region Domain Controllers Security Group

In this section, I modify the Domain Controllers Security Group in us-east-1. This allows the domain controllers deployed in us-west-2 to communicate with each other.

  1. From the Amazon Elastic Compute Cloud (Amazon EC2) console, select Security Groups under the Network & Security navigation section.
  2. Select the Domain Controllers Security Group that was created with Launch Wizard Active Directory.
  3. Select Edit inbound rules. The Security Group should start with LaunchWizard-UsEast1AD-.
  4. Choose Add rule and enter the following:
    • Type: Select All traffic
    • Protocol: All
    • Port range: All
    • Source: Select Custom
    • Enter the CIDR of your remote VPC. An example would be 1.0.0/24
  5. Select Save rules.

Create a Copy of Your Administrator Secret in Your Remote Region

In this section, I create a Secret in Secrets Manager that contains the Administrator credentials when I created a home region.

  1. Find the Secret that being with LaunchWizard-UsEast1AD from the AWS Secrets Manager Console in us-east-1.
  2. In the middle of the Secret page, select Retrieve secret value.
    • This displays the username and password key with their values. Make note of these keys and values, as we need them for the next steps.
  3. From the AWS Secrets Manager Console, change the region to us-west-2.
  4. Select Store a new secret. Then, enter the following inputs:
    • Select secret type: Other type of secrets
    • Add your first keypair
    • Select Add row to add the second keypair
  5. Select Next, then enter the following inputs.
    • Secret name: UsWest2AD
    • Select Next twice
    • Select Store

Deploy Your Domain Controllers in the Remote Region using Launch Wizard

In this section, I deploy the second set of domain controllers into the us-west-1 region using Launch Wizard.

  1. In the AWS Launch Wizard Console, select Active Directory in the navigation pane on the left.
  2. Select Create deployment.
  3. In the Review Permissions page, select Next.
  4. In the Configure application settings page, set the following inputs.
    • General
      • Deployment name: UsWest2AD
    • Active Directory (AD) installation
      • Installation type: Active Directory on EC2
    • Domain Settings:
      • Number of domain controllers: 2
      • AMI installation type: License-included AMI
      • License-included AMI: ami-################# | Windows_Server-2019-English-Full-Base-202#-##-##
    • Connection type: Add domain controllers to existing Active Directory
    • Domain DNS name: corp.example.com
    • Domain NetBIOS Name: CORP
    • Domain Administrator secret name: Select you secret you created above.
    • Add permission to secret
      • After you verified the Secret you created above has the policy listed. Check the checkbox confirming the secret has the required policy.
    • Domain DNS IP address for resolution: The private IP of either domain controller in your home region
    • Connectivity:
      • Key Pair Name: Choose an existing Key pair
      • Virtual Private Cloud (VPC): Select Virtual Private Cloud (VPC)
    • VPC: Select your home region VPC
    • Availability Zone (AZ) and private subnets:
      • Select 2 Availability Zones
      • Choose the proper subnet in each subnet
      • Assign a Controller IP address for each domain controller
    • Remote Desktop Gateway preferences: disregard for now, as I set this later.
    • Check the I confirm that a public subnet has been set up. Each of the selected private subnets have outbound connectivity enabled check box
  1. In the Define infrastructure requirements page set the following:
    • Storage and compute: Based on infrastructure requirements
    • Number of AD users: Up to 5000 users
  2. In the Review and deploy page, review your selections. Then, select Deploy.

Note that it may take up to 2 hours to deploy domain controllers. Once the status has changed to Completed, proceed to the next section. In this next section, I prepare Active Directory Sites and Services for the second set of domain controller in another region.

Prepare Your Remote Region Domain Controllers Security Group

In this section, I modify the Domain Controllers Security Group in us-west-2. This allows the domain controllers deployed in us-west-2 to communicate with each other.

  1. From the Amazon Elastic Compute Cloud (Amazon EC2) console, select Security Groups under the Network & Security navigation section.
  2. Select the Domain Controllers Security Group that was created by your Launch Wizard Active Directory.
  3. Select Edit inbound rules. The Security Group should start with LaunchWizard-UsWest2AD-EC2ADStackExistingVPC-
  4. Choose Add rule and enter the following:
    • Type: Select All traffic
    • Protocol: All
    • Port range: All
    • Source: Select Custom
    • Enter the CIDR of your remote VPC. An example would be 0.0.0/24
  5. Choose Save rules.

Create an AD User and Verify Replication

In this section, I create a user in one region and verify that it replicated to the other region. I also use AD replication diagnostics tools to verify that replication is working properly.

Create a Test User Account

  1. Log in to one of the domain controllers in us-east-1.
  2. Select Start, type dsa and press Enter on your keyboard. The Active Directory Users and Computers MMC should appear.
  3. Right click on the Users container and select New > User.
  4. Enter the following inputs:
    • First name: John
    • Last name: Doe
    • User logon name: jdoe and select Next
    • Password and Confirm password: Your choice of complex password
    • Uncheck User must change password at next logon
  5. Select Next.
  6. Select Finish.

Verify Test User Account Has Replicated

  1. Log in to the one of the domain controllers in us-west-2.
  2. Select Start and type dsa.
  3. Then, press Enter on your keyboard. The Active Directory Users and Computers MMC should appear.
  4. Select Users. You should see a user object named John Doe.

Note that if the user is not present, it may not have been replicated yet. Replication should not take longer than 60 seconds from when the item was created.

Summary

Congratulations, you have created a cross-region Active Directory! In this post you:

  1. Launched a new Active Directory forest in us-east-1 using AWS Launch Wizard.
  2. Configured Active Directory Sites and Service for a multi-region configuration.
  3. Launched a set of new domain controllers in the us-west-2 region using AWS Launch Wizard.
  4. Created a test user and verified replication.

This post only touches on a couple of features that are available in the AWS Launch Wizard Active Directory deployment. AWS Launch Wizard also automates the creation of a Single Tier PKI infrastructure or trust creation. One of the prime benefits of this solution is the simplicity in deploying a fully functional Active Directory environment in just a few clicks. You no longer need to do the undifferentiated heavy lifting required to deploy Active Directory.  For more information, please refer to AWS Launch Wizard documentation.

New – VPC Reachability Analyzer

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-vpc-insights-analyzes-reachability-and-visibility-in-vpcs/

With Amazon Virtual Private Cloud (VPC), you can launch a logically isolated customer-specific virtual network on the AWS Cloud. As customers expand their footprint on the cloud and deploy increasingly complex network architectures, it can take longer to resolve network connectivity issues caused by misconfiguration. Today, we are happy to announce VPC Reachability Analyzer, a network diagnostics tool that troubleshoots reachability between two endpoints in a VPC, or within multiple VPCs.

Ensuring Your Network Configuration is as Intended
You have full control over your virtual network environment, including choosing your own IP address range, creating subnets, and configuring route tables and network gateways. You can also easily customize the network configuration of your VPC. For example, you can create a public subnet for a web server that has access to the Internet with Internet Gateway. Security-sensitive backend systems such as databases and application servers can be placed on private subnets that do not have internet access. You can use multiple layers of security, such as security groups and network access control list (ACL), to control access to entities of each subnet by protocol, IP address, and port number.

You can also combine multiple VPCs via VPC peering or AWS Transit Gateway for region-wide, or global network connections that can route traffic privately. You can also use VPN Gateway to connect your site with your AWS account for secure communication. Many AWS services that reside outside the VPC, such as AWS Lambda, or Amazon S3, support VPC endpoints or AWS PrivateLink as entities inside the VPC and can communicate with those privately.

When you have such rich controls and feature set, it is not unusual to have unintended configuration that could lead to connectivity issues. Today, you can use VPC Reachability Analyzer for analyzing reachability between two endpoints without sending any packets. VPC Reachability analyzer looks at the configuration of all the resources in your VPCs and uses automated reasoning to determine what network flows are feasible. It analyzes all possible paths through your network without having to send any traffic on the wire. To learn more about how these algorithms work checkout this re:Invent talk or read this paper.

How VPC Reachability Analyzer Works
Let’s see how it works. Using VPC Reachability Analyzer is very easy, and you can test it with your current VPC. If you need an isolated VPC for test purposes, you can run the AWS CloudFormation YAML template at the bottom of this article. The template creates a VPC with 1 subnet, 2 security groups and 3 instances as A, B, and C. Instance A and B can communicate with each other, but those instances cannot communicate with instance C because the security group attached to instance C does not allow any incoming traffic.

You see Reachability Analyzer in the left navigation of the VPC Management Console.

Click Reachability Analyzer, and also click Create and analyze path button, then you see new windows where you can specify a path between a source and destination, and start analysis.

You can specify any of the following endpoint types: VPN Gateways, Instances, Network Interfaces, Internet Gateways, VPC Endpoints, VPC Peering Connections, and Transit Gateways for your source and destination of communication. For example, we set instance A for source and the instance B for destination. You can choose to check for connectivity via either the TCP or UDP protocols. Optionally, you can also specify a port number, or source, or destination IP address.

Configuring test path

Finally, click the Create and analyze path button to start the analysis. The analysis can take up to several minutes depending on the size and complexity of your VPCs, but it typically takes a few seconds.

You can now see the analysis result as Reachable. If you click the URL link of analysis id nip-xxxxxxxxxxxxxxxxx, you can see the route hop by hop.

The communication from instance A to instance C is not reachable because the security group attached to instance C does not allow any incoming traffic.

If you click nip-xxxxxxxxxxxxxxxxx for more detail, you can check the Explanations for details.

Result Detail

Here we see the security group that blocked communication. When you click on the security group listed in the upper right corner, you can go directly to the security group editing window to change the security group rules. In this case adding a properly scoped ingress rule will allow the instances to communicate.

Available Today
This feature is available for all AWS commercial Regions except for China (Beijing), and China (Ningxia) regions. More information is available in our technical documentation, and remember that to use this feature your IAM permissions need to be set up as documented here.

– Kame

CloudFormation YAML template for test

---
Description: An AWS VPC configuration with 1 subnet, 2 security groups and 3 instances. When testing ReachabilityAnalyzer, this provides both a path found and path not found scenario.
AWSTemplateFormatVersion: 2010-09-09

Mappings:
  RegionMap:
    us-east-1:
      execution: ami-0915e09cc7ceee3ab
      ecs: ami-08087103f9850bddd

Resources:
  # VPC
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 172.0.0.0/16
      EnableDnsSupport: true
      EnableDnsHostnames: true
      InstanceTenancy: default

  # Subnets
  Subnet1:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 172.0.0.0/20
      MapPublicIpOnLaunch: false

  # SGs
  SecurityGroup1:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow all ingress and egress traffic
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - CidrIp: 0.0.0.0/0
          IpProtocol: "-1" # -1 specifies all protocols

  SecurityGroup2:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow all egress traffic
      VpcId: !Ref VPC

  # Instances
  # Instance A and B should have a path between them since they are both in SecurityGroup 1
  InstanceA:
    Type: AWS::EC2::Instance
    Properties:
      ImageId:
        Fn::FindInMap:
          - RegionMap
          - Ref: AWS::Region
          - execution
      InstanceType: 't3.nano'
      SubnetId:
        Ref: Subnet1
      SecurityGroupIds:
        - Ref: SecurityGroup1

  # Instance A and B should have a path between them since they are both in SecurityGroup 1
  InstanceB:
    Type: AWS::EC2::Instance
    Properties:
      ImageId:
        Fn::FindInMap:
          - RegionMap
          - Ref: AWS::Region
          - execution
      InstanceType: 't3.nano'
      SubnetId:
        Ref: Subnet1
      SecurityGroupIds:
        - Ref: SecurityGroup1

  # This instance should not be reachable from Instance A or B since it is in SecurityGroup 2
  InstanceC:
    Type: AWS::EC2::Instance
    Properties:
      ImageId:
        Fn::FindInMap:
          - RegionMap
          - Ref: AWS::Region
          - execution
      InstanceType: 't3.nano'
      SubnetId:
        Ref: Subnet1
      SecurityGroupIds:
        - Ref: SecurityGroup2