Tag Archives: Amazon EC2 Container Registry

Orchestrate Jenkins Workloads using Dynamic Pod Autoscaling with Amazon EKS

Post Syndicated from Vladimir Toussaint original https://aws.amazon.com/blogs/devops/orchestrate-jenkins-workloads-using-dynamic-pod-autoscaling-with-amazon-eks/

This blog post will demonstrate how to leverage Jenkins with Amazon Elastic Kubernetes Service (EKS) by running a Jenkins Manager within an EKS pod. In doing so, we can run Jenkins workloads by allowing Amazon EKS to spawn dynamic Jenkins Agent(s) in order to perform application and infrastructure deployment. Traditionally, customers will setup a Jenkins Manager-Agent architecture that contains a set of manually added nodes with no autoscaling capabilities. Implementing this strategy will ensure that a robust approach optimizes the performance with the right-sized compute capacity and work needed to successfully perform the build tasks.

In setting up our Amazon EKS cluster with Jenkins, we’ll utilize the eksctl simple CLI tool for creating clusters on EKS. Then, we’ll build both the Jenkins Manager and Jenkins Agent image. Afterward, we’ll run a container deployment on our cluster to access the Jenkins application and utilize the dynamic Jenkins Agent pods to run pipelines and jobs.

Solution Overview

The architecture below illustrates the execution steps.

Solution Architecture diagram
Figure 1. Solution overview diagram

Disclaimer(s): (Note: This Jenkins application is not configured with a persistent volume storage. Therefore, you must establish and configure this template to fit that requirement).

To accomplish this deployment workflow, we will do the following:

Centralized Shared Services account

  1. Deploy the Amazon EKS Cluster into a Centralized Shared Services Account.
  2. Create the Amazon ECR Repository for the Jenkins Manager and Jenkins Agent to store docker images.
  3. Deploy the kubernetes manifest file for the Jenkins Manager.

Target Account(s)

  1. Establish a set of AWS Identity and Access Management (IAM) roles with permissions for cross-across access from the Share Services account into the Target account(s).

Jenkins Application UI

  1. Jenkins Plugins – Install and configure the Kubernetes Plugin and CloudBees AWS Credentials Plugin from Manage Plugins (you will not have to manually install this since it will be packaged and installed as part of the Jenkins image build).
  2. Jenkins Pipeline Example—Fetch the Jenkinsfile to deploy an S3 Bucket with CloudFormation in the Target account using a Jenkins parameterized pipeline.

Prerequisites

The following is the minimum requirements for ensuring this solution will work.

Account Prerequisites

  • Shared Services Account: The location of the Amazon EKS Cluster.
  • Target Account: The destination of the CI/CD pipeline deployments.

Build Requirements

Clone the Git Repository

git clone https://github.com/aws-samples/jenkins-cloudformation-deployment-example.git

Security Considerations

This blog provides a high-level overview of the best practices for cross-account deployment and isolation maintenance between the applications. We evaluated the cross-account application deployment permissions and will describe the current state as well as what to avoid. As part of the security best practices, we will maintain isolation among multiple apps deployed in these environments, e.g., Pipeline 1 does not deploy to the Pipeline 2 infrastructure.

Requirement

A Jenkins manager is running as a container in an EC2 compute instance that resides within a Shared AWS account. This Jenkins application represents individual pipelines deploying unique microservices that build and deploy to multiple environments in separate AWS accounts. The cross-account deployment utilizes the target AWS account admin credentials in order to do the deployment.

This methodology means that it is not good practice to share the account credentials externally. Additionally, the deployment errors risk should be eliminated and application isolation should be maintained within the same account.

Note that the deployment steps are being run using AWS CLIs, thus our solution will be focused on AWS CLI usage.

The risk is much lower when utilizing CloudFormation / CDK to conduct deployments because the AWS CLIs executed from the build jobs will specify stack names as parametrized inputs and the very low probability of stack-name error. However, it remains inadvisable to utilize admin credentials of the target account.

Best Practice — Current Approach

We utilized cross-account roles that can restrict unauthorized access across build jobs. Behind this approach, we will utilize the assume-role concept that will enable the requesting role to obtain temporary credentials (from the STS service) of the target role and execute actions permitted by the target role. This is safer than utilizing hard-coded credentials. The requesting role could be either the inherited EC2 instance role OR specific user credentials. However, in our case, we are utilizing the inherited EC2 instance role.

For ease of understanding, we will refer the target-role as execution-role below.

Cross account roles for Jenkins build jobs
Figure 2. Current approach

  • As per the security best practice of assigning minimum privileges, we must first create execution role in IAM in the target account that has deployment permissions (either via CloudFormation OR via CLI’s), e.g., app-dev-role in Dev account and app-prod-role in Prod account.
  • For each of those roles, we configure a trust relationship with the parent account ID (Shared Services account). This enables any roles in the Shared Services account (with assume-role permission) to assume the execution-role and deploy it on respective hosting infrastructure, e.g., the app-dev-role in Dev account will be a common execution role that will deploy various apps across infrastructure.
  • Then, we create a local role in the Shared Services account and configure credentials within Jenkins to be utilized by the Build Jobs. Provide the job with the assume-role permissions and specify the list of ARNs across every account. Alternatively, the inherited EC2 instance role can also be utilized to assume the execution-role.

Create Cross-Account IAM Roles

Cross-account IAM roles allow users to securely access AWS resources in a target account while maintaining the observability of that AWS account. The cross-account IAM role includes a trust policy allowing AWS identities in another AWS account to assume the given role. This allows us to create a role in one AWS account that delegates specific permissions to another AWS account.

  • Create an IAM role with a common name in each target account. The role name we’ve created is AWSCloudFormationStackExecutionRole. The role must have permissions to perform CloudFormation actions and any actions regarding the resources that will be created. In our case, we will be creating an S3 Bucket utilizing CloudFormation.
  • This IAM role must also have an established trust relationship to the Shared Services account. In this case, the Jenkins Agent will be granted the ability to assume the role of the particular target account from the Shared Services account.
  • In our case, the IAM entity that will assume the AWSCloudFormationStackExecutionRole is the EKS Node Instance Role that associated with the EKS Cluster Nodes.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "cloudformation:CreateUploadBucket",
                "cloudformation:ListStacks",
                "cloudformation:CancelUpdateStack",
                "cloudformation:ExecuteChangeSet",
                "cloudformation:ListChangeSets",
                "cloudformation:ListStackResources",
                "cloudformation:DescribeStackResources",
                "cloudformation:DescribeStackResource",
                "cloudformation:CreateChangeSet",
                "cloudformation:DeleteChangeSet",
                "cloudformation:DescribeStacks",
                "cloudformation:ContinueUpdateRollback",
                "cloudformation:DescribeStackEvents",
                "cloudformation:CreateStack",
                "cloudformation:DeleteStack",
                "cloudformation:UpdateStack",
                "cloudformation:DescribeChangeSet",
                "s3:PutBucketPublicAccessBlock",
                "s3:CreateBucket",
                "s3:DeleteBucketPolicy",
                "s3:PutEncryptionConfiguration",
                "s3:PutBucketPolicy",
                "s3:DeleteBucket"
            ],
            "Resource": "*"
        }
    ]
}

Build Docker Images

Build the custom docker images for the Jenkins Manager and the Jenkins Agent, and then push the images to AWS ECR Repository. Navigate to the docker/ directory, then execute the command according to the required parameters with the AWS account ID, repository name, region, and the build folder name jenkins-manager/ or jenkins-agent/ that resides in the current docker directory. The custom docker images will contain a set of starter package installations.

Deploy Jenkins Application

After building both images, navigate to the k8s/ directory, modify the manifest file for the Jenkins image, and then execute the Jenkins manifest.yaml template to setup the Jenkins application. (Note: This Jenkins application is not configured with a persistent volume storage. Therefore, you will need to establish and configure this template to fit that requirement).

# Fetch the Application URL or navigate to the AWS Console for the Load Balancer
kubectl get svc -n jenkins

# Verify that jenkins deployment/pods are up running
kubectl get pods -n jenkins

# Replace with jenkins manager pod name and fetch Jenkins login password
kubectl exec -it pod/<JENKINS-MANAGER-POD-NAME> -n jenkins -- cat /var/jenkins_home/secrets/initialAdminPassword
  • The Kubernetes Plugin and CloudBees AWS Credentials Plugin should be installed as part of the Jenkins image build from the Managed Plugins.
  • Navigate: Manage Jenkins → Configure Global Security
  • Set the Crumb Issuer to remove the error pages in order to prevent Cross Site Request Forgery exploits.

Screenshot of Crumb isssuer
Figure 3. Configure Global Security

Configure Jenkins Kubernetes Cloud

  • Navigate: Manage Jenkins → Manage Nodes and Clouds → Configure Clouds
  • Click: Add a new cloud → select Kubernetes from the drop menus

Screenshot to configure Cloud on Jenkins
Figure 4a. Jenkins Configure Nodes and Clouds

Note: Before proceeding, please ensure that you can access your Amazon EKS cluster information, whether it is through Console or CLI.

  • Enter a Name in the Kubernetes Cloud configuration field.
  • Enter the Kubernetes URL which can be found via AWS Console by navigating to the Amazon EKS service and locating the API server endpoint of the cluster, or run the command kubectl cluster-info.
  • Enter the namespace that will be utilized in the Kubernetes Namespace field. This will determine where the dynamic kubernetes pods will spawn. In our case, the name of the namespace is jenkins.
  • During the initial setup of Jenkins Manager on kubernetes, there is an environment variable JENKINS_URL that automatically utilizes the Load Balancer URL to resolve requests. However, we will resolve our requests locally to the cluster IP address.
    • The format is as follows: https://<service-name>.<namespace>.svc.cluster.local

Configuring Kubernetes cloud for Jenkins
Figure 4b. Configure Kubernetes Cloud

Set AWS Credentials

Security concerns are a key reason why we’re utilizing an IAM role instead of access keys. For any given approach involving IAM, it is the best practice to utilize temporary credentials.

  • You must have the AWS Credentials Binding Plugin installed before this step. Enter the unique ID name as shown in the example below.
  • Enter the IAM Role ARN you created earlier for both the ID and IAM Role to use in the field as shown below.

Setting up credentials on Jenkins
Figure 5. AWS Credentials Binding

Configuring Global credentials
Figure 6. Managed Credentials

Create a pipeline

  • Navigate to the Jenkins main menu and select new item
  • Create a Pipeline

Screenshot for Pipeline configuration
Figure 7. Create a pipeline

Configure Jenkins Agent

Setup a Kubernetes YAML template after you’ve built the agent image. In this example, we will be using the k8sPodTemplate.yaml file stored in the k8s/ folder.

CloudFormation Execution Scripts

This deploy-stack.sh file can accept four different parameters and conduct several types of CloudFormation stack executions such as deploy, create-changeset, and execute-changeset. This is also reflected in the stages of this Jenkinsfile pipeline. As for the delete-stack.sh file, two parameters are accepted, and, when executed, it will delete a CloudFormation stack based on the given stack name and region.

Jenkinsfile

In this Jenkinsfile, the individual pipeline build jobs will deploy individual microservices. The k8sPodTemplate.yaml is utilized to specify the kubernetes pod details and the inbound-agent that will be utilized to run the pipeline.

Jenkins Pipeline: Execute a pipeline

  • Click Build with Parameters and then select a build action.

Configuring stackname in Jenkins configuration
Figure 8a. Build with Parameters

  • Examine the pipeline stages even further for the choice you selected. Also, view more details of the stages below and verify in your AWS account that the CloudFormation stack was executed.

Jenkins pipeline dashboard
Figure 8b. Pipeline Stage View

  • The Final Step is to execute your pipeline and watch the pods spin up dynamically in your terminal. As is shown below, the Jenkins agent pod spawned and then terminated after the work completed. Watch this task on your own by executing the following command:
# Watch the pods spawn in the "jenkins" namespace
kubectl get pods -n jenkins -w

CLI output showing Jenkins POD status
Figure 9. Watch Jenkins Agent Pods Spawn

Code Repository

References

Cleanup

In order to avoid incurring future charges, delete the resources utilized in the walkthrough.

  • Delete the EKS cluster. You can utilize the eksctl to delete the cluster.
  • Delete any remaining AWS resources created by EKS such as AWS LoadBalancer, Target Groups, etc.
  • Delete any related IAM entities.

Conclusion

This post walked you through the process of building out Amazon EKS based infrastructure and integrating Jenkins to orchestrate workloads. We demonstrated how you can utilize this to deploy securely across multiple accounts with dynamic Jenkins agents and create alignment to your business with similar use cases. To learn more about Amazon EKS, see our documentation pages or explore our console.

About the Authors

Vladimir Toussaint Headshot1.png

Vladimir P. Toussaint

Vladimir is a DevOps Cloud Architect at Amazon Web Services. He works with GovCloud customers to build solutions and capabilities as they move to the cloud. Previous to Amazon Web Services, Vladimir has leveraged container orchestration tools such as Kubernetes to securely manage microservice applications for large enterprises.

Matt Noyce Headshot1.png

Matt Noyce

Matt is a Sr. Cloud Application Architect at Amazon Web Services. He works primarily with health care and life sciences customers to help them architect and build applications, data lakes, and DevOps pipelines that solve their business needs. In his spare time Matt likes to run and hike along with enjoying time with friends and family.

Nikunj Vaidya Headshot1.png

Nikunj Vaidya

Nikunj is a DevOps Tech Leader at Amazon Web Services. He offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality. Prior to AWS, Nikunj has worked in software engineering roles, leading transformation projects, driving releases and improvements in the software quality and customer experience.

Building well-architected serverless applications: Optimizing application costs

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-optimizing-application-costs/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

COST 1. How do you optimize your serverless application costs?

Design, implement, and optimize your application to maximize value. Asynchronous design patterns and performance practices ensure efficient resource use and directly impact the value per business transaction. By optimizing your serverless application performance and its code patterns, you can directly impact the value it provides, while making more efficient use of resources.

Serverless architectures are easier to manage in terms of correct resource allocation compared to traditional architectures. Due to its pay-per-value pricing model and scale based on demand, a serverless approach effectively reduces the capacity planning effort. As covered in the operational excellence and performance pillars, optimizing your serverless application has a direct impact on the value it produces and its cost. For general serverless optimization guidance, see the AWS re:Invent talks, “Optimizing your Serverless applications” Part 1 and Part 2, and “Serverless architectural patterns and best practices”.

Required practice: Minimize external calls and function code initialization

AWS Lambda functions may call other managed services and third-party APIs. Functions may also use application dependencies that may not be suitable for ephemeral environments. Understanding and controlling what your function accesses while it runs can have a direct impact on value provided per invocation.

Review code initialization

I explain the Lambda initialization process with cold and warm starts in “Optimizing application performance – part 1”. Lambda reports the time it takes to initialize application code in Amazon CloudWatch Logs. As Lambda functions are billed by request and duration, you can use this to track costs and performance. Consider reviewing your application code and its dependencies to improve the overall execution time to maximize value.

You can take advantage of Lambda execution environment reuse to make external calls to resources and use the results for subsequent invocations. Use TTL mechanisms inside your function handler code. This ensures that you can prevent additional external calls that incur additional execution time, while preemptively fetching data that isn’t stale.

Review third-party application deployments and permissions

When using Lambda layers or applications provisioned by AWS Serverless Application Repository, be sure to understand any associated charges that these may incur. When deploying functions packaged as container images, understand the charges for storing images in Amazon Elastic Container Registry (ECR).

Ensure that your Lambda function only has access to what its application code needs. Regularly review that your function has a predicted usage pattern so you can factor in the cost of other services, such as Amazon S3 and Amazon DynamoDB.

Required practice: Optimize logging output and its retention

Considering reviewing your application logging level. Ensure that logging output and log retention are appropriately set to your operational needs to prevent unnecessary logging and data retention. This helps you have the minimum of log retention to investigate operational and performance inquiries when necessary.

Emit and capture only what is necessary to understand and operate your component as intended.

With Lambda, any standard output statements are sent to CloudWatch Logs. Capture and emit business and operational events that are necessary to help you understand your function, its integration, and its interactions. Use a logging framework and environment variables to dynamically set a logging level. When applicable, sample debugging logs for a percentage of invocations.

In the serverless airline example used in this series, the booking service Lambda functions use Lambda Powertools as a logging framework with output structured as JSON.

Lambda Powertools is added to the Lambda functions as a shared Lambda layer in the AWS Serverless Application Model (AWS SAM) template. The layer ARN is stored in Systems Manager Parameter Store.

Parameters:
  SharedLibsLayer:
    Type: AWS::SSM::Parameter::Value<String>
    Description: Project shared libraries Lambda Layer ARN
Resources:
    ConfirmBooking:
        Type: AWS::Serverless::Function
        Properties:
            FunctionName: !Sub ServerlessAirline-ConfirmBooking-${Stage}
            Handler: confirm.lambda_handler
            CodeUri: src/confirm-booking
            Layers:
                - !Ref SharedLibsLayer
            Runtime: python3.7
…

The LOG_LEVEL and other Powertools settings are configured in the Globals section as Lambda environment variable for all functions.

Globals:
    Function:
        Environment:
            Variables:
                POWERTOOLS_SERVICE_NAME: booking
                POWERTOOLS_METRICS_NAMESPACE: ServerlessAirline
                LOG_LEVEL: INFO 

For Amazon API Gateway, there are two types of logging in CloudWatch: execution logging and access logging. Execution logs contain information that you can use to identify and troubleshoot API errors. API Gateway manages the CloudWatch Logs, creating the log groups and log streams. Access logs contain details about who accessed your API and how they accessed it. You can create your own log group or choose an existing log group that could be managed by API Gateway.

Enable access logs, and selectively review the output format and request fields that might be necessary. For more information, see “Setting up CloudWatch logging for a REST API in API Gateway”.

API Gateway logging

API Gateway logging

Enable AWS AppSync logging which uses CloudWatch to monitor and debug requests. You can configure two types of logging: request-level and field-level. For more information, see “Monitoring and Logging”.

AWS AppSync logging

AWS AppSync logging

Define and set a log retention strategy

Define a log retention strategy to satisfy your operational and business needs. Set log expiration for each CloudWatch log group as they are kept indefinitely by default.

For example, in the booking service AWS SAM template, log groups are explicitly created for each Lambda function with a parameter specifying the retention period.

Parameters:
    LogRetentionInDays:
        Type: Number
        Default: 14
        Description: CloudWatch Logs retention period
Resources:
    ConfirmBookingLogGroup:
        Type: AWS::Logs::LogGroup
        Properties:
            LogGroupName: !Sub "/aws/lambda/${ConfirmBooking}"
            RetentionInDays: !Ref LogRetentionInDays

The Serverless Application Repository application, auto-set-log-group-retention can update the retention policy for new and existing CloudWatch log groups to the specified number of days.

For log archival, you can export CloudWatch Logs to S3 and store them in Amazon S3 Glacier for more cost-effective retention. You can use CloudWatch Log subscriptions for custom processing, analysis, or loading to other systems. Lambda extensions allows you to process, filter, and route logs directly from Lambda to a destination of your choice.

Good practice: Optimize function configuration to reduce cost

Benchmark your function using a different set of memory size

For Lambda functions, memory is the capacity unit for controlling the performance and cost of a function. You can configure the amount of memory allocated to a Lambda function, between 128 MB and 10,240 MB. The amount of memory also determines the amount of virtual CPU available to a function. Benchmark your AWS Lambda functions with differing amounts of memory allocated. Adding more memory and proportional CPU may lower the duration and reduce the cost of each invocation.

In “Optimizing application performance – part 2”, I cover using AWS Lambda Power Tuning to automate the memory testing process to balances performance and cost.

Best practice: Use cost-aware usage patterns in code

Reduce the time your function runs by reducing job-polling or task coordination. This avoids overpaying for unnecessary compute time.

Decide whether your application can fit an asynchronous pattern

Avoid scenarios where your Lambda functions wait for external activities to complete. I explain the difference between synchronous and asynchronous processing in “Optimizing application performance – part 1”. You can use asynchronous processing to aggregate queues, streams, or events for more efficient processing time per invocation. This reduces wait times and latency from requesting apps and functions.

Long polling or waiting increases the costs of Lambda functions and also reduces overall account concurrency. This can impact the ability of other functions to run.

Consider using other services such as AWS Step Functions to help reduce code and coordinate asynchronous workloads. You can build workflows using state machines with long-polling, and failure handling. Step Functions also supports direct service integrations, such as DynamoDB, without having to use Lambda functions.

In the serverless airline example used in this series, Step Functions is used to orchestrate the Booking microservice. The ProcessBooking state machine handles all the necessary steps to create bookings, including payment.

Booking service state machine

Booking service state machine

To reduce costs and improves performance with CloudWatch, create custom metrics asynchronously. You can use the Embedded Metrics Format to write logs, rather than the PutMetricsData API call. I cover using the embedded metrics format in “Understanding application health” – part 1 and part 2.

For example, once a booking is made, the logs are visible in the CloudWatch console. You can select a log stream and find the custom metric as part of the structured log entry.

Custom metric structured log entry

Custom metric structured log entry

CloudWatch automatically creates metrics from these structured logs. You can create graphs and alarms based on them. For example, here is a graph based on a BookingSuccessful custom metric.

CloudWatch metrics custom graph

CloudWatch metrics custom graph

Consider asynchronous invocations and review run away functions where applicable

Take advantage of Lambda’s event-based model. Lambda functions can be triggered based on events ingested into Amazon Simple Queue Service (SQS) queues, S3 buckets, and Amazon Kinesis Data Streams. AWS manages the polling infrastructure on your behalf with no additional cost. Avoid code that polls for third-party software as a service (SaaS) providers. Rather use Amazon EventBridge to integrate with SaaS instead when possible.

Carefully consider and review recursion, and establish timeouts to prevent run away functions.

Conclusion

Design, implement, and optimize your application to maximize value. Asynchronous design patterns and performance practices ensure efficient resource use and directly impact the value per business transaction. By optimizing your serverless application performance and its code patterns, you can reduce costs while making more efficient use of resources.

In this post, I cover minimizing external calls and function code initialization. I show how to optimize logging output with the embedded metrics format, and log retention. I recap optimizing function configuration to reduce cost and highlight the benefits of asynchronous event-driven patterns.

This post wraps up the series, building well-architected serverless applications, where I cover the AWS Well-Architected Tool with the Serverless Lens . See the introduction post for links to all the blog posts.

For more serverless learning resources, visit Serverless Land.

 

How to authenticate private container registries using AWS Batch

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/how-to-authenticate-private-container-registries-using-aws-batch/

This post was contributed by Clayton Thomas, Solutions Architect, AWS WW Public Sector SLG Govtech.

Many AWS Batch users choose to store and consume their AWS Batch job container images on AWS using Amazon Elastic Container Registries (ECR). AWS Batch and Amazon Elastic Container Service (ECS) natively support pulling from Amazon ECR without any extra steps required. For those users that choose to store their container images on other container registries or Docker Hub, often times they are not publicly exposed and require authentication to pull these images. Third-party repositories may throttle the number of requests, which impedes the ability to run workloads and self-managed repositories require heavy tuning to offer the scale that Amazon ECS provides. This makes Amazon ECS the preferred solution to run workloads on AWS Batch.

While Amazon ECS allows you to configure repositoryCredentials in task definitions containing private registry credentials, AWS Batch does not expose this option in AWS Batch job definitions. AWS Batch does not provide the ability to use private registries by default but you can allow that by configuring the Amazon ECS agent in a few steps.

This post shows how to configure an AWS Batch EC2 compute environment and the Amazon ECS agent to pull your private container images from private container registries. This gives you the flexibility to use your own private and public container registries with AWS Batch.

Overview

The solution uses AWS Secrets Manager to securely store your private container registry credentials, which are retrieved on startup of the AWS Batch compute environment. This ensures that your credentials are securely managed and accessed using IAM roles and are not persisted or stored in AWS Batch job definitions or EC2 user data. The Amazon ECS agent is then configured upon startup to pull these credentials from AWS Secrets Manager. Note that this solution only supports Amazon EC2 based AWS Batch compute environments, thus AWS Fargate cannot use this solution.

High-level diagram showing event flow

Figure 1: High-level diagram showing event flow

  1. AWS Batch uses an Amazon EC2 Compute Environment powered by Amazon ECS. This compute environment uses a custom EC2 Launch Template to configure the Amazon ECS agent to include credentials for pulling images from private registries.
  2. An EC2 User Data script is run upon EC2 instance startup that retrieves registry credentials from AWS Secrets Manager. The EC2 instance authenticates with AWS Secrets Manager using its configured IAM instance profile, which grants temporary IAM credentials.
  3. AWS Batch jobs can be submitted using private images that require authentication with configured credentials.

Prerequisites

For this walkthrough, you should have the following prerequisites:

  1. An AWS account
  2. An Amazon Virtual Private Cloud with private and public subnets. If you do not have a VPC, this tutorial can be followed. The AWS Batch compute environment must have connectivity to the container registry.
  3. A container registry containing a private image. This example uses Docker Hub and assumes you have created a private repository
  4. Registry credentials and/or an access token to authenticate with the container registry or Docker Hub.
  5. A VPC Security Group allowing the AWS Batch compute environment egress connectivity to the container registry.

A CloudFormation template is provided to simplify setting up this example. The CloudFormation template and provided EC2 user data script can be viewed here on GitHub.

The CloudFormation template will create the following resources:

  1. Necessary IAM roles for AWS Batch
  2. AWS Secrets Manager secret containing container registry credentials
  3. AWS Batch managed compute environment and job queue
  4. EC2 Launch Configuration with user data script

Click the Launch Stack button to get started:

Launch Stack

Launch the CloudFormation stack

After clicking the Launch stack button above, click Next to be presented with the following screen:

Figure 2: CloudFormation stack parameters

Figure 2: CloudFormation stack parameters

Fill in the required parameters as follows:

  1. Stack Name: Give your stack a unique name.
  2. Password: Your container registry password or Docker Hub access token. Note that both user name and password are masked and will not appear in any CF logs or output. Additionally, they are securely stored in an AWS Secrets Manager secret created by CloudFormation.
  3. RegistryUrl: If not using Docker Hub, specify the URL of the private container registry.
  4. User name: Your container registry user name.
  5. SecurityGroupIDs: Select your previously created security group to assign to the example Batch compute environment.
  6. SubnetIDs: To assign to the example Batch compute environment, select one or more VPC subnet IDs.

After entering these parameters, you can click through next twice and create the stack, which will take a few minutes to complete. Note that you must acknowledge that the template creates IAM resources on the review page before submitting.

Finally, you will be presented with a list of created AWS resources once the stack deployment finishes as shown in Figure 3 if you would like to dig deeper.

Figure 3: CloudFormation created resources

Figure 3: CloudFormation created resources

User data script contained within launch template

AWS Batch allows you to customize the compute environment in a variety of ways such as specifying an EC2 key pair, custom AMI, or an EC2 user data script. This is done by specifying an EC2 launch template before creating the Batch compute environment. For more information on Batch launch template support, see here.

Let’s take a closer look at how the Amazon ECS agent is configured upon compute environment startup to use your registry credentials.

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/cloud-config; charset="us-ascii"

packages:
- jq
- aws-cli

runcmd:
- /usr/bin/aws configure set region $(curl http://169.254.169.254/latest/meta-data/placement/region)
- export SECRET_STRING=$(/usr/bin/aws secretsmanager get-secret-value --secret-id your_secrets_manager_secret_id | jq -r '.SecretString')
- export USERNAME=$(echo $SECRET_STRING | jq -r '.username')
- export PASSWORD=$(echo $SECRET_STRING | jq -r '.password')
- export REGISTRY_URL=$(echo $SECRET_STRING | jq -r '.registry_url')
- echo $PASSWORD | docker login --username $USERNAME --password-stdin $REGISTRY_URL
- export AUTH=$(cat ~/.docker/config.json | jq -c .auths)
- echo 'ECS_ENGINE_AUTH_TYPE=dockercfg' >> /etc/ecs/ecs.config
- echo "ECS_ENGINE_AUTH_DATA=$AUTH" >> /etc/ecs/ecs.config

--==MYBOUNDARY==--

This example script uses and installs a few tools including the AWS CLI and the open-source tool jq to retrieve and parse the previously created Secrets Manager secret. These packages are installed using the cloud-config user data type, which is part of the cloud-init packages functionality. If using the provided CloudFormation template, this script will be dynamically rendered to reference the created secret, but note that you must specify the correct Secrets Manager secret id if not using the template.

After performing a Docker login, the generated Auth JSON object is captured and passed to the Amazon ECS agent configuration to be used on AWS Batch jobs that require private images. For an explanation of Amazon ECS agent configuration options including available Amazon ECS engine Auth types, see here. This example script can be extended or customized to fit your needs but must adhere to requirements for Batch launch template user data scripts, including being in MIME multi-part archive format.

It’s worth noting that the AWS CLI automatically grabs temporary IAM credentials from the associated IAM instance profile the CloudFormation stack created in order to retrieve the Secret Manager secret values. This example assumes you created the AWS Secrets Manager secret with the default AWS managed KMS key for Secrets Manager. However, if you choose to encrypt your secret with a customer managed KMS key, make sure to specify kms:Decrypt IAM permissions for the Batch compute environment IAM role.

Submitting the AWS Batch job

Now let’s try an example Batch job that uses a private container image by creating a Batch job definition and submitting a Batch job:

  1. Open the AWS Batch console
  2. Navigate to the Job Definition page
  3. Click create
  4. Provide a unique Name for the job definition
  5. Select the EC2 platform
  6. Specify your private container image located in the Image field
  7. Click create

Figure 4: Batch job definition

Now you can submit an AWS Batch job that uses this job definition:

  1. Click on the Jobs page
  2. Click Submit New Job
  3. Provide a Name for the job
  4. Select the previously created job definition
  5. Select the Batch Job Queue created by the CloudFormation stack
  6. Click Submit
Submitting a new Batch job

Figure 5: Submitting a new Batch job

After submitting the AWS Batch job, it will take a few minutes for the AWS Batch Compute Environment to create resources for scheduling the job. Once that is done, you should see a SUCCEEDED status by viewing the job and filtering by AWS Batch job queue shown in Figure 6.

Figure 6: AWS Batch job succeeded

Figure 6: AWS Batch job succeeded

Cleaning up

To clean up the example resources, click delete for the created CloudFormation stack in the CloudFormation Console.

Conclusion

In this blog, you deployed a customized AWS Batch managed compute environment that was configured to allow pulling private container images in a secure manner. As I’ve shown, AWS Batch gives you the flexibility to use both private and public container registries. I encourage you to continue to explore the many options available natively on AWS for hosting and pulling container images. Amazon ECR or the recently launched Amazon ECR public repositories (for a deeper dive, see this blog announcement) both provide a seamless experience for container workloads running on AWS.

Choosing a Well-Architected CI/CD approach: Open Source on AWS

Post Syndicated from Mikhail Vasilyev original https://aws.amazon.com/blogs/devops/choosing-a-well-architected-ci-cd-approach-open-source-on-aws/

Introduction

When building a CI/CD platform, it is important to make an informed decision regarding every underlying tool. This post explores evaluating the criteria for selecting each tool focusing on a balance between meeting functional and non-functional requirements, and maximizing value.

Your first decision: source code management.

Source code is potentially your most valuable asset, and so we start by choosing a source code management tool. These tools normally have high non-functional requirements in order to protect your assets and to ensure they are available to the organization when needed. The requirements usually include demand for high durability, high availability (HA), consistently high throughput, and strong security with role-based access controls.

At the same time, source code management tools normally have many specific functional requirements as well. For example, the ability to provide collaborative code review in the UI, flexible and tunable merge policies including both automated and manual gates (code checks), and out-of-box UI-level integrations with numerous other tools. These kinds of integrations can include enabling monitoring, CI, chats, and agile project management.

Many teams also treat source code management tools as their portal into other CI/CD tools. They make them shareable between teams, and might prefer to stay within one single context and user interface throughout the entire DevOps cycle. Many source code management tools are actually a stack of services that support multiple steps of your CI/CD workflows from within a single UI. This makes them an excellent starting point for building your CI/CD platforms.

The first decision your need to make is whether to go with an open source solution for managing code or with AWS-managed solutions, such as AWS CodeCommit. Open source solutions include (but are not limited to) the following: Gerrit, Gitlab, Gogs, and Phabricator.

You decision will be influenced by the amount of benefit your team can gain from the flexibility provided through open source, and how well your team can support deploying and managing these solutions. You will also need to consider the infrastructure and management overhead cost.

Engineering teams that have the capacity to develop their own plugins for their CI/CD platforms, or whom even contribute directly to open source projects, will often prefer open source solutions for the flexibility they provide. This will be especially true if they are fluent in designing and supporting their own cloud infrastructure. If the team gets more value by trading the flexibility of open source for not having to worry about managing infrastructure (especially if High Availability, Scalability, Durability, and Security are more critical) an AWS-managed solution would be a better choice.

Source Code Management Solution

When the choice is made in favor of an open-source code management solution (such as Gitlab), the next decision will be how to architect the deployment. Will the team deploy to a single instance, or design for high availability, durability, and scalability? Teams that want to design Gitlab for HA can use the following guide to proceed: Installing GitLab on Amazon Web Services (AWS)

By adopting AWS services (such as Amazon RDS, Amazon ElastiCache for Redis, and Autoscaling Groups), you can lower the management burden of supporting the underlying infrastructure in this self-managed HA scenario.

High level overview of self-managed HA Gitlab deployment

Your second decision: Continuous Integration engine

Selecting your CI engine, you might be able to benefit from additional features of previously selected solutions. Gitlab provides both source control services, as well as built-in CI tools, called Gitlab CI. Gitlab Runners are responsible for running CI jobs, and the actual jobs are described as YML files stored in Gitlab’s git repository along with product code. For security and performance reasons, GitLab Runners should be on resources separate from your GitLab instance.

You could manage those resources or you could use one of the AWS services that can support deploying and managing Runners. The use of an on-demand service removes the expense of implementing and managing a capability that is undifferentiated heavy lifting for you. This provides cost optimization and enables operational excellence. You pay for what you use and the service team manages the underlying service.

Continuous Integration engine Solution

In an architecture example (below), Gitlab Runners are deployed in containers running on Amazon EKS. The team has less infrastructure to manage, can start focusing on development faster by not having to implement the capability, and can provision resources in an optimal way for their on-demand needs.

To further optimize costs, you can use EC2 Spot Instances for your EKS nodes. CI jobs are normally compute intensive and limited in run time. The runner jobs can easily be restarted on a different resource with little impact. This makes them tolerant of failure and the use of EC2 Spot instances very appealing. Amazon EKS and Spot Instances are supported out-of-box in Gitlab. As a result there is no integration to develop, only configuration is required.

To support infrastructure as code best practices, Runners are deployed with Helm and are stored and versioned as Helm charts. All of the infrastructure as code information used to implement the CI/CD platform itself is stored in templates such as Terraform.

High level overview of Infrastructure as Code on Gitlab and Gitlab CI

High level overview of Infrastructure as Code on Gitlab and Gitlab CI

Your third decision: Container Registry

You will be unable to deploy Runners if the container images are not available. As a result, the primary non-functional requirements for your production container registry are likely to include high availability, durability, transparent scalability, and security. At the same time, your functional requirements for a container registry might be lower. It might be sufficient to have a simple UI, and simple APIs supporting basic flows. Customers looking for a managed solution can use Amazon ECR, which is OCI compliant and supports Helm Charts.

Container Registry Solution

For this set of requirements, the flexibility and feature velocity of open source tools does not provide an advantage. Self-supporting high availability and strengthened security could be costly in implementation time and long-term management. Based on [Blog post 1 Diagram 1], an AWS-managed solution provides cost advantages and has no management overhead. In this case, an AWS-managed solution is a better choice for your container registry than an open-source solution hosted on AWS. In this example, Amazon ECR is selected. Customers who prefer to go with open-source container registries might consider solutions like Harbor.

High level overview of Gitlab CI with Amazon ECR

High level overview of Gitlab CI with Amazon ECR

Additional Considerations

Now that the main services for the CI/CD platform are selected, we will take a high level look at additional important considerations. You need to make sure you have observability into both infrastructure and applications, that backup tools and policies are in place, and that security needs are addressed.

There are many mechanisms to strengthen security including the use of security groups. Use IAM for granular permission control. Robust policies can limit the exposure of your resources and control the flow of traffic. Implement policies to prevent your assets leaving your CI environment inappropriately. To protect sensitive data, such as worker secrets, encrypt these assets while in transit and at rest. Select a key management solution to reduce your operational burden and to support these activities such as AWS Key Management Service (AWS KMS). To deliver secure and compliant application changes rapidly while running operations consistently with automation, implement DevSecOps.

Amazon S3 is durable, secure, and highly available by design making it the preferred choice to store EBS-level backups by many customers. Amazon S3 satisfies the non-functional requirements for a backup store. It also supports versioning and tiered storage classes, making it a cost-effective as well.

Your observability requirements may emphasize versatility and flexibility for application-level monitoring. Using Amazon CloudWatch to monitor your infrastructure and then extending your capabilities through an open-source solutions such as Prometheus may be advantageous. You can get many of the benefits of both open-source Prometheus and AWS services with Amazon Managed Service for Prometheus (AMP). For interactive visualization of metrics, many customers choose solutions such as open-source Grafana, available as an AWS service Amazon Managed Service for Grafana (AMG).

CI/CD Platform with Gitlab and AWS

CI/CD Platform with Gitlab and AWS

Conclusion

We have covered how making informed decisions can maximize value and synergy between open-source solutions on AWS, such as Gitlab, and AWS-managed services, such as Amazon EKS and Amazon ECR. You can find the right balance of open-source tools and AWS services that will meet your functional and non-functional requirements, and help maximizing the value you get from those resources.

Pete Goldberg, Director of Partnerships at GitLab: “When aligning your development process to AWS Well Architected Framework, GitLab allows customers to build and automate processes to achieve Operational Excellence. As a single tool designed to facilitate collaboration across the organization, GitLab simplifies the process to follow the Fully Separated Operating Model where Engineering and Operations come together via automated processes that remove the historical barriers between the groups. This gives organizations the ability to efficiently and rapidly deploy new features and applications that drive the business while providing the risk mitigation and compliance they require. By allowing operations teams to define infrastructure as code in the same tool that the engineering teams are storing application code, and allowing your automation bring those together for your CI/CD workflows companies can move faster while having compliance and controls built-in, providing the entire organization greater transparency. With GitLab’s integrations with different AWS compute options (EC2, Lambda, Fargate, ECS or EKS), customers can choose the best type of compute for the job without sacrificing the controls required to maintain Operational Excellence.”

 

Author bio

Mikhail is a Solutions Architect for RUS-CIS. Mikhail supports customers on their cloud journeys with Well-architected best practices and adoption of DevOps techniques on AWS. Mikhail is a fan of ChatOps, Open Source on AWS and Operational Excellence design principles.

Building well-architected serverless applications: Implementing application workload security – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-implementing-application-workload-security-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Security question SEC3: How do you implement application security in your workload?

Review and automate security practices at the application code level, and enforce security code review as part of development workflow. By implementing security at the application code level, you can protect against emerging security threats and reduce the attack surface from malicious code, including third-party dependencies.

Required practice: Review security awareness documents frequently

Stay up to date with both AWS and industry security best practices to understand and evolve protection of your workloads. Having a clear understanding of common threats helps you to mitigate them when developing your workloads.

The AWS Security Blog provides security-specific AWS content. The Open Web Application Security Project (OWASP) Top 10 is a guide for security practitioners to understand the most common application attacks and risks. The OWASP Top 10 Serverless Interpretation provides information specific to serverless applications.

Review and subscribe to vulnerability and security bulletins

Regularly review news feeds from multiple sources that are relevant to the technologies used in your workload. Subscribe to notification services to be informed of critical threats in near-real time.

The Common Vulnerabilities and Exposures (CVE) program identifies, defines, and catalogs publicly disclosed cybersecurity vulnerabilities. You can search the CVE list directly, for example “Python”.

CVE Python search

CVE Python search

The US National Vulnerability Database (NVD) allows you to search by vulnerability type, severity, and impact. You can also perform advanced searching by vendor name, product name, and version numbers. GitHub also integrates with CVE, which allows for advanced searching within the CVEproject/cvelist repository.

AWS Security Bulletins are a notification system for security and privacy events related to AWS services. Subscribe to the security bulletin RSS feed to keep up to date with AWS security announcements.

The US Cybersecurity and Infrastructure Security Agency (CISA) provides alerts about current security issues, vulnerabilities, and exploits. You can receive email alerts or subscribe to the RSS feed.

AWS Partner Network (APN) member Palo Alto Networks provides the “Serverless architectures Security Top 10” list. This is a security awareness and education guide to use while designing, developing, and testing serverless applications to help minimize security risks.

Good practice: Automatically review a workload’s code dependencies/libraries

Regularly reviewing application and code dependencies is a good industry security practice. This helps detect and prevent non-certified application code, and ensure that third-party application dependencies operate as intended.

Implement security mechanisms to verify application code and dependencies before using them

Combine automated and manual security code reviews to examine application code and its dependencies to ensure they operate as intended. Automated tools can help identify overly complex application code, and common security vulnerability exposures that are already cataloged.

Manual security code reviews, in addition to automated tools, help ensure that application code works as intended. Manual reviews can include business contextual information and integrations that automated tools may not capture.

Before adding any code dependencies to your workload, take time to review and certify each dependency to ensure that you are adding secure code. Use third-party services to review your code dependencies on every commit automatically.

OWASP has a code review guide and dependency check tool that attempt to detect publicly disclosed vulnerabilities within a project’s dependencies. The tool has a command line interface, a Maven plugin, an Ant task, and a Jenkins plugin.

GitHub has a number of security features for hosted repositories to inspect and manage code dependencies.

The dependency graph allows you to explore the packages that your repository depends on. Dependabot alerts show information about dependencies that are known to contain security vulnerabilities. You can choose whether to have pull requests generated automatically to update these dependencies. Code scanning alerts automatically scan code files to detect security vulnerabilities and coding errors.

You can enable these features by navigating to the Settings tab, and selecting Security & analysis.

GitHub configure security and analysis features

GitHub configure security and analysis features

Once Dependabot analyzes the repository, you can view the dependencies graph from the Insights tab. In the serverless airline example used in this series, you can view the Loyalty service package.json dependencies.

Serverless airline loyalty dependencies

Serverless airline loyalty dependencies

Dependabot alerts for security vulnerabilities are visible in the Security tab. You can review alerts and see information about how to resolve them.

Dependabot alert

Dependabot alert

Once Dependabot alerts are enabled for a repository, you can also view the alerts when pushing code to the repository from the terminal.

Dependabot terminal alert

Dependabot terminal alert

If you enable security updates, Dependabot can automatically create pull requests to update dependencies.

Dependabot pull requests

Dependabot pull requests

AWS Partner Network (APN) member Snyk has an integration with AWS Lambda to manage the security of your function code. Snyk determines what code and dependencies are currently deployed for Node.js, Ruby, and Java projects. It tests dependencies against their vulnerability database.

If you build your functions using container images, you can use Amazon Elastic Container Registry’s (ECR) image scanning feature. You can manually scan your images, or scan them on each push to your repository.

Elastic Container Registry image scanning example results

Elastic Container Registry image scanning example results

Best practice: Validate inbound events

Sanitize inbound events and validate them against a predefined schema. This helps prevent errors and increases your workload’s security posture by catching malformed events or events intentionally crafted to be malicious. The OWASP Input validation cheat sheet includes guidance for providing input validation security functionality in your applications.

Validate incoming HTTP requests against a schema

Implicitly trusting data from clients could lead to malformed data being processed. Use data type validators or web application frameworks to ensure data correctness. These should include regular expressions, value range, data structure, and data normalization.

You can configure Amazon API Gateway to perform basic validation of an API request before proceeding with the integration request to add another layer of security. This ensures that the HTTP request matches the desired format. Any HTTP request that does not pass validation is rejected, returning a 400 error response to the caller.

The Serverless Security Workshop has a module on API Gateway input validation based on the fictional Wild Rydes unicorn raid hailing service. The example shows a REST API endpoint where partner companies of Wild Rydes can submit unicorn customizations, such as branded capes, to advertise their company. The API endpoint should ensure that the request body follows specific patterns. These include checking the ImageURL is a valid URL, and the ID for Cape is a numeric value.

In API Gateway, a model defines the data structure of a payload, using the JSON schema draft 4. The model ensures that you receive the parameters in the format you expect. You can check them against regular expressions. The CustomizationPost model specifies that the ImageURL and Cape schemas should contain the following valid patterns:

    "imageUrl": {
      "type": "string",
      "title": "The Imageurl Schema",
      "pattern": "^https?:\/\/[[email protected]:%_+.~#?&//=]+$"
    },
    "sock": {
      "type": "string",
      "title": " The Cape Schema ",
      "pattern": "^[0-9]*$"
    },
    …

The model is applied to the /customizations/post method as part of the Method Request. The Request Validator is set to Validate body and the CustomizationPost model is set for the Request Body.

API Gateway request validator

API Gateway request validator

When testing the POST /customizations API with valid parameters using the following input:

{  
   "name":"Cherry-themed unicorn",
   "imageUrl":"https://en.wikipedia.org/wiki/Cherry#/media/File:Cherry_Stella444.jpg",
   "sock": "1",
   "horn": "2",
   "glasses": "3",
   "cape": "4"
}

The result is a valid response:

{"customUnicornId":<the-id-of-the-customization>}

Testing validation to the POST /customizations API using invalid parameters shows the input validation process.

The ImageUrl is not a valid URL:

 {  
    "name":"Cherry-themed unicorn",
    "imageUrl":"htt://en.wikipedia.org/wiki/Cherry#/media/File:Cherry_Stella444.jpg",
    "sock": "1" ,
    "horn": "2" ,
    "glasses": "3",
    "cape": "4"
 }

The Cape parameter is not a number, which shows a SQL injection attempt.

 {  
    "name":"Orange-themed unicorn",
    "imageUrl":"https://en.wikipedia.org/wiki/Orange_(fruit)#/media/File:Orange-Whole-%26-Split.jpg",
    "sock": "1",
    "horn": "2",
    "glasses": "3",
    "cape":"2); INSERT INTO Cape (NAME,PRICE) VALUES ('Bad color', 10000.00"
 }

These return a 400 Bad Request response from API Gateway before invoking the Lambda function:

{"message": "Invalid request body"}

To gain further protection, consider adding an AWS Web Application Firewall (AWS WAF) access control list to your API endpoint. The workshop includes an AWS WAF module to explore three AWS WAF rules:

  • Restrict the maximum size of request body
  • SQL injection condition as part of the request URI
  • Rate-based rule to prevent an overwhelming number of requests
AWS WAF ACL

AWS WAF ACL

AWS WAF also includes support for custom responses and request header insertion to improve the user experience and security posture of your applications.

For more API Gateway security information, see the security overview whitepaper.

Also add further input validation logic to your Lambda function code itself. For examples, see “Input Validation for Serverless”.

Conclusion

Implementing application security in your workload involves reviewing and automating security practices at the application code level. By implementing code security, you can protect against emerging security threats. You can improve the security posture by checking for malicious code, including third-party dependencies.

In this post, I cover reviewing security awareness documentation such as the CVE database. I show how to use GitHub security features to inspect and manage code dependencies. I then show how to validate inbound events using API Gateway request validation.

This well-architected question will be continued where I look at securely storing, auditing, and rotating secrets that are used in your application code.

For more serverless learning resources, visit Serverless Land.

How Banks Can Use AWS to Meet Compliance

Post Syndicated from Jiwan Panjiker original https://aws.amazon.com/blogs/architecture/how-banks-can-use-aws-to-meet-compliance/

Since the 2008 financial crisis, banking supervisory institutions such as the Basel Committee on Banking Supervision (BCBS) have strengthened regulations. There is now increased oversight over the financial services industry. For banks, making the necessary changes to comply with these rules is a challenging, multi-year effort.

Basel IV, a massive update to existing rules, is due for implementation in January 2023. Basel IV standardizes the approach to calculating credit risk, increases the impact of risk-weighted assets (RWAs) and emphasizes data transparency.

Given the complexity of data, modeling, and numerous assumptions that have to be made, compliance under Basel IV implementation will be challenging. Standardization omits nuances unique to your business, which can drive up costs, but violating guidelines will result in steep penalties.

This post will address these challenges by outlining a mechanism that facilitates a healthy, data-driven dialogue between banks and regulators to better achieve compliance objectives. The reference architecture will focus on enabling fast, iterative releases with the help of serverless AWS services.

There are four key actions to take in order to support this mechanism:

  1. Automate data management
  2. Establish a continuous integration/continuous delivery (CI/CD) pipeline
  3. Enable fast, point-in-time audit replays
  4. Set up proactive monitoring and notifications

Automate data management

Due to frequent merger activity, banks are typically comprised of a web of integrated systems and siloed business units, making it difficult to consolidate data. Under Basel IV guidelines, auditors want banks to provide detailed data in a presentable way.

You can tackle this first challenge by establishing a data pipeline as shown in Figure 1. Take inventory of each data source as it is incorporated into the pipeline. Identify the critical internal and external data sources that will be used to populate the initial landing area. Amazon Simple Storage Service (S3) is a great choice for this.

Figure 1. Data pipeline that cleans, processes, and segments data

Figure 1. Data pipeline that cleans, processes, and segments data

Amazon S3 is a highly available, durable service that is a popular data lake solution. S3 offers WORM storage capabilities like S3 Glacier Vault and S3 Object Lock to protect the integrity of your archived data in accordance with U.S. SEC and FINRA rules.

Basel IV regulations also require banks to use many attributes to develop accurate credit risk models. The attributes can be a mix of datasets such as financial statements, internal balanced scorecards, macro-economic data, and credit ratings. The risk models themselves can also be segmented by portfolio types, industry segments, asset types and much more.

You can split data into different domains and designate data owners with separate S3 buckets. Credit risk model developers, analyst, and data scientists can then use the structure of the S3 buckets to pull together relevant datasets. They can then store the outputs into S3 buckets.

To support fast, automated data retrieval, store object metadata in a highly scalable, and queryable database. You can set up Amazon S3 so that an event can initiate a function to populate Amazon DynamoDB. Developers can use AWS Lambda to write these functions using popular languages like Python.

With AWS Glue, you can automate Extract/Load/Transform (ETL) processes to clean and move data to the different S3 buckets. AWS Glue can also support data operations by automatically cataloging your various data sources.

Taking on a structured approach will simplify data governance and transparency as the business continues to grow and operate.

Establish a CI/CD pipeline

Adopt tools that machine learning teams can use to build a streamlined CI/CD solution as demonstrated in Figure 2.

Figure 2. An end-to-end machine learning development and deployment pipeline

Figure 2. An end-to-end machine learning development and deployment pipeline

Using tightly integrated AWS services, your teams can minimize time spent managing tools and deployment processes, and instead, focus on tuning the models and analyzing the results.

Amazon SageMaker brings together a powerful set of machine learning capabilities on the AWS Cloud. It helps data scientists and engineers build insightful models. Figure 2 depicts the high-level architecture and shows how Amazon SageMaker Pipelines helps teams orchestrate the automation and deployment processes.

The core of the pipeline uses a set of AWS deployment services so that your teams can collaborate and review effectively. With AWS CodeCommit, your teams can set up git-based repository to store and version models for data processing, training, and evaluation. The repository can also store code and configuration files using AWS CloudFormation for deployment. You can use AWS CodePipeline and AWS CodeBuild to create and update a model endpoint based on the approved/reviewed changes.

Any updates detected in the AWS CodeCommit repository initiate a deployment whenever a new model version is added to the Model Registry. Amazon S3 can be used to store generated model artifacts, historical data, and models.

Enable fast, point-in-time audit replays

Figure 3. Containers offer a lightweight, powerful solution to run audits using historical assets

Figure 3. Containers offer a lightweight, powerful solution to run audits using historical assets

One of the main themes of Basel IV is transparency. Figure 3 illustrates a solution to build trust with regulators by allowing them to verify and understand modeling activity.

A lightweight application is hosted in AWS Fargate and enables auditors to re-run Basel credit risk models under specified conditions. With AWS Fargate, you don’t need to manually manage instances or container orchestration. Configure the CPU or memory specifications at the task level and set guidelines around scalability for your service. Your tasks then scale up and down automatically, based on demand, and will optimize cost efficiency and availability.

Figure 3 shows the following:

  1. The application takes inputs such as date, release version, and model type.
  2. It then queries DynamoDB with this information.
  3. The query will return the data necessary to retrieve model artifacts from previous CI/CD deployments and relevant datasets from historical S3 buckets.
  4. Using this information, it can spin up as many containers as needed to run the model.
  5. It then stores the outputs in a separate S3 bucket.
  6. Auditors will have a detailed trace of all the attributes, assumptions, and data that went into the modeling effort. To streamline this process, the app can also compare the outputs of the historical runs to the recent replay and highlight any significant deviations.

Though internal models will be de-emphasized under Basel IV, banks will continue to run internal models as a benchmark against the broader standards. Schedule AWS Fargate tasks to run these models regularly to capitalize on highly performant compute services while minimizing costs.

Set up proactive monitoring and notifications

Figure 4. Scheduled jobs can send out notifications using Amazon SNS when certain thresholds are breached

Figure 4. Scheduled jobs can send out notifications using Amazon SNS when certain thresholds are breached

The last principle is based around establishing an early warning system, enabling banks to take on a more proactive role in maintaining compliance.

With automated monitoring and notifications, banks will be able to respond quickly to potential concerns. For instance, there can be a daily scheduled job that launches containers and runs the models against the latest data. If any thresholds are breached, alerts can be sent out via SMS or email. Operational teams can be subscribed to certain message topics using Amazon Simple Notification Service (SNS). They can then respond before actual compliance issues emerge.

Conclusion

With a Well-Architected approach, AWS helps you control your data, deploy new features, and embrace a serverless approach. This frees you to innovate quickly and address regulatory challenges.

You can iterate with new AWS services and bring machine learning to bear on various streams of data to identify high impact pools of value. You can get a clearer picture of the data to make it easier to identify areas where you can reduce RWAs. Using Amazon S3, you can turn on AWS analytics services such as Amazon QuickSight and Amazon Athena to visualize the data. You’ll be able to fulfill reporting requirements such as those found in regulatory studies like CCAR, DFAST, CECL, and IFRS9.

For more information about establishing a data pipeline, read Lake House Formation Architecture. It is a powerful pattern that combines a few concepts that will help bring your data together cohesively. To set up a robust CI/CD pipeline, explore the AWS Serverless CI/CD Reference Architecture.

Choosing a CI/CD approach: Open Source on AWS, an Iponweb story

Post Syndicated from Mikhail Vasilyev original https://aws.amazon.com/blogs/devops/choosing-a-ci-cd-approach-open-source-on-aws-an-iponweb-story/

Iponweb is a global leader in building programmatic and real-time advertising technology and infrastructure for some of the world’s biggest digital media buyers and sellers. The company develops client-facing products and internal development tools that must be platform agnostic to support spanning across multiple cloud services.

In this post, we explore how Iponweb applied key considerations when choosing a continuous integration, continuous deployment (CI/CD), what they determined to be the right CI/CD approach for them, and review some considerations that may apply to your own business needs. And in the next post, we will dive even deeper into these key considerations.

How did Iponweb decide what they needed?

The first and most important question in designing a Well-Architected approach is: “How do you determine your priorities?” AWS Well-Architected defines the first two best practices to do that as: ”evaluate external customer needs” (Iponweb’s clients) and “evaluate internal customer needs” (Iponweb’s team).

Iponweb started with these two considerations while selecting the strategic toolset. After evaluating their customers’ requirements, the next step was to look at the needs of the Iponweb team. Their priorities included the products and features required, the cost, and the ability to build multi-cloud solutions.

Iponweb is dedicated to operating securely with the reliability and performance to support their customers. Solutions had to satisfy their fundamental requirements in these areas to be considered in their evaluation.

Feature set

Iponweb evaluated available options for the CI tool chain and found that, for their needs, GitLab was the clear winner, differentiated by delivering the greatest number of required features at the best price while being platform agnostic.

AWS had the complete set of tools, services, and best practices to support Iponweb’s goal to establish an open-source, self-hosted CI environment using GitLab. Upon completing their thorough evaluation process, Iponweb selected AWS to implement its CI environment.

Cost

Iponweb understood the investment they would be making within their team to leverage and support all the desired features of GitLab. Iponweb evaluated the expertise of its internal teams and factored in ease of integration with supporting services.

They adopted several AWS services that satisfied their undifferentiated needs, which allowed them to remove the operational burden and cost of maintaining their own implementations of various capabilities and features.

Furthermore, the availability of Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances provided the opportunity to further manage costs for their CI resource needs and usage patterns.

Security

Iponweb leveraged their existing security control implementations and integration with AWS to support adopting additional AWS services. AWS was responsible for the security of the cloud, including the underlying AWS services. Iponweb was able to focus on secure and effective configurations of those services and secure and effective configuration of their GitLab implementation. This ensured the security of their open-source, self-hosted CI environment.

When setting priorities for the design of a Well-Architected approach, it’s imperative to “manage benefits and risks,” which emphasizes making informed decisions when adopting open source or any tools. Iponweb achieved their best value solution by applying Well-Architected practices in Operational Excellence, Cost Optimization, and Security pillars by leveraging AWS products and services.

Overview of solution

Continuous integration consists of three key processes, each of which AWS supports:

  • Code stage – Iponweb built the centralized Git repository on the GitLab platform on EC2 servers, providing the UI and API to store and manage the code.
  • Test and build stage – They used GitLab as the application layer to manage build and test flows through GitLab Runners (compute workers for CI jobs). This layer is implanted via GitLab in containers, and is deployed and managed by Amazon Elastic Kubernetes Service (Amazon EKS).
  • Publish stageAmazon Elastic Container Registry (Amazon ECR) stores the infrastructure containers for the runners and product containers.

The following diagram illustrates this architecture:

At the core of Iponweb’s CI platform architecture is the open-source GitLab Community Edition.

Implementing the solution

CI jobs are either run regularly or triggered by events such as merge requests. The jobs are described as code in YAML files and are stored and versioned along with the product code itself. Runner versions are published into Amazon ECR and launched as Docker containers in Amazon EKS.

Runner code is stored as Helm charts that help Iponweb package up and manage their large-scale Kubernetes deployments. In addition, Amazon EKS has support for Helm and many other plugins for Kubernetes.

Iponweb developers innovate at a very fast pace, and customize Iponweb’s client solutions in rapid iterations. To address uncertain container registry requirements, Iponweb decided to use Amazon ECR. As a managed service, Amazon ECR eliminates concerns about scaling capacity and management. Integration of GitLab with Amazon EKS and Amazon ECR is provided out of the box through a UI and predefined scripts, with no additional overhead to develop and deploy code or plugins.

Iponweb was able to implement the Well-Architected design principle: “stop continuously estimating its capacity needs.” Enabling them to focus on more strategic development activities. They performed a thorough analysis of each component, looking at the total cost of ownership, including operations and management. In doing so, they implemented the best practice from the Cost Optimization pillar: “How do you evaluate cost when you select services?”

In the Cost Optimization pillar, a key question is “How do you use pricing models to reduce costs?” Iponweb deployed runners in Amazon EKS for precise, granular, and on-demand compute scaling for each CI job. These tasks have short-term capacity needs, so Iponweb benefited from configuring Amazon EKS on Spot Instances, achieving factor price reduction. The EC2 Spot pricing model is most appropriate for their CI resource needs and usage patterns.

To protect their data at rest, Iponweb followed a best practice from the Security pillar: “Implement secure key management.” They used AWS Key Management Service (AWS KMS) to manage secrets for the runners.

To protect the code and artifacts, and to ensure these valuable assets don’t leave the CI environment inappropriately, Iponweb followed best practices in Infrastructure Protection from the Security pillar question, “How do you protect your networks?” Iponweb scrupulously defined the network protection requirements, limiting their exposure by controlling traffic at all layers, and implementing security groups to prevent inappropriate access into and out of their VPC.

Michael Benuhis, CTO at Iponweb, says:

“Iponweb was able to get the best of open-source software and public cloud services by building the continuous integration platform on Amazon Web Services. Open-source tools provided Iponweb platform agnosticism for serving our diverse customer base, while managed Amazon EKS on EC2 Spot Instances eliminated the operational burden of managing our own Kubernetes infrastructure, and with greater cost efficiency.”

Conclusion

Iponweb has satisfied their current needs and aren’t looking for improvement in the short term. They will stay on the free version of GitLab, satisfied for the moment with what they have achieved. They have custom automations in place to synchronize with GitLab and integrate with their existing tools. They like the features provided by the paid version of GitLab, but there isn’t a business case to support an informed decision to upgrade at this time.

They have achieved their goal of using Amazon EKS and Spot under GitLab CI/CD integrated with their existing systems and satisfying their needs.

Using container image support for AWS Lambda with AWS SAM

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/using-container-image-support-for-aws-lambda-with-aws-sam/

At AWS re:Invent 2020, AWS Lambda released Container Image Support for Lambda functions. This new feature allows developers to package and deploy Lambda functions as container images of up to 10 GB in size. With this release, AWS SAM also added support to manage, build, and deploy Lambda functions using container images.

In this blog post, I walk through building a simple serverless application that uses Lambda functions packaged as container images with AWS SAM. I demonstrate creating a new application and highlight changes to the AWS SAM template specific to container image support. I then cover building the image locally for debugging in addition to eventual deployment. Finally, I show using AWS SAM to handle packaging and deploying Lambda functions from a developer’s machine or a CI/CD pipeline.

Push to invoke lifecycle

Push to invoke lifecycle

The process for creating a Lambda function packaged as a container requires only a few steps. A developer first creates the container image and tags that image with the appropriate label. The image is then uploaded to an Amazon Elastic Container Registry (ECR) repository using docker push.

During the Lambda create or update process, the Lambda service pulls the image from ECR, optimizes the image for use, and deploys the image to the Lambda service. Once this, and any other configuration processes are complete, the Lambda function is then in Active status and ready to be invoked. The AWS SAM CLI manages most of these steps for you.

Prerequisites

The following tools are required in this walkthrough:

Create the application

Use the terminal and follow these steps to create a serverless application:

  1. Enter sam init.
  2. For Template source, select option one for AWS Quick Start Templates.
  3. For Package type, choose option two for Image.
  4. For Base image, select option one for amazon/nodejs12.x-base.
  5. Name the application demo-app.
Demonstration of sam init

Demonstration of sam init

Exploring the application

Open the template.yaml file in the root of the project to see the new options available for container image support. The AWS SAM template has two new values that are required when working with container images. PackageType: Image tells AWS SAM that this function is using container images for packaging.

AWS SAM template

AWS SAM template

The second set of required data is in the Metadata section that helps AWS SAM manage the container images. When a container is created, a new tag is added to help identify that image. By default, Docker uses the tag, latest. However, AWS SAM passes an explicit tag name to help differentiate between functions. That tag name is a combination of the Lambda function resource name, and the DockerTag value found in the Metadata. Additionally, the DockerContext points to the folder containing the function code and Dockerfile identifies the name of the Dockerfile used in building the container image.

In addition to changes in the template.yaml file, AWS SAM also uses the Docker CLI to build container images. Each Lambda function has a Dockerfile that instructs Docker how to construct the container image for that function. The Dockerfile for the HelloWorldFunction is at hello-world/Dockerfile.

Local development of the application

AWS SAM provides local development support for zip-based and container-based Lambda functions. When using container-based images, as you modify your code, update the local container image using sam build. AWS SAM then calls docker build using the Dockerfile for instructions.

Dockerfile for Lambda function

Dockerfile for Lambda function

In the case of the HelloWorldFunction that uses Node.js, the Docker command:

  1. Pulls the latest container base image for nodejs12.x from the Amazon Elastic Container Registry Public.
  2. Copies the app.js code and package.json files to the container image.
  3. Installs the dependencies inside the container image.
  4. Sets the invocation handler.
  5. Creates and tags new version of the local container image.

To build your application locally on your machine, enter:

sam build

The results are:

Results for sam build

Results for sam build

Now test the code by locally invoking the HelloWorldFunction using the following command:

sam local invoke HelloWorldFunction

The results are:

Results for sam local invoke

Results for sam local invoke

You can also combine these commands and add flags for cached and parallel builds:

sam build --cached --parallel && sam local invoke HelloWorldFunction

Deploying the application

There are two ways to deploy container-based Lambda functions with AWS SAM. The first option is to deploy from AWS SAM using the sam deploy command. The deploy command tags the local container image, uploads it to ECR, and then creates or updates your Lambda function. The second method is the sam package command used in continuous integration and continuous delivery or deployment (CI/CD) pipelines, where the deployment process is separate from the artifact creation process.

AWS SAM package tags and uploads the container image to ECR but does not deploy the application. Instead, it creates a modified version of the template.yaml file with the newly created container image location. This modified template is later used to deploy the serverless application using AWS CloudFormation.

Deploying from AWS SAM with the guided flag

Before you can deploy the application, use the AWS CLI to create a new ECR repository to store the container image for the HelloWorldFunction.

Run the following command from a terminal:

aws ecr create-repository --repository-name demo-app-hello-world \
--image-tag-mutability IMMUTABLE --image-scanning-configuration scanOnPush=true

This command creates a new ECR repository called demo-app-hello-world. The –image-tag-mutability IMMUTABLE option prevents overwriting tags. The –image-scanning-configuration scanOnPush=true enables automated vulnerability scanning whenever a new image is pushed to the repository. The output is:

Amazon ECR creation output

Amazon ECR creation output

Make a note of the repositoryUri as you need it in the next step.

Before you can push your images to this new repository, ensure that you have logged in to the managed Docker service that ECR provides. Update the bracketed tokens with your information and run the following command in the terminal:

aws ecr get-login-password --region <region> | docker login --username AWS \
--password-stdin <account id>.dkr.ecr.<region>.amazonaws.com

You can also install the Amazon ECR credentials helper to help facilitate Docker authentication with Amazon ECR.

After building the application locally and creating a repository for the container image, you can deploy the application. The first time you deploy an application, use the guided version of the sam deploy command and follow these steps:

  1. Type sam deploy --guided, or sam deploy -g.
  2. For Stack Name, enter demo-app.
  3. Choose the same Region that you created the ECR repository in.
  4. Enter the Image Repository for the HelloWorldFunction (this is the repositoryUri of the ECR repository).
  5. For Confirm changes before deploy and Allow SAM CLI IAM role creation, keep the defaults.
  6. For HelloWorldFunction may not have authorization defined, Is this okay? Select Y.
  7. Keep the defaults for the remaining prompts.
Results of sam deploy --guided

Results of sam deploy –guided

AWS SAM uploads the container images to the ECR repo and deploys the application. During this process, you see a changeset along with the status of the deployment. When the deployment is complete, the stack outputs are then displayed. Use the HelloWorldApi endpoint to test your application in production.

Deploy outputs

Deploy outputs

When you use the guided version, AWS SAM saves the entered data to the samconfig.toml file. For subsequent deployments with the same parameters, use sam deploy. If you want to make a change, use the guided deployment again.

This example demonstrates deploying a serverless application with a single, container-based Lambda function in it. However, most serverless applications contain more than one Lambda function. To work with an application that has more than one Lambda function, follow these steps to add a second Lambda function to your application:

  1. Copy the hello-world directory using the terminal command cp -R hello-world hola-world
  2. Replace the contents of the template.yaml file with the following
    AWSTemplateFormatVersion: '2010-09-09'
    Transform: AWS::Serverless-2016-10-31
    Description: demo app
      
    Globals:
      Function:
        Timeout: 3
    
    Resources:
      HelloWorldFunction:
        Type: AWS::Serverless::Function
        Properties:
          PackageType: Image
          Events:
            HelloWorld:
              Type: Api
              Properties:
                Path: /hello
                Method: get
        Metadata:
          DockerTag: nodejs12.x-v1
          DockerContext: ./hello-world
          Dockerfile: Dockerfile
          
      HolaWorldFunction:
        Type: AWS::Serverless::Function
        Properties:
          PackageType: Image
          Events:
            HolaWorld:
              Type: Api
              Properties:
                Path: /hola
                Method: get
        Metadata:
          DockerTag: nodejs12.x-v1
          DockerContext: ./hola-world
          Dockerfile: Dockerfile
    
    Outputs:
      HelloWorldApi:
        Description: "API Gateway endpoint URL for Prod stage for Hello World function"
        Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/hello/"
      HolaWorldApi:
        Description: "API Gateway endpoint URL for Prod stage for Hola World function"
        Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/hola/"
  3. Replace the contents of hola-world/app.js with the following
    let response;
    exports.lambdaHandler = async(event, context) => {
        try {
            response = {
                'statusCode': 200,
                'body': JSON.stringify({
                    message: 'hola world',
                })
            }
        }
        catch (err) {
            console.log(err);
            return err;
        }
        return response
    };
  4. Create an ECR repository for the HolaWorldFunction
    aws ecr create-repository --repository-name demo-app-hola-world \
    --image-tag-mutability IMMUTABLE --image-scanning-configuration scanOnPush=true
  5. Run the guided deploy to add the second repository:
    sam deploy -g

The AWS SAM guided deploy process allows you to provide the information again but prepopulates the defaults with previous values. Update the following:

  1. Keep the same stack name, Region, and Image Repository for HelloWorldFunction.
  2. Use the new repository for HolaWorldFunction.
  3. For the remaining steps, use the same values from before. For Lambda functions not to have authorization defined, enter Y.
Results of sam deploy --guided

Results of sam deploy –guided

Deploying in a CI/CD pipeline

Companies use continuous integration and continuous delivery (CI/CD) pipelines to automate application deployment. Because the process is automated, using an interactive process like a guided AWS SAM deployment is not possible.

Developers can use the packaging process in AWS SAM to prepare the artifacts for deployment and produce a separate template usable by AWS CloudFormation. The package command is:

sam package --output-template-file packaged-template.yaml \
--image-repository 5555555555.dkr.ecr.us-west-2.amazonaws.com/demo-app

For multiple repositories:

sam package --output-template-file packaged-template.yaml \ 
--image-repositories HelloWorldFunction=5555555555.dkr.ecr.us-west-2.amazonaws.com/demo-app-hello-world \
--image-repositories HolaWorldFunction=5555555555.dkr.ecr.us-west-2.amazonaws.com/demo-app-hola-world

Both cases create a file called packaged-template.yaml. The Lambda functions in this template have an added tag called ImageUri that points to the ECR repository and a tag for the Lambda function.

Packaged template

Packaged template

Using sam package to generate a separate CloudFormation template enables developers to separate artifact creation from application deployment. The deployment process can then be placed in an isolated stage allowing for greater customization and observability of the pipeline.

Conclusion

Container image support for Lambda enables larger application artifacts and the ability to use container tooling to manage Lambda images. AWS SAM simplifies application management by bringing these tools into the serverless development workflow.

In this post, you create a container-based serverless application in using command lines in the terminal. You create ECR repositories and associate them with functions in the application. You deploy the application from your local machine and package the artifacts for separate deployment in a CI/CD pipeline.

To learn more about serverless and AWS SAM, visit the Sessions with SAM series at s12d.com/sws and find more resources at serverlessland.com.

#ServerlessForEveryone

Working with Lambda layers and extensions in container images

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/working-with-lambda-layers-and-extensions-in-container-images/

In this post, I explain how to use AWS Lambda layers and extensions with Lambda functions packaged and deployed as container images.

Previously, Lambda functions were packaged only as .zip archives. This includes functions created in the AWS Management Console. You can now also package and deploy Lambda functions as container images.

You can use familiar container tooling such as the Docker CLI with a Dockerfile to build, test, and tag images locally. Lambda functions built using container images can be up to 10 GB in size. You push images to an Amazon Elastic Container Registry (ECR) repository, a managed AWS container image registry service. You create your Lambda function, specifying the source code as the ECR image URL from the registry.

Lambda container image support

Lambda container image support

Lambda functions packaged as container images do not support adding Lambda layers to the function configuration. However, there are a number of solutions to use the functionality of Lambda layers with container images. You take on the responsible for packaging your preferred runtimes and dependencies as a part of the container image during the build process.

Understanding how Lambda layers and extensions work as .zip archives

If you deploy function code using a .zip archive, you can use Lambda layers as a distribution mechanism for libraries, custom runtimes, and other function dependencies.

When you include one or more layers in a function, during initialization, the contents of each layer are extracted in order to the /opt directory in the function execution environment. Each runtime then looks for libraries in a different location under /opt, depending on the language. You can include up to five layers per function, which count towards the unzipped deployment package size limit of 250 MB. Layers are automatically set as private, but they can be shared with other AWS accounts, or shared publicly.

Lambda Extensions are a way to augment your Lambda functions and are deployed as Lambda layers. You can use Lambda Extensions to integrate functions with your preferred monitoring, observability, security, and governance tools. You can choose from a broad set of tools provided by AWS, AWS Lambda Ready Partners, and AWS Partners, or create your own Lambda Extensions. For more information, see “Introducing AWS Lambda Extensions – In preview.”

Extensions can run in either of two modes, internal and external. An external extension runs as an independent process in the execution environment. They can start before the runtime process, and can continue after the function invocation is fully processed. Internal extensions run as part of the runtime process, in-process with your code.

Lambda searches the /opt/extensions directory and starts initializing any extensions found. Extensions must be executable as binaries or scripts. As the function code directory is read-only, extensions cannot modify function code.

It helps to understand that Lambda layers and extensions are just files copied into specific file paths in the execution environment during the function initialization. The files are read-only in the execution environment.

Understanding container images with Lambda

A container image is a packaged template built from a Dockerfile. The image is assembled or built from commands in the Dockerfile, starting from a parent or base image, or from scratch. Each command then creates a new layer in the image, which is stacked in order on top of the previous layer. Once built from the packaged template, a container image is immutable and read-only.

For Lambda, a container image includes the base operating system, the runtime, any Lambda extensions, your application code, and its dependencies. Lambda provides a set of open-source base images that you can use to build your container image. Lambda uses the image to construct the execution environment during function initialization. You can use the AWS Serverless Application Model (AWS SAM) CLI or native container tools such as the Docker CLI to build and test container images locally.

Using Lambda layers in container images

Container layers are added to a container image, similar to how Lambda layers are added to a .zip archive function.

There are a number of ways to use container image layering to add the functionality of Lambda layers to your Lambda function container images.

Use a container image version of a Lambda layer

A Lambda layer publisher may have a container image format equivalent of a Lambda layer. To maintain the same file path as Lambda layers, the published container images must have the equivalent files located in the /opt directory. An image containing an extension must include the files in the /opt/extensions directory.

An example Lambda function, packaged as a .zip archive, is created with two layers. One layer contains shared libraries, and the other layer is a Lambda extension from an AWS Partner.

aws lambda create-function –region us-east-1 –function-name my-function \

aws lambda create-function --region us-east-1 --function-name my-function \  
    --role arn:aws:iam::123456789012:role/lambda-role \
    --layers \
        "arn:aws:lambda:us-east-1:123456789012:layer:shared-lib-layer:1" \
        "arn:aws:lambda:us-east-1:987654321987:extensions-layer:1" \
    …

The corresponding Dockerfile syntax for a function packaged as a container image includes the following lines. These pull the container image versions of the Lambda layers and copy them into the function image. The shared library image is pulled from ECR and the extension image is pulled from Docker Hub.

FROM public.ecr.aws/myrepo/shared-lib-layer:1 AS shared-lib-layer
# Layer code
WORKDIR /opt
COPY --from=shared-lib-layer /opt/ .

FROM aws-partner/extensions-layer:1 as extensions-layer
# Extension  code
WORKDIR /opt/extensions
COPY --from=extensions-layer /opt/extensions/ .

Copy the contents of a Lambda layer into a container image

You can use existing Lambda layers, and copy the contents of the layers into the function container image /opt directory during docker build.

You need to build a Dockerfile that includes the AWS Command Line Interface to copy the layer files from Amazon S3.

The Dockerfile to add two layers into a single image includes the following lines to copy the Lambda layer contents.

FROM alpine:latest

ARG AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION:-"us-east-1"}
ARG AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:-""}
ARG AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:-""}
ENV AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
ENV AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}

RUN apk add aws-cli curl unzip

RUN mkdir -p /opt

RUN curl $(aws lambda get-layer-version-by-arn --arn arn:aws:lambda:us-east-1:1234567890123:layer:shared-lib-layer:1 --query 'Content.Location' --output text) --output layer.zip
RUN unzip layer.zip -d /opt
RUN rm layer.zip

RUN curl $(aws lambda get-layer-version-by-arn --arn arn:aws:lambda:us-east-1:987654321987:extensions-layer:1 --query 'Content.Location' --output text) --output layer.zip
RUN unzip layer.zip -d /opt
RUN rm layer.zip

To run the AWS CLI, specify your AWS_ACCESS_KEY, and AWS_SECRET_ACCESS_KEY, and include the required AWS_DEFAULT_REGION as command-line arguments.

docker build . -t layer-image1:latest \
--build-arg AWS_DEFAULT_REGION=us-east-1 \
--build-arg AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE \
--build-arg AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

This creates a container image containing the existing Lambda layer and extension files. This can be pushed to ECR and used in a function.

Build a container image from a Lambda layer

You can repackage and publish Lambda layer file content as container images. Creating separate container images for different layers allows you to add them to multiple functions, and share them in a similar way as Lambda layers.

You can create a separate container image containing the files from a single layer, or combine the files from multiple layers into a single image. If you create separate container images for layer files, you then add these images into your function image.

There are two ways to manage language code dependencies. You can pre-build the dependencies and copy the files into the container image, or build the dependencies during docker build.

In this example, I migrate an existing Python application. This comprises a Lambda function and extension, from a .zip archive to separate function and extension container images. The extension writes logs to S3.

You can choose how to store images in repositories. You can either push both images to the same ECR repository with different image tags, or push to different repositories. In this example, I use separate ECR repositories.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file.

The existing example extension uses a makefile to install boto3 using pip install with a requirements.txt file. This is migrated to the docker build process. I must add a Python runtime to be able to run pip install as part of the build process. I use python:3.8-alpine as a minimal base image.

I create separate Dockerfiles for the function and extension. The extension Dockerfile contains the following lines.

FROM python:3.8-alpine AS installer
#Layer Code
COPY extensionssrc /opt/
COPY extensionssrc/requirements.txt /opt/
RUN pip install -r /opt/requirements.txt -t /opt/extensions/lib

FROM scratch AS base
WORKDIR /opt/extensions
COPY --from=installer /opt/extensions .

I build, tag, login, and push the extension container image to an existing ECR repository.

docker build -t log-extension-image:latest  .
docker tag log-extension-image:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-image:latest
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-image:latest

The function Dockerfile contains the following lines, which add the files from the previously created extension image to the function image. There is no need to run pip install for the function as it does not require any additional dependencies.

FROM 123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-image:latest AS layer
FROM public.ecr.aws/lambda/python:3.8
# Layer code
WORKDIR /opt
COPY --from=layer /opt/ .
# Function code
WORKDIR /var/task
COPY app.py .
CMD ["app.lambda_handler"]

I build, tag, and push the function container image to a separate existing ECR repository. This creates an immutable image of the Lambda function.

docker build -t log-extension-function:latest  .
docker tag log-extension-function:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-function:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-function:latest

The function requires a unique S3 bucket to store the logs files, which I create in the S3 console. I create a Lambda function from the ECR repository image, and specify the bucket name as a Lambda environment variable.

aws lambda create-function --region us-east-1  --function-name log-extension-function \
--package-type Image --code ImageUri=123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-function:latest \
--role "arn:aws:iam:: 123456789012:role/lambda-role" \
--environment  "Variables": {"S3_BUCKET_NAME": "s3-logs-extension-demo-logextensionsbucket-us-east-1"}

For subsequent extension code changes, I need to update both the extension and function images. If only the function code changes, I need to update the function image. I push the function image as the :latest image to ECR. I then update the function code deployment to use the updated :latest ECR image.

aws lambda update-function-code --function-name log-extension-function --image-uri 123456789012.dkr.ecr.us-east-1.amazonaws.com/log-extension-function:latest

Using custom runtimes with container images

With .zip archive functions, custom runtimes are added using Lambda layers. With container images, you no longer need to copy in Lambda layer code for custom runtimes.

You can build your own custom runtime images starting with AWS provided base images for custom runtimes. You can add your preferred runtime, dependencies, and code to these images. To communicate with Lambda, the image must implement the Lambda Runtime API. We provide Lambda runtime interface clients for all supported runtimes, or you can implement your own for additional runtimes.

Running extensions in container images

A Lambda extension running in a function packaged as a container image works in the same way as a .zip archive function. You build a function container image including the extension files, or adding an extension image layer. Lambda looks for any external extensions in the /opt/extensions directory and starts initializing them. Extensions must be executable as binaries or scripts.

Internal extensions modify the Lambda runtime startup behavior using language-specific environment variables, or wrapper scripts. For language-specific environment variables, you can set the following environment variables in your function configuration to augment the runtime command line.

  • JAVA_TOOL_OPTIONS (Java Corretto 8 and 11)
  • NODE_OPTIONS (Node.js 10 and 12)
  • DOTNET_STARTUP_HOOKS (.NET Core 3.1)

An example Lambda environment variable for JAVA_TOOL_OPTIONS:

-javaagent:"/opt/ExampleAgent-0.0.jar"

Wrapper scripts delegate the runtime start-up to a script. The script can inject and alter arguments, set environment variables, or capture metrics, errors, and other diagnostic information. The following runtimes support wrapper scripts: Node.js 10 and 12, Python 3.8, Ruby 2.7, Java 8 and 11, and .NET Core 3.1

You specify the script by setting the value of the AWS_LAMBDA_EXEC_WRAPPER environment variable as the file system path of an executable binary or script, for example:

/opt/wrapper_script

Conclusion

You can now package and deploy Lambda functions as container images in addition to .zip archives. Lambda functions packaged as container images do not directly support adding Lambda layers to the function configuration as .zip archives do.

In this post, I show a number of solutions to use the functionality of Lambda layers and extensions with container images, including example Dockerfiles.

I show how to migrate an existing Lambda function and extension from a .zip archive to separate function and extension container images. Follow the instructions in the README.md file in the GitHub repository.

For more serverless learning resources, visit https://serverlessland.com.

Creating multi-architecture Docker images to support Graviton2 using AWS CodeBuild and AWS CodePipeline

Post Syndicated from Tyler Lynch original https://aws.amazon.com/blogs/devops/creating-multi-architecture-docker-images-to-support-graviton2-using-aws-codebuild-and-aws-codepipeline/

This post provides a clear path for customers who are evaluating and adopting Graviton2 instance types for performance improvements and cost-optimization.

Graviton2 processors are custom designed by AWS using 64-bit Arm Neoverse N1 cores. They power the T4g*, M6g*, R6g*, and C6g* Amazon Elastic Compute Cloud (Amazon EC2) instance types and offer up to 40% better price performance over the current generation of x86-based instances in a variety of workloads, such as high-performance computing, application servers, media transcoding, in-memory caching, gaming, and more.

More and more customers want to make the move to Graviton2 to take advantage of these performance optimizations while saving money.

During the transition process, a great benefit AWS provides is the ability to perform native builds for each architecture, instead of attempting to cross-compile on homogenous hardware. This has the benefit of decreasing build time as well as reducing complexity and cost to set up.

To see this benefit in action, we look at how to build a CI/CD pipeline using AWS CodePipeline and AWS CodeBuild that can build multi-architecture Docker images in parallel to aid you in evaluating and migrating to Graviton2.

Solution overview

With CodePipeline and CodeBuild, we can automate the creation of architecture-specific Docker images, which can be pushed to Amazon Elastic Container Registry (Amazon ECR). The following diagram illustrates this architecture.

Solution overview architectural diagram

The steps in this process are as follows:

  1. Create a sample Node.js application and associated Dockerfile.
  2. Create the buildspec files that contain the commands that CodeBuild runs.
  3. Create three CodeBuild projects to automate each of the following steps:
    • CodeBuild for x86 – Creates a x86 Docker image and pushes to Amazon ECR.
    • CodeBuild for arm64 – Creates a Arm64 Docker image and pushes to Amazon ECR.
    • CodeBuild for manifest list – Creates a Docker manifest list, annotates the list, and pushes to Amazon ECR.
  4. Automate the orchestration of these projects with CodePipeline.

Prerequisites

The prerequisites for this solution are as follows:

  • The correct AWS Identity and Access Management (IAM) role permissions for your account allowing for the creation of the CodePipeline pipeline, CodeBuild projects, and Amazon ECR repositories
  • An Amazon ECR repository named multi-arch-test
  • A source control service such as AWS CodeCommit or GitHub that CodeBuild and CodePipeline can interact with
  • The source code repository initialized and cloned locally

Creating a sample Node.js application and associated Dockerfile

For this post, we create a sample “Hello World” application that self-reports the processor architecture. We work in the local folder that is cloned from our source repository as specified in the prerequisites.

  1. In your preferred text editor, add a new file with the following Node.js code:

# Hello World sample app.
const http = require('http');

const port = 3000;

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader('Content-Type', 'text/plain');
  res.end(`Hello World. This processor architecture is ${process.arch}`);
});

server.listen(port, () => {
  console.log(`Server running on processor architecture ${process.arch}`);
});
  1. Save the file in the root of your source repository and name it app.js.
  2. Commit the changes to Git and push the changes to our source repository. See the following code:

git add .
git commit -m "Adding Node.js sample application."
git push

We also need to create a sample Dockerfile that instructs the docker build command how to build the Docker images. We use the default Node.js image tag for version 14.

  1. In a text editor, add a new file with the following code:

# Sample nodejs application
FROM node:14
WORKDIR /usr/src/app
COPY package*.json app.js ./
RUN npm install
EXPOSE 3000
CMD ["node", "app.js"]
  1. Save the file in the root of the source repository and name it Dockerfile. Make sure it is Dockerfile with no extension.
  2. Commit the changes to Git and push the changes to our source repository:

git add .
git commit -m "Adding Dockerfile to host the Node.js sample application."
git push

Creating a build specification file for your application

It’s time to create and add a buildspec file to our source repository. We want to use a single buildspec.yml file for building, tagging, and pushing the Docker images to Amazon ECR for both target native architectures, x86, and Arm64. We use CodeBuild to inject environment variables, some of which need to be changed for each architecture (such as image tag and image architecture).

A buildspec is a collection of build commands and related settings, in YAML format, that CodeBuild uses to run a build. For more information, see Build specification reference for CodeBuild.

The buildspec we add instructs CodeBuild to do the following:

  • install phase – Update the yum package manager
  • pre_build phase – Sign in to Amazon ECR using the IAM role assumed by CodeBuild
  • build phase – Build the Docker image using the Docker CLI and tag the newly created Docker image
  • post_build phase – Push the Docker image to our Amazon ECR repository

We first need to add the buildspec.yml file to our source repository.

  1. In a text editor, add a new file with the following build specification:

version: 0.2
phases:
    install:
        commands:
            - yum update -y
    pre_build:
        commands:
            - echo Logging in to Amazon ECR...
            - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
    build:
        commands:
            - echo Build started on `date`
            - echo Building the Docker image...          
            - docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
            - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG      
    post_build:
        commands:
            - echo Build completed on `date`
            - echo Pushing the Docker image...
            - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
  1. Save the file in the root of the repository and name it buildspec.yml.

Because we specify environment variables in the CodeBuild project, we don’t need to hard code any values in the buildspec file.

  1. Commit the changes to Git and push the changes to our source repository:

git add .
git commit -m "Adding CodeBuild buildspec.yml file."
git push

Creating a build specification file for your manifest list creation

Next we create a buildspec file that instructs CodeBuild to create a Docker manifest list, and associate that manifest list with the Docker images that the buildspec file builds.

A manifest list is a list of image layers that is created by specifying one or more (ideally more than one) image names. You can then use it in the same way as an image name in docker pull and docker run commands, for example. For more information, see manifest create.

As of this writing, manifest creation is an experimental feature of the Docker command line interface (CLI).

Experimental features provide early access to future product functionality. These features are intended only for testing and feedback because they may change between releases without warning or be removed entirely from a future release. Experimental features must not be used in production environments. For more information, Experimental features.

When creating the CodeBuild project for manifest list creation, we specify a buildspec file name override as buildspec-manifest.yml. This buildspec instructs CodeBuild to do the following:

  • install phase – Update the yum package manager
  • pre_build phase – Sign in to Amazon ECR using the IAM role assumed by CodeBuild
  • build phase – Perform three actions:
    • Set environment variable to enable Docker experimental features for the CLI
    • Create the Docker manifest list using the Docker CLI
    • Annotate the manifest list to add the architecture-specific Docker image references
  • post_build phase – Push the Docker image to our Amazon ECR repository and use docker manifest inspect to echo out the contents of the manifest list from Amazon ECR

We first need to add the buildspec-manifest.yml file to our source repository.

  1. In a text editor, add a new file with the following build specification:

version: 0.2
# Based on the Docker documentation, must include the DOCKER_CLI_EXPERIMENTAL environment variable
# https://docs.docker.com/engine/reference/commandline/manifest/    

phases:
    install:
        commands:
            - yum update -y
    pre_build:
        commands:
            - echo Logging in to Amazon ECR...
            - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
    build:
        commands:
            - echo Build started on `date`
            - echo Building the Docker manifest...   
            - export DOCKER_CLI_EXPERIMENTAL=enabled       
            - docker manifest create $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-arm64v8 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-amd64    
            - docker manifest annotate --arch arm64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-arm64v8
            - docker manifest annotate --arch amd64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-amd64

    post_build:
        commands:
            - echo Build completed on `date`
            - echo Pushing the Docker image...
            - docker manifest push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
            - docker manifest inspect $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
  1. Save the file in the root of the repository and name it buildspec-manifest.yml.
  2. Commit the changes to Git and push the changes to our source repository:

git add .
git commit -m "Adding CodeBuild buildspec-manifest.yml file."
git push

Setting up your CodeBuild projects

Now we have created a single buildspec.yml file for building, tagging, and pushing the Docker images to Amazon ECR for both target native architectures: x86 and Arm64. This file is shared by two of the three CodeBuild projects that we create. We use CodeBuild to inject environment variables, some of which need to be changed for each architecture (such as image tag and image architecture). We also want to use the single Docker file, regardless of the architecture. We also need to ensure any third-party libraries are present and compiled correctly for the target architecture.

For more information about third-party libraries and software versions that have been optimized for Arm, see the Getting started with AWS Graviton GitHub repo.

We use the same environment variable names for the CodeBuild projects, but each project has specific values, as detailed in the following table. You need to modify these values to your numeric AWS account ID, the AWS Region where your Amazon ECR registry endpoint is located, and your Amazon ECR repository name. The instructions for adding the environment variables in the CodeBuild projects are in the following sections.

Environment Variable x86 Project values Arm64 Project values manifest Project values
1 AWS_DEFAULT_REGION us-east-1 us-east-1 us-east-1
2 AWS_ACCOUNT_ID 111111111111 111111111111 111111111111
3 IMAGE_REPO_NAME multi-arch-test multi-arch-test multi-arch-test
4 IMAGE_TAG latest-amd64 latest-arm64v8 latest

The image we use in this post uses architecture-specific tags with the term latest. This is for demonstration purposes only; it’s best to tag the images with an explicit version or another meaningful reference.

CodeBuild for x86

We start with creating a new CodeBuild project for x86 on the CodeBuild console.

CodeBuild looks for a file named buildspec.yml by default, unless overridden. For these first two CodeBuild projects, we rely on that default and don’t specify the buildspec name.

  1. On the CodeBuild console, choose Create build project.
  2. For Project name, enter a unique project name for your build project, such as node-x86.
  3. To add tags, add them under Additional Configuration.
  4. Choose a Source provider (for this post, we choose GitHub).
  5. For Environment image, choose Managed image.
  6. Select Amazon Linux 2.
  7. For Runtime(s), choose Standard.
  8. For Image, choose aws/codebuild/amazonlinux2-x86_64-standard:3.0.

This is a x86 build image.

  1. Select Privileged.
  2. For Service role, choose New service role.
  3. Enter a name for the new role (one is created for you), such as CodeBuildServiceRole-nodeproject.

We reuse this same service role for the other CodeBuild projects associated with this project.

  1. Expand Additional configurations and move to the Environment variables
  2. Create the following Environment variables:
Name Value Type
1 AWS_DEFAULT_REGION us-east-1 Plaintext
2 AWS_ACCOUNT_ID 111111111111 Plaintext
3 IMAGE_REPO_NAME multi-arch-test Plaintext
4 IMAGE_TAG latest-amd64 Plaintext
  1. Choose Create build project.

Attaching the IAM policy

Now that we have created the CodeBuild project, we need to adjust the new service role that was just created and attach an IAM policy so that it can interact with the Amazon ECR API.

  1. On the CodeBuild console, choose the node-x86 project
  2. Choose the Build details
  3. Under Service role, choose the link that looks like arn:aws:iam::111111111111:role/service-role/CodeBuildServiceRole-nodeproject.

A new browser tab should open.

  1. Choose Attach policies.
  2. In the Search field, enter AmazonEC2ContainerRegistryPowerUser.
  3. Select AmazonEC2ContainerRegistryPowerUser.
  4. Choose Attach policy.

CodeBuild for arm64

Now we move on to creating a new (second) CodeBuild project for Arm64.

  1. On the CodeBuild console, choose Create build project.
  2. For Project name, enter a unique project name, such as node-arm64.
  3. If you want to add tags, add them under Additional Configuration.
  4. Choose a Source provider (for this post, choose GitHub).
  5. For Environment image, choose Managed image.
  6. Select Amazon Linux 2.
  7. For Runtime(s), choose Standard.
  8. For Image, choose aws/codebuild/amazonlinux2-aarch64-standard:2.0.

This is an Arm build image and is different from the image selected in the previous CodeBuild project.

  1. Select Privileged.
  2. For Service role, choose Existing service role.
  3. Choose CodeBuildServiceRole-nodeproject.
  4. Select Allow AWS CodeBuild to modify this service role so it can be used with this build project.
  5. Expand Additional configurations and move to the Environment variables
  6. Create the following Environment variables:
Name Value Type
1 AWS_DEFAULT_REGION us-east-1 Plaintext
2 AWS_ACCOUNT_ID 111111111111 Plaintext
3 IMAGE_REPO_NAME multi-arch-test Plaintext
4 IMAGE_TAG latest-arm64v8 Plaintext
  1. Choose Create build project.

CodeBuild for manifest list

For the last CodeBuild project, we create a Docker manifest list, associating that manifest list with the Docker images that the preceding projects create, and pushing the manifest list to ECR. This project uses the buildspec-manifest.yml file created earlier.

  1. On the CodeBuild console, choose Create build project.
  2. For Project name, enter a unique project name for your build project, such as node-manifest.
  3. If you want to add tags, add them under Additional Configuration.
  4. Choose a Source provider (for this post, choose GitHub).
  5. For Environment image, choose Managed image.
  6. Select Amazon Linux 2.
  7. For Runtime(s), choose Standard.
  8. For Image, choose aws/codebuild/amazonlinux2-x86_64-standard:3.0.

This is a x86 build image.

  1. Select Privileged.
  2. For Service role, choose Existing service role.
  3. Choose CodeBuildServiceRole-nodeproject.
  4. Select Allow AWS CodeBuild to modify this service role so it can be used with this build project.
  5. Expand Additional configurations and move to the Environment variables
  6. Create the following Environment variables:
Name Value Type
1 AWS_DEFAULT_REGION us-east-1 Plaintext
2 AWS_ACCOUNT_ID 111111111111 Plaintext
3 IMAGE_REPO_NAME multi-arch-test Plaintext
4 IMAGE_TAG latest Plaintext
  1. For Buildspec name – optional, enter buildspec-manifest.yml to override the default.
  2. Choose Create build project.

Setting up CodePipeline

Now we can move on to creating a pipeline to orchestrate the builds and manifest creation.

  1. On the CodePipeline console, choose Create pipeline.
  2. For Pipeline name, enter a unique name for your pipeline, such as node-multi-architecture.
  3. For Service role, choose New service role.
  4. Enter a name for the new role (one is created for you). For this post, we use the generated role name CodePipelineServiceRole-nodeproject.
  5. Select Allow AWS CodePipeline to create a service role so it can be used with this new pipeline.
  6. Choose Next.
  7. Choose a Source provider (for this post, choose GitHub).
  8. If you don’t have any existing Connections to GitHub, select Connect to GitHub and follow the wizard.
  9. Choose your Branch name (for this post, I choose main, but your branch might be different).
  10. For Output artifact format, choose CodePipeline default.
  11. Choose Next.

You should now be on the Add build stage page.

  1. For Build provider, choose AWS CodeBuild.
  2. Verify the Region is your Region of choice (for this post, I use US East (N. Virginia)).
  3. For Project name, choose node-x86.
  4. For Build type, select Single build.
  5. Choose Next.

You should now be on the Add deploy stage page.

  1. Choose Skip deploy stage.

A pop-up appears that reads Your pipeline will not include a deployment stage. Are you sure you want to skip this stage?

  1. Choose Skip.
  2. Choose Create pipeline.

CodePipeline immediately attempts to run a build. You can let it continue without worry if it fails. We are only part of the way done with the setup.

Adding an additional build step

We need to add the additional build step for the Arm CodeBuild project in the Build stage.

  1. On the CodePipeline console, choose node-multi-architecture pipeline
  2. Choose Edit to start editing the pipeline stages.

You should now be on the Editing: node-multi-architecture page.

  1. For the Build stage, choose Edit stage.
  2. Choose + Add action.

Editing node-multi-architecture

  1. For Action name, enter Build-arm64.
  2. For Action provider, choose AWS CodeBuild.
  3. Verify your Region is correct.
  4. For Input artifacts, select SourceArtifact.
  5. For Project name, choose node-arm64.
  6. For Build type, select Single build.
  7. Choose Done.
  8. Choose Save.

A pop-up appears that reads Saving your changes cannot be undone. If the pipeline is running when you save your changes, that execution will not complete.

  1. Choose Save.

Updating the first build action name

This step is optional. The CodePipeline wizard doesn’t allow you to enter your Build action name during creation, but you can update the Build stage’s first build action to have consistent naming.

  1. Choose Edit to start editing the pipeline stages.
  2. Choose the Edit icon.
  3. For Action name, enter Build-x86.
  4. Choose Done.
  5. Choose Save.

A pop-up appears that says Saving your changes cannot be undone. If the pipeline is running when you save your changes, that execution will not complete.

  1. Choose Save.

Adding the project

Now we add the CodeBuild project for manifest creation and publishing.

  1. On the CodePipeline console, choose node-multi-architecture pipeline.
  2. Choose Edit to start editing the pipeline stages.
  3. Choose +Add stage below the Build
  4. Set the Stage name to Manifest
  5. Choose +Add action group.
  6. For Action name, enter Create-manifest.
  7. For Action provider, choose AWS CodeBuild.
  8. Verify your Region is correct.
  9. For Input artifacts, select SourceArtifact.
  10. For Project name, choose node-manifest.
  11. For Build type, select Single build.
  12. Choose Done.
  13. Choose Save.

A pop-up appears that reads Saving your changes cannot be undone. If the pipeline is running when you save your changes, that execution will not complete.

  1. Choose Save.

Testing the pipeline

Now let’s verify everything works as planned.

  1. In the pipeline details page, choose Release change.

This runs the pipeline in stages. The process should take a few minutes to complete. The pipeline should show each stage as Succeeded.

Pipeline visualization

Now we want to inspect the output of the Create-manifest action that runs the CodeBuild project for manifest creation.

  1. Choose Details in the Create-manifest

This opens the CodeBuild pipeline.

  1. Under Build logs, we should see the output from the manifest inspect command we ran as the last step in the buildspec-manifest.yml See the following sample log:

[Container] 2020/10/07 16:47:39 Running command docker manifest inspect $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1369,
         "digest": "sha256:238c2762212ff5d7e0b5474f23d500f2f1a9c851cdd3e7ef0f662efac508cd04",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1369,
         "digest": "sha256:0cc9e96921d5565bdf13274e0f356a139a31d10e95de9ad3d5774a31b8871b05",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      }
   ]
}

Cleaning up

To avoid incurring future charges, clean up the resources created as part of this post.

  1. On the CodePipeline console, choose the pipeline node-multi-architecture.
  2. Choose Delete pipeline.
  3. When prompted, enter delete.
  4. Choose Delete.
  5. On the CodeBuild console, choose the Build project node-x86.
  6. Choose Delete build project.
  7. When prompted, enter delete.
  8. Choose Delete.
  9. Repeat the deletion process for Build projects node-arm64 and node-manifest.

Next we delete the Docker images we created and pushed to Amazon ECR. Be careful to not delete a repository that is being used for other images.

  1. On the Amazon ECR console, choose the repository multi-arch-test.

You should see a list of Docker images.

  1. Select latest, latest-arm64v8, and latest-amd64.
  2. Choose Delete.
  3. When prompted, enter delete.
  4. Choose Delete.

Finally, we remove the IAM roles that we created.

  1. On the IAM console, choose Roles.
  2. In the search box, enter CodePipelineServiceRole-nodeproject.
  3. Select the role and choose Delete role.
  4. When prompted, choose Yes, delete.
  5. Repeat these steps for the role CodeBuildServiceRole-nodeproject.

Conclusion

To summarize, we successfully created a pipeline to create multi-architecture Docker images for both x86 and arm64. We referenced them via annotation in a Docker manifest list and stored them in Amazon ECR. The Docker images were based on a single Docker file that uses environment variables as parameters to allow for Docker file reuse.

For more information about these services, see the following:

About the Authors

 

Tyler Lynch photo

Tyler Lynch
Tyler Lynch is a Sr. Solutions Architect focusing on EdTech at AWS.

 

 

 

Alistair McLean photo

Alistair McLean

Alistair is a Principal Solutions Architect focused on State and Local Government and K12 customers at AWS.

 

 

Reducing Docker image build time on AWS CodeBuild using an external cache

Post Syndicated from Camillo Anania original https://aws.amazon.com/blogs/devops/reducing-docker-image-build-time-on-aws-codebuild-using-an-external-cache/

With the proliferation of containerized solutions to simplify creating, deploying, and running applications, coupled with the use of automation CI/CD pipelines that continuously rebuild, test, and deploy such applications when new changes are committed, it’s important that your CI/CD pipelines run as quickly as possible, enabling you to get early feedback and allowing for faster releases.

AWS CodeBuild supports local caching, which makes it possible to persist intermediate build artifacts, like a Docker layer cache, locally on the build host and reuse them in subsequent runs. The CodeBuild local cache is maintained on the host at best effort, so it’s possible several of your build runs don’t hit the cache as frequently as you would like.

A typical Docker image is built from several intermediate layers that are constructed during the initial image build process on a host. These intermediate layers are reused if found valid in any subsequent image rebuild; doing so speeds up the build process considerably because the Docker engine doesn’t need to rebuild the whole image if the layers in the cache are still valid.

This post shows how to implement a simple, effective, and durable external Docker layer cache for CodeBuild to significantly reduce image build runtime.

Solution overview

The following diagram illustrates the high-level architecture of this solution. We describe implementing each stage in more detail in the following paragraphs.

CodeBuildExternalCacheDiagram

In a modern software engineering approach built around CI/CD practices, whenever specific events happen, such as an application code change is merged, you need to rebuild, test, and eventually deploy the application. Assuming the application is containerized with Docker, the build process entails rebuilding one or multiple Docker images. The environment for this rebuild is on CodeBuild, which is a fully managed build service in the cloud. CodeBuild spins up a new environment to accommodate build requests and runs a sequence of actions defined in its build specification.

Because each CodeBuild instance is an independent environment, build artifacts can’t be persisted in the host indefinitely. The native CodeBuild local caching feature allows you to persist a cache for a limited time so that immediate subsequent builds can benefit from it. Native local caching is performed at best effort and can’t be relied on when multiple builds are triggered at different times. This solution describes using an external persistent cache that you can reuse across builds and is valid at any time.

After the first build of a Docker image is complete, the image is tagged and pushed to Amazon Elastic Container Registry (Amazon ECR). In each subsequent build, the image is pulled from Amazon ECR and the Docker build process is forced to use it as cache for its next build iteration of the image. Finally, the newly produced image is pushed back to Amazon ECR.

In the following paragraphs, we explain the solution and walk you through an example implementation. The solution rebuilds the publicly available Amazon Linux 2 Standard 3.0 image, which is an optimized image that you can use with CodeBuild.

Creating a policy and service role

The first step is to create an AWS Identity and Access Management (IAM) policy and service role for CodeBuild with the minimum set of permissions to perform the job.

  1. On the IAM console, choose Policies.
  2. Choose Create policy.
  3. Provide the following policy in JSON format:
    CodeBuild Docker Cache Policy:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "ecr:GetAuthorizationToken",
                    "ecr:BatchCheckLayerAvailability",
                    "ecr:GetDownloadUrlForLayer",
                    "ecr:GetRepositoryPolicy",
                    "ecr:DescribeRepositories",
                    "ecr:ListImages",
                    "ecr:DescribeImages",
                    "ecr:BatchGetImage",
                    "ecr:ListTagsForResource",
                    "ecr:DescribeImageScanFindings",
                    "ecr:InitiateLayerUpload",
                    "ecr:UploadLayerPart",
                    "ecr:CompleteLayerUpload",
                    "ecr:PutImage"
                ],
                "Resource": "*"
            }
        ]
    }
  4. In the Review policy section, enter a name (for example, CodeBuildDockerCachePolicy).
  5. Choose Create policy.
  6. Choose Roles on the navigation pane.
  7. Choose Create role.
  8. Keep AWS service as the type of role and choose CodeBuild from the list of services.
  9. Choose Next.
  10. Search for and add the policy you created.
  11. Review the role and enter a name (for example, CodeBuildDockerCacheRole).
  12. Choose Create role.

Creating an Amazon ECR repository

In this step, we create an Amazon ECR repository to store the built Docker images.

  1. On the Amazon ECR console, choose Create repository.
  2. Enter a name (for example, amazon_linux_codebuild_image).
  3. Choose Create repository.

Configuring a CodeBuild project

You now configure the CodeBuild project that builds the Docker image and configures its cache to speed up the process.

  1. On the CodeBuild console, choose Create build project.
  2. Enter a name (for example, SampleDockerCacheProject).
  3. For Source provider, choose GitHub.
  4. For Repository, select Public repository.
  5. For Repository URL, enter https://github.com/aws/aws-codebuild-docker-images.
    CodeBuildGitHubSourceConfiguration
  6. In the Environment section, for Environment image, select Managed image.
  7. For Operating system, choose Amazon Linux 2.
  8. For Runtime(s), choose Standard.
  9. For Image, enter aws/codebuild/amazonlinux2-x86_64-standard:3.0.
  10. For Image version, choose Always use the latest image for this runtime version.
  11. For Environment type, choose Linux.
  12. For Privileged, select Enable this flag if you want to build Docker images or want your builds to get elevated privileges.
  13. For Service role, select Existing service role.
  14. For Role ARN, enter the ARN for the service role you created (CodeBuildDockerCachePolicy).
  15. Select Allow AWS CodeBuild to modify this service so it can be used with this build project.
    CodeBuildEnvironmentConfiguration
  16. In the Buildspec section, select Insert build commands.
  17. Choose Switch to editor.
  18. Enter the following build specification (substitute account-ID and region).
    version: 0.2
    
    env:
        variables:
        CONTAINER_REPOSITORY_URL: account-ID.dkr.ecr.region.amazonaws.com/amazon_linux_codebuild_image
        TAG_NAME: latest
    
    phases:
      install:
        runtime-versions:
          docker: 19
    
    pre_build:
      commands:
        - $(aws ecr get-login --no-include-email)
        - docker pull $CONTAINER_REPOSITORY_URL:$TAG_NAME || true
    
    build:
      commands:
        - cd ./al2/x86_64/standard/1.0
        - docker build --cache-from $CONTAINER_REPOSITORY_URL:$TAG_NAME --tag
    $CONTAINER_REPOSITORY_URL:$TAG_NAME .
    
    post_build:
        commands:
          - docker push $CONTAINER_REPOSITORY_URL
  19. Choose Create the project.

The provided build specification instructs CodeBuild to do the following:

  • Use the Docker 19 runtime to run the build. The following process doesn’t work reliably with Docker versions lower than 19.
  • Authenticate with Amazon ECR and pull the image you want to rebuild if it exists (on the first run, this image doesn’t exist).
  • Run the image rebuild, forcing Docker to consider as cache the image pulled at the previous step using the –cache-from parameter.
  • When the image rebuild is complete, push it to Amazon ECR.

Testing the solution

The solution is fully configured, so we can proceed to evaluate its behavior.

For the first run, we record a runtime of approximately 39 minutes. The build doesn’t use any cache and the docker pull in the pre-build stage fails to find the image we indicate, as expected (the || true statement at the end of the command line guarantees that the CodeBuild instance doesn’t stop because the docker pull failed).

The second run pulls the previously built image before starting the rebuild and completes in approximately 6 minutes, most of which is spent downloading the image from Amazon ECR (which is almost 5 GB).

We trigger another run after simulating a change halfway through the Dockerfile (addition of an echo command to the statement at line 291 of the Dockerfile). Docker still reuses the layers in the cache until the point of the changed statement and then rebuilds from scratch the remaining layers described in the Dockerfile. The runtime was approximately 31 minutes; the overhead of downloading the whole image first partially offsets the advantages of using it as cache.

It’s relevant to note the image size in this use case is considerably large; on average, projects deal with smaller images that introduce less overhead. Furthermore, the previous run had the built-in CodeBuild feature to cache Docker layers at best effort disabled; enabling it provides further efficiency because the docker pull specified in the pre-build stage doesn’t have to download the image if the one available locally matches the one on Amazon ECR.

Cleaning up

When you’re finished testing, you should un-provision the following resources to avoid incurring further charges and keep the account clean from unused resources:

  • The amazon_linux_codebuild_image Amazon ECR repository and its images;
  • The SampleDockerCacheProject CodeBuild project;
  • The CodeBuildDockerCachePolicy policy and the CodeBuildDockerCacheRole role.

Conclusion

In this post, we reviewed a simple and effective solution to implement a durable external cache for Docker on CodeBuild. The solution provides significant improvements in the execution time of the Docker build process on CodeBuild and is general enough to accommodate the majority of use cases, including multi-stage builds.

The approach works in synergy with the built-in CodeBuild feature of caching Docker layers at best effort, and we recommend using it for further improvements. Shorter build processes translate to lower compute costs, and overall determine a shorter development lifecycle for features released faster and at a lower cost.

About the Author

 

 

Camillo Anania is a Global DevOps Consultant with AWS Professional Services, London, UK.

 

 

 

 

James Jacob is a Global DevOps Consultant with AWS Professional Services, London, UK.

 

Deploying a ASP.NET Core web application to Amazon ECS using an Azure DevOps pipeline

Post Syndicated from John Formento original https://aws.amazon.com/blogs/devops/deploying-a-asp-net-core-web-application-to-amazon-ecs-using-an-azure-devops-pipeline/

For .NET developers, leveraging Team Foundation Server (TFS) has been the cornerstone for CI/CD over the years. As more and more .NET developers start to deploy onto AWS, they have been asking questions about using the same tools to deploy to the AWS cloud. By configuring a pipeline in Azure DevOps to deploy to the AWS cloud, you can easily use familiar Microsoft development tools to build great applications.

Solution overview

This blog post demonstrates how to create a simple Azure DevOps project, repository, and pipeline to deploy an ASP.NET Core web application to Amazon ECS using Azure DevOps. The following screenshot shows a high-level architecture diagram of the pipeline:

 

Solution Architecture Diagram

In this example, you perform the following steps:

  1. Create an Azure DevOps Project, clone project repo, and push ASP.NET Core web application.
  2. Create a pipeline in Azure DevOps
  3. Build an Amazon ECS Cluster, Task and Service.
  4. Kick-off deployment of the ASP.Net Core web application using the newly create Azure DevOps pipeline.

 

Prerequisites

Ensure you have the following prerequisites set up:

  • An Amazon ECR repository
  • An IAM user with permissions for Amazon ECR and Amazon ECS (the user will need an access key and secret access key)

 

Create an Azure DevOps Project, clone project repo, and push ASP.NET Core web application

Follow these steps to deploy a .NET Core app onto your Amazon ECS cluster using the Azure DevOps (ADO) repository and pipeline:

 

  1. Login to dev.azure.com and navigate to the marketplace.
  2. Go to Visual Studio, search for “AWS”, and add the AWS Tools for Microsoft Visual Studio Team Services.
  3. Create a project in ADO: Provide a project name and choose Create.
  4. On the Project Summary page, choose Project Settings.
  5. In the Project Settings pane, navigate to the Service Connections page.
  6. Choose Create service connection, select AWS, and choose Next.
  7. Input an Access Key ID and Secret Access Key. (You’ll need an IAM user with permissions for Amazon ECR and Amazon ECS in order to deploy via the Azure DevOps pipeline.) Choose Save.
  8. Choose Repos in the left pane, then Clone in Visual Studio under Clone to your computer.
  9. Create a ASP.NET Core web application in Visual Studio, set the location to locally cloned repository, and check Enable Docker support.
  10. Once you’ve created the new project, perform an initial commit and push to the repository in Azure DevOps.

 

Creating a pipeline in Azure DevOps

Now that you have synced the repository, create a pipeline in Azure DevOps.

  1. Go to the pipeline page within Azure DevOps and choose Create Pipeline.
  2. Choose Use the classic editor.Pipeline configuration with repository
  3. Select Azure Repos Git for the location of your code and select the repository you created earlier.
  4. On the Choose a Template page, select Docker Container and choose Apply.
  5. Remove the Push an image step.
  6. Add an Amazon ECR Push task by choosing the + symbol next to Agent job 1. You can search for “AWS” in the Add tasks pane to filter for all AWS tasks.

 

Now, configure each task:

  1. Choose the Build an image task and ensure that the action is set to Build an image. Additionally, you can modify the Image Name to your standards.Pipeline configuration page Azure DevOps
  2. Choose the Push Image task and provide the following
    • Enter a name under Display Name.
    • Select the AWS Credentials that you created in Service Connections.
    • Select the AWS Region.
    • Provide the source image name, which you can find in the setting for the Build an image task.
    • Enter the name of the repository in Amazon ECR to which the image is pushedPipeline configuration page Azure DevOps
  3. Choose Save and queue.

Build Amazon ECS Cluster, Task, and Service

The goal here is to test up to building the Docker image and ensure it’s pushed to Amazon ECR. Once the Docker image is in Amazon ECR, you can create the Amazon ECS cluster, task definition, and service leveraging the newly created Docker image.

  1. Create an Amazon ECS cluster.
  2. Create an Amazon ECS task definition. When you create the task definition and configure the container, use the Amazon ECR URI for the Docker image that was just pushed to Amazon ECR.
  3. Create an Amazon ECS service.

Go back and edit the pipeline:

  1. Add the last step by choosing the + symbol next to Agent job 1.
  2. Search for “AWS CLI” in the search bar and add the task.
  3. Choose AWS CLI and configure the task.
  4. Enter a name under Display Name, such as Update ECS Service.
  5. Select the AWS Credentials that you created in Service Connections.
  6. Select the AWS Region.
  7. Input the following command, which updates the Amazon ECS service after a new image is pushed to Amazon ECR. Replace <clustername> and <servicename> with your Amazon ECS cluster and service names.
    • Command:ecs
    • Subcommand:update-service
    • Options and parameters: --cluster <clustername> --service <servicename> --force-new-deployment
  8. Now choose the Triggers tab and select Enable continuous integration with the repository you created.
  9. Choose Save and queue.

 

At this point, your build pipeline kicks off and builds a Docker image from the source code in the repository you created, pushes the image to Amazon ECR, and updates the Amazon ECS service with the new image.

You can verify by viewing the build. Choose Pipelines in Azure DevOps, selecting the entry for the latest run, and then the icon under the status column. Once it successfully completes, you can log in to the AWS console and view the updated image in Amazon ECR and the updated service in Amazon ECS.Pipeline status page Azure DevOps

Every time you commit and push your code through Visual Studio, this pipeline kicks off and builds and deploys your application to Amazon ECS.

Cleanup

At the end of this example, once you’ve completed all steps and are finished testing, follow these steps to disable or delete resources to avoid incurring costs:

  1. Go to the Amazon ECS console within the AWS Console.
  2. Navigate to the cluster you created, then choose the Tasks tab.
  3. Choose Stop all to turn off the tasks.

Conclusion

This blog post reviewed how to create a CI/CD pipeline in Azure DevOps to deploy a Docker Image to Amazon ECR and container to Amazon ECS. It provided detailed steps on how to set up a basic CI/CD pipeline, leveraging tools with which .NET developers are familiar and the steps needed to integrate with Amazon ECR and Amazon ECS.

I hope this post was informative and has helped you learn the basics of how to integrate Amazon ECR and Amazon ECS with Azure DevOps to create a robust CI/CD pipeline.

About the Authors

John Formento

 

 

John Formento is a Solution Architect at Amazon Web Services. He helps large enterprises achieve their goals by architecting secure and scalable solutions on the AWS Cloud.

New – AWS IoT Greengrass Adds Container Support and Management of Data Streams at the Edge

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-aws-iot-greengrass-adds-docker-support-and-streams-management-at-the-edge/

AWS IoT Greengrass extends cloud capabilities to edge devices, so that they can respond to local events in near real-time, even with intermittent connectivity.

Today, we are adding two features that make it easier to build IoT solutions:

  • Container support to deploy applications using the Greengrass Docker application deployment connector.
  • Collect, process, and export data streams from edge devices and manage the lifecycle of that data with the Stream Manager for AWS IoT Greengrass.

Let’s see how these new features work and how to use them.

Deploying a Container-Based Application to a Greengrass Core Device
You can now run AWS Lambda functions and container-based applications in your AWS IoT Greengrass core device. In this way it is easier to migrate applications from on-premises, or build new applications that include dependencies such as libraries, other binaries, and configuration files, using container images. This provides a consistent deployment environment for your applications that enables portability across development environments and edge locations. You can easily deploy legacy and third-party applications by packaging the code or executables into the container images.

To use this feature, I describe my container-based application using a Docker Compose file. I can reference container images in public or private repositories, such as Amazon Elastic Container Registry (ECR) or Docker Hub. To start, I create a simple web app using Python and Flask that counts the number of times it is visualized.

from flask import Flask

app = Flask(__name__)

counter = 0

@app.route('/')
def hello():
    global counter
    counter += 1
    return 'Hello World! I have been seen {} times.\n'.format(counter)

My requirements.txt file contains a single dependency, flask.

I build the container image using this Dockerfile and push it to ECR.

FROM python:3.7-alpine
WORKDIR /code
ENV FLASK_APP app.py
ENV FLASK_RUN_HOST 0.0.0.0
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
CMD ["flask", "run"]

Here is the docker-compose.yml file referencing the container image in my ECR repository. Docker Compose files can describe applications using multiple containers, but for this example I am using just one.

version: '3'
services:
  web:
    image: "123412341234.dkr.ecr.us-east-1.amazonaws.com/hello-world-counter:latest"
    ports:
      - "80:5000"

I upload the docker-compose.yml file to an Amazon Simple Storage Service (S3) bucket.

Now I create an AWS IoT Greengrass group using an Amazon Elastic Compute Cloud (EC2) instance as core device. Usually your core device is outside of the AWS cloud, but using an EC2 instance can be a good way to set up and automate a dev & test environment for your deployments at the edge.

When the group is ready, I run an “empty” deployment, just to check that everything is working as expected. After a few seconds, my first deployment has completed and I start adding a connector.

In the connector section of the AWS IoT Greengrass group, I select Add a connector and search for “Docker”. I select Docker Application Deployment and hit Next.

Now I configure the parameters for the connector. I select my docker-compose.yml file on S3. The AWS Identity and Access Management (IAM) role used by the AWS IoT Greengrass group needs permissions to get the file from S3 and to get the authorization token and download the image from ECR. If you use a private repository such as Docker Hub, you can leverage the integration with the AWS Secret Manager to make it easy for your connectors and Lambda functions to use local secrets to interact with services and applications.

I deploy my changes, similarly to what I did before. This time, the new container-based application is installed and started on the AWS IoT Greengrass core device.

To test the web app that I deployed, I open access to the HTTP port on the Security Group of the EC2 instance I am using as core device. When I connect with my browser, I see the Flask app starting to count the visits. My container-based application is running on the AWS IoT Greengrass core device!

You can deploy much more complex applications than what I did in this example. Let’s see that as we go through the other feature released today.

Using the Stream Manager for AWS IoT Greengrass
For common use cases like video processing, image recognition, or high-volume data collection from sensors at the edge, you often need to build your own data stream management capabilities. The new Stream Manager simplifies this process by adding a standardized mechanism to the Greengrass Core SDK that you can use to process data streams from IoT devices, manage local data retention policies based on cache size or data age, and automatically transmit data directly into AWS cloud services such as Amazon Kinesis and AWS IoT Analytics.

The Stream Manager also handles disconnected or intermittent connectivity scenarios by adding configurable prioritization, caching policies, bandwidth utilization, and time-outs on a per-stream basis. In situations where connectivity is unpredictable or bandwidth is constrained, this new functionality enables you to define the behavior of your applications’ data management while disconnected, reconnecting, or connected, allowing you to prioritize important data’s path to the cloud and make efficient use of a connection when it is available. Using this feature, you can focus on your specific application use cases rather than building data retention and connection management functionality.

Let’s see now how the Stream Manager works with a practical use case. For example, my AWS IoT Greengrass core device is receiving lots of data from multiple devices. I want to do two things with the data I am collecting:

  • Upload all row data with low priority to AWS IoT Analytics, where I use Amazon QuickSight to visualize and understand my data.
  • Aggregate data locally based on time and location of the devices, and send the aggregated data with high priority to a Kinesis Data Stream that is processed by a business application for predictive maintenance.

Using the Stream Manager in the Greengrass Core SDK, I create two local data streams:

  • The first local data stream has a configured low-priority export to IoT Analytics and can use up to 256MB of local disk (yes, it’s a constrained device). You can use memory to store the local data stream if you prefer speed to resilience. When local space is filled up, for example because I lost connectivity to the cloud and I continue to cache locally, I can choose to either reject new data or overwrite the oldest data.
  • The second local data stream is exporting data with high priority to a Kinesis Data Stream and can use up to 128MB of local disk (it’s aggregated data, I need less space for the same amount of time).

 

Here’s how the data flows in this architecture:

  • Sensor data is collected by a Producer Lambda function that is writing to the first local data stream.
  • A second Aggregator Lambda function is reading from the first local data stream, performing the aggregation, and writing its output to the second local data stream.
  • A Reader container-based app (deployed using the Docker application deployment connector) is rendering the aggregated data in real-time for a display panel.
  • The Stream Manager takes care of the ingestion to the cloud, based on the configuration and the policies of the local data streams, so that developers can focus their efforts on the logic on the device.

The use of Lambda functions or container-based apps in the previous architecture is just an example. You can mix and match, or standardize to one or the other, depending on your development best practices.

Available Now
The Docker application deployment connector and the Stream Manager are available with Greengrass version 1.10. The Stream Manager is available in the Greengrass Core SDK for Java and Python. We are adding support for other platforms based on customer feedback.

These new features are independent from each other, but can be used together as in my example. They can simplify the way you build and deploy applications on edge devices, making it easier to process data locally and be integrated with streaming and analytics services in the backend. Let me know what you are going to use these features for!

Danilo

EventBridge Support in Amazon Elastic Container Registry

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/eventbridge-support-in-amazon-elastic-container-registry/

Many of our customers require a secure and private place to store their container images, and that’s why they use our fully managed container registry Amazon Elastic Container Registry. We recently added support for Amazon EventBridge so that you can trigger actions when images are pushed or deleted. These actions can trigger a continuous integration, continuous deployment pipeline when an image is pushed or post a message to your DevOps team Slack channel when an image has been deleted.

This new capability can even enable complicated workflows, for example, customers can use the image push event on a base image to trigger a rebuild of images built on top of that base. In this scenario, a base image might be rebuilt weekly to pick up the latest security patches. A push event from the base image repository can trigger other builds, so that all derivative images are patched, too.

To show you how to go about using this new capability, I thought I’d open up the console and work through an example of how all the pieces fit together.

In the Amazon EventBridge console, I create a new rule, and I enter a unique name and description.

Next, I scroll down to Define pattern and begin to customise the type of event pattern that I want to use. I leave the default Event pattern radio button selected and also that I want to use a Pre-defined patten by service. Since Elastic Container Registry is an AWS service, I select AWS as the Service Provider.

In the Service Name section, you can select one of the many different AWS services as the event source. I am going to choose the newest addition to this list Elastic Container Registry (ECR). Lastly, in this section, I select ECR Image Action as the Event type. This ECR Image Action contains both DELETE and PUSH as action types.

Next, I’m asked to configure which event bus I want to use. For this example, I select the AWS default event bus that comes with every AWS account.

Now that I have identified where my events are coming from, I now need to say where I want them to go. We call these targets, and there are plenty of options here. For example, I could send the event to a Lambda Function, a Kinesis stream, or any one of the wide variety of AWS targets.

To keep things simple, I’m going to choose to invoke a Amazon Simple Notification Service (SNS) topic. This topic is called ImageAction, and I have subscribed to this topic so that I receive an email when new messages are received by this topic.

Back over on my laptop, I push a new version of my container to my repository in to Elastic Container Registry.

If I go over to the Elastic Container Registry console, I can see that my Docker Image was successfully pushed, I’m now going to select the image and click the Delete button, which will delete my new image.

This will have sent both a PUSH and a DELETE event through to my SNS topic which in turn deliver two emails to me as a subscriber to that topic.

 

If I open up Outlook, sure enough, I have two (admittedly not pretty) emails that have both the respective action-type of PUSH and DELETE.

So there you have it, you can now wire up events in Elastic Container Registry and enable exciting and wonderful things to happen as a result. Amazon EventBridge support in Amazon Elastic Container Registry is available in all public AWS Regions and GovCloud (US). Try it now in the Amazon EventBridge console.

Happy Eventing!

Martin

Deploying GitOps with Weave Flux and Amazon EKS

Post Syndicated from Ignacio Riesgo original https://aws.amazon.com/blogs/compute/deploying-gitops-with-weave-flux-and-amazon-eks/

This post is contributed by Jon Jozwiak | Senior Solutions Architect, AWS

 

You have countless options for deploying resources into an Amazon EKS cluster. GitOps—a term coined by Weaveworks—provides some substantial advantages over the alternatives. With only Git as the single, central source for controlling deployment into your cluster, GitOps provides easy version control on a platform your team already knows. Getting started with GitOps is straightforward: create a pull request, merge, and the configuration deploys to the EKS cluster.

Weave Flux makes running GitOps in your EKS cluster fast and easy, as it monitors your configuration in Git and image repositories and automates deployments. Weave Flux follows a pull model, automatically triggering deployments based on changes. This provides better security than most continuous deployment tools, which need permissions to access your cluster. This approach also provides Git with version control over your configuration and enables rollback.

This post walks through implementing Weave Flux and deploying resources to EKS using Git. To simplify the image build pipeline, I use AWS Service Catalog to provide a standardized pipeline. AWS Service Catalog lets you centrally define a portfolio of approved products that AWS users can provision. An AWS CloudFormation template defines each product, which can be version-controlled.

After you deploy the sample resources, I quickly demonstrate the GitOps approach where a new image results in the configuration automatically deploying to EKS. This new image may be a commit of Kubernetes manifests or a commit of Helm release definitions.

The following diagram shows the workflow.

Prerequisites

In GitOps, you manage Docker image builds separately from deployment configuration. For image builds, this example uses AWS CodePipeline and AWS CodeBuild, which provide a managed workflow from GitHub source through to an image landing in Amazon Elastic Container Registry (ECR).

This post assumes that you already have an EKS cluster deployed, including kubectl access. It also assumes that you have a GitHub account.

GitHub setup

First, create a GitHub repository to store the Kubernetes manifests (configuration files) to apply to the cluster.

In GitHub, create a GitHub repository. This repository holds Kubernetes manifests for your deployments. Name the repository k8s-config to align with this post. Leave it as a public repository, check the box for Initialize this repository with a README, and choose Create Repo.

On the GitHub repository page, choose Clone or Download and save the SSH string:

[email protected]:youruser/k8s-config.git

Next, create a GitHub token that allows creating and deleting repositories so AWS Service Catalog can deploy and remove pipelines.

  1. In your GitHub profile, access your token settings.
  2. Choose Generate New Token.
  3. Name your new token CodePipeline Service Catalog, and select the following options:
  • repo scopes (repo:status, repo_deployment, public_repo, and repo:invite)
  • read:org
  • write:public_key and read:public_key
  • write:repo_hook and read:repo_hook
  • read:user and user:email
  • delete_repo

4 . Choose Generate Token.

5. Copy and save your access token for future access.

 

Deploy Helm

Helm is a package manager for Kubernetes that allows you to define a chart. Charts are collections of related resources that let you create, version, share, and publish applications. By deploying Helm into your cluster, you make it much easier to deploy Weave Flux and other systems. If you’ve deployed Helm already, skip this section.

First, install the Helm client with the following command:

curl -LO https://git.io/get_helm.sh

chmod 700 get_helm.sh

./get_helm.sh

 

On macOS, you could alternatively enter the following command:

brew install kubernetes-helm

 

Next, set up a service account with cluster role for Tiller, Helm’s server-side component. This allows Tiller to manage resources in your cluster.

kubectl -n kube-system create sa tiller

kubectl create clusterrolebinding tiller-cluster-rule \

--clusterrole=cluster-admin \

--serviceaccount=kube-system:tiller

 

Finally, initialize Helm and verify your version. Tiller takes a few seconds to start.

helm init --service-account tiller --history-max 200

helm version

 

Deploy Weave Flux

With Helm installed, proceed with the Weave Flux installation. Begin by installing the Flux Custom Resource Definition.

kubectl apply -f https://raw.githubusercontent.com/fluxcd/flux/helm-0.10.1/deploy-helm/flux-helm-release-crd.yaml

Now add the Weave Flux Helm repository and proceed with the install. Make sure that you update the git.url to match the GitHub repository that you created earlier.

helm repo add fluxcd https://charts.fluxcd.io

helm upgrade -i flux --set helmOperator.create=true --set helmOperator.createCRD=false --set [email protected]:YOURUSER/k8s-config --namespace flux fluxcd/flux

 

You can use the following code to verify that you successfully deployed Flux. You should see three pods running:

kubectl get pods -n flux

NAME                                 READY     STATUS    RESTARTS   AGE

flux-5bd7fb6bb6-4sc78                1/1       Running   0          52s

flux-helm-operator-df5746688-84kw8   1/1       Running   0          52s

flux-memcached-6f8c446979-f45wj      1/1       Running   0          52s

 

Flux requires a deploy key to work with the GitHub repository. In this post, Flux generates the SSH key pair itself, but you can also specify a different key pair when deploying. To access the key, download fluxctl, a command line utility that interacts with the Flux API. The following steps work for Linux. For other OS platforms, see Installing fluxctl.

sudo wget -O /usr/local/bin/fluxctl https://github.com/fluxcd/flux/releases/download/1.14.1/fluxctl_linux_amd64

sudo chmod 755 /usr/local/bin/fluxctl

 

Validate that fluxctl installed successfully, then retrieve the public key pair using the following command. Specify the namespace where you deployed Flux.

fluxctl version

fluxctl --k8s-fwd-ns=flux identity

 

Copy the key and add that as a deploy key in your GitHub repository.

  1. In your GitHub repository, choose Settings, Deploy Keys.
  2. Choose Add deploy key and name the key Flux Deploy Key.
  3. Paste the key from fluxctl identity.
  4. Choose Allow Write Access, Add Key.

Now use AWS Service Catalog to set up your image build pipeline.

 

Set up AWS Service Catalog

To allow end users to consume product portfolios, you must associate a portfolio with an IAM principal (or principals): a user, group, or role. For this example, associate your current identity. After you master these basics, there are additional resources to teach you how to set up a multi-region, multi-account catalog.

To retrieve your current identity, use the AWS CLI to get your ARN:

aws sts get-caller-identity

Deploy the product portfolio that contains an image build pipeline service by doing the following:

  1. In the AWS CloudFormation console, launch the CloudFormation stack with the following link:

 

 

2. Choose Next.

3. On the Specify Details page, enter your ARN from get-caller-identity. Also enter an environment tag, which AWS applies to all resources from this portfolio.

4. Choose Next.

5. On the Options page, choose Next.

6. On the Review page, select the check box displayed next to I acknowledge that AWS CloudFormation might create IAM resources.

7. Choose Create. CloudFormation takes a few minutes to create your resources.

 

Deploy the image pipeline

The image pipeline provisions a GitHub repository, Amazon ECR repository, and AWS CodeBuild project. It also uses AWS CodePipeline to build a Docker image.

  1. In the AWS Management Console, go to the AWS Service Catalog products list and choose Pipeline for Docker Images.
  2. Choose Launch Product.
  3. For Name, enter ExamplePipeline, and choose Next.
  4. On the Parameters page, fill in a project name, description, and unique S3 bucket name. The specifics don’t matter, but make a note of the name and S3 bucket for later use.
  5. Fill in your GitHub User and GitHub Token values from earlier. Leave the rest of the fields as the default values.
  6. To clean up your GitHub repository on stack delete, change Delete Repository to true.
  7. Choose Next.
  8. On the TagOptions screen, choose Next.
  9. Choose Next on the Notifications page.
  10. On the Review page, choose Launch.

The launch process takes 1–2 minutes. You can verify that you now have a repository matching your project name (eks-example) in GitHub. You can also look at the pipeline created in the AWS CodePipeline console.

 

Deploying with GitOps

You can now provision workloads into the EKS cluster. With a GitOps approach, you only commit code and Kubernetes resource definitions to GitHub. AWS CodePipeline handles the image builds, and Weave Flux applies the desired state to Kubernetes.

First, create a simple Hello World application in your example pipeline. Clone the GitHub repository that you created in the previous step and substitute your GitHub user below.

git clone [email protected]:youruser/eks-example.git

cd eks-example

Create a base README file, a source directory, and download a simple NGINX configuration (hello.conf), home page (index.html), and Dockerfile.

echo "# eks-example" > README.md

mkdir src

wget -O src/hello.conf https://blog-gitops-eks.s3.amazonaws.com/hello.conf

wget -O src/index.html https://blog-gitops-eks.s3.amazonaws.com/index.html

wget https://blog-gitops-eks.s3.amazonaws.com/Dockerfile

 

Now that you have a simple Hello World app with Dockerfile, commit the changes to kick off the pipeline.

git add .

git commit -am "Initial commit"

[master (root-commit) d69a6ba] Initial commit

4 files changed, 34 insertions(+)

create mode 100644 Dockerfile

create mode 100644 README.md

create mode 100644 src/hello.conf

create mode 100644 src/index.html

git push

 

Watch in the AWS CodePipeline console to see the image build in process. This may take a minute to start. When it’s done, look in the ECR console to see the first version of the container image.

To deploy this image and the Hello World application, commit Kubernetes manifests for Flux. Create a namespace, deployment, and service in the Kubernetes Git repository (k8s-config) you created. Make sure that you aren’t in your eks-example repository directory.

cd ..

git clone [email protected]:youruser/k8s-config.git

cd k8s-config

mkdir charts namespaces releases workloads

 

The preceding directory structure helps organize the repository but isn’t necessary. Flux can descend into subdirectories and look for YAML files to apply.

Create a namespace Kubernetes manifest.

cat << EOF > namespaces/eks-example.yaml
apiVersion: v1
kind: Namespace
metadata:
  labels:
    name: eks-example
  name: eks-example
EOF

Now create a deployment manifest. Make sure that you update this image to point to your repository and image tag. For example, <Account ID>.dkr.ecr.us-east-1.amazonaws.com/eks-example:d69a6bac.

cat << EOF > workloads/eks-example-dep.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: eks-example
  namespace: eks-example
  labels:
    app: eks-example
  annotations:
    # Container Image Automated Updates
    flux.weave.works/automated: "true"
    # do not apply this manifest on the cluster
    #flux.weave.works/ignore: "true"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: eks-example
  template:
    metadata:
      labels:
        app: eks-example
    spec:
      containers:
      - name: eks-example
        image: <Your Account>.dkr.ecr.us-east-1.amazonaws.com/eks-example:d69a6bac
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: http
        readinessProbe:
          httpGet:
            path: /
            port: http
EOF

 

Finally, create a service manifest to create a load balancer.

cat << EOF > workloads/eks-example-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: eks-example
  namespace: eks-example
  labels:
    app: eks-example
spec:
  type: LoadBalancer
  ports:
    - port: 80
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: eks-example
EOF

 

In the preceding code, there are two Kubernetes annotations for Flux. The first, flux.weave.works/automated, tells Flux whether the container image should be automatically updated. This example sets the value to true, enabling updates to your deployment as new images arrive in the registry. This example comments out the second annotation, flux.weave.works/ignore. However, you can use it to tell Flux to ignore the deployment temporarily.

Commit the changes, and in a few minutes, it automatically deploys.

git add .
git commit -am "eks-example deployment"
[master 954908c] eks-example deployment
 3 files changed, 64 insertions(+)
 create mode 100644 namespaces/eks-example.yaml
 create mode 100644 workloads/eks-example-dep.yaml
 create mode 100644 workloads/eks-example-svc.yaml

 

Make sure that you push your changes.

git push

Now check the logs of your Flux pod:

kubectl get pods -n flux

Update the name below to reflect the name of the pod in your deployment. This sample pulls every five minutes for changes. When it triggers, you should see kubectl apply log messages to create the namespace, service, and deployment.

kubectl logs flux-5bd7fb6bb6-4sc78 -n flux

Find the load balancer input for your service with the following:

kubectl describe service eks-example -n eks-example

Now when you connect to the load balancer address in a browser, you can see the Hello World app.

Change the eks-example source code in a small way (such as changing index.html to say Hello World Deployment 2), then commit and push to Git.

After a few minutes, refresh your browser to see the deployed change. You can watch the changes in AWS CodePipeline, in ECR, and through Flux logs. Weave Flux automatically updated your deployment manifests in the k8s-config repository to deploy the new image as it detected it. To back out that change, use a git revert or git reset command.

Finally, you can use the same approach to deploy Helm charts. You can host these charts within the configuration Git repository (k8s-config in this example), or on an external chart repository. In the following example, you use an external chart repository.

In your k8s-config directory, get the latest changes from your repository and then create a Helm release from an external chart.

cd k8s-config

git pull

 

First, create the namespace manifest.

cat << EOF > namespaces/nginx.yaml
apiVersion: v1
kind: Namespace
metadata:
  labels:
    name: nginx
  name: nginx
EOF

 

Then create the Helm release manifest. This is a custom resource definition provided by Weave Flux.

cat << EOF > releases/nginx.yaml
apiVersion: flux.weave.works/v1beta1
kind: HelmRelease
metadata:
  name: mywebserver
  namespace: nginx
  annotations:
    flux.weave.works/automated: "true"
    flux.weave.works/tag.nginx: semver:~1.16
    flux.weave.works/locked: 'true'
    flux.weave.works/locked_msg: '"Halt updates for now"'
    flux.weave.works/locked_user: User Name <[email protected]>
spec:
  releaseName: mywebserver
  chart:
    repository: https://charts.bitnami.com/bitnami/
    name: nginx
    version: 3.3.2
  values:
    usePassword: true
    image:
      registry: docker.io
      repository: bitnami/nginx
      tag: 1.16.0-debian-9-r46
    service:
      type: LoadBalancer
      port: 80
      nodePorts:
        http: ""
      externalTrafficPolicy: Cluster
    ingress:
      enabled: false
    livenessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 30
      timeoutSeconds: 5
      failureThreshold: 6
    readinessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 5
      timeoutSeconds: 3
      periodSeconds: 5
    metrics:
      enabled: false
EOF

git add . 
git commit -am "Adding NGINX Helm release"
git push

 

There are a few new annotations for Flux above. The flux.weave.works/locked annotation tells Flux to lock the deployment. This is useful if you find a known bad image and must roll back to a previous version. In addition, the flux.weave.works/tag.nginx annotation filters image tags by semantic versioning.

Wait up to five minutes for Flux to pull the configuration and verify this deployment as you did in the previous example:

kubectl get pods -n flux

kubectl logs flux-5bd7fb6bb6-4sc78 -n flux

 

kubectl get all -n nginx

 

If this doesn’t deploy, ensure Helm initialized as described earlier in this post.

kubectl get pods -n kube-system | grep tiller

kubectl get pods -n flux

kubectl logs flux-helm-operator-df5746688-84kw8 -n flux

 

Clean up

Log in as an administrator and follow these steps to clean up your sample deployment.

  1. Delete all images from the Amazon ECR repository.

2. In AWS Service Catalog provisioned products, select the three dots to the left of your ExamplePipeline service and choose Terminate provisioned product. Wait until it completes termination (1–2 minutes).

3. Delete your Amazon S3 artifact bucket.

4. Delete Weave Flux:

helm delete flux --purge

kubectl delete ns flux

kubectl delete crd helmreleases.flux.weave.works

5. Delete the load balancer services:

helm delete mywebserver --purge

kubectl delete ns nginx

kubectl delete svc eks-example -n eks-example

kubectl delete deployment eks-example -n eks-example

kubectl delete ns eks-example

6. Clean up your GitHub repositories:

 – Go to your k8s-config repository in GitHub, choose Settings, scroll to the bottom and choose Delete this repository. If you set delete to false in the pipeline service, you also must delete your eks-example repository.

 – Delete the personal access token that you created.

7.     If you provisioned an EKS cluster at the beginning of this post, delete it:

eksctl get cluster

eksctl delete cluster <clustername>

8.     In the AWS CloudFormation console, select the DevServiceCatalog stack, and choose the Actions, Delete Stack.

Conclusion

In this post, I demonstrated how to use a GitOps approach, which allows you to focus on committing code and configuration to Git rather than learning new CI/CD tooling. Git acts as the single source of truth, and Weave Flux pulls changes and ensures that the Kubernetes cluster configuration matches the desired state.

In addition, AWS Service Catalog can be used to create a portfolio of services that enables you to standardize your offerings, such as an image build pipeline based on AWS CodePipeline.

As always, AWS welcomes feedback. Please submit comments or questions below.

Optimizing Amazon ECS task density using awsvpc network mode

Post Syndicated from Ignacio Riesgo original https://aws.amazon.com/blogs/compute/optimizing-amazon-ecs-task-density-using-awsvpc-network-mode/

This post is contributed by Tony Pujals | Senior Developer Advocate, AWS

 

AWS recently increased the number of elastic network interfaces available when you run tasks on Amazon ECS. Use the account setting called awsvpcTrunking. If you use the Amazon EC2 launch type and task networking (awsvpc network mode), you can now run more tasks on an instance—5 to 17 times as many—as you did before.

As more of you embrace microservices architectures, you deploy increasing numbers of smaller tasks. AWS now offers you the option of more efficient packing per instance, potentially resulting in smaller clusters and associated savings.

 

Overview

To manage your own cluster of EC2 instances, use the EC2 launch type. Use task networking to run ECS tasks using the same networking properties as if tasks were distinct EC2 instances.

Task networking offers several benefits. Every task launched with awsvpc network mode has its own attached network interface, a primary private IP address, and an internal DNS hostname. This simplifies container networking and gives you more control over how tasks communicate, both with each other and with other services within their virtual private clouds (VPCs).

Task networking also lets you take advantage of other EC2 networking features like VPC Flow Logs. This feature lets you monitor traffic to and from tasks. It also provides greater security control for containers, allowing you to use security groups and network monitoring tools at a more granular level within tasks. For more information, see Introducing Cloud Native Networking for Amazon ECS Containers.

However, if you run container tasks on EC2 instances with task networking, you can face a networking limit. This might surprise you, particularly when an instance has plenty of free CPU and memory. The limit reflects the number of network interfaces available to support awsvpc network mode per container instance.

 

Raise network interface density limits with trunking

The good news is that AWS raised network interface density limits by implementing a networking feature on ECS called “trunking.” This is a technique for multiplexing data over a shared communication link.

If you’re migrating to microservices using AWS App Mesh, you should optimize network interface density. App Mesh requires awsvpc networking to provide routing control and visibility over an ever-expanding array of running tasks. In this context, increased network interface density might save money.

By opting for network interface trunking, you should see a significant increase in capacity—from 5 to 17 times more than the previous limit. For more information on the new task limits per container instance, see Supported Amazon EC2 Instance Types.

Applications with tasks not hitting CPU or memory limits also benefit from this feature through the more cost-effective “bin packing” of container instances.

 

Trunking is an opt-in feature

AWS chose to make the trunking feature opt-in due to the following factors:

  • Instance registration: While normal instance registration is straightforward with trunking, this feature increases the number of asynchronous instance registration steps that can potentially fail. Any such failures might add extra seconds to launch time.
  • Available IP addresses: The “trunk” belongs to the same subnet in which the instance’s primary network interface originates. This effectively reduces the available IP addresses and potentially the ability to scale out on other EC2 instances sharing the same subnet. The trunk consumes an IP address. With a trunk attached, there are two assigned IP addresses per instance, one for the primary interface and one for the trunk.
  • Differing customer preferences and infrastructure: If you have high CPU or memory workloads, you might not benefit from trunking. Or, you may not want awsvpc networking.

Consequently, AWS leaves it to you to decide if you want to use this feature. AWS might revisit this decision in the future, based on customer feedback. For now, your account roles or users must opt in to the awsvpcTrunking account setting to gain the benefits of increased task density per container instance.

 

Enable trunking

Enable the ECS elastic network interface trunking feature to increase the number of network interfaces that can be attached to supported EC2 container instance types. You must meet the following prerequisites before you can launch a container instance with the increased network interface limits:

  • Your account must have the AWSServiceRoleForECS service-linked role for ECS.
  • You must opt into the awsvpcTrunking  account setting.

 

Make sure that a service-linked role exists for ECS

A service-linked role is a unique type of IAM role linked to an AWS service (such as ECS). This role lets you delegate the permissions necessary to call other AWS services on your behalf. Because ECS is a service that manages resources on your behalf, you need this role to proceed.

In most cases, you won’t have to create a service-linked role. If you created or updated an ECS cluster, ECS likely created the service-linked role for you.

You can confirm that your service-linked role exists using the AWS CLI, as shown in the following code example:

$ aws iam get-role --role-name AWSServiceRoleForECS
{
    "Role": {
        "Path": "/aws-service-role/ecs.amazonaws.com/",
        "RoleName": "AWSServiceRoleForECS",
        "RoleId": "AROAJRUPKI7I2FGUZMJJY",
        "Arn": "arn:aws:iam::226767807331:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS",
        "CreateDate": "2018-11-09T21:27:17Z",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ecs.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        },
        "Description": "Role to enable Amazon ECS to manage your cluster.",
        "MaxSessionDuration": 3600
    }
}

If the service-linked role does not exist, create it manually with the following command:

aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com

For more information, see Using Service-Linked Roles for Amazon ECS.

 

Opt in to the awsvpcTrunking account setting

Your account, IAM user, or role must opt in to the awsvpcTrunking account setting. Select this setting using the AWS CLI or the ECS console. You can opt in for an account by making awsvpcTrunking  its default setting. Or, you can enable this setting for the role associated with the instance profile with which the instance launches. For instructions, see Account Settings.

 

Other considerations

After completing the prerequisites described in the preceding sections, launch a new container instance with increased network interface limits using one of the supported EC2 instance types.

Keep the following in mind:

  • It’s available with the latest variant of the ECS-optimized AMI.
  • It only affects creation of new container instances after opting into awsvpcTrunking.
  • It only affects tasks created with awsvpc network mode and EC2 launch type. Tasks created with the AWS Fargate launch type always have a dedicated network interface, no matter how many you launch.

For details, see ENI Trunking Considerations.

 

Summary

If you seek to optimize the usage of your EC2 container instances for clusters that you manage, enable the increased network interface density feature with awsvpcTrunking. By following the steps outlined in this post, you can launch tasks using significantly fewer EC2 instances. This is especially useful if you embrace a microservices architecture, with its increasing numbers of lighter tasks.

Hopefully, you found this post informative and the proposed solution intriguing. As always, AWS welcomes all feedback or comment.

Setting up AWS PrivateLink for Amazon ECS, and Amazon ECR

Post Syndicated from Nathan Peck original https://aws.amazon.com/blogs/compute/setting-up-aws-privatelink-for-amazon-ecs-and-amazon-ecr/

Amazon ECS and Amazon ECR now have support for AWS PrivateLink. AWS PrivateLink is a networking technology designed to enable access to AWS services in a highly available and scalable manner. It keeps all the network traffic within the AWS network. When you create AWS PrivateLink endpoints for ECR and ECS, these service endpoints appear as elastic network interfaces with a private IP address in your VPC.

Before AWS PrivateLink, your Amazon EC2 instances had to use an internet gateway to download Docker images stored in ECR or communicate to the ECS control plane. Instances in a public subnet with a public IP address used the internet gateway directly. Instances in a private subnet used a network address translation (NAT) gateway hosted in a public subnet. The NAT gateway would then use the internet gateway to talk to ECR and ECS.

Now that AWS PrivateLink support has been added, instances in both public and private subnets can use it to get private connectivity to download images from Amazon ECR. Instances can also communicate with the ECS control plane via AWS PrivateLink endpoints without needing an internet gateway or NAT gateway.

 

This networking architecture is considerably simpler. It enables enhanced security by allowing you to deny your private EC2 instances access to anything other than these AWS services. That’s assuming that you want to block all other outbound internet access for those instances. For this to work, you must create some AWS PrivateLink resources:

  • AWS PrivateLink endpoints for ECR. This allows instances in your VPC to communicate with ECR to download image manifests
  • Gateway VPC endpoint for Amazon S3. This allows instances to download the image layers from the underlying private Amazon S3 buckets that host them.
  • AWS PrivateLink endpoints for ECS. These endpoints allow instances to communicate with the telemetry and agent services in the ECS control plane.

This post explains how to create these resources.

Create an AWS PrivateLink interface endpoint for ECR

ECR requires two interface endpoints:

  • com.amazonaws.region.ecr.api
  • com.amazonaws.region.ecr.dkr

In the VPC console, create the interface VPC endpoints for ECR using the endpoint creation wizard. Choose AWS services and select an endpoint. Substitute your AWS Region of choice.

Next, specify the VPC and subnets to which the AWS PrivateLink interface should be added. Make sure that you select the same VPC in which your ECS cluster is running. To be on the safe side, select every Availability Zone and subnet from the list. Each zone has a list of the subnets available. You can select all the subnets in each Availability Zone.

However, depending on your networking needs, you might also choose to only enable the AWS PrivateLink endpoint in your private subnets from each Availability Zone. Let instances running in a public subnet continue to communicate with ECR via the public subnet’s internet gateway.

Next, enable Private DNS Name, which is required for the endpoint.

com.amazonaws.region.ecr.dkr.

A private hosted zone enables you to access the resources in your VPC using the Amazon ECR default DNS domain names. You don’t need to use the private IPv4 address or the private DNS hostnames provided by Amazon VPC endpoints. The Amazon ECR DNS hostname that the AWS CLI and Amazon ECR SDKs use by default (https://api.ecr.region.amazonaws.com) resolves to your VPC endpoint.

If you enabled a private hosted zone for com.amazonaws.region.ecr.api and you are using an SDK released before January 24, 2019, you must specify the following endpoint when using an SDK or the AWS CLI. Use the following command:

aws --endpoint-url https://api.ecr.region.amazonaws.com

If you don’t enable a private hosted zone, use the following command:

aws --endpoint-url https://VPC_Endpoint_ID.api.ecr.region.vpce.amazonaws.com ecr describe-repositories

If you enabled a private hosted zone and you are using the SDK released on January 24, 2019 or later, use the following command:

aws ecr describe-repositories

Lastly, specify a security group for the interface itself. This is going to control whether each host is able to talk to the interface. The security group should allow inbound connections on port 80 from the instances in your cluster.

You may have a security group that is applied to all the EC2 instances in the cluster, perhaps using an Auto Scaling group. You can create a rule that allows the VPC endpoint to be accessed by any instance in that security group.

Finally, choose Create endpoint. The new endpoint appears in the list.

Add a gateway VPC endpoint for S3

The next step is to create a gateway VPC endpoint for S3. This is necessary because ECR uses S3 to store Docker image layers. When your instances download Docker images from ECR, they must access ECR to get the image manifest and S3 to download the actual image layers.

S3 uses a slightly different endpoint type called a gateway. Be careful about adding an S3 gateway to your VPC if your application is actively using S3. With gateway endpoints, your application’s existing connections to S3 may be briefly interrupted while the gateway is being added. You may have a busy cluster with many active ECS deployments, causing image layer downloads from S3. Or, your application itself may make heavy usage of S3. In that case, it’s best to create a fresh new VPC with an S3 gateway, then migrate your ECS cluster and its containers into that VPC.

To add the S3 gateway endpoint, select com.amazonaws.region.s3 on the list of AWS services and select the VPC hosting your ECS cluster. Gateway endpoints are added to the VPC route table for the subnets. Select each route table associated with the subnet in which the S3 gateway should be.

Instead of using a security group, the gateway endpoint uses an IAM policy document to limit access to the service. This policy is similar to an IAM policy but does not replace the default level of access that your applications have through their IAM role. It just further limits what portions of the service are available via the gateway.

It’s okay to just use the default Full Access policy. Any restrictions you have put on your task IAM roles or other IAM user policies still apply on top of this policy. For information about a minimal access policy, see the Minimum Amazon S3 Bucket Permissions for Amazon ECR.

Choose Create to add this gateway endpoint to your VPC. When you view the route tables in your VPC subnets, you see an S3 gateway that is used whenever ECR Docker image layers are being downloaded from S3.

Create an AWS PrivateLink interface endpoint for ECS

In addition to downloading Docker images from ECR, your EC2 instances must also communicate with the ECS control plane to receive orchestration instructions.

ECS requires three endpoints:

  • com.amazonaws.region.ecs-agent
  • com.amazonaws.region.ecs-telemetry
  • com.amazonaws.region.ecs

Create these three interface endpoints in the same way that you created the endpoint for ECR, by adding each endpoint and setting the subnets and security group for the endpoint.

After the endpoints are created and added to your VPC, there is one additional step. Make sure that your ECS agent is upgraded to version 1.25.1 or higher. For more information, see the instructions for upgrading the ECS agent.

If you are already running the right version of the ECS agent, restart any ECS agents that are currently running in the VPC. The ECS agent uses a persistent web socket connection to the ECS backend and VPC endpoints do not interrupt existing connections. The agent continues to use its existing connection instead of establishing a new connection through the new endpoint, unless you restart it.

To restart the agent with no disruption to your application containers, you can connect using SSH to each EC2 instance in the cluster and issue the following command:

sudo docker restart ecs-agent

This restarts the ECS agent without stopping any of the other application containers on the host. Your application may be stateless and safe to stop at any time, or you may not have or want SSH access to the underlying hosts. In that case, choose to just reboot each EC2 instance in the cluster one at a time. This restarts the agent on that host while also restarting any service launched tasks on that host on a different host.

Conclusion

In this post, I showed you how to add AWS PrivateLink endpoints to your VPC for ECS and ECR, including an S3 gateway for ECR layer downloads.

The instances in your ECS cluster can communicate directly with the ECS control plane. They should be able to download Docker images directly without needing to make any connections outside of your VPC using an internet gateway or NAT gateway. All container orchestration traffic stays inside the VPC.

If you have questions or suggestions, please comment below.

How to Use Cross-Account ECR Images in AWS CodeBuild for Your Build Environment

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/how-to-use-cross-account-ecr-images-in-aws-codebuild-for-your-build-environment/

AWS CodeBuild now makes it possible for you to access Docker images from any Amazon Elastic Container Registry repository in another account as the build environment. With this feature, AWS CodeBuild allows you to pull any image from a repository to which you have been granted resource-level permissions.

In this blog post, we will show you how to provision a build environment using an image from another AWS account.

Here is a quick overview of the services used in our example:

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. It provides a fully preconfigured build platform for most popular programming languages and build tools, including Apache Maven, Gradle, and more.

Amazon Elastic ECR is a fully managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.

We will use a sample Docker image in an Amazon ECR image repository in AWS account B. The CodeBuild project in AWS account A will pull the images from the Amazon ECR image repository in AWS account B.

Prerequisites:

To get started you need:

·       Two AWS accounts (AWS account A and AWS account B).

·       In AWS account A, an image registry in Amazon ECR. In AWS account B, images that you would like to use for your build environment. If you do not have an image registry and a sample image, see Docker Sample in the AWS CodeBuild User Guide.

·       In AWS account A, an AWS CodeCommit repository with a buildspec.yml file and sample code.

·       Using the following steps, permissions in your Amazon ECR image repository for AWS CodeBuild to pull the repository’s Docker image into the build environment.

To grant CodeBuild permissions to pull the Docker image into the build environment

1.     Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.

2.     Choose the name of the repository you created.

3.     On the Permissions tab, choose Edit JSON policy.

4.     Apply the following policy and save.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CodeBuildAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "<arn of the service role>"  
      },
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability"
      ]
    }
  ]
}

To use an image from account B and set up a build project in account A

1. Open the AWS CodeBuild console at https://console.aws.amazon.com/codesuite/codebuild/home.

2. Choose Create project.

3. In Project configuration, enter a name and description for the build project.

4. In Source, for Source provider, choose the source code provider type. In this example, we use the AWS CodeCommit repository name.

 

5.  For Environment, we will pull the Docker image from AWS account B and use the image to create the build environment to build artifacts. To configure the build environment, choose Custom Image. For Image registry, choose Amazon ECR. For ECR account, choose Other ECR account.

6.  In Amazon ECR repository URI, enter the URI for the image repository from AWS account B and then choose Create build project.

7. Go to the build project you just created, and choose Start build. The build execution will download the source code from the AWS CodeCommit repository and provision the build environment using the image retrieved from the image registry.

Next steps

Now that you have seen how to use cross-account ECR images, you can integrate a build step in AWS CodePipeline and use the build environment to create artifacts and deploy your application. To integrate a build step in your pipeline, see Working with Deployments in AWS CodeDeploy in the AWS CodeDeploy User Guide

If you have any feedback, please leave it in the Comments section below. If you have questions, please start a thread on the AWS CodeBuild forum or contact AWS Support.

 

Setting up AWS PrivateLink for AWS Fargate, Amazon ECS, and Amazon ECR

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/setting-up-aws-privatelink-for-aws-fargate-amazon-ecs-and-amazon-ecr/

This post is contributed by Nathan Peck – Developer Advocate, Amazon Container Services

AWS Fargate, Amazon ECS, and Amazon ECR now have support for AWS PrivateLink. AWS PrivateLink is a networking technology designed to enable access to AWS services in a highly available and scalable manner, while keeping all the network traffic within the AWS network. When you create AWS PrivateLink endpoints for ECR, ECS, and Fargate, these service endpoints appear as elastic network interfaces with a private IP address in your VPC.

Before AWS PrivateLink, your Amazon EC2 instances had to use an internet gateway to download Docker images stored in ECR or communicate to the ECS control plane. Instances in a public subnet with a public IP address used the internet gateway directly. Instances in a private subnet used a network address translation (NAT) gateway hosted in a public subnet. The NAT gateway would then use the internet gateway to talk to ECR and ECS.

Now that AWS PrivateLink support has been added, instances in both public and private subnets can use it to get private connectivity to download images from Amazon ECR. Instances can also communicate with the ECS control plane via AWS PrivateLink endpoints without needing an internet gateway or NAT gateway.

This networking architecture is considerably simpler. It enables enhanced security by allowing you to deny your private EC2 instances access to anything other than these AWS services. That’s assuming that you want to block all other outbound internet access for those instances. For this to work, you must create some AWS PrivateLink resources:

  • AWS PrivateLink endpoints for ECR. This allows instances in your VPC to communicate with ECR to download image manifests
  • AWS PrivateLink gateway for Amazon S3. This allows instances to download the image layers from the underlying private S3 buckets that host them.
  • AWS PrivateLink endpoints for ECS. These endpoints allow instances to communicate with the telemetry and agent services in the ECS control plane.

This post explains how to create these resources.

Create an AWS PrivateLink interface endpoint for ECR

ECR requires 2 interface endpoints:

  • com.amazonaws.region.ecr.api
  • com.amazonaws.region.ecr.dkr

First, create the interface VPC endpoints for ECR using the endpoint creation wizard in the VPC dashboard separately. Select AWS services and select an endpoint. Substitute your region of choice.

Next, specify the VPC and subnets to which the AWS PrivateLink interface should be added. Make sure that you select the same VPC in which your ECS cluster is running. To be on the safe side, select every Availability Zone and subnet from the list. Each zone has a list of the subnets available. You can select all the subnets in each Availability Zone.

However, depending on your networking needs, you might also choose to only enable the AWS PrivateLink endpoint in your private subnets from each Availability Zone. Let instances running in a public subnet continue to communicate with ECR via the public subnet’s internet gateway.

Next, enable Private DNS Name. You are required to enable Private DNS Name for endpoint

com.amazonaws.region.ecr.dkr.

A private hosted zone enables you to access the resources in your VPC using the Amazon ECR default DNS domain names instead of using private IPv4 address or private DNS hostnames provided by AWS VPC Endpoints. The Amazon ECR DNS hostname that AWS CLI and Amazon ECR SDKs use by default (https://api.ecr.region.amazonaws.com) resolves to your VPC endpoint.

If you enabled a private hosted zone for com.amazonaws.region.ecr.api and you are using an SDK released before January 24, 2019, you must specify the following endpoint when using SDK or AWS CLI. For example:

aws --endpoint-url https://api.ecr.region.amazonaws.com

If you don’t enable a private hosted zone, this would be:

aws --endpoint-url https://VPC_Endpoint_ID.api.ecr.region.vpce.amazonaws.com ecr describe-repositories

If you enabled a private hosted zone and you are using the SDK released on January 24, 2019 or later, this would be:

aws ecr describe-repositories

Lastly, specify a security group for the interface itself. This is going to control whether each host is able to talk to the interface. The security group should allow inbound connections on port 80 from the instances in your cluster.

You may have a security group that is applied to all the EC2 instances in the cluster, perhaps using an Auto Scaling group. You can create a rule that allows the VPC endpoint to be accessed by any instance in that security group.

Finally, choose Create endpoint. The new endpoint appears in the list.

Add an AWS PrivateLink gateway endpoint for S3

The next step is to create a gateway VPC endpoint for S3. This is necessary because ECR uses S3 to store Docker image layers. When your instances download Docker images from ECR, they must access ECR to get the image manifest and S3 to download the actual image layers.

S3 uses a slightly different endpoint type called a gateway. Be careful about adding an S3 gateway to your VPC if your application is actively using S3. With gateway endpoints, your application’s existing connections to S3 may be briefly interrupted while the gateway is being added. You may have a busy cluster with many active ECS deployments, causing image layer downloads from S3. Or, your application itself may make heavy usage of S3. In that case, it’s best to create a fresh new VPC with an S3 gateway, then migrate your ECS cluster and its containers into that VPC.

To add the S3 gateway endpoint, select com.amazonaws.region.s3 on the list of AWS services and select the VPC hosting your ECS cluster. Gateway endpoints are added to the VPC route table for the subnets. Select each route table associated with the subnet in which the S3 gateway should be.

Instead of using a security group, the gateway endpoint uses an IAM policy document to limit access to the service. This policy is similar to an IAM policy but does not replace the default level of access that your applications have through their IAM role. It just further limits what portions of the service are available via the gateway. It’s okay to just use the default Full Access policy. Any restrictions you have put on your task IAM roles or other IAM user policies still apply on top of this policy.

Choose Create to add this gateway endpoint to your VPC. When you view the route tables in your VPC subnets, you see an S3 gateway that is used whenever ECR Docker image layers are being downloaded from S3.

Create an AWS PrivateLink interface endpoint for ECS

In addition to downloading Docker images from ECR, your EC2 instances must also communicate with the ECS control plane to receive orchestration instructions.

ECS requires three endpoints:

  • com.amazonaws.region.ecs-agent
  • com.amazonaws.region.ecs-telemetry
  • com.amazonaws.region.ecs

Create these three interface endpoints in the same way that you created the endpoint for ECR, by adding each endpoint and setting the subnets and security group for the endpoint.

After the endpoints are created and added to your VPC, there is one additional step. Restart any ECS agents that are currently running in the VPC. The ECS agent uses a persistent web socket connection to the ECS backend and VPC endpoints do not interrupt existing connections. The agent continues to use its existing connection instead of establishing a new connection through the new endpoint, unless you restart it.

To restart the agent with no disruption to your application containers, you can connect using SSH to each EC2 instance in the cluster and issue the following command:

sudo docker restart ecs-agent

This restarts the ECS agent without stopping any of the other application containers on the host. Your application may be stateless and safe to stop at any time, or you may not have or want SSH access to the underlying hosts. In that case, choose to just reboot each EC2 instance in the cluster one at a time. This restarts the agent on that host while also restarting any service launched tasks on that host on a different host.

If you are using AWS Fargate, you can issue an UpdateService API call to do a rolling restart of all your containers. Or, manually stop your running containers one by one and let them be automatically replaced. When they restart, they use an ECS agent that is communicating using the new ECS endpoints. The Docker image is downloaded using the ECR endpoint and S3 gateway.

Conclusion

In this post, I showed you how to add AWS PrivateLink endpoints to your VPC for ECS and ECR, including an S3 gateway for ECR layer downloads. These endpoints work whether you are running your containers on EC2 instances in a self-managed cluster in your VPC, or as Fargate containers running in your VPC.

Your ECS cluster or Fargate tasks communicate directly with the ECS control plane. They should be able to download Docker images directly without needing to make any connections outside of your VPC via an internet gateway or NAT gateway. All container orchestration traffic stays inside the VPC.

If you have questions or suggestions, please comment below.

Migrate Wildfly Cluster to Amazon ECS using Service Discovery

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/migrate-wildfly-cluster-to-ecs-using-service-discovery/

This post is courtesy of Vidya Narasimhan, AWS Solutions Architect

1. Overview

Java Enterprise Edition has been an important server-side platform for over a decade for developing mission-critical & large-scale applications amongst enterprises. High-availability & fault tolerance for such applications is typically achieved through built-in JEE clustering provided by the platform.

JEE clustering represents a group of machines working together to transparently provide enterprise services such as JNDI, EJB, JMS, HTTPSession etc. that enable distribution, discovery, messaging, transaction, caching, replication & component failover.  Implementation of clustering technology varies in JEE platforms provided by different vendors. Many of the clustering implementations involve proprietary communication protocols that use multicast for intra-cluster communications that is not supported in public cloud.

This article is relevant for JEE platforms & other products that use JGroups based clustering such as Wildfly. The solution described allows easy migration of applications developed on these platforms using native clustering to Amazon Elastic Container Service (Amazon ECS) which is a highly scalable, fast, container management service that makes it easy to orchestrate, run & scale Docker containers on a cluster. This solution is useful when the business objective is to migrate to cloud fast with minimum changes to the application. The approach recommends lift & shift to AWS wherein the initial focus is to migrate as-is with optimizations coming in later incrementally.

Whether the JEE application to be migrated is designed as a monolith or micro services, a legacy or green-field deployment, there are multiple reasons why organizations should opt for containerization of their application. This link explains well the benefits of containerization (see section Why Use Containers) https://aws.amazon.com/getting-started/projects/break-monolith-app-microservices-ecs-docker-ec2/module-one/

2. Wildfly Clustering on ECS

Here onwards, this article highlights how to migrate a standard clustered JEE app deployed on Wildfly Application Server to Amazon ECS. Wildfly supports clustering out of the box and supports two modes of clustering, standalone & domain mode. This article explores how to setup WildFly cluster in ECS with multiple Wildfly standalone nodes enabled for HA to form a cluster. The clustering is demonstrated through a web application that replicates session information across the cluster nodes and can withstand a failover without session data loss.

The important components of clustering that requires a mention right away are ECS Service Discovery, JGroups & Infinispan.

  • JGroups – Wildfly clustering is enabled by the popular open-source JGroups toolkit. The JGroups subsystem provides group communication support for HA services using a multicast transmission by default. It deals with all aspects of node discovery and providing reliable messaging between the nodes as follows-
    • Node-to-node messaging — By default is based on UDP/multicast that can be extended via TCP/unicast.
    • Node discovery — By default uses multicast ping MPING. Alternatives include TCPPING, S3_PING, JDBC_PING, DNS_PING and others.

This article focusses on DNS_PING for node discovery using TCP protocol.

ECS Service discovery – Amazon ECS service can optionally be configured to use Amazon ECS Service Discovery. Service discovery uses Amazon Route 53 auto naming API actions to manage DNS entries (A or SRV records) for service tasks, making them discoverable within your VPC. You can specify health check conditions in a service task definition and Amazon ECS will ensure that only healthy service endpoints are returned by a service lookup.

As your services scale up or down in response to load or container health, the Route 53 hosted zone is kept up to date.

Wildfly uses JGroups to discover the cluster nodes via DNS_PING discovery protocol that sends a DNS service endpoint query to the ECS service registry maintained in Route53.

  • Infinispan – Wildfly uses Infinispan subsystem to provides high-performance, clustered, transactional caching. In a clustered web application, Infinispan handles the replication of application data across the cluster by means of a replicated/distributed cache. Under the hood, it uses JGroups channel for data transmission within the cluster.

3. Implementation Instructions

Configure Wildfly

  • Modify Wildfly standalone configuration file – Standalone-HA.xml. The HA suffix implies high availability configuration.
  1.  Modify the JGroup Subsystem – Add a TCP Stack with DNS_Ping as the discovery protocol & configure the DNS Query endpoint. It is important to note that the DNS_QUERY matches the ECS  service endpoint when configuring the ECS service.  
  2. Change the JGroup default stack to point to the TCP Stack.                           
  3. Configure a custom Infinispan replicated cache to be used by the web app or use the default cache.      

Build the docker image & store it in Elastic Container Registry (ECR)

  1. Package the JBoss/Wildfly image with JEE application & Wildfly platform on Docker. Create a Dockerfile & include the following:
    1. Install the WildFly distribution & set permissions – This approach requires the latest Wildfly distribution 15.0.0.Final released recently.          
    2. Copy the modified Wildfly standalone-ha.xml to the container.
    3. Deploy the JEE web application. This simple web app is configured as distributable and uses Infinispan to replicate session information across cluster nodes. It displays a page containing the container IP/hostname, Session ID & session data & helps demonstrate session replication.                   
    4. Define a custom entrypoint, entrypoint.sh, to boot Wildfly with the specified bind IP addresses to its interfaces. The script gets the container metadata, extracts the container IP to bind it to Wildfly interfaces. This interface binding is an important step as it enables the application related network communication (web, messaging) between the containers.    
    5. Add the enrypoint.sh script to the image in the Dockerfile.                                        
    6. Build the container & push it to ECR repository. Amazon Elastic Container Registry (ECR) is a fully managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.

The Wildly configuration files, the Dockerfile & the web app WAR file can be found at the Github link https://github.com/vidyann/Wildfly_ECS

Create ECS Service with service discovery

  • Create a Service using service discovery.
    • This link describe steps to set up a ECS task & service with service discovery https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-service-discovery.html#create-service-discovery-taskdef. Though the example in the link creates a Fargate cluster, you can create an EC2 based cluster as well for this example.
    • While configuring the task choose the network mode as AWSVPC. The task networking features provided by the AWSVPC network mode give Amazon ECS tasks the same networking properties as Amazon EC2 instances. Benefits of task networking can be found here – https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-networking.html
    • Tasks can be flagged as compatible with EC2, Fargate, or both. Here is what the cluster & service looks like:                             
    • When setting up the container details in task, use 8080 as the port, which is the default Wildfly port. This can be changed through WIldfly configuration. Enable the cloudwatch logs which captures Wildfly logging.
    • While configuring the ECS service, ensure that the service name & namespace should combine to form service endpoint that exactly matches the DNS_Query endpoint configured in Wildfly configuration file. The container security group should allow inbound traffic to port 8080. Here is what the service endpoint looks like:     
    • The route53 registry created by ECS is shown below. We see two DNS entries corresponding to the DNS endpoint myapp.sampleaws.com.              
    • Finally view the Wildfly logs in the console by clicking a task instance. You can check if clustering is enabled by looking for a log entry as below:            

Here we see that a Wildfly cluster was formed with two nodes(same as the pic in route 53).

Run the Web App in a browser

  • Spin up a windows instance in the VPC & open the web app in a browser. Below is a screenshot of the webapp:                                         
  • Open in different browsers & tabs & verify the Container IP & session ID. Now force shutdown a node by resizing the ECS service task instances to one. Note that though the container IP in the webapp changes, the session ID does not change and the webapp is available and the HTTP Session is alive thus demonstrating the session replication & failover amongst the clustering nodes.

4. Summary

Our goal here is to migrate the enterprise JEE apps to Amazon ECS by tweaking a few configurations but gaining immediately the benefits of containerization & orchestration managed by ECS. By delegating the undifferentiated heavy lifting of container management, orchestration, scaling to ECS, you can focus on improvising/re-architecting your application to micro-services oriented architecture. Please note that all the deployment procedures in this article can be fully automated via the AWS CI/CD services.