Tag Archives: Containers

Amazon ECS Now Supports EC2 Inf1 Instances

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-ecs-now-supports-ec2-inf1-instances/

As machine learning and deep learning models become more sophisticated, hardware acceleration is increasingly required to deliver fast predictions at high throughput. Today, we’re very happy to announce that AWS customers can now use the Amazon EC2 Inf1 instances on Amazon ECS, for high performance and the lowest prediction cost in the cloud. For a few weeks now, these instances have also been available on Amazon Elastic Kubernetes Service.

A primer on EC2 Inf1 instances
Inf1 instances were launched at AWS re:Invent 2019. They are powered by AWS Inferentia, a custom chip built from the ground up by AWS to accelerate machine learning inference workloads.

Inf1 instances are available in multiple sizes, with 1, 4, or 16 AWS Inferentia chips, with up to 100 Gbps network bandwidth and up to 19 Gbps EBS bandwidth. An AWS Inferentia chip contains four NeuronCores. Each one implements a high-performance systolic array matrix multiply engine, which massively speeds up typical deep learning operations such as convolution and transformers. NeuronCores are also equipped with a large on-chip cache, which helps cut down on external memory accesses, saving I/O time in the process. When several AWS Inferentia chips are available on an Inf1 instance, you can partition a model across them and store it entirely in cache memory. Alternatively, to serve multi-model predictions from a single Inf1 instance, you can partition the NeuronCores of an AWS Inferentia chip across several models.

Compiling Models for EC2 Inf1 Instances
To run machine learning models on Inf1 instances, you need to compile them to a hardware-optimized representation using the AWS Neuron SDK. All tools are readily available on the AWS Deep Learning AMI, and you can also install them on your own instances. You’ll find instructions in the Deep Learning AMI documentation, as well as tutorials for TensorFlow, PyTorch, and Apache MXNet in the AWS Neuron SDK repository.

In the demo below, I will show you how to deploy a Neuron-optimized model on an ECS cluster of Inf1 instances, and how to serve predictions with TensorFlow Serving. The model in question is BERT, a state of the art model for natural language processing tasks. This is a huge model with hundreds of millions of parameters, making it a great candidate for hardware acceleration.

Creating an Amazon ECS Cluster
Creating a cluster is the simplest thing: all it takes is a call to the CreateCluster API.

$ aws ecs create-cluster --cluster-name ecs-inf1-demo

Immediately, I see the new cluster in the console.

New cluster

Several prerequisites are required before we can add instances to this cluster:

  • An AWS Identity and Access Management (IAM) role for ECS instances: if you don’t have one already, you can find instructions in the documentation. Here, my role is named ecsInstanceRole.
  • An Amazon Machine Image (AMI) containing the ECS agent and supporting Inf1 instances. You could build your own, or use the ECS-optimized AMI for Inferentia. In the us-east-1 region, its id is ami-04450f16e0cd20356.
  • A Security Group, opening network ports for TensorFlow Serving (8500 for gRPC, 8501 for HTTP). The identifier for mine is sg-0994f5c7ebbb48270.
  • If you’d like to have ssh access, your Security Group should also open port 22, and you should pass the name of an SSH key pair. Mine is called admin.

We also need to create a small user data file in order to let instances join our cluster. This is achieved by storing the name of the cluster in an environment variable, itself written to the configuration file of the ECS agent.

#!/bin/bash
echo ECS_CLUSTER=ecs-inf1-demo >> /etc/ecs/ecs.config

We’re all set. Let’s add a couple of Inf1 instances with the RunInstances API. To minimize cost, we’ll request Spot Instances.

$ aws ec2 run-instances \
--image-id ami-04450f16e0cd20356 \
--count 2 \
--instance-type inf1.xlarge \
--instance-market-options '{"MarketType":"spot"}' \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=ecs-inf1-demo}]' \
--key-name admin \
--security-group-ids sg-0994f5c7ebbb48270 \
--iam-instance-profile Name=ecsInstanceRole \
--user-data file://user-data.txt

Both instances appear right away in the EC2 console.

Inf1 instances

A couple of minutes later, they’re ready to run tasks on the cluster.

Inf1 instances

Our infrastructure is ready. Now, let’s build a container storing our BERT model.

Building a Container for Inf1 Instances
The Dockerfile is pretty straightforward:

  • Starting from an Amazon Linux 2 image, we open ports 8500 and 8501 for TensorFlow Serving.
  • Then, we add the Neuron SDK repository to the list of repositories, and we install a version of TensorFlow Serving that supports AWS Inferentia.
  • Finally, we copy our BERT model inside the container, and we load it at startup.

Here is the complete file.

FROM amazonlinux:2
EXPOSE 8500 8501
RUN echo $'[neuron] \n\
name=Neuron YUM Repository \n\
baseurl=https://yum.repos.neuron.amazonaws.com \n\
enabled=1' > /etc/yum.repos.d/neuron.repo
RUN rpm --import https://yum.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB
RUN yum install -y tensorflow-model-server-neuron
COPY bert /bert
CMD ["/bin/sh", "-c", "/usr/local/bin/tensorflow_model_server_neuron --port=8500 --rest_api_port=8501 --model_name=bert --model_base_path=/bert/"]

Then, I build and push the container to a repository hosted in Amazon Elastic Container Registry. Business as usual.

$ docker build -t neuron-tensorflow-inference .

$ aws ecr create-repository --repository-name ecs-inf1-demo

$ aws ecr get-login-password | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

$ docker tag neuron-tensorflow-inference 123456789012.dkr.ecr.us-east-1.amazonaws.com/ecs-inf1-demo:latest

$ docker push

Now, we need to create a task definition in order to run this container on our cluster.

Creating a Task Definition for Inf1 Instances
If you don’t have one already, you should first create an execution role, i.e. a role allowing the ECS agent to perform API calls on your behalf. You can find more information in the documentation. Mine is called ecsTaskExecutionRole.

The full task definition is visible below. As you can see, it holds two containers:

  • The BERT container that I built,
  • A sidecar container called neuron-rtd, that allows the BERT container to access NeuronCores present on the Inf1 instance. The AWS_NEURON_VISIBLE_DEVICES environment variable lets you control which ones may be used by the container. You could use it to pin a container on one or several specific NeuronCores.
{
  "family": "ecs-neuron",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
  "containerDefinitions": [
    {
      "entryPoint": [
        "sh",
        "-c"
      ],
      "portMappings": [
        {
          "hostPort": 8500,
          "protocol": "tcp",
          "containerPort": 8500
        },
        {
          "hostPort": 8501,
          "protocol": "tcp",
          "containerPort": 8501
        },
        {
          "hostPort": 0,
          "protocol": "tcp",
          "containerPort": 80
        }
      ],
      "command": [
        "tensorflow_model_server_neuron --port=8500 --rest_api_port=8501 --model_name=bert --model_base_path=/bert"
      ],
      "cpu": 0,
      "environment": [
        {
          "name": "NEURON_RTD_ADDRESS",
          "value": "unix:/sock/neuron-rtd.sock"
        }
      ],
      "mountPoints": [
        {
          "containerPath": "/sock",
          "sourceVolume": "sock"
        }
      ],
      "memoryReservation": 1000,
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/ecs-inf1-demo:latest",
      "essential": true,
      "name": "bert"
    },
    {
      "entryPoint": [
        "sh",
        "-c"
      ],
      "portMappings": [],
      "command": [
        "neuron-rtd -g unix:/sock/neuron-rtd.sock"
      ],
      "cpu": 0,
      "environment": [
        {
          "name": "AWS_NEURON_VISIBLE_DEVICES",
          "value": "ALL"
        }
      ],
      "mountPoints": [
        {
          "containerPath": "/sock",
          "sourceVolume": "sock"
        }
      ],
      "memoryReservation": 1000,
      "image": "790709498068.dkr.ecr.us-east-1.amazonaws.com/neuron-rtd:latest",
      "essential": true,
      "linuxParameters": { "capabilities": { "add": ["SYS_ADMIN", "IPC_LOCK"] } },
      "name": "neuron-rtd"
    }
  ],
  "volumes": [
    {
      "name": "sock",
      "host": {
        "sourcePath": "/tmp/sock"
      }
    }
  ]
}

Finally, I call the RegisterTaskDefinition API to let the ECS backend know about it.

$ aws ecs register-task-definition --cli-input-json file://inf1-task-definition.json

We’re now ready to run our container, and predict with it.

Running a Container on Inf1 Instances
As this is a prediction service, I want to make sure that it’s always available on the cluster. Instead of simply running a task, I create an ECS Service that will make sure the required number of container copies is running, relaunching them should any failure happen.

$ aws ecs create-service --cluster ecs-inf1-demo \
--service-name bert-inf1 \
--task-definition ecs-neuron:1 \
--desired-count 1

A minute later, I see that both task containers are running on the cluster.

Running containers

Predicting with BERT on ECS and Inf1
The inner workings of BERT are beyond the scope of this post. This particular model expects a sequence of 128 tokens, encoding the words of two sentences we’d like to compare for semantic equivalence.

Here, I’m only interested in measuring prediction latency, so dummy data is fine. I build 100 prediction requests storing a sequence of 128 zeros. Using the IP address of the BERT container, I send them to the TensorFlow Serving endpoint via grpc, and I compute the average prediction time.

Here is the full code.

import numpy as np
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import time

if __name__ == '__main__':
    channel = grpc.insecure_channel('18.234.61.31:8500')
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'bert'
    i = np.zeros([1, 128], dtype=np.int32)
    request.inputs['input_ids'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))
    request.inputs['input_mask'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))
    request.inputs['segment_ids'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))

    latencies = []
    for i in range(100):
        start = time.time()
        result = stub.Predict(request)
        latencies.append(time.time() - start)
        print("Inference successful: {}".format(i))
    print ("Ran {} inferences successfully. Latency average = {}".format(len(latencies), np.average(latencies)))

For convenience, I’m running this code on an EC2 instance based on the Deep Learning AMI. It comes pre-installed with a Conda environment for TensorFlow and TensorFlow Serving, saving me from installing any dependencies.

$ source activate tensorflow_p36
$ python predict.py

On average, prediction took 56.5ms. As far as BERT goes, this is pretty good!

Ran 100 inferences successfully. Latency average = 0.05647835493087769

Getting Started
You can now deploy Amazon Elastic Compute Cloud (EC2) Inf1 instances on Amazon ECS today in the US East (N. Virginia) and US West (Oregon) regions. As Inf1 deployment progresses, you’ll be able to use them with Amazon ECS in more regions.

Give this a try, and please send us feedback either through your usual AWS Support contacts, on the AWS Forum for Amazon ECS, or on the container roadmap on Github.

– Julien

Logical separation: Moving beyond physical isolation in the cloud computing era

Post Syndicated from Min Hyun original https://aws.amazon.com/blogs/security/logical-separation-moving-beyond-physical-isolation-in-the-cloud-computing-era/

We’re sharing an update to the Logical Separation on AWS: Moving Beyond Physical Isolation in the Era of Cloud Computing whitepaper to help customers benefit from the security and innovation benefits of logical separation in the cloud. This paper discusses using a multi-pronged approach—leveraging identity management, network security, serverless and containers services, host and instance features, logging, and encryption—to build logical security mechanisms that meet and often exceed the security results of physical separation of resources and other on-premises security approaches. Public sector and commercial organizations worldwide can leverage these mechanisms to more confidently migrate sensitive workloads to the cloud without the need for physically dedicated infrastructure.

Amazon Web Services (AWS) addresses the concerns driving physical separation requirements through the logical security capabilities we provide customers and the security controls we have in place to protect customer data. The strength of that isolation combined with the automation and flexibility that the isolation provides is on par with or better than the security controls seen in traditional physically separated environments.

The paper also highlights a U.S. Department of Defense (DoD) use case demonstrating how the AWS logical separation capabilities met the intent behind a DoD requirement for dedicated, physically isolated infrastructure for its most sensitive unclassified workloads.

Download and read the updated whitepaper.

If you have questions or want to learn more, contact your account executive or contact AWS Support. If you have feedback about this post, submit comments in the Comments section below.

Note: The post announcing the original version of the whitepaper can be found here: https://aws.amazon.com/blogs/security/how-aws-meets-a-physical-separation-requirement-with-a-logical-separation-approach/

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Min Hyun

Min is the Global Lead for Growth Strategies at AWS. Her team’s mission is to set the industry bar in thought leadership for security and data privacy assurance in emerging technology, trends, and strategy to advance customers’ journeys to AWS. View her other Security Blog publications here

Author

Tim Anderson

Tim is a Senior Security Advisor with AWS Security where he addresses security, compliance, and privacy needs of customers and industry globally. He also designs solutions, capabilities, and practices to teach and democratize security concepts to meet challenges across the global landscape. Before AWS, Tim spent 16 years managing security and compliance programs for DoD and other federal agencies.

Securing Amazon EKS workloads with Atlassian Bitbucket and Snyk

Post Syndicated from James Bland original https://aws.amazon.com/blogs/devops/securing-amazon-eks-workloads-with-atlassian-bitbucket-and-snyk/

This post was contributed by James Bland, Sr. Partner Solutions Architect, AWS, Jay Yeras, Head of Cloud and Cloud Native Solution Architecture, Snyk, and Venkat Subramanian, Group Product Manager, Bitbucket

 

One of our goals at Atlassian is to make the software delivery and development process easier. This post explains how you can set up a software delivery pipeline using Bitbucket Pipelines and Snyk, a tool that finds and fixes vulnerabilities in open-source dependencies and container images, to deploy secured applications on Amazon Elastic Kubernetes Service (Amazon EKS). By presenting important development information directly on pull requests inside the product, you can proactively diagnose potential issues, shorten test cycles, and improve code quality.

Atlassian Bitbucket Cloud is a Git-based code hosting and collaboration tool, built for professional teams. Bitbucket Pipelines is an integrated CI/CD service that allows you to automatically build, test, and deploy your code. With its best-in-class integrations with Jira, Bitbucket Pipelines allows different personas in an organization to collaborate and get visibility into the deployments. Bitbucket Pipes are small chunks of code that you can drop into your pipeline to make it easier to build powerful, automated CI/CD workflows.

In this post, we go over the following topics:

  • The importance of security as practices shift-left in DevOps
  • How embedding security into pull requests helps developer workflows
  • Deploying an application on Amazon EKS using Bitbucket Pipelines and Snyk

Shift-left on security

Security is usually an afterthought. Developers tend to focus on delivering software first and addressing security issues later when IT Security, Ops, or InfoSec teams discover them. However, research from the 2016 State of DevOps Report shows that you can achieve better outcomes by testing for security earlier in the process within a developer’s workflow. This concept is referred to as shift-left, where left indicates earlier in the process, as illustrated in the following diagram.

There are two main challenges in shifting security left to developers:

  • Developers aren’t security experts – They develop software in the most efficient way they know how, which can mean importing libraries to take care of lower-level details. And sometimes these libraries import other libraries within them, and so on. This makes it almost impossible for a developer, who is not a security expert, to keep track of security.
  • It’s time-consuming – There is no automation. Developers have to run tests to understand what’s happening and then figure out how to fix it. This slows them down and takes them away from their core job: building software.

Time spent on SDLC testing

Enabling security into a developer’s workflow

Code Insights is a new feature in Bitbucket that provides contextual information as part of the pull request interface. It surfaces information relevant to a pull request so issues related to code quality or security vulnerabilities can be viewed and acted upon during the code review process. The following screenshot shows Code Insights on the pull request sidebar.

 

Code insights

In the security space, we’ve partnered with Snyk, McAfee, Synopsys, and Anchore. When you use any of these integrations in your Bitbucket Pipeline, security vulnerabilities are automatically surfaced within your pull request, prompting developers to address them. By bringing the vulnerability information into the pull request interface before the actual deployment, it’s much easier for code reviewers to assess the impact of the vulnerability and provide actionable feedback.

When security issues are fixed as part of a developer’s workflow instead of post-deployment, it means fewer sev1 incidents, which saves developer time and IT resources down the line, and leads to a better user experience for your customers.

 

Securing your Atlassian Workflow with Snyk

To demonstrate how you can easily introduce a few steps to your workflow that improve your security posture, we take advantage of the new Snyk integration to Atlassian’s Code Insights and other Snyk integrations to Bitbucket Cloud, Amazon Elastic Container Registry (Amazon ECR, for more information see Container security with Amazon Elastic Container Registry (ECR): integrate and test), and Amazon EKS (for more information see Kubernetes workload and image scanning. We reference sample code in a publicly available Bitbucket repository. In this repository, you can find resources such as a multi-stage build Dockerfile for a sample Java web application, a sample bitbucket-pipelines.yml configured to perform Snyk scans and push container images to Amazon ECR, and a reference Kubernetes manifest to deploy your application.

Prerequisites

You first need to have a few resources provisioned, such as an Amazon ECR repository and an Amazon EKS cluster. You can quickly create these using the AWS Command Line Interface (AWS CLI) by invoking the create-repository command and following the Getting started with eksctl guide. Next, make sure that you have enabled the new code review experience in your Bitbucket account.

To take a closer look at the bitbucket-pipelines.yml file, see the following code:

script:
 - IMAGE_NAME="petstore"
 - docker build -t $IMAGE_NAME .
 - pipe: snyk/snyk-scan:0.4.3
   variables:
     SNYK_TOKEN: $SNYK_TOKEN
     LANGUAGE: "docker"
     IMAGE_NAME: $IMAGE_NAME
     TARGET_FILE: "Dockerfile"
     CODE_INSIGHTS_RESULTS: "true"
     SEVERITY_THRESHOLD: "high"
     DONT_BREAK_BUILD: "true"
 - pipe: atlassian/aws-ecr-push-image:1.1.2
   variables:
     AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID
     AWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY
     AWS_DEFAULT_REGION: "us-west-2"
     IMAGE_NAME: $IMAGE_NAME

In the preceding code, we invoke two Bitbucket Pipes to easily configure our pipeline and complete two critical tasks in just a few lines: scan our container image and push to our private registry. This saves time and allows for reusability across repositories while discovering innovative ways to automate our pipelines thanks to an extensive catalog of integrations.

Snyk pipe for Bitbucket Pipelines

In the following use case, we build a container image from the Dockerfile included in the Bitbucket repository and scan the image using the Snyk pipe. We also invoke the aws-ecr-push-image pipe to securely store our image in a private registry on Amazon ECR. When the pipeline runs, we see results as shown in the following screenshot.

Bitbucket pipeline

If we choose the available report, we can view the detailed results of our Snyk scan. In the following screenshot, we see detailed insights into the content of that report: three high, one medium, and five low-severity vulnerabilities were found in our container image.

container image report

 

Snyk scans of Bitbucket and Amazon ECR repositories

Because we use Snyk’s integration to Amazon ECR and Snyk’s Bitbucket Cloud integration to scan and monitor repositories, we can dive deeper into these results by linking our Dockerfile stored in our Bitbucket repository to the results of our last container image scan. By doing so, we can view recommendations for upgrading our base image, as in the following screenshot.

 

ECR scan recommendations

As a result, we can move past informational insights and onto actionable recommendations. In the preceding screenshot, our current image of jboss/wilfdly:11.0.0.Final contains 76 vulnerabilities. We also see two recommendations: a major upgrade to jboss/wildfly:18.0.1.FINAL, which brings our total vulnerabilities down to 65, and an alternative upgrade, which is less desirable.

 

We can investigate further by drilling down into the report to view additional context on how a potential vulnerability was introduced, and also create a Jira issue to Atlassian Jira Software Cloud. The following screenshot shows a detailed report on the Issues tab.

 

Jira issue

We can also explore the Dependencies tab for a list of all the direct dependencies, transitive dependencies, and the vulnerabilities those may contain. See the following screenshot.

 

dependency vulnerabilities

Snyk scan Amazon EKS configuration

The final step in securing our workflow involves integrating Snyk with Kubernetes and deploying to Amazon EKS and Bitbucket Pipelines. Sample Kubernetes manifest files and a bitbucket-pipeline.yml are available for you to use in the accompanying Bitbucket repository for this post. Our bitbucket-pipeline.yml contains the following step:

script:
 - pipe: atlassian/aws-eks-kubectl-run:1.2.3
   variables:
     AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID
     AWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY
     AWS_DEFAULT_REGION: $AWS_DEFAULT_REGION
     CLUSTER_NAME: "my-kube-cluster"
     KUBECTL_COMMAND: "apply"
     RESOURCE_PATH: "java-app.yaml"

In the preceding code, we call the aws-eks-kubectl-run pipe and pass in a few repository variables we previously defined (see the following screenshot).

 

repository variables

For more information about generating the necessary access keys in AWS Identity and Access Management (IAM) to make programmatic requests to the AWS API, see Creating an IAM User in Your AWS Account.

Now that we have provisioned the supporting infrastructure and invoked kubectl apply -f java-app.yaml to deploy our pods using our container images in Amazon ECR, we can monitor our project details and view some initial results. The following screenshot shows that our initial configuration isn’t secure.

secure config scan results

The reason for this is that we didn’t explicitly define a few parameters in our Kubernetes manifest under securityContext. For example, parameters such as readOnlyRootFilesystem, runAsNonRoot, allowPrivilegeEscalation, and capabilities either aren’t defined or are set incorrectly in our template. As a result, we see this in our findings with the FAIL flag. Hovering over these on the Snyk console provides specific insights on how to fix these, for example:

  • Run as non-root – Whether any containers in the workload have securityContext.runAsNonRoot set to false or unset
  • Read-only root file system – Whether any containers in the workload have securityContext.readOnlyFilesystem set to false or unset
  • Drop capabilities – Whether all capabilities are dropped and CAP_SYS_ADMIN isn’t added

 

To save you the trouble of researching this, we provide another sample template, java-app-snyk.yaml, which you can apply against your running pods. The difference in this template is that we have included the following lines to the manifest, which address the three failed findings in our report:

securityContext:
 allowPrivilegeEscalation: false
 readOnlyRootFilesystem: true
 runAsNonRoot: true
 capabilities:
   drop:
     - all

After a subsequent scan, we can validate our changes propagated successfully and our Kubernetes configuration is secure (see the following screenshot).

secure config scan passing results

Conclusion

This post demonstrated how to secure your entire flow proactively with Atlassian Bitbucket Cloud and Snyk. Seamless integrations to Bitbucket Cloud provide you with actionable insights at each step of your development process.

Get started for free with Bitbucket and Snyk and learn more about the Bitbucket-Snyk integration.

 

“The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.”

 

AWS App2Container – A New Containerizing Tool for Java and ASP.NET Applications

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-app2container-a-new-containerizing-tool-for-java-and-asp-net-applications/

Our customers are increasingly developing their new applications with containers and serverless technologies, and are using modern continuous integration and delivery (CI/CD) tools to automate the software delivery life cycle. They also maintain a large number of existing applications that are built and managed manually or using legacy systems. Maintaining these two sets of applications with disparate tooling adds to operational overhead and slows down the pace of delivering new business capabilities. As much as possible, they want to be able to standardize their management tooling and CI/CD processes across both their existing and new applications, and see the option of packaging their existing applications into containers as the first step towards accomplishing that goal.

However, containerizing existing applications requires a long list of manual tasks such as identifying application dependencies, writing dockerfiles, and setting up build and deployment processes for each application. These manual tasks are time consuming, error prone, and can slow down the modernization efforts.

Today, we are launching AWS App2Container, a new command-line tool that helps containerize existing applications that are running on-premises, in Amazon Elastic Compute Cloud (EC2), or in other clouds, without needing any code changes. App2Container discovers applications running on a server, identifies their dependencies, and generates relevant artifacts for seamless deployment to Amazon ECS and Amazon EKS. It also provides integration with AWS CodeBuild and AWS CodeDeploy to enable a repeatable way to build and deploy containerized applications.

AWS App2Container generates the following artifacts for each application component: Application artifacts such as application files/folders, Dockerfiles, container images in Amazon Elastic Container Registry (ECR), ECS Task definitions, Kubernetes deployment YAML, CloudFormation templates to deploy the application to Amazon ECS or EKS, and templates to set up a build/release pipeline in AWS Codepipeline which also leverages AWS CodeBuild and CodeDeploy.

Starting today, you can use App2Container to containerize ASP.NET (.NET 3.5+) web applications running in IIS 7.5+ on Windows, and Java applications running on Linux—standalone JBoss, Apache Tomcat, and generic Java applications such as Spring Boot, IBM WebSphere, Oracle WebLogic, etc.

By modernizing existing applications using containers, you can make them portable, increase development agility, standardize your CI/CD processes, and reduce operational costs. Now let’s see how it works!

AWS App2Container – Getting Started
AWS App2Container requires that the following prerequisites be installed on the server(s) hosting your application: AWS Command Line Interface (CLI) version 1.14 or later, Docker tools, and (in the case of ASP.NET) Powershell 5.0+ for applications running on Windows. Additionally, you need to provide appropriate IAM permissions to App2Container to interact with AWS services.

For example, let’s look how you containerize your existing Java applications. App2Container CLI for Linux is packaged as a tar.gz archive. The file provides users an interactive shell script, install.sh to install the App2Container CLI. Running the script guides users through the install steps and also updates the user’s path to include the App2Container CLI commands.

First, you can begin by running a one-time initialization on the installed server for the App2Container CLI with the init command.

$ sudo app2container init
Workspace directory path for artifacts[default:  /home/ubuntu/app2container/ws]:
AWS Profile (configured using 'aws configure --profile')[default: default]:  
Optional S3 bucket for application artifacts (Optional)[default: none]: 
Report usage metrics to AWS? (Y/N)[default: y]:
Require images to be signed using Docker Content Trust (DCT)? (Y/N)[default: n]:
Configuration saved

This sets up a workspace to store application containerization artifacts (minimum 20GB of disk space available). You can extract them into your Amazon Simple Storage Service (S3) bucket using your AWS profile configured to use AWS services.

Next, you can view Java processes that are running on the application server by using the inventory command. Each Java application process has a unique identifier (for example, java-tomcat-9e8e4799) which is the application ID. You can use this ID to refer to the application with other App2Container CLI commands.

$ sudo app2container inventory
{
    "java-jboss-5bbe0bec": {
        "processId": 27366,
        "cmdline": "java ... /home/ubuntu/wildfly-10.1.0.Final/modules org.jboss.as.standalone -Djboss.home.dir=/home/ubuntu/wildfly-10.1.0.Final -Djboss.server.base.dir=/home/ubuntu/wildfly-10.1.0.Final/standalone ",
        "applicationType": "java-jboss"
    },
    "java-tomcat-9e8e4799": {
        "processId": 2537,
        "cmdline": "/usr/bin/java ... -Dcatalina.home=/home/ubuntu/tomee/apache-tomee-plume-7.1.1 -Djava.io.tmpdir=/home/ubuntu/tomee/apache-tomee-plume-7.1.1/temp org.apache.catalina.startup.Bootstrap start ",
        "applicationType": "java-tomcat"
    }
}

You can also intialize ASP.NET applications on an administrator-run PowerShell session of Windows Servers with IIS version 7.0 or later. Note that Docker tools and container support are available on Windows Server 2016 and later versions. You can select to run all app2container operations on the application server with Docker tools installed or use a worker machine with Docker tools using Amazon ECS-optimized Windows Server AMIs.

PS> app2container inventory
{
    "iis-smarts-51d2dbf8": {
        "siteName": "nopCommerce39",
        "bindings": "http/*:90:",
        "applicationType": "iis"
    }
}

The inventory command displays all IIS websites on the application server that can be containerized. Each IIS website process has a unique identifier (for example, iis-smarts-51d2dbf8) which is the application ID. You can use this ID to refer to the application with other App2Container CLI commands.

You can choose a specific application by referring to its application ID and generate an analysis report for the application by using the analyze command.

$ sudo app2container analyze --application-id java-tomcat-9e8e4799
Created artifacts folder /home/ubuntu/app2container/ws/java-tomcat-9e8e4799
Generated analysis data in /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/analysis.json
Analysis successful for application java-tomcat-9e8e4799
Please examine the same, make appropriate edits and initiate containerization using "app2container containerize --application-id java-tomcat-9e8e4799"

You can use the analysis.json template generated by the application analysis to gather information on the analyzed application that helps identify all system dependencies from the analysisInfo section, and update containerization parameters to customize the container images generated for the application using the containerParameters section.

$ cat java-tomcat-9e8e4799/analysis.json
{
    "a2CTemplateVersion": "1.0",
	"createdTime": "2020-06-24 07:40:5424",
    "containerParameters": {
        "_comment1": "*** EDITABLE: The below section can be edited according to the application requirements. Please see the analyisInfo section below for deetails discoverd regarding the application. ***",
        "imageRepository": "java-tomcat-9e8e4799",
        "imageTag": "latest",
        "containerBaseImage": "ubuntu:18.04",
        "coopProcesses": [ 6446, 6549, 6646]
    },
    "analysisInfo": {
        "_comment2": "*** NON-EDITABLE: Analysis Results ***",
        "processId": 2537
        "appId": "java-tomcat-9e8e4799",
		"userId": "1000",
        "groupId": "1000",
        "cmdline": [...],
        "os": {...},
        "ports": [...]
    }
}

Also, you can run the $ app2container extract --application-id java-tomcat-9e8e4799 command to generate an application archive for the analyzed application. This depends on the analysis.json file generated earlier in the workspace folder for the application,and adheres to any containerization parameter updates specified in there. By using extract command, you can continue the workflow on a worker machine after running the first set of commands on the application server.

Now you can containerize command generated Docker images for the selected application.

$ sudo app2container containerize --application-id java-tomcat-9e8e4799
AWS pre-requisite check succeeded
Docker pre-requisite check succeeded
Extracted container artifacts for application
Entry file generated
Dockerfile generated under /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/Artifacts
Generated dockerfile.update under /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/Artifacts
Generated deployment file at /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/deployment.json
Containerization successful. Generated docker image java-tomcat-9e8e4799
You're all set to test and deploy your container image.

Next Steps:
1. View the container image with \"docker images\" and test the application.
2. When you're ready to deploy to AWS, please edit the deployment file as needed at /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/deployment.json.
3. Generate deployment artifacts using app2container generate app-deployment --application-id java-tomcat-9e8e4799

Using this command, you can view the generated container images using Docker images on the machine where the containerize command is run. You can use the docker run command to launch the container and test application functionality.

Note that in addition to generating container images, the containerize command also generates a deployment.json template file that you can use with the next generate-appdeployment command. You can edit the parameters in the deployment.json template file to change the image repository name to be registered in Amazon ECR, the ECS task definition parameters, or the Kubernetes App name.

$ cat java-tomcat-9e8e4799/deployment.json
{
       "a2CTemplateVersion": "1.0",
       "applicationId": "java-tomcat-9e8e4799",
       "imageName": "java-tomcat-9e8e4799",
       "exposedPorts": [
              {
                     "localPort": 8090,
                     "protocol": "tcp6"
              }
       ],
       "environment": [],
       "ecrParameters": {
              "ecrRepoTag": "latest"
       },
       "ecsParameters": {
              "createEcsArtifacts": true,
              "ecsFamily": "java-tomcat-9e8e4799",
              "cpu": 2,
              "memory": 4096,
              "dockerSecurityOption": "",
              "enableCloudwatchLogging": false,
              "publicApp": true,
              "stackName": "a2c-java-tomcat-9e8e4799-ECS",
              "reuseResources": {
                     "vpcId": "",
                     "cfnStackName": "",
                     "sshKeyPairName": ""
              },
              "gMSAParameters": {
                     "domainSecretsArn": "",
                     "domainDNSName": "",
                     "domainNetBIOSName": "",
                     "createGMSA": false,
                     "gMSAName": ""
              }
       },
       "eksParameters": {
              "createEksArtifacts": false,
              "applicationName": "",
              "stackName": "a2c-java-tomcat-9e8e4799-EKS",
              "reuseResources": {
                     "vpcId": "",
                     "cfnStackName": "",
                     "sshKeyPairName": ""
              }
       }
 }

At this point, the application workspace where the artifacts are generated serves as an iteration sandbox. You can choose to edit the Dockerfile generated here to make changes to their application and use the docker build command to build new container images as needed. You can generate the artifacts needed to deploy the application containers in Amazon EKS by using the generate-deployment command.

$ sudo app2container generate app-deployment --application-id java-tomcat-9e8e4799
AWS pre-requisite check succeeded
Docker pre-requisite check succeeded
Created ECR Repository
Uploaded Cloud Formation resources to S3 Bucket: none
Generated Cloud Formation Master template at: /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/EksDeployment/amazon-eks-master.template.yaml
EKS Cloudformation templates and additional deployment artifacts generated successfully for application java-tomcat-9e8e4799

You're all set to use AWS Cloudformation to manage your application stack.
Next Steps:
1. Edit the cloudformation template as necessary.
2. Create an application stack using the AWS CLI or the AWS Console. AWS CLI command:

       aws cloudformation deploy --template-file /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/EksDeployment/amazon-eks-master.template.yaml --capabilities CAPABILITY_NAMED_IAM --stack-name java-tomcat-9e8e4799

3. Setup a pipeline for your application stack:

       app2container generate pipeline --application-id java-tomcat-9e8e4799

This command works based on the deployment.json template file produced as part of running the containerize command. App2Container will now generate ECS/EKS cloudformation templates as well and an option to deploy those stacks.

The command registers the container image to user specified ECR repository, generates cloudformation template for Amazon ECS and EKS deployments. You can register ECS task definition with Amazon ECS and use kubectl to launch the containerized application on the existing Amazon EKS or self-managed kubernetes cluster using App2Container generated amazon-eks-master.template.deployment.yaml.

Alternatively, you can directly deploy containerized applications by --deploy options into Amazon EKS.

$ sudo app2container generate app-deployment --application-id java-tomcat-9e8e4799 --deploy
AWS pre-requisite check succeeded
Docker pre-requisite check succeeded
Created ECR Repository
Uploaded Cloud Formation resources to S3 Bucket: none
Generated Cloud Formation Master template at: /home/ubuntu/app2container/ws/java-tomcat-9e8e4799/EksDeployment/amazon-eks-master.template.yaml
Initiated Cloudformation stack creation. This may take a few minutes. Please visit the AWS Cloudformation Console to track progress.
Deploying application to EKS

Handling ASP.NET Applications with Windows Authentication
Containerizing ASP.NET applications is almost same process as Java applications, but Windows containers cannot be directly domain joined. They can however still use Active Directory (AD) domain identities to support various authentication scenarios.

App2Container detects if a site is using Windows authentication and accordingly makes the IIS site’s application pool run as the network service identity, and generates the new cloudformation templates for Windows authenticated IIS applications. The creation of gMSA and AD Security group, domain join ECS nodes and making containers use this gMSA are all taken care of by those templates.

Also, it provides two PowerShell scripts as output to the $ app2container containerize command along with an instruction file on how to use it.

The following is an example output:

PS C:\Windows\system32> app2container containerize --application-id iis-SmartStoreNET-a726ba0b
Running AWS pre-requisite check...
Running Docker pre-requisite check...
Container build complete. Please use "docker images" to view the generated container images.
Detected that the Site is using Windows Authentication.
Generating powershell scripts into C:\Users\Admin\AppData\Local\app2container\iis-SmartStoreNET-a726ba0b\Artifacts required to setup Container host with Windows Authentication
Please look at C:\Users\Admin\AppData\Local\app2container\iis-SmartStoreNET-a726ba0b\Artifacts\WindowsAuthSetupInstructions.md for setup instructions on Windows Authentication.
A deployment file has been generated under C:\Users\Admin\AppData\Local\app2container\iis-SmartStoreNET-a726ba0b
Please edit the same as needed and generate deployment artifacts using "app2container generate-deployment"

The first PowerShellscript, DomainJoinAddToSecGroup.ps1, joins the container host and adds it to an Active Directory security group. The second script, CreateCredSpecFile.ps1, creates a Group Managed Service Account (gMSA), grants access to the Active Directory security group, generates the credential spec for this gMSA, and stores it locally on the container host. You can execute these PowerShellscripts on the ECS host. The following is an example usage of the scripts:

PS C:\Windows\system32> .\DomainJoinAddToSecGroup.ps1 -ADDomainName Dominion.com -ADDNSIp 10.0.0.1 -ADSecurityGroup myIISContainerHosts -CreateADSecurityGroup:$true
PS C:\Windows\system32> .\CreateCredSpecFile.ps1 -GMSAName MyGMSAForIIS -CreateGMSA:$true -ADSecurityGroup myIISContainerHosts

Before executing the app2container generate-deployment command, edit the deployment.json file to change the value of dockerSecurityOption to the name of the CredentialSpec file that the CreateCredSpecFile script generated. For example,
"dockerSecurityOption": "credentialspec:file://dominion_mygmsaforiis.json"

Effectively, any access to network resource made by the IIS server inside the container for the site will now use the above gMSA to authenticate. The final step is to authorize this gMSA account on the network resources that the IIS server will access. A common example is authorizing this gMSA inside the SQL Server.

Finally, if the application must connect to a database to be fully functional and you run the container in Amazon ECS, ensure that the application container created from the Docker image generated by the tool has connectivity to the same database. You can refer to this documentation for options on migrating: MS SQL Server from Windows to Linux on AWS, Database Migration Service, and backup and restore your MS SQL Server to Amazon RDS.

Now Available
AWS App2Container is offered free. You only pay for the actual usage of AWS services like Amazon EC2, ECS, EKS, and S3 etc based on their usage. For details, please refer to App2Container FAQs and documentations. Give this a try, and please send us feedback either through your usual AWS Support contacts, on the AWS Forum for ECS, AWS Forum for EKS, or on the container roadmap on Github.

Channy;

How to build a CI/CD pipeline for container vulnerability scanning with Trivy and AWS Security Hub

Post Syndicated from Amrish Thakkar original https://aws.amazon.com/blogs/security/how-to-build-ci-cd-pipeline-container-vulnerability-scanning-trivy-and-aws-security-hub/

In this post, I’ll show you how to build a continuous integration and continuous delivery (CI/CD) pipeline using AWS Developer Tools, as well as Aqua Security‘s open source container vulnerability scanner, Trivy. You’ll build two Docker images, one with vulnerabilities and one without, to learn the capabilities of Trivy and how to send all vulnerability information to AWS Security Hub.

If you’re building modern applications, you might be using containers, or have experimented with them. A container is a standard way to package your application’s code, configurations, and dependencies into a single object. In contrast to virtual machines (VMs), containers virtualize the operating system rather than the server. Thus, the images are orders of magnitude smaller, and they start up much more quickly.

Like VMs, containers need to be scanned for vulnerabilities and patched as appropriate. For VMs running on Amazon Elastic Compute Cloud (Amazon EC2), you can use Amazon Inspector, a managed vulnerability assessment service, and then patch your EC2 instances as needed. For containers, vulnerability management is a little different. Instead of patching, you destroy and redeploy the container.

Many container deployments use Docker. Docker uses Dockerfiles to define the commands you use to build the Docker image that forms the basis of your container. Instead of patching in place, you rewrite your Dockerfile to point to more up-to-date base images, dependencies, or both and to rebuild the Docker image. Trivy lets you know which dependencies in the Docker image are vulnerable, and which version of those dependencies are no longer vulnerable, allowing you to quickly understand what to patch to get back to a secure state.

Solution architecture

 

Figure 1: Solution architecture

Figure 1: Solution architecture

Here’s how the solution works, as shown in Figure 1:

  1. Developers push Dockerfiles and other code to AWS CodeCommit.
  2. AWS CodePipeline automatically starts an AWS CodeBuild build that uses a build specification file to install Trivy, build a Docker image, and scan it during runtime.
  3. AWS CodeBuild pushes the build logs in near real-time to an Amazon CloudWatch Logs group.
  4. Trivy scans for all vulnerabilities and sends them to AWS Security Hub, regardless of severity.
  5. If no critical vulnerabilities are found, the Docker images are deemed to have passed the scan and are pushed to Amazon Elastic Container Registry (ECR), so that they can be deployed.

Note: CodePipeline supports different sources, such as Amazon Simple Storage Service (Amazon S3) or GitHub. If you’re comfortable with those services, feel free to substitute them for this walkthrough of the solution.

To quickly deploy the solution, you’ll use an AWS CloudFormation template to deploy all needed services.

Prerequisites

  1. You must have Security Hub enabled in the AWS Region where you deploy this solution. In the AWS Management Console, go to AWS Security Hub, and select Enable Security Hub.
  2. You must have Aqua Security integration enabled in Security Hub in the Region where you deploy this solution. To do so, go to the AWS Security Hub console and, on the left, select Integrations, search for Aqua Security, and then select Accept Findings.

Setting up

For this stage, you’ll deploy the CloudFormation template and do preliminary setup of the CodeCommit repository.

  1. Download the CloudFormation template from GitHub and create a CloudFormation stack. For more information on how to create a CloudFormation stack, see Getting Started with AWS CloudFormation.
  2. After the CloudFormation stack completes, go to the CloudFormation console and select the Resources tab to see the resources created, as shown in Figure 2.

 

Figure 2: CloudFormation output

Figure 2: CloudFormation output

Setting up the CodeCommit repository

CodeCommit repositories need at least one file to initialize their master branch. Without a file, you can’t use a CodeCommit repository as a source for CodePipeline. To create a sample file, do the following.

  1. Go to the CodeCommit console and, on the left, select Repositories, and then select your CodeCommit repository.
  2. Scroll to the bottom of the page, select the Add File dropdown, and then select Create file.
  3. In the Create a file screen, enter readme into the text body, name the file readme.md, enter your name as Author name and your Email address, and then select Commit changes, as shown in Figure 3.

    Figure 3: Creating a file in CodeCommit

    Figure 3: Creating a file in CodeCommit

Simulate a vulnerable image build

For this stage, you’ll create the necessary files and add them to your CodeCommit repository to start an automated container vulnerability scan.

    1. Download the buildspec.yml file from the GitHub repository.

      Note: In the buildspec.yml code, the values prepended with $ will be populated by the CodeBuild environmental variables you created earlier. Also, the command trivy -f json -o results.json –exit-code 1 will fail your build by forcing Trivy to return an exit code 1 upon finding a critical vulnerability. You can add additional severity levels here to force Trivy to fail your builds and ensure vulnerabilities of lower severity are not published to Amazon ECR.

    2. Download the python code file sechub_parser.py from the GitHub repository. This script parses vulnerability details from the JSON file that Trivy generates, maps the information to the AWS Security Finding Format (ASFF), and then imports it to Security Hub.
    3. Next, download the Dockerfile from the GitHub repository. The code clones a GitHub repository maintained by the Trivy team that has purposely vulnerable packages that generate critical vulnerabilities.
    4. Go back to your CodeCommit repository, select the Add file dropdown menu, and then select Upload file.
    5. In the Upload file screen, select Choose file, select the build specification you just created (buildspec.yml), complete the Commit changes to master section by adding the Author name and Email address, and select Commit changes, as shown in Figure 4.

 

Figure 4: Uploading a file to CodeCommit

Figure 4: Uploading a file to CodeCommit

 

  • To upload your Dockerfile and sechub_parser.py script to CodeCommit, repeat steps 4 and 5 for each of these files.
  • Your pipeline will automatically start in response to every new commit to your repository. To check the status, go back to the pipeline status view of your CodePipeline pipeline.
  • When CodeBuild starts, select Details in the Build stage of the CodePipeline, under BuildAction, to go to the Build section on the CodeBuild console. To see a stream of logs as your build progresses, select Tail logs, as shown in Figure 5.

    Figure 5: CodeBuild Tailed Logs

    Figure 5: CodeBuild Tailed Logs

  • After Trivy has finished scanning your image, CodeBuild will fail due to the critical vulnerabilities found, as shown in Figure 6.

    Note: The command specified in the post-build stage will run even if the CodeBuild build fails. This is by design and allows the sechub_parser.py script to run and send findings to Security Hub.

     

    Figure 6: CodeBuild logs failure

    Figure 6: CodeBuild logs failure

 

You’ll now go to Security Hub to further analyze the findings and create saved searches for future use.

Analyze container vulnerabilities in Security Hub

For this stage, you’ll analyze your container vulnerabilities in Security Hub and use the findings view to locate information within the ASFF.

  1. Go to the Security Hub console and select Integrations in the left-hand navigation pane.
  2. Scroll down to the Aqua Security integration card and select See findings, as shown in Figure 7. This filters to only Aqua Security product findings in the Findings view.

    Figure 7: Aqua Security integration card

    Figure 7: Aqua Security integration card

  3. You should now see critical vulnerabilities from your previous scan in the Findings view, as shown in Figure 8. To see more details of a finding, select the Title of any of the vulnerabilities, and you will see the details in the right side of the Findings view.
    Figure 8: Security Hub Findings pane

    Figure 8: Security Hub Findings pane

    Note: Within the Findings view, you can apply quick filters by checking the box next to certain fields, although you won’t do that for the solution in this post.

  4. To open a new tab to a website about the Common Vulnerabilities and Exposures (CVE) for the finding, select the hyperlink within the Remediation section, as shown in Figure 9.
    Figure 9: Remediation information

    Figure 9: Remediation information

    Note: The fields shown in Figure 9 are dynamically populated by Trivy in its native output, and the CVE website can differ greatly from vulnerability to vulnerability.

  5. To see the JSON of the full ASFF, at the top right of the Findings view, select the hyperlink for Finding ID.
  6. To find information mapped from Trivy, such as the CVE title and what the patched version of the vulnerable package is, scroll down to the Other section, as shown in Figure 10.

    Figure 10: ASFF, other

    Figure 10: ASFF, other

This was a brief demonstration of exploring findings with Security Hub. You can use custom actions to define response and remediation actions, such as sending these findings to a ticketing system or aggregating them in a security information event management (SIEM) tool.

Push a non-vulnerable Dockerfile

Now that you’ve seen Trivy perform correctly with a vulnerable image, you’ll fix the vulnerabilities. For this stage, you’ll modify Dockerfile to remove any vulnerable dependencies.

  1. Open a text editor, paste in the code shown below, and save it as Dockerfile. You can overwrite your previous example if desired.
    
    FROM alpine:3.7
    RUN apk add --no-cache mysql-client
    ENTRYPOINT ["mysql"]
    	

  2. Upload the new Dockerfile to CodeCommit, as shown earlier in this post.

Clean up

To avoid incurring additional charges from running these services, disable Security Hub and delete the CloudFormation stack after you’ve finished evaluating this solution. This will delete all resources created during this post. Deleting the CloudFormation stack will not remove the findings in Security Hub. If you don’t disable Security Hub, you can archive those findings and wait 90 days for them to be fully removed from your Security Hub instance.

Conclusion

In this post, you learned how to create a CI/CD Pipeline with CodeCommit, CodeBuild, and CodePipeline for building and scanning Docker images for critical vulnerabilities. You also used Security Hub to aggregate scan findings and take action on your container vulnerabilities.

You now have the skills to create similar CI/CD pipelines with AWS Developer Tools and to perform vulnerability scans as part of your container build process. Using these reports, you can efficiently identify CVEs and work with your developers to use up-to-date libraries and binaries for Docker images. To further build upon this solution, you can change the Trivy commands in your build specification to fail the builds on different severity levels of found vulnerabilities. As a final suggestion, you can use ECR as a source for a downstream CI/CD pipeline responsible for deploying your newly-scanned images to Amazon Elastic Container Service (Amazon ECS).

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Amrish Thakkar

Amrish is a Senior Solutions Architect at AWS. He holds the AWS Certified Solutions Architect Professional and AWS Certified DevOps Engineer Professional certifications. Amrish is passionate about DevOps, Microservices, Containerization, and Application Security, and devotes personal time into research and advocacy on these topics. Outside of work, he enjoys spending time with family and watching LOTR trilogy frequently.

Field Notes: Optimize your Java application for AWS Lambda with Quarkus

Post Syndicated from Sascha Moellering original https://aws.amazon.com/blogs/architecture/field-notes-optimize-your-java-application-for-aws-lambda-with-quarkus/

This blog post is a continuation of an existing article about optimizing your Java application for Amazon ECS with Quarkus. In this blog post, we examine the benefits of Quarkus in the context of AWS Lambda. Quarkus is a framework that uses the Open Java Development Kit (OpenJDK) with GraalVM and over 50 libraries like RESTEasy, Vertx, Hibernate, and Netty. This blog post shows you an effective approach for implementing a Java-based application and compiling it into a native-image through Quarkus. You can find the demo application code on GitHub.

Getting started

To build and deploy this application, you will need the AWS CLI, the AWS Serverless Application Model (AWS SAM), Git, Maven, OpenJDK 11, and Docker. AWS Cloud9 makes the setup easy. AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code with just a browser. It comes with the AWS tools, Git, and Docker installed.

Create a new AWS Cloud9 EC2 environment based on Amazon Linux. Because the compilation process is very memory intensive, it is recommended to select an instance with at least 8 GiB of RAM (for example, m5.large).

AWS Cloud9 environment

Figure 1 – AWS Cloud9 environment

Launching the AWS Cloud9 environment from the AWS Management Console, you select the instance type. Pick an instance type with at least 8 GiB of RAM.

After creation, you are redirected automatically to your AWS Cloud9 environment’s IDE. You can navigate back to your IDE at any time through the AWS Cloud9 console.

All code blocks in this blog post refer to commands you enter into the terminal provided by the AWS Cloud9 IDE. AWS Cloud9 executes the commands on an underlying EC2 instance. If necessary, you can open a new Terminal in AWS Cloud9 by selecting Window → New Terminal.

Modify the EBS volume of the AWS Cloud9 EC2 instance to at least 20 GB to have enough space for the compilation of your application. Then, reboot the instance using the following command in the AWS Cloud9 IDE terminal, and wait for the AWS Cloud9 IDE to automatically reconnect to your instance.

sudo reboot

To satisfy the OpenJDK 11 requirement, run the following commands in the AWS Cloud9 IDE terminal to install Amazon Corretto 11. Amazon Corretto is a no-cost, multiplatform, production-ready distribution of the OpenJDK.

sudo curl -L -o /etc/yum.repos.d/corretto.repo
https://yum.corretto.aws/corretto.repo
sudo yum install -y java-11-amazon-corretto-devel

You will build this application using Apache Maven. You must install it via the AWS Cloud9 IDE terminal by executing the following code.

sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo \
    -O /etc/yum.repos.d/epel-apache-maven.repo
sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
sudo yum install -y apache-maven

After you clone the demo application code, you can then build the application. Compiling the application to a self-contained JAR is straight forward. Navigate to the aws-quarkus-demo/lambda directory and kick off the Apache Maven build.

git clone https://github.com/aws-samples/aws-quarkus-demo.git
cd aws-quarkus-demo/lambda/
mvn clean install

To compile the application to a native binary, you must add the parameter used by the Apache Maven build to run the necessary steps.

mvn clean install -Dnative-image.docker-build=true

AWS Lambda layers and custom runtimes

Create an AWS Lambda custom runtime from the application. A runtime is a program that runs an AWS Lambda function’s handler method when the function is invoked. Include a runtime in your function’s deployment package as an executable file named bootstrap.

A runtime is responsible for running the function’s setup code, reading the handler name from an environment variable, and reading invocation events from the AWS Lambda runtime API. The runtime passes the event data to the function handler, and posts the response from the handler back to AWS Lambda. Your custom runtime can be a shell script, a script in a language that’s included in Amazon Linux, or a binary executable file that’s compiled in Amazon Linux.

Application architecture

The application architecture is similar to the architecture we described in “Optimize your Java application for Amazon ECS with Quarkus”.

Architecture of the application

Figure 2 – Architecture of the application

The architecture of the application is simple and consists of a few classes that implement a REST-service that stores all information in an Amazon DynamoDB-table. Quarkus offers an extension for Amazon DynamoDB that is based on the AWS SDK for Java V2.

Setting up the infrastructure

After you build the AWS Lambda function (as a regular build or native build), package your application and deploy it. The command sam deploy creates a zip file of your code and dependencies, uploads it to an Amazon S3 bucket, creates an AWS CloudFormation template, and deploys its resources.

The following command guides you through all necessary steps for packaging and deployment.

sam deploy --template-file sam.jvm.yaml \
    --stack-name APIGatewayQuarkusDemo --capabilities

CAPABILITY_IAM --guided

If you want to deploy the native version of the application, you must use a different AWS SAM template.

sam deploy --template-file sam.native.yaml \
    --stack-name APIGatewayQuarkusDemo --capabilities
CAPABILITY_IAM --guided

During deployment, the AWS CloudFormation template creates the AWS Lambda function, an Amazon DynamoDB table, an Amazon API Gateway REST-API, and all necessary IAM roles. The output of the AWS CloudFormation stack is the API Gateway’s DNS record.

aws cloudformation describe-stacks \
  --stack-name APIGatewayQuarkusDemo \
  --query
"Stacks[].Outputs[?OutputKey=='ApiUrl'].OutputValue" \
  --output text

A following code is a typical example output.

https://<your-api-gateway-url>/prod/users

Testing the application

After the resources have been created successfully, you can start testing.

1.      Create a user:

curl -d '{"userName":"jdoe", "firstName":"John", "lastName":"Doe", "age":"35"}' \
    -H "Content-Type: application/json" \
    -X POST https://<your-api-gateway-endpoint>/prod/users

2.      List all the users that you created:

curl https://<your-api-gateway-url>/prod/users

3.      You can get a specific user by userId:

curl -X GET 'https://<your-api-gateway-url>/prod/users/<userId>'

4.      If you want to delete the user that you’ve created recently, send a DELETE request to the specific userId:

curl -X DELETE 'https://<your-api-gateway-url>/prod/users/<userId>'

Performance considerations

Let’s investigate the impact of using a native build in comparison to the regular build of our sample Java application. In this benchmark, we focus on the performance of the application. We want to get the AWS services and architecture out of the equation as much as possible, so we measure the duration of the AWS Lambda function executions of a function including its downstream calls to Amazon DynamoDB. This duration is provided in the Amazon CloudWatch Logs of the function.

The following two charts illustrate 40 Create (POST), Read (GET), and Delete (DELETE) call iterations for a user with the execution durations plotted on the vertical axis. The first graph shows the development of the duration over time observing a single JVM instance. In a second graph, exclusively reports on the performance of the iterations each hitting a fresh JVM.

This is an example application to demonstrate the use of a native build. When you start the optimization of your application make sure to read the best practices for working with AWS Lambda Functions first. Verify the effect of all your optimizations.

The “single cold call” -graph starts with a cold call and each consecutive call hits the same AWS Lambda function container and thus the same JVM. This graph shows both the regular build and the native build (denoted with *) of our application as an AWS Lambda function with 256 MB of memory on Java 11. The native build has been compiled with Quarkus version 1.2.1.

single cold call, followed by warm calls only

Figure 3 – Single cold call, followed by warm calls only

The first executions of the regular build have long durations (the vertical axis has a logarithmic scale) but quickly drop below 100 ms. Still, you can observe an ongoing fluctuation between 10 and 100 ms. For the native build you can observe a consistent execution duration, except for the first call and one outlier in iteration 20. The first call is slow because it is a cold call. The outlier occurs because the Substrate VM still needs garbage collector pauses. Only the first call is slower than the calls to a warmed up regular build.

Let’s dive deeper into the cold calls of the application and their duration. The following chart shows 40 cold calls for both the regular build and the native build.

each of the 40 Create-Read-Delete iterations start with one cold call

Figure 3 – Each of the 40 Create-Read-Delete iterations start with one cold call

You can observe a consistent and predictable duration of the first calls. In this example, the execution of the AWS Lambda function of the native build takes just 0.6–5% of the duration of the regular build.

Tradeoffs

GraalVM assumes that all code is known at build time of the image, which means that no new code will be loaded at runtime. This means that not all applications can be optimized using GraalVM. Read a list of limitations in the GraalVM documentation. If the native image build for your application fails, a fallback image is created that requires a full JVM for execution.

Cleaning up

After you are finished, you can easily destroy all of these resources with a single command to save costs.

aws cloudformation delete-stack --stack-name <your_stack_name>

Also delete your AWS Cloud9 IDE from the AWS Cloud9 console.

Conclusion

In this post, we described how Java applications are compiled to a native image through Quarkus and run using AWS Lambda. Testing the demo application, we’ve seen a performance improvement of more than 95% compared to a regular build. Keep in mind that the potential performance benefits vary depending on your application and its downstream calls to other services. Due to the limitations of GraalVM, your application may not be a candidate for optimization.

We also demonstrated how AWS SAM deploys the native image as an AWS Lambda function with a custom runtime behind an Amazon API Gateway.We hope we’ve given you some ideas on how you can optimize your existing Java application to reduce startup time and memory consumption. Feel free to submit enhancements to the sample application in the source repository.

Field Notes provides hands-on technical guidance from AWS Solutions Architects, consultants, and technical account managers, based on their experiences in the field solving real-world business problems for customers.

Deep dive into Fargate Spot to run your ECS Tasks for up to 70% less

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/deep-dive-into-fargate-spot-to-run-your-ecs-tasks-for-up-to-70-less/

Author: Pritam Pal, Sr. EC2 Spot Specialist SA

AWS launched AWS Fargate Spot during late 2019 for customers looking for a cost effective way to run containers. This blog dives deep into how to use ECS Fargate Spot and Fargate Tasks to lower the cost of your workloads. I explain existing concepts like Container Stop Timeout, catching SIGTERM for graceful shut down, and introduce you to new Amazon Elastic Container Service (Amazon ECS) concepts like capacity providers and service events.

Product overview

In 2017, AWS launched AWS Fargate for Amazon ECS. Fargate allows you to spend less time managing Amazon EC2 instances and more time building. Fargate Spot is a new purchase option that allows customers to launch Tasks on spare capacity with a steep discount. A Spot task is almost indistinguishable from an On-Demand Task with the following exceptions:

Price (per CPU-Hour and GB-Hour) of a Spot Task is variable, ranging between 50% to 70% off the price of an On-Demand Task, and a Fargate Spot Task may be interrupted (i.e stopped) when AWS needs the capacity back.

Fargate Spot runs on the same principle as Amazon EC2 Spot Instances. Your tasks run on spare capacity in the AWS Cloud. If you request to run your Task on Fargate Spot, your Tasks will run when capacity for Fargate Spot is available. As these Tasks run on spare capacity, you receive a two-minute notification when AWS needs capacity back, just like Spot Instances.

In this blog, I walk through how to launch a Fargate Spot Task using the AWS Management Console and command line interface (CLI), how to handle termination notices, what the Service Task Placement Failure Event looks like and other best practices to make sure you are a Fargate Spot champion.

ECS Fargate concepts

Before we go deep, let’s explore some ECS concepts, which we will use for Fargate Spot in this blog post.

StopTask and stopTimeout

Because Fargate Spot task may be interrupted with two minutes notice, you need to make sure you gracefully exit. For this, you can use concepts like stopTimeout parameters. When StopTask is called on a task, the equivalent of Docker stop is issued to the containers running in the task. If the container handles the SIGTERM value gracefully and exits within 30 seconds from receiving it, no SIGKILL signal is sent. You typically use stopTimeout parameter of the task definition to control this behavior. stopTimeout is time duration (in seconds) to wait before the container is forcefully stopped if it doesn’t exit normally on its own. For Fargate 1.3.0 or later, the max stop timeout value is 120 seconds. If the parameter is not specified, the default value of 30 seconds is used.

Capacity providers

Capacity providers are a new way to manage compute capacity for containers. This tool allows the application to define its requirements for how it uses the capacity. With capacity providers, you can define flexible rules for how containerized workloads run on different types of compute capacity, and manage the scaling of the capacity. Capacity providers improve the availability, scalability, and cost of running tasks and services on ECS. As of now, each cluster can have up to six capacity providers and an optional default capacity provider strategy, which determines how the tasks are spread across the capacity providers. To run your tasks, you can either use the default capacity provider strategy or specify one of your own.

AWS Fargate and AWS Fargate Spot capacity providers do not need to be created. They are available to all accounts, and only need to be associated with a cluster to be available for use. When a new cluster is created using the Amazon ECS console, along with the Networking only cluster template, the FARGATE and FARGATE_SPOT capacity providers are associated with the new cluster automatically.

A cluster may contain a mix of FARGATE, FARGATE_SPOT and Auto Scaling Group (ASG) capacity providers, however at this moment, a capacity provider strategy may only contain either FARGATE or Auto Scaling Group capacity providers, but not both.

Now that I covered ECS Fargate concepts, lets jump into the technical walk through.

Launch ECS Fargate Spot Task using AWS Management Console

1. Open the Amazon ECS console

2. From the navigation bar, select the Region to use

3. In the navigation pane, choose Clusters

4. On the Clusters page, choose Create Cluster

5. Create a Networking only Cluster

ECS create clusterWith this option, you can launch a cluster with a new VPC to use for Fargate tasks. The FARGATE and FARGATE_SPOT capacity providers are automatically associated with the cluster, as shown in the following image.ECS default capacity providers

6. Click on Update Cluster on top right-hand side to set a capacity provider strategy

In this example, I use a combination of FARGATE_SPOT and FARGATE capacity providers. I selected a Weight of 4 for FARGATE_SPOT and 1 for FARGATE. This means that for every five Tasks, four are started on FARGATE_SPOT and one on FARGATE. You can distribute this however you want. More tasks on Fargate Spot means more savings. But, if your workload requires high availability and you are not comfortable with interruptions, start with a ratio that works for you.

ECS update capacity provider strategy

7. Let’s create a Task Definition first. Here are some great Task definitions to start with. Find the Task Definition link of left navigation panel, click Create a New Task Definition, Choose Fargate launch type, scroll down, near the bottom of the page find the Configure via JSON button. Delete the pre-populated JSON entry, copy the sample Fargate WebApp task definition from below and paste. Click Save. Click Create.

{
    "family": "webapp-fargate-task", 
    "networkMode": "awsvpc", 
    "containerDefinitions": [
        {
            "name": "fargate-app", 
            "image": "httpd:2.4", 
            "portMappings": [
                {
                    "containerPort": 80, 
                    "hostPort": 80, 
                    "protocol": "tcp"
                }
            ], 
            "essential": true, 
            "entryPoint": [
                "sh",
		"-c"
            ], 
            "command": [
                "/bin/sh -c \"echo '<html> <head> <title>Amazon ECS Sample App</title> <style>body {margin-top: 40px; background-color: #333;} </style> </head><body> <div style=color:white;text-align:center> <h1>Amazon ECS Sample App</h1> <h2>Congratulations!</h2> <p>Your application is now running on a container in Amazon ECS.</p> </div></body></html>' >  /usr/local/apache2/htdocs/index.html && httpd-foreground\""
            ]
        }
    ], 
    "requiresCompatibilities": [
        "FARGATE"
    ], 
    "cpu": "256", 
    "memory": "512"
}

8. Now our Task definition is ready, we will run the same Task definition. Select the Task definition you just created, click Action, Run Task. Enter the number of Tasks you want to run. In this case I chose 10. After configuring the VPC and security groups, click Run Task.

Task definition

ECS run tasks listThe Run Task command from the last step starts ten Tasks, out of which eight Tasks launch on FARGATE_SPOT and two launch on FARGATE (The ratio I setup is 4:1). You can see the ratio by clicking on any Task, and finding the “Capacity provider” value for that Task under the details tab. Currently, there is no option to view all Tasks with particular “Capacity provider” in the Run Task console.

ECS Fargate Spot Task detailThis Task ran on FARGATE Capacity Provider, whereas the earlier mentioned Task ran on FARGATE_SPOT.ECS Fargate Task detail

 

In next section, I explore how to Launch Fargate Spot Tasks using the AWS CLI.

Launch ECS Fargate Spot Task using AWS CLI

Create a cluster with capacity providers

While creating a new Cluster using CLI, you must specify capacity providers. In the following example, I specified two capacity providers FARGATE and FARGATE_SPOT.

Enter the following code to specify the capacity providers:

aws ecs create-cluster \
    --cluster-name "Fargate-Spot-Deep-Dive" 
    --capacity-providers FARGATE FARGATE_SPOT   

The Output of this command should result in this:

"cluster": {
        "status": "PROVISIONING", 
        "defaultCapacityProviderStrategy": [], 
        "statistics": [], 
        "capacityProviders": [
            "FARGATE", 
            "FARGATE_SPOT"
        ], 
        ... 

Launching a Task

Once the cluster is created, you can launch a Fargate Spot Task by calling RunTask and providing the Spot capacity provider in the –capacity-provider-strategy field. You also need to specify:

  • A task definition
  • Weight options for capacity providers
  • A network configuration like Subnets, Security Groups
  • How many Tasks you want to run

I defined these specifications in the following code:

aws ecs run-task \
 --capacity-provider-strategy capacityProvider=FARGATE,weight=1
 capacityProvider=FARGATE_SPOT,weight=1 \
 --cluster "Fargate-Spot-Deep-Dive" \
 --task-definition task-def-family:revision \
 --network-configuration
"awsvpcConfiguration={subnets=[string,string],securityGroups=[string,string],assignPublicIp=string}"

 \
 -count integer \

If you specify count=10, and weight =1 for both providers, it would start 5 FARGATE_SPOT and 5 FARGATE Tasks.

Creating a service

In the example below, you create a service with FARGATE_SPOT only.

aws ecs create-service \
 --capacity-provider-strategy capacityProvider=FARGATE,weight=1
 capacityProvider=FARGATE_SPOT,weight=1 \
 --cluster "Fargate-Spot-Deep-Dive" \
 --service-name FargateService \
 --task-definition task-def-family:revision \
 --network-configuration
 "awsvpcConfiguration={subnets=[string,string],securityGroups=[string,string],assignPublicIp=string}"

 \
 --desired-count integer \

You can also create a service with a mix of Spot and On-Demand Tasks by calling CreateService and providing both Spot and On-Demand capacity providers in the capacity-provider-strategy field.

The base attribute is an optional field that says there should be at least four On-Demand Tasks (default base is 0, you cannot specify more than one capacity provider with a non-zero base). The weight is another optional field that says for the six remaining Tasks that are not managed by the base attribute, there should be one On-Demand Tasks for every two Spot Tasks.

aws ecs create-service \
--cluster "my-cluster"
--service-name "my-service"
--desired-count 10
--capacity-provider-strategy [
{'capacityProvider':'FARGATE_SPOT', 'weight': 2, 'base': 0},
{'capacityProvider':'FARGATE', 'weight': 1, 'base': 4},
]
...

Add the Fargate and Fargate Spot capacity providers to an existing cluster

aws ecs put-cluster-capacity-providers \
 --cluster "Fargate-Spot-Deep-Dive" \
 --capacity-providers FARGATE FARGATE_SPOT 
existing_capacity_provider1 existing_capacity_provider2 \
 --default-capacity-provider-strategy existing_default_capacity_provider_strategy \

How to handle Fargate Spot termination notices

By design, Fargate Spot is an interruptible service. When tasks using FARGATE_SPOT are stopped due to a Spot interruption, a two-minute warning is sent before a task is stopped.

The warning is sent as a task state change event to Amazon EventBridge and a SIGTERM signal to the running task. When using Fargate Spot as part of a service, the service scheduler will receive the interruption signal and attempt to launch additional Tasks on Fargate Spot if capacity is available.

To ensure that our containers exit gracefully before the Task stops, the following can be configured:

A stopTimeout value of 120 seconds (2 minutes) or less can be specified in the container definition that the task is using. Specifying a stopTimeout value gives us time between the moment the Task state change event is received and the point at which the container is forcefully stopped.

The SIGTERM signal must be received from within the container to perform any cleanup actions.

The following is a snippet of a Task state change event displaying the stopped reason and stop code for a Fargate Spot interruption.

{
  "version": "0",
  "id": "9bcdac79-b31f-4d3d-9410-fbd727c29fab",
  "detail-type": "ECS Task State Change",
  "source": "aws.ecs",
  "account": "111122223333",
  "resources": [
    "arn:aws:ecs:us-east-1:111122223333:task/b99d40b3-5176-4f71-9a52-9dbd6f1cebef"
  ],
  "detail": {
    "clusterArn": "arn:aws:ecs:us-east-1:111122223333:cluster/default",
    "createdAt": "2016-12-06T16:41:05.702Z",
    "desiredStatus": "STOPPED",
    "lastStatus": "RUNNING",
    "stoppedReason": "Your Spot Task was interrupted.",
    "stopCode": "TerminationNotice",
    "taskArn": "arn:aws:ecs:us-east-1:111122223333:task/b99d40b3-5176-4f71-9a52-9dbd6fEXAMPLE",
    ...
  }
}

Example service Task placement failure event

In case FARGATE_SPOT can’t place a Task due to capacity constraints; service Task placement failure events are delivered.

In the following example, the task attempted to use the FARGATE_SPOT capacity provider, but the service scheduler was unable to acquire any Fargate Spot capacity.

{
    "version": "0",
    "id": "ddca6449-b258-46c0-8653-e0e3a6d0468b",
    "detail-type": "ECS Service Action",
    "source": "aws.ecs",
    "account": "111122223333",
    "time": "2019-11-19T19:55:38Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:ecs:us-west-2:111122223333:service/default/servicetest"
    ],
    "detail": {
        "eventType": "ERROR",
        "eventName": "SERVICE_TASK_PLACEMENT_FAILURE",
        "clusterArn": "arn:aws:ecs:us-west-2:111122223333:cluster/default",
        "capacityProviderArns": [
            "arn:aws:ecs:us-west-2:111122223333:capacity-provider/FARGATE_SPOT"
        ],
        "reason": "RESOURCE:FARGATE",
        "createdAt": "2019-11-06T19:09:33.087Z"
    }
}

Amazon EventBridge enables you to automate your AWS services, and respond automatically to system events such as application availability issues or resource changes. Events from AWS services are delivered to EventBridge in near real-time. You can write simple rules to indicate which events are of interest to you and what automated actions to take when an event matches a rule.

More details on how to use Amazon ECS Events can be found here.

Fargate Spot pricing

With AWS Fargate, there are no upfront payments and you only pay for the resources that you use. You pay for the amount of vCPU and memory resources consumed by your containerized applications.

The price for Spot CPU-Hour and GB-Hour is the same across all Availability Zones and Task Configurations. However, the price varies throughout the day. The latest price is available at Fargate Pricing page. Pricing is per second with a 1-minute minimum. Duration is calculated from the time you start to download your container image (Docker pull) until the Task terminates, rounded up to the nearest second.

Fargate Spot best practices

As I wrap up, I want to focus on a few best practices about Fargate Spot.

  • Fargate Spot is great for stateless, fault-tolerant workloads, but don’t rely solely on Spot Tasks for critical workloads, configure a mix of regular Fargate Tasks
  • Applications running on Fargate Spot should be fault-tolerant
  • Handle interruptions gracefully by catching SIGTERM signals

Conclusion

Fargate Spot is a great fit for parallelizable workloads like image rendering, Monte Carlo simulations, and genomic processing. However, customers can also use Fargate Spot for Tasks that run as a part of ECS services such as websites and APIs which require high availability.

ECS Fargate has already made its super easy to run containerized workloads without worrying about the setup and managing infrastructure. Fargate Spot makes it more affordable for your price sensitive workloads. With right mix of FARGATE and FARGATE_SPOT capacity providers you can get the optimal capacity for you tasks in budget.


PritamPritam is Sr. Specialist Solutions Architect in EC2 Spot team. For last 13 years he evangelized DevOps and Cloud adoption across industries and verticals. He likes to deep dive and find solutions to every day problems.

Amazon Elastic Container Service now supports Amazon EFS file systems

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/amazon-ecs-supports-efs/

It has only been five years since Jeff wrote on this blog about the launch of the Amazon Elastic Container Service. I remember reading that post and thinking how exotic and unusual containers sounded. Fast forward just five years, and containers are an everyday part of most developers lives, but whilst customers are increasingly adopting container orchestrators such as ECS, there are still some types of applications that have been hard to move into this containerized world.

Customers building applications that require data persistence or shared storage have faced a challenge since containers are temporary in nature. As containers are scaled in and out dynamically, any local data is lost as containers are terminated. Today we are changing that for ECS by launching support for Amazon Elastic File System (EFS) file systems. Both containers running on ECS and AWS Fargate will be able to use Amazon Elastic File System (EFS).

This new capability will help customers containerize applications that require shared storage such as content management systems, internal DevOps tools, and machine learning frameworks. A whole new set of workloads will now enjoy the benefits containers bring, enabling customers to speed up their deployment process, optimize their infrastructure utilization, and build more resilient systems.

Amazon Elastic File System (EFS) provides a fully managed, highly available, scalable shared file system; it means that you can store your data separately from your compute. It is also a regional service, meaning that the service is storing data within and across 3 Availability Zones for high availability and durability.

Until now, it was possible to get EFS working with ECS if you were running your containers on a cluster of EC2 instances. However, if you wanted to use AWS Fargate as your container data plane then, prior to this announcement, you couldn’t mount an EFS file system. Fargate does not allow you as a customer to gain access to the managed instances inside the Fargate fleet and so you are unable to make the modifications required to the instances to setup EFS.

I’m sure many of our customers will be delighted that they now have a way of connecting EFS file systems easily to ECS, personally I’m ecstatic that we can use this new feature in combination with Fargate, it will be perfect for a little side project that I am currently building and finally give us a way of having persistent storage work in combination with serverless containers.

There is a good reason both ECS and EFS include the word Elastic in their name as both of these services can scale up and down as your application requires. EFS scales on demand from zero to petabytes with no disruptions, growing and shrinking automatically as you add and remove files. With ECS there are options to use either Cluster Auto Scaling or Fargate to ensure that your capacity grows and shrinks to meet demand. For you, our customer, this means that you are only ever paying for the storage and compute that you are actually going to use.

So, enough talking, let’s get to the fun bit and see how we can get a containerized application working with Amazon Elastic File System (EFS).

A Simple Shared File System Example
For this example, I have built some basic infrastructure, so I can show you the before and after effect of adding EFS to a Fargate cluster. Firstly I have created a VPC that spans two Availability Zones. Secondly, I have created an ECS cluster. On the ECS Cluster, I plan to run two containers using Fargate, and this means that I don’t have to set up any EC2 instances as my containers will run on the Fargate fleet that is managed by AWS.

To deploy my application, I create a Task Definition that uses a Docker Image of an Open Source application called Cloud Commander, which is a simple drag and drop, file manager.

In the ECS console, I create a service and use the Task Definition that I created to deploy my application. Once the service is deployed, and the containers are provisioned, I head over to the Application Load Balancer URL which was created as part of my service, and I can see that my application appears to be working. I can drag a file to upload it to the application.

However there is a problem. If I refresh the page, occasionally, the file that I dragged to upload disappears. This happens because I have two containers running my application, and they are both using their local file systems. As I refresh my browser, the load balancer sends me to one of the two containers, and only one of the containers is storing the image on its local volume.

What I need is a shared file system that both the containers can mount and to which they can both write files.

Next, I create a new file system inside the EFS console. In the wizard, I choose the same VPC that I used when I created my ECS cluster and select all of the availability zones that the VPC spans and ask the service to create mount targets in each one. These mount targets will mean that containers that are in different Availability Zones will still be able to connect to the file system.

I select the defaults for all the other options in the wizard. In Step 3, I click the button to Add an access Point. An access point is a way of me giving a particular application access to the file system and gives me incredibly granular control over what data my application is allowed to access. You can add multiple Access points to your EFS file system and provide different applications, different levels of access to the same file system.

The application I am deploying will handle user uploads for my web site, so I will create an EFS Access Point that gives this application full access to the /uploads directory, but nothing else. To do this, I will create an access point with a new User ID (1000) and Group ID (1000), and a home directory of /uploads. The directory will be created with this user and group as the owner with full permissions, giving read permissions to all other users.

Security is the number one priority at AWS and the team have worked hard to ensure ECS integrates with EFS to provide multiple layers of security for protecting EFS filesystems from unauthorized access, including IAM role-based access control, VPC security groups, and encryption of data in transit.

After working through the wizard, my file system is created, and I’m given a File system ID and an Access point ID. I will need these IDs to configure the task definitions in Fargate.

I go back to the Task Definition inside my ECS Cluster and create a new revision of the Task Definition. I scroll down to the Volumes section of the definition and click Add volume.

I can then add my EFS File System details, I select the correct File system ID and also the correct Access point ID that I created earlier.

I have opted to Enable Encryption in transit, but for this example I have not enabled EFS IAM authorization which would be helpful in a larger application with many clients requiring different levels of access for different portions of a filesystem. This feature can simplify management by using IAM authorization, if you want more details on this check out the blog we wrote when the feature launched earlier in the year.

Now that I have updated my task definition, I can update my ECS service to use this new definition. It’s also essential here to make sure that I set the platform version to 1.4.0.

The service deploys my two new containers and decommissions the old two. The new containers will now be using the shared EFS file system, and so my application now works as expected.

If I upload files and then revisit the application, my files will still be there. If my containers are replaced or are scaled up or down, the file system will persist.

Looking to the Future
I am loving the innovation that has been coming out of the containers teams recently, and looking at their public roadmap; they have some really exciting plans for the future. If you have ideas or feature requests, make sure you add your voice to the many customers that are already guiding their roadmap.

The new feature is available in all regions where ECS and EFS are available and comes at no additional cost. So, go check it out in the AWS console and let us know what you think.

Happy Containerizing

— Martin

 

Bottlerocket – Open Source OS for Container Hosting

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/bottlerocket-open-source-os-for-container-hosting/

It is safe to say that our industry has decided that containers are now the chosen way to package and scale applications. Our customers are making great use of Amazon ECS and Amazon Elastic Kubernetes Service, with over 80% of all cloud-based containers running on AWS.

Container-based environments lend themselves to easy scale-out, and customers can run host environments that encompass hundreds or thousands of instances. At this scale, several challenges arise with the host operating system. For example:

Security – Installing extra packages simply to satisfy dependencies can increase the attack surface.

Updates – Traditional package-based update systems and mechanisms are complex and error prone, and can have issues with dependencies.

Overhead – Extra, unnecessary packages consume disk space and compute cycles, and also increase startup time.

Drift – Inconsistent packages and configurations can damage the integrity of a cluster over time.

Introducing Bottlerocket
Today I would like to tell you about Bottlerocket, a new Linux-based open source operating system that we designed and optimized specifically for use as a container host.

Bottlerocket reflects much of what we have learned over the years. It includes only the packages that are needed to make it a great container host, and integrates with existing container orchestrators. It supports Docker image and images that conform to the Open Container Initiative (OCI) image format.

Instead of a package update system, Bottlerocket uses a simple, image-based model that allows for a rapid & complete rollback if necessary. This removes opportunities for conflicts and breakage, and makes it easier for you to apply fleet-wide updates with confidence using orchestrators such as EKS.

In addition to the minimal package set, Bottlerocket uses a file system that is primarily read-only, and that is integrity-checked at boot time via dm-verity. SSH access is discouraged, and is available only as part of a separate admin container that you can enable on an as-needed basis and then use for troubleshooting purposes.

Try it Out
We’re launching a public preview of Bottlerocket today. You can follow the steps in QUICKSTART to set up an EKS cluster, and you can take a look at the GitHub repo. Try it out, report bugs, send pull requests, and let us know what you think!

Jeff;

 

AWS ECS Cluster Auto Scaling is Now Generally Available

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/aws-ecs-cluster-auto-scaling-is-now-generally-available/

Today, we have launched AWS ECS Cluster Auto Scaling. This new capability improves your cluster scaling experience by increasing the speed and reliability of cluster scale-out, giving you control over the amount of spare capacity maintained in your cluster, and automatically managing instance termination on cluster scale-in.

To enable ECS Cluster Auto Scaling, you will need to create a new ECS resource type called a Capacity Provider. A Capacity Provider can be associated with an EC2 Auto Scaling Group (ASG). When you associate an ECS Capacity Provider with an ASG and add the Capacity Provider to an ECS cluster, the cluster can now scale your ASG automatically by using two new features of ECS:

  1. Managed scaling, with an automatically-created scaling policy on your ASG, and a new scaling metric (Capacity Provider Reservation) that the scaling policy uses; and
  2. Managed instance termination protection, which enables container-aware termination of instances in the ASG when scale-in happens.

These new features will give customers greater control of when and how Amazon ECS clusters scale-in and scale-out.

Capacity Provider Reservation
The new metric, called capacity provider reservation, measures the total percentage of cluster resources needed by all ECS workloads in the cluster, including existing workloads, new workloads, and changes in workload size. This metric enables the scaling policy to scale out quicker and more reliably than it could when using CPU or memory reservation metrics. Customers can also use this metric to reserve spare capacity in their clusters. Reserving spare capacity allows customers to run more containers immediately if needed, without waiting for new instances to start.

Managed Instance Termination Protection
With instance termination protection, ECS controls which instances the scaling policy is allowed to terminate on scale-in, to minimize disruptions of running containers. These improvements help customers achieve lower operational costs and higher availability of their container workloads running on ECS.

How This Help Customers
Customers running scalable container workloads on ECS often use metric-based scaling policies to automatically scale their ECS clusters. These scaling policies use generic metrics such as average cluster CPU and memory reservation percentages to determine when the policy should add or remove cluster instances.

Clusters running a single workload, or workloads that scale-out slowly, often work well with such policies. However, customers running multiple workloads in the same cluster, or workloads that scale-out rapidly, are more likely to experience problems with cluster scaling. Ideally, increases in workload size that cannot be accommodated by the current cluster should trigger the policy to scale the cluster out to a larger size.

Because the existing metrics are not container-specific and account only for resources already in use, this may happen slowly or be unreliable. Furthermore, because the scaling policy does not know where containers are running in the cluster, it can unnecessarily terminate containers when scaling in. These issues can reduce the availability of container workloads. Mitigations such as over-provisioning, custom tooling, or manual intervention often impose high operational costs.

Enough Talk, Let’s Scale
To understand these new features more clearly, I think it’s helpful to work through an example.

Amazon ECS Cluster Auto Scaling can be set up and configured using the AWS Management Console, AWS CLI, or Amazon ECS API. I’m going to open up my terminal and create a cluster.

Firstly, I create two files. The first file is called demo-launchconfig.json and defines the instance configuration for the Amazon Elastic Compute Cloud (EC2) instances that will make up my auto scaling group.

{
    "LaunchConfigurationName": "demo-launchconfig",
    "ImageId": "ami-01f07b3fa86406c96",
    "SecurityGroups": [
        "sg-0fa5be8c3749f3aa0"
    ],
    "InstanceType": "t2.micro",
    "BlockDeviceMappings": [
        {
            "DeviceName": "/dev/xvdcz",
            "Ebs": {
                "VolumeSize": 22,
                "VolumeType": "gp2",
                "DeleteOnTermination": true,
                "Encrypted": true
                }
        }
    ],
    "InstanceMonitoring": {
        "Enabled": false
    },
    "IamInstanceProfile": "arn:aws:iam::365489315573:role/ecsInstanceRole",
    "AssociatePublicIpAddress": true
}

The second file is demo-userdata.txt, and it contains the user data that will be added to each EC2 instance. The ECS_CLUSTER name included in the file must be the same as the name of the cluster we are going to create. In my case, the name is demo-news-blog-scale.

#!/bin/bash
echo ECS_CLUSTER=demo-news-blog-scale >> /etc/ecs/ecs.config

Using the create-launch-configuration command, I pass the two files I created as inputs, this will create the launch configuration that I will use in my auto scaling group.

aws autoscaling create-launch-configuration --cli-input-json file://demo-launchconfig.json --user-data file://demo-userdata.txt

Next, I create a file called demo-asgconfig.json and define my requirements.

{
    "LaunchConfigurationName": "demo-launchconfig", 
    "MinSize": 0,
    "MaxSize": 100,
    "DesiredCapacity": 0,
    "DefaultCooldown": 300,
    "AvailabilityZones": [ 
        "ap-southeast-1c" ], 
    "HealthCheckType": "EC2", 
    "HealthCheckGracePeriod": 300, 
    "VPCZoneIdentifier": "subnet-abcd1234", 
    "TerminationPolicies": [ 
        "DEFAULT" 
    ],
    "NewInstancesProtectedFromScaleIn": true, 
    "ServiceLinkedRoleARN": "arn:aws:iam::111122223333:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling"
} 

I then use the create-auto-scaling-group command to create an auto scaling group called demo-asg using the above file as an input.

aws autoscaling create-auto-scaling-group --auto-scaling-group-name demo-asg --cli-input-json file://demo-asgconfig.json

I am now ready to create a capacity provider. I create a file called demo-capacityprovider.json, importantly, I set the managedTerminationProtection property to ENABLED.

{
    "name": "demo-capacityprovider", "autoScalingGroupProvider": {
    "autoScalingGroupArn": "arn:aws:autoscaling:ap-southeast-1:365489315573:autoScalingGroup:e9c2f0c4-9a4c-428e-b81e-b22411a52954:autoScalingGroupName/demo-ASG",
            "managedScaling": {
                "status": "ENABLED",
                "targetCapacity": 100,
                "minimumScalingStepSize": 1,
                "maximumScalingStepSize": 100
            },
            "managedTerminationProtection": "ENABLED"
    }
}

I then use the new create-capacity-provider command to create a provider using the file as an input.

aws ecs create-capacity-provider --cli-input-json file://demo-capacityprovider.json

Now all the components have been created, I can finally create a cluster. I add the capacity provider and set the default capacity provider for the cluster as demo-capacityprovider.

aws ecs create-cluster --cluster-name demo-news-blog-scale --capacity-providers demo-capacityprovider --default-capacity-provider-strategy<br />capacityProvider=demo-capacityprovider,weight=1

I now need to wait until the cluster has moved into the active state. I use the following command to get details about the cluster.

aws ecs describe-clusters --clusters demo-news-blog-scale --include ATTACHMENTS

Now that my cluster is set up, I can register some tasks. Firstly I will need to create a task definition. Below is a file I. have created called demo-sleep-taskdef.json. All this definition does is define a container that sleeps for infinity.

{
    "family": "demo-sleep-taskdef",
    "containerDefinitions": [
        {
            "name": "sleep",
            "image": "amazonlinux:2",
            "memory": 20,
            "essential": true,
            "command": [
                "sh",
                "-c",
                "sleep infinity"] 
        }],
    "requiresCompatibilities": [
        "EC2"] 
} 

I then register the task definition using the register-task-definition command.

aws ecs register-task-definition --cli-input-json file://demo-sleep-taskdef.json

Finally, I can create my tasks. In this case, I have created 5 tasks based on the demo-sleep-taskdef:1 definition that I just registered.

aws ecs run-task --cluster demo-news-blog-scale --count 5 --task-definition demo-sleep-taskdef:1

Now because instances are not yet available to run the tasks, the tasks go into a provisioning state, which means they are waiting for capacity to become available. The capacity provider I configured will now scale-out the auto scaling group so that instances start up and join the cluster – at which point the tasks get placed on the instances. This gives a true “scale from zero” capability, which did not previously exist.

Things To Know
AWS ECS Cluster Auto Scaling is now available in all regions where Amazon ECS and AWS Auto Scaling are available – check the region table for the latest list.

Happy Scaling!

— Martin

 

New – AWS IoT Greengrass Adds Container Support and Management of Data Streams at the Edge

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-aws-iot-greengrass-adds-docker-support-and-streams-management-at-the-edge/

AWS IoT Greengrass extends cloud capabilities to edge devices, so that they can respond to local events in near real-time, even with intermittent connectivity.

Today, we are adding two features that make it easier to build IoT solutions:

  • Container support to deploy applications using the Greengrass Docker application deployment connector.
  • Collect, process, and export data streams from edge devices and manage the lifecycle of that data with the Stream Manager for AWS IoT Greengrass.

Let’s see how these new features work and how to use them.

Deploying a Container-Based Application to a Greengrass Core Device
You can now run AWS Lambda functions and container-based applications in your AWS IoT Greengrass core device. In this way it is easier to migrate applications from on-premises, or build new applications that include dependencies such as libraries, other binaries, and configuration files, using container images. This provides a consistent deployment environment for your applications that enables portability across development environments and edge locations. You can easily deploy legacy and third-party applications by packaging the code or executables into the container images.

To use this feature, I describe my container-based application using a Docker Compose file. I can reference container images in public or private repositories, such as Amazon Elastic Container Registry (ECR) or Docker Hub. To start, I create a simple web app using Python and Flask that counts the number of times it is visualized.

from flask import Flask

app = Flask(__name__)

counter = 0

@app.route('/')
def hello():
    global counter
    counter += 1
    return 'Hello World! I have been seen {} times.\n'.format(counter)

My requirements.txt file contains a single dependency, flask.

I build the container image using this Dockerfile and push it to ECR.

FROM python:3.7-alpine
WORKDIR /code
ENV FLASK_APP app.py
ENV FLASK_RUN_HOST 0.0.0.0
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
CMD ["flask", "run"]

Here is the docker-compose.yml file referencing the container image in my ECR repository. Docker Compose files can describe applications using multiple containers, but for this example I am using just one.

version: '3'
services:
  web:
    image: "123412341234.dkr.ecr.us-east-1.amazonaws.com/hello-world-counter:latest"
    ports:
      - "80:5000"

I upload the docker-compose.yml file to an Amazon Simple Storage Service (S3) bucket.

Now I create an AWS IoT Greengrass group using an Amazon Elastic Compute Cloud (EC2) instance as core device. Usually your core device is outside of the AWS cloud, but using an EC2 instance can be a good way to set up and automate a dev & test environment for your deployments at the edge.

When the group is ready, I run an “empty” deployment, just to check that everything is working as expected. After a few seconds, my first deployment has completed and I start adding a connector.

In the connector section of the AWS IoT Greengrass group, I select Add a connector and search for “Docker”. I select Docker Application Deployment and hit Next.

Now I configure the parameters for the connector. I select my docker-compose.yml file on S3. The AWS Identity and Access Management (IAM) role used by the AWS IoT Greengrass group needs permissions to get the file from S3 and to get the authorization token and download the image from ECR. If you use a private repository such as Docker Hub, you can leverage the integration with the AWS Secret Manager to make it easy for your connectors and Lambda functions to use local secrets to interact with services and applications.

I deploy my changes, similarly to what I did before. This time, the new container-based application is installed and started on the AWS IoT Greengrass core device.

To test the web app that I deployed, I open access to the HTTP port on the Security Group of the EC2 instance I am using as core device. When I connect with my browser, I see the Flask app starting to count the visits. My container-based application is running on the AWS IoT Greengrass core device!

You can deploy much more complex applications than what I did in this example. Let’s see that as we go through the other feature released today.

Using the Stream Manager for AWS IoT Greengrass
For common use cases like video processing, image recognition, or high-volume data collection from sensors at the edge, you often need to build your own data stream management capabilities. The new Stream Manager simplifies this process by adding a standardized mechanism to the Greengrass Core SDK that you can use to process data streams from IoT devices, manage local data retention policies based on cache size or data age, and automatically transmit data directly into AWS cloud services such as Amazon Kinesis and AWS IoT Analytics.

The Stream Manager also handles disconnected or intermittent connectivity scenarios by adding configurable prioritization, caching policies, bandwidth utilization, and time-outs on a per-stream basis. In situations where connectivity is unpredictable or bandwidth is constrained, this new functionality enables you to define the behavior of your applications’ data management while disconnected, reconnecting, or connected, allowing you to prioritize important data’s path to the cloud and make efficient use of a connection when it is available. Using this feature, you can focus on your specific application use cases rather than building data retention and connection management functionality.

Let’s see now how the Stream Manager works with a practical use case. For example, my AWS IoT Greengrass core device is receiving lots of data from multiple devices. I want to do two things with the data I am collecting:

  • Upload all row data with low priority to AWS IoT Analytics, where I use Amazon QuickSight to visualize and understand my data.
  • Aggregate data locally based on time and location of the devices, and send the aggregated data with high priority to a Kinesis Data Stream that is processed by a business application for predictive maintenance.

Using the Stream Manager in the Greengrass Core SDK, I create two local data streams:

  • The first local data stream has a configured low-priority export to IoT Analytics and can use up to 256MB of local disk (yes, it’s a constrained device). You can use memory to store the local data stream if you prefer speed to resilience. When local space is filled up, for example because I lost connectivity to the cloud and I continue to cache locally, I can choose to either reject new data or overwrite the oldest data.
  • The second local data stream is exporting data with high priority to a Kinesis Data Stream and can use up to 128MB of local disk (it’s aggregated data, I need less space for the same amount of time).

 

Here’s how the data flows in this architecture:

  • Sensor data is collected by a Producer Lambda function that is writing to the first local data stream.
  • A second Aggregator Lambda function is reading from the first local data stream, performing the aggregation, and writing its output to the second local data stream.
  • A Reader container-based app (deployed using the Docker application deployment connector) is rendering the aggregated data in real-time for a display panel.
  • The Stream Manager takes care of the ingestion to the cloud, based on the configuration and the policies of the local data streams, so that developers can focus their efforts on the logic on the device.

The use of Lambda functions or container-based apps in the previous architecture is just an example. You can mix and match, or standardize to one or the other, depending on your development best practices.

Available Now
The Docker application deployment connector and the Stream Manager are available with Greengrass version 1.10. The Stream Manager is available in the Greengrass Core SDK for Java and Python. We are adding support for other platforms based on customer feedback.

These new features are independent from each other, but can be used together as in my example. They can simplify the way you build and deploy applications on edge devices, making it easier to process data locally and be integrated with streaming and analytics services in the backend. Let me know what you are going to use these features for!

Danilo

Improving Containers by Listening to Customers

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/improving-containers/

At AWS, we build our product roadmap based upon feedback from our customers. The following three new features have all come about because customers have asked us to solve specific issues they have face when building and operating sophisticated container-based applications.

Managed Node Groups for Amazon Elastic Kubernetes Service
Our customers have told us that they want to focus on building innovative solutions for their customers, and focus less on the heavy lifting of managing Kubernetes infrastructure.

Amazon Elastic Kubernetes Service already provides you with a standard, highly-available Kubernetes cluster control plane, and now, AWS can also manage the nodes (Amazon Elastic Compute Cloud (EC2) instances) for your Kubernetes cluster. Amazon Elastic Kubernetes Service makes it easy to apply bug fixes and security patches to nodes, and updates them to the latest Kubernetes versions along with the cluster.

The Amazon Elastic Kubernetes Service console and API give you a single place to understand the state of your cluster, you no longer have to jump around different services to see all of the resources that make up your cluster.

You can provision managed nodes today when you create a new Amazon EKS cluster. There is no additional cost to use Amazon EKS managed node groups, you only pay for the Amazon EKS cluster and AWS resources they provision. To find out more check out this blog: Extending the EKS API: Managed Node Groups.

Managing your container Logs with AWS FireLens
Customers building container-based applications told us that they wanted more flexibility when it came to logging; however, they didn’t wish to to install, configure, or troubleshoot logging agents.

AWS FireLens, gives you this flexibility as you can now forward container logs to storage and analytics tools by configuring your task definition in Amazon ECS or AWS Fargate.

This means that developers have their containers send logs to Stdout and then FireLens picks up these logs and forwards them to the destination that has been configured.

FireLens works with the open-source projects Fluent Bit and Fluentd, which means that you can send logs to any destination supported by either of those projects.

There are a lot of configuration options with FireLens, and you can choose to filter logs and even have logs sent to multiple destinations. For more information, you can take a look at the demo I wrote earlier in the week: Announcing Firelens – A New Way to Manage Container Logs.

If’ you would like a deeper understanding of how the technology works and was built, Wesley Pettit goes into even further depth on the Containers Blog in his article: Under the hood: FireLens for Amazon ECS Tasks.

Amazon Elastic Container Registry EventBridge Support
Customers using Amazon Elastic Container Registry have told us they want to be able to start a build process when new container images are pushed to Elastic Container Registry.

We have therefore added Amazon Elastic Container Registry EventBridge support.

Using events that Elastic Container Registry now publishes to EventBridge, you can trigger actions such as starting a pipeline or posting a message to somewhere like Amazon Chime or Slack when your image is successfully pushed.

To learn more about this new feature, check out the following blog post where I give a more detailed explication and demo: EventBridge support in Amazon Elastic Container Registry.

More to come
These 3 new releases add to other great releases we have already had this year such as Savings Plans, Amazon EKS Windows Containers support, and Native Container Image Scanning in Amazon ECR.

We are still listening, and we need your feedback, so if you have a feature request or a pain point with your container applications, please let us know by creating or commenting on issues in our public containers roadmap. Sometime in the future I might one-day writing about a new feature that was inspired by you.

Martin

 

EventBridge Support in Amazon Elastic Container Registry

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/eventbridge-support-in-amazon-elastic-container-registry/

Many of our customers require a secure and private place to store their container images, and that’s why they use our fully managed container registry Amazon Elastic Container Registry. We recently added support for Amazon EventBridge so that you can trigger actions when images are pushed or deleted. These actions can trigger a continuous integration, continuous deployment pipeline when an image is pushed or post a message to your DevOps team Slack channel when an image has been deleted.

This new capability can even enable complicated workflows, for example, customers can use the image push event on a base image to trigger a rebuild of images built on top of that base. In this scenario, a base image might be rebuilt weekly to pick up the latest security patches. A push event from the base image repository can trigger other builds, so that all derivative images are patched, too.

To show you how to go about using this new capability, I thought I’d open up the console and work through an example of how all the pieces fit together.

In the Amazon EventBridge console, I create a new rule, and I enter a unique name and description.

Next, I scroll down to Define pattern and begin to customise the type of event pattern that I want to use. I leave the default Event pattern radio button selected and also that I want to use a Pre-defined patten by service. Since Elastic Container Registry is an AWS service, I select AWS as the Service Provider.

In the Service Name section, you can select one of the many different AWS services as the event source. I am going to choose the newest addition to this list Elastic Container Registry (ECR). Lastly, in this section, I select ECR Image Action as the Event type. This ECR Image Action contains both DELETE and PUSH as action types.

Next, I’m asked to configure which event bus I want to use. For this example, I select the AWS default event bus that comes with every AWS account.

Now that I have identified where my events are coming from, I now need to say where I want them to go. We call these targets, and there are plenty of options here. For example, I could send the event to a Lambda Function, a Kinesis stream, or any one of the wide variety of AWS targets.

To keep things simple, I’m going to choose to invoke a Amazon Simple Notification Service (SNS) topic. This topic is called ImageAction, and I have subscribed to this topic so that I receive an email when new messages are received by this topic.

Back over on my laptop, I push a new version of my container to my repository in to Elastic Container Registry.

If I go over to the Elastic Container Registry console, I can see that my Docker Image was successfully pushed, I’m now going to select the image and click the Delete button, which will delete my new image.

This will have sent both a PUSH and a DELETE event through to my SNS topic which in turn deliver two emails to me as a subscriber to that topic.

 

If I open up Outlook, sure enough, I have two (admittedly not pretty) emails that have both the respective action-type of PUSH and DELETE.

So there you have it, you can now wire up events in Elastic Container Registry and enable exciting and wonderful things to happen as a result. Amazon EventBridge support in Amazon Elastic Container Registry is available in all public AWS Regions and GovCloud (US). Try it now in the Amazon EventBridge console.

Happy Eventing!

Martin

Announcing Firelens – A New Way to Manage Container Logs

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/announcing-firelens-a-new-way-to-manage-container-logs/

Today, the fantastic team that builds our container services at AWS have launched an excellent new tool called AWS FireLens that will make dealing with logs a whole lot easier.

Using FireLens, customers can direct container logs to storage and analytics tools without modifying deployment scripts, manually installing extra software or writing additional code. With a few configuration updates on Amazon ECS or AWS Fargate, you select the destination and optionally define filters to instruct FireLens to send container logs to where they are needed.

FireLens works with either Fluent Bit or Fluentd, which means that you can send logs to any destination supported by either of those open-source projects. We maintain a web page where you can see a list of AWS Partner Network products that have been reviewed by AWS Solution Architects. You can send log data or events to any of these products using FireLens.

I find the simplest way to understand FireLens is to use it, so in the rest of this blog post, I’m going to demonstrate using FireLens with a container in Amazon ECS, forwarding the container logs on to Amazon CloudWatch.

First, I need to configure a task definition, I got an example definition from the Amazon ECS FireLens Examples on GitHub.

I replaced the AWS Identity and Access Management (IAM) roles with my own taskRoleArn and executionRoleArn IAM roles, I also added port mappings so that I could access the NGINX container from a browser.

{
	"family": "firelens-example-cloudwatch",
	"taskRoleArn": "arn:aws:iam::365489000573:role/ecsInstanceRole",
	"executionRoleArn": "arn:aws:iam::365489300073:role/ecsTaskExecutionRole",
	"containerDefinitions": [
		{
			"essential": true,
			"image": "906394416424.dkr.ecr.us-east-1.amazonaws.com/aws-for-fluent-bit:latest",
			"name": "log_router",
			"firelensConfiguration": {
				"type": "fluentbit"
			},
			"logConfiguration": {
				"logDriver": "awslogs",
				"options": {
					"awslogs-group": "firelens-container",
					"awslogs-region": "us-west-2",
					"awslogs-create-group": "true",
					"awslogs-stream-prefix": "firelens"
				}
			},
			"memoryReservation": 50
		 },
		 {
			 "essential": true,
			 "image": "nginx",
			 "name": "app",
			 "portMappings": [
				{
				  "containerPort": 80,
				  "hostPort": 80
				}
			  ],
			 "logConfiguration": {
				 "logDriver":"awsfirelens",
				 "options": {
					"Name": "cloudwatch",
					"region": "us-west-2",
					"log_group_name": "firelens-fluent-bit",
					"auto_create_group": "true",
					"log_stream_prefix": "from-fluent-bit"
				}
			},
			"memoryReservation": 100
		}
	]
}

I saved the task definition to a local folder and then used the AWS Command Line Interface (CLI) to register the task definition.

aws ecs register-task-definition --cli-input-json file://cloudwatch_task_definition.json

I already have an ECS cluster set up, but if you don’t, you can learn how to do that from the ECS documentation. The command below creates a service on my ECS cluster using my newly registered task definition.

aws ecs create-service --cluster demo-cluster --service-name demo-service --task-definition firelens-example-cloudwatch --desired-count 1 --launch-type "EC2"

After logging into the Amazon ECS console and drilling into my service, and my tasks, I find the container definition that exposes an External Link. This IP address is exposed since I asked for the container to map port 80 of the container port to port 80 of the host port inside of the task definition.

If I go to that IP adress in a browser then the NGINX container which I used as my app, serves its default page. The NGINX container logs any requests that it receives to Stdout and so FireLens will now forward these logs on to CloudWatch. I added a little message to the URL so that when I take a look at the logs, I should be able to quickly identify this request from all the others.

I then navigated over to the Amazon CloudWatch console and drilled down into the firelens-fluent-bit log group. If you remember this is the log group name that I set up in the original task definition. Below you will notice I have several logs in my log stream and the last one is the request that I just made in the browser. If you look closely at the log, you will find that “IT WORKS” is passed in as part of the GET request.

So there we have it, I successfully set up FireLens and had it forward my container logs on to CloudWatch I could, of course, have chosen a different destination, for example, a third-party provider like Datadog or an AWS destination like Amazon Kinesis Data Firehose.

If you want to try FireLens, it is available today in all regions that support Amazon ECS, and AWS Fargate.

Happy Logging!

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Post Syndicated from Jenny Hang original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-september-aws-online-tech-talks/

Learn about AWS Services & Solutions – September AWS Online Tech Talks

AWS Tech Talks

Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

 

Compute:

September 23, 2019 | 11:00 AM – 12:00 PM PTBuild Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.

September 26, 2019 | 1:00 PM – 2:00 PM PTSelf-Hosted WordPress: It’s Easier Than You Think – Learn how you can easily build a fault-tolerant WordPress site using Amazon Lightsail.

October 3, 2019 | 11:00 AM – 12:00 PM PTLower Costs by Right Sizing Your Instance with Amazon EC2 T3 General Purpose Burstable Instances – Get an overview of T3 instances, understand what workloads are ideal for them, and understand how the T3 credit system works so that you can lower your EC2 instance costs today.

 

Containers:

September 26, 2019 | 11:00 AM – 12:00 PM PTDevelop a Web App Using Amazon ECS and AWS Cloud Development Kit (CDK) – Learn how to build your first app using CDK and AWS container services.

 

Data Lakes & Analytics:

September 26, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Provisioning Amazon MSK Clusters and Using Popular Apache Kafka-Compatible Tooling – Learn best practices on running Apache Kafka production workloads at a lower cost on Amazon MSK.

 

Databases:

September 25, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

October 3, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Enterprise-Class Security, High-Availability, and Scalability with Amazon ElastiCache – Learn about new enterprise-friendly Amazon ElastiCache enhancements like customer managed key and online scaling up or down to make your critical workloads more secure, scalable and available.

 

DevOps:

October 1, 2019 | 9:00 AM – 10:00 AM PT – CI/CD for Containers: A Way Forward for Your DevOps Pipeline – Learn how to build CI/CD pipelines using AWS services to get the most out of the agility afforded by containers.

 

Enterprise & Hybrid:

September 24, 2019 | 1:00 PM – 2:30 PM PT Virtual Workshop: How to Monitor and Manage Your AWS Costs – Learn how to visualize and manage your AWS cost and usage in this virtual hands-on workshop.

October 2, 2019 | 1:00 PM – 2:00 PM PT – Accelerate Cloud Adoption and Reduce Operational Risk with AWS Managed Services – Learn how AMS accelerates your migration to AWS, reduces your operating costs, improves security and compliance, and enables you to focus on your differentiating business priorities.

 

IoT:

September 25, 2019 | 9:00 AM – 10:00 AM PTComplex Monitoring for Industrial with AWS IoT Data Services – Learn how to solve your complex event monitoring challenges with AWS IoT Data Services.

 

Machine Learning:

September 23, 2019 | 9:00 AM – 10:00 AM PTTraining Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.

September 30, 2019 | 11:00 AM – 12:00 PM PTUsing Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.

October 3, 2019 | 1:00 PM – 2:30 PM PTVirtual Workshop: Getting Hands-On with Machine Learning and Ready to Race in the AWS DeepRacer League – Join DeClercq Wentzel, Senior Product Manager for AWS DeepRacer, for a presentation on the basics of machine learning and how to build a reinforcement learning model that you can use to join the AWS DeepRacer League.

 

AWS Marketplace:

September 30, 2019 | 9:00 AM – 10:00 AM PTAdvancing Software Procurement in a Containerized World – Learn how to deploy applications faster with third-party container products.

 

Migration:

September 24, 2019 | 11:00 AM – 12:00 PM PTApplication Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.

 

Networking & Content Delivery:

September 25, 2019 | 11:00 AM – 12:00 PM PTBuilding Highly Available and Performant Applications using AWS Global Accelerator – Learn how to build highly available and performant architectures for your applications with AWS Global Accelerator, now with source IP preservation.

September 30, 2019 | 1:00 PM – 2:00 PM PTAWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.

 

Robotics:

October 1, 2019 | 11:00 AM – 12:00 PM PTRobots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.

 

Security, Identity & Compliance:

October 1, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on Running Active Directory on AWS – Learn how to deploy Active Directory on AWS and start migrating your windows workloads.

 

Serverless:

October 2, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.

 

Storage:

September 24, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Amazon S3 Data Lake with S3 Storage Classes and Management Tools – Learn how to use the Amazon S3 Storage Classes and management tools to better manage your data lake at scale and to optimize storage costs and resources.

October 2, 2019 | 11:00 AM – 12:00 PM PTThe Great Migration to Cloud Storage: Choosing the Right Storage Solution for Your Workload – Learn more about AWS storage services and identify which service is the right fit for your business.

 

 

Sharing automated blueprints for Amazon ECS continuous delivery using AWS Service Catalog

Post Syndicated from Ignacio Riesgo original https://aws.amazon.com/blogs/compute/sharing-automated-blueprints-for-amazon-ecs-continuous-delivery-using-aws-service-catalog/

This post is contributed by Mahmoud ElZayet | Specialist SA – Dev Tech, AWS

 

Modern application development processes enable organizations to improve speed and quality continually. In this innovative culture, small, autonomous teams own the entire application life cycle. While such nimble, autonomous teams speed product delivery, they can also impose costs on compliance, quality assurance, and code deployment infrastructures.

Standardized tooling and application release code helps share best practices across teams, reduce duplicated code, speed on-boarding, create consistent governance, and prevent resource over-provisioning.

 

Overview

In this post, I show you how to use AWS Service Catalog to provide standardized and automated deployment blueprints. This helps accelerate and improve your product teams’ application release workflows on Amazon ECS. Follow my instructions to create a sample blueprint that your product teams can use to release containerized applications on ECS. You can also apply the blueprint concept to other technologies, such as serverless or Amazon EC2–based deployments.

The sample templates and scripts provided here are for demonstration purposes and should not be used “as-is” in your production environment. After you become familiar with these resources, create customized versions for your production environment, taking account of in-house tools and team skills, as well as all applicable standards and restrictions.

 

Prerequisites

To use this solution, you need the following resources:

 

Sample scenario

Example Corp. has various product teams that develop applications and services on AWS. Example Corp. teams have expressed interest in deploying their containerized applications managed by AWS Fargate on ECS. As part of Example Corp’s central tooling team, you want to enable teams to quickly release their applications on Fargate. However, you also make sure that they comply with all best practices and governance requirements.

For convenience, I also assume that you have supplied product teams working on the same domain, application, or project with a shared AWS account for service deployment. Using this account, they all deploy to the same ECS cluster.

In this scenario, you can author and provide these teams with a shared deployment blueprint on ECS Fargate. Using AWS Service Catalog, you can share the blueprint with teams as follows:

  1. Every time that a product team wants to release a new containerized application on ECS, they retrieve a new AWS Service Catalog ECS blueprint product. This enables them to obtain the required infrastructure, permissions, and tools. As a prerequisite, the ECS blueprint requires building blocks such as a git repository or an AWS CodeBuild project. Again, you can acquire those blocks through another AWS Service Catalog product.
  2. The product team completes the ECS blueprint’s required parameters, such as the desired number of ECS tasks and application name. As an administrator, you can constrain the value of some parameters such as the VPC and the cluster name. For more information, see AWS Service Catalog Template Constraints.
  3. The ECS blueprint product deploys all the required ECS resources, configured according to best practices. You can also use the AWS Cloud Development Kit (CDK) to maintain and provision pre-defined constructs for your infrastructure.
  4. A standardized CI/CD pipeline also generates, enabling your product teams to publish their application to ECS automatically. Ideally, this pipeline should have all stages, practices, security checks, and standards required for application release. Product teams must still author application code, create a Dockerfile, build specifications, run automated tests and deployment scripts, and complete other tasks required for application release.
  5. The ECS blueprint can be continually updated based on organization-wide feedback and to support new use cases. Your product team can always access the latest version through AWS Service Catalog. I recommend retaining multiple, customizable blueprints for various technologies.

 

For simplicity’s sake, my explanation envisions your environment as consisting of one AWS account. In practice, you can use IAM controls to segregate teams’ access to each other’s resources, even when they share an account. However, I recommend having at least two AWS accounts, one for testing and one for production purposes.

To see an example framework that helps deploy your AWS Service Catalog products to multiple accounts, see AWS Deployment Framework (ADF). This framework can also help you create cross-account pipelines that cater to different product teams’ needs, even when these teams deploy to the same technology stack.

To set up shared deployment blueprints for your production teams, follow the steps outlined in the following sections.

 

Set up the environment

In this section, I explain how to create a central ECS cluster in the appropriate VPC where teams can deploy their containers. I provide an AWS CloudFormation template to help you set up these resources. This template also creates an IAM role to be used by AWS Service Catalog later.

To run the CloudFormation template:

1. Use a git client to clone the following GitHub repository to a local directory. This will be the directory where you will run all the subsequent AWS CLI commands.

2. Using the AWS CLI, run the following commands. Replace <Application_Name> with a lowercase string with no spaces representing the application or microservice that your product team plans to release—for example, myapp.

aws cloudformation create-stack --stack-name "fargate-blueprint-prereqs" --template-body file://environment-setup.yaml --capabilities CAPABILITY_NAMED_IAM --parameters ParameterKey=ApplicationName,ParameterValue=<Application_Name>

3. Keep running the following command until the output reads CREATE_COMPLETE:

aws cloudformation describe-stacks --stack-name "fargate-blueprint-prereqs" --query Stacks[0].StackStatus

4. In case of error, use the describe-events CLI command or review error details on the console.

5. When the stack creation reads CREATE_COMPLETE, run the following command, and make a note of the output values in an editor of your choice. You need this information for a later step:

aws cloudformation describe-stacks  --stack-name fargate-blueprint-prereqs --query Stacks[0].Outputs

6. Run the following commands to copy those CloudFormation templates to Amazon S3. Replace <Template_Bucket_Name> with the template bucket output value you just copied into your editor of choice:

aws s3 cp core-build-tools.yml s3://<Template_Bucket_Name>/core-build-tools.yml

aws s3 cp ecs-fargate-deployment-blueprint.yml s3://<Template_Bucket_Name>/ecs-fargate-deployment-blueprint.yml

Create AWS Service Catalog products

In this section, I show you how to create two AWS Service Catalog products for teams to use in publishing their containerized app:

  1. Core Build Tools
  2. ECS Fargate Deployment Blueprint

To create an AWS Service Catalog portfolio that includes these products:

1. Using the AWS CLI, run the following command, replacing <Application_Name>
with the application name you defined earlier and replacing <Template_Bucket_Name>
with the template bucket output value you copied into your editor of choice:

aws cloudformation create-stack --stack-name "fargate-blueprint-catalog-products" --template-body file://catalog-products.yaml --parameters ParameterKey=ApplicationName,ParameterValue=<Application_Name> ParameterKey=TemplateBucketName,ParameterValue=<Template_Bucket_Name>

2. After a few minutes, check the stack creation completion. Run the following command until the output reads CREATE_COMPLETE:

aws cloudformation describe-stacks --stack-name "fargate-blueprint-catalog-products" --query Stacks[0].StackStatus

3. In case of error, use the describe-events CLI command or check error details in the console.

Your AWS Service Catalog configuration should now be ready.

 

Test product teams experience

In this section, I show you how to use IAM roles to impersonate a product team member and simulate their first experience of containerized application deployment.

 

Assume team role

To assume the role that you created during the environment setup step

1.     In the Management console, follow the instructions in Switching a Role.

  • For Account, enter the account ID used in the sample solution. To learn more about how to find an AWS account ID, see Your AWS Account ID and Its Alias.
  • For Role, enter <Application_Name>-product-team-role, where <Application_Name> is the same application name you defined in Environment Setup section.
  • (Optional) For Display name, enter a custom session value.

You are now logged in as a member of the product team.

 

Provision core build product

Next, provision the core build tools for your blueprint:

  1. In the Service Catalog console, you should now see the two products created earlier listed under Products.
  2. Select the first product, Core Build Tools.
  3. Choose LAUNCH PRODUCT.
  4. Name the product something such as <Application_Name>-build-tools, replacing <Application_Name> with the name previously defined for your application.
  5. Provide the same application name you defined previously.
  6. Leave the ContainerBuild parameter default setting as yes, as you are building a container requiring a container repository and its associated permissions.
  7. Choose NEXT three times, then choose LAUNCH.
  8. Under Events, watch the Status property. Keep refreshing until the status reads Succeeded. In case of failure, choose the URL value next to the key CloudformationStackARN. This choice takes you to the CloudFormation console, where you can find more information on the errors.

Now you have the following build tools created along with the required permissions:

  • AWS CodeCommit repository to store your code
  • CodeBuild project to build your container image and test your application code
  • Amazon ECR repository to store your container images
  • Amazon S3 bucket to store your build and release artifacts

 

Provision ECS Fargate deployment blueprint

In the Service Catalog console, follow the same steps to deploy the blueprint for ECS deployment. Here are the product provisioning details:

  • Product Name: <Application_Name>-fargate-blueprint.
  • Provisioned Product Name: <Application_Name>-ecs-fargate-blueprint.
  • For the parameters Subnet1, Subnet2, VpcId, enter the output values you copied earlier into your editor of choice in the Setup Environment section.
  • For other parameters, enter the following:
    • ApplicationName: The same application name you defined previously.
    • ClusterName: Enter the value example-corp-ecs-cluster, which is the name chosen in the template for the central cluster.
  • Leave the DesiredCount and LaunchType parameters to their default values.

After the blueprint product creation completes, you should have an ECS service with a sample task definition for your product team. The build tools created earlier include the permissions required for deploying to the ECS service. Also, a CI/CD pipeline has been created to guide your product teams as they publish their application to the ECS service. Ideally, this pipeline should have all stages, practices, security checks, and standards required for application release.

Product teams still have to author application code, create a Dockerfile, build specifications, run automated tests and deployment scripts, and perform other tasks required for application release. The blueprint product can provide wiki links to reference examples for these steps, or access to pre-provisioned sample pipelines.

 

Test your pipeline

Now, upload a sample app to test your pipeline:

  1. Log in with the product team role.
  2. In the CodeCommit console, select the repository with the application name that you defined in the environment setup section.
  3. Scroll down, choose Add file, Create file.
  4. Paste the following in the page editor, which is a script to build the container image and push it to the ECR repository:
version: 0.2
phases:
  pre_build:
    commands:
      - $(aws ecr get-login --no-include-email)
      - TAG="$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | head -c 8)"
      - IMAGE_URI="${REPOSITORY_URI}:${TAG}"
  build:
    commands:
      - docker build --tag "$IMAGE_URI" .
  post_build:
    commands:
      - docker push "$IMAGE_URI"      
      - printf '[{"name":"%s","imageUri":"%s"}]' "$APPLICATION_NAME" "$IMAGE_URI" > images.json
artifacts:
  files: 
    - images.json
    - '**/*'

5. For File name, enter buildspec.yml.

6. For Author name and Email address, enter your name and your preferred email address for the commit. Although optional, the addition of a commit message is a good practice.

7. Choose Commit changes.

8. Repeat the same steps for the Dockerfile. The sample Dockerfile creates a straightforward PHP application. Typically, you add your application content to that image.

File name: Dockerfile

File content:

FROM ubuntu:12.04

# Install dependencies
RUN apt-get update -y
RUN apt-get install -y git curl apache2 php5 libapache2-mod-php5 php5-mcrypt php5-mysql

# Configure apache
RUN a2enmod rewrite
RUN chown -R www-data:www-data /var/www
ENV APACHE_RUN_USER www-data
ENV APACHE_RUN_GROUP www-data
ENV APACHE_LOG_DIR /var/log/apache2

EXPOSE 80

CMD ["/usr/sbin/apache2", "-D",  "FOREGROUND"]

Your pipeline should now be ready to run successfully. Although you can list all current pipelines in the Region, you can only describe and modify pipelines that have a prefix matching your application name. To confirm:

  1. In the AWS CodePipeline console, select the pipeline <Application_Name>-ecs-fargate-pipeline.
  2. The pipeline should now be running.

Because you performed two commits to the repository from the console, you must wait for the second run to complete before successful deployment to ECS Fargate.

 

Clean up

To clean up the environment, run the following commands in the AWS CLI, replacing <Application_Name>
with your application name, <Account_Id> with your AWS Account ID with no hyphens and <Template_Bucket_Name>
with the template bucket output value you copied into your editor of choice:

aws ecr delete-repository --repository-name <Application_Name> --force

aws s3 rm s3://<Application_Name>-artifactbucket-<Account_Id> --recursive

aws s3 rm s3://<Template_Bucket_Name> --recursive

 

To remove the AWS Service Catalog products:

  1. Log in with the Product team role
  2. In the console, follow the instructions at Deleting Provisioned Products.
  3. Delete the AWS Service Catalog products in reverse order, starting with the blueprint product.

Run the following commands to delete the administrative resources:

aws cloudformation delete-stack --stack-name fargate-blueprint-catalog-products

aws cloudformation delete-stack --stack-name fargate-blueprint-prereqs

Conclusion

In this post, I showed you how to design and build ECS Fargate deployment blueprints. I explained how these accelerate and standardize the release of containerized applications on AWS. Your product teams can keep getting the latest standards and coded best practices through those automated blueprints.

As always, AWS welcomes feedback. Please submit comments or questions below.

Deploying GitOps with Weave Flux and Amazon EKS

Post Syndicated from Ignacio Riesgo original https://aws.amazon.com/blogs/compute/deploying-gitops-with-weave-flux-and-amazon-eks/

This post is contributed by Jon Jozwiak | Senior Solutions Architect, AWS

 

You have countless options for deploying resources into an Amazon EKS cluster. GitOps—a term coined by Weaveworks—provides some substantial advantages over the alternatives. With only Git as the single, central source for controlling deployment into your cluster, GitOps provides easy version control on a platform your team already knows. Getting started with GitOps is straightforward: create a pull request, merge, and the configuration deploys to the EKS cluster.

Weave Flux makes running GitOps in your EKS cluster fast and easy, as it monitors your configuration in Git and image repositories and automates deployments. Weave Flux follows a pull model, automatically triggering deployments based on changes. This provides better security than most continuous deployment tools, which need permissions to access your cluster. This approach also provides Git with version control over your configuration and enables rollback.

This post walks through implementing Weave Flux and deploying resources to EKS using Git. To simplify the image build pipeline, I use AWS Service Catalog to provide a standardized pipeline. AWS Service Catalog lets you centrally define a portfolio of approved products that AWS users can provision. An AWS CloudFormation template defines each product, which can be version-controlled.

After you deploy the sample resources, I quickly demonstrate the GitOps approach where a new image results in the configuration automatically deploying to EKS. This new image may be a commit of Kubernetes manifests or a commit of Helm release definitions.

The following diagram shows the workflow.

Prerequisites

In GitOps, you manage Docker image builds separately from deployment configuration. For image builds, this example uses AWS CodePipeline and AWS CodeBuild, which provide a managed workflow from GitHub source through to an image landing in Amazon Elastic Container Registry (ECR).

This post assumes that you already have an EKS cluster deployed, including kubectl access. It also assumes that you have a GitHub account.

GitHub setup

First, create a GitHub repository to store the Kubernetes manifests (configuration files) to apply to the cluster.

In GitHub, create a GitHub repository. This repository holds Kubernetes manifests for your deployments. Name the repository k8s-config to align with this post. Leave it as a public repository, check the box for Initialize this repository with a README, and choose Create Repo.

On the GitHub repository page, choose Clone or Download and save the SSH string:

[email protected]:youruser/k8s-config.git

Next, create a GitHub token that allows creating and deleting repositories so AWS Service Catalog can deploy and remove pipelines.

  1. In your GitHub profile, access your token settings.
  2. Choose Generate New Token.
  3. Name your new token CodePipeline Service Catalog, and select the following options:
  • repo scopes (repo:status, repo_deployment, public_repo, and repo:invite)
  • read:org
  • write:public_key and read:public_key
  • write:repo_hook and read:repo_hook
  • read:user and user:email
  • delete_repo

4 . Choose Generate Token.

5. Copy and save your access token for future access.

 

Deploy Helm

Helm is a package manager for Kubernetes that allows you to define a chart. Charts are collections of related resources that let you create, version, share, and publish applications. By deploying Helm into your cluster, you make it much easier to deploy Weave Flux and other systems. If you’ve deployed Helm already, skip this section.

First, install the Helm client with the following command:

curl -LO https://git.io/get_helm.sh

chmod 700 get_helm.sh

./get_helm.sh

 

On macOS, you could alternatively enter the following command:

brew install kubernetes-helm

 

Next, set up a service account with cluster role for Tiller, Helm’s server-side component. This allows Tiller to manage resources in your cluster.

kubectl -n kube-system create sa tiller

kubectl create clusterrolebinding tiller-cluster-rule \

--clusterrole=cluster-admin \

--serviceaccount=kube-system:tiller

 

Finally, initialize Helm and verify your version. Tiller takes a few seconds to start.

helm init --service-account tiller --history-max 200

helm version

 

Deploy Weave Flux

With Helm installed, proceed with the Weave Flux installation. Begin by installing the Flux Custom Resource Definition.

kubectl apply -f https://raw.githubusercontent.com/fluxcd/flux/helm-0.10.1/deploy-helm/flux-helm-release-crd.yaml

Now add the Weave Flux Helm repository and proceed with the install. Make sure that you update the git.url to match the GitHub repository that you created earlier.

helm repo add fluxcd https://charts.fluxcd.io

helm upgrade -i flux --set helmOperator.create=true --set helmOperator.createCRD=false --set [email protected]:YOURUSER/k8s-config --namespace flux fluxcd/flux

 

You can use the following code to verify that you successfully deployed Flux. You should see three pods running:

kubectl get pods -n flux

NAME                                 READY     STATUS    RESTARTS   AGE

flux-5bd7fb6bb6-4sc78                1/1       Running   0          52s

flux-helm-operator-df5746688-84kw8   1/1       Running   0          52s

flux-memcached-6f8c446979-f45wj      1/1       Running   0          52s

 

Flux requires a deploy key to work with the GitHub repository. In this post, Flux generates the SSH key pair itself, but you can also specify a different key pair when deploying. To access the key, download fluxctl, a command line utility that interacts with the Flux API. The following steps work for Linux. For other OS platforms, see Installing fluxctl.

sudo wget -O /usr/local/bin/fluxctl https://github.com/fluxcd/flux/releases/download/1.14.1/fluxctl_linux_amd64

sudo chmod 755 /usr/local/bin/fluxctl

 

Validate that fluxctl installed successfully, then retrieve the public key pair using the following command. Specify the namespace where you deployed Flux.

fluxctl version

fluxctl --k8s-fwd-ns=flux identity

 

Copy the key and add that as a deploy key in your GitHub repository.

  1. In your GitHub repository, choose Settings, Deploy Keys.
  2. Choose Add deploy key and name the key Flux Deploy Key.
  3. Paste the key from fluxctl identity.
  4. Choose Allow Write Access, Add Key.

Now use AWS Service Catalog to set up your image build pipeline.

 

Set up AWS Service Catalog

To allow end users to consume product portfolios, you must associate a portfolio with an IAM principal (or principals): a user, group, or role. For this example, associate your current identity. After you master these basics, there are additional resources to teach you how to set up a multi-region, multi-account catalog.

To retrieve your current identity, use the AWS CLI to get your ARN:

aws sts get-caller-identity

Deploy the product portfolio that contains an image build pipeline service by doing the following:

  1. In the AWS CloudFormation console, launch the CloudFormation stack with the following link:

 

 

2. Choose Next.

3. On the Specify Details page, enter your ARN from get-caller-identity. Also enter an environment tag, which AWS applies to all resources from this portfolio.

4. Choose Next.

5. On the Options page, choose Next.

6. On the Review page, select the check box displayed next to I acknowledge that AWS CloudFormation might create IAM resources.

7. Choose Create. CloudFormation takes a few minutes to create your resources.

 

Deploy the image pipeline

The image pipeline provisions a GitHub repository, Amazon ECR repository, and AWS CodeBuild project. It also uses AWS CodePipeline to build a Docker image.

  1. In the AWS Management Console, go to the AWS Service Catalog products list and choose Pipeline for Docker Images.
  2. Choose Launch Product.
  3. For Name, enter ExamplePipeline, and choose Next.
  4. On the Parameters page, fill in a project name, description, and unique S3 bucket name. The specifics don’t matter, but make a note of the name and S3 bucket for later use.
  5. Fill in your GitHub User and GitHub Token values from earlier. Leave the rest of the fields as the default values.
  6. To clean up your GitHub repository on stack delete, change Delete Repository to true.
  7. Choose Next.
  8. On the TagOptions screen, choose Next.
  9. Choose Next on the Notifications page.
  10. On the Review page, choose Launch.

The launch process takes 1–2 minutes. You can verify that you now have a repository matching your project name (eks-example) in GitHub. You can also look at the pipeline created in the AWS CodePipeline console.

 

Deploying with GitOps

You can now provision workloads into the EKS cluster. With a GitOps approach, you only commit code and Kubernetes resource definitions to GitHub. AWS CodePipeline handles the image builds, and Weave Flux applies the desired state to Kubernetes.

First, create a simple Hello World application in your example pipeline. Clone the GitHub repository that you created in the previous step and substitute your GitHub user below.

git clone [email protected]:youruser/eks-example.git

cd eks-example

Create a base README file, a source directory, and download a simple NGINX configuration (hello.conf), home page (index.html), and Dockerfile.

echo "# eks-example" > README.md

mkdir src

wget -O src/hello.conf https://blog-gitops-eks.s3.amazonaws.com/hello.conf

wget -O src/index.html https://blog-gitops-eks.s3.amazonaws.com/index.html

wget https://blog-gitops-eks.s3.amazonaws.com/Dockerfile

 

Now that you have a simple Hello World app with Dockerfile, commit the changes to kick off the pipeline.

git add .

git commit -am "Initial commit"

[master (root-commit) d69a6ba] Initial commit

4 files changed, 34 insertions(+)

create mode 100644 Dockerfile

create mode 100644 README.md

create mode 100644 src/hello.conf

create mode 100644 src/index.html

git push

 

Watch in the AWS CodePipeline console to see the image build in process. This may take a minute to start. When it’s done, look in the ECR console to see the first version of the container image.

To deploy this image and the Hello World application, commit Kubernetes manifests for Flux. Create a namespace, deployment, and service in the Kubernetes Git repository (k8s-config) you created. Make sure that you aren’t in your eks-example repository directory.

cd ..

git clone [email protected]:youruser/k8s-config.git

cd k8s-config

mkdir charts namespaces releases workloads

 

The preceding directory structure helps organize the repository but isn’t necessary. Flux can descend into subdirectories and look for YAML files to apply.

Create a namespace Kubernetes manifest.

cat << EOF > namespaces/eks-example.yaml
apiVersion: v1
kind: Namespace
metadata:
  labels:
    name: eks-example
  name: eks-example
EOF

Now create a deployment manifest. Make sure that you update this image to point to your repository and image tag. For example, <Account ID>.dkr.ecr.us-east-1.amazonaws.com/eks-example:d69a6bac.

cat << EOF > workloads/eks-example-dep.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: eks-example
  namespace: eks-example
  labels:
    app: eks-example
  annotations:
    # Container Image Automated Updates
    flux.weave.works/automated: "true"
    # do not apply this manifest on the cluster
    #flux.weave.works/ignore: "true"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: eks-example
  template:
    metadata:
      labels:
        app: eks-example
    spec:
      containers:
      - name: eks-example
        image: <Your Account>.dkr.ecr.us-east-1.amazonaws.com/eks-example:d69a6bac
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: http
        readinessProbe:
          httpGet:
            path: /
            port: http
EOF

 

Finally, create a service manifest to create a load balancer.

cat << EOF > workloads/eks-example-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: eks-example
  namespace: eks-example
  labels:
    app: eks-example
spec:
  type: LoadBalancer
  ports:
    - port: 80
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: eks-example
EOF

 

In the preceding code, there are two Kubernetes annotations for Flux. The first, flux.weave.works/automated, tells Flux whether the container image should be automatically updated. This example sets the value to true, enabling updates to your deployment as new images arrive in the registry. This example comments out the second annotation, flux.weave.works/ignore. However, you can use it to tell Flux to ignore the deployment temporarily.

Commit the changes, and in a few minutes, it automatically deploys.

git add .
git commit -am "eks-example deployment"
[master 954908c] eks-example deployment
 3 files changed, 64 insertions(+)
 create mode 100644 namespaces/eks-example.yaml
 create mode 100644 workloads/eks-example-dep.yaml
 create mode 100644 workloads/eks-example-svc.yaml

 

Make sure that you push your changes.

git push

Now check the logs of your Flux pod:

kubectl get pods -n flux

Update the name below to reflect the name of the pod in your deployment. This sample pulls every five minutes for changes. When it triggers, you should see kubectl apply log messages to create the namespace, service, and deployment.

kubectl logs flux-5bd7fb6bb6-4sc78 -n flux

Find the load balancer input for your service with the following:

kubectl describe service eks-example -n eks-example

Now when you connect to the load balancer address in a browser, you can see the Hello World app.

Change the eks-example source code in a small way (such as changing index.html to say Hello World Deployment 2), then commit and push to Git.

After a few minutes, refresh your browser to see the deployed change. You can watch the changes in AWS CodePipeline, in ECR, and through Flux logs. Weave Flux automatically updated your deployment manifests in the k8s-config repository to deploy the new image as it detected it. To back out that change, use a git revert or git reset command.

Finally, you can use the same approach to deploy Helm charts. You can host these charts within the configuration Git repository (k8s-config in this example), or on an external chart repository. In the following example, you use an external chart repository.

In your k8s-config directory, get the latest changes from your repository and then create a Helm release from an external chart.

cd k8s-config

git pull

 

First, create the namespace manifest.

cat << EOF > namespaces/nginx.yaml
apiVersion: v1
kind: Namespace
metadata:
  labels:
    name: nginx
  name: nginx
EOF

 

Then create the Helm release manifest. This is a custom resource definition provided by Weave Flux.

cat << EOF > releases/nginx.yaml
apiVersion: flux.weave.works/v1beta1
kind: HelmRelease
metadata:
  name: mywebserver
  namespace: nginx
  annotations:
    flux.weave.works/automated: "true"
    flux.weave.works/tag.nginx: semver:~1.16
    flux.weave.works/locked: 'true'
    flux.weave.works/locked_msg: '"Halt updates for now"'
    flux.weave.works/locked_user: User Name <[email protected]>
spec:
  releaseName: mywebserver
  chart:
    repository: https://charts.bitnami.com/bitnami/
    name: nginx
    version: 3.3.2
  values:
    usePassword: true
    image:
      registry: docker.io
      repository: bitnami/nginx
      tag: 1.16.0-debian-9-r46
    service:
      type: LoadBalancer
      port: 80
      nodePorts:
        http: ""
      externalTrafficPolicy: Cluster
    ingress:
      enabled: false
    livenessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 30
      timeoutSeconds: 5
      failureThreshold: 6
    readinessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 5
      timeoutSeconds: 3
      periodSeconds: 5
    metrics:
      enabled: false
EOF

git add . 
git commit -am "Adding NGINX Helm release"
git push

 

There are a few new annotations for Flux above. The flux.weave.works/locked annotation tells Flux to lock the deployment. This is useful if you find a known bad image and must roll back to a previous version. In addition, the flux.weave.works/tag.nginx annotation filters image tags by semantic versioning.

Wait up to five minutes for Flux to pull the configuration and verify this deployment as you did in the previous example:

kubectl get pods -n flux

kubectl logs flux-5bd7fb6bb6-4sc78 -n flux

 

kubectl get all -n nginx

 

If this doesn’t deploy, ensure Helm initialized as described earlier in this post.

kubectl get pods -n kube-system | grep tiller

kubectl get pods -n flux

kubectl logs flux-helm-operator-df5746688-84kw8 -n flux

 

Clean up

Log in as an administrator and follow these steps to clean up your sample deployment.

  1. Delete all images from the Amazon ECR repository.

2. In AWS Service Catalog provisioned products, select the three dots to the left of your ExamplePipeline service and choose Terminate provisioned product. Wait until it completes termination (1–2 minutes).

3. Delete your Amazon S3 artifact bucket.

4. Delete Weave Flux:

helm delete flux --purge

kubectl delete ns flux

kubectl delete crd helmreleases.flux.weave.works

5. Delete the load balancer services:

helm delete mywebserver --purge

kubectl delete ns nginx

kubectl delete svc eks-example -n eks-example

kubectl delete deployment eks-example -n eks-example

kubectl delete ns eks-example

6. Clean up your GitHub repositories:

 – Go to your k8s-config repository in GitHub, choose Settings, scroll to the bottom and choose Delete this repository. If you set delete to false in the pipeline service, you also must delete your eks-example repository.

 – Delete the personal access token that you created.

7.     If you provisioned an EKS cluster at the beginning of this post, delete it:

eksctl get cluster

eksctl delete cluster <clustername>

8.     In the AWS CloudFormation console, select the DevServiceCatalog stack, and choose the Actions, Delete Stack.

Conclusion

In this post, I demonstrated how to use a GitOps approach, which allows you to focus on committing code and configuration to Git rather than learning new CI/CD tooling. Git acts as the single source of truth, and Weave Flux pulls changes and ensures that the Kubernetes cluster configuration matches the desired state.

In addition, AWS Service Catalog can be used to create a portfolio of services that enables you to standardize your offerings, such as an image build pipeline based on AWS CodePipeline.

As always, AWS welcomes feedback. Please submit comments or questions below.

Improve Productivity and Reduce Overhead Expenses with Red Hat OpenShift Dedicated on AWS

Post Syndicated from Ryan Niksch original https://aws.amazon.com/blogs/architecture/improve-productivity-and-reduce-overhead-expenses-with-red-hat-openshift-dedicated-on-aws/

Red Hat OpenShift on AWS helps you develop, deploy, and manage container-based applications across on-premises and cloud environments. A recent case study from Cathay Pacific Airways proved that the use of the Red Hat OpenShift application platform can significantly improve developer productivity and reduce operational overhead by automating infrastructure, application deployment, and scaling. In this post, I explore how the architectural implementation and customization options of Red Hat OpenShift dedicated on AWS can cater to a variety of customer needs.

Red Hat OpenShift is a turn key solution providing  a container runtime, Kubernetes orchestration, container image repositories, pipeline, build process, monitoring, logging, role-based access control, granular policy-based control, and abstractions to simplify functions. Deploying a single turnkey solution, instead of building and integrating a collection of independent solutions or services, allows you to invest more time and effort in building meaningful applications for your business.

In the past, Red Hat OpenShift deployed on Amazon EC2 using an automated provisioning process with an open source solution, like the Red Hat OpenShift on AWS Quick Start. The Red Hat OpenShift Quick Start is an infrastructure as code solution which accelerates customer provisioning of Red Hat OpenShift on AWS. The OpenShift Quick Start adheres to the reference architecture to deploy Red Hat OpenShift on AWS in a resilient, scalable, well-architected manner. This reference architecture sees the control plane as a collection of load balanced master nodes for traffic routing, session state, scheduling, and monitoring. It also contains the application nodes where the customer’s containerized workloads run. This solution allowed customers to get up and running within three hours; however, it did not reduce management overhead because customers were required to monitor and maintain the infrastructure of the Red Hat OpenShift cluster.

Red Hat and AWS listened to customer feedback and created Red Hat OpenShift dedicated, a fully managed OpenShift implementation running exclusively on AWS. This implementation monitors the layers and functions, scales the layers to cater to consumption needs, and addresses operational concerns.

Customers now have access to a platform that helps manage control planes for business-critical solutions, like their developer and operational platforms.

Red Hat OpenShift Dedicated Infrastructure on AWS

You can purchase Red Hat OpenShift dedicated through the Red Hat account team. Red Hat OpenShift dedicated comes in two varieties: the Standard edition and the Cloud Choice edition (bring your own cloud).

redhar-openshit options

Figure 1: Red Hat OpenShift architecture illustrating master and infrastructure nodes spread over three Availability Zones and placed behind elastic load balancers.

Red Hat OpenShift dedicated adheres to the reference architecture defined by AWS and Red Hat. Master and infrastructure layers are spread across three AWS availability zones providing resilience within the OpenShift solution, as well as the underlying infrastructure.

Red Hat OpenShift Dedicated Standard Edition

In the Red Hat OpenShift dedicated standard edition, Red Hat deploys the OpenShift cluster into an AWS account owned and managed by Red Hat. Red Hat provides an aggregated bill for the OpenShift subscription fees, management fees, and AWS billing. This edition is ideal for customers who want everything to be managed for them. The Red Hat site reliability engineering  team (SRE) will monitor and manage healing, scaling, and patching of the cluster.

Red Hat OpenShift Cloud Choice Edition

The cloud choice edition allows customers to create their own AWS account, and then have the Red Hat OpenShift dedicated infrastructure provisioned into their existing account. The Red Hat SRE team provisions the Red Hat OpenShift cluster into the customer owned AWS account and manages the solution via IAM roles.

Figure 2: Red Hat OpenShift Cloud Choice IAM role separation

Red Hat provides billing for the Red Hat OpenShift Cloud Choice subscription and management fees, and AWS provides billing for the AWS resources. Keeping the Red Hat OpenShift infrastructure within your AWS account allows better cost controls.

Red Hat OpenShift Cloud Choice provides visibility into the resources running in your account; which is desirable if you have regulatory and auditing concerns. You can inspect, monitor, and audit resources within the AWS account — taking advantage of the rich AWS service set (AWS CloudTrail, AWS config, AWS CloudWatch, and AWS cost explorer).

You can also take advantage of cost management solutions like AWS organizations and consolidated billing. Customers with multiple business units using AWS can combine the usage across their accounts to share the volume pricing discounts resulting in cost savings for projects, departments, and companies.

Red Hat OpenShift Cloud Choice dedicated cannot be deployed into an account currently hosting other applications and resources. In order to maintain separation of control with the managed service, Red Hat OpenShift Cloud Choice dedicated requires an AWS account dedicated to the managed Red Hat OpenShift solution.

You can take advantage of cost reductions of up to 70% using Reserved Instances, which match the pervasive running instances. This is ideal for the master and infrastructure nodes of the Red Hat OpenShift solutions running in your account. The reference architecture for Red Hat OpenShift on AWS recommends spanning  nodes over three availability zones, which translates to three master instances. The master and infrastructure nodes scale differently; so, there will be three additional instances for the infrastructure nodes. Purchasing reserved instances to offset the costs of the master nodes and the infrastructure nodes can free up funds for your next project.

Interactions

DevOps teams using either edition of Red Hat OpenShift dedicated have a rich console experience providing control over networking between application workloads, storage, and monitoring. Granular drill down consoles enable operations teams to focus on what is most critical to their organization.

Each interface is controlled through granular role-based access control. Teams have visibility of high-level cluster overviews where they are able to see visualizations of the overall health of the cluster; and they have access to more granular overviews of views of hosts, nodes, and containers. Application owners, key stake holders, and operations teams have access to a customizable dashboard displaying the running state. Teams can drill down to the underlying nodes, and further into the PODs and containers, should they wish to explore the status or overall health of the containerized micro services. The cluster-wide event stream provides the same drill down experience to logging events.

The drill down console menu options are illustrated in the screenshots below:

In summary, the partnership of Red Hat and AWS created a fully managed solution which directly answers customer feedback requests for a fully managed application platform running on the availability, scalability, and cost benefits of AWS. The solution allows visibility and control whenever and wherever you need it.

About the author

Ryan Niksch

Ryan Niksch is a Partner Solutions Architect focusing on application platforms, hybrid application solutions, and modernization. Ryan has worn many hats in his life and has a passion for tinkering and a desire to leave everything he touches a little better than when he found it.

Optimizing Amazon ECS task density using awsvpc network mode

Post Syndicated from Ignacio Riesgo original https://aws.amazon.com/blogs/compute/optimizing-amazon-ecs-task-density-using-awsvpc-network-mode/

This post is contributed by Tony Pujals | Senior Developer Advocate, AWS

 

AWS recently increased the number of elastic network interfaces available when you run tasks on Amazon ECS. Use the account setting called awsvpcTrunking. If you use the Amazon EC2 launch type and task networking (awsvpc network mode), you can now run more tasks on an instance—5 to 17 times as many—as you did before.

As more of you embrace microservices architectures, you deploy increasing numbers of smaller tasks. AWS now offers you the option of more efficient packing per instance, potentially resulting in smaller clusters and associated savings.

 

Overview

To manage your own cluster of EC2 instances, use the EC2 launch type. Use task networking to run ECS tasks using the same networking properties as if tasks were distinct EC2 instances.

Task networking offers several benefits. Every task launched with awsvpc network mode has its own attached network interface, a primary private IP address, and an internal DNS hostname. This simplifies container networking and gives you more control over how tasks communicate, both with each other and with other services within their virtual private clouds (VPCs).

Task networking also lets you take advantage of other EC2 networking features like VPC Flow Logs. This feature lets you monitor traffic to and from tasks. It also provides greater security control for containers, allowing you to use security groups and network monitoring tools at a more granular level within tasks. For more information, see Introducing Cloud Native Networking for Amazon ECS Containers.

However, if you run container tasks on EC2 instances with task networking, you can face a networking limit. This might surprise you, particularly when an instance has plenty of free CPU and memory. The limit reflects the number of network interfaces available to support awsvpc network mode per container instance.

 

Raise network interface density limits with trunking

The good news is that AWS raised network interface density limits by implementing a networking feature on ECS called “trunking.” This is a technique for multiplexing data over a shared communication link.

If you’re migrating to microservices using AWS App Mesh, you should optimize network interface density. App Mesh requires awsvpc networking to provide routing control and visibility over an ever-expanding array of running tasks. In this context, increased network interface density might save money.

By opting for network interface trunking, you should see a significant increase in capacity—from 5 to 17 times more than the previous limit. For more information on the new task limits per container instance, see Supported Amazon EC2 Instance Types.

Applications with tasks not hitting CPU or memory limits also benefit from this feature through the more cost-effective “bin packing” of container instances.

 

Trunking is an opt-in feature

AWS chose to make the trunking feature opt-in due to the following factors:

  • Instance registration: While normal instance registration is straightforward with trunking, this feature increases the number of asynchronous instance registration steps that can potentially fail. Any such failures might add extra seconds to launch time.
  • Available IP addresses: The “trunk” belongs to the same subnet in which the instance’s primary network interface originates. This effectively reduces the available IP addresses and potentially the ability to scale out on other EC2 instances sharing the same subnet. The trunk consumes an IP address. With a trunk attached, there are two assigned IP addresses per instance, one for the primary interface and one for the trunk.
  • Differing customer preferences and infrastructure: If you have high CPU or memory workloads, you might not benefit from trunking. Or, you may not want awsvpc networking.

Consequently, AWS leaves it to you to decide if you want to use this feature. AWS might revisit this decision in the future, based on customer feedback. For now, your account roles or users must opt in to the awsvpcTrunking account setting to gain the benefits of increased task density per container instance.

 

Enable trunking

Enable the ECS elastic network interface trunking feature to increase the number of network interfaces that can be attached to supported EC2 container instance types. You must meet the following prerequisites before you can launch a container instance with the increased network interface limits:

  • Your account must have the AWSServiceRoleForECS service-linked role for ECS.
  • You must opt into the awsvpcTrunking  account setting.

 

Make sure that a service-linked role exists for ECS

A service-linked role is a unique type of IAM role linked to an AWS service (such as ECS). This role lets you delegate the permissions necessary to call other AWS services on your behalf. Because ECS is a service that manages resources on your behalf, you need this role to proceed.

In most cases, you won’t have to create a service-linked role. If you created or updated an ECS cluster, ECS likely created the service-linked role for you.

You can confirm that your service-linked role exists using the AWS CLI, as shown in the following code example:

$ aws iam get-role --role-name AWSServiceRoleForECS
{
    "Role": {
        "Path": "/aws-service-role/ecs.amazonaws.com/",
        "RoleName": "AWSServiceRoleForECS",
        "RoleId": "AROAJRUPKI7I2FGUZMJJY",
        "Arn": "arn:aws:iam::226767807331:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS",
        "CreateDate": "2018-11-09T21:27:17Z",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ecs.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        },
        "Description": "Role to enable Amazon ECS to manage your cluster.",
        "MaxSessionDuration": 3600
    }
}

If the service-linked role does not exist, create it manually with the following command:

aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com

For more information, see Using Service-Linked Roles for Amazon ECS.

 

Opt in to the awsvpcTrunking account setting

Your account, IAM user, or role must opt in to the awsvpcTrunking account setting. Select this setting using the AWS CLI or the ECS console. You can opt in for an account by making awsvpcTrunking  its default setting. Or, you can enable this setting for the role associated with the instance profile with which the instance launches. For instructions, see Account Settings.

 

Other considerations

After completing the prerequisites described in the preceding sections, launch a new container instance with increased network interface limits using one of the supported EC2 instance types.

Keep the following in mind:

  • It’s available with the latest variant of the ECS-optimized AMI.
  • It only affects creation of new container instances after opting into awsvpcTrunking.
  • It only affects tasks created with awsvpc network mode and EC2 launch type. Tasks created with the AWS Fargate launch type always have a dedicated network interface, no matter how many you launch.

For details, see ENI Trunking Considerations.

 

Summary

If you seek to optimize the usage of your EC2 container instances for clusters that you manage, enable the increased network interface density feature with awsvpcTrunking. By following the steps outlined in this post, you can launch tasks using significantly fewer EC2 instances. This is especially useful if you embrace a microservices architecture, with its increasing numbers of lighter tasks.

Hopefully, you found this post informative and the proposed solution intriguing. As always, AWS welcomes all feedback or comment.

Using AWS App Mesh with Fargate

Post Syndicated from Ignacio Riesgo original https://aws.amazon.com/blogs/compute/using-aws-app-mesh-with-fargate/

This post is contributed by Tony Pujals | Senior Developer Advocate, AWS

 

AWS App Mesh is a service mesh, which provides a framework to control and monitor services spanning multiple AWS compute environments. My previous post provided a walkthrough to get you started. In it, I showed deploying a simple microservice application to Amazon ECS and configuring App Mesh to provide traffic control and observability.

In this post, I show more advanced techniques using AWS Fargate as an ECS launch type. I show you how to deploy a specific version of the colorteller service from the previous post. Finally, I move on and explore distributing traffic across other environments, such as Amazon EC2 and Amazon EKS.

I simplified this example for clarity, but in the real world, creating a service mesh that bridges different compute environments becomes useful. Fargate is a compute service for AWS that helps you run containerized tasks using the primitives (the tasks and services) of an ECS application. This lets you work without needing to directly configure and manage EC2 instances.

 

Solution overview

This post assumes that you already have a containerized application running on ECS, but want to shift your workloads to use Fargate.

You deploy a new version of the colorteller service with Fargate, and then begin shifting traffic to it. If all goes well, then you continue to shift more traffic to the new version until it serves 100% of all requests. Use the labels “blue” to represent the original version and “green” to represent the new version. The following diagram shows programmer model of the Color App.

You want to begin shifting traffic over from version 1 (represented by colorteller-blue in the following diagram) over to version 2 (represented by colorteller-green).

In App Mesh, every version of a service is ultimately backed by actual running code somewhere, in this case ECS/Fargate tasks. Each service has its own virtual node representation in the mesh that provides this conduit.

The following diagram shows the App Mesh configuration of the Color App.

 

 

After shifting the traffic, you must physically deploy the application to a compute environment. In this demo, colorteller-blue runs on ECS using the EC2 launch type and colorteller-green runs on ECS using the Fargate launch type. The goal is to test with a portion of traffic going to colorteller-green, ultimately increasing to 100% of traffic going to the new green version.

 

AWS compute model of the Color App.

Prerequisites

Before following along, set up the resources and deploy the Color App as described in the previous walkthrough.

 

Deploy the Fargate app

To get started after you complete your Color App, configure it so that your traffic goes to colorteller-blue for now. The blue color represents version 1 of your colorteller service.

Log into the App Mesh console and navigate to Virtual routers for the mesh. Configure the HTTP route to send 100% of traffic to the colorteller-blue virtual node.

The following screenshot shows routes in the App Mesh console.

Test the service and confirm in AWS X-Ray that the traffic flows through the colorteller-blue as expected with no errors.

The following screenshot shows racing the colorgateway virtual node.

 

Deploy the new colorteller to Fargate

With your original app in place, deploy the send version on Fargate and begin slowly increasing the traffic that it handles rather than the original. The app colorteller-green represents version 2 of the colorteller service. Initially, only send 30% of your traffic to it.

If your monitoring indicates a healthy service, then increase it to 60%, then finally to 100%. In the real world, you might choose more granular increases with automated rollout (and rollback if issues arise), but this demonstration keeps things simple.

You pushed the gateway and colorteller images to ECR (see Deploy Images) in the previous post, and then launched ECS tasks with these images. For this post, launch an ECS task using the Fargate launch type with the same colorteller and envoy images. This sets up the running envoy container as a sidecar for the colorteller container.

You don’t have to manually configure the EC2 instances in a Fargate launch type. Fargate automatically colocates the sidecar on the same physical instance and lifecycle as the primary application container.

To begin deploying the Fargate instance and diverting traffic to it, follow these steps.

 

Step 1: Update the mesh configuration

You can download updated AWS CloudFormation templates located in the repo under walkthroughs/fargate.

This updated mesh configuration adds a new virtual node (colorteller-green-vn). It updates the virtual router (colorteller-vr) for the colorteller virtual service so that it distributes traffic between the blue and green virtual nodes at a 2:1 ratio. That is, the green node receives one-third of the traffic.

$ ./appmesh-colorapp.sh
...
Waiting for changeset to be created..
Waiting for stack create/update to complete
...
Successfully created/updated stack - DEMO-appmesh-colorapp
$

Step 2: Deploy the green task to Fargate

The fargate-colorteller.sh script creates parameterized template definitions before deploying the fargate-colorteller.yaml CloudFormation template. The change to launch a colorteller task as a Fargate task is in fargate-colorteller-task-def.json.

$ ./fargate-colorteller.sh
...

Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - DEMO-fargate-colorteller
$

 

Verify the Fargate deployment

The ColorApp endpoint is one of the CloudFormation template’s outputs. You can view it in the stack output in the AWS CloudFormation console, or fetch it with the AWS CLI:

$ colorapp=$(aws cloudformation describe-stacks --stack-name=$ENVIRONMENT_NAME-ecs-colorapp --query="Stacks[0
].Outputs[?OutputKey=='ColorAppEndpoint'].OutputValue" --output=text); echo $colorapp> ].Outputs[?OutputKey=='ColorAppEndpoint'].OutputValue" --output=text); echo $colorapp
http://DEMO-Publi-YGZIJQXL5U7S-471987363.us-west-2.elb.amazonaws.com

Assign the endpoint to the colorapp environment variable so you can use it for a few curl requests:

$ curl $colorapp/color
{"color":"blue", "stats": {"blue":1}}
$

The 2:1 weight of blue to green provides predictable results. Clear the histogram and run it a few times until you get a green result:

$ curl $colorapp/color/clear
cleared

$ for ((n=0;n<200;n++)); do echo "$n: $(curl -s $colorapp/color)"; done

0: {"color":"blue", "stats": {"blue":1}}
1: {"color":"green", "stats": {"blue":0.5,"green":0.5}}
2: {"color":"blue", "stats": {"blue":0.67,"green":0.33}}
3: {"color":"green", "stats": {"blue":0.5,"green":0.5}}
4: {"color":"blue", "stats": {"blue":0.6,"green":0.4}}
5: {"color":"gre
en", "stats": {"blue":0.5,"green":0.5}}
6: {"color":"blue", "stats": {"blue":0.57,"green":0.43}}
7: {"color":"blue", "stats": {"blue":0.63,"green":0.38}}
8: {"color":"green", "stats": {"blue":0.56,"green":0.44}}
...
199: {"color":"blue", "stats": {"blue":0.66,"green":0.34}}

This reflects the expected result for a 2:1 ratio. Check everything on your AWS X-Ray console.

The following screenshot shows the X-Ray console map after the initial testing.

The results look good: 100% success, no errors.

You can now increase the rollout of the new (green) version of your service running on Fargate.

Using AWS CloudFormation to manage your stacks lets you keep your configuration under version control and simplifies the process of deploying resources. AWS CloudFormation also gives you the option to update the virtual route in appmesh-colorapp.yaml and deploy the updated mesh configuration by running appmesh-colorapp.sh.

For this post, use the App Mesh console to make the change. Choose Virtual routers for appmesh-mesh, and edit the colorteller-route. Update the HTTP route so colorteller-blue-vn handles 33.3% of the traffic and colorteller-green-vn now handles 66.7%.

Run your simple verification test again:

$ curl $colorapp/color/clear
cleared
fargate $ for ((n=0;n<200;n++)); do echo "$n: $(curl -s $colorapp/color)"; done
0: {"color":"green", "stats": {"green":1}}
1: {"color":"blue", "stats": {"blue":0.5,"green":0.5}}
2: {"color":"green", "stats": {"blue":0.33,"green":0.67}}
3: {"color":"green", "stats": {"blue":0.25,"green":0.75}}
4: {"color":"green", "stats": {"blue":0.2,"green":0.8}}
5: {"color":"green", "stats": {"blue":0.17,"green":0.83}}
6: {"color":"blue", "stats": {"blue":0.29,"green":0.71}}
7: {"color":"green", "stats": {"blue":0.25,"green":0.75}}
...
199: {"color":"green", "stats": {"blue":0.32,"green":0.68}}
$

If your results look good, double-check your result in the X-Ray console.

Finally, shift 100% of your traffic over to the new colorteller version using the same App Mesh console. This time, modify the mesh configuration template and redeploy it:

appmesh-colorteller.yaml
  ColorTellerRoute:
    Type: AWS::AppMesh::Route
    DependsOn:
      - ColorTellerVirtualRouter
      - ColorTellerGreenVirtualNode
    Properties:
      MeshName: !Ref AppMeshMeshName
      VirtualRouterName: colorteller-vr
      RouteName: colorteller-route
      Spec:
        HttpRoute:
          Action:
            WeightedTargets:
              - VirtualNode: colorteller-green-vn
                Weight: 1
          Match:
            Prefix: "/"
$ ./appmesh-colorapp.sh
...
Waiting for changeset to be created..
Waiting for stack create/update to complete
...
Successfully created/updated stack - DEMO-appmesh-colorapp
$

Again, repeat your verification process in both the CLI and X-Ray to confirm that the new version of your service is running successfully.

 

Conclusion

In this walkthrough, I showed you how to roll out an update from version 1 (blue) of the colorteller service to version 2 (green). I demonstrated that App Mesh supports a mesh spanning ECS services that you ran as EC2 tasks and as Fargate tasks.

In my next walkthrough, I will demonstrate that App Mesh handles even uncontainerized services launched directly on EC2 instances. It provides a uniform and powerful way to control and monitor your distributed microservice applications on AWS.

If you have any questions or feedback, feel free to comment below.