Tag Archives: Amazon Elastic Container Service

AWS Fargate Price Reduction – Up to 50%

Post Syndicated from Nathan Peck original https://aws.amazon.com/blogs/compute/aws-fargate-price-reduction-up-to-50/

AWS Fargate is a compute engine that uses containers as its fundamental compute primitive. AWS Fargate runs your application containers for you on demand. You no longer need to provision a pool of instances or manage a Docker daemon or orchestration agent. Because the infrastructure that runs your containers is invisible, you don’t have to worry about whether you have provisioned enough instances to run your containerized workload. You also don’t have to worry about whether you’re using those instances efficiently to avoid paying for resources that you don’t use. You no longer need to do undifferentiated heavy lifting to maintain the infrastructure that runs your containers. AWS Fargate automatically updates and patches underlying resources to keep you safe from vulnerabilities in the underlying operating system and software. AWS Fargate uses an on-demand pricing model that charges per vCPU and per GB of memory reserved per second, with a 1-minute minimum.

At re:Invent 2018 we announced Firecracker, an open source virtualization technology that is purpose-built for creating and managing secure, multi-tenant containers and functions-based services. Firecracker enables you to deploy workloads in lightweight virtual machines called microVMs. These microVMs can initiate code faster, with less overhead. Innovations such as these allow us to improve the efficiency of Fargate and help us pass on cost savings to customers.

Effective January 7th, 2019 Fargate pricing per vCPU per second is being reduced by 20%, and pricing per GB of memory per second is being reduced by 65%. Depending on the ratio of CPU to memory that you’re allocating for your containers, you could see an overall price reduction of anywhere from 35% to 50%.

The following table shows the price reduction for each built-in launch configuration.

vCPU GB Memory Effective Price Cut
0.25 0.5 -35.00%
0.25 1 -42.50%
0.25 2 -50.00%
0.5 1 -35.00%
0.5 2 -42.50%
0.5 3 -47.00%
0.5 4 -50.00%
1 2 -35.00%
1 3 -39.30%
1 4 -42.50%
1 5 -45.00%
1 6 -47.00%
1 7 -48.60%
1 8 -50.00%
2 4 -35.00%
2 5 -37.30%
2 6 -39.30%
2 7 -41.00%
2 8 -42.50%
2 9 -43.80%
2 10 -45.00%
2 11 -46.10%
2 12 -47.00%
2 13 -47.90%
2 14 -48.60%
2 15 -49.30%
2 16 -50.00%
4 8 -35.00%
4 9 -36.20%
4 10 -37.30%
4 11 -38.30%
4 12 -39.30%
4 13 -40.20%
4 14 -41.00%
4 15 -41.80%
4 16 -42.50%
4 17 -43.20%
4 18 -43.80%
4 19 -44.40%
4 20 -45.00%
4 21 -45.50%
4 22 -46.10%
4 23 -46.50%
4 24 -47.00%
4 25 -47.40%
4 26 -47.90%
4 27 -48.30%
4 28 -48.60%
4 29 -49.00%
4 30 -49.30%

Many engineering organizations such as Turner Broadcasting System, Veritone, and Catalytic have already been using AWS Fargate to achieve significant infrastructure cost savings for batch jobs, cron jobs, and other on-and-off workloads. Running a cluster of instances at all times to run your containers constantly incurs cost, but AWS Fargate stops charging when your containers stop.

With these new price reductions, AWS Fargate also enables significant savings for containerized web servers, API services, and background queue consumers run by organizations like KPMG, CBS, and Product Hunt. If your application is currently running on large EC2 instances that peak at 10-20% CPU utilization, consider migrating to containers in AWS Fargate. Containers give you more granularity to provision the exact amount of CPU and memory that your application needs. You no longer pay for instance resources that your application doesn’t use. If a sudden spike of traffic causes your application to require more resources you still have the ability to rapidly scale your application out by adding more containers, or scale your application up by launching larger containers.

AWS Fargate lets you focus on building your containerized application without worrying about the infrastructure. This encompasses not just the infrastructure capacity provisioning, monitoring, and maintenance but also the infrastructure price. Implementing Firecracker in AWS Fargate is just part of our journey to keep making AWS Fargate faster, more powerful, and more efficient. Running your containers in AWS Fargate allows you to benefit from these improvements without any manual intervention required on your part.

AWS Fargate has achieved SOC, PCI, HIPAA BAA, ISO, MTCS, C5, and ENS High compliance certification, and has a 99.99% SLA. You can get started with AWS Fargate in 13 AWS Regions around the world.

Amazon ECS Task Placement

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/amazon-ecs-task-placement/

Intro

Amazon Elastic Container Service (ECS) is a highly scalable, high-performance container orchestration service that allows you to easily run and scale containerized applications on AWS. This post covers how Amazon Elastic Container Service (Amazon ECS) runs containers in a cluster. Topics include why AWS built the task placement engine, the different strategies and constraints available to decide where and how containers are run, and things to consider when picking placement strategies.

If you are not familiar with the relationship between ECS and Amazon EC2 or its components, see the Building Blocks of Amazon ECS post.

Task Placement

When a task is launched in a cluster, a decision has to be made to choose which container instance should run that task. Conversely, when scaling down a service, a decision has to be made to choose the specific task to be terminated.

Task placement

By default, ECS uses the following placement strategies:

  • When you run tasks with the RunTask API action, tasks are placed randomly in a cluster.
  • When you launch and terminate tasks with the CreateService API action, the service scheduler spreads the tasks across the Availability Zones (and the instances within the zones) in a cluster.

Before December 2016, tasks could only be placed by their default placement strategies. This meant making the decision yourself, such as writing your own scheduler, and calling the StartTask API action to achieve custom task placement. When you manually constrained the placement of your grouping of containers, you could only place based on CPU, memory, and ports. Additionally, while creating your own scheduler can be powerful, there’s a tradeoff with complexity.

AWS built the task placement engine, which removes the need for you to build, run, and manage your own scheduling and placement services. There are several new features that provide you with more control over how applications run across clusters through custom attributes.

You can think of this flow as a funnel with filters for your instances. Constraints must be obeyed. If an instance doesn’t fit, it isn’t used. Strategies are then used to sort the rest of the instances by preference to determine which are the “best.”

For every instantiation of your task, it runs through every step. Calling run-task with a count of n is effectively calling run-task n times (create-service also works the same way).

Cluster Constraints, Placement Constraints, Placement Strategies

Example

Here’s how to use these placement features. In this example, you use the AWS CLI run-task command. For the last couple of filters, I show how to use them with placement flags, but you can just as easily include them in your task definition file instead. This can all be done in the console as well. Start with the cluster shown earlier:

Task Placement Instances

aws ecs run-task --task-definition nouvelleApp \
--placement-constraints type="memberOf",expression="attribute:ecs.instance-type == t2.small" \
--placement-strategy --placement-strategy type="binpack",field="memory" \
--count 8

Cluster constraints

In the first step, eliminate all the instances that don’t have the required resources based on what you defined either in the JSON task definition or what you provided overrides for to RunTask.

Not enough CPU? Not enough memory? A port is needed, but it is already in use on that instance? Then the instance is eliminated from the set of valid candidates.

Task Placement Cluster Constraints

aws ecs run-task --task-definition nouvelleApp

Placement constraints

In the second step, keep only the instances that satisfy the attribute or task group constraints. Yes, this means that you can indicate what instance to use for a task (for example, to make sure that CPU-intensive jobs are scheduled on the right type of instance, or in which Availability Zone).

You can also create any custom tags of your choosing. The green tasks on the green instances, the blue tasks on the blue instances! You can also use the Cluster Query Language to write expressions to check for multiple attributes. In the next section, I cover how to write and use the attributes and expressions.

Placement Constraints

--placement-constraints type="memberOf",expression="attribute:ecs.instance-type == t2.micro"

Placement strategies

In the third step, filter on the following supported task placement strategies:

  • random
  • binpack
  • spread

By default, tasks are randomly placed with RunTask or spread across Availability Zones with CreateService. Spread is typically used to achieve high availability by making sure that multiple copies of a task are scheduled across multiple instances based on attributes such as Availability Zones.

Conversely, binpack places tasks together to be as cost-efficient as possible. Later in this post, you’ll see how these placement strategies work, as well as how to chain them together and why you may want to do so.

Task Placement Binpack

--placement-strategy type="binpack",field="memory"

Task copies

This isn’t part of the filter, but instead, the count flag is used to indicate how many copies (n) of a given task to run. Effectively, it tells ECS to re-run this workflow n times. By default, the count is set to 1, so run-task is executed one time. For services, the desired-count flag is used.

--count 8

Attributes, task groups, and expressions

For task placement, you can use instance fields, such as attributes, as well as task groups. These can be used in expressions for task placement constraints, or instance fields can be used standalone for task placement strategies. Here’s a quick overview of attributes, task groups, and expressions before you go any further.

Instance: Fields

Because you are using these fields with respect to instances in task placement, the instance: preface is optional and can be used either of the following ways with a field name or an attribute.

instance:<field>
<field>

Field names

The currently supported field names are as follows:

ec2InstanceId
agentConnected

Attributes

There are also instance attributes, which are prefaced with attribute. Again, instance: is optional:

attribute:<attribute-name>

Built-in attributes

The following are some of the provided attributes:

ecs.ami-id
ecs.availability-zone
ecs.instance-type
ecs.os-type
ecs.subnet-id
ecs.vpc-id

Custom attributes

Well, what if you don’t see an attribute that you want? This is where custom attributes come in handy! Want to differentiate between test and prod? What about blue versus green?

aws ecs put-attributes \
--attributes name=color,value=blue,targetId=<your-container-instance-arn>

Task groups

In addition to placing tasks based on attributes, you can use task groups. Every task is assigned a group ID that you can reference in placement. For both tasks and services, a default ID is given, or you can choose your own. Perhaps you want to run version 2 of a service but only on instances with version 1.

task:group

Expressions

Alright, so you have some attributes and task groups… now what? Well, AWS created the Cluster Query Language to make it easy to create expressions for task placement constraints. These attributes and task groups are used with the available comparison operators, which may look familiar if you’ve used Boolean operators before. Some of these operators can be written in multiple ways, such as “!” or “not”.

For instance, to create an expression using a single attribute to select only t2.micro instances, use the ecs.instance-type attribute and the string equality comparator as follows:

attribute:ecs.instance-type == t2.micro

For t2.micro and t2.nano instances, you have a few options. You could use the same syntax as earlier with the or comparator:

attribute:ecs.instance-type == t2.micro or attribute:ecs.instance-type == t2.nano

Another way is to use the in comparator with an argument list:

attribute:ecs.instance-type in [t2.micro, t2.nano]

To include all t2 instances, use a wildcard and the pattern match operator instead of listing out each one:

attribute:ecs.instance-type =~ t2.*

Task group comparisons work the same way. The following snippet selects any instance upon which the task group “database” is running:

task:group == database

To select only task groups that are not “database,” combine expressions:

not(task:group == database)

You can use these expressions to filter your instances:

aws ecs list-container-instances \
--filter "attribute:ecs.instance-type != t2.micro"
aws ecs list-container-instances \
--filter "attribute:color == blue"
aws ecs list-container-instances \
--filter "task:group == database"

These expressions and attributes, respectively, are also used for task placement constraints and strategies, which I cover in the next few sections.

Constraints

Now look at placement constraints. When determining task placement, there may be certain EC2 instances to include or exclude from running containers. For example, you may want to place tasks only on GPU types.

Task placement constraints let you define where your containers should run across your cluster. ECS currently supports two types of placement constraints: distinctInstance and memberOf. By default, ECS spreads tasks across Availability Zones and instances.

  "placementConstraints": [ 
      { 
         "expression": "string",
         "type": "string"
      }
   ],

Distinct Instance

Distinct InstanceThe distinctInstance constraint makes it possible to ensure that every container is started on a unique instance in your cluster. The distinctInstance constraint never places multiple copies of a task on a single instance, even if you request more running tasks than available instances.

For example if you decide to place five copies of a task, each time it filters out the instances that are already running the task.

aws ecs run-task --task-definition nouvelleApp \
--count 5 --placement-constraints type="distinctInstance"

Member of

Member of t2-micro The memberOf constraint describes a set of instances on which your tasks should run. It is for anything you could define as an attribute or task. It also takes in an expression of attributes written in the Cluster Query Language.

For example, if you have a small application and just want it to run on t2.micro instances:

aws ecs run-task --task-definition nouvelleApp \
--count 5 \
--placement-constraints 
type="memberOf",expression="attribute:ecs.instance-type == t2.micro"

You can create expressions using the Cluster Query Language to check for multiple attributes. Here’s how you can weed out all instances in the us-west-2c Availability Zone as well as instances that aren’t of type t2.nano or t2.micro:

aws ecs run-task --task-definition nouvelleApp \
--count 5 \
--placement-constraints type="memberOf",expression="attribute:ecs.availability-zone != us-west-2c and (attribute:ecs.instance-type == t2.nano or attribute:ecs.instance-type == t2.micro)"

Member of affinity

You can also use constraints to place all tasks with the same task group on the same instance (affinity):

aws ecs run-task --task-definition nouvelleApp \
--count 5 --group webserver \
--placement-constraints type=memberOf,expression="task:group == webserver"

Or you can ensure that instances never have more than one task in the same group (anti-affinity):

aws ecs run-task –task-definition nouvelleApp –count 5 –group webserver –placement-constraints type=memberOf,expression=”not(task:group == webserver)”

Strategies

Now look at placement strategies. Placement strategies are used to identify an instance that meets a specific strategy. ECS supports three task placement strategies:

  • random
  • binpack
  • spread

Random is how RunTask places tasks by default and is fairly straightforward (it doesn’t require further parameters). The two other strategies, binpack and spread, take opposite actions. Binpack places tasks on as few instances as possible, helping to optimize resource utilization, while spread places tasks evenly across your cluster to help maximize availability. By default, ECS uses spread with the ecs.availability-zone attribute to place tasks.

   "placementStrategy": [ 
      { 
         "field": "string",
         "type": "string"
      }
   ],

Random

Placement Random

 Random places tasks on instances at random. This still honors the other constraints that you specified, implicitly or explicitly. Specifically, it still makes sure that tasks are scheduled on instances with enough resources to run them.

aws ecs run-task --task-definition nouvelleApp \
--count 5 \
--placement-strategy type="random"

Bin packing

Placement Binpack

The binpack strategy tries to fit your workloads in as few instances as possible. It gets its name from the bin packing problem where the goal is to fit objects of various sizes in the smallest number of bins. It is well suited to scenarios for minimizing the number of instances in your cluster, perhaps for cost savings, and lends itself well to automatic scaling for elastic workloads, to shut down instances that are not in use.

When you use the binpack strategy, you must also indicate if you are trying to make optimal use of your instances’ CPU or memory. This is done by passing an extra field parameter, which tells the task placement engine which parameter to use to evaluate how “full” your “bins” are. It then chooses the instance with the least available CPU or memory (depending on which you pick). If there are multiple instances with this CPU or memory remaining, it chooses randomly.

aws ecs run-task --task-definition nouvelleApp \
--count 8 --placement-strategy type="binpack",field="cpu"

aws ecs run-task --task-definition nouvelleApp \
--count 8 --placement-strategy type="binpack",field="memory"

Spread

Placement Spread

The spread strategy, contrary to the binpack strategy, tries to put your tasks on as many different instances as possible. It is typically used to achieve high availability and mitigate risks, by making sure that you don’t put all your task-eggs in the same instance-baskets. Spread across Availability Zones, therefore, is the default placement strategy used for services.

When using the spread strategy, you must also indicate a field parameter. It is used to indicate the “bins” that you are considering. The accepted values are instanceID to balance tasks across all instances, host, or attribute key:value pairs such as attribute:ecs.availability-zone to balance tasks across zones. There are several AWS attributes that start with the “ecs” prefix, but you can be creative and create your own attributes.

aws ecs run-task --task-definition nouvelleApp \
--count 8 \
--placement-strategy type="spread",field="attribute:ecs.availability-zone"

Chaining placement strategies

Placement binpack spread

Now that you’ve seen how to use task placement strategies, you can also chain multiple task placement strategies with their respective attributes together. You can have up to five strategy rules per service. Perhaps you want to spread tasks across Availability Zones and binpack:

aws ecs run-task --task-definition nouvelleApp \
--count 8 \
--placement-strategy type="spread",field="attribute:ecs.availability-zone" type="binpack",field="memory"

Use cases

Here are some use cases for task placement so you can see how they can be solved by combining attributes, expressions, constraints, and strategies.

Task creation

Mariya is fairly new to using containers and especially container orchestrators. She wants to try ECS and has a simple application that she first wants to get running on a single node. (Solution: Use the RunTask API.)

aws ecs run-task --task-definition nouvelleApp

Scaling

After trying this, Mariya wants to scale her application to run 10 containers across any available nodes in her cluster. (Solution: This means she needs to run a task using either random or spread placement strategies.)

aws ecs run-task --task-definition nouvelleApp \
--count 10 \
--placement-strategy type="random"

Availability

Mariya then realizes that if she wants her tasks to automatically restart themselves if they fail, or if she wants more than 10 instantiations of her task running, she needs to create a service. (Solution: Create a service.)

aws ecs create-service --task-definition nouvelleApp \
--desiredCount 300 --placement-strategy type="random"

Christopher wants to achieve high availability by distributing his tasks amongst all the instances in his cluster so he minimizes impact if any one host goes down. (Solution: To do this he uses spread placement over host name.)

aws ecs run-task --task-definition nouvelleApp \
--count 9 \
--placement-strategy type="spread",field="host"

Ming-ya wants to run a monitoring container on each instance in her cluster. To help her do this, she creates a service with a high desired count and a distinctInstance placement constraint. The ECS service scheduler ensures that each instance in the cluster runs this task (up to the desired count).

aws ecs create-service --service-name monitoring \
--task-definition monitor \
--desiredCount 500 \
--placement-constraints type="distinctInstance"

Availability and Task Groups

Alex wants to run a fleet of webservers. For performance reasons, they want each webserver to have local access to a caching process that was written by another team. They define their webserver as one task, the caching server as a second task. When they launch their webserver task they uses a placement constraint so that the tasks are only placed on instances that are already hosting the cache task. (Solution: Use placement constraints with a task group.)

aws ecs run-task --task-definition cache \
--group caching --count 9 \
--placement-constraints type="distinctInstance"

aws ecs run-task --task-definition webserver \
--count 9 \
--placement-constraints type="distinctInstance" type="memberOf",expression="task:group == caching"

Availability and resource optimization

Jake wants to achieve high availability, but he has a limited budget and needs to optimize all the resources he uses. (Solution: Take a balanced approach of spreading over availability Availability Zones and binpacking on memory within a zone.)

aws ecs run-task --task-definition nouvelleApp \
--count 9 \
--placement-strategy type="spread",field="attribute:ecs.availability-zone" type="binpack",field="memory"

Instance type selection

Aditya has a GPU workload that they want to run in containers on ECS. He needs to ensure that only GPU-enabled instances are used for this workload. (Solution: Create a service and spread on instance type = G2* or whatever other GPU-enabled instance types are in the cluster)

aws ecs create-service --service-name workload \
--task-definition GPU --desiredCount 30 \
--placement-constraints type="memberOf",expression="attribute:ecs.instance-type =~ g2* or attribute:ecs.instance-type =~ p2*"

Conclusion

You’ve now looked at task placement at a high level, as well as:

  • Attributes, task groups, and expressions
  • Constraints
  • Strategies
  • Example use cases

To dive deeper into any of these aspects, check out Task Placement. Also, feel free to ask any questions!

@tiffanyfayj

 

Getting started with the AWS Cloud Development Kit for Amazon ECS

Post Syndicated from Nathan Peck original https://aws.amazon.com/blogs/compute/getting-started-with-the-aws-cloud-development-kit-for-amazon-ecs/

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to define cloud infrastructure in code and provision it through AWS CloudFormation. The AWS CDK integrates fully with AWS services and offers a higher-level object-oriented abstraction to define AWS resources imperatively.

Using the AWS CDK library of infrastructure constructs, you can easily encapsulate AWS best practices in your infrastructure definition and share it without worrying about boilerplate logic. The AWS CDK improves the end-to-end development experience because you get to use the power of modern programming languages to define your AWS infrastructure in a predictable and efficient manner. The AWS CDK is currently available for TypeScript, JavaScript, Java, and .NET.

The AWS CDK now includes constructs for ECS resources, allowing you to deploy a fully functioning containerized application environment on AWS with just a few lines of simple, readable code. Here’s how it works.

Install the AWS CDK

The first step is to install the AWS CDK on your development machine:

mkdir greeter-cdk
cd greeter-cdk
npm init -y
npm install @aws-cdk/cdk
npm install -g aws-cdk

Next, write some JavaScript code that imports the AWS CDK library and uses it to define a skeleton that you can place all your resources in:

index.js

const cdk = require('@aws-cdk/cdk');

class GreetingStack extends cdk.Stack {
  constructor(parent, id, props) {
    super(parent, id, props);
  }
}

class GreetingApp extends cdk.App {
  constructor(argv) {
    super(argv);
    new GreetingStack(this, 'greeting-stack');
  }
}

new GreetingApp().run();

Next, write a small configuration file telling the AWS CDK CLI that index.js is the code that defines your application stack:

cdk.json

{
  "app": "node index.js"
}

Now if you run cdk ls –l, you can see that the AWS CDK has found your stack, and has automatically interpolated some details about it from your development machine’s environment, such as your AWS account ID and default Region:

$ cdk ls -l
- name: greeting-stack
  environment:
    name: 209640446841/us-east-1
    account: '209640446841'
    region: us-east-1

Add ECS constructs

It’s time to add some ECS constructs to your stack. To do this, first install the ECS construct library. Also, install a couple of other constructs to help you set up resources linked to your containers:

npm install @aws-cdk/aws-ecs
npm install @aws-cdk/aws-ec2
npm install @aws-cdk/aws-elasticloadbalancingv2

Now it’s time to use the ECS constructs to set up your application environment:

const cdk = require('@aws-cdk/cdk');
const ecs = require('@aws-cdk/aws-ecs');
const ec2 = require('@aws-cdk/aws-ec2');

class GreetingStack extends cdk.Stack {
  constructor(parent, id, props) {
    super(parent, id, props);

    const vpc = new ec2.VpcNetwork(this, 'GreetingVpc', { maxAZs: 2 });

    // Create an ECS cluster
    const cluster = new ecs.Cluster(this, 'Cluster', { vpc });

    // Add capacity to it
    cluster.addDefaultAutoScalingGroupCapacity({
      instanceType: new ec2.InstanceType('t3.xlarge'),
      instanceCount: 3
    });
  }
}

With just three calls, you can create a VPC to hold all your application resources, and an ECS cluster with three t3.xlarge instances. All it takes is one command to tell the AWS CDK to automatically deploy this stack on your account:

cdk deploy

Behind the scenes, the AWS CDK synthesizes your JavaScript calls into a CloudFormation template. It asks CloudFormation to deploy the resources described in the synthesized template. You can see a live log of each resource that is being created and what the status is. As you can see from the numbers on the left side of the message stream, those three simple commands added to the AWS CDK stack automatically expanded into 32 lower-level, primitive resources to be created on your AWS account.

After the AWS CDK deployment finishes, you have a fresh ECS cluster ready to run your services. Next you will deploy a simple microservices stack onto this cluster. Your application will be a simple greeting server. The frontend greeter service fetches a random greeting and name from two backend services. There are two tiers to this application: the frontend and backend. The network will look like the following diagram:

There are two load balancers, one of them allows anyone on the internet to talk to your greeter service. The other is internal and designed to allow the greeter service to talk to the other greeting and name services.

In total, you need to add five more high-level constructs to your AWS CDK application: two load balancers and three services.

Add a new ECS service

Adding a new ECS service to the application stack is easy. Define a task definition, add a container to it, and tell the AWS CDK to turn the task definition into a service:

// Name service
const nameTaskDefinition = new ecs.Ec2TaskDefinition(this, 'name-task-definition', {});

const nameContainer = nameTaskDefinition.addContainer('name', {
   image: ecs.ContainerImage.fromDockerHub('nathanpeck/name'),
   memoryLimitMiB: 128
});

nameContainer.addPortMappings({
   containerPort: 3000
});

const nameService = new ecs.Ec2Service(this, 'name-service', {
    cluster: cluster,
    desiredCount: 2,
     taskDefinition: nameTaskDefinition
});

// Greeting service
 const greetingTaskDefinition = new ecs.Ec2TaskDefinition(this, 'greeting-task-definition', {});

  const greetingContainer = greetingTaskDefinition.addContainer('greeting', {
    image: ecs.ContainerImage.fromDockerHub ('nathanpeck/greeting'),
    memoryLimitMiB: 128
  });

  greetingContainer.addPortMappings({
    containerPort: 3000
  });

  const greetingService = new ecs.Ec2Service(this, 'greeting-service', {
    cluster: cluster,
    desiredCount: 2,
    taskDefinition: greetingTaskDefinition
  });

Just like that, you’ve defined two different ECS services that run by loading a public image from Docker Hub. The next step is to create a load balancer and add the services to it:

// Internal load balancer for the backend services
    const internalLB = new elbv2.ApplicationLoadBalancer(this, 'internal', {
      vpc: vpc,
      internetFacing: false
    });

    const internalListener = internalLB.addListener('PublicListener', { port: 80, open: true });

    internalListener.addTargetGroups('default', {
      targetGroups: [new elbv2.ApplicationTargetGroup(this, 'default', {
        vpc: vpc,
        protocol: 'HTTP',
        port: 80
      })]
    });

    internalListener.addTargets('name', {
      port: 80,
      pathPattern: '/name*',
      priority: 1,
      targets: [nameService]
    });

    internalListener.addTargets('greeting', {
      port: 80,
      pathPattern: '/greeting*',
      priority: 2,
      targets: [greetingService]
    });

For this configuration, the code defines a single load balancer with a single listener on port 80, but adds two different services behind it. If the path of the request looks like /name, it sends the request to your name service. If it looks like /greeting, it sends the request to the greeting service.

Finally, add the frontend greeter service, which constructs a random greeting phrase by fetching a random name from the name service and a random greeting from the greeting service. To do this, configure the greeter service to know how to make requests to the other two backend services:

    // Greeter service
    const greeterTaskDefinition = new ecs.Ec2TaskDefinition(this, 'greeter-task-definition', {});

    const greeterContainer = greeterTaskDefinition.addContainer('greeter', {
      image: ecs.ContainerImage.fromDockerHub ('nathanpeck/greeter'),
      memoryLimitMiB: 128,
      environment: {
        GREETING_URL: 'http://' + internalLB.dnsName + '/greeting',
        NAME_URL: 'http://' + internalLB.dnsName + '/name'
      }
    });

    greeterContainer.addPortMappings({
      containerPort: 3000
    });

    const greeterService = new ecs.Ec2Service(this, 'greeter-service', {
      cluster: cluster,
      desiredCount: 2,
      taskDefinition: greeterTaskDefinition
    });

The AWS CDK has a powerful capability to resolve expressions that you enter in your JavaScript and turn them into a CloudFormation template that resolves the correct values.

In this example, you create a reference to the DNS name of the load balancer, and indicate that you want to assign the following:

·    Environment variable = 'NAME_URL'

·    Value = 'http://' + internalLB.dnsName + '/name'

If you run cdk synth, you can see that the AWS CDK generates a CloudFormation template that dynamically inserts the proper DNS name of the load balancer at deployment time:

Type: 'AWS::ECS::TaskDefinition'
    Properties:
      ContainerDefinitions:
        - Environment:
            - Name: GREETING_URL
              Value:
                'Fn::Join':
                  - ''
                  - - 'http://'
                    - 'Fn::GetAtt':
                        - internal505AC855
                        - DNSName
                    - /greeting
            - Name: NAME_URL
              Value:
                'Fn::Join':
                  - ''
                  - - 'http://'
                    - 'Fn::GetAtt':
                        - internal505AC855
                        - DNSName
                    - /name

One final thing to add to your AWS CDK stack is an output. This gives you the DNS name of your service so you can send traffic to it:

new cdk.Output(this, 'ExternalDNS', { value: externalLB.dnsName });

Now see what is added when you deploy. Type the following command to see a preview of new or modified resources without actually doing the deployment:

cdk diff

The list of new resources being added looks good so run cdk deploy again. Again, the AWS CDK synthesizes the CloudFormation template, and initializes its deployment on your AWS account. This time, however, it creates a total of 66 resources, and gives you a URL output where the application is hosted:

To verify that your application is up and accepting traffic at that URL, load that internet ExternalDNS URL in your browser. The web application was able to talk to the two other backend services to get a greeting and a name:

Conclusion

If you’d like to try deploying this microservice stack yourself or using it as the basis for building your own AWS CDK stack, you can find the full AWS CDK example code on GitHub. Be sure to check out the AWS CDK documentation and the official AWS CDK construct for Amazon ECS on NPM.

AWS Cloud Map: Easily create and maintain custom maps of your applications

Post Syndicated from Abby Fuller original https://aws.amazon.com/blogs/aws/aws-cloud-map-easily-create-and-maintain-custom-maps-of-your-applications/

Companies are increasingly building their applications as microservices (many separate services that each do a single job). Microservices often allow companies to iterate and deploy more quickly. Many of these microservice-based modern applications are built using various types of cloud resources and deployed on dynamically changing infrastructure. Previously you had to use configuration files to manage the location of your application resource. However, dependencies in a microservices-based application can quickly become too complex to easily manage through configuration files. Additionally, many applications are built using containers that scale dynamically, reacting on the changes in traffic load. That increases your application responsiveness, but poses a new class of problem – now your application components need to discover and connect to the upstream services at runtime. This problem of connectivity in dynamically changing infrastructures and microservices is commonly addressed by service discovery.

Introducing AWS Cloud Map

 

AWS Cloud Map keeps track of all your application components, their locations, attributes and health status. Now your applications can simply query AWS Cloud Map using AWS SDK, API or even DNS to discover the locations of its dependencies. That allows your applications to scale dynamically and connect to upstream services directly, increasing the responsiveness of your applications.

When you register your web services and cloud resources in AWS Cloud Map, you can describe them using custom attributes, such as deployment stage and version. Your applications then can make discovery calls specifying the required deployment stage and version. AWS Cloud Map will return the locations of resources that match the supplied parameters. It simplifies your deployments and reduces the operational complexity for your applications.

Integrated health checking for IP-based resources, registered with AWS Cloud Map, automatically stops routing traffic to unhealthy endpoints. Additionally, you have APIs to describe the health status of your services, so that you can learn about potential issues with your infrastructure. That increases the resilience of your applications.

AWS Cloud Map in Action
Getting started with AWS Cloud Map is easy. You can use the AWS console or CLI to create a namespace, such as myapp.com . For this example, I’ll use the CLI. Let’s create a namespace:

aws servicediscovery create-public-dns-namespace --name myapp.com (http://myapp.com/)

At this point, you’ll need to decide whether your want your applications to discover resources only via the AWS SDK and API calls, or if you need optional discovery via DNS. When you enable DNS discovery for a namespace, you’ll need to provide IP addresses for all the resources that you register. If you plan to register other cloud resources, such as DynamoDB tables by ARN or the URLs of the APIs deployed on Amazon API Gateway, you need to select API discovery mode.

Once your namespace is created, it’s time to create services. A service represents your application components, such as users , auth, or payment and can be comprised of many dynamically changing resources. You can specify a friendly name for your service, then select the DNS discovery and health checking options. You can create a service like this:

aws servicediscovery create-service --name frontend --namespace-id %namespace_id%”

After you create a service, you can register service instances with custom attributes:

aws servicediscovery register-instance --service-id %service_id% --instance-id %id%
--attributes AWS_INSTANCE_IPV4=54.20.10.1,stage=beta,version=1.0,active=yes

aws servicediscovery register-instance --service-id %service_id% --instance-id %id%
--attributes AWS_INSTANCE_IPV4=54.20.10.2,stage=beta,version=2.0,active=no

Now, your applications can make API calls to discover the service instances, optionally providing query parameters to filter the results:

aws servicediscovery discover-instances --namespace-name myapp.com --service-name frontend --query-parameters version=1.0,active=yes
-->
{
"Instances": [
{
"InstanceId": "1",
"NamespaceName": "myapp.com",
"ServiceName": "users",
"HealthStatus": "HEALTHY",
"Attributes": {
"version":"1.0",
"active":"yes",
"stage":"beta",
"AWS_INSTANCE_IPV4": "54.20.10.2" }
}
]
}

And that’s it! Amazon Elastic Container Service (ECS) and AWS Fargate are tightly integrated with AWS Cloud Map. When you create your service and enable service discovery, all the task instances are automatically registered in AWS Cloud Map on scale up, and deregistered on scale down. ECS also ensures that only healthy task instances are returned on the discovery calls by publishing always up-to-date health information to AWS Cloud Map.

For Amazon Elastic Container Service for Kubernetes (EKS), you can automatically publish the external IPs of the services running in EKS in AWS Cloud Map. To do this, we’ve released an update to an open source project, ExternalDNS, to make Kubernetes resources discoverable via AWS Cloud Map. You can find out more details about Kubernetes External DNS here.

 

Now Generally Available
You can start building your applications with AWS Cloud Map and enjoy the integration with Amazon ECS and EKS, rich and secure API query interface, ubiquitous DNS name resolution and integrated health checking support today. Want to try it out? Head to https://console.aws.amazon.com/cloudmap/home.  To test out the integration with ECS, head to https://console.aws.amazon.com/ecs/home and enable Service Discovery to get started.

Use AWS CodeDeploy to Implement Blue/Green Deployments for AWS Fargate and Amazon ECS

Post Syndicated from Curtis Rissi original https://aws.amazon.com/blogs/devops/use-aws-codedeploy-to-implement-blue-green-deployments-for-aws-fargate-and-amazon-ecs/

We are pleased to announce support for blue/green deployments for services hosted using AWS Fargate and Amazon Elastic Container Service (Amazon ECS).

In AWS CodeDeploy, blue/green deployments help you minimize downtime during application updates. They allow you to launch a new version of your application alongside the old version and test the new version before you reroute traffic to it. You can also monitor the deployment process and, if there is an issue, quickly roll back.

With this new capability, you can create a new service in AWS Fargate or Amazon ECS  that uses CodeDeploy to manage the deployments, testing, and traffic cutover for you. When you make updates to your service, CodeDeploy triggers a deployment. This deployment, in coordination with Amazon ECS, deploys the new version of your service to the green target group, updates the listeners on your load balancer to allow you to test this new version, and performs the cutover if the health checks pass.

In this post, I show you how to configure blue/green deployments for AWS Fargate and Amazon ECS using AWS CodeDeploy. For information about how to automate this end-to-end using a continuous delivery pipeline in AWS CodePipeline and Amazon ECR, read Build a Continuous Delivery Pipeline for Your Container Images with Amazon ECR as Source.

Let’s dive in!

Prerequisites

To follow along, you must have these resources in place:

  • A Docker image repository that contains an image you have built from your Dockerfile and application source. This walkthrough uses Amazon ECR. For more information, see Creating a Repository and Pushing an Image in the Amazon Elastic Container Registry User Guide.
  • An Amazon ECS cluster. You can use the default cluster created for you when you first use Amazon ECS or, on the Clusters page of the Amazon ECS console, you can choose a Networking only cluster. For more information, see Creating a Cluster in the Amazon Elastic Container Service User Guide.

Note: The image repository and cluster must be created in the same AWS Region.

Set up IAM service roles

Because you will be using AWS CodeDeploy to handle the deployments of your application to Amazon ECS, AWS CodeDeploy needs permissions to call Amazon ECS APIs, modify your load balancers, invoke Lambda functions, and describe CloudWatch alarms. Before you create an Amazon ECS service that uses the blue/green deployment type, you must create the AWS CodeDeploy IAM role (ecsCodeDeployRole). For instructions, see Amazon ECS CodeDeploy IAM Role in the Amazon ECS Developer Guide.

Create an Application Load Balancer

To allow AWS CodeDeploy and Amazon ECS to control the flow of traffic to multiple versions of your Amazon ECS service, you must create an Application Load Balancer.

Follow the steps in Creating an Application Load Balancer and make the following modifications:

  1. For step 6a in the Define Your Load Balancer section, name your load balancer sample-website-alb.
  2. For step 2 in the Configure Security Groups section:
    1. For Security group name, enter sample-website-sg.
    2. Add an additional rule to allow TCP port 8080 from anywhere (0.0.0.0/0).
  3. In the Configure Routing section:
    1. For Name, enter sample-website-tg-1.
    2. For Target type, choose to register your targets with an IP address.
  4. Skip the steps in the Create a Security Group Rule for Your Container Instances section.

Create an Amazon ECS task definition

Create an Amazon ECS task definition that references the Docker image hosted in your image repository. For the sake of this walkthrough, we use the Fargate launch type and the following task definition.

{
  "executionRoleArn": "arn:aws:iam::account_ID:role/ecsTaskExecutionRole",
  "containerDefinitions": [{
    "name": "sample-website",
    "image": "<YOUR ECR REPOSITORY URI>",
    "essential": true,
    "portMappings": [{
      "hostPort": 80,
      "protocol": "tcp",
      "containerPort": 80
    }]
  }],
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512",
  "family": "sample-website"
}

Note: Be sure to change the value for “image” to the Amazon ECR repository URI for the image you created and uploaded to Amazon ECR in Prerequisites.

Creating an Amazon ECS service with blue/green deployments

Now that you have completed the prerequisites and setup steps, you are ready to create an Amazon ECS service with blue/green deployment support from AWS CodeDeploy.

Create an Amazon ECS service

  1. Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
  2. From the list of clusters, choose the Amazon ECS cluster you created to run your tasks.
  3. On the Services tab, choose Create.

This opens the Configure service wizard. From here you are able to configure everything required to deploy, run, and update your application using AWS Fargate and AWS CodeDeploy.

  1. Under Configure service:
    1. For the Launch type, choose FARGATE.
    2. For Task Definition, choose the sample-website task definition that you created earlier.
    3. Choose the cluster where you want to run your applications tasks.
    4. For Service Name, enter Sample-Website.
    5. For Number of tasks, specify the number of tasks that you want your service to run.
  2. Under Deployments:
    1. For Deployment type, choose Blue/green deployment (powered by AWS CodeDeploy). This creates a CodeDeploy application and deployment group using the default settings. You can see and edit these settings in the CodeDeploy console later.
    2. For the service role, choose the CodeDeploy service role you created earlier.
  3. Choose Next step.
  4. Under VPC and security groups:
    1. From Subnets, choose the subnets that you want to use for your service.
    2. For Security groups, choose Edit.
      1. For Assigned security groups, choose Select existing security group.
      2. Under Existing security groups, choose the sample-website-sg group that you created earlier.
      3. Choose Save.
  5. Under Load Balancing:
    1. Choose Application Load Balancer.
    2. For Load balancer name, choose sample-website-alb.
  6. Under Container to load balance:
    1. Choose Add to load balancer.
    2. For Production listener port, choose 80:HTTP from the first drop-down list.
    3. For Test listener port, in Enter a listener port, enter 8080.
  7. Under Additional configuration:
    1. For Target group 1 name, choose sample-website-tg-1.
    2. For Target group 2 name, enter sample-website-tg-2.
  8. Under Service discovery (optional), clear Enable service discovery integration, and then choose Next step.
  9. Do not configure Auto Scaling. Choose Next step.
  10. Review your service for accuracy, and then choose Create service.
  11. If everything is created successfully, choose View service.

You should now see your newly created service, with at least one task running.

When you choose the Events tab, you should see that Amazon ECS has deployed the tasks to your sample-website-tg-1 target group. When you refresh, you should see your service reach a steady state.

In the AWS CodeDeploy console, you will see that the Amazon ECS Configure service wizard has created a CodeDeploy application for you. Click into the application to see other details, including the deployment group that was created for you.

If you click the deployment group name, you can view other details about your deployment.  Under Deployment type, you’ll see Blue/green. Under Deployment configuration, you’ll see CodeDeployDefault.ECSAllAtOnce. This indicates that after the health checks are passed, CodeDeploy updates the listeners on the Application Load Balancer to send 100% of the traffic over to the green environment.

Under Load Balancing, you can see details about your target groups and your production and test listener ARNs.

Let’s apply an update to your service to see the CodeDeploy deployment in action.

Trigger a CodeDeploy blue/green deployment

Create a revised task definition

To test the deployment, create a revision to your task definition for your application.

  1. Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
  2. From the navigation pane, choose Task Definitions.
  3. Choose your sample-website task definition, and then choose Create new revision.
  4. Under Tags:
    1. In Add key, enter Name.
    2. In Add value, enter Sample Website.
  5. Choose Create.

Update ECS service

You now need to update your Amazon ECS service to use the latest revision of your task definition.

  1. Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
  2. Choose the Amazon ECS cluster where you’ve deployed your Amazon ECS service.
  3. Select the check box next to your sample-website service.
  4. Choose Update to open the Update Service wizard.
    1. Under Configure service, for Task Definition, choose 2 (latest) from the Revision drop-down list.
  5. Choose Next step.
  6. Skip Configure deployments. Choose Next step.
  7. Skip Configure network. Choose Next step.
  8. Skip Set Auto Scaling (optional). Choose Next step.
  9. Review the changes, and then choose Update Service.
  10. Choose View Service.

You are now be taken to the Deployments tab of your service where you can see details about your blue/green deployment.

You can click the deployment ID to go to the details view for the CodeDeploy deployment.

From there you can see the deployments status:

You can also see the progress of the traffic shifting:

If you notice issues, you can stop and roll back the deployment. This shifts traffic back to the original (blue) task set and stops the deployment.

By default, CodeDeploy waits one hour after a successful deployment before it terminates the original task set. You can use the AWS CodeDeploy console to shorten this interval. After the task set is terminated, CodeDeploy marks the deployment complete.

Conclusion

In this post, I showed you how to create an AWS Fargate-based Amazon ECS service with blue/green deployments powered by AWS CodeDeploy. I showed you how to configure the required and prerequisite components, such as an Application Load Balancer and associated targets groups, all from the AWS Management Console. I hope that the information in this posts helps you get started implementing this for your own applications!

Build a Continuous Delivery Pipeline for Your Container Images with Amazon ECR as Source

Post Syndicated from Daniele Stroppa original https://aws.amazon.com/blogs/devops/build-a-continuous-delivery-pipeline-for-your-container-images-with-amazon-ecr-as-source/

Today, we are launching support for Amazon Elastic Container Registry (Amazon ECR) as a source provider in AWS CodePipeline. You can now initiate an AWS CodePipeline pipeline update by uploading a new image to Amazon ECR. This makes it easier to set up a continuous delivery pipeline and use the AWS Developer Tools for CI/CD.

You can use Amazon ECR as a source if you’re implementing a blue/green deployment with AWS CodeDeploy from the AWS CodePipeline console. For more information about using the Amazon Elastic Container Service (Amazon ECS) console to implement a blue/green deployment without CodePipeline, see Implement Blue/Green Deployments for AWS Fargate and Amazon ECS Powered by AWS CodeDeploy.

This post shows you how to create a complete, end-to-end continuous deployment (CD) pipeline with Amazon ECR and AWS CodePipeline. It walks you through setting up a pipeline to build your images when the upstream base image is updated.

Prerequisites

To follow along, you must have these resources in place:

  • A source control repository with your base image Dockerfile and a Docker image repository to store your image. In this walkthrough, we use a simple Dockerfile for the base image:
    FROM alpine:3.8

    RUN apk update

    RUN apk add nodejs
  • A source control repository with your application Dockerfile and source code and a Docker image repository to store your image. For the application Dockerfile, we use our base image and then add our application code:
    FROM 012345678910.dkr.ecr.us-east-1.amazonaws.com/base-image

    ENV PORT=80

    EXPOSE $PORT

    COPY app.js /app/

    CMD ["node", "/app/app.js"]

This walkthrough uses AWS CodeCommit for the source control repositories and Amazon ECR  for the Docker image repositories. For more information, see Create an AWS CodeCommit Repository in the AWS CodeCommit User Guide and Creating a Repository in the Amazon Elastic Container Registry User Guide.

Note: The source control repositories and image repositories must be created in the same AWS Region.

Set up IAM service roles

In this walkthrough you use AWS CodeBuild and AWS CodePipeline to build your Docker images and push them to Amazon ECR. Both services use Identity and Access Management (IAM) service roles to makes calls to Amazon ECR API operations. The service roles must have a policy that provides permissions to make these Amazon ECR calls. The following procedure helps you attach the required permissions to the CodeBuild service role.

To create the CodeBuild service role

  1. Follow these steps to use the IAM console to create a CodeBuild service role.
  2. On step 10, make sure to also add the AmazonEC2ContainerRegistryPowerUser policy to your role.

CodeBuild service role policies

Create a build specification file for your base image

A build specification file (or build spec) is a collection of build commands and related settings, in YAML format, that AWS CodeBuild uses to run a build. Add a buildspec.yml file to your source code repository to tell CodeBuild how to build your base image. The example build specification used here does the following:

  • Pre-build stage:
    • Sign in to Amazon ECR.
    • Set the repository URI to your ECR image and add an image tag with the first seven characters of the Git commit ID of the source.
  • Build stage:
    • Build the Docker image and tag the image with latest and the Git commit ID.
  • Post-build stage:
    • Push the image with both tags to your Amazon ECR repository.
version: 0.2

phases:
  pre_build:
    commands:
      - echo.Logging in to Amazon ECR...
      - aws --version
      - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - REPOSITORY_URI=012345678910.dkr.ecr.us-east-1.amazonaws.com/base-image
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG=${COMMIT_HASH:=latest}
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $REPOSITORY_URI:latest .
      - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push $REPOSITORY_URI:latest
      - docker push $REPOSITORY_URI:$IMAGE_TAG

To add a buildspec.yml file to your source repository

  1. Open a text editor and then copy and paste the build specification above into a new file.
  2. Replace the REPOSITORY_URI value (012345678910.dkr.ecr.us-east-1.amazonaws.com/base-image) with your Amazon ECR repository URI (without any image tag) for your Docker image. Replace base-image with the name for your base Docker image.
  3. Commit and push your buildspec.yml file to your source repository.
    git add .
    git commit -m "Adding build specification."
    git push

Create a build specification file for your application

Add a buildspec.yml file to your source code repository to tell CodeBuild how to build your source code and your application image. The example build specification used here does the following:

  • Pre-build stage:
    • Sign in to Amazon ECR.
    • Set the repository URI to your ECR image and add an image tag with the first seven characters of the CodeBuild build ID.
  • Build stage:
    • Build the Docker image and tag the image with latest and the Git commit ID.
  • Post-build stage:
    • Push the image with both tags to your ECR repository.
version: 0.2

phases:
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws --version
      - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - REPOSITORY_URI=012345678910.dkr.ecr.us-east-1.amazonaws.com/hello-world
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG=build-$(echo $CODEBUILD_BUILD_ID | awk -F":" '{print $2}')
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $REPOSITORY_URI:latest .
      - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push $REPOSITORY_URI:latest
      - docker push $REPOSITORY_URI:$IMAGE_TAG
artifacts:
  files:
    - imageDetail.json

To add a buildspec.yml file to your source repository

  1. Open a text editor and then copy and paste the build specification above into a new file.
  2. Replace the REPOSITORY_URI value (012345678910.dkr.ecr.us-east-1.amazonaws.com/hello-world) with your Amazon ECR repository URI (without any image tag) for your Docker image. Replace hello-world with the container name in your service’s task definition that references your Docker image.
  3. Commit and push your buildspec.yml file to your source repository.
    git add .
    git commit -m "Adding build specification."
    git push

Create a continuous deployment pipeline for your base image

Use the AWS CodePipeline wizard to create your pipeline stages:

  1. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
  2. On the Welcome page, choose Create pipeline.
    If this is your first time using AWS CodePipeline, an introductory page appears instead of Welcome. Choose Get Started Now.
  3. On the Step 1: Name page, for Pipeline name, type the name for your pipeline and choose Next step. For this walkthrough, the pipeline name is base-image.
  4. On the Step 2: Source page, for Source provider, choose AWS CodeCommit.
    1. For Repository name, choose the name of the AWS CodeCommit repository to use as the source location for your pipeline.
    2. For Branch name, choose the branch to use, and then choose Next step.
  5. On the Step 3: Build page, choose AWS CodeBuild, and then choose Create project.
    1. For Project name, choose a unique name for your build project. For this walkthrough, the project name is base-image.
    2. For Operating system, choose Ubuntu.
    3. For Runtime, choose Docker.
    4. For Version, choose aws/codebuild/docker:17.09.0.
    5. For Service role, choose Existing service role, choose the CodeBuild service role you’ve created earlier, and then clear the Allow AWS CodeBuild to modify this service role so it can be used with this build project box.
    6. Choose Continue to CodePipeline.
    7. Choose Next.
  6. On the Step 4: Deploy page, choose Skip and acknowledge the pop-up warning.
  7. On the Step 5: Review page, review your pipeline configuration, and then choose Create pipeline.

Base image pipeline

Create a continuous deployment pipeline for your application image

The execution of the application image pipeline is triggered by changes to the application source code and changes to the upstream base image. You first create a pipeline, and then edit it to add a second source stage.

    1. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
    2. On the Welcome page, choose Create pipeline.
    3. On the Step 1: Name page, for Pipeline name, type the name for your pipeline, and then choose Next step. For this walkthrough, the pipeline name is hello-world.
    4. For Service role, choose Existing service role, and then choose the CodePipeline service role you modified earlier.
    5. On the Step 2: Source page, for Source provider, choose Amazon ECR.
      1. For Repository name, choose the name of the Amazon ECR repository to use as the source location for your pipeline. For this walkthrough, the repository name is base-image.

Amazon ECR source configuration

  1. On the Step 3: Build page, choose AWS CodeBuild, and then choose Create project.
    1. For Project name, choose a unique name for your build project. For this walkthrough, the project name is hello-world.
    2. For Operating system, choose Ubuntu.
    3. For Runtime, choose Docker.
    4. For Version, choose aws/codebuild/docker:17.09.0.
    5. For Service role, choose Existing service role, choose the CodeBuild service role you’ve created earlier, and then clear the Allow AWS CodeBuild to modify this service role so it can be used with this build project box.
    6. Choose Continue to CodePipeline.
    7. Choose Next.
  2. On the Step 4: Deploy page, choose Skip and acknowledge the pop-up warning.
  3. On the Step 5: Review page, review your pipeline configuration, and then choose Create pipeline.

The pipeline will fail, because it is missing the application source code. Next, you edit the pipeline to add an additional action to the source stage.

  1. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
  2. On the Welcome page, choose your pipeline from the list. For this walkthrough, the pipeline name is hello-world.
  3. On the pipeline page, choose Edit.
  4. On the Editing: hello-world page, in Edit: Source, choose Edit stage.
  5. Choose the existing source action, and choose the edit icon.
    1. Change Output artifacts to BaseImage, and then choose Save.
  6. Choose Add action, and then enter a name for the action (for example, Code).
    1. For Action provider, choose AWS CodeCommit.
    2. For Repository name, choose the name of the AWS CodeCommit repository for your application source code.
    3. For Branch name, choose the branch.
    4. For Output artifacts, specify SourceArtifact, and then choose Save.
  7. On the Editing: hello-world page, choose Save and acknowledge the pop-up warning.

Application image pipeline

Test your end-to-end pipeline

Your pipeline should have everything for running an end-to-end native AWS continuous deployment. Now, test its functionality by pushing a code change to your base image repository.

  1. Make a change to your configured source repository, and then commit and push the change.
  2. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
  3. Choose your pipeline from the list.
  4. Watch the pipeline progress through its stages. As the base image is built and pushed to Amazon ECR, see how the second pipeline is triggered, too. When the execution of your pipeline is complete, your application image is pushed to Amazon ECR, and you are now ready to deploy your application. For more information about continuously deploying your application, see Create a Pipeline with an Amazon ECR Source and ECS-to-CodeDeploy Deployment in the AWS CodePipeline User Guide.

Conclusion

In this post, we showed you how to create a complete, end-to-end continuous deployment (CD) pipeline with Amazon ECR and AWS CodePipeline. You saw how to initiate an AWS CodePipeline pipeline update by uploading a new image to Amazon ECR. Support for Amazon ECR in AWS CodePipeline makes it easier to set up a continuous delivery pipeline and use the AWS Developer Tools for CI/CD.

Scanning Docker Images for Vulnerabilities using Clair, Amazon ECS, ECR, and AWS CodePipeline

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/scanning-docker-images-for-vulnerabilities-using-clair-amazon-ecs-ecr-aws-codepipeline/

Post by Vikrama Adethyaa, Solution Architect and Tiffany Jernigan, Developer Advocate

 

Containers are an increasingly important way for you to package and deploy your applications. They are lightweight and provide a consistent, portable software environment for applications to easily run and scale anywhere.

A container is launched from a container image, an executable package that includes everything needed to run an application: the application code, configuration files, runtime (for example, Java, Python, etc.), libraries, and environment variables.

A container image is built up from a series of layers. For a Docker image, each layer in the image represents an instruction in the image’s Dockerfile. A parent image is the image on which your image is built. It refers to the contents of the FROM directive in the Dockerfile. Most Dockerfiles start from a parent image, and often the parent image was downloaded from a public registry.

It is incredibly difficult and time-consuming to manually track all the files, packages, libraries, and so on, included in an image along with the vulnerabilities that they may possess. Having a security breach is one of the costliest things an organization can endure. It takes years to build up a reputation and only seconds to tear it down.

One way to prevent breaches is to regularly scan your images and compare the dependencies to a known list of common vulnerabilities and exposures (CVEs). Public CVE lists contain an identification number, description, and at least one public reference for known cybersecurity vulnerabilities. The automatic detection of vulnerabilities helps increase awareness and best security practices across developer and operations teams. It encourages action to patch and address the vulnerabilities.

This post walks you through the process of setting up an automated vulnerability scanning pipeline. You use AWS CodePipeline to scan your container images for known security vulnerabilities and deploy the container only if the vulnerabilities are within the defined threshold.

This solution uses CoresOS Clair for static analysis of vulnerabilities in container images. Clair is an API-driven analysis engine that inspects containers layer-by-layer for known security flaws. Clair scans each container layer and provides a notification of vulnerabilities that may be a threat, based on the CVE database and similar data feeds from Red Hat, Ubuntu, and Debian.

Deploying Clair

Here’s how to install Clair on AWS. The following diagram shows the high-level architecture of Clair.

Clair uses PostgreSQL, so use Aurora PostgreSQL to host the Clair database. You deploy Clair as an ECS service with the Fargate launch type behind an Application Load Balancer. The Clair container is deployed in a private subnet behind the Application Load Balancer that is hosted in the public subnets. The private subnets must have a route to the internet using the NAT gateway, as Clair fetches the latest vulnerability information from multiple online sources.

Prerequisites

Ensure that the following are installed or configured on your workstation before you deploy Clair:

  • Docker
  • Git
  • AWS CLI installed
  • AWS CLI is configured with your access key ID and secret access key, and the default region as us-east-1

Download the AWS CloudFormation template for deploying Clair

To help you quickly deploy Clair on AWS and set up CodePipeline with automatic vulnerability detection, use AWS CloudFormation templates that can be downloaded from the aws-codepipeline-docker-vulnerability-scan GitHub repository. The repository also includes a simple, containerized NGINX website for testing your pipeline.

# Clone the GitHub repository
git clone https://github.com/aws-samples/aws-codepipeline-docker-vulnerability-scan.git

cd aws-codepipeline-docker-vulnerability-scan

VPC requirements

We recommend a VPC with the following specification for deploying CoreOS Clair:

  • Two public subnets
  • Two private subnets
  • NAT gateways to allow internet access for services in private subnets

You can create such a VPC using the AWS CloudFormation template networking-template.yaml that is included in the sample code you cloned from GitHub.

# Create the VPC
aws cloudformation create-stack \
--stack-name coreos-clair-vpc-stack \
--template-body file://networking-template.yaml

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name coreos-clair-vpc-stack

# Get stack outputs
aws cloudformation describe-stacks \
--stack-name coreos-clair-vpc-stack \
--query 'Stacks[].Outputs[]'

Build the Clair Docker image

First, create an Amazon Elastic Container Registry (Amazon ECR) repository to host your Clair Docker image. Then, build the Clair Docker image on your workstation and push it to the ECR repository that you created.

# Create the ECR repository
# Note the URI and ARN of the ECR Repository
aws ecr create-repository --repository-name coreos-clair

# Build the Docker image
docker build -t <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/coreos-clair:latest ./coreos-clair

# Push the Docker image to ECR
aws ecr get-login --no-include-email | bash
docker push <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/coreos-clair:latest

Deploy Clair using AWS CloudFormation

Now that the Clair Docker image has been built and pushed to ECR, deploy Clair as an ECS service with the Fargate launch type. The following AWS CloudFormation stack creates an ECS cluster named clair-demo-cluster and deploys the Clair service.

# Create the AWS CloudFormation stack
# <ECRRepositoryUri> - CoreOS Clair ECR repository URI without an image tag
# Example - <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/coreos-clair

aws cloudformation create-stack \
--stack-name coreos-clair-stack \
--template-body file://coreos-clair/clair-template.yaml \
--capabilities CAPABILITY_IAM \
--parameters \
ParameterKey="VpcId",ParameterValue="<VpcId>" \
ParameterKey="PublicSubnets",ParameterValue=\"<PublicSubnet01-ID>,<PublicSubnet02-ID>\" \
ParameterKey="PrivateSubnets",ParameterValue=\"<PrivateSubnet01-ID>,<PrivateSubnet02-ID>\" \
ParameterKey="ECRRepositoryUri",ParameterValue="<ECRRepositoryUri>"

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name coreos-clair-stack

# Get stack outputs
# Note the ClairAlbDnsName
aws cloudformation describe-stacks \
--stack-name coreos-clair-stack \
--query 'Stacks[].Outputs[]'

Deploying the sample website

Deploy a simple static website running on NGINX as a container. An AWS CloudFormation template is included in the sample code that you cloned from GitHub.

Create a CodeCommit repository for the NGINX website

You create an AWS CodeCommit repository to host the sample NGINX website code. This repository is the source of the pipeline that you create later. Before you proceed with the following steps, ensure SSH authentication to CodeCommit.

# Create the CodeCommit repository
# Note the cloneUrlSsh value
aws codecommit create-repository --repository-name my-nginx-website
 
# Clone the empty CodeCommit repository
cd ../
git clone <cloneUrlSsh>

# Copy the contents of nginx-website to my-nginx-website
cp -R aws-codepipeline-docker-vulnerability-scan/nginx-website/ my-nginx-website/

# Commit the changes
cd my-nginx-website/
git add *
git commit -m "Initial commit"
git push

Build the NGINX Docker image

Create an ECR repository to host your NGINX website Docker image. Build the image on your workstation using the file Dockerfile-amznlinux, where Amazon Linux is the parent image. After the image is built, push it to the ECR repository that you created.

# Create an ECR repository
# Note the URI and ARN of the ECR repository
aws ecr create-repository --repository-name nginx-website

# Build the Docker image
docker build -f Dockerfile-amznlinux -t <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/nginx-website:latest .

# Push the Docker image to ECR
docker push <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/nginx-website:latest

Deploy the NGINX website using AWS CloudFormation

Now deploy the NGINX website. The following stack deploys the NGINX website onto the same ECS cluster (clair-demo-cluster) as Clair.

# Create the AWS CloudFormation stack
# <ECRRepositoryUri> - Nginx-Website ECR Repository URI without Image tag
# Example: <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/nginx-website

cd ../aws-codepipeline-docker-vulnerability-scan/

aws cloudformation create-stack \
--stack-name nginx-website-stack \
--template-body file://nginx-website/nginx-website-template.yaml \
--capabilities CAPABILITY_IAM \
--parameters \
ParameterKey="VpcId",ParameterValue="<VpcId>" \
ParameterKey="PublicSubnets",ParameterValue=\"<PublicSubnet01-ID>,<PublicSubnet02-ID>\" \
ParameterKey="PrivateSubnets",ParameterValue=\"<PrivateSubnet01-ID>,<PrivateSubnet02-ID>\" \
ParameterKey="ECRRepositoryUri",ParameterValue="<ECRRepositoryUri>"

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name nginx-website-stack

# Get stack outputs
aws cloudformation describe-stacks \
--stack-name nginx-website-stack \
--query 'Stacks[].Outputs[]'

Note the AWS CloudFormation stack outputs. The stack output contains the Application Load Balancer URL for the NGINX website and the ECS service name of the NGINX website. You need the ECS service name for the pipeline.

Building the pipeline

In this section, you build a pipeline to automate vulnerability scanning for the nginx-website Docker image builds. Every time that a code change is made, the Docker image is rebuilt and scanned for vulnerabilities. Only if vulnerabilities are within the defined threshold is the container is deployed onto ECS. For more information, see Tutorial: Continuous Deployment with AWS CodePipeline.

The sample code includes an AWS CloudFormation template to create the pipeline. The buildspec.yml file is used by AWS CodeBuild to build the nginx-website Docker image and scan the image using Clair.

CodeBuild build spec

build spec is a collection of build commands and related settings, in YAML format, that AWS CodeBuild uses to run a build. You can include a build spec in the root directory of your application source code, or you can define a build spec when you create a build project.

In this sample app, you include the build spec in the root directory of your sample application source code. The buildspec.yml file is located in the /aws-codepipeline-docker-vulnerability-scan/nginx-website folder.

Use Klar, a simple tool to analyze images stored in a private or public Docker registry for security vulnerabilities using Clair. Klar serves as a client which coordinates the image checks between ECR and Clair.

In the buildspec.yml file, you set the variable CLAIR_OUTPUT=Critical. CLAIR_OUTPUT defines the severity level threshold. Vulnerabilities with severity levels higher than or equal to this threshold are outputted. The supported levels are:

  • Unknown
  • Negligible
  • Low
  • Medium
  • High
  • Critical
  • Defcon1

You can configure Klar to your requirements by setting the variables as defined in https://github.com/optiopay/klar.

# Set the following variables as CodeBuild project environment variables
# ECR_REPOSITORY_URI
# CLAIR_URL

version: 0.2
phases:
  pre_build:
    commands:
      - echo Fetching ECR Login
      - ECR_LOGIN=$(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - echo Logging in to Amazon ECR...
      - $ECR_LOGIN
      - IMAGE_TAG=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - echo Downloading Clair client Klar-2.1.1
      - wget https://github.com/optiopay/klar/releases/download/v2.1.1/klar-2.1.1-linux-amd64
      - mv ./klar-2.1.1-linux-amd64 ./klar
      - chmod +x ./klar
      - PASSWORD=`echo $ECR_LOGIN | cut -d' ' -f6`
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $ECR_REPOSITORY_URI:latest .
      - docker tag $ECR_REPOSITORY_URI:latest $ECR_REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - bash -c "if [ /"$CODEBUILD_BUILD_SUCCEEDING/" == /"0/" ]; then exit 1; fi"
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push $ECR_REPOSITORY_URI:latest
      - docker push $ECR_REPOSITORY_URI:$IMAGE_TAG
      - echo Running Clair scan on the Docker Image
      - DOCKER_USER=AWS DOCKER_PASSWORD=${PASSWORD} CLAIR_ADDR=$CLAIR_URL CLAIR_OUTPUT=Critical ./klar $ECR_REPOSITORY_URI
      - echo Writing image definitions file...
      - printf '[{"name":"MyWebsite","imageUri":"%s"}]' $ECR_REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json
artifacts:
  files: imagedefinitions.json

The build spec does the following:

Pre-build stage:

  • Log in to ECR.
  • Download the Clair client Klar.

Build stage:

  • Build the Docker image and tag it as latest and with the Git commit ID.

Post-build stage:

  • Push the image to your ECR repository with both tags.
  • Trigger Klar to scan the image that you pushed to ECR for security vulnerabilities using Clair.
  • Write a file called imagedefinitions.json in the build root that has your Amazon ECS service’s container name and the image and tag. The deployment stage of your CD pipeline uses this information to create a new revision of your service’s task definition. It then updates the service to use the new task definition. The imagedefinitions.json file is required for the AWS CodeDeploy ECS job worker.

Deploy the pipeline

Deploy the pipeline using the AWS CloudFormation template provided with the sample code. The following template creates the CodeBuild project, CodePipeline pipeline, Amazon CloudWatch Events rule, and necessary IAM permissions.

# Deploy the pipeline
 
# Replace the following variables 
# WebsiteECRRepositoryARN – NGINX website ECR repository ARN
# WebsiteECRRepositoryURI – NGINX website ECR repository URI
# ClairAlbDnsName - Output variable from coreos-clair-stack
# EcsServiceName – Output variable from nginx-website-stack

aws cloudformation create-stack \
--stack-name nginx-website-codepipeline-stack \
--template-body file://clair-codepipeline-template.yaml \
--capabilities CAPABILITY_IAM \
--disable-rollback \
--parameters \
ParameterKey="EcrRepositoryArn",ParameterValue="<WebsiteECRRepositoryARN>" \
ParameterKey="EcrRepositoryUri",ParameterValue="<WebsiteECRRepositoryURI>" \
ParameterKey="ClairAlbDnsName",ParameterValue="<ClairAlbDnsName>" \
ParameterKey="EcsServiceName",ParameterValue="<WebsiteECSServiceName>"

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name nginx-website-codepipeline-stack

The pipeline is triggered after the AWS CloudFormation stack creation is complete. You can log in to the AWS Management Console to monitor the status of the pipeline. The vulnerability scan information is available in CloudWatch Logs.

You can also modify the CLAIR_OUTPUT value from Critical to High in the buildspec.yml file in the /cores-clair-ecs-cicd/nginx-website-repo folder and then check the status of the build.

Summary

I’ve described how to deploy Clair on AWS and set up a release pipeline for the automated vulnerability scanning of container images. The Clair instance can be used as a centralized Docker image vulnerability scanner and used by other CodeBuild projects. To meet your organization’s security requirements, define your vulnerability threshold in Klar by setting the variables, as defined in https://github.com/optiopay/klar.

Re-affirming Long-Term Support for Java in Amazon Linux

Post Syndicated from Deepak Singh original https://aws.amazon.com/blogs/compute/re-affirming-long-term-support-for-java-in-amazon-linux/

In light of Oracle’s recent announcement indicating an end to free long-term support for OpenJDK after January 2019, we re-affirm that the OpenJDK 8 and OpenJDK 11 Java runtimes in Amazon Linux 2 will continue to receive free long-term support from Amazon until at least June 30, 2023. We are collaborating and contributing in the OpenJDK community to provide our customers with a free long-term supported Java runtime.

In addition, Amazon Linux AMI 2018.03, the last major release of Amazon Linux AMI, will receive support for the OpenJDK 8 runtime at least until June 30, 2020, to facilitate migration to Amazon Linux 2. Java runtimes provided by AWS Services such as AWS Lambda, AWS Elastic Map Reduce (EMR), and AWS Elastic Beanstalk will also use the AWS supported OpenJDK builds.

Amazon Linux users will not need to make any changes to get support for OpenJDK 8. OpenJDK 11 will be made available through the Amazon Linux 2 repositories at a future date. The Amazon Linux OpenJDK support posture will also apply to the on-premises virtual machine images and Docker base image of Amazon Linux 2.

Amazon Linux 2 provides a secure, stable, and high-performance execution environment. Amazon Linux AMI and Amazon Linux 2 include a Java runtime based on OpenJDK 8 and are available in all public AWS regions at no additional cost beyond the pricing for Amazon EC2 instance usage.

Amazon ECS and Docker volume drivers, part 2: Amazon EFS

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/amazon-ecs-and-docker-volume-drivers-amazon-efs/

← Introduction and Part 1: Amazon EBS

 

Post by: Tiffany Jernigan and Jeremy Cowan

Introduction

This is the second post in a series showing how to use Docker volumes with Amazon ECS. If you are unfamiliar with Docker volumes or REX-Ray, or want to know how to use a volume plugin with ECS and Amazon Elastic Block Store (Amazon EBS), see Part 1.

In this post, you use the REX-Ray EFS plugin with Amazon Elastic File System (Amazon EFS) to persist and share data among multiple ECS tasks. To help you get started, we have created an AWS CloudFormation template that builds a two-instance ECS cluster across two Availability Zones.

The template bootstraps the REX-Ray EFS plugin onto each node. Each instance has the REX-Ray EFS plugin installed, is assigned an IAM role with an inline policy with permissions for REX-Ray to issue the necessary AWS API calls, and a security group to open port 2049 for EFS. The template also creates a Network Load Balancer that is used to expose an ECS service to the internet.

Set up the environment

First, create a folder in which you create all files and enter it. Next, set the full path for your EC2 key pair that you need later to connect to your instance using SSH.

#example path /Users/tiffany/.aws/ec2-keypair.pem
export KeyPairPath=<your-keypair>

Step 1: Instantiate the CloudFormation template

Next, create a CloudFormation stack with the following S3 template:
rexray-demo-efs.yaml

KeyPairName=$(echo $KeyPairPath | cut -d / -f5 | sed 's/.pem//')
Region=$(aws configure get region) #You can also replace this
CloudFormationStack=$(aws cloudformation create-stack \
--region $Region \
--stack-name rexray-demo-efs \
--capabilities CAPABILITY_NAMED_IAM \
--template-url http://s3.amazonaws.com/ecs-refarch-volume-plugins/rexray-demo-efs.yaml \
--parameters ParameterKey=KeyName,ParameterValue=$KeyPairName \
| jq -r .StackId)

The ECS container instances are bootstrapped with a user data script that installs the rexray/efs Docker plugin using:

docker plugin install rexray/efs REXRAY_PREEMPT=true \
EFS_REGION=${AWS::Region} \
EFS_SECURITYGROUPS=${EFSSecurityGroup} \
--grant-all-permissions

Step 2: Export output parameters as environment variables

This shell script exports the output parameters from the CloudFormation template. With the following command, import them as OS environment variables. Later, you use these variables to create task and service definitions.

cat > get-outputs.sh << 'EOF'
#!/bin/bash
function usage {
  echo "usage: source <(./get-outputs.sh  )"
  echo "stack name or ID must be provided or exported as the CloudFormationStack environment variable"
  echo "region must be provided or set with aws configure"
}

function main {
    #Get stack
    if [ -z "$1" ]; then
        if [ -z "$CloudFormationStack" ]; then
            echo "please provide stack name or ID"
            usage
            exit 1
        fi
    else
        CloudFormationStack="$1"
    fi
    #Get region
    if [ -z "$2" ]; then
        region=$(aws configure get region)
        if [ -z $region ]; then
            echo "please provide region"
            usage
            exit 1
        fi
    else
        region="$2"
    fi
    
    echo "#Region: $region"
    echo "#Stack: $CloudFormationStack"
    echo "#---"
    
    echo "#Checking if stack exists..."
    aws cloudformation wait stack-exists \
    --region $region \
    --stack-name $CloudFormationStack
    
    echo "#Checking if stack creation is complete..."
    aws cloudformation wait stack-create-complete \
    --region $region \
    --stack-name $CloudFormationStack
     
    echo "#Getting output keys and values..."
    echo "#---"
    aws cloudformation describe-stacks \
    --region $region \
    --stack-name $CloudFormationStack \
    --query 'Stacks[].Outputs[].[OutputKey, OutputValue]' \
    --output text | awk '{print "export", $1"="$2}'
}
main "[email protected]"
EOF
#Add executable permissions
chmod +x get-outputs.sh

Now run the script:

./get-outputs.sh && source <(./get-outputs.sh)

Step 3: Create a task definition

In this step, you create a task definition for an Apache web service, Space, which is an example website using Apache2 on Ubuntu. The scheduler and the REX-Ray EFS plugin ensure that each copy of the task establishes a connection with EFS.

cat > space-taskdef-efs.json << EOF 
{
    "containerDefinitions": [
        {
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "${CWLogGroupName}",
                    "awslogs-region": "${AWSRegion}",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "portMappings": [
               {
                    "containerPort": 80,
                    "protocol": "tcp"
                }
            ],
            "mountPoints": [
                {
                    "containerPath": "/var/www/",
                    "sourceVolume": "rexray-efs-vol"
                }
            ],
            "image": "tiffanyfay/space:apache",
            "essential": true,
            "name": "space"
        }
    ],
    "memory": "512",
    "family": "rexray-efs",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "512",
    "volumes": [
        {
            "name": "rexray-efs-vol",
            "dockerVolumeConfiguration": {
                "autoprovision": true,
                "scope": "shared",
                "driver": "rexray/efs"
            }
        }
    ]
}
EOF

Because autoprovision is set to true, the Docker volume driver, rexray/efs, creates a new file system for you. And because scope is shared, the file system can be used across multiple tasks.

Register the task definition and extract the task definition ARN from the result:

TaskDefinitionArn=$(aws ecs register-task-definition \
--region $AWSRegion \
--cli-input-json 'file://space-taskdef-efs.json' \
| jq -r .taskDefinition.taskDefinitionArn)

Step 4: Create a service definition

In this step, you create a service definition for the rexray-efs task definition. An ECS service is a long-running task that is monitored by the service scheduler. If the task dies or becomes unhealthy, the scheduler automatically attempts to restart the task.

The web service is fronted by a Network Load Balancer that is configured for forward traffic on port 80 to the tasks registered with a specific target group. The desired count is the desired number of task copies to run. The minimum and maximum healthy percent parameters inform the scheduler to run only exactly the number of desired copies of this task at a time. Unless a task has been stopped, it does not try starting a new one.

cat > space-svcdef-efs.json << EOF 
{
    "cluster": "${ECSClusterName}",
    "serviceName": "space-svc",
    "taskDefinition": "${TaskDefinitionArn}",
    "loadBalancers": [
        {
            "targetGroupArn": "${WebTargetGroupArn}",
            "containerName": "space",
            "containerPort": 80
        }
    ],
    "desiredCount": 4,
    "launchType": "EC2",
    "healthCheckGracePeriodSeconds": 60, 
    "deploymentConfiguration": {
        "maximumPercent": 100,
        "minimumHealthyPercent": 0
    },
    "networkConfiguration": {
        "awsvpcConfiguration": {
            "subnets": [
                "${SubnetIds}"
            ],
            "securityGroups": [
                "${EFSSecurityGroupId}",
                "${InstanceSecurityGroupId}"
            ]
        }
    }
}
EOF

Create the Apache service:

SvcDefinitionArn=$(aws ecs create-service \
--region $AWSRegion \
--cli-input-json file://space-svcdef-efs.json \
| jq -r .service.serviceArn)

Wait for service to be up with the last status as RUNNING for the tasks using either the CLI or the console:

aws ecs wait services-stable \
--region $AWSRegion \
--cluster $ECSClusterName \
--services $SvcDefinitionArn

Next, look at your file system and see two mount points—one for each Availability Zone:

FileSystemId=$(aws efs describe-file-systems \
--region $AWSRegion \
--query 'FileSystems[?Name==`/rexray-efs-vol`].FileSystemId' \
--output text)
aws efs describe-mount-targets \
--region $AWSRegion \
--file-system-id $FileSystemId 

Step 5: View the webpage

Now, open a browser and paste NLBDNSName as the URL.

echo $NLBDNSName

If you refresh the page, you can see that the task ID and EC2 instance ID change as the traffic is being load balanced.

Get the DNS info for an instance so that you can connect to it using SSH and modify index.shtml:

InstanceDns=$(aws ec2 describe-instances \
--region $AWSRegion \
--filter Name="tag:aws:cloudformation:stack-id",Values="$CloudFormationStack" \
--query 'Reservations[1].Instances[].PublicDnsName' \
--output text)
ssh -i $KeyPairPath [email protected]$InstanceDns

Now, get one of the Docker container IDs and use docker exec to change the image being displayed:

ContainerId=$(docker ps --filter volume="rexray-efs-vol" \
--format "{{.ID}}" --latest)
docker exec -it $ContainerId sed -i "s/ecsship/cruiser/" /var/www/index.shtml

To see the update, refresh the load balancer webpage.

Step 6: Clean up

To clean up the resources that you created in this post, take the following steps.

Delete the mount targets and file system.

FileSystemId=$(aws efs describe-file-systems \
--region $AWSRegion \
--query 'FileSystems[?Name==`/rexray-efs-vol`].FileSystemId' \
--output text)
MountTargetIds=($(aws efs describe-mount-targets \
--region $AWSRegion \
--file-system-id $FileSystemId \
--query 'MountTargets[].MountTargetId' --output text))
aws efs delete-mount-target --region $AWSRegion \
--mount-target-id ${MountTargetIds[2]}
aws efs delete-mount-target --region $AWSRegion \
--mount-target-id ${MountTargetIds[1]}
aws efs delete-file-system --region $AWSRegion \
--file-system-id $FileSystemId 

Delete the service.

aws ecs update-service \
--region $AWSRegion \
--cluster $ECSClusterName \
--service $SvcDefinitionArn \
--desired-count 0
aws ecs delete-service \
--region $AWSRegion \
--cluster $ECSClusterName \
--service $SvcDefinitionArn

Delete the CloudFormation template. This removes the rest of the environment that was pre-created for this exercise.

aws cloudformation delete-stack --region $AWSRegion \
--stack-name $CloudFormationStack

Summary

Congratulations on getting your service up and running with Docker volume plugins and EFS!

You have created a CloudFormation stack including two instances, running the REX-Ray EFS plugin, across two subnets, a Network Load Balancer, as well as an ECS cluster. You also created a task definition and service which used the plugin to create an elastic filesystem.

We look forward to hearing about how you use Docker Volume Plugins with ECS.

Tiffany and Jeremy

Amazon ECS and Docker volume drivers, part 1: Amazon EBS

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/amazon-ecs-and-docker-volume-drivers-amazon-ebs/

→ Part 2: Amazon EFS

 

Post by: Jeremy Cowan, Ronnie Eichler, and Tiffany Jernigan

Introduction

Containers are emerging as the default compute primitive for building cloud-native applications.  They facilitate the adoption of continuous delivery, and help increase infrastructure use.

However, deploying stateful application as containers has been challenging because containers have short life-spans, get re-deployed frequently, are scaled up and down dynamically, and often share the same host with other containers. All of these factors make it challenging for you to appropriately align the lifecycles of storage volumes and containers.

Before Docker volume driver support was added to Amazon ECS, you had to manage storage volumes manually using custom tooling such as bash scripts, Lambda functions, or manual configuration of Docker volumes. Now, you can now take full advantage of the Docker plugin ecosystem by using popular plugins such as REX-Ray or Portworx.

ECS support for Docker volumes means that you can now deploy stateful and storage-intensive use cases. These include:

  • Machine learning and data processing workloads
  • Applications such as GitLab or Jenkins that share a filesystem across multiple tasks
  • Databases such as Cassandra or RocksDB
  • Streaming tools such as Kafka
  • Additional scratch space added to containers that process large workloads and are storage-intensive

To support this broad array of use cases, ECS offers you the flexibility to configure the lifecycle of the Docker volume. For example, you can specify whether it is a scratch space volume specific to a single instantiation of a task, or a persistent volume that persists beyond the lifecycle of a unique instantiation of the task. You can also choose to use a Docker volume that you’ve created before launching your task.

In addition to managing the Docker volume configuration and lifecycle, the ECS scheduler is now plugin-aware. ECS takes the availability of the requested driver into account in its placement decisions, so that tasks that require a certain driver are only placed on container instances that have the driver installed.

Docker and Docker volumes

Docker volumes are a way to persist data outside of the lifecycle of a container. Containers themselves are made up of multiple immutable layers of storage with an ephemeral layer, which is read/write. If your application writes files to the ephemeral layer, these changes are lost when the container stops.

Volumes are managed outside of the container lifecycle—stopping or removing the container does not remove the volume. Docker also supports volume drivers that allow you to use volumes as an abstraction between containers and persistent storage such as Amazon EBS or Amazon EFS. By default, Docker provides a driver called ‘local’ that provides local storage volumes to containers. With Docker plugins, you can now add volume drivers to provision and manage EBS and EFS storage, such as REX-Ray, Portworx, and NetShare.

To deploy a stateful application such as Cassandra, MongoDB, Zookeeper, or Kafka, you likely need high-performance persistent storage like EBS. Docker volumes allow you to present an EBS volume to your application as a Docker volume.

There are other applications such as Jenkins and GitLab, where multiple copies of the application need access to the same data. With volume drivers and EFS, you can present EFS as a shared volume to multiple instances of your container so that you can scale your application yet still retain and persist shared data on EFS.

Another overlooked use case involves applications that need scratch space. When you define a task in ECS and your application writes to the filesystem inside of the container (not on a Docker volume), the task consumes space on the underlying EC2 instance that is shared by all other running tasks. This can lead to issues of ‘noisy neighbors’ if a task were to write a bunch of data to /tmp on its local filesystem.

Now with Docker volume support in ECS, you can map an EBS volume to /tmp (or whatever your scratch space directory you prefer). You can ensure good performance while limiting the size of the underlying EBS volume using arguments in your ECS task to the volume driver.

What is REX-Ray?

REX-Ray is just one example of a Docker volume driver plugin that provides an abstraction between Docker volumes and the underlying storage. Built on top of the libStorage framework, REX-Ray’s simplified architecture consists of a single binary. It runs as a stateless service on every host, using a configuration file to orchestrate multiple storage platforms. REX-Ray supports multiple storage backends. For this post, we focus on EBS as a storage backend. Part two of this series focuses on EFS.

Using a plugin such as REX-Ray, your Docker container is able to persist data outside of the lifespan of a running container. You don’t have to worry about the underlying storage. Instead, you simply reference a Docker volume in your task definition and let REX-Ray provide the abstraction. While this post is specific to REX-Ray, ECS is designed to be open and pass through the volume driver arguments from your task definition to Docker. You can use any volume driver (such as Portworx) that is supported by Docker.

Putting it all together

Before you can get started using Docker volumes with ECS, there are a few things you need to do.

First, you need a suitable volume driver plugin, such as REX-Ray, to provide an abstraction between the Docker volume and the underlying storage, for example, EBS or EFS. Docker designed volumes and the associated driver mechanism to be pluggable to support a variety of storage backends. Although we’ve chosen to highlight REX-Ray for this post, there are several others to choose from, including Portworx and NetShare.

Because the volume plugin interacts with the AWS storage services on your behalf, an IAM role has to be assigned to the ECS container instances. This allows REX-Ray to issue the appropriate AWS API calls and perform actions such as attaching and detaching EBS volumes, and so on.

Using REX-Ray with Amazon EBS

To help you get started, we’ve created an AWS CloudFormation template that builds a two-node ECS cluster.  The template bootstraps the rexray/ebs volume driver onto each node and assigns them an IAM role with an inline policy that allows them to call the API actions that REX-Ray needs.  The template also creates a Network Load Balancer, which is used to expose an ECS service to the internet.

Finally, you create a task definition for a stateful service—MySQL—that uses the the rexray/ebs driver. Observe how the volume where MySQL stores its data is moved when the MySQL task is scheduled on another instance in the cluster.

Set up the environment

Here’s how to set up the environment for this walkthrough.

Step 1: Instantiate the AWS CloudFormation template

aws cloudformation create-stack --stack-name rexray-demo \
--capabilities CAPABILITY_NAMED_IAM \
--template-url http://s3.amazonaws.com/ecs-refarch-volume-plugins/rexray-demo.json \
--parameters ParameterKey=KeyName,ParameterValue=<keypair-name>

The ECS container instances are bootstrapped using the following script, which is given as user data in rexyray-demo.json.

#open file descriptor for stderr
exec 2>>/var/log/ecs/ecs-agent-install.log
set -x
#verify that the agent is running
until curl -s http://localhost:51678/v1/metadata
do
	sleep 1
done
#install the Docker volume plugin
docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=<AWS_REGION> --grant-all-permissions
#restart the ECS agent
stop ecs 
start ecs

Step 2: Export output parameters as environment variables

This shell script exports the output parameters from the CloudFormation template and imports them as OS environment variables.  You use these variables later to create task and service definitions.

cat > get-outputs.sh << 'EOF'
#!/bin/bash
function usage {
  echo "usage: source <(./get-outputs.sh <stackname-or-stackid> <region>)"
  echo "stack name or ID must be provided or exported as the CloudFormationStack environment variable"
  echo "region must be provided or set with aws configure"
}

function main {
    #Get stack
    if [ -z "$1" ]; then
        if [ -z "$CloudFormationStack" ]; then
            echo "please provide stack name or ID"
            usage
            exit 1
        fi
    else
        CloudFormationStack="$1"
    fi
    #Get region
    if [ -z "$2" ]; then
        region=$(aws configure get region)
        if [ -z $region ]; then
            echo "please provide region"
            usage
            exit 1
        fi
    else
        region="$2"
    fi
    
    echo "#Region: $region"
    echo "#Stack: $CloudFormationStack"
    echo "#---"
    
    echo "#Checking if stack exists..."
    aws cloudformation wait stack-exists \
    --region $region \
    --stack-name $CloudFormationStack
    
    echo "#Checking if stack creation is complete..."
    aws cloudformation wait stack-create-complete \
    --region $region \
    --stack-name $CloudFormationStack
     
    echo "#Getting output keys and values..."
    echo "#---"
    aws cloudformation describe-stacks \
    --region $region \
    --stack-name $CloudFormationStack \
    --query 'Stacks[].Outputs[].[OutputKey, OutputValue]' \
    --output text | awk '{print "export", $1"="$2}'
}
main "[email protected]"
EOF

#Add executable permissions
chmod +x get-outputs.sh

Export the output parameters. The region parameter is only needed if your Region configuration is not us-west-2, as defined in the CloudFormation template.

./get-outputs.sh && source <(./get-outputs.sh)

Step 3: Create the task definition

In this step, you create a task definition for MySQL.  MySQL is considered stateful service because the data stored in the database has to persist beyond the life of the task.

When the MySQL task is restarted on another instance in the cluster, the scheduler and the rexray/ebs plugin ensure that the task is launched on an instance that can re-establish a connection to the EBS volume where the database is stored.

The placement constraint in the task definition informs the ECS service scheduler to launch the task in a specific Availability Zone; the available zone where the EBS volume was originally created.  Such a constraint is necessary because instances cannot connect to volumes in a different Availability Zone.

cat > mysql-taskdef.json << EOF 
{
    "containerDefinitions": [
        {
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "${CWLogGroupName}",
                    "awslogs-region": "${AWSRegion}",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "portMappings": [
                {
                    "containerPort": 3306,
                    "protocol": "tcp"
                }
            ],
            "environment": [
                {
                    "name": "MYSQL_ROOT_PASSWORD",
                    "value": "my-secret-pw"
                }
            ],
            "mountPoints": [
                {
                    "containerPath": "/var/lib/mysql",
                    "sourceVolume": "rexray-vol"
                }
            ],
            "image": "mysql",
            "essential": true,
            "name": "mysql"
        }
    ],
    "placementConstraints": [
        {
            "type": "memberOf",
            "expression": "attribute:ecs.availability-zone==${AvailabilityZone}"
        }
    ],
    "memory": "512",
    "family": "mysql",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "512",
    "volumes": [
        {
            "name": "rexray-vol",
            "dockerVolumeConfiguration": {
                "autoprovision": true,
                "scope": "shared",
                "driver": "rexray/ebs",
                "driverOpts": {
                    "volumetype": "gp2",
                    "size": "5"
                }
            }
        }
    ]
}
EOF

Docker volumes support adds several new the parameters to the ECS task definition. These include the volume type, scope, drivers, and Docker options and labels. A volume can either be scoped to a single, specific task or it can be shared among multiple tasks.

When a volume is scoped to a task, it is not meant to be shared across different running tasks.  In contrast, a shared volume is for use cases where the volume lifecycle is independent of the ECS task. The volume can be used by different tasks concurrently or at different times. It is primarily intended for use cases such as single-task applications where the volume persists after the task dies and is re-used when the task starts again. Another use case is when multiple tasks on the same EC2 container instance access the volume concurrently.

The autoprovision parameter is used to specify whether ECS manages the lifecycle of the volume.  When this is set to true, ECS automatically provisions the volume for you, which is what you are doing in the above example.  When it’s set to false, ECS assumes that the volume already exists.  For this example, you could instead set autoprovision to false and run the following command to create a volume:

aws create-volume --size 1 --volume-type gp2 \
--availability-zone $AvailabilityZone \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=rexray-vol}]'

The driver options are used to configure the type of EBS storage use, for example, gp2, standard, io1, and so on, the size of the volume to provision, IOPS, and encryption.  The specific options vary depending on the volume plugin that you are using.

Register the task definition and extract the task definition ARN from the result:

TaskDefinitionArn=$(aws ecs register-task-definition \
--cli-input-json 'file://mysql-taskdef.json' \
| jq -r .taskDefinition.taskDefinitionArn)

Step 4: Create a service definition

In this step, you create a service definition for MySQL.  An ECS service is a long running task that is monitored by the service scheduler.  If the task dies or becomes unhealthy, the scheduler automatically attempts to restart the task.

The MySQL service is fronted by a Network Load Balancer that is configured for forward traffic on port 3306 to the tasks registered with a specific target group.  The desired count is the desired number of task copies to run. The minimum and maximum healthy percent parameters inform the scheduler to only run exactly the number of desired copies of this task at a time. Unless a task has been stopped, it does not try starting a new one.

cat > mysql-svcdef.json << EOF 
{
    "cluster": "${ECSClusterName}",
    "serviceName": "mysql-svc",
    "taskDefinition": "${TaskDefinitionArn}",
    "loadBalancers": [
        {
            "targetGroupArn": "${MySQLTargetGroupArn}",
            "containerName": "mysql",
            "containerPort": 3306
        }
    ],
    "desiredCount": 1,
    "launchType": "EC2",
    "healthCheckGracePeriodSeconds": 60, 
    "deploymentConfiguration": {
        "maximumPercent": 100,
        "minimumHealthyPercent": 0
    },
    "networkConfiguration": {
        "awsvpcConfiguration": {
            "subnets": [
                "${SubnetId}"
            ],
            "securityGroups": [
                "${SecurityGroupId}"
            ],
            "assignPublicIp": "DISABLED"
        }
    }
}
EOF

Create the MySQL service:

SvcDefinitionArn=$(aws ecs create-service \
--cli-input-json file://mysql-svcdef.json \
| jq -r .service.serviceArn)

Step 5: Connect to the MySQL service

After the service is running, configure a MySQL client, such as MySQL Workbench, to connect to the service:

  1. For Connection Name, type “rexray-demo”.
  2. For Hostname, copy and paste the DNS name of the Network Load Balancer.
  3. For Password, type the default password found in the mysql-taskdef.json file.
  4. Choose Test Connection, Close.
  5. Under MySQL Connections, open the rexray-demo connection.

MySQL Workbench

In the Query window, paste the following:

CREATE DATABASE rexraydb;
USE rexraydb;
CREATE TABLE pets (name VARCHAR(20), breed VARCHAR(20));
SHOW TABLES;
DESCRIBE pets;
INSERT INTO pets VALUES ('Fluffy', 'Poodle');
SELECT * FROM pets;

You can execute each line separately by placing the cursor on a line and clicking the execute statement button.

Execute MySQL commands

Step 6: Drain the instance

Now that you have a running MySQL database server running under a container and persisting its data, make sure that it will survive a container replacement.

Docker containers by their nature are designed to be ephemeral. If you upgrade the underlying host operating system, you must drain the tasks off of the instance and let them be re-scheduled onto another ECS host. Below, I show the behavior of persisting the MySQL instance’s data to an EBS volume and allowing the task to be re-scheduled.

The following script identifies the instance that is currently running the task and puts it in a draining state.  This forces the task to be rescheduled onto the other EC2 container instance in the cluster.

cat > drain-instance.sh << 'EOF'

echo "Region [$AWSRegion]"
echo "Cluster [$ECSClusterName]"
echo "Task Definition [$TaskDefinitionArn]"

TaskArns=$(aws ecs list-tasks --region $AWSRegion \
--cluster $ECSClusterName --query taskArns --output text)
echo "Task ARNs [$TaskArns]"

ContainerInstanceArns=$(aws ecs describe-tasks \
--region $AWSRegion --cluster $ECSClusterName \
--tasks $TaskArns \
--query 'tasks[?taskDefinitionArn==`'$TaskDefinitionArn'`]' \
--query 'tasks[].containerInstanceArn' --output text)
echo "Container Instance ARNs [$ContainerInstanceArns]"

echo "DRAINING Instances"
aws ecs update-container-instances-state --region $AWSRegion \
--cluster $ECSClusterName --container-instances $ContainerInstanceArns \
--status "DRAINING"

EOF

In the ECS console, if you click on the cluster and then the tab for the cluster’s tasks, you see the container instance ID for the MySQL task:

Clicking the link of the container instance ID takes you to another page that shows the EC2 instance ID of the instance where the MySQL task is running:

Now run the script:

chmod +x drain-instance.sh
./drain-instance.sh

When you run the script, the tasks on the draining instance are stopped. Because you have an ECS service definition for MySQL, ECS launches new tasks on other ECS instances in the cluster that meet the placement constraints. In this example, you placed a constraint on the Availability Zone of the EBS volume as it’s not possible to detach and re-attach volumes across Availability Zones. Because the volume already exists, REX-Ray attaches the existing volume to the new task. When MySQL starts, it sees this as its data volume and you have access to the recently stored data.

Step 7: Re-connect to the MySQL service

After you see that a new task has been provisioned on the ECS cluster, you can return to MySQL Workbench and attempt to run the following query:

USE rexraydb;
SELECT * FROM pets;

You may get an error message stating “The MySQL server has gone away.” This usually means that the new ECS task has not completed starting or hasn’t been registered yet as a healthy target behind the Network Load Balancer. If you wait a little longer and try again, you should see the same results in the query grid as before.

This environment is meant as a demonstration on how to use Docker volume plugins with ECS for supporting persistent workloads. For an actual production implementation, I recommend scoping the VPC and security groups to only allow network access from trusted resources. This post creates a MySQL server that is accessible from the internet. In addition, you should implement your own strong MySQL root password, among other things.

To clean up this demo, take the following steps.

Delete the service.

aws ecs update-service --cluster $ECSClusterName \
--service $SvcDefinitionArn \
--desired-count 0
aws ecs delete-service --cluster $ECSClusterName \
--service $SvcDefinitionArn

Delete the volume.

Even though you deleted the task and the service, you still need to clean up the EBS volume that you created. You created this volume and referenced it in the ECS task definition. ECS passed this information along to Docker running on the host, which in turn handed it to REX-Ray (your volume driver), which knew how to attach the EBS volume and map it to the container.

The easiest way to delete this volume is from the EC2 console. In the list of volumes, you should see a volume named rexray-vol that is unattached (state=available). Delete this volume as it is no longer needed.

 

REX-Ray Volume

Otherwise, you can run the following command, which grabs the volume ID and deletes it:

rexrayVolumeID=$(aws ec2 describe-volumes --filter Name="tag:Name",Values=rexray-vol \
--query "Volumes[].VolumeId" --output text)
aws ec2 delete-volume --volume-id $rexrayVolumeID

Delete the CloudFormation template.

Lastly, delete the CloudFormation template. This removes the rest of the environment that was pre-created for this exercise.

aws cloudformation delete-stack --stack-name rexray-demo

Summary

While it was possible to use Docker volume plugins with ECS previously, doing so required you to create volumes out of band, that is, outside of ECS, and create placement constraints to restrict where tasks could be run. With native support for Docker volumes, volumes can now be provisioned simply by adding a handful of parameters to an ECS task definition.

Moreover, the ECS scheduler is now volume plugin aware.  Instances that have a volume driver installed on them automatically get annotated with attributes that inform the scheduler where to place tasks that use a particular driver.  Together, these features help you to run stateful, storage intensive applications such as databases, machine learning, and data processing applications, streaming applications like Kafka, as well as applications that need additional scratch space.  We look forward to hearing about the use cases that this new feature enables.

– Jeremy, Ronnie, and Tiffany

Automating rollback of failed Amazon ECS deployments

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/automating-rollback-of-failed-amazon-ecs-deployments/

Contributed by Vinay Nadig, Associate Solutions Architect, AWS.

With more and more organizations moving toward Agile development, it’s not uncommon to deploy code to production multiple times a day. With the increased speed of deployments, it’s imperative to have a mechanism in place where you can detect errors and roll back problematic deployments early. In this blog post, we look at a solution that automates the process of monitoring Amazon Elastic Container Service (Amazon ECS) deployments and rolling back the deployment if the container health checks fail repeatedly.

The normal flow for a service deployment on Amazon ECS is to create a new task definition revision and update an Amazon ECS service with the new task definition. Based on the values of minimumHealthyPercent and maximumHealthyPercent, Amazon ECS replaces existing containers in batches to complete the deployment. After the deployment is complete, you typically monitor the service health for errors and make a call on rolling back the deployment.

In March 2018, AWS announced support for native Docker health checks on Amazon ECS. Amazon ECS also supports Application Load Balancer health checks for services that are integrated with a load balancer. Leveraging these two features, we can build a solution that automatically rolls back Amazon ECS deployments if health checks fail.

Solution overview

The solution consists of the following components:

·      An Amazon CloudWatch event to listen for the UpdateService event of an Amazon ECS cluster

·      An AWS Lambda function that listens for the Amazon ECS events generated from the cluster after the service update

·      A Lambda function that calculates the failure percentage based on the events in the Amazon ECS event stream

·      A Lambda function that triggers rollback of the deployment if there are high error rates in the event

·      An AWS Step Functions state machine to orchestrate the entire flow

 

The following diagram shows the solution’s components and workflow.

Assumptions

The following assumptions are important to understand before you implement the solution:

·      The solution assumes that with every revision of the task definition, you use a new Docker tag instead of using the default “latest” tag. As a best practice, we advise that you do every release with a different Docker image tag and a revision of the task definition.

·      If there are continuous healthcheck failures even after the deployment is automatically rolled back using this setup, another rollback is triggered due to the health check failures. This might introduce a runaway deployment rollback loop. Make sure that you use the solution where you know that a one-step rollback will bring the Amazon ECS service into a stable state.

·      This blog post assumes deployment to the US West (Oregon) us-west-2 Region. If you want to deploy the solution to other Regions, you need to make minor modifications to the Lambda code.

·      The Amazon ECS cluster launches in a new VPC. Make sure that your VPC service limit allows for a new VPC.

Prerequisites

You need the following permissions in AWS Identity and Access Management (IAM) to implement the solution:

·      Create IAM Roles

·      Create ECS Cluster

·      Create CloudWatch Rule

·      Create Lambda Functions

·      Create Step Functions

Creating the Amazon ECS cluster

First, we create an Amazon ECS cluster using the AWS Management Console.

1. Sign in to the AWS Management Console and open the Amazon ECS console.

2. For Step 1: Select cluster template, choose EC2 Linux + Networking and then choose Next step.

3. For Step 2: Configure cluster, under Configure cluster, enter the Amazon ECS cluster name as AutoRollbackTestCluster.

 

4. Under Instance configuration, for EC2 instance type, choose t2.micro.

5. Keep the default values for the rest of the settings and choose Create.

 

This provisions an Amazon ECS cluster with a single Amazon ECS container instance.

Creating the task definition

Next, we create a new task definition using the Nginx Alpine image.

1. On the Amazon ECS console, choose Task Definitions in the navigation pane and then choose Create new Task Definition.

2. For Step 1: Select launch type compatibility, choose EC2 and then choose Next step.

3. For Task Definition Name, enter Web-Service-Definition.

4. Under Task size, under Container Definitions, choose Add Container.

5.  On the Add container pane, under Standard, enter Web-Service-Container for Container name.

6.  For Image, enter nginx:alpine. This pulls the nginx:alpine Docker image from Docker Hub.

7.  For Memory Limits (MiB), choose Hard limit and enter 128.

8.  Under Advanced container configuration, enter the following information for Healthcheck:

 

·      Command:

CMD-SHELL, wget http://localhost/ && rm index.html || exit 1

·      Interval: 10

·      Timeout: 30

·      Start period: 10

·      Retries: 2

9.  Keep the default values for the rest of the settings on this pane and choose Add.

10. Choose Create.

Creating the Amazon ECS service

Next, we create an Amazon ECS service that uses this task definition.

1.  On the Amazon ECS console, choose Clusters in the navigation pane and then choose AutoRollbackTestCluster.

2.  On the Services view, choose Create.

3.  For Step 1: Configure service, use the following settings:

·      Launch type: EC2.

·      Task Definition Family: Web-Service-Definition. This automatically selects the latest revision of the task definition.

·      Cluster: AutoRollbackTestCluster.

·      Service name: Nginx-Web-Service.

·      Number of tasks: 3.

4.  Keep the default values for the rest of the settings and choose Next Step.

5.  For Step 2: Configure network, keep the default value for Load balancer type and choose Next Step.

6. For Step 3: Set Auto Scaling (optional), keep the default value for Service Auto Scaling and choose Next Step.

7. For Step 4: Review, review the settings and choose Create Service.

After creating the service, you should have three tasks running in the cluster. You can verify this on the Tasks view in the service, as shown in the following image.

Implementing the solution

With the Amazon ECS cluster set up, we can move on to implementing the solution.

Creating the IAM role

First, we create an IAM role for reading the event stream of the Amazon ECS service and rolling back any faulty deployments.

 

1.  Open the IAM console and choose Policies in the navigation pane.

2.  Choose Create policy.

3.  On the Visual editor view, for Service, choose EC2 Container Service.

4.  For Actions, under Access Level, select DescribeServices for Read and UpdateServices for Write.

5.  Choose Review policy.

6.  For Name, enter ECSRollbackPolicy.

7.  For Description, enter an appropriate description.

8.  Choose Create policy.

Creating the Lambda service role

Next, we create a Lambda service role that uses the previously created IAM policy. The Lambda function to roll back faulty deployments uses this role.

 

1.  On the IAM console, choose Roles in the navigation pane and then choose Create role.

2.  For the type of trusted entity, choose AWS service.

3.  For the service that will use this role, choose Lambda.

4.  Choose Next: Permissions.

5.  Under Attach permissions policies, select the ECSRollbackPolicy policy that you created.

6. Choose Next: Review.

7.  For Role name, enter ECSRollbackLambdaRole and choose Create role.

Creating the Lambda function for the Step Functions workflow and Amazon ECS event stream

The next step is to create the Lambda function that will collect Amazon ECS events from the Amazon ECS event stream. This Lambda function will be part of the Step Functions state machine.

 

1.  Open the Lambda console and choose Create function.

2.  For Name, enter ECSEventCollector.

3.  For Runtime, choose Python 3.6.

4.  For Existing role, choose the ECSRollbackLambdaRole IAM role that you created.

5. Choose Create function.

6.  On the Configuration view, under Function code, enter the following code.

import time
import boto3
from datetime import datetime

ecs = boto3.client('ecs', region_name='us-west-2')


def lambda_handler(event, context):
    service_name = event['detail']['requestParameters']['service']
    cluster_name = event['detail']['requestParameters']['cluster']
    _update_time = event['detail']['eventTime']
    _update_time = datetime.strptime(_update_time, "%Y-%m-%dT%H:%M:%SZ")
    start_time = _update_time.strftime("%s")
    seconds_from_start = time.time() - int(start_time)
    event.update({'seconds_from_start': seconds_from_start})

    _services = ecs.describe_services(
        cluster=cluster_name, services=[service_name])
    service = _services['services'][0]
    service_events = service['events']
    events_since_update = [event for event in service_events if int(
        (event['createdAt']).strftime("%s")) > int(start_time)]
    [event.pop('createdAt') for event in events_since_update]
    event.update({"events": events_since_update})
    return event

 

7. Under Basic Settings, set Timeout to 30 seconds.

 

8.  Choose Save.

Creating the Lambda function to calculate failure percentage

Next, we create a Lambda function that calculates the failure percentage based on the number of failed container health checks derived from the event stream.

 

1.     On the Lambda console, choose Create function.

2.     For Name, enter ECSFailureCalculator.

3.     For Runtime, choose Python 3.6.

4.     For Existing role, choose the ECSRollbackLambdaRole IAM role that you created.

5.     Choose Create function.

6.     On the Configuration view, under Function code, enter the following code.

 

import re

lb_hc_regex = re.compile("\(service (.*)?\) \(instance (i-[a-z0-9]{7,17})\) \(port ([0-9]{4,5})\) is unhealthy in \(target-group (.*)?\) due to \((.*)?: \[(.*)\]\)")
docker_hc_regex = re.compile("\(service (.*)?\) \(task ([a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12})\) failed container health checks\.")
task_registration_formats = ["\(service (.*)?\) has started ([0-9]{1,9}) tasks: (.*)\."]


def lambda_handler(event, context):
    cluster_name = event['detail']['requestParameters']['cluster']
    service_name = event['detail']['requestParameters']['service']

    messages = [m['message'] for m in event['events']]
    failures = get_failure_messages(messages)
    registrations = get_registration_messages(messages)
    failure_percentage = get_failure_percentage(failures, registrations)
    print("Failure Percentage = {}".format(failure_percentage))
    return {"failure_percentage": failure_percentage, "service_name": service_name, "cluster_name": cluster_name}


def get_failure_percentage(failures, registrations):
    no_of_failures = len(failures)
    no_of_registrations = sum([float(x[0][1]) for x in registrations])
    return no_of_failures / no_of_registrations * 100 if no_of_registrations > 0 else 0


def get_failure_messages(messages):
    failures = []
    for message in messages:
        failures.append(lb_hc_regex.findall(message)) if lb_hc_regex.findall(message) else None
        failures.append(docker_hc_regex.findall(message)) if docker_hc_regex.findall(message) else None
    return failures


def get_registration_messages(messages):
    registrations = []
    for message in messages:
        for registration_format in task_registration_formats:
            if re.findall(registration_format, message):
                registrations.append(re.findall(registration_format, message))
    return registrations

7.     Under Basic Settings, set Timeout to 30 seconds.

8.     Choose Save.

Creating the Lambda function to roll back a deployment

Next, we create a Lambda function to roll back an Amazon ECS deployment.

 

1.     On the Lambda console, choose Create function.

2.     For Name, enter ECSRollbackfunction.

3.     For Runtime, choose Python 3.6.

4.     For Existing role, choose the ECSRollbackLambdaRole IAM role that you created.

5.     Choose Create function.

6.     On the Configuration view, under Function code, enter the following code.

 

import boto3

ecs = boto3.client('ecs', region_name='us-west-2')

def lambda_handler(event, context):
    service_name = event['service_name']
    cluster_name = event['cluster_name']

    _services = ecs.describe_services(cluster=cluster_name, services=[service_name])
    task_definition = _services['services'][0][u'taskDefinition']
    previous_task_definition = get_previous_task_definition(task_definition)

    ecs.update_service(cluster=cluster_name, service=service_name, taskDefinition=previous_task_definition)
    print("Rollback Complete")
    return {"Rollback": True}

def get_previous_task_definition(task_definition):
    previous_version_number = str(int(task_definition.split(':')[-1])-1)
    previous_task_definition = ':'.join(task_definition.split(':')[:-1]) + ':' + previous_version_number
    return previous_task_definition


7.     Under Basic Settings, set Timeout to 30 seconds.

8.     Choose Save.

Creating the Step Functions state machine

Next, we create a Step Functions state machine that performs the following steps:

 

1.     Collect events of a specified service for a specified duration from the event stream of the Amazon ECS cluster.

2.     Calculate the percentage of failures after the deployment.

3.     If the failure percentage is greater than a specified threshold, roll back the service to the previous task definition.

 

To create the state machine:

1.     Open the Step Functions console and choose Create state machine.

2.     For Name, enter ECSAutoRollback.

For IAM role, keep the default selection of Create a role for me and select the check box. This will create a new IAM role with necessary permissions for the execution of the state machine.

Note
If you have already created a Step Functions state machine, IAM Role is populated.

3.     For State machine definition, enter the following code, replacing the Amazon Resource Name (ARN) placeholders with the ARNs of the three Lambda functions that you created.

{
    "StartAt": "VerifyClusterAndService",
    "States":
    {
        "VerifyClusterAndService":
        {
            "Type": "Choice",
            "Choices": [
            {
                "And": [
                {
                    "Variable": "$.detail.requestParameters.cluster",
                    "StringEquals": "AutoRollbackTestCluster"
                },
                {
                    "Variable": "$.detail.requestParameters.service",
                    "StringEquals": "Nginx-Web-Service"
                }],
                "Next": "GetTasksStatus"
            },
            {
                "Not":
                {
                    "And": [
                    {
                        "Variable": "$.detail.requestParameters.cluster",
                        "StringEquals": "AutoRollbackTestCluster"
                    },
                    {
                        "Variable": "$.detail.requestParameters.service",
                        "StringEquals": "Nginx-Web-Service"
                    }]
                },
                "Next": "EndState"
            }]
        },
        "GetTasksStatus":
        {
            "Type": "Task",
            "Resource": "<ARN-of-ECSEventCollector-Lambda-Function>",
            "Next": "WaitForInterval"
        },
        "WaitForInterval":
        {
            "Type": "Wait",
            "Seconds": 5,
            "Next": "IntervalCheck"
        },
        "IntervalCheck":
        {
            "Type": "Choice",
            "Choices": [
            {
                "Variable": "$.seconds_from_start",
                "NumericGreaterThan": 300,
                "Next": "FailureCalculator"
            },
            {
                "Variable": "$.seconds_from_start",
                "NumericLessThan": 300,
                "Next": "GetTasksStatus"
            }]
        },
        "FailureCalculator":
        {
            "Type": "Task",
            "Resource": "<ARN-of-ECSFailureCalculator-Lambda-Function-here>",
            "Next": "RollbackDecider"
        },
        "RollbackDecider":
        {
            "Type": "Choice",
            "Choices": [
            {
                "Variable": "$.failure_percentage",
                "NumericGreaterThan": 10,
                "Next": "RollBackDeployment"
            },
            {
                "Variable": "$.failure_percentage",
                "NumericLessThan": 10,
                "Next": "EndState"
            }]
        },
        "RollBackDeployment":
        {
            "Type": "Task",
            "Resource": "<ARN-of-ECSRollbackFunction-Lambda-Function-here>",
            "Next": "EndState"
        },
        "EndState":
        {
            "Type": "Succeed"
        }
    }
}

4.     Choose Create state machine.

 

Now we have a mechanism to roll back a deployment if there are more than a configurable percentage of errors after a deployment to a specific Amazon ECS service.

(Optional) Monitoring and rolling back all services in the Amazon ECS cluster

Step Functions hard-codes the Amazon ECS service name in the state machine so that you monitor only a specific service in the cluster. The following image shows these lines in the state machine’s definition.

If you want to monitor all services and automatically roll back any Amazon ECS deployment in the cluster  based on failures, modify the state machine definition to verify only the cluster name and to not verify the service name. To do this, remove the service name check in the definition, as shown in the following image.

The following code verifies only the cluster name. It monitors any Amazon ECS service and performs a rollback if there are errors.

{
    "StartAt": "VerifyClusterAndService",
    "States":
    {
        "VerifyClusterAndService":
        {
            "Type": "Choice",
            "Choices": [
            {
                "Variable": "$.detail.requestParameters.cluster",
                "StringEquals": "AutoRollbackTestCluster",
                "Next": "GetTasksStatus"
            },
            {
                "Not":
                {
                    "Variable": "$.detail.requestParameters.cluster",
                    "StringEquals": "AutoRollbackTestCluster"
                },
                "Next": "EndState"
            }]
        },
        "GetTasksStatus":
        {
            "Type": "Task",
            "Resource": "<ARN-of-ECSEventCollector-Lambda-Function>",
            "Next": "WaitForInterval"
        },
        "WaitForInterval":
        {
            "Type": "Wait",
            "Seconds": 5,
            "Next": "IntervalCheck"
        },
        "IntervalCheck":
        {
            "Type": "Choice",
            "Choices": [
            {
                "Variable": "$.seconds_from_start",
                "NumericGreaterThan": 300,
                "Next": "FailureCalculator"
            },
            {
                "Variable": "$.seconds_from_start",
                "NumericLessThan": 300,
                "Next": "GetTasksStatus"
            }]
        },
        "FailureCalculator":
        {
            "Type": "Task",
            "Resource": "<ARN-of-ECSFailureCalculator-Lambda-Function-here>",
            "Next": "RollbackDecider"
        },
        "RollbackDecider":
        {
            "Type": "Choice",
            "Choices": [
            {
                "Variable": "$.failure_percentage",
                "NumericGreaterThan": 10,
                "Next": "RollBackDeployment"
            },
            {
                "Variable": "$.failure_percentage",
                "NumericLessThan": 10,
                "Next": "EndState"
            }]
        },
        "RollBackDeployment":
        {
            "Type": "Task",
            "Resource": "<ARN-of-ECSRollbackFunction-Lambda-Function-here>",
            "Next": "EndState"
        },
        "EndState":
        {
            "Type": "Succeed"
        }
    }
}

 

Configuring the state machine to execute automatically upon Amazon ECS deployment

Next, we configure a trigger for the state machine so that its execution automatically starts when there is an Amazon ECS deployment. We use Amazon CloudWatch to configure the trigger.

 

1.     Open the CloudWatch console and choose Rules in the navigation pane.

2.     Choose Create rule and use the following settings:

·      Event Source

o   Service Name: EC2 Container Service (ECS)

o   Event Type: AWS API Call via CloudTrail

o   Operations: choose ‘Specific Operations and enter UpdateService

·      Targets

o   Step Functions state machine

o   State machine: ECSAutoRollback

3.     Choose Configure details.

4.     For Name, enter ECSServiceUpdateRule.

5.     For Description, enter an appropriate description.

6.     For State, make sure that Enabled is selected.

7.     Click Create rule.

 

Setting up the CloudWatch trigger is the last step in linking the Amazon ECS UpdateService events to the Step Functions state machine that we set up. With this step complete, we can move on to testing the solution.

Testing the solution

Let’s update the task definition and force a failure of the container health checks so that we can confirm that the deployment rollback occurs as expected.

 

To test the solution:

 

1.     Open the Amazon ECS console and choose Task Definitions in the navigation pane.

2.     Select the check box next to Web-Service-Definition and choose Create new revision.

3.     Under Container Definitions, choose Web-Service-Container.

4.     On the Edit container pane, under Healthcheck, update Command to

CMD-SHELL, wget http://localhost/does-not-exist.html && rm index.html || exit 1 

and choose Update.

5.     Choose Create. This creates the task definition revision.

6.     Open the Nginx-Web-Service page of the Amazon ECS console and choose Update.

7.     For Task Definition, select the latest revision.

8.    Keep the default values for the rest of the settings by choosing Next Step until you reach Review.

9.     Choose Update Service. This creates a new Amazon ECS deployment.

This service update triggers the CloudWatch rule, which in turn triggers the state machine. The state machine collects the Amazon ECS events for 300 seconds. If the percentage of errors due to health check failures is more than 10%, the deployment is automatically rolled back. You can verify this on the Step Functions console. On the Executions view, you should see a new execution that the deployment is triggering, as shown in the following image.

Choose the execution to see the workflow in progress. After the workflow is complete, you can check the outcome of the workflow by choosing EndState in Visual Workflow. The output should show {“Rollback”: true}.

You can also verify in the service details that the service has been updated with the previous version of the task definition.

Conclusion

With this solution, you can detect issues with Amazon ECS deployments early on and automate failure responses. You can also integrate the solution into your existing systems by triggering an Amazon SNS notification to send email or SMS instead of rolling back the deployment automatically. Though this blog uses Amazon ECS, you can follow similar steps to have automatic rollback for AWS Fargate.

If you want to customize the duration for monitoring your deployments before deciding to rollback and the error percentage threshold beyond which a rollback should be triggered, modify the values highlighted in the following image of the state machine definition.

Measuring service chargeback in Amazon ECS

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/measuring-service-chargeback-in-amazon-ecs/

Contributed by Subhrangshu Kumar Sarkar, Sr. Technical Account Manager, and Shiva Kumar Subramanian, Sr. Technical Account Manager

Amazon Elastic Container Service (ECS) users have been asking us for a way to allocate cost to the deployed services in a shared Amazon ECS cluster. This blog post can help customers think through different techniques to allocate costs incurred by running Amazon ECS services to owners who include specific teams or individual users. The post dives in to one technique that gives customers a granular way to allocate costs to Amazon ECS service owners.

Amazon ECS pricing models

Amazon ECS has two pricing models.  In the Amazon EC2 launch type model, you pay for the AWS resources (e.g., Amazon EC2 instances or Amazon EBS volumes) that you create to store and run your application. Right now, it’s difficult to calculate the aggregate cost of an Amazon ECS service that consists of multiple tasks. In the AWS Fargate launch type model, you pay for vCPU and memory resources that your containerized application requests. Although the user knows the cost that the tasks incur, there is no out-of-box way to associate that cost to a service.

Possible solutions

There are two possible solutions to this problem.

A. Billing based on the usage of container instances in a partitioned cluster.

One solution for service chargeback is to associate specific container instances with respective teams or customers. Then use task placement constraints to restrict the services that they deploy to only those container instances. The following image shows how this solution works.

Here, user A is allowed to deploy services only the blue container instances and user B is allowed on the green ones. Both users can be charged based on the AWS resources they use. E.g. the EC2 instances and the ALB etc.

This solution is useful when you don’t want to host services from different teams or users on the same set of container instances. However, an Amazon ECS cluster is getting shared, and the end users are still getting charged for the Amazon EC2 instances and other AWS assets that they’re using rather than for the exact vCPU and memory resources that their service is using. The disadvantage to this approach is that you could have provisioned excess capacity for your users and end up wasting resources. You also need to use placement constraints in all of your task definitions.

B. Billing based on resource usage at the task level.

Another solution could be to develop a mechanism to let the Amazon ECS cluster owners calculate the aggregate cost of an Amazon ECS service that consists of multiple tasks. The solution would have a metering mechanism and a chargeback measurement. When deployed for Amazon EC2 launch type tasks, the metering mechanism tracks the vCPU and memory that Amazon ECS reserves in the tasks’ lifetime. Then, with the chargeback measurement, the cluster owner can associate a cost with these tasks based on the cost incurred by the container instances that they’re running on. The following image shows how this solution works.

Here, unlike the previous solution, both users can use all the container instances of the ECS cluster.

With this solution, customers can start using a shared Amazon ECS cluster to deploy their tasks on any of the container instances. After the solution has been deployed, the cost for a service can be calculated at any point in time, using the cluster and the service name as input parameters.

With Fargate tasks, the vCPU and memory usage details are already available in vCPU-hours and GB-hours, respectively. The chargeback measurement in the solution aggregates the CPU and memory reservation of all the tasks that ever ran as part of a service. It associates a cost to this aggregated CPU and memory reservation by multiplying it with Fargate’s per vCPU per hour and perGB per hour cost, respectively.

This solution has the following considerations:

  • Amazon EC2 pricing: For the base price of the container instance, we’re considering the On-Demand price.
  • Platform costs: Common costs for the cluster (the Amazon EBS volume that the containers are launched from, Amazon ECR, etc.) are treated as the platform cost for all of the services running on the cluster.
  • Networking cost: When you’re using bridge or host networking, there is no mechanism to divide costs among different tasks that are launched on the container instance.
  • Elastic Load Balancing or Application Load Balancer costs: If services sit behind multiple target groups of an Application Load Balancer, there is no direct way of dividing costs per target group.

Solution components

The solution has two components: a metering mechanism and a chargeback measurement.

The metering mechanism consists of the following parts:

The chargeback measurement consists of the following parts:

  • Python script
  • AWS Price List Service API

Metering mechanism

The following image shows the architecture of the solution’s metering mechanism.

As part of the deployment of the Metering mechanism, the user needs to do the following.

  1. A CloudWatch Events rule is created by the user to trigger a Lambda function on an Amazon ECS task state change event. Typically, a task state change event is generated with a call to the StartTask, RunTask, and StopTask API operations or when an Amazon ECS service scheduler starts or stops a task.
  2. User needs to create a DynamoDB table, which the Lambda function can update.
  3. Every time the Lambda function is invoked, it updates the DynamoDB table with details of the Amazon ECS task.

With the first run of the metering mechanism, it takes stock of all running Amazon ECS tasks across all services across all clusters. This data resides in DynamoDB from then on, and the solution’s chargeback measurement uses it.

Chargeback measurement

The following image shows the architecture of the chargeback measurement.

When you need to find the cost associated with a service, run the ecs-chargeback Python script with the cluster and service names as parameters. This script performs the following actions.

  1. Find all the tasks that have ever run or are currently running as part of the service.
  2. For each task, calculate the up time.
  3. For each task, find the container instance type (for Amazon EC2 type tasks).
  4. Find what percentage of the host’s compute or memory resources the task has reserved. If there is no task-level CPU reservation for Amazon EC2 launch type tasks, a CPU reservation of 128 CPU shares (0.125 vCPUs) is assumed. In Amazon EC2 launch type tasks, you have to specify memory reservation at the task or container level during creation of the task definition.
  5. Associate that percentage with a cost.
  6. (Optional) Use the following parameters:
    • Duration: By default, the script shows the service cost for its complete uptime. You can use the duration parameter to get the cost for a particular month, the month to date, or the last n days.
    • Weight: This parameter is a weighted fraction that you can use to disproportionately divide the instance cost between vCPU and memory. By default, this value is 0.5.

The vCPU and memory costs are calculated using the following formulas:

  • Task vCPU cost = (task vCPU reservation/total vCPUs in the instance) * (cost of the instance) * (vCPU/memory weight) * task run time in seconds
  • Task memory cost = (task memory reservation/total memory in the instance) * (cost of the instance) * (1- vCPU/memory weight) * task run time in seconds

Solution deployment and cost measurement

Here are the steps to deploy the solution in your AWS account and then calculate the service chargeback.

Metering mechanics

1. Create a DynamoDB table named ECSTaskStatus to capture details of an ECS task state change CloudWatch event.

Primary partition key: taskArn. Type: string.

Provision RCUs or WCUs depending on your Amazon ECS usage.

For the rest, keep the default values.

aws dynamodb create-table --table-name ECSTaskStatus \
--attribute-definitions AttributeName=taskArn,AttributeType=S \
--key-schema AttributeName=taskArn,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=10,WriteCapacityUnits=20

2. Create an IAM policy named LambdaECSTaskStatusPolicy that allows the Lambda function to make    the following API calls. Create a local copy of the policy document LambdaECSTaskStatusPolicy.JSON from GitHub.

o	ecs: DescribeContainerInstances
o	dynamodb: BatchGetItem, BatchWriteItem, PutItem, GetItem, and UpdateItem

o	logs: CreateLogGroup, CreateLogStream, and PutLogEvents

aws iam create-policy --policy-name LambdaECSTaskStatusPolicy \
--policy-document file://LambdaECSTaskStatusPolicy.JSON

3. Create an IAM role named LambdaECSTaskStatusRole and attach the policy to the role. Replace <Policy ARN> with the Amazon Resource Name (ARN) of the IAM policy.

aws iam create-role --role-name LambdaECSTaskStatusRole \
--assume-role-policy-document \
'{ "Version": "2012-10-17", "Statement": { "Effect": "Allow", "Principal": {"Service": "lambda.amazonaws.com"}, "Action": "sts:AssumeRole"}}'

aws iam attach-role-policy --policy-arn <Policy ARN> --role-name LambdaECSTaskStatusRole

4. Create a Lambda function named ecsTaskStatus that PUTs or UPDATEs the Amazon ECS task details to the ECSTaskStatus DynamoDB table. This function has the following details:

o   Runtime: Python 3.6.

o   Memory setting: 128 MB.

o   Timeout: 3 seconds.

o   Execution role: LambdaECSTaskStatusRole.

o   Code: ecsTaskStatus.py. Use the inline code editor on the Lambda console to author the function.

 

5. Create a CloudWatch Events rule for Amazon ECS task state change events and configure the Lambda function as the target. The function puts or updates items in the ECSTaskStatus DynamoDB table with every Amazon ECS task’s details.

a.     Create the CloudWatch Events rule.

aws events put-rule --name ECSTaskStatusRule \
--event-pattern '{"source": ["aws.ecs"], "detail-type": ["ECS Task State Change"], "detail": {"lastStatus": ["RUNNING", "STOPPED"]}}'

b.     Add the Lambda function as a target to the CloudWatch Events rule. Replace <Lambda ARN> with the ARN of the Lambda function that you created in step 4.

aws events put-targets --rule ECSTaskStatusRule --targets "Id"="1","Arn"="<Lambda ARN>"

c.     Add permissions for CloudWatch Events to invoke Lambda. Replace <CW Events Rule ARN> with the ARN of the CloudWatch Events rule that you created in step 5a.

aws lambda add-permission --function-name ecsTaskStatus \
--action 'lambda:InvokeFunction' --statement-id "LambdaAddPermission" \
--principal events.amazonaws.com --source-arn <CW Events Rule ARN>

The solution invokes the Lambda function only when an Amazon ECS task state change event occurs. Therefore, when the solution is deployed, no event is raised for current running tasks, and task details aren’t populated into the DynamoDB table. If you want to meter current running tasks, you can run the script ecsTaskStatus-FirstRun.py after creation of the DynamoDB table. This populates all running tasks’ details into the DynamoDB table. The script is idempotent.

ecsTaskStatus-FirstRun.py --region eu-west-1

Chargeback measurement

To find the cost for running a service, run the Python script ecs-chargeback, which has the following usage and arguments.

./ecs-chargeback -h
usage: ecs-chargeback [-h] --region REGION --cluster CLUSTER --service SERVICE
                      [--weight WEIGHT] [-v]
                      [--month MONTH | --days DAYS | --hours HOURS]

optional arguments:
  -h, --help            show this help message and exit
  --region REGION, -r REGION
                        AWS Region in which Amazon ECS service is running.
  --cluster CLUSTER, -c CLUSTER
                        ClusterARN in which Amazon ECS service is running.
  --service SERVICE, -s SERVICE
                        Name of the AWS ECS service for which cost has to be
                        calculated.
  --weight WEIGHT, -w WEIGHT
                        Floating point value that defines CPU:Memory Cost
                        Ratio to be used for dividing EC2 pricing
  -v, --verbose
  --month MONTH, -M MONTH
                        Show charges for a service for a particular month
  --days DAYS, -D DAYS  Show charges for a service for last N days
  --hours HOURS, -H HOURS
                        Show charges for a service for last N hours

 

To calculate the cost that a service incurs with Amazon EC2 launch type tasks, run the script as follows.

./ecs-chargeback -r eu-west-1 -c ecs-chargeback -s nginxsvc

The following is sample output of running this script.

# ECS Region  : eu-west-1, ECS Service Name: nginxsvc
# ECS Cluster : arn:aws:ecs:eu-west-1:675410410211:cluster/ecs-chargeback
#
# Amazon ECS Service Cost           : 26.547270 USD
#             (Launch Type : EC2)
#         EC2 vCPU Usage Cost       : 21.237816 USD
#         EC2 Memory Usage Cost     : 5.309454 USD

To get the chargeback for Fargate launch type tasks, run the script as follows.

./ecs-chargeback -r eu-west-1 -c ecs-chargeback -s fargatesvc

The following is sample output of this script.


# ECS Region  : eu-west-1, ECS Service Name: fargatesvc
# ECS Cluster : arn:aws:ecs:eu-west-1:675410410211:cluster/ecs-chargeback
#
# Amazon ECS Service Cost           : 118.653359 USD
#             (Launch Type : FARGATE)
#         Fargate vCPU Usage Cost   : 78.998157 USD
#         Fargate Memory Usage Cost : 39.655201 USD

Conclusion

This solution can help Amazon ECS users track and allocate costs for their deployed workloads. It might also help them save some costs by letting them share an Amazon ECS cluster among multiple users or teams. We welcome your comments and questions below. Please reach out to us if you would like to contribute to the solution.

Introducing private registry authentication support for AWS Fargate

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/introducing-private-registry-authentication-support-for-aws-fargate/

Private registry authentication support for Amazon Elastic Container Service (Amazon ECS) is now available with the AWS Fargate launch type! Now, in addition to Amazon Elastic Container Registry (Amazon ECR), you can use any private registry or repository of your choice for both EC2 and Fargate launch types.

For ECS to pull from a private repository, it needs a secret in AWS Secrets Manager with your registry credentials, an ECS task execution IAM role in AWS Identity Access Management (IAM) with a policy granting access to the secret, and a task with the secret and task execution IAM role ARNs in the task definition.

Diagram of ECS Private Registry Authentication Architecture

Here’s how to use ECS with a private repository on Docker Hub via the AWS Management Console.

Registry

If you don’t already have a private repository (or account), you can create a free repo now. To follow along, run the following commands in a terminal to pull an image, get the image ID, and push it to your new repository:

docker pull tiffanyfay/space
docker images tiffanyfay/space --format {{.ID}}
docker tag <image-id> <your-username/repository-name>:latest
docker login
docker push <your-username/repository-name>

Secrets Manager

In the Secrets Manager console, store a new secret with your Docker Hub credentials, which is used to access your private repository.

By default, Secrets Manager creates an encryption key, DefaultEncryptionKey, on your behalf. You can instead use an existing key or add a new one with AWS Key Management Service (AWS KMS), if you would prefer.

Choose Other type of secrets and add secret keys and values for username and password.

Next, create a name, such as dockerhub, and description for your secret.

Because the keys are corresponding to your Docker Hub credentials, leave rotation disabled.

On the next page, you can review your settings and store your secret. Open your new secret to see the details. Write down the Secret ARN value and keep it handy, as it is used in the next step and later, in your task definition.

IAM

Now that you have a secret, you need to provide Fargate permissions to read it. This is done via a task execution IAM role.

In the IAM console, choose Policies, Create policy. Provide Secrets Manager with read access for secretsmanager:GetSecretValue, with your secret’s ARN as the resource.

Name your policy dockerhubsecret.

If you chose to use your own encryption key, you also need to create a policy with kms:Decrypt permissions for KMS.

Next, choose Role to create an IAM role, which is used as your task execution role. Choose AWS service, Elastic Container Service, and Elastic Container Service Task.

Search for your dockerhubsecret policy and attach it to the role.

Lastly, give the role a name, such as ecsExecutionRoleDockerHub, and create it. Copy the role ARN value. Depending on how you create your task definition, you may need it.

ECS

While the mechanism to authenticate private registries is supported on both EC2 and Fargate launch types, for this example we will be launching a task on Fargate.

Before you can create a task, you need an ECS cluster, VPC, and subnets. If you don’t already have them, in the ECS console, choose Clusters, Get Started. Keep track of the cluster name, VPC ID, and subnet IDs, as you use them soon.

It’s time to create your task definition, which is used to create your task (grouping of up to ten containers that run on the same host). This is where you need your Secrets Manager ARN and IAM role name.

Choose Task Definitions, Create new Task Definition, and select the Fargate launch type. You can then configure your task definition via the wizard or scroll down, choose Configure via JSON and paste the following task definition after replacing fields with angle brackets. This task definition also works with the EC2 launch type.

{
    "family": "space-td",
    "containerDefinitions": [
        {
            "name": "space",
            "image": "<your-username/repository-name>",
            "portMappings": [
                {
                    "protocol": "tcp",
                    "containerPort": 80
                }
            ],
            "cpu": 0,
            "repositoryCredentials": {
                "credentialsParameter": "<secret-ARN>"
            }
        }
    ],
    "memory": "512",
    "cpu": "256",
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "networkMode": "awsvpc",
    "executionRoleArn": "<execution-role-ARN>"
}

If you use the wizard, give your task a name, such as space-td, and specify your task execution IAM role (ecsTaskExecutionRoleDockerHub), a task size of 0.5 GB of memory, and 0.25 vCPU.

Next, choose Container Definitions, Add container. Give the container a name, specify your image <your-username/repository-name>, check the box for private registry authentication, and add your secrets manager ARN and a container port 80. Choose Add.

After you create your task definition, choose Actions, Run Task, and specify the Fargate launch type, your cluster, cluster VPC, subnets, a security group with inbound permissions for your container ports (the default one provides access to port 80). Enable auto-assigning a public IP address.

Open the task from its ID to see the details:

When the Last status field is RUNNING, under Network, copy the public IP address and paste it in a browser.

If you used pushed tiffanyfay/space to your repository, you should see the following:

I hope this post has helped you. If you have any questions, feel free to reach out!

-tiffany

Special thanks to Yuling Zhou, Deepak Dayama, Derek Petersen, Varun Iyer, Adnan Khan and several others for their insights in this blog.

tiffany jernigan

tiffany jernigan

@tiffanyfayj
Tiffany is a developer advocate at Amazon for containers on AWS. Previously she worked at Docker and Intel in software engineering and as a hardware engineer after graduating from Georgia Tech in Electrical Engineering. In the majority of her free time she dabbles in photography and spends time with family and friends. You can find her on twitter/ig as tiffanyfayj.

Compute Abstractions on AWS: A Visual Story

Post Syndicated from Massimo Re Ferre original https://aws.amazon.com/blogs/architecture/compute-abstractions-on-aws-a-visual-story/

When I joined AWS last year, I wanted to find a way to explain, in the easiest way possible, all the options it offers to users from a compute perspective. There are many ways to peel this onion, but I want to share a “visual story” that I have created.

I define the compute domain as “anything that has CPU and Memory capacity that allows you to run an arbitrary piece of code written in a specific programming language.” Your mileage may vary in how you define it, but this is broad enough that it should cover a lot of different interpretations.

A key part of my story is around the introduction of different levels of compute abstractions this industry has witnessed in the last 20 or so years.

Separation of duties

The start of my story is a line. In a cloud environment, this line defines the perimeter between the consumer role and the provider role. In the cloud, there are things that AWS will do and things that the consumer will do. The perimeter of these responsibilities varies depending on the services you opt to use. If you want to understand more about this concept, read the AWS Shared Responsibility Model documentation.

The different abstraction levels

The reason why the line above is oblique is because it needs to intercept different compute abstraction levels. If you think about what happened in the last 20 years of IT, we have seen a surge of different compute abstractions that changed the way people consume CPU and Memory resources. It all started with physical (x86) servers back in the 80s, and then we have seen the industry adding abstraction layers over the years (for example, hypervisors, containers, functions).

The higher you go in the abstraction levels, the more the cloud provider can add value and can offload the consumer from non-strategic activities. A lot of these activities tend to be “undifferentiated heavy lifting.” We define this as something that AWS customers have to do but that don’t necessarily differentiate them from their competitors (because those activities are table-stakes in that particular industry).

What we found is that supporting millions of customers on AWS requires a certain degree of flexibility in the services we offer because there are many different patterns, use cases, and requirements to satisfy. Giving our customers choices is something AWS always strives for.

A couple of final notes before we dig deeper. The way this story builds up through the blog post is aligned to the progression of the launch dates of the various services, with a few noted exceptions. Also, the services mentioned are all generally available and production-grade. For full transparency, the integration among some of them may still be work-in-progress, which I’ll call out explicitly as we go.

The instance (or virtual machine) abstraction

This is the very first abstraction we introduced on AWS back in 2006. Amazon Elastic Compute Cloud (Amazon EC2) is the service that allows AWS customers to launch instances in the cloud. When customers intercept us at this level, they retain responsibility of the guest operating system and above (middleware, applications, etc.) and their lifecycle. AWS has the responsibility for managing the hardware and the hypervisor including their lifecycle.

At the very same level of the stack there is also Amazon Lightsail, which “is the easiest way to get started with AWS for developers, small businesses, students, and other users who need a simple virtual private server (VPS) solution. Lightsail provides developers compute, storage, and networking capacity and capabilities to deploy and manage websites and web applications in the cloud.”

And this is how these two services appear in our story:

The container abstraction

With the rise of microservices, a new abstraction took the industry by storm in the last few years: containers. Containers are not a new technology, but the rise of Docker a few years ago democratized access. You can think of a container as a self-contained environment with soft boundaries that includes both your own application as well as the software dependencies to run it. Whereas an instance (or VM) virtualizes a piece of hardware so that you can run dedicated operating systems, a container technology virtualizes an operating system so that you can run separated applications with different (and often incompatible) software dependencies.

And now the tricky part. Modern containers-based solutions are usually implemented in two main logical pieces:

  • A containers control plane that is responsible for exposing the API and interfaces to define, deploy, and lifecycle containers. This is also sometimes referred to as the container orchestration layer.
  • A containers data plane that is responsible for providing capacity (as in CPU/Memory/Network/Storage) so that those containers can actually run and connect to a network. From a practical perspective this is typically a Linux host or less often a Windows host where the containers get started and wired to the network.

Arguably, in a specific compute abstraction discussion, the data plane is key, but it is as important to understand what’s happening for the control plane piece.

In 2014, Amazon launched a production-grade containers control plane called Amazon Elastic Container Service (ECS), which “is a highly scalable, high performance container management service that supports Docker … Amazon ECS eliminates the need for you to install, operate, and scale your own cluster management infrastructure.”

In 2017, Amazon also announced the intention to release a new service called Amazon Elastic Container Service for Kubernetes (EKS) based on Kubernetes, a successful open source containers control plane technology. Amazon EKS was made generally available in early June 2018.

Just like for ECS, the aim for this service is to free AWS customers from having to manage a containers control plane. In the past, AWS customers would spin up EC2 instances and deploy/manage their own Kubernetes masters (masters is the name of the Kubernetes hosts running the control plane) on top of an EC2 abstraction. However, we believe many AWS customers will leave to AWS the burden of managing this layer by either consuming ECS or EKS, depending on their use cases. A comparison between ECS and EKS is beyond the scope of this blog post.

You may have noticed that what we have discussed so far is about the container control plane. How about the containers data plane? This is typically a fleet of EC2 instances managed by the customer. In this particular setup, the containers control plane is managed by AWS while the containers data plane is managed by the customer. One could argue that, with ECS and EKS, we have raised the abstraction level for the control plane, but we have not yet really raised the abstraction level for the data plane as the data plane is still comprised of regular EC2 instances that the customer has responsibility for.

There is more on that later on but, for now, this is how the containers control plane and the containers data plane services appear:

The function abstraction

At re:Invent 2014, AWS introduced another abstraction layer: AWS Lambda. Lambda is an execution environment that allows an AWS customer to run a single function. So instead of having to manage and run a full-blown OS instance to run your code, or having to track all software dependencies in a user-built container to run your code, Lambda allows you to upload your code and let AWS figure out how to run it at scale.

What makes Lambda so special is its event-driven model. Not only can you invoke Lambda directly (for example, via the Amazon API Gateway), but you can trigger a Lambda function upon an event in another AWS service (for example, an upload to Amazon S3 or a change in an Amazon DynamoDB table).

The key point about Lambda is that you don’t have to manage the infrastructure underneath the function you are running. No need to track the status of the physical hosts, no need to track the capacity of the fleet, no need to patch the OS where the function will be running. In a nutshell, no need to spend time and money on the undifferentiated heavy lifting.

And this is how the Lambda service appears:

The bare metal abstraction

Also known as the “no abstraction.”

As recently as re:Invent 2017, we announced (the preview of) the Amazon EC2 bare metal instances. We made this service generally available to the public in May 2018.

This announcement is part of Amazon’s strategy to provide choice to our customers. In this case, we are giving customers direct access to hardware. To quote from Jeff Barr’s post:

“…. (AWS customers) wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.”

This is how the bare metal Amazon EC2 i3.metal instance appears:

As a side note, and also as alluded to by Jeff, i3.metal is the foundational EC2 instance type on top of which VMware created their own VMware Cloud on AWS service. We are now offering the ability to any AWS user to provision bare metal instances. This doesn’t necessarily mean you can load your hypervisor of choice out of the box, but you can certainly do things you wouldn’t be able to do with a traditional EC2 instance (note: this was just a Saturday afternoon hack).

More seriously, a question I get often asked is whether users could install ESXi on i3.metal on their own. Today this cannot be done, but I’d be interested in hearing your use case for this.

The full container abstraction (for lack of a better term)

Now that we covered all the abstractions, it is time to go back and see if there are other optimizations we can provide for AWS customers. When we discussed the container abstraction, we called out that while there are two different fully managed containers control planes (ECS and EKS), there wasn’t a managed option for the data plane.

Some customers were (and still are) happy about being in full control of said instances. Others have been very vocal that they wanted to get out of the (undifferentiated heavy-lifting) business of managing the lifecycle of that piece of infrastructure.

Enter AWS Fargate, a production-grade service that provides compute capacity to AWS containers control planes. Practically speaking, Fargate is making the containers data plane fall into the “Provider space” responsibility. This means the compute unit exposed to the user is the container abstraction, while AWS will manage transparently the data plane abstractions underneath.

This is how the Fargate service appears:

Now ECS has two “launch types”: one called “EC2” (where your tasks get deployed on a customer-managed fleet of EC2 instances), and the other one called “Fargate” (where your tasks get deployed on an AWS-managed fleet of EC2 instances).

For EKS, the strategy will be very similar, but as of this writing it was not yet available. If you’re interested in some of the exploration being done to make this happen, this is a good read.

Conclusions

We covered the spectrum of abstraction levels available on AWS and how AWS customers can intercept them depending on their use cases and where they sit on their cloud maturity journey. Customers with a “lift & shift” approach may be more akin to consume services on the left-hand side of the slide, whereas customers with a more mature cloud native approach may be more interested in consuming services on the right-hand side of the slide.

In general, customers tend to use higher-level services to get out of the business of managing non-differentiating activities. For example, I recently talked to a customer interested in using Fargate. The trigger there was the fact that Fargate is ISO, PCI, SOC and HIPAA compliant, which was a huge time and money saver for them because it’s easier to point to an AWS document during an audit than having to architect and document for compliance the configuration of a DIY containers data plane.

As a recap, here’s our visual story with all the abstractions available:

I hope you found it useful. Any feedback is greatly appreciated.

About the author

Massimo is a Principal Solutions Architect at AWS. For about 25 years, he specialized on the x86 ecosystem starting with operating systems and virtualization technologies, and lately he has been head down learning about cloud and how application architectures are evolving in that space. Massimo has a blog at www.it20.info and his Twitter handle is @mreferre.

Hosting ASP.NET Core applications in Amazon ECS using AWS Fargate

Post Syndicated from Sundar Narasiman original https://aws.amazon.com/blogs/compute/hosting-asp-net-core-applications-in-amazon-ecs-using-aws-fargate/

There is an increasing amount of customer interest in hosting microservices-based applications using Amazon Elastic Container Service (ECS), largely due to the benefits offered by AWS Fargate.

AWS Fargate is a compute engine for containers that allows you to run containers without needing to provision, manage, or scale any Amazon EC2 compute infrastructure. Fargate works with Amazon ECS and can run microservices developed in many programming languages or application frameworks. This includes Java, .NET Core, Python, Node.js, Go, or Ruby on Rails. Nowadays, enterprises that are building microservices applications using .NET are using .NET core because of the cross-platform support (the ability to run in Linux).

In this post, I cover how to host a cross-platform ASP.NET core application using AWS Fargate.

Reference architecture

A good reference architecture for AWS Fargate application deployment should cover the VPC, Subnets, Load Balancer, Internet Gateway, Elastic Network Interface (ENI), AWS Fargate Task, Network ACLs, and Security Groups. The architectural choices for VPC Networking, Load Balancing, and Container Networking are also important.

There are a couple of networking approaches for deploying containers in Amazon ECS:

  • Deploy containers in the public VPC Subnet with direct Internet access
  • Deploy containers in the private VPC Subnet without direct Internet access

Because the ASP.NET Core application is going to serve traffic from the Internet, we will deploy containers in the Public VPC Subnet with direct Internet access.

When it comes to sending traffic to containers through the Load Balancer, the following options are available:

  • A public Load Balancer that accepts traffic from the Internet and route it to container through the AWS Fargate Task’s Elastic Network Interface (ENI).
  • A private, Internal Load Balancer that only accepts traffic from other containers in the cluster

Because the ASP.NET Core application container lives in the web tier, go with a public Load Balancer. The public Load Balancer accepts traffic from the Internet and routes it to the container through the AWS Fargate Task’s Elastic Network Interface (ENI).

Based on these considerations, the reference architecture for deploying to AWS Fargate should look like this diagram:

This solution deploys containers in a public Subnet (inside a VPC). The AWS Fargate Task and the two containers are hosted with direct access to the internet. They are also accessible to clients, using the public Load Balancer.

Walkthrough

To implement this architecture, we will do the following:

  1. Containerize the ASP.NET core application.
  2. Configure the reverse-proxy server.
  3. Containerize the NGINX reverse-proxy server.
  4. Create the Docker Compose file.
  5. Push container images to Amazon ECR.
  6. Create the ECS cluster.
  7. Create an Application Load Balancer.
  8. Create an AWS Fargate Task definition.
  9. Create the Amazon ECS service.

Code examples

The code examples, Dockerfile definition, Docker Compose file, and ECS task definition for this solution are available in the amazon-ecs-fargate-aspnetcore GitHub repository.

Pre-requisites

The development environment needs to have the following pre-requisites :-

  • Mac OS latest version (or) Windows 10 with latest updates (or) Ubuntu 16.0.4 or higher
  • .NET core 2.0 or higher
  • Docker latest version
  • aws cli
  • aws-ecs cli

Containerize the ASP.NET Core application

The first step in this journey is to containerize the ASP.NET Core application.

If you are using Visual Studio 2017 or later with the latest updates in Windows, you can add container support to the solution. Open the context (right-click) menu for the existing project and add Docker support.

If you are developing in Linux or Mac OS, you must explicitly add a Dockerfile.

The Dockerfile definition should look like the following, irrespective of the operating system used for development.

FROM microsoft/aspnetcore:2.0
WORKDIR /mymvcweb
COPY bin/Release/netcoreapp2.0/publish . 
ENV ASPNETCORE_URLS http://+:5000
EXPOSE 5000
ENTRYPOINT ["dotnet", "mymvcweb.dll"]

This Dockerfile definition creates an application container based on the microsoft/aspnetcore:2.0 base image. It publishes the contents of the bin/Release folder to a specified work directory, starts the default Kestrel web server and listens on port 5000 to serve web traffic.

By default, ASP.NET core uses Kestrel as the web server. Kestrel is a lightweight HTTP server and is great for serving dynamic content from ASP.NET core. However, for capabilities such as serving static content, caching requests, compressing requests, and terminating SSL from the HTTP server, a dedicated reverse-proxy server like NGINX is required.

Configure the reverse-proxy server

NGINX can act as both the HTTP and reverse-proxy server. NGINX is highly adopted because of its asynchronous, event-driven architecture that allows it to serve thousands of concurrent requests with a low-memory footprint.

In this solution, deploy a NGINX (reverse-proxy server) container in front of the application (ASP.NET core) container, defined in the AWS Fargate Task.

The reverse-proxy configuration file nginx.conf should be defined as follows:

worker_processes 4;
 
events { worker_connections 1024; }
 
http {
    sendfile on;
 
    upstream app_servers {
        server 127.0.0.1:5000;
    }
 
    server {
        listen 80;
 
        location / {
            proxy_pass         http://app_servers;
            proxy_redirect     off;
            proxy_set_header   Host $host;
            proxy_set_header   X-Real-IP $remote_addr;
            proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header   X-Forwarded-Host $server_name;
        }
    }
}

The NGINX container is set to listen on port 80 and it is configured to forward the request to the application container listening on port 5000. The attribute upstream app_server in the nginx.conf file must be set with a value of mymvcweb:5000 in the local development environment.

Containerize the NGINX reverse-proxy server

Create a Dockerfile definition like the following to containerize the NGINX reverse-proxy server. It should look like the following:

FROM nginx
COPY nginx.conf /etc/nginx/nginx.conf

Create the Docker Compose file

Next, use docker-compose to define these two containers as a microservices in the local development environment. The Docker Compose file should look like the following:

version: '2'
services:
  mymvcweb:
    build:
      context: ./mymvcweb
      dockerfile: Dockerfile
    expose:
      - "5000"
  reverseproxy:
    build:
      context: ./reverseproxy
      dockerfile: Dockerfile
    ports:
      - "80:80"
    links :
      - mymvcweb

These two containers can be built and tested by issuing the following docker-compose commands:

docker-compose build
docker-compose up

Open http://localhost:80 in the browser and it should render the default view of ‘index. cshtml’. Whenever there is a change to the application code or container definition, the docker-compose cache should be cleaned to affect the latest changes. To do this, run the following docker-compose commands:

docker-compose stop
docker-compose rm
docker-compose rmi ‘containerimageid’

Push container images to Amazon ECR

Next, push the container images from the local environment to Amazon Elastic Container Registry (ECR) so that the container images are available in Amazon ECR before the creation of AWS Fargate cluster.

Before you deploy this application to ECS, the upstream app_server attribute in the nginx.conf file must be set with the value of 127.0.0.1:5000. This enables the communication with the upstream application container listening on port 5000.

The first step to push the container images to ECR is to fetch the docker login command with the required security tokens. Run the following command:

aws ecr get-login --no-include-email --region us-east-1

It should return you a Docker login command with a security token. Copy the command and tokens and run it.

The second step is to tag the local container image with the remote ECR repository. Run the following command:

docker tag aspnetcorefargate_mymvcweb:latest <yourawsaccountnumber>.dkr.ecr.us-east-1.amazonaws.com/mymvcweb:latest

The third step is to push the tagged image to the remote ECR registry. Run the following command:

docker push <yourawsaccountnumber>.dkr.ecr.us-east-1.amazonaws.com/mywebmvc:latest

The above steps are repeated for the NGINX container as well. Now you have the container images available in ECR.

Create the Amazon ECS cluster

The Amazon ECS cluster is a logical grouping for AWS Fargate and Amazon ECS tasks. The cluster remains an administrative boundary for running every application.

In the AWS Management Console, Navigate to Create Cluster and select Networking only.
Since we’re going to create and host the Amazon ECS Service with AWS Fargate as the launch type, the notion of the Amazon ECS Cluster becomes a logical boundary. We need not create ECS instances while creating Amazon ECS Cluster, when the launch type is Fargate. Hence, we can create the Fargate cluster with required networking constructs such as VPC and Subnets.

Name the cluster and select Creation of new VPC for this cluster.

Leave the rest of the fields as their default values. You now have a VPC with two public subnets.

Create an Application Load Balancer

Next, create an Application Load Balancer, as defined in the reference architecture. The Application Load Balancer is required to load balance across multiple AWS Fargate tasks.

In the EC2 console, navigate to Create Load Balancer. Name your Load Balancer as aspnetcorefargatealb.

For Scheme, select internet-facing. For IP address type, choose ipv4. The Load Balancer listens on port 80 (HTTP). The Load Balancer’s Security Group should also allow traffic on port 80 (HTTP) from the internet.

While configuring the routing for the Load Balancer, for Target type, choose ip. For Protocol, choose HTTP. For Path, enter / (forward slash).

For more information, see Creating an Application Load Balancer.

Create an AWS Fargate Task definition

The AWS Fargate Task definition is an important resource, acts as a blueprint for the AWS Fargate task. The Task definition defines parameters such as:

  • Container image URL
  • CPU
  • Memory
  • IAM execution role
  • Host port
  • Container port
  • Log configurations
  • Container networking mode
  • Task type
  • Mount point
  • Volume

A Fargate Task is the running instance of Task definition. Each Task represents a microservice. Tasks can be managed and independently scaled using AWS Fargate Service, which is explained in the upcoming sections.

In the console, choose Task Definitions, Create new Task Definition. For more information, see Creating a Task Definition.

Use the following AWS Fargate Task definition, which based on the reference architecture defined for this walkthrough. Replace <awsaccount> with your own account.

{
  "executionRoleArn": "arn:aws:iam::<awsaccount>:role/ecsTaskExecutionRole",
  "containerDefinitions": [
    {
      "dnsSearchDomains": null,
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/aspnetcorefargatetask",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "entryPoint": null,
      "portMappings": [
        {
          "hostPort": 80,
          "protocol": "tcp",
          "containerPort": 80
        }
      ],
      "command": null,
      "linuxParameters": null,
      "cpu": 0,
      "environment": [],
      "ulimits": null,
      "dnsServers": null,
      "mountPoints": [],
      "workingDirectory": null,
      "dockerSecurityOptions": null,
      "memory": null,
      "memoryReservation": 1024,
      "volumesFrom": [],
      "image": "<awsaccount>.dkr.ecr.us-east-1.amazonaws.com/reverseproxy: latest",
      "disableNetworking": null,
      "healthCheck": null,
      "essential": true,
      "links": null,
      "hostname": null,
      "extraHosts": null,
      "user": null,
      "readonlyRootFilesystem": null,
      "dockerLabels": null,
      "privileged": null,
      "name": "reverseproxy"
    },
    {
      "dnsSearchDomains": null,
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/aspnetcorefargatetask",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "entryPoint": null,
      "portMappings": [
        {
          "hostPort": 5000,
          "protocol": "tcp",
          "containerPort": 5000
        }
      ],
      "command": null,
      "linuxParameters": null,
      "cpu": 0,
      "environment": [],
      "ulimits": null,
      "dnsServers": null,
      "mountPoints": [],
      "workingDirectory": null,
      "dockerSecurityOptions": null,
      "memory": null,
      "memoryReservation": 1024,
      "volumesFrom": [],
      "image": "<awsaccount>.dkr.ecr.us-east-1.amazonaws.com/mymvcweb:latest",
      "disableNetworking": null,
      "healthCheck": null,
      "essential": true,
      "links": null,
      "hostname": null,
      "extraHosts": null,
      "user": null,
      "readonlyRootFilesystem": null,
      "dockerLabels": null,
      "privileged": null,
      "name": "mymvcweb"
    }
  ],
  "placementConstraints": [],
  "memory": "2048",
  "taskRoleArn": "arn:aws:iam::<awsaccount>:role/aspnetecstaskroles",
  "compatibilities": [
    "EC2",
    "FARGATE"
  ],
  "taskDefinitionArn": "arn:aws:ecs:us-east-1:<awsaccount>:task-definition/aspnetcorefargatetask:1",
  "family": "aspnetcorefargatetask",
  "requiresAttributes": [
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.execution-role-ecr-pull"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.task-eni"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.ecr-auth"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.task-iam-role"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.execution-role-awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
    }
  ],
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "1024",
  "revision": 1,
  "status": "ACTIVE",
  "volumes": []
}

The above Task definition contains two containers, the ASP.NET core and the NGINX reverse-proxy server. Currently, awsvpc is the only networking mode supported for AWS Fargate Tasks. When an AWS Fargate Task is launched, the ECS container network plugin assigns a dedicated Elastic Network Interface (ENI) for the Tasks. This ENI does not share the global default network namespace with ECS instances.

You also specify the subnets for placing tasks across ECS instances. This means that the Subnet Security Group is also applicable to the ENI for the respective Tasks. This enables communication between two AWS Fargate Tasks, or other resources within the VPC. Because of the awsvpc network mode, calls from AWS Fargate Tasks do not go through the eth0 Docker bridge.

Create the Amazon ECS service

The AWS Fargate service is a managed AWS Fargate task. The desired state of the application can be defined using the AWS Fargate service. For more information, see Create a service.

In the console, choose Task Definitions and select the task definition that you just created.

On the Task Definition [name] page, select the revision of the task definition from which to create your service.

Review the task definition, and choose Actions, Create Service. For Launch type, choose FARGATE. Enter values for the rest of the fields:

  • Platform version: LATEST
  • Cluster: aspcorefargatecluster (or the cluster name you chose)
  • Service name: aspcorefargatesvc (or another name of your choice)
  • Number of tasks: 2
  • Minimum healthy percent: 50
  • Maximum percent: 200

On the Configure networking page, select the required VPC and subnets required for running the tasks.

Register the Application Load Balancer (ALB) that you created. The ECS scheduler has built-in intelligence, which makes it seamless to work with Application Load Balancer (ALB).

Then, configure Service Auto Scaling. Even though this is an optional feature, I recommend to enable service-level scaling. It addresses the key tenets of how a microservice should behave at runtime. For more information, see (Optional) Configuring Your Service to Use Service Auto Scaling.

I’m defining minimum number of tasks as 2, desired tasks as 2 and maximum tasks as 3.

Complete the Amazon ECS Service creation.

When the Amazon ECS Task gets placed, the ECS scheduler registers the Task as a target for the Load Balancer.

When the Task is healthy and passes the Load Balancer health checks, it is reflected in the healthy host count.

Access the DNS ‘A’ record of the Load Balancer in the browser. The ASP.NET core application should render successfully.

Conclusion

In this post, we took an existing ASP.NET core application, containerized it, and hosted it in Amazon ECS as a microservice using the AWS Fargate compute engine. AWS Fargate gives you a way to run containers directly without managing any EC2 instances and giving you full control over how the task is defined, including task networking and resources.

If you have questions or suggestions, please comment below.

Sundararajan Narasiman is an AWS Partner Solutions Architect

Refreshing an Amazon ECS Container Instance Cluster With a New AMI

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/refreshing-an-amazon-ecs-container-instance-cluster-with-a-new-ami/

This post contributed by Subhrangshu Kumar Sarkar, Sr. Technical Account Manager at AWS

The Amazon ECS–optimized Amazon Machine Image (AMI) comes prepackaged with the Amazon Elastic Container Service (ECS) container agent, Docker, and the ecs-init service. When updates to these components are released, try to integrate them as quickly as possible. Doing so helps you maintain a safe, secure, and reliable environment for running your containers.

Each release of the ECS–optimized AMI includes bug fixes and feature updates. AWS recommends refreshing your container instance fleet with the latest AMI whenever possible, rather than trying to patch instances in-place. Periodical replacement of your ECS instances aligns with the immutable infrastructure paradigm, which is less prone to human error. It’s also less susceptible to configuration drift because infrastructure is managed through code.

In this post, I show you how to manually refresh the container instances in an active ECS cluster with new container instances built from a newly released AMI. You also see how to refresh the ECS instance fleet when it is part of an Auto Scaling group, and when it is not.

Solution Overview

The following flow chart shows the strategy to be used in refreshing the cluster.

Prerequisites

  • An AWS account with enough room to accommodate “ECS cluster instance count” number of more Amazon EC2 instances, in addition to the existing EC2 instances that you already have during the refresh period. If you have a total of 10 t2.medium instances in an AWS Region where an ECS cluster with four container instances is running, you should be able to spawn four more t2.medium instances. Your instance count comes down to 10 again, after your old instances are de-registered and terminated at the end of the refresh period.
  • An existing ECS cluster (preferably with one or more container instances built with an old AMI), with or without a service running on it.
  • A Linux system with the AWS CLI and JQ installed. This allows you to try the programmatic method of refreshing the cluster. You can SSH into an EC2 virtual machine if you do not have local access to a Linux system.
  • An IAM user with permissions to view ECS resources, deregister and terminate the ECS instances, revise a task definition, and update a service.
  • A specified AWS Region. In this post, the cluster is in us-east-1 and that is the region for all AWS CLI commands mentioned.

Use the following steps to test if you have all the resources and permissions to proceed.

Using the AWS CLI

Run the following command:

# aws ecs list-clusters
Sample output:
{
    "clusterArns": [
        "arn:aws:ecs:us-east-1:012345678910:cluster/workshop-app-cluster"
    ]
}

Choose the cluster to refresh. In my case, the cluster name is workshop-app-cluster, with a service named “workshop-service” running on this cluster.

# aws ecs describe-clusters --clusters <cluster name>

Sample output:

{
    "clusters": [
    {
        "status": "ACTIVE",
        "statistics": [],
        "clusterName": "workshop-app-cluster",
        "registeredContainerInstancesCount": 7,
        "pendingTasksCount": 0,
        "runningTasksCount": 3,
        "activeServicesCount": 1,
        "clusterArn": "arn:aws:ecs:us-east-1:012345678910:cluster/workshop-app-cluster"
    }
    ],
    "failures": []
}

Using the AWS Console

  1. Open the Amazon ECS console.
  2. On the clusters page, select the cluster to refresh.

You should be able to see the details of the services, tasks, and the container instance on the respective tabs.

1. Retrieve the latest ECS–optimized AMI metadata

Previously, to make sure that you were using the latest ECS–optimized AMI, you had to either consult the ECS documentation or subscribe to the ECS AMI Amazon SNS topic.

Now, you can query the AWS Systems Manager Parameter Store API to get the latest AMI version ID or a list of available AMI IDs and their corresponding Docker runtime and ECS agent versions. You can query the Parameter Store API using the AWS CLI or any of the AWS SDKs. In fact, you can now use a Systems Manager parameter in AWS CloudFormation to launch EC2 instances with the latest ECS-optimized AMI.

Run the following command:

aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux/recommended --query "Parameters[].Value" --output text | jq .

Sample output:

{
    "schema_version": 1,
    "image_name": "amzn-ami-2017.09.l-amazon-ecs-optimized",
    "image_id": "ami-aff65ad2",
    "os": "Amazon Linux",
    "ecs_runtime_version": "Docker version 17.12.1-ce",
    "ecs_agent_version": "1.17.3"
}

The image_id is the image ID for the latest ECS–optimized AMI in the Region in which you are operating.

Note: At the time of publication, querying Parameter Store is not possible through the console.

2. Find all outdated container instances

Use the following steps to find all container instances not built with the latest ECS–optimized AMI, which should be refreshed.

Using the AWS CLI

Run the following command on your ECS cluster with the image_id value that you got from the ssm get-parameters command:

aws ecs list-container-instances --cluster <cluster name> --filter "attribute:ecs.ami-id != <image_id>"

Sample output:

{
    "containerInstanceArns": [
    "arn:aws:ecs:us-east-1:012345678910:container-instance/2db66342-5f69-4782-89a3-f9b707f979ab",
    "arn:aws:ecs:us-east-1:012345678910:container-instance/4649d3ab-7f44-40a2-affb-670637c86aad"
    ]
}

Now, find the corresponding EC2 instance IDs for these container instances. The IDs are then used to find the corresponding Auto Scaling group from which to detach the instances.

aws ecs list-container-instances --cluster <cluster name> --filter "attribute:ecs.ami-id != <image_id>"| \
jq -c '.containerInstanceArns[]' | \
xargs aws ecs describe-container-instances --cluster <cluster name> --container-instances | \
jq '[.containerInstances[]|{(.containerInstanceArn) : .ec2InstanceId}]'

Sample output:

[
    {
        "arn:aws:ecs:us-east-1:012345678910:container-instance/2db66342-5f69-4782-89a3-f9b707f979ab": "i-08e8cfc073db135a9"
    },
    {
        "arn:aws:ecs:us-east-1:012345678910:container-instance/4649d3ab-7f44-40a2-affb-670637c86aad": "i-02dd87a0b28e8575b"
    }
]

An ECS container instance is an EC2 instance that is running the ECS container agent and has been registered into a cluster. In the above sample output:

  • 2db66342-5f69-4782-89a3-f9b707f979ab is the container instance ID
  • i-08e8cfc073db135a9 is an EC2 instance ID

Using the AWS Console

  1. In the ECS console, choose Clusters, select the cluster, and choose ECS Instances.
  2. Select Filter by attributes and choose ecs:ami-id as the attribute on which to filter.
  3. Select an AMI ID that is not same as the latest AMI ID, in this case ami-aff65ad2.

For all resulting ECS instances, the container instance ID and the EC2 instance IDs are both visible.

3. List the instances that are part of an Auto Scaling group

If your cluster was created with the console first-run experience after November 24, 2015, then the Auto Scaling group associated with the AWS CloudFormation stack created for your cluster can be scaled up or down to add or remove container instances. You can perform this scaling operation from within the ECS console.

Use the following steps to list the outdated ECS instances that are part of an Auto Scaling group.

Using the AWS CLI

Run the following command:

aws autoscaling describe-auto-scaling-instances --instance-ids <instance id #1> <instance id #2>

Sample output:

{
    "AutoScalingInstances": [
    {
        "ProtectedFromScaleIn": false,
        "AvailabilityZone": "us-east-1b",
        "InstanceId": "i-02dd87a0b28e8575b",
        "AutoScalingGroupName": "EC2ContainerService-workshop-app-cluster-EcsInstanceAsg-1IVVUK4CR81X1",
        "HealthStatus": "HEALTHY",
        "LifecycleState": "InService"
    },
    {
        "ProtectedFromScaleIn": false,
        "AvailabilityZone": "us-east-1a",
        "InstanceId": "i-08e8cfc073db135a9",
        "AutoScalingGroupName": "EC2ContainerService-workshop-app-cluster-EcsInstanceAsg-1IVVUK4CR81X1",
        "HealthStatus": "HEALTHY",
        "LifecycleState": "InService"
    }
    ]
}

The response shows that the instances are part of the EC2ContainerService-workshop-app-cluster-EcsInstanceAsg-1IVVUK4CR81X1 Auto Scaling group.

Using the AWS Console

If the ECS cluster was created from the console, you likely have an associated CloudFormation stack. By default, the stack name is EC2ContainerService-cluster_name.

  1. In the CloudFormation console, select the cluster, choose Outputs, and note the corresponding stack for your cluster.
  2. In the EC2 console, choose Auto Scaling groups.
  3. Select the group and check that the EC2 instance IDs for the ECS instance are registered.

4. Create a new Auto Scaling group

If the container instances are not part of any Auto Scaling group, create a new group from one of the existing container instances and then add all other container instances to it. A launch configuration is automatically created for the new Auto Scaling group.

Using the AWS CLI

Run the following command to create an Auto Scaling group using the EC2 instance ID for an existing container instance:

aws autoscaling create-auto-scaling-group --auto-scaling-group-name <auto-scaling-group-name> --instance-id <instance-id> --min-size 0 --max-size 3

Keep the min-size parameter to 0 and max-size to greater than the number of instances that you are going to add to this Auto Scaling group.

At this point, your Auto Scaling group does not contain any instances. Neither does it have any of the subnets or Availability Zones of any of the old instances, other than the instance from which you made the Auto Scaling group. To add all old instances (including the one from which the Auto Scaling group was created) to this Auto Scaling group, find the subnets and Availability Zones to which they are attached.

Run the following commands:

aws ec2 describe-instances --instance-id <instance-id> --query "Reservations[].Instances[].NetworkInterfaces[].SubnetId" --output text

aws ec2 describe-instances --instance-id <instance-id> --query "Reservations[].Instances[].Placement.AvailabilityZone" --output text

After you have all the Availability Zones and subnets to be added to the Auto Scaling group, run the following command to update the Auto Scaling group:

aws autoscaling update-auto-scaling-group --vpc-zone-identifier <subnet-1>,<subnet-2> --auto-scaling-group-name <auto-scaling-group-name> --availability-zones <availability-zone1> <availability-zone2>

You are now ready to add all the old instances to this Auto Scaling group. Run the following command:

aws autoscaling attach-instances --instance-ids <instance-id 1> <instance-id 2> --auto-scaling-group-name <auto-scaling-group-name>

Now, all existing container instances are part of an Auto Scaling group, which is attached to a launch configuration capable of launching instances with the old AMI.

When you attach instances, Auto Scaling increases the desired capacity of the group by the number of instances being attached.

Using the AWS Console

To create an Auto Scaling group from an existing container instance, do the following steps:

  1. In the ECS console, on the EC2 Instances tab, open the EC2 instance ID for the container instance.
  2. Select the instance and choose Actions, Instance Settings, and Attach to Auto Scaling Group.
  3. On the Attach to Auto Scaling Group page, select a new Auto Scaling group, enter a name for the group, and then choose Attach.

The new Auto Scaling group is created using a new launch configuration with the same name that you specified for the Auto Scaling group. The launch configuration gets its settings (for example, security group and IAM role) from the instance that you attached. The Auto Scaling group also gets settings (for example, Availability Zone and subnet) from the instance that you attached, and has a desired capacity and maximum size of 1.

Now that you have an Auto Scaling group and launch configuration ready, add the max value for the Auto Scaling group to the total number of exiting container instances in the ECS cluster.

To add other container instances of the ECS cluster to this Auto Scaling group:

  1. On the navigation pane, under Auto Scaling, choose Auto Scaling Groups, select the new Auto Scaling group, and choose Edit.
  2. Add subnets for other instances to the Subnet(s) section and save the configuration.
  3. For each of the other container instances of the cluster, open the EC2 instance ID, select the instance, and then choose Actions, Instance Settings, and Attach to Auto Scaling Group.
  4. On the Attach to Auto Scaling Group page, select an existing Auto Scaling group, select the Auto Scaling group that you just created, and then choose Attach.
  5. If the instance doesn’t meet the criteria (for example, if it’s not in the same Availability Zone as the Auto Scaling group), you get an error message with the details. Choose Close and try again with an instance that meets the criteria.

5. Create a new launch configuration

Create a new launch configuration for the Auto Scaling group. This launch configuration should be able to launch instances with the new ECS–optimized AMI. It should also put the user data in the instances to allow them to join the ECS cluster when they are created.

Using the AWS CLI

First, run the following command to get the launch configuration for the Auto Scaling group:

aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names <Auto Scaling group name> --query AutoScalingGroups[].LaunchConfigurationName --output text

Sample output:

EC2ContainerService-workshop-app-cluster-EcsInstanceLc-1LEL4X28KY4X

Now, create a new launch configuration with the new image ID from this existing launch configuration. Create a launch configuration called New-AMI-launch. Substitute the existing launch configuration name for launch-configuration-name and the image ID corresponding to the new AMI for image_id.
aws autoscaling describe-launch-configurations --launch-configuration-name \
<launch-configuration-name> --query "LaunchConfigurations[0]" | \
jq 'del(.LaunchConfigurationARN)' | jq 'del(.CreatedTime)' | \
jq 'del(.KernelId)' | jq 'del(.RamdiskId)' | \
jq '. += {"LaunchConfigurationName": "New-AMI-launch"}' | \
jq '. += {"ImageId": "<image_id>"}' > new-launch-config.json

aws autoscaling create-launch-configuration --cli-input-json file://new-launch-config.json

At this point, the New-AMI-launch launch configuration is ready. Update the Auto Scaling group with the new launch configuration:

aws autoscaling update-auto-scaling-group --auto-scaling-group-name <auto-scaling-group-name> --launch-configuration-name New-AMI-launch

To add block devices to the launch configuration, you can always override the block device mapping for the new launch configuration.

Using the AWS Console

  1. On the Auto Scaling groups page, choose Details in the bottom pane and note the launch configuration for your Auto Scaling group.
  2. On the Launch configurations page, select the launch configuration and choose Copy launch configuration.
  3. On the AMI details page, choose Edit AMI.
  4. In the search box, enter the latest AMI image ID (in this case, ami-aff65ad2) and choose Select.
  5. On the Configure details page, enter a new name for the launch configuration.
  6. Keep everything else the same and choose Create.
  7. On the Auto Scaling groups page, choose Edit.
  8. Select the newly created launch configuration and choose Save.

6. Detach the old ECS instances from the Auto Scaling group

Now that you have a new launch configuration with the Auto Scaling group, detach the old instances from the group.

For every old instance detached, add a new instance through the new launch configuration. This keeps the desired count for the Auto Scaling group unchanged.

Using the AWS CLI

Run the following command:

aws autoscaling detach-instances --instance-ids <instance id #1> <instance id #2> --auto-scaling-group-name <auto-scaling-group-name> --no-should-decrement-desired-capacity

When this is done, the following command should show a blank result:

aws autoscaling describe-auto-scaling-instances --instance-ids <instance id #1> <instance id #2>

The following command should show the new ECS instances, for every old instance detached from the Auto Scaling group:

aws ecs list-container-instances --cluster <cluster name>

The old container instances have been detached from the Auto Scaling group but they are still registered in the ECS cluster.

Using the AWS Console

  1. On the Auto Scaling groups page, select the group.
  2. On the instance tab, select the old container instances.
  3. In the bottom pane, choose Actions, Detach.
  4. In the Detach Instances dialog box, select the check box for Add new instances to the Auto Scaling group to balance the load and choose Detach instances.

7. Revise the task definition and update the service

Now revise the task definition in use to impose a constraint. Subsequent tasks spawned from this task definition are hosted only on ECS instances built with the new AMI.

Using the AWS CLI

Run the following command to get the task definition for the service running on the cluster:

aws ecs describe-services --cluster <cluster name> \
--services <service arn> \
--query "services[].deployments[].["taskDefinition"]" --output text

Sample output

arn:aws:ecs:us-east-1:012345678910:task-definition/workshop-task:9

Here, workshop-task is the family and 9 is the revision. Now, update the task definition with the constraint. Use the built-in attribute, ecs.ami-id, to impose the constraint. Replace the image_id value in the following command with the value found by querying Parameter Store.
aws ecs describe-task-definition --task-definition <task definition family:revision> --query taskDefinition | \
jq '. + {placementConstraints: [{"expression": "attribute:ecs.ami-id == <image_id>", "type": "memberOf"}]}' | \
jq 'del(.status)'| jq 'del(.revision)' | jq 'del(.requiresAttributes)' | \
jq '. + {containerDefinitions:[.containerDefinitions[] + {"memory":256, "memoryReservation": 128}]}'| \
jq 'del(.compatibilities)' | jq 'del(.taskDefinitionArn)' > new-task-def.json

Even if your original container definition doesn’t have a memory or memoryReservation key, you must provide one of those values while updating the task definition. For this post, I have used the task-level memory allocation value (256) and an arbitrary value (128) for those keys, respectively.

aws ecs register-task-definition --cli-input-json file://new-task-def.json

You should now have a new revised version of the task definition. In this example, it’s workshop-task:10.

8. Update the service with the revised task definition

Use the following steps to add the revised task definition to the service.

Using the AWS CLI

Run the following command to update the service with the revised task definition:

aws ecs update-service --cluster <cluster name> --service <service name> --task-definition <task definition family:revised version>

After the service is updated with the revised task definition, the new tasks constituting the service should come up on the new ECS instances, thanks to the constraint in the new task definition.

Use the command on the old container instances until there are no task ARNs in the output:

aws ecs list-tasks --cluster <cluster name> --container-instance <container-instance id #1> --container-instance <container-instance id #2>

Using the AWS Console

  1. In the ECS console, on the Task definitions page, select your task definition and choose Create new revision.
  2. On the Create new revision of task definition page, choose Add constraint.
  3. For Expression, add attribute:ecs.ami-id == <AMI ID for new ECS optimized AMI> and choose Create. You see a new revision of the task definition being created. In this case, workshop-task:10 got created.
  4. To update the service, on the Clusters page, select the service corresponding to the revised task definition.
  5. On the Configure service page, for Task definition, select the appropriate task definition version and choose Next step.
  6. Keep the remaining default values. On the Review page, choose Update service.

On the service page, on the Event tab, you see events corresponding to the old tasks getting stopped new tasks getting started on the new ECS instances.

Wait until no tasks are running on the old ECS instances and you see all tasks starting on the new ECS instances.

9. Deregister and terminate the old ECS instances

Using the AWS CLI

For each of the old container instances, run the following command:

aws ecs deregister-container-instance --cluster <cluster name> --container-instance <container instance id> --query containerInstance.ec2InstanceId

Sample output:

"i-02dd87a0b28e8575b"

Record the EC2 instance ID and then terminate the instance:

aws ec2 terminate-instances --instance-ids <instance-id>

Using the AWS Console

  1. In the ECS console, choose Clusters, ECS instances.
  2. Keep the EC2 instance ID displayed on the EC2 Instance column and keep the instance detail page open.
  3. Open the container instance ID for the ECS instance to deregister.
  4. On the container instance page, choose Deregister.

After the container instance is deregistered, terminate the instance detail page.

At this point, your ECS cluster has been refreshed with the EC2 instances built with the new ECS–optimized AMI.

Conclusion

In this post, I demonstrated how to refresh the container instances in an active ECS cluster with instances built from a newly released ECS–optimized AMI. You can either use the AWS Management Console or programmatically refresh your ECS cluster in some quick steps.

AWS Fargate is a service that’s designed to remove the need to do these types of operations by running and managing all the EC2 infrastructure necessary to support your containers for you. With Fargate, your containers are always started with the latest ECS agent and Docker version.

I welcome your comments and questions below.

Machine Learning with AWS Fargate and AWS CodePipeline at Corteva Agriscience

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/machine-learning-with-aws-fargate-and-aws-codepipeline-at-corteva-agriscience/

This post contributed by Duke Takle and Kevin Hayes at Corteva Agriscience

At Corteva Agriscience, the agricultural division of DowDuPont, our purpose is to enrich the lives of those who produce and those who consume, ensuring progress for generations to come. As a global business, we support a network of research stations to improve agricultural productivity around the world

As analytical technology advances the volume of data, as well as the speed at which it must be processed, meeting the needs of our scientists poses unique challenges. Corteva Cloud Engineering teams are responsible for collaborating with and enabling software developers, data scientists, and others. Their work allows Corteva research and development to become the most efficient innovation machine in the agricultural industry.

Recently, our Systems and Innovations for Breeding and Seed Products organization approached the Cloud Engineering team with the challenge of how to deploy a novel machine learning (ML) algorithm for scoring genetic markers. The solution would require supporting labs across six continents in a process that is run daily. This algorithm replaces time-intensive manual scoring of genotypic assays with a robust, automated solution. When examining the solution space for this challenge, the main requirements for our solution were global deployability, application uptime, and scalability.

Before the implementing this algorithm in AWS, ML autoscoring was done as a proof of concept using pre-production instances on premises. It required several technicians to continue to process assays by hand. After implementing on AWS, we have enabled those technicians to be better used in other areas, such as technology development.

Solutions Considered

A RESTful web service seemed to be an obvious way to solve the problem presented. AWS has several patterns that could implement a RESTful web service, such as Amazon API Gateway, AWS Lambda, Amazon EC2, AWS Auto Scaling, Amazon Elastic Container Service (ECS) using the EC2 launch type, and AWS Fargate.

At the time, the project came into our backlog, we had just heard of Fargate. Fargate does have a few limitations (scratch storage, CPU, and memory), none of which were a problem. So EC2, Auto Scaling, and ECS with the EC2 launch type were ruled out because they would have introduced unneeded complexity. The unneeded complexity is mostly around management of EC2 instances to either run the application or the container needed for the solution.

When the project came into our group, there had been a substantial proof-of-concept done with a Docker container. While we are strong API Gateway and Lambda proponents, there is no need to duplicate processes or services that AWS provides. We also knew that we needed to be able to move fast. We wanted to put the power in the hands of our developers to focus on building out the solution. Additionally, we needed something that could scale across our organization and provide some rationalization in how we approach these problems. AWS services, such as Fargate, AWS CodePipeline, and AWS CloudFormation, made that possible.

Solution Overview

Our group prefers using existing AWS services to bring a complete project to the production environment.

CI/CD Pipeline

A complete discussion of the CI/CD pipeline for the project is beyond the scope of this post. However, in broad strokes, the pipeline is:

  1. Compile some C++ code wrapped in Python, create a Python wheel, and publish it to an artifact store.
  2. Create a Docker image with that wheel installed and publish it to ECR.
  3. Deploy and test the new image to our test environment.
  4. Deploy the new image to the production environment.

Solution

As mentioned earlier, the application is a Docker container deployed with the Fargate launch type. It uses an Aurora PostgreSQL DB instance for the backend data. The application itself is only needed internally so the Application Load Balancer is created with the scheme set to “internal” and deployed into our private application subnets.

Our environments are all constructed with CloudFormation templates. Each environment is constructed in a separate AWS account and connected back to a central utility account. The infrastructure stacks export a number of useful bits like the VPC, subnets, IAM roles, security groups, etc. This scheme allows us to move projects through the several accounts without changing the CloudFormation templates, just the parameters that are fed into them.

For this solution, we use an existing VPC, set of subnets, IAM role, and ACM certificate in the us-east-1 Region. The solution CloudFormation stack describes and manages the following resources:

AWS::ECS::Cluster*
AWS::EC2::SecurityGroup
AWS::EC2::SecurityGroupIngress
AWS::Logs::LogGroup
AWS::ECS::TaskDefinition*
AWS::ElasticLoadBalancingV2::LoadBalancer
AWS::ElasticLoadBalancingV2::TargetGroup
AWS::ElasticLoadBalancingV2::Listener
AWS::ECS::Service*
AWS::ApplicationAutoScaling::ScalableTarget
AWS::ApplicationAutoScaling::ScalingPolicy
AWS::ElasticLoadBalancingV2::ListenerRule

A complete discussion of all the resources for the solution is beyond the scope of this post. However, we can explore the resource definitions of the components specific to Fargate. The following three simple segments of CloudFormation are all that is needed to create a Fargate stack: an ECS cluster, task definition, and service. More complete examples of the CloudFormation templates are linked at the end of this post, with stack creation instructions.

AWS::ECS::Cluster:

"ECSCluster": {
    "Type":"AWS::ECS::Cluster",
    "Properties" : {
        "ClusterName" : { "Ref": "clusterName" }
    }
}

The ECS Cluster resource is a simple grouping for the other ECS resources to be created. The cluster created in this stack holds the tasks and service that implement the actual solution. Finally, in the AWS Management Console, the cluster is the entry point to find info about your ECS resources.

AWS::ECS::TaskDefinition

"fargateDemoTaskDefinition": {
    "Type": "AWS::ECS::TaskDefinition",
    "Properties": {
        "ContainerDefinitions": [
            {
                "Essential": "true",
                "Image": { "Ref": "taskImage" },
                "LogConfiguration": {
                    "LogDriver": "awslogs",
                    "Options": {
                        "awslogs-group": {
                            "Ref": "cloudwatchLogsGroup"
                        },
                        "awslogs-region": {
                            "Ref": "AWS::Region"
                        },
                        "awslogs-stream-prefix": "fargate-demo-app"
                    }
                },
                "Name": "fargate-demo-app",
                "PortMappings": [
                    {
                        "ContainerPort": 80
                    }
                ]
            }
        ],
        "ExecutionRoleArn": {"Fn::ImportValue": "fargateDemoRoleArnV1"},
        "Family": {
            "Fn::Join": [
                "",
                [ { "Ref": "AWS::StackName" }, "-fargate-demo-app" ]
            ]
        },
        "NetworkMode": "awsvpc",
        "RequiresCompatibilities" : [ "FARGATE" ],
        "TaskRoleArn": {"Fn::ImportValue": "fargateDemoRoleArnV1"},
        "Cpu": { "Ref": "cpuAllocation" },
        "Memory": { "Ref": "memoryAllocation" }
    }
}

The ECS Task Definition is where we specify and configure the container. Interesting things to note are the CPU and memory configuration items. It is important to note the valid combinations for CPU/memory settings, as shown in the following table.

CPU Memory
0.25 vCPU 0.5 GB, 1 GB, and 2 GB
0.5 vCPU Min. 1 GB and Max. 4 GB, in 1-GB increments
1 vCPU Min. 2 GB and Max. 8 GB, in 1-GB increments
2 vCPU Min. 4 GB and Max. 16 GB, in 1-GB increments
4 vCPU Min. 8 GB and Max. 30 GB, in 1-GB increments

AWS::ECS::Service

"fargateDemoService": {
     "Type": "AWS::ECS::Service",
     "DependsOn": [
         "fargateDemoALBListener"
     ],
     "Properties": {
         "Cluster": { "Ref": "ECSCluster" },
         "DesiredCount": { "Ref": "minimumCount" },
         "LaunchType": "FARGATE",
         "LoadBalancers": [
             {
                 "ContainerName": "fargate-demo-app",
                 "ContainerPort": "80",
                 "TargetGroupArn": { "Ref": "fargateDemoTargetGroup" }
             }
         ],
         "NetworkConfiguration":{
             "AwsvpcConfiguration":{
                 "SecurityGroups": [
                     { "Ref":"fargateDemoSecuityGroup" }
                 ],
                 "Subnets":[
                    {"Fn::ImportValue": "privateSubnetOneV1"},
                    {"Fn::ImportValue": "privateSubnetTwoV1"},
                    {"Fn::ImportValue": "privateSubnetThreeV1"}
                 ]
             }
         },
         "TaskDefinition": { "Ref":"fargateDemoTaskDefinition" }
     }
}

The ECS Service resource is how we can configure where and how many instances of tasks are executed to solve our problem. In this case, we see that there are at least minimumCount instances of the task running in any of three private subnets in our VPC.

Conclusion

Deploying this algorithm on AWS using containers and Fargate allowed us to start running the application at scale with low support overhead. This has resulted in faster turnaround time with fewer staff and a concomitant reduction in cost.

“We are very excited with the deployment of Polaris, the autoscoring of the marker lab genotyping data using AWS technologies. This key technology deployment has enhanced performance, scalability, and efficiency of our global labs to deliver over 1.4 Billion data points annually to our key customers in Plant Breeding and Integrated Operations.”

Sandra Milach, Director of Systems and Innovations for Breeding and Seed Products.

We are distributing this solution to all our worldwide laboratories to harmonize data quality, and speed. We hope this enables an increase in the velocity of genetic gain to increase yields of crops for farmers around the world.

You can learn more about the work we do at Corteva at www.corteva.com.

Try it yourself:

The snippets above are instructive but not complete. We have published two repositories on GitHub that you can explore to see how we built this solution:

Note: the components in these repos do not include our production code, but they show you how this works using Amazon ECS and AWS Fargate.

Maintaining Transport Layer Security all the way to your container part 2: Using AWS Certificate Manager Private Certificate Authority

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/maintaining-transport-layer-security-all-the-way-to-your-container-part-2-using-aws-certificate-manager-private-certificate-authority/

This post contributed by AWS Senior Cloud Infrastructure Architect Anabell St Vincent and AWS Solutions Architect Alex Kimber.

The previous post, Maintaining Transport Layer Security All the Way to Your Container, covered how the layer 4 Network Load Balancer can be used to maintain Transport Layer Security (TLS) all the way from the client to running containers.

In this post, we discuss the various options available for ensuring that certificates can be securely and reliably made available to containers. By simplifying the process of distributing or generating certificates and other secrets, it’s easier for you to build inherently secure architectures without compromising scalability.

There are several ways to achieve this:

1. Storing the certificate and private key in the Docker image

Certificates and keys can be included in the Docker image and made available to the container at runtime. This approach makes the deployment of containers with certificates and keys simple and easy.

However, there are some drawbacks. First, the certificates and keys need to be created, stored securely, and then included in the Docker image. There are some manual or additional automation steps required to securely create, retrieve, and include them for every new revision of the Docker image.

The following example Docker file creates an NGINX container that has the certificate and the key included:

FROM nginx:alpine

# Copy in secret materials
RUN mkdir -p /root/certs/nginxdemotls.com
COPY nginxdemotls.com.key /root/certs/nginxdemotls.com/nginxdemotls.com.key
COPY nginxdemotls.com.crt /root/certs/nginxdemotls.com/nginxdemotls.com.crt
RUN chmod 400 /root/certs/nginxdemotls.com/nginxdemotls.com.key

# Copy in nginx configuration files
COPY nginx.conf /etc/nginx/nginx.conf
COPY nginxdemo.conf /etc/nginx/conf.d
COPY nginxdemotls.conf /etc/nginx/conf.d

# Create folders to hold web content and copy in HTML files.
RUN mkdir -p /var/www/nginxdemo.com
RUN mkdir -p /var/www/nginxdemotls.com

COPY index.html /var/www/nginxdemo.com/index.html
COPY indextls.html /var/www/nginxdemotls.com/index.html

From a security perspective, this approach has additional drawbacks. Because certificates and private keys are bundled with the Docker images, anyone with access to a Docker image can also retrieve the certificate and private key.
The other drawback is the updated certificates are not replaced automatically and the Docker image must be re-created to include any updated certificates. Running containers must either be restarted with the new image, or have the certificates updated.

2. Storing the certificates in AWS Systems Manager Parameter Store and Amazon S3

The post Managing Secrets for Amazon ECS Applications Using Parameter Store and IAM Roles for Tasks explains how you can use Systems Manager Parameter Store to store secrets. Some customers use Parameter Store to keep their secrets for simpler retrieval, as well as fine-grained access control. Parameter Store allows for securing data using AWS Key Management Service (AWS KMS) for the encryption. Each encryption key created in KMS can be accessed and controlled using AWS Identity and Access Management (IAM) roles in addition to key policy functionality within KMS. This approach allows for resource-level permissions to each item that is stored in Parameter Store, based on the KMS key used for the encryption.

Some certificates can be stored in Parameter Store using the ‘Secure String’ type and using KMS for encryption. With this approach, you can make an API call to retrieve the certificate when the container is deployed. As mentioned earlier, the access to the certificate can be based on the role used to retrieve the certificate. The advantage of this approach is that the certificate can be replaced. The next time the container is deployed, it picks up the new certificate and there is no need to update the Docker image.

Currently, there is a limitation of 4,096 characters that can be stored in Parameter Store. This may not be sufficient for some type of certificates. For example, some x509 certs include the chain and so can exceed the 4,096 character limit. To avoid any character size limitation, Amazon S3 can be used to store the certificate with Parameter Store. The certificate can be stored on Amazon S3, encrypted with KMS and the private key, or the password can be stored in Parameter Store.

With this approach, there is no limitation on certificate length and the private key remains secured with KMS. However, it does involve some additional complexity in setting up the process of creating the certificates, storing them in S3, and then storing the password or private keys in Parameter Store. That is in addition to securing, trusting, and auditing the system handling the private keys and certificates.

3. Storing the certificates in AWS Secrets Manager

AWS Secrets Manager offers a number of features to allow you to store and manage credentials, keys, and other secret materials. This eliminates the need to store these materials with the application code and instead allows them to be referenced on demand. By centralizing the management of secret materials, this single service can manage fine-grained access control through granular IAM policies as well as the revocation and rotation, all through API calls.

All materials stored in the AWS Secrets Manager are encrypted with the customer’s choice of KMS key. The post AWS Secrets Manager: Store, Distribute, and Rotate Credentials Securely shows how AWS Secrets Manager can be used to store RDS database credentials. However, the same process can apply to TLS certificates and keys.

Secrets currently have a limit of 4,096 characters. This approach may be unsuitable for some x509 certificates that include the chain and can exceed this limit. This limit applies to the sum of all key-value pairs within a single secret, so certificates and keys may need to be stored in separate secrets.

After the secure material is in place, it can be retrieved by the container instance at runtime via the AWS Command Line Interface (AWS CLI) or directly from within the application code. All that’s required is for the container task role to have the requisite permissions in IAM to read the secrets.

With the exception of rotating RDS credentials, AWS Secrets Manager requires the user to provide Lambda function code, which is called on a configurable schedule to manage the rotation. This rotation would need to consider the generation of new keys and certificates and redeploying the containers.

4. Using self-signed certificates, generated as the Docker container is created

The advantage of this approach is that it allows the use of TLS communications without any of the complexity of distributing certificates or private keys. However, this approach does require implicit trust of the server. Some applications may generate warnings that there is no acceptable root of trust.

5. Building and managing a private certificate authority

A private certificate authority (CA) can offer greater security and flexibility than the solutions outlined earlier. Typically, a private CA solution would manage the following for each ‘Common name’:

  • A private key
  • A certificate, created with the private key
  • Lists of certificates issued and those that have been revoked
  • Policies for managing certificates, for example which services have the right to make a request for a new certificate
  • Audit logs to track the lifecycle of certificates, in particular to ensure timely renewal where necessary

It is possible for an organization to build and maintain their own certificate issuing platform. This approach requires the implementation of a platform that is highly available and secure. These types of systems add to the overall overhead of maintaining infrastructures from a security, availability, scalability, and maintenance perspective. Some customers have also implemented Lambda functions to achieve the same functionality when it comes to issuing private certificates.

While it’s possible to build a private CA for internal services, there are some challenges to be aware of. Any solution should provide a number of features that are key to ensuring appropriate management of the certificates throughout their lifecycle.

For instance, the solution must support the creation, tracking, distribution, renewal, and revocation of certificates. All of these operations must be provided with the requisite security and authentication controls to ensure that certificates are distributed appropriately.

Scalability is another consideration. As applications become increasingly stateless and elastic, it’s conceivable that certificates may be required for every new container instance or wildcard certificates created to support an environment. Whatever CA solution is implemented must be ready to accommodate such a load while also providing high availability.

These types of approaches have drawbacks from various perspectives:

  • Custom code can be hard to maintain
  • Additional security measures must be implemented
  • Certificate renewal and revocation mechanisms also must be implemented
  • The platform must be maintained and kept up-to-date from a patching perspective while maintaining high availability

6. Using the new ACM Private CA to issue private certificates

ACM Private CA offers a secure, managed infrastructure to support the issuance and revocation of private digital certificates. It supports RSA and ECDSA key types for CA keys used for the creation of new certificates, as well as certificate revocation lists (CRLs) to inform clients when a certificate should no longer be trusted. Currently, ACM Private CA does not offer root CA support.

The following screenshot shows a subordinate certificate that is available for use:

The private key for any private CA that you create with ACM Private CA is created and stored in a FIPS 140-2 Level 3 Hardware Security Module (HSM) managed by AWS. The ACM Private CA is also integrated with AWS CloudTrail, which allows you to record the audit trail of API calls made using the AWS Management Console, AWS CLI, and AWS SDKs.

Setting up ACM Private CA requires a root CA. This can be used to sign a certificate signing request (CSR) for the new subordinate (CA), which is then imported into ACM Private CA. After this is complete, it’s possible for containers within your platform to generate their own key-value pairs at runtime using OpenSSL. They can then use the key-value pairs to make their own CSR and ultimately receive their own certificate.

More specifically, the container would complete the following steps at runtime:

  1. Add OpenSSL to the Docker image (if it is not already included).
  2. Generate a key-value pair (a cryptographically related private and public key).
  3. Use that private key to make a CSR.
  4. Call the ACM Private CA API or CLI issue-certificate operation, which issues a certificate based on the CSR.
  5. Call the ACM Private CA API or CLI get-certificate operation, which returns an issued certificate.

The following diagram shows these steps:

The authorization to successfully request a certificate is controlled via IAM policies, which can be attached via a role to the Amazon ECS task. Containers require the ‘Allow’ effect for at least the acm-pca:GetCertificate and acm:IssueCertificate actions. The following is a sample IAM policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": "acm-pca:*",
            "Resource": "arn:aws:acm-pca:us-east-1:1234567890:certificate-authority/2c4ccba1-215e-418a-a654-aaaaaaaa"
        }
    ]
}

For additional security, it is possible to store the certificate and keys in a temporary volume mounted in memory through the ‘tmpfs’ parameter. With this option enabled, the secure material is never written to the filesystem of the host machine.

Note: This feature is not currently available for containers run on AWS Fargate.

The task now has the necessary materials and starts up. Clients should be able to establish the trust hierarchy from the server, through ACM Private CA to the root or intermediate CA.

One consideration to be aware of is that ACM Private CA currently has a limit of 50,000 certificates for each CA in each Region. If the requirement is for each, short-lived container instance to have a separate certificate, then this limit could be reached.

Summary

The approaches outlined in this post describe the available options for ensuring that generation, storage, or distribution of sensitive material is done efficiently and securely. It should also be done in a way that supports the ephemeral, automatic scaling capabilities of container-based architectures. ACM Private CA provides a single interface to manage public and now private certificates, as well as seamlessly integrating with the AWS services.

If you have questions or suggestions, please comment below.

Building, deploying, and operating containerized applications with AWS Fargate

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/building-deploying-and-operating-containerized-applications-with-aws-fargate/

This post was contributed by Jason Umiker, AWS Solutions Architect.

Whether it’s helping facilitate a journey to microservices or deploying existing tools more easily and repeatably, many customers are moving toward containerized infrastructure and workflows. AWS provides many of the services and mechanisms to help you with that.

In this post, I show you how to use Amazon ECS and AWS Fargate, as well as AWS CodeBuild and AWS CodePipeline, for an end-to-end CI/CD container solution.

What is Amazon ECS?

Amazon Elastic Container Service (ECS) helps schedule and orchestrate containers across a fleet of servers. It involves installing an agent on each container host that takes instructions from the ECS control plane and relays them to the local Docker image on each one. ECS makes this easy by providing an optimized Amazon Machine Image (AMI) that launches automatically using the ECS console or CLI and that you can use to launch container hosts yourself.

It is up to you to choose the appropriate instance types, sizes, and quantity for your cluster fleet. You should have the capacity to deploy and scale workloads as well as to spread them across enough failure domains for high availability. Features like Auto Scaling groups help with that.

Also, while AWS provides Amazon Linux and Windows AMIs pre-configured for ECS, you are responsible for ongoing maintenance of the OS, which includes patching and security. Items that require regular patching or updating in this model are the OS, Docker, the ECS agent, and of course the contents of the container images.

Two of the key ECS concepts are Tasks and Services. A task is one or more containers that are to be scheduled together by ECS. A service is like an Auto Scaling group for tasks. It defines the quantity of tasks to run across the cluster, where they should be running (for example, across multiple Availability Zones), automatically associates them with a load balancer, and horizontally scales based on metrics that you define like CPI or memory utilization.

What is Fargate?

AWS Fargate is a new compute engine for Amazon ECS that runs containers without requiring you to deploy or manage the underlying Amazon EC2 instances. With Fargate, you specify an image to deploy and the amount of CPU and memory it requires. Fargate handles the updating and securing of the underlying Linux OS, Docker daemon, and ECS agent as well as all the infrastructure capacity management and scaling.

How to use Fargate?

Fargate is exposed as a launch type for ECS. It uses an ECS task and service definition that is similar to the traditional EC2 launch mode, with a few minor differences. It is easy to move tasks and services back and forth between launch types. The differences include:

  • Using the awsvpc network mode
  • Specifying the CPU and memory requirements for the task in the definition

The best way to learn how to use Fargate is to walk through the process and see it in action.

Walkthrough: Deploying a service with Fargate in the console

At the time of publication, Fargate for ECS is available in the N. Virginia, Ohio, Oregon, and Ireland AWS regions. This walkthrough works in any AWS region where Fargate is available.

If you’d prefer to use a CloudFormation template, this one covers Steps 1-4. After launching this template you can skip ahead to Explore Running Service after Step 4.

Step 1 – Create an ECS cluster

An ECS cluster is a logical construct for running groups of containers known as tasks. Clusters can also be used to segregate different environments or teams from each other. In the traditional EC2 launch mode, there are specific EC2 instances associated with and managed by each ECS cluster, but this is transparent to the customer with Fargate.

  1. Open the ECS console and ensure that Fargate is available in the selected Region (for example, N. Virginia).
  2. Choose Clusters, Create Cluster.
  3. Choose Networking only, Next step.
  4. For Cluster name, enter “Fargate”. If you don’t already have a VPC to use, select the Create VPC check box and accept the defaults as well. Choose Create.

Step 2 – Create a task definition, CloudWatch log group, and task execution role

A task is a collection of one or more containers that is the smallest deployable unit of your application. A task definition is a JSON document that serves as the blueprint for ECS to know how to deploy and run your tasks.

The console makes it easier to create this definition by exposing all the parameters graphically. In addition, the console creates two dependencies:

  • The Amazon CloudWatch log group to store the aggregated logs from the task
  • The task execution IAM role that gives Fargate the permissions to run the task
  1. In the left navigation pane, choose Task Definitions, Create new task definition.
  2. Under Select launch type compatibility, choose FARGATE, Next step.
  3. For Task Definition Name, enter NGINX.
  4. If you had an IAM role for your task, you would enter it in Task Role but you don’t need one for this example.
  5. The Network Mode is automatically set to awsvpc for Fargate
  6. Under Task size, for Task memory, choose 0.5 GB. For Task CPU, enter 0.25.
  7. Choose Add container.
  8. For Container name, enter NGINX.
  9. For Image, put nginx:1.13.9-alpine.
  10. For Port mappings type 80 into Container port.
  11. Choose Add, Create.

Step 3 – Create an Application Load Balancer

Sending incoming traffic through a load balancer is often a key piece of making an application both scalable and highly available. It can balance the traffic between multiple tasks, as well as ensure that traffic is only sent to healthy tasks. You can have the service manage the addition or removal of tasks from an Application Load Balancer as they come and go but that must be specified when the service is created. It’s a dependency that you create first.

  1. Open the EC2 console.
  2. In the left navigation pane, choose Load Balancers, Create Load Balancer.
  3. Under Application Load Balancer, choose Create.
  4. For Name, put NGINX.
  5. Choose the appropriate VPC (10.0.0.0/16 if you let ECS create if for you).
  6. For Availability Zones, select both and choose Next: Configure Security Settings.
  7. Choose Next: Configure Security Groups.
  8. For Assign a security group, choose Create a new security group. Choose Next: Configure Routing.
  9. For Name, enter NGINX. For Target type, choose ip.
  10. Choose Next: Register Targets, Next: Review, Create.
  11. Select the new load balancer and note its DNS name (this is the public address for the service).

Step 4 – Create an ECS service using Fargate

A service in ECS using Fargate serves a similar purpose to an Auto Scaling group in EC2. It ensures that the needed number of tasks are running both for scaling as well as spreading the tasks over multiple Availability Zones for high availability. A service creates and destroys tasks as part of its role and can optionally add or remove them from an Application Load Balancer as targets as it does so.

  1. Open the ECS console and ensure that that Fargate is available in the selected Region (for example, N. Virginia).
  2. In the left navigation pane, choose Task Definitions.
  3. Select the NGINX task definition that you created and choose Actions, Create Service.
  4. For Launch Type, select Fargate.
  5. For Service name, enter NGINX.
  6. For Number of tasks, enter 1.
  7. Choose Next step.
  8. Under Subnets, choose both of the options.
  9. For Load balancer type, choose Application Load Balancer. It should then default to the NGINX version that you created earlier.
  10. Choose Add to load balancer.
  11. For Target group name, choose NGINX.
  12. Under DNS records for service discovery, for TTL, enter 60.
  13. Click Next step, Next step, and Create Service.

Explore the running service

At this point, you have a running NGINX service using Fargate. You can now explore what you have running and how it works. You can also ask it to scale up to two tasks across two Availability Zones in the console.

Go into the service and see details about the associated load balancer, tasks, events, metrics, and logs:

Scale the service from one task to multiple tasks:

  • Choose Update.
  • For Number of tasks, enter 2.
  • Choose Next step, Next step, Next step then Update Service.
  • Watch the event that is logged and the new additional task both appear.

On the service Details tab, open the NGINX Target Group Name link and see the IP address registered targets spread across the two zones.

Go to the DNS name for the Application Load Balancer in your browser and see the default NGINX page. Get the value from the Load Balancers dashboard in the EC2 console.

Walkthrough: Adding a CI/CD pipeline to your service

Now, I’m going to show you how to set up a CI/CD pipeline around this service. It watches a GitHub repo for changes and rebuilds the container with CodeBuild based on the buildspec.yml file and Dockerfile in the repo. If that build is successful, it then updates your Fargate service to deploy the new image.

If you’d prefer to use a CloudFormation Template, this one covers the creation of the dependencies so that the console will pre-fill these (CodeBuild Project and IAM Roles) during the creation of the CodePipeline in the steps below.

Step 1 – Create an ECR repository for the rebuilt container image

An ECR repository is a place to store your container images in a secure and reliable manner. Scaling and self-healing of Fargate tasks requires these images to be always available to be pulled when required. This is an important part of a container platform.

  1. Open the ECS console and ensure that that Fargate is available in the selected Region (for example N. Virginia).
  2. In the left navigation pane, under Amazon ECR, choose Repositories, Get started.
  3. For Repository name, put NGINX and choose Next step.

Step 2 – Fork the nginx-codebuild example into your own GitHub account

I have created an example project that takes the Dockerfile and config files for the official NGINX Docker Hub image and adds a buildspec.yml file to tell CodeBuild how to build the container and push it to your new ECR registry on completion. You can fork it into your own GitHub account for this CI/CD demo.

  1. Go to https://github.com/jasonumiker/nginx-codebuild.
  2. In the upper right corner, choose Fork.

Step 3 – Create the pipeline and associated IAM roles

You have two complementary AWS services for building a CI/CD pipeline for your containers. CodeBuild executes the build jobs and CodePipeline kicks off those builds when it notices that the source GitHub or CodeCommit repo changes. If successful, CodePipeline then deploys the new container image to Fargate.

The CodePipeline console can create the associated CodeBuild project, in addition to other dependencies such as the required IAM roles.

  1. Open the CodePipeline console and ensure that that Fargate is available in the selected Region (for example, N. Virginia).
  2. Choose Get started.
  3. For Pipeline name, enter NGINX and choose Next step.
  4. For Source provider, choose GitHub.
  5. Choose Connect to GitHub and log in.
    • For Repository, choose your forked nginx-codebuild repo. For Branch, enter master. Choose Next step.
  6. For Build provider, enter AWS CodeBuild.
  7. Select Create a new build project.
  8. For Project name, enter NGINX.
  9. For Operating system, choose Ubuntu. For Runtime, choose Docker. For Version, select the latest version.
  10. Expand Advanced and set the following environment variables:
    • AWS_ACCOUNT_ID with a value of the account number
    • IMAGE_REPO_NAME with a value of NGINX (or whatever ECR name that you used)
  11. Choose Save build project, Next step.
  12. For Deployment provider, choose Amazon ECS.
  13. For Cluster name, enter Fargate.
  14. For Service name, choose NGINX.
  15. For Image filename, enter images.json.
  16. Choose Next step.
  17. Choose Create role, Allow, Next step, and then choose Create pipeline.
  18. Open the IAM console and ensure that that Fargate is available in the selected Region (for example, N. Virginia).
  19. In the left navigation pane, choose Roles.
  20. Choose the code-build-nginx-service-role that was just created and choose Attach policy.
  21. For Policy type, choose AmazonEC2ContainerRegistryPowerUser and choose Attach policy.

Step 4 – Start the pipeline

You now have CodePipeline watching the GitHub repo for changes. It kicks off a CodeBuild build job on a change and, if the build is successful, creates a new deployment of the Fargate service with the new image.

Make a change to the source repo (even just adding a new dummy file) and then commit it and push it to master on your GitHub fork. This automatically kicks off the pipeline to build and deploy the change.

Conclusion

As you’ve seen, Fargate is fast and easy to set up, integrates well with the rest of the AWS platform, and saves you from much of the heavy lifting of running containers reliably at scale.

While it is useful to go through creating things in the console to understand them better we suggest automating them with infrastructure-as-code patterns via things like our CloudFormation to ensure that they are repeatable, and any changes can be managed. There are some example templates to help you get started in this post.

In addition, adding things like unit and integration testing, blue/green and/or manual approval gates into CodePipeline are often a good idea before deploying patterns like this to production in many organizations. Some additional examples to look at next include:

Setting Up an Envoy Front Proxy on Amazon ECS

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/setting-up-an-envoy-front-proxy-on-amazon-ecs/

This post was contributed by Nare Hayrapetyan, Sr. Software Engineer

Many customers are excited about new microservices management tools and technologies like service mesh. Specifically, they ask how to get started using Envoy on AWS.  In this post, I walk through setting up an Envoy reverse proxy on Amazon Elastic Container Service (Amazon ECS). This example is based on the Envoy front proxy sandbox provided in the Envoy documentation.

The Envoy front proxy acts as a reverse proxy. It accepts incoming requests and routes them to ECS service tasks that can have an envoy sidecar themselves. The envoy sidecar then redirects the request to the service on the local host.

The reverse proxy provides the following features:

  • Terminates TLS.
  • Supports both HTTP/1.1 and HTTP/
  • Supports full HTTP L7
  • Talks to ECS services via an envoy sidecar, which means that the reverse proxy and the service hosts are operated in the same way and emit the same statistics because both are running Envoy.

To get started, create the following task definitions:

  • One contains an envoy image acting as a front proxy
  • One contains an envoy image acting as a service sidecar and an image for the service task itself

The envoy images are the same. The only difference is the configuration provided to the envoy process that defines how the proxy acts.

Service + Envoy Sidecar

Create a simple service that returns the hostname that it’s running on. This allows you to test the Envoy load balancing capabilities.

Create the following:

  • A simple service running on Amazon ECS
  • A script to start the service
  • A Dockerfile that copies the service code and starts it
  • The envoy configuration
  • A Dockerfile to download the envoy image and start it with the provided configuration
  • Images from both Dockerfiles, pushed to a registry so that they can be accessed by ECS
  • An ECS task definition that points to the envoy and service images

First, write the service code:

#service.py
from flask import Flask
import socket

app = Flask(__name__)

@app.route('/service')
def hello():
    return ('Hello from behind Envoy! hostname: {}\n'.format(socket.gethostname()))
    
if __name__ == "__main__":
    app.run(host='127.0.0.1', port=8080, debug=True)
#start_service.sh

#!/usr/bin/env bash
python3 /code/service.py

To start the service, create a Dockerfile.

FROM alpine:latest
RUN apk update && apk add python3 bash
RUN python3 --version && pip3 --version
RUN pip3 install -q Flask==0.11.1 requests==2.18.4
RUN mkdir /code
ADD ./service.py /code
ADD ./start_service.sh /usr/local/bin/start_service.sh
RUN chmod u+x /usr/local/bin/start_service.sh
ENTRYPOINT /usr/local/bin/start_service.sh

Here’s the configuration for Envoy as a service sidecar. Use the awsvpc networking mode to allow Envoy to access the service on the local host (see the task definition below).

#service-envoy.yaml
static_resources:
  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 80
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          codec_type: auto
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: service
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/service"
                route:
                  cluster: local_service
          http_filters:
          - name: envoy.router
            config: {}
  clusters:
  - name: local_service
    connect_timeout: 0.25s
    type: strict_dns
    lb_policy: round_robin
    hosts:
    - socket_address:
        address: 127.0.0.1
        port_value: 8080
admin:
  access_log_path: "/dev/null"
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8081

The envoy process is configured to listen to port 80 and redirect to local host:8080 (127.0.0.1:80), which is the address on which the service is listening. The service can be accessed on the local host because the task is running in awsvpc mode, which allows containers in the task to communicate with each other via local host.

Create the envoy sidecar image that has access to the envoy configuration.

FROM envoyproxy/envoy:latest

COPY service-envoy.yaml /etc/envoy/service-envoy.yaml
CMD /usr/local/bin/envoy -c /etc/envoy/service-envoy.yaml

And here’s the task definition containing the service and envoy images.

{
  "containerDefinitions": [
    {
      "image": "<accountId>.dkr.ecr.us-east-1.amazonaws.com/service",
      "name": "envoy-service"
    },
    {
      "portMappings": [
        {
          "hostPort": 80,
          "protocol": "tcp",
          "containerPort": 80
        },
        {
          "hostPort": 8081,
          "protocol": "tcp",
          "containerPort": 8081
        }
      ],
      "image": "<accountId>.dkr.ecr.us-east-1.amazonaws.com/envoy",
      "name": "envoy"
    }
  ],
  "networkMode": "awsvpc"
}

The task definition is used to launch an Amazon ECS service. To test load balancing in Envoy, scale the service to a couple of tasks. Also, we’ll add ECS service discovery to the service so that the front proxy can discover the service endpoints.

Envoy Front Proxy

For the front proxy setup, you need a Dockerfile with an envoy image and front proxy envoy configuration. Similar to the task definition earlier, you create a Docker image from the Dockerfile and push it to a repository to be accessed by the ECS task definition.

FROM envoyproxy/envoy:latest
COPY front-envoy.yaml /etc/envoy/front-envoy.yaml
CMD /usr/local/bin/envoy -c /etc/envoy/front-envoy.yaml

Front proxy envoy configuration:

#front-envoy.yaml
static_resources:
  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 80
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          codec_type: auto
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: backend
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/service"
                route:
                  cluster: testservice
          http_filters:
          - name: envoy.router
            config: {}
  clusters:
  - name: testservice
    connect_timeout: 0.25s
    type: logical_dns
    lb_policy: round_robin
    http2_protocol_options: {}
    hosts:
    - socket_address:
        # this is the ecs service discovery endpoint for our service 
        address: testservice.ecs
        port_value: 80         
admin:
  access_log_path: "/dev/null"
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8001

The differences between the envoy configurations are the hosts in the cluster. The front proxy envoy uses ECS service discovery—set up when the service was being created—to discover the service endpoints.

After you push the front proxy image to ECR and create an ECS task definition, launch both services (using the front proxy and the service task definitions) in the same VPC.  Now the calls to the front proxy are redirected to one of the service envoys discovered by ECS service discovery.

Now test Envoy’s load balancing capabilities:

$ curl (front-proxy-private-ip):80/service
Hello from behind Envoy! hostname: 6ae1c4ff6b5d

$ curl (front-proxy-private-ip):80/service
Hello from behind Envoy! hostname: 6203f60d9d5c

Conclusion

Now you should be all set! As you can see, getting started with Envoy and ECS can be simple and straight forward. I’m excited to see how you can use these technologies to build next-gen applications!

– Nare