Tag Archives: AWS CodeBuild

Implementing GitFlow Using AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy

Post Syndicated from Ashish Gore original https://aws.amazon.com/blogs/devops/implementing-gitflow-using-aws-codepipeline-aws-codecommit-aws-codebuild-and-aws-codedeploy/

This blog post shows how AWS customers who use a GitFlow branching model can model their merge and release process by using AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy. This post provides a framework, AWS CloudFormation templates, and AWS CLI commands.

Before we begin, we want to point out that GitFlow isn’t something that we practice at Amazon because it is incompatible with the way we think about CI/CD. Continuous integration means that every developer is regularly merging changes back to master (at least once per day). As we’ll explain later, GitFlow involves creating multiple levels of branching off of master where changes to feature branches are only periodically merged all the way back to master to trigger a release. Continuous delivery requires the capability to get every change into production quickly, safely, and sustainably. Research by groups such as DORA has shown that teams that practice CI/CD get features to customers more quickly, are able to recover from issues more quickly, experience fewer failed deployments, and have higher employee satisfaction.

Despite our differing view, we recognize that our customers have requirements that might make branching models like GitFlow attractive (or even mandatory). For this reason, we want to provide information that helps them use our tools to automate merge and release tasks and get as close to CI/CD as possible. With that disclaimer out of the way, let’s dive in!

When Linus Torvalds introduced Git version control in 2005, it really changed the way developers thought about branching and merging. Before Git, these tasks were scary and mostly avoided. As the tools became more mature, branching and merging became both cheap and simple. They are now part of the daily development workflow. In 2010, Vincent Driessen introduced GitFlow, which became an extremely popular branch and release management model. It introduced the concept of a develop branch as the mainline integration and the well-known master branch, which is always kept in a production-ready state. Both master and develop are permanent branches, but GitFlow also recommends short-lived feature, hotfix, and release branches, like so:

GitFlow guidelines:

  • Use development as a continuous integration branch.
  • Use feature branches to work on multiple features.
  • Use release branches to work on a particular release (multiple features).
  • Use hotfix branches off of master to push a hotfix.
  • Merge to master after every release.
  • Master contains production-ready code.

Now that you have some background, let’s take a look at how we can implement this model using services that are part of AWS Developer Tools: AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy. In this post, we assume you are familiar with these AWS services. If you aren’t, see the links in the Reference section before you begin. We also assume that you have installed and configured the AWS CLI.

Throughout the post, we use the popular GitFlow tool. It’s written on top of Git and automates the process of branch creation and merging. The tool follows the GitFlow branching model guidelines. You don’t have to use this tool. You can use Git commands instead.

For simplicity, production-like pipelines that have approval or testing stages have been omitted, but they can easily fit into this model. Also, in an ideal production scenario, you would keep Dev and Prod accounts separate.

AWS Developer Tools and GitFlow

Let’s take a look at how can we model AWS CodePipeline with GitFlow. The idea is to create a pipeline per branch. Each pipeline has a lifecycle that is tied to the branch. When a new, short-lived branch is created, we create the pipeline and required resources. After the short-lived branch is merged into develop, we clean up the pipeline and resources to avoid recurring costs.

The following would be permanent and would have same lifetime as the master and develop branches:

  • AWS CodeCommit master/develop branch
  • AWS CodeBuild project across all branches
  • AWS CodeDeploy application across all branches
  • AWS Cloudformation stack (EC2 instance) for master (prod) and develop (stage)

The following would be temporary and would have the same lifetime as the short-lived branches:

  • AWS CodeCommit feature/hotfix/release branch
  • AWS CodePipeline per branch
  • AWS CodeDeploy deployment group per branch
  • AWS Cloudformation stack (EC2 instance) per branch

Here’s how it would look:

Basic guidelines (assuming EC2/on-premises):

  • Each branch has an AWS CodePipeline.
  • AWS CodePipeline is configured with AWS CodeCommit as the source provider, AWS CodeBuild as the build provider, and AWS CodeDeploy as the deployment provider.
  • AWS CodeBuild is configured with AWS CodePipeline as the source.
  • Each AWS CodePipeline has an AWS CodeDeploy deployment group that uses the Name tag to deploy.
  • A single Amazon S3 bucket is used as the artifact store, but you can choose to keep separate buckets based on repo.

 

Step 1: Use the following AWS CloudFormation templates to set up the required roles and environment for master and develop, including the commit repo, VPC, EC2 instance, CodeBuild, CodeDeploy, and CodePipeline.

$ aws cloudformation create-stack --stack-name GitFlowEnv \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-devops-workshop-environment-setup.template \
--capabilities CAPABILITY_IAM 

$ aws cloudformation create-stack --stack-name GitFlowCiCd \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-pipeline-commit-build-deploy.template \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=MainBranchName,ParameterValue=master ParameterKey=DevBranchName,ParameterValue=develop 

Here is how the pipelines should appear in the CodePipeline console:

Step 2: Push the contents to the AWS CodeCommit repo.

Download https://s3.amazonaws.com/gitflowawsdevopsblogpost/WebAppRepo.zip. Unzip the file, clone the repo, and then commit and push the contents to CodeCommit – WebAppRepo.

Step 3: Run git flow init in the repo to initialize the branches.

$ git flow init

Assume you need to start working on a new feature and create a branch.

$ git flow feature start <branch>

Step 4: Update the stack to create another pipeline for feature-x branch.

$ aws cloudformation update-stack --stack-name GitFlowCiCd \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-pipeline-commit-build-deploy-update.template \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=MainBranchName,ParameterValue=master ParameterKey=DevBranchName,ParameterValue=develop ParameterKey=FeatureBranchName,ParameterValue=feature-x

When you’re done, you should see the feature-x branch in the CodePipeline console. It’s ready to build and deploy. To test, make a change to the branch and view the pipeline in action.

After you have confirmed the branch works as expected, use the finish command to merge changes into the develop branch.

$ git flow feature finish <feature>

After the changes are merged, update the AWS CloudFormation stack to remove the branch. This will help you avoid charges for resources you no longer need.

$ aws cloudformation update-stack --stack-name GitFlowCiCd \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-pipeline-commit-build-deploy.template \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=MainBranchName,ParameterValue=master ParameterKey=DevBranchName,ParameterValue=develop

The steps for the release and hotfix branches are the same.

End result: Pipelines and deployment groups

You should end up with pipelines that look like this.

Next steps

If you take the CLI commands and wrap them in your own custom bash script, you can use GitFlow and the script to quickly set up and tear down pipelines and resources for short-lived branches. This helps you avoid being charged for resources you no longer need. Alternatively, you can write a scheduled Lambda function that, based on creation date, deletes the short-lived pipelines on a regular basis.

Summary

In this blog post, we showed how AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy can be used to model GitFlow. We hope you can use the information in this post to improve your CI/CD strategy, specifically to get your developers working in feature/release/hotfixes branches and to provide them with an environment where they can collaborate, test, and deploy changes quickly.

References

Improve Build Performance and Save Time Using Local Caching in AWS CodeBuild

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/improve-build-performance-and-save-time-using-local-caching-in-aws-codebuild/

AWS CodeBuild now supports local caching, which makes it possible for you to persist intermediate build artifacts locally on the build host so that they are available for reuse in subsequent build runs.

Your build project can use one of two types of caching: Amazon S3 or local. In this blog post, we will discuss how to use the local caching feature.

Local caching stores a cache on a build host. The cache is available to that build host only for a limited time and until another build is complete. For example, when you are dealing with large Java projects, compilation might take a long time. You can speed up subsequent builds by using local caching. This is a good option for large intermediate build artifacts because the cache is immediately available on the build host.

Local caching increases build performance for:

  • Projects with a large, monolithic source code repository.
  • Projects that generate and reuse many intermediate build artifacts.
  • Projects that build large Docker images.
  • Projects with many source dependencies.

To use local caching

1. Open AWS CodeBuild console at https://console.aws.amazon.com/codesuite/codebuild/home.

2. Choose Create project.

3. In Project configuration, enter a name and description for the build project.

4. In Source, for Source provider, choose the source code provider type. In this example, we use an AWS CodeCommit repository name.

5. For Environment image, choose Managed image or Custom image, as appropriate. For environment type, choose Linux or Windows Server. Specify a runtime, runtime version, and service role for your project.

6. Configure the buildspec file for your project.

7. In Artifacts, expand Additional Configuration. For Cache type, choose Local, as shown here.

Local caching supports the following caching modes:

Source cache mode caches Git metadata for primary and secondary sources. After the cache is created, subsequent builds pull only the change between commits. This mode is a good choice for projects with a clean working directory and a source that is a large Git repository. If you choose this option and your project does not use a Git repository (GitHub, GitHub Enterprise, or Bitbucket), the option is ignored. No changes are required in the buildspec file.

Docker layer cache mode caches existing Docker layers. This mode is a good choice for projects that build or pull large Docker images. It can prevent the performance issues caused by pulling large Docker images down from the network.

Note

  • You can use a Docker layer cache in the Linux environment only.
  • The privileged flag must be set so that your project has the required Docker permissions
  • You should consider the security implications before you use a Docker layer cache.

Custom cache mode caches directories you specify in the buildspec file. This mode is a good choice if your build scenario is not suited to one of the other two local cache modes. If you use a custom cache:

  • Only directories can be specified for caching. You cannot specify individual files.
  • Symlinks are used to reference cached directories.
  • Cached directories are linked to your build before it downloads its project sources. Cached items are overridden if a source item has the same name. Directories are specified using cache paths in the buildspec file.

To use source cache mode

In the build project configuration, under Artifacts, expand Additional Configuration. For Cache type, choose Local. Select Source cache, as shown here.

To use Docker layer cache mode

In the build project configuration, under Artifacts, expand Additional Configuration. For Cache type, choose Local. Select Docker layer cache, as shown here.

Under Privileged, select Enable this flag if you want to build Docker images or want your builds to get elevated privileges. This grants elevated privileges to the Docker process running on the build host.

To use custom cache mode

In your buildspec file, specify the cache path, as shown here.

In the build project configuration, under Artifacts, expand Additional Configuration. For Cache type, choose Local. Select Custom cache, as shown here.


version: 0.2
phases:
  pre_build:
    commands:
      - echo "Enter pre_build commands"
  build:
    commands:
      - echo "Enter build commands"
      
cache:
  paths:
    - '/root/.m2/**/*'
    - '/root/.npm/**/*'
    - 'build/**/*'

Conclusion

We hope you find the information in this post helpful. If you have feedback, please leave it in the Comments section below. If you have questions, start a new thread on the AWS CodeBuild forum or contact AWS Support.

 

 

 

 

 

Validating AWS CodeCommit Pull Requests with AWS CodeBuild and AWS Lambda

Post Syndicated from Chris Barclay original https://aws.amazon.com/blogs/devops/validating-aws-codecommit-pull-requests-with-aws-codebuild-and-aws-lambda/

Thanks to Jose Ferraris and Flynn Bundy for this great post about how to validate AWS CodeCommit pull requests with AWS CodeBuild and AWS Lambda. Both are DevOps Consultants from the AWS Professional Services’ EMEA team.

You can help ensure a high level of code quality and avoid merging code that does not integrate with previous changes by testing proposed code changes in pull requests before they are allowed to be merged. In this blog post, we’ll show you how to set up this kind of validation using AWS CodeCommit, AWS CodeBuild, and AWS Lambda. In addition, we’ll show you how to set up a pipeline to automatically build your tested, approved, and merged code changes using AWS CodePipeline.

When we talk with customers and partners, we find that they are in different stages in the adoption of DevOps methodologies such as Continuous Integration and Continuous Deployment (CI/CD). However, one of the main requirements we see is a strong emphasis on automation of delivering resources in a safe, secure, and repeatable manner. One of the fundamental principles of CI/CD is aimed at keeping everyone on the team in sync about changes happening in the codebase. With this in mind, it’s important to fail fast and fail early within a CI/CD workflow to ensure that potential issues are caught before making their way into production.

To do this, we can use services such as AWS CodeBuild for running our tests, along with AWS CodeCommit to store our source code. One of the ways we can “fail fast” is to validate pull requests with tests to see how they will integrate with the current master branch of a repository when first opened in AWS CodeCommit. By running our tests against the proposed changes prior to merging them into the master branch, we can ensure a high level of quality early on, catch any potential issues, and boost the confidence of the developer in relation to their changes. In this way, you can start validating your pull requests in AWS CodeCommit by utilizing AWS Lambda and AWS CodeBuild to automatically trigger builds and tests of your development branches.

We can also use services such as AWS CodePipeline for visualizing and creating our pipeline, and automatically building and deploying merged code that has met the validation bar for pull requests.

The following diagram shows the workflow of a pull request. The AWS CodeCommit repository contains two branches, the master branch that contains approved code, and the development branch, where changes to the code are developed. In this workflow, a pull request is created with the new code in the development branch, which the developer wants to merge into the master branch. The creation of the pull request is an event detected by AWS CloudWatch. This event will start two separate actions:
• It triggers an AWS Lambda function that will post an automated comment to the pull request that indicates a build to test the changes is about to begin.
• It also triggers an AWS CodeBuild project that will build and validate those changes.

When the build completes, AWS CloudWatch detects that event. Another AWS Lambda function posts an automated comment to the pull request with the results of the build and a link to the build logs. Based on this automated testing, the developer who opened the pull request can update the code to address any build failures, and then update the pull request with those changes. Those updates will be built, and the build results are then posted to the pull request as a comment.

Let’s show how this works in a specific example project. This project has its own set of tasks defined in the build specification file that will execute and validate this specific pull request. The buildspec.yml for our example AWS CloudFormation template contains the following code:

version: 0.2

phases:
  install:
    commands:
      - pip install cfn-lint
  build:
    commands:
      - cfn-lint --template ./template.yaml --regions $AWS_REGION
      - aws cloudformation validate-template --template-body file://$(pwd)/template.yaml
artifacts:
  files:
    - '*'

In this example we are installing cfn-lint, which perform various checks against our template, we are also running the AWS CloudFormation validate-template command via the AWS CLI.

Once the code included in the pull request has been built, AWS CloudWatch detects the build complete event and passes along the outcome to a Lambda function that will update the specific commit with a comment that notifies the users of the results. It also includes a link to build logs in AWS CodeBuild. This process repeats any time the pull request is updated. For example, if an initial pull request was opened but failed the set of tests associated with the project, the developer might fix the code and make an update to the currently opened pull request. This will in turn trigger the function to run again and update the comments section with the test results.

Testing and validating pull requests before they can be merged into production code is a common approach and a best practice when working with CI/CD. Once the pull request is approved and merged into the production branch, it is also a good CI/CD practice to automatically build, test, and deploy that code. This is why we’ve structured this into two different AWS CloudFormation stacks (both can be found in our GitHub repository). One contains a base layer template that contains the resources you would only need to create once, in this case the AWS Lambda functions that test and update pull requests. The second stack includes an example of a CI/CD pipeline defined in AWS CloudFormation that imports the resources from the base layer stack.

We start by creating our base layer, which creates the Lambda functions and sets up AWS IAM roles that the functions will use to interact with the various AWS services. Once this stack is in place, we can add one or more pipeline stacks which import some of the values from the base layer. The pipeline will automatically build any changes merged into the master branch of the repository. Once any pipeline stack is complete, we have an AWS CodeCommit repository, AWS CodeBuild project, and an AWS CodePipeline pipeline set up and ready for deployment.

We can now push some code into our repository on the master branch to trigger a run-through of our pipeline.

In this example we will use the following AWS CloudFormation template. This template creates a single Amazon S3 bucket. This template will be the artifact that we push through our CI/CD pipeline and deploy to our stages.

AWSTemplateFormatVersion: '2010-09-09'
Description: 'A sample CloudFormation template that we can use to validate in our pipeline'
Resources:
  S3Bucket:
    Type: 'AWS::S3::Bucket'

Once this code is tested and approved in a pull request, it will be merged into the production branch as part of the pull request approve and merge process. This will automatically start our pipeline in AWS CodePipeline, and will run through to the stages defined for it. For example:

Now we can make some changes to our code base in the development branch and open a pull request. First, edit the file to make a typo in our CloudFormation template so we can test the validation.

AWSTemplateFormatVersion: '2010-09-09'
Metadata: 
  License: Apache-2.0
Description: 'A sample CloudFormation template that we can use to validate in our pipeline'
Resources:
  S3Bucket:
    Type: 'AWS::S3::Bucket1'

Notice that we changed the S3 bucket to be AWS::S3::Bucket1. This doesn’t exist, so cfn-lint will return a failure when it attempts to validate the template.

Now push this change into our development branch in the AWS CodeCommit repository and open the pull request against the production (master) branch.

From there, navigate to the comments section of the pull request. You should see a status update that the pull request is currently building.

Once the build is complete, you should see feedback on the outcome of the build and its results given to us as a comment.

Choose the Logs link to view details about the failure. We can see that we were able to catch an error related to linting rules failing.

We can remedy this and update our pull request with the updated code. Upon doing so, we can see another build has been kicked off by looking at the comments of the pull request. Once this has been completed we can confirm that our pull request has been validated as desired and our tests have passed.

Once this pull request is approved and merged to master, this will start our pipeline in AWS CodePipeline, which will take this code change through the specified stages.

 

How to Use Cross-Account ECR Images in AWS CodeBuild for Your Build Environment

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/how-to-use-cross-account-ecr-images-in-aws-codebuild-for-your-build-environment/

AWS CodeBuild now makes it possible for you to access Docker images from any Amazon Elastic Container Registry repository in another account as the build environment. With this feature, AWS CodeBuild allows you to pull any image from a repository to which you have been granted resource-level permissions.

In this blog post, we will show you how to provision a build environment using an image from another AWS account.

Here is a quick overview of the services used in our example:

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. It provides a fully preconfigured build platform for most popular programming languages and build tools, including Apache Maven, Gradle, and more.

Amazon Elastic ECR is a fully managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.

We will use a sample Docker image in an Amazon ECR image repository in AWS account B. The CodeBuild project in AWS account A will pull the images from the Amazon ECR image repository in AWS account B.

Prerequisites:

To get started you need:

·       Two AWS accounts (AWS account A and AWS account B).

·       In AWS account A, an image registry in Amazon ECR. In AWS account B, images that you would like to use for your build environment. If you do not have an image registry and a sample image, see Docker Sample in the AWS CodeBuild User Guide.

·       In AWS account A, an AWS CodeCommit repository with a buildspec.yml file and sample code.

·       Using the following steps, permissions in your Amazon ECR image repository for AWS CodeBuild to pull the repository’s Docker image into the build environment.

To grant CodeBuild permissions to pull the Docker image into the build environment

1.     Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.

2.     Choose the name of the repository you created.

3.     On the Permissions tab, choose Edit JSON policy.

4.     Apply the following policy and save.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CodeBuildAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "<arn of the service role>"  
      },
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability"
      ]
    }
  ]
}

To use an image from account B and set up a build project in account A

1. Open the AWS CodeBuild console at https://console.aws.amazon.com/codesuite/codebuild/home.

2. Choose Create project.

3. In Project configuration, enter a name and description for the build project.

4. In Source, for Source provider, choose the source code provider type. In this example, we use the AWS CodeCommit repository name.

 

5.  For Environment, we will pull the Docker image from AWS account B and use the image to create the build environment to build artifacts. To configure the build environment, choose Custom Image. For Image registry, choose Amazon ECR. For ECR account, choose Other ECR account.

6.  In Amazon ECR repository URI, enter the URI for the image repository from AWS account B and then choose Create build project.

7. Go to the build project you just created, and choose Start build. The build execution will download the source code from the AWS CodeCommit repository and provision the build environment using the image retrieved from the image registry.

Next steps

Now that you have seen how to use cross-account ECR images, you can integrate a build step in AWS CodePipeline and use the build environment to create artifacts and deploy your application. To integrate a build step in your pipeline, see Working with Deployments in AWS CodeDeploy in the AWS CodeDeploy User Guide

If you have any feedback, please leave it in the Comments section below. If you have questions, please start a thread on the AWS CodeBuild forum or contact AWS Support.

 

How to Use Docker Images from a Private Registry for Your Build Environment

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/how-to-use-docker-images-from-a-private-registry-in-aws-codebuild-for-your-build-environment/

AWS CodeBuild now supports using a Docker image that is stored in a private registry as your runtime environment. Previously, the service supported the use of Docker images from public Docker Hub or Amazon ECR only.

In this blog post, we will show you how to use a Docker image from a private registry to create the AWS CodeBuild runtime environment. The credentials for the private registry are stored in AWS Secrets Manager.

Here is an overview of the services used in our example:

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. It provides a fully preconfigured build platform for most popular programming languages and build tools, including Apache Maven, Gradle, and more.

Docker Hub repositories allow you to share container images with your team, customers, or the Docker community at large.

AWS Secrets Manager protects secrets required to access your applications, services, and IT resources. The services makes it possible for you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle.

 

Prerequisites:

To get started you need:

·       A private repository or account.

·       A Secrets Manager secret that stores your Docker Hub credentials. The credentials are used to access your private repository.

·       An AWS account to create an AWS CodeBuild project.

·       A service role created in IAM that grants access to your Secrets Manager secret.

·       An AWS CodeCommit repository set up in your AWS account with a buildspec.yml file and sample code.

Create a private registry

If you do not have a private registry, follow the steps in the documentation

on the Docker website. Alternatively, you can execute the following commands in a terminal to pull an image, get its ID, and push it to a new repository.

docker pull amazonlinux

docker images amazonlinux --format {{.ID}}

docker tag image-id your-username/repository-name:latest

docker login

docker push your-username/repository-name

 

Create a basic secret in AWS Secrets Manager

In AWS Secrets Manager, a basic secret is one with a minimum of metadata and a single encrypted secret value. The one version that’s stored in the secret is automatically labeled AWSCURRENT.

To create a basic secret

1.     Open the AWS Secrets Manager console at https://console.aws.amazon.com/secretsmanager/.

2.     Choose Store a new secret.

3.     In the Select a secret type section, specify the kind of secret that you want to create by choosing Other type of secrets, and then enter a user name and password to access your private registry.

4.     In Secret key/value, create one key-value pair for your Docker Hub user name and one key-value pair for your Docker Hub password.

5.     For Secret name, enter a name, such as dockerhub. You can enter an optional description to help you remember that this is a secret for Docker Hub.

6.     Leave Disable automatic rotation selected because the keys correspond to your Docker Hub credentials.

7.     Review your settings, and then choose Store secret.

 

 

Create an AWS CodeBuild project to pull Docker images from a private registry

Note

If your private registry is in your VPC, it must have public internet access. AWS CodeBuild cannot pull an image from a private IP address in a VPC.

To use a Docker image from a private registry in your AWS CodeBuild project

1.     Open the AWS CodeBuild console at https://console.aws.amazon.com/codesuite/codebuild/home.

2.     Choose Create project.

3.     In Project configuration, for Project name, enter a name and description for the build project.

4.     In Source, for Source provider, choose the source code provider type. In this example, we are using the name of an AWS CodeCommit repository.

 

5. We will pull the Docker image from a private registry and use the image to create the build environment to build artifacts. To configure the build environment, in Environment, choose Custom image. For Environment type, choose Linux or Windows. For Custom image type, choose Other location, and then enter the image location and the ARN or name of your Secrets Manager credentials.

6. Go to the build project you just created, and choose Start build. The build execution will download the source code from the AWS CodeCommit repository and provision the build environment using the image retrieved from the registry.

Conclusion

Using the above guidelines, you now can now provision build environment using docker images from private registry.

Next steps:

Now that you have seen how to use Docker images to provision build environments from a private registry, you can integrate a build step in AWS CodePipeline and use the build environment to create artifacts and deploy your application. To integrate a build step in your pipeline, see Working with Deployments in AWS CodeDeploy in the AWS CodeDeploy User Guide.

If you have feedback, please leave it in the Comments section below. If you have questions, please start a thread on the AWS CodeBuild forum or contact AWS Support

 

 

 

 

 

Build a Continuous Delivery Pipeline for Your Container Images with Amazon ECR as Source

Post Syndicated from Daniele Stroppa original https://aws.amazon.com/blogs/devops/build-a-continuous-delivery-pipeline-for-your-container-images-with-amazon-ecr-as-source/

Today, we are launching support for Amazon Elastic Container Registry (Amazon ECR) as a source provider in AWS CodePipeline. You can now initiate an AWS CodePipeline pipeline update by uploading a new image to Amazon ECR. This makes it easier to set up a continuous delivery pipeline and use the AWS Developer Tools for CI/CD.

You can use Amazon ECR as a source if you’re implementing a blue/green deployment with AWS CodeDeploy from the AWS CodePipeline console. For more information about using the Amazon Elastic Container Service (Amazon ECS) console to implement a blue/green deployment without CodePipeline, see Implement Blue/Green Deployments for AWS Fargate and Amazon ECS Powered by AWS CodeDeploy.

This post shows you how to create a complete, end-to-end continuous deployment (CD) pipeline with Amazon ECR and AWS CodePipeline. It walks you through setting up a pipeline to build your images when the upstream base image is updated.

Prerequisites

To follow along, you must have these resources in place:

  • A source control repository with your base image Dockerfile and a Docker image repository to store your image. In this walkthrough, we use a simple Dockerfile for the base image:
    FROM alpine:3.8

    RUN apk update

    RUN apk add nodejs
  • A source control repository with your application Dockerfile and source code and a Docker image repository to store your image. For the application Dockerfile, we use our base image and then add our application code:
    FROM 012345678910.dkr.ecr.us-east-1.amazonaws.com/base-image

    ENV PORT=80

    EXPOSE $PORT

    COPY app.js /app/

    CMD ["node", "/app/app.js"]

This walkthrough uses AWS CodeCommit for the source control repositories and Amazon ECR  for the Docker image repositories. For more information, see Create an AWS CodeCommit Repository in the AWS CodeCommit User Guide and Creating a Repository in the Amazon Elastic Container Registry User Guide.

Note: The source control repositories and image repositories must be created in the same AWS Region.

Set up IAM service roles

In this walkthrough you use AWS CodeBuild and AWS CodePipeline to build your Docker images and push them to Amazon ECR. Both services use Identity and Access Management (IAM) service roles to makes calls to Amazon ECR API operations. The service roles must have a policy that provides permissions to make these Amazon ECR calls. The following procedure helps you attach the required permissions to the CodeBuild service role.

To create the CodeBuild service role

  1. Follow these steps to use the IAM console to create a CodeBuild service role.
  2. On step 10, make sure to also add the AmazonEC2ContainerRegistryPowerUser policy to your role.

CodeBuild service role policies

Create a build specification file for your base image

A build specification file (or build spec) is a collection of build commands and related settings, in YAML format, that AWS CodeBuild uses to run a build. Add a buildspec.yml file to your source code repository to tell CodeBuild how to build your base image. The example build specification used here does the following:

  • Pre-build stage:
    • Sign in to Amazon ECR.
    • Set the repository URI to your ECR image and add an image tag with the first seven characters of the Git commit ID of the source.
  • Build stage:
    • Build the Docker image and tag the image with latest and the Git commit ID.
  • Post-build stage:
    • Push the image with both tags to your Amazon ECR repository.
version: 0.2

phases:
  pre_build:
    commands:
      - echo.Logging in to Amazon ECR...
      - aws --version
      - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - REPOSITORY_URI=012345678910.dkr.ecr.us-east-1.amazonaws.com/base-image
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG=${COMMIT_HASH:=latest}
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $REPOSITORY_URI:latest .
      - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push $REPOSITORY_URI:latest
      - docker push $REPOSITORY_URI:$IMAGE_TAG

To add a buildspec.yml file to your source repository

  1. Open a text editor and then copy and paste the build specification above into a new file.
  2. Replace the REPOSITORY_URI value (012345678910.dkr.ecr.us-east-1.amazonaws.com/base-image) with your Amazon ECR repository URI (without any image tag) for your Docker image. Replace base-image with the name for your base Docker image.
  3. Commit and push your buildspec.yml file to your source repository.
    git add .
    git commit -m "Adding build specification."
    git push

Create a build specification file for your application

Add a buildspec.yml file to your source code repository to tell CodeBuild how to build your source code and your application image. The example build specification used here does the following:

  • Pre-build stage:
    • Sign in to Amazon ECR.
    • Set the repository URI to your ECR image and add an image tag with the first seven characters of the CodeBuild build ID.
  • Build stage:
    • Build the Docker image and tag the image with latest and the Git commit ID.
  • Post-build stage:
    • Push the image with both tags to your ECR repository.
version: 0.2

phases:
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws --version
      - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - REPOSITORY_URI=012345678910.dkr.ecr.us-east-1.amazonaws.com/hello-world
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG=build-$(echo $CODEBUILD_BUILD_ID | awk -F":" '{print $2}')
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $REPOSITORY_URI:latest .
      - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push $REPOSITORY_URI:latest
      - docker push $REPOSITORY_URI:$IMAGE_TAG
artifacts:
  files:
    - imageDetail.json

To add a buildspec.yml file to your source repository

  1. Open a text editor and then copy and paste the build specification above into a new file.
  2. Replace the REPOSITORY_URI value (012345678910.dkr.ecr.us-east-1.amazonaws.com/hello-world) with your Amazon ECR repository URI (without any image tag) for your Docker image. Replace hello-world with the container name in your service’s task definition that references your Docker image.
  3. Commit and push your buildspec.yml file to your source repository.
    git add .
    git commit -m "Adding build specification."
    git push

Create a continuous deployment pipeline for your base image

Use the AWS CodePipeline wizard to create your pipeline stages:

  1. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
  2. On the Welcome page, choose Create pipeline.
    If this is your first time using AWS CodePipeline, an introductory page appears instead of Welcome. Choose Get Started Now.
  3. On the Step 1: Name page, for Pipeline name, type the name for your pipeline and choose Next step. For this walkthrough, the pipeline name is base-image.
  4. On the Step 2: Source page, for Source provider, choose AWS CodeCommit.
    1. For Repository name, choose the name of the AWS CodeCommit repository to use as the source location for your pipeline.
    2. For Branch name, choose the branch to use, and then choose Next step.
  5. On the Step 3: Build page, choose AWS CodeBuild, and then choose Create project.
    1. For Project name, choose a unique name for your build project. For this walkthrough, the project name is base-image.
    2. For Operating system, choose Ubuntu.
    3. For Runtime, choose Docker.
    4. For Version, choose aws/codebuild/docker:17.09.0.
    5. For Service role, choose Existing service role, choose the CodeBuild service role you’ve created earlier, and then clear the Allow AWS CodeBuild to modify this service role so it can be used with this build project box.
    6. Choose Continue to CodePipeline.
    7. Choose Next.
  6. On the Step 4: Deploy page, choose Skip and acknowledge the pop-up warning.
  7. On the Step 5: Review page, review your pipeline configuration, and then choose Create pipeline.

Base image pipeline

Create a continuous deployment pipeline for your application image

The execution of the application image pipeline is triggered by changes to the application source code and changes to the upstream base image. You first create a pipeline, and then edit it to add a second source stage.

    1. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
    2. On the Welcome page, choose Create pipeline.
    3. On the Step 1: Name page, for Pipeline name, type the name for your pipeline, and then choose Next step. For this walkthrough, the pipeline name is hello-world.
    4. For Service role, choose Existing service role, and then choose the CodePipeline service role you modified earlier.
    5. On the Step 2: Source page, for Source provider, choose Amazon ECR.
      1. For Repository name, choose the name of the Amazon ECR repository to use as the source location for your pipeline. For this walkthrough, the repository name is base-image.

Amazon ECR source configuration

  1. On the Step 3: Build page, choose AWS CodeBuild, and then choose Create project.
    1. For Project name, choose a unique name for your build project. For this walkthrough, the project name is hello-world.
    2. For Operating system, choose Ubuntu.
    3. For Runtime, choose Docker.
    4. For Version, choose aws/codebuild/docker:17.09.0.
    5. For Service role, choose Existing service role, choose the CodeBuild service role you’ve created earlier, and then clear the Allow AWS CodeBuild to modify this service role so it can be used with this build project box.
    6. Choose Continue to CodePipeline.
    7. Choose Next.
  2. On the Step 4: Deploy page, choose Skip and acknowledge the pop-up warning.
  3. On the Step 5: Review page, review your pipeline configuration, and then choose Create pipeline.

The pipeline will fail, because it is missing the application source code. Next, you edit the pipeline to add an additional action to the source stage.

  1. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
  2. On the Welcome page, choose your pipeline from the list. For this walkthrough, the pipeline name is hello-world.
  3. On the pipeline page, choose Edit.
  4. On the Editing: hello-world page, in Edit: Source, choose Edit stage.
  5. Choose the existing source action, and choose the edit icon.
    1. Change Output artifacts to BaseImage, and then choose Save.
  6. Choose Add action, and then enter a name for the action (for example, Code).
    1. For Action provider, choose AWS CodeCommit.
    2. For Repository name, choose the name of the AWS CodeCommit repository for your application source code.
    3. For Branch name, choose the branch.
    4. For Output artifacts, specify SourceArtifact, and then choose Save.
  7. On the Editing: hello-world page, choose Save and acknowledge the pop-up warning.

Application image pipeline

Test your end-to-end pipeline

Your pipeline should have everything for running an end-to-end native AWS continuous deployment. Now, test its functionality by pushing a code change to your base image repository.

  1. Make a change to your configured source repository, and then commit and push the change.
  2. Open the AWS CodePipeline console at https://console.aws.amazon.com/codepipeline/.
  3. Choose your pipeline from the list.
  4. Watch the pipeline progress through its stages. As the base image is built and pushed to Amazon ECR, see how the second pipeline is triggered, too. When the execution of your pipeline is complete, your application image is pushed to Amazon ECR, and you are now ready to deploy your application. For more information about continuously deploying your application, see Create a Pipeline with an Amazon ECR Source and ECS-to-CodeDeploy Deployment in the AWS CodePipeline User Guide.

Conclusion

In this post, we showed you how to create a complete, end-to-end continuous deployment (CD) pipeline with Amazon ECR and AWS CodePipeline. You saw how to initiate an AWS CodePipeline pipeline update by uploading a new image to Amazon ECR. Support for Amazon ECR in AWS CodePipeline makes it easier to set up a continuous delivery pipeline and use the AWS Developer Tools for CI/CD.

Scanning Docker Images for Vulnerabilities using Clair, Amazon ECS, ECR, and AWS CodePipeline

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/scanning-docker-images-for-vulnerabilities-using-clair-amazon-ecs-ecr-aws-codepipeline/

Post by Vikrama Adethyaa, Solution Architect and Tiffany Jernigan, Developer Advocate

 

Containers are an increasingly important way for you to package and deploy your applications. They are lightweight and provide a consistent, portable software environment for applications to easily run and scale anywhere.

A container is launched from a container image, an executable package that includes everything needed to run an application: the application code, configuration files, runtime (for example, Java, Python, etc.), libraries, and environment variables.

A container image is built up from a series of layers. For a Docker image, each layer in the image represents an instruction in the image’s Dockerfile. A parent image is the image on which your image is built. It refers to the contents of the FROM directive in the Dockerfile. Most Dockerfiles start from a parent image, and often the parent image was downloaded from a public registry.

It is incredibly difficult and time-consuming to manually track all the files, packages, libraries, and so on, included in an image along with the vulnerabilities that they may possess. Having a security breach is one of the costliest things an organization can endure. It takes years to build up a reputation and only seconds to tear it down.

One way to prevent breaches is to regularly scan your images and compare the dependencies to a known list of common vulnerabilities and exposures (CVEs). Public CVE lists contain an identification number, description, and at least one public reference for known cybersecurity vulnerabilities. The automatic detection of vulnerabilities helps increase awareness and best security practices across developer and operations teams. It encourages action to patch and address the vulnerabilities.

This post walks you through the process of setting up an automated vulnerability scanning pipeline. You use AWS CodePipeline to scan your container images for known security vulnerabilities and deploy the container only if the vulnerabilities are within the defined threshold.

This solution uses CoresOS Clair for static analysis of vulnerabilities in container images. Clair is an API-driven analysis engine that inspects containers layer-by-layer for known security flaws. Clair scans each container layer and provides a notification of vulnerabilities that may be a threat, based on the CVE database and similar data feeds from Red Hat, Ubuntu, and Debian.

Deploying Clair

Here’s how to install Clair on AWS. The following diagram shows the high-level architecture of Clair.

Clair uses PostgreSQL, so use Aurora PostgreSQL to host the Clair database. You deploy Clair as an ECS service with the Fargate launch type behind an Application Load Balancer. The Clair container is deployed in a private subnet behind the Application Load Balancer that is hosted in the public subnets. The private subnets must have a route to the internet using the NAT gateway, as Clair fetches the latest vulnerability information from multiple online sources.

Prerequisites

Ensure that the following are installed or configured on your workstation before you deploy Clair:

  • Docker
  • Git
  • AWS CLI installed
  • AWS CLI is configured with your access key ID and secret access key, and the default region as us-east-1

Download the AWS CloudFormation template for deploying Clair

To help you quickly deploy Clair on AWS and set up CodePipeline with automatic vulnerability detection, use AWS CloudFormation templates that can be downloaded from the aws-codepipeline-docker-vulnerability-scan GitHub repository. The repository also includes a simple, containerized NGINX website for testing your pipeline.

# Clone the GitHub repository
git clone https://github.com/aws-samples/aws-codepipeline-docker-vulnerability-scan.git

cd aws-codepipeline-docker-vulnerability-scan

VPC requirements

We recommend a VPC with the following specification for deploying CoreOS Clair:

  • Two public subnets
  • Two private subnets
  • NAT gateways to allow internet access for services in private subnets

You can create such a VPC using the AWS CloudFormation template networking-template.yaml that is included in the sample code you cloned from GitHub.

# Create the VPC
aws cloudformation create-stack \
--stack-name coreos-clair-vpc-stack \
--template-body file://networking-template.yaml

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name coreos-clair-vpc-stack

# Get stack outputs
aws cloudformation describe-stacks \
--stack-name coreos-clair-vpc-stack \
--query 'Stacks[].Outputs[]'

Build the Clair Docker image

First, create an Amazon Elastic Container Registry (Amazon ECR) repository to host your Clair Docker image. Then, build the Clair Docker image on your workstation and push it to the ECR repository that you created.

# Create the ECR repository
# Note the URI and ARN of the ECR Repository
aws ecr create-repository --repository-name coreos-clair

# Build the Docker image
docker build -t <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/coreos-clair:latest ./coreos-clair

# Push the Docker image to ECR
aws ecr get-login --no-include-email | bash
docker push <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/coreos-clair:latest

Deploy Clair using AWS CloudFormation

Now that the Clair Docker image has been built and pushed to ECR, deploy Clair as an ECS service with the Fargate launch type. The following AWS CloudFormation stack creates an ECS cluster named clair-demo-cluster and deploys the Clair service.

# Create the AWS CloudFormation stack
# <ECRRepositoryUri> - CoreOS Clair ECR repository URI without an image tag
# Example - <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/coreos-clair

aws cloudformation create-stack \
--stack-name coreos-clair-stack \
--template-body file://coreos-clair/clair-template.yaml \
--capabilities CAPABILITY_IAM \
--parameters \
ParameterKey="VpcId",ParameterValue="<VpcId>" \
ParameterKey="PublicSubnets",ParameterValue=\"<PublicSubnet01-ID>,<PublicSubnet02-ID>\" \
ParameterKey="PrivateSubnets",ParameterValue=\"<PrivateSubnet01-ID>,<PrivateSubnet02-ID>\" \
ParameterKey="ECRRepositoryUri",ParameterValue="<ECRRepositoryUri>"

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name coreos-clair-stack

# Get stack outputs
# Note the ClairAlbDnsName
aws cloudformation describe-stacks \
--stack-name coreos-clair-stack \
--query 'Stacks[].Outputs[]'

Deploying the sample website

Deploy a simple static website running on NGINX as a container. An AWS CloudFormation template is included in the sample code that you cloned from GitHub.

Create a CodeCommit repository for the NGINX website

You create an AWS CodeCommit repository to host the sample NGINX website code. This repository is the source of the pipeline that you create later. Before you proceed with the following steps, ensure SSH authentication to CodeCommit.

# Create the CodeCommit repository
# Note the cloneUrlSsh value
aws codecommit create-repository --repository-name my-nginx-website
 
# Clone the empty CodeCommit repository
cd ../
git clone <cloneUrlSsh>

# Copy the contents of nginx-website to my-nginx-website
cp -R aws-codepipeline-docker-vulnerability-scan/nginx-website/ my-nginx-website/

# Commit the changes
cd my-nginx-website/
git add *
git commit -m "Initial commit"
git push

Build the NGINX Docker image

Create an ECR repository to host your NGINX website Docker image. Build the image on your workstation using the file Dockerfile-amznlinux, where Amazon Linux is the parent image. After the image is built, push it to the ECR repository that you created.

# Create an ECR repository
# Note the URI and ARN of the ECR repository
aws ecr create-repository --repository-name nginx-website

# Build the Docker image
docker build -f Dockerfile-amznlinux -t <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/nginx-website:latest .

# Push the Docker image to ECR
docker push <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/nginx-website:latest

Deploy the NGINX website using AWS CloudFormation

Now deploy the NGINX website. The following stack deploys the NGINX website onto the same ECS cluster (clair-demo-cluster) as Clair.

# Create the AWS CloudFormation stack
# <ECRRepositoryUri> - Nginx-Website ECR Repository URI without Image tag
# Example: <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/nginx-website

cd ../aws-codepipeline-docker-vulnerability-scan/

aws cloudformation create-stack \
--stack-name nginx-website-stack \
--template-body file://nginx-website/nginx-website-template.yaml \
--capabilities CAPABILITY_IAM \
--parameters \
ParameterKey="VpcId",ParameterValue="<VpcId>" \
ParameterKey="PublicSubnets",ParameterValue=\"<PublicSubnet01-ID>,<PublicSubnet02-ID>\" \
ParameterKey="PrivateSubnets",ParameterValue=\"<PrivateSubnet01-ID>,<PrivateSubnet02-ID>\" \
ParameterKey="ECRRepositoryUri",ParameterValue="<ECRRepositoryUri>"

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name nginx-website-stack

# Get stack outputs
aws cloudformation describe-stacks \
--stack-name nginx-website-stack \
--query 'Stacks[].Outputs[]'

Note the AWS CloudFormation stack outputs. The stack output contains the Application Load Balancer URL for the NGINX website and the ECS service name of the NGINX website. You need the ECS service name for the pipeline.

Building the pipeline

In this section, you build a pipeline to automate vulnerability scanning for the nginx-website Docker image builds. Every time that a code change is made, the Docker image is rebuilt and scanned for vulnerabilities. Only if vulnerabilities are within the defined threshold is the container is deployed onto ECS. For more information, see Tutorial: Continuous Deployment with AWS CodePipeline.

The sample code includes an AWS CloudFormation template to create the pipeline. The buildspec.yml file is used by AWS CodeBuild to build the nginx-website Docker image and scan the image using Clair.

CodeBuild build spec

build spec is a collection of build commands and related settings, in YAML format, that AWS CodeBuild uses to run a build. You can include a build spec in the root directory of your application source code, or you can define a build spec when you create a build project.

In this sample app, you include the build spec in the root directory of your sample application source code. The buildspec.yml file is located in the /aws-codepipeline-docker-vulnerability-scan/nginx-website folder.

Use Klar, a simple tool to analyze images stored in a private or public Docker registry for security vulnerabilities using Clair. Klar serves as a client which coordinates the image checks between ECR and Clair.

In the buildspec.yml file, you set the variable CLAIR_OUTPUT=Critical. CLAIR_OUTPUT defines the severity level threshold. Vulnerabilities with severity levels higher than or equal to this threshold are outputted. The supported levels are:

  • Unknown
  • Negligible
  • Low
  • Medium
  • High
  • Critical
  • Defcon1

You can configure Klar to your requirements by setting the variables as defined in https://github.com/optiopay/klar.

# Set the following variables as CodeBuild project environment variables
# ECR_REPOSITORY_URI
# CLAIR_URL

version: 0.2
phases:
  pre_build:
    commands:
      - echo Fetching ECR Login
      - ECR_LOGIN=$(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - echo Logging in to Amazon ECR...
      - $ECR_LOGIN
      - IMAGE_TAG=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - echo Downloading Clair client Klar-2.1.1
      - wget https://github.com/optiopay/klar/releases/download/v2.1.1/klar-2.1.1-linux-amd64
      - mv ./klar-2.1.1-linux-amd64 ./klar
      - chmod +x ./klar
      - PASSWORD=`echo $ECR_LOGIN | cut -d' ' -f6`
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $ECR_REPOSITORY_URI:latest .
      - docker tag $ECR_REPOSITORY_URI:latest $ECR_REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - bash -c "if [ /"$CODEBUILD_BUILD_SUCCEEDING/" == /"0/" ]; then exit 1; fi"
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push $ECR_REPOSITORY_URI:latest
      - docker push $ECR_REPOSITORY_URI:$IMAGE_TAG
      - echo Running Clair scan on the Docker Image
      - DOCKER_USER=AWS DOCKER_PASSWORD=${PASSWORD} CLAIR_ADDR=$CLAIR_URL CLAIR_OUTPUT=Critical ./klar $ECR_REPOSITORY_URI
      - echo Writing image definitions file...
      - printf '[{"name":"MyWebsite","imageUri":"%s"}]' $ECR_REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json
artifacts:
  files: imagedefinitions.json

The build spec does the following:

Pre-build stage:

  • Log in to ECR.
  • Download the Clair client Klar.

Build stage:

  • Build the Docker image and tag it as latest and with the Git commit ID.

Post-build stage:

  • Push the image to your ECR repository with both tags.
  • Trigger Klar to scan the image that you pushed to ECR for security vulnerabilities using Clair.
  • Write a file called imagedefinitions.json in the build root that has your Amazon ECS service’s container name and the image and tag. The deployment stage of your CD pipeline uses this information to create a new revision of your service’s task definition. It then updates the service to use the new task definition. The imagedefinitions.json file is required for the AWS CodeDeploy ECS job worker.

Deploy the pipeline

Deploy the pipeline using the AWS CloudFormation template provided with the sample code. The following template creates the CodeBuild project, CodePipeline pipeline, Amazon CloudWatch Events rule, and necessary IAM permissions.

# Deploy the pipeline
 
# Replace the following variables 
# WebsiteECRRepositoryARN – NGINX website ECR repository ARN
# WebsiteECRRepositoryURI – NGINX website ECR repository URI
# ClairAlbDnsName - Output variable from coreos-clair-stack
# EcsServiceName – Output variable from nginx-website-stack

aws cloudformation create-stack \
--stack-name nginx-website-codepipeline-stack \
--template-body file://clair-codepipeline-template.yaml \
--capabilities CAPABILITY_IAM \
--disable-rollback \
--parameters \
ParameterKey="EcrRepositoryArn",ParameterValue="<WebsiteECRRepositoryARN>" \
ParameterKey="EcrRepositoryUri",ParameterValue="<WebsiteECRRepositoryURI>" \
ParameterKey="ClairAlbDnsName",ParameterValue="<ClairAlbDnsName>" \
ParameterKey="EcsServiceName",ParameterValue="<WebsiteECSServiceName>"

# Verify that stack creation is complete
aws cloudformation wait stack-create-complete \
–stack-name nginx-website-codepipeline-stack

The pipeline is triggered after the AWS CloudFormation stack creation is complete. You can log in to the AWS Management Console to monitor the status of the pipeline. The vulnerability scan information is available in CloudWatch Logs.

You can also modify the CLAIR_OUTPUT value from Critical to High in the buildspec.yml file in the /cores-clair-ecs-cicd/nginx-website-repo folder and then check the status of the build.

Summary

I’ve described how to deploy Clair on AWS and set up a release pipeline for the automated vulnerability scanning of container images. The Clair instance can be used as a centralized Docker image vulnerability scanner and used by other CodeBuild projects. To meet your organization’s security requirements, define your vulnerability threshold in Klar by setting the variables, as defined in https://github.com/optiopay/klar.

How to Run Headless Front-End Tests with AWS Cloud9 and AWS CodeBuild

Post Syndicated from Eric Z. Beard original https://aws.amazon.com/blogs/devops/how-to-run-headless-front-end-tests-with-aws-cloud9-and-aws-codebuild/

Automated testing is a critical component to a well-designed software development lifecycle. When you test front-end applications, you often use a browser in combination with testing frameworks. A headless browser is one that is used on a server that does not normally need to run visual applications. In this blog post, I will show you how to configure AWS Cloud9 and AWS CodeBuild to support testing an Angular application with the headless version of Chrome. AWS Cloud9 has deep integration with services such as AWS Lambda, and the environment is easily accessible anywhere, from any internet-connected device.

AWS Cloud9

By default, Cloud9 runs on an Amazon EC2 instance that is managed for you. You can also run it on any Linux machine that is accessible through SSH.

First, create a Cloud9 environment.

  1. Sign in to the AWS Management Console, scroll down to Developer Tools, and choose Cloud9.
  2. On the following page, choose Create Environment.
  3. Enter a name for your environment and then choose Next Step.
  4. On the following page, leave the defaults for the time being and click Next Step.
  5. On the following page, choose Create Environment.

It might take a few minutes for your environment to initialize. Behind the scenes, an EC2 instance is created for you in the region you have currently selected in the console. In the environment, press Alt-T to bring up a bash terminal tab. For the remaining steps in this post, you will enter commands into this tab.

There is a lot to take in if this is your first time using Cloud9. If you need help getting set up or want to learn more, see the Cloud9 User Guide.

Install and configure Angular

The first thing we will do in our new environment is to install and configure an Angular application.

  1. Upgrade Node to the latest version supported by AWS Lambda. (At the time of this writing, that’s 8.10.)
    nvm install 8.10
  2. Install the Angular CLI using npm, the Node Package Manager. Install it as a global package with the –g option so that it is available to run from anywhere in your environment.
    npm install -g @angular/cli
  3. Use the Angular CLI to create an Angular application.
    ng new my-app
    cd my-app/
  4. Run the application to make sure everything is working as expected. To preview a running application in Cloud9, the app must run on a specific port. With Angular, you must disable the default host header check.
    ng serve --port 8080 --host localhost --disable-host-check

     

    On the toolbar, next to Run, choose Preview and then choose Preview Running Application. You should see something like this:

  5. Press Ctrl-C to stop serving and then in the my-app directory, try to test your application.
    cd ..
    ng test --watch=false

    That obviously doesn’t work the way you would expect it to on a regular workstation. The testing framework can’t find Chrome because we are running on a headless EC2 instance. To start addressing the problem, first install a package called Puppeteer as a development dependency in your application.

    I’d like to give credit here to Alex Bainter, a software developer who wrote a comprehensive blog post about replacing PhantomJS with headless Chromium and Karma. His post was extremely helpful to me when I had to figure this out for the first time.

  6. Install Puppeteer and its dependencies.
    npm i -D puppeteer
    npm i –D @angular-devkit/build-angular
  7. You can get a good look at the missing Chrome libraries by running the ldd command on the binary that comes with Puppeteer.
    cd node_modules/puppeteer/.local-chromium/linux-564778/chrome-linux/

    (By the time you read this post, the version number in that path will probably be different. Look in the puppeteer/.local-chromium directory to see what it is for your installation.)

    ldd chrome | grep not

    You should see output that looks like this:

    libXcursor.so.1 => not found
    libXdamage.so.1 => not found
    libXfixes.so.3 => not found
    libcups.so.2 => not found
    libXss.so.1 => not found
    libXrandr.so.2 => not found
    libpangocairo-1.0.so.0 => not found
    libpango-1.0.so.0 => not found
    libcairo.so.2 => not found
    libatk-1.0.so.0 => not found
    libatk-bridge-2.0.so.0 => not found
    libgtk-3.so.0 => not found
    libgdk-3.so.0 => not found
    libgdk_pixbuf-2.0.so.0 => not found

 

Install headless Chrome

Now comes the tricky part. Installing headless Chrome on an Amazon Linux EC2 instance is no simple task. One strategy is to install the various dependencies by compiling from source, but the chain of dependencies for Chrome, which includes gtk+ and glib, soon gets out of hand. I found another blogger who solved the problem by borrowing from the CentOS and Fedora package repositories. Thanks to Yuanyi for this part of the solution.

  1. Install yum packages to cover basic dependencies.
    sudo yum install -y libXcursor libXdamage libcups libXss libXrandr \
        cups-libs dbus-glib libXinerama cairo cairo-gobject pango
  2. Borrow packages from CentOS and Fedora.
    sudo rpm -ivh --nodeps http://mirror.centos.org/centos/7/os/x86_64/Packages/atk-2.22.0-3.el7.x86_64.rpm
    sudo rpm -ivh --nodeps http://mirror.centos.org/centos/7/os/x86_64/Packages/at-spi2-atk-2.22.0-2.el7.x86_64.rpm
    sudo rpm -ivh --nodeps http://mirror.centos.org/centos/7/os/x86_64/Packages/at-spi2-core-2.22.0-1.el7.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/g/GConf2-3.2.6-7.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libXScrnSaver-1.2.2-6.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libxkbcommon-0.3.1-1.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libwayland-client-1.2.0-3.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libwayland-cursor-1.2.0-3.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/g/gtk3-3.10.4-1.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/16/Fedora/x86_64/os/Packages/gdk-pixbuf2-2.24.0-1.fc16.x86_64.rpm
  3. Edit src/karma.conf.js to require Puppeteer and set the CHROME_BIN environment variable. Here is the full content of that file after the changes.
    const puppeteer = require("puppeteer");
    process.env.CHROME_BIN = puppeteer.executablePath();
    
    module.exports = function (config) {
        config.set({
            basePath: '',
            frameworks: ['jasmine', ' @angular-devkit/build-angular'],
            plugins: [
                require('karma-jasmine'),
                require('karma-chrome-launcher'),
                require('karma-jasmine-html-reporter'),
                require('karma-coverage-istanbul-reporter'),
               require('@angular-devkit/build-angular/plugins/karma')
            ],
            client:{
                clearContext: false // leave Jasmine Spec Runner output visible in browser
            },
        coverageIstanbulReporter: {
            reports: [ 'html', 'lcovonly' ],
            fixWebpackSourcePaths: true
        },
        angularCli: {
            environment: 'dev'
        },
        reporters: ['progress', 'kjhtml'],
        port: 8080,
        colors: true,
        logLevel: config.LOG_INFO,
        autoWatch: true,
        browsers: ['ChromeHeadlessNoSandbox'],
        customLaunchers: {
            ChromeHeadlessNoSandbox: {
                base: 'ChromeHeadless',
                flags: ['--no-sandbox']
            }
        },
        singleRun: false
    
    });
    
    };
  4. Make a small adjustment to your test specification in src/app/app.component.spec.ts so that it is checking for the title in the test called "should render title in a h1 tag". Run ng test again.
    ng test --watch=false

If you see that green SUCCESS indicator, then you have done it! You installed Angular and created an application, installed Puppeteer, and by filling in the missing libraries for Chrome, you made it possible to run headless Chrome tests in Cloud9!

AWS CodeBuild

The next piece of the puzzle is your CI/CD pipeline. When a developer checks in new code, you want to test that code with a continuous integration tool like AWS CodeBuild. With CodeBuild, the problem related to headless Chrome is slightly different than it was with Cloud9, because the default build environment for Node apps is an Ubuntu image. You still need to install Chromium and its dependencies, but Ubuntu packages make it easier.

  1. Navigate to the CodeBuild console and create a new build project. Give it a name and configure the source repository. You will need to store your code for this exercise with one of the providers listed later so that CodeBuild knows where to find it when you start a build. Since you are already logged in to the AWS console, AWS CodeCommit is a good option, but you could also choose Amazon S3, Bitbucket, or GitHub.
  2. Configure the build environment. For Operating system, choose Ubuntu. For Runtime, choose Node.js. You can specify your own container image for the build, but the buildspec.yml described in step 3 works out of the box with the default image.
  3. For the build specification, provide the following buildspec.yml file in the root directory of the source code repository.
    
    version: 0.1
    phases:
      install:
        commands:
    
          # Install the Angular CLI
          - npm install -g @angular/cli
    
          # Install puppeteer as a dev dependency
          - npm i -D puppeteer
          - npm i –D @angular-devkit/build-angular
    
          # Print out missing libs
          - echo "Missing Libs" || ldd ./node_modules/puppeteer/.local-chromium/linux-549031/chrome-linux/chrome | grep not
    
          # Upgrade apt
          - apt-get upgrade
    
          # Update libs
          - apt-get update
    
          # Install apt-transport-https
          - apt-get install -y apt-transport-https
    
          # Use apt to install the Chrome dependencies
          - apt-get install -y libxcursor1
          - apt-get install -y libgtk-3-dev
          - apt-get install -y libxss1
          - apt-get install -y libasound2
          - apt-get install -y libnspr4
          - apt-get install -y libnss3
          - apt-get install -y libx11-xcb1
    
          # Print out missing libs
          - echo "Missing Libs" || ldd ./node_modules/puppeteer/.local-chromium/linux-549031/chrome-linux/chrome | grep not
    
          # Install project dependencies
          - npm install
    
      pre_build:
        commands
    	  - echo "Nothing to pre_build"
    
      build:
        commands:
    
          - printenv 
    
          # Build the project
          - ng build
    
          # Run headless Chrome tests
          - ng test --watch=false
          - printenv
    
      post_build:
        commands:
    
          - printenv
    
          # Deploy the project to S3
    
          - if [ ${CODEBUILD_BUILD_SUCCEEDING}=1 ]; then aws s3 sync --delete dist/ "s3://${BUCKET_NAME}"; else echo "Skipping aws sync"; fi
    
    artifacts:
      files:
        - src/*
    
    

    Feel free to remove those ldd and printenv statements, but it is worth taking a look at the output to get a better understanding of what is going on with the build.

  4. Specify the location for artifacts. The following step isn’t required, but it makes it possible to incorporate the build project into AWS CodePipeline.
  5. Expand Advanced Settings and configure an environment variable for the website bucket name.
  6. Configure the buckets. CodeBuild can’t write to the S3 buckets unless you give the service explicit permissions to do so. This is one of the most common causes of build failures for projects that involve S3. Attach the following policy to the CodeBuild service role to give it access to those buckets. Choose Continue and Save to create the build project, and then navigate to the IAM console and search for the CodeBuild service role that was just created for you. Add this as an inline policy.
    
    {
    	"Version": "2012-10-17",
    	"Statement": [
    		{
    			"Sid": "VisualEditor0",
    			"Effect": "Allow",
    			"Action": "s3:*",
    			"Resource": [
    				"arn:aws:s3:::YOUR_BUCKET_FOR_ARTIFACTS",
    				"arn:aws:s3:::YOUR_BUCKET_FOR_ARTIFACTS /*"
    			]
    		},
    		{
    			"Sid": "VisualEditor1",
    			"Effect": "Allow",
    			"Action": "s3:*",
    			"Resource": [
    				"arn:aws:s3:::YOUR_BUCKET_FOR_THE_WEBSITE",
    				"arn:aws:s3:::YOUR_BUCKET_FOR_THE_WEBSITE /*"
    			]
    		}
    	]
    }
    
    
  7. You should now be able to start the build and see that the compiled website has been copied to your S3 bucket after the build is complete.

 

Alternative Cloud9 installation using SSH and Ubuntu

You can run the Cloud9 IDE from a Linux machine that you create, rather than letting Cloud9 provision it for you. Create a Cloud9 environment and choose Connect and run in remote server. For more information about this type of setup, see Creating an SSH Environment in the AWS Cloud9 User Guide.

After you have configured the environment, the work you have to do is much simpler than on the Amazon Linux instance, because there are Ubuntu packages that install the required dependencies. Follow the instructions earlier in this post until you get to the “Install headless Chrome” section. Issue this command:

sudo apt install -y libxcursor1 libgtk-3-dev libxss1 libasound2 libnspr4 libnss3

You don’t need to borrow from any of the CentOS or Fedora repositories.

Make changes to karma.conf.js as described earlier and you should then be ready to test your application.

 

Summary

You are now able to run headless integration tests using Cloud9 by installing Puppeteer and filling in the required Chrome dependencies. You can also extend this to the container image used to test your application with CodeBuild. Automated testing is vital to a trustworthy DevOps pipeline, and Cloud9 opens up new possibilities for developers of all types, including front-end developers.

Happy coding! –EZB

Machine Learning with AWS Fargate and AWS CodePipeline at Corteva Agriscience

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/machine-learning-with-aws-fargate-and-aws-codepipeline-at-corteva-agriscience/

This post contributed by Duke Takle and Kevin Hayes at Corteva Agriscience

At Corteva Agriscience, the agricultural division of DowDuPont, our purpose is to enrich the lives of those who produce and those who consume, ensuring progress for generations to come. As a global business, we support a network of research stations to improve agricultural productivity around the world

As analytical technology advances the volume of data, as well as the speed at which it must be processed, meeting the needs of our scientists poses unique challenges. Corteva Cloud Engineering teams are responsible for collaborating with and enabling software developers, data scientists, and others. Their work allows Corteva research and development to become the most efficient innovation machine in the agricultural industry.

Recently, our Systems and Innovations for Breeding and Seed Products organization approached the Cloud Engineering team with the challenge of how to deploy a novel machine learning (ML) algorithm for scoring genetic markers. The solution would require supporting labs across six continents in a process that is run daily. This algorithm replaces time-intensive manual scoring of genotypic assays with a robust, automated solution. When examining the solution space for this challenge, the main requirements for our solution were global deployability, application uptime, and scalability.

Before the implementing this algorithm in AWS, ML autoscoring was done as a proof of concept using pre-production instances on premises. It required several technicians to continue to process assays by hand. After implementing on AWS, we have enabled those technicians to be better used in other areas, such as technology development.

Solutions Considered

A RESTful web service seemed to be an obvious way to solve the problem presented. AWS has several patterns that could implement a RESTful web service, such as Amazon API Gateway, AWS Lambda, Amazon EC2, AWS Auto Scaling, Amazon Elastic Container Service (ECS) using the EC2 launch type, and AWS Fargate.

At the time, the project came into our backlog, we had just heard of Fargate. Fargate does have a few limitations (scratch storage, CPU, and memory), none of which were a problem. So EC2, Auto Scaling, and ECS with the EC2 launch type were ruled out because they would have introduced unneeded complexity. The unneeded complexity is mostly around management of EC2 instances to either run the application or the container needed for the solution.

When the project came into our group, there had been a substantial proof-of-concept done with a Docker container. While we are strong API Gateway and Lambda proponents, there is no need to duplicate processes or services that AWS provides. We also knew that we needed to be able to move fast. We wanted to put the power in the hands of our developers to focus on building out the solution. Additionally, we needed something that could scale across our organization and provide some rationalization in how we approach these problems. AWS services, such as Fargate, AWS CodePipeline, and AWS CloudFormation, made that possible.

Solution Overview

Our group prefers using existing AWS services to bring a complete project to the production environment.

CI/CD Pipeline

A complete discussion of the CI/CD pipeline for the project is beyond the scope of this post. However, in broad strokes, the pipeline is:

  1. Compile some C++ code wrapped in Python, create a Python wheel, and publish it to an artifact store.
  2. Create a Docker image with that wheel installed and publish it to ECR.
  3. Deploy and test the new image to our test environment.
  4. Deploy the new image to the production environment.

Solution

As mentioned earlier, the application is a Docker container deployed with the Fargate launch type. It uses an Aurora PostgreSQL DB instance for the backend data. The application itself is only needed internally so the Application Load Balancer is created with the scheme set to “internal” and deployed into our private application subnets.

Our environments are all constructed with CloudFormation templates. Each environment is constructed in a separate AWS account and connected back to a central utility account. The infrastructure stacks export a number of useful bits like the VPC, subnets, IAM roles, security groups, etc. This scheme allows us to move projects through the several accounts without changing the CloudFormation templates, just the parameters that are fed into them.

For this solution, we use an existing VPC, set of subnets, IAM role, and ACM certificate in the us-east-1 Region. The solution CloudFormation stack describes and manages the following resources:

AWS::ECS::Cluster*
AWS::EC2::SecurityGroup
AWS::EC2::SecurityGroupIngress
AWS::Logs::LogGroup
AWS::ECS::TaskDefinition*
AWS::ElasticLoadBalancingV2::LoadBalancer
AWS::ElasticLoadBalancingV2::TargetGroup
AWS::ElasticLoadBalancingV2::Listener
AWS::ECS::Service*
AWS::ApplicationAutoScaling::ScalableTarget
AWS::ApplicationAutoScaling::ScalingPolicy
AWS::ElasticLoadBalancingV2::ListenerRule

A complete discussion of all the resources for the solution is beyond the scope of this post. However, we can explore the resource definitions of the components specific to Fargate. The following three simple segments of CloudFormation are all that is needed to create a Fargate stack: an ECS cluster, task definition, and service. More complete examples of the CloudFormation templates are linked at the end of this post, with stack creation instructions.

AWS::ECS::Cluster:

"ECSCluster": {
    "Type":"AWS::ECS::Cluster",
    "Properties" : {
        "ClusterName" : { "Ref": "clusterName" }
    }
}

The ECS Cluster resource is a simple grouping for the other ECS resources to be created. The cluster created in this stack holds the tasks and service that implement the actual solution. Finally, in the AWS Management Console, the cluster is the entry point to find info about your ECS resources.

AWS::ECS::TaskDefinition

"fargateDemoTaskDefinition": {
    "Type": "AWS::ECS::TaskDefinition",
    "Properties": {
        "ContainerDefinitions": [
            {
                "Essential": "true",
                "Image": { "Ref": "taskImage" },
                "LogConfiguration": {
                    "LogDriver": "awslogs",
                    "Options": {
                        "awslogs-group": {
                            "Ref": "cloudwatchLogsGroup"
                        },
                        "awslogs-region": {
                            "Ref": "AWS::Region"
                        },
                        "awslogs-stream-prefix": "fargate-demo-app"
                    }
                },
                "Name": "fargate-demo-app",
                "PortMappings": [
                    {
                        "ContainerPort": 80
                    }
                ]
            }
        ],
        "ExecutionRoleArn": {"Fn::ImportValue": "fargateDemoRoleArnV1"},
        "Family": {
            "Fn::Join": [
                "",
                [ { "Ref": "AWS::StackName" }, "-fargate-demo-app" ]
            ]
        },
        "NetworkMode": "awsvpc",
        "RequiresCompatibilities" : [ "FARGATE" ],
        "TaskRoleArn": {"Fn::ImportValue": "fargateDemoRoleArnV1"},
        "Cpu": { "Ref": "cpuAllocation" },
        "Memory": { "Ref": "memoryAllocation" }
    }
}

The ECS Task Definition is where we specify and configure the container. Interesting things to note are the CPU and memory configuration items. It is important to note the valid combinations for CPU/memory settings, as shown in the following table.

CPUMemory
0.25 vCPU0.5 GB, 1 GB, and 2 GB
0.5 vCPUMin. 1 GB and Max. 4 GB, in 1-GB increments
1 vCPUMin. 2 GB and Max. 8 GB, in 1-GB increments
2 vCPUMin. 4 GB and Max. 16 GB, in 1-GB increments
4 vCPUMin. 8 GB and Max. 30 GB, in 1-GB increments

AWS::ECS::Service

"fargateDemoService": {
     "Type": "AWS::ECS::Service",
     "DependsOn": [
         "fargateDemoALBListener"
     ],
     "Properties": {
         "Cluster": { "Ref": "ECSCluster" },
         "DesiredCount": { "Ref": "minimumCount" },
         "LaunchType": "FARGATE",
         "LoadBalancers": [
             {
                 "ContainerName": "fargate-demo-app",
                 "ContainerPort": "80",
                 "TargetGroupArn": { "Ref": "fargateDemoTargetGroup" }
             }
         ],
         "NetworkConfiguration":{
             "AwsvpcConfiguration":{
                 "SecurityGroups": [
                     { "Ref":"fargateDemoSecuityGroup" }
                 ],
                 "Subnets":[
                    {"Fn::ImportValue": "privateSubnetOneV1"},
                    {"Fn::ImportValue": "privateSubnetTwoV1"},
                    {"Fn::ImportValue": "privateSubnetThreeV1"}
                 ]
             }
         },
         "TaskDefinition": { "Ref":"fargateDemoTaskDefinition" }
     }
}

The ECS Service resource is how we can configure where and how many instances of tasks are executed to solve our problem. In this case, we see that there are at least minimumCount instances of the task running in any of three private subnets in our VPC.

Conclusion

Deploying this algorithm on AWS using containers and Fargate allowed us to start running the application at scale with low support overhead. This has resulted in faster turnaround time with fewer staff and a concomitant reduction in cost.

“We are very excited with the deployment of Polaris, the autoscoring of the marker lab genotyping data using AWS technologies. This key technology deployment has enhanced performance, scalability, and efficiency of our global labs to deliver over 1.4 Billion data points annually to our key customers in Plant Breeding and Integrated Operations.”

Sandra Milach, Director of Systems and Innovations for Breeding and Seed Products.

We are distributing this solution to all our worldwide laboratories to harmonize data quality, and speed. We hope this enables an increase in the velocity of genetic gain to increase yields of crops for farmers around the world.

You can learn more about the work we do at Corteva at www.corteva.com.

Try it yourself:

The snippets above are instructive but not complete. We have published two repositories on GitHub that you can explore to see how we built this solution:

Note: the components in these repos do not include our production code, but they show you how this works using Amazon ECS and AWS Fargate.

Spring 2018 AWS SOC Reports are Now Available with 11 Services Added in Scope

Post Syndicated from Chris Gile original https://aws.amazon.com/blogs/security/spring-2018-aws-soc-reports-are-now-available-with-11-services-added-in-scope/

Since our last System and Organization Control (SOC) audit, our service and compliance teams have been working to increase the number of AWS Services in scope prioritized based on customer requests. Today, we’re happy to report 11 services are newly SOC compliant, which is a 21 percent increase in the last six months.

With the addition of the following 11 new services, you can now select from a total of 62 SOC-compliant services. To see the full list, go to our Services in Scope by Compliance Program page:

• Amazon Athena
• Amazon QuickSight
• Amazon WorkDocs
• AWS Batch
• AWS CodeBuild
• AWS Config
• AWS OpsWorks Stacks
• AWS Snowball
• AWS Snowball Edge
• AWS Snowmobile
• AWS X-Ray

Our latest SOC 1, 2, and 3 reports covering the period from October 1, 2017 to March 31, 2018 are now available. The SOC 1 and 2 reports are available on-demand through AWS Artifact by logging into the AWS Management Console. The SOC 3 report can be downloaded here.

Finally, prospective customers can read our SOC 1 and 2 reports by reaching out to AWS Compliance.

Want more AWS Security news? Follow us on Twitter.

Announcing Local Build Support for AWS CodeBuild

Post Syndicated from Karthik Thirugnanasambandam original https://aws.amazon.com/blogs/devops/announcing-local-build-support-for-aws-codebuild/

Today, we’re excited to announce local build support in AWS CodeBuild.

AWS CodeBuild is a fully managed build service. There are no servers to provision and scale, or software to install, configure, and operate. You just specify the location of your source code, choose your build settings, and CodeBuild runs build scripts for compiling, testing, and packaging your code.

In this blog post, I’ll show you how to set up CodeBuild locally to build and test a sample Java application.

By building an application on a local machine you can:

  • Test the integrity and contents of a buildspec file locally.
  • Test and build an application locally before committing.
  • Identify and fix errors quickly from your local development environment.

Prerequisites

In this post, I am using AWS Cloud9 IDE as my development environment.

If you would like to use AWS Cloud9 as your IDE, follow the express setup steps in the AWS Cloud9 User Guide.

The AWS Cloud9 IDE comes with Docker and Git already installed. If you are going to use your laptop or desktop machine as your development environment, install Docker and Git before you start.

Steps to build CodeBuild image locally

Run git clone https://github.com/aws/aws-codebuild-docker-images.git to download this repository to your local machine.

$ git clone https://github.com/aws/aws-codebuild-docker-images.git

Lets build a local CodeBuild image for JDK 8 environment. The Dockerfile for JDK 8 is present in /aws-codebuild-docker-images/ubuntu/java/openjdk-8.

Edit the Dockerfile to remove the last line ENTRYPOINT [“dockerd-entrypoint.sh”] and save the file.

Run cd ubuntu/java/openjdk-8 to change the directory in your local workspace.

Run docker build -t aws/codebuild/java:openjdk-8 . to build the Docker image locally. This command will take few minutes to complete.

$ cd aws-codebuild-docker-images
$ cd ubuntu/java/openjdk-8
$ docker build -t aws/codebuild/java:openjdk-8 .

Steps to setup CodeBuild local agent

Run the following Docker pull command to download the local CodeBuild agent.

$ docker pull amazon/aws-codebuild-local:latest --disable-content-trust=false

Now you have the local agent image on your machine and can run a local build.

Run the following git command to download a sample Java project.

$ git clone https://github.com/karthiksambandam/sample-web-app.git

Steps to use the local agent to build a sample project

Let’s build the sample Java project using the local agent.

Execute the following Docker command to run the local agent and build the sample web app repository you cloned earlier.

$ docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=aws/codebuild/java:openjdk-8" -e "ARTIFACTS=/home/ec2-user/environment/artifacts" -e "SOURCE=/home/ec2-user/environment/sample-web-app" amazon/aws-codebuild-local

Note: We need to provide three environment variables namely  IMAGE_NAME, SOURCE and ARTIFACTS.

IMAGE_NAME: The name of your build environment image.

SOURCE: The absolute path to your source code directory.

ARTIFACTS: The absolute path to your artifact output folder.

When you run the sample project, you get a runtime error that says the YAML file does not exist. This is because a buildspec.yml file is not included in the sample web project. AWS CodeBuild requires a buildspec.yml to run a build. For more information about buildspec.yml, see Build Spec Example in the AWS CodeBuild User Guide.

Let’s add a buildspec.yml file with the following content to the sample-web-app folder and then rebuild the project.

version: 0.2

phases:
  build:
    commands:
      - echo Build started on `date`
      - mvn install

artifacts:
  files:
    - target/javawebdemo.war

$ docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=aws/codebuild/java:openjdk-8" -e "ARTIFACTS=/home/ec2-user/environment/artifacts" -e "SOURCE=/home/ec2-user/environment/sample-web-app" amazon/aws-codebuild-local

This time your build should be successful. Upon successful execution, look in the /artifacts folder for the final built artifacts.zip file to validate.

Conclusion:

In this blog post, I showed you how to quickly set up the CodeBuild local agent to build projects right from your local desktop machine or laptop. As you see, local builds can improve developer productivity by helping you identify and fix errors quickly.

I hope you found this post useful. Feel free to leave your feedback or suggestions in the comments.

CI/CD with Data: Enabling Data Portability in a Software Delivery Pipeline with AWS Developer Tools, Kubernetes, and Portworx

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/cicd-with-data-enabling-data-portability-in-a-software-delivery-pipeline-with-aws-developer-tools-kubernetes-and-portworx/

This post is written by Eric Han – Vice President of Product Management Portworx and Asif Khan – Solutions Architect

Data is the soul of an application. As containers make it easier to package and deploy applications faster, testing plays an even more important role in the reliable delivery of software. Given that all applications have data, development teams want a way to reliably control, move, and test using real application data or, at times, obfuscated data.

For many teams, moving application data through a CI/CD pipeline, while honoring compliance and maintaining separation of concerns, has been a manual task that doesn’t scale. At best, it is limited to a few applications, and is not portable across environments. The goal should be to make running and testing stateful containers (think databases and message buses where operations are tracked) as easy as with stateless (such as with web front ends where they are often not).

Why is state important in testing scenarios? One reason is that many bugs manifest only when code is tested against real data. For example, we might simply want to test a database schema upgrade but a small synthetic dataset does not exercise the critical, finer corner cases in complex business logic. If we want true end-to-end testing, we need to be able to easily manage our data or state.

In this blog post, we define a CI/CD pipeline reference architecture that can automate data movement between applications. We also provide the steps to follow to configure the CI/CD pipeline.

 

Stateful Pipelines: Need for Portable Volumes

As part of continuous integration, testing, and deployment, a team may need to reproduce a bug found in production against a staging setup. Here, the hosting environment is comprised of a cluster with Kubernetes as the scheduler and Portworx for persistent volumes. The testing workflow is then automated by AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild.

Portworx offers Kubernetes storage that can be used to make persistent volumes portable between AWS environments and pipelines. The addition of Portworx to the AWS Developer Tools continuous deployment for Kubernetes reference architecture adds persistent storage and storage orchestration to a Kubernetes cluster. The example uses MongoDB as the demonstration of a stateful application. In practice, the workflow applies to any containerized application such as Cassandra, MySQL, Kafka, and Elasticsearch.

Using the reference architecture, a developer calls CodePipeline to trigger a snapshot of the running production MongoDB database. Portworx then creates a block-based, writable snapshot of the MongoDB volume. Meanwhile, the production MongoDB database continues serving end users and is uninterrupted.

Without the Portworx integrations, a manual process would require an application-level backup of the database instance that is outside of the CI/CD process. For larger databases, this could take hours and impact production. The use of block-based snapshots follows best practices for resilient and non-disruptive backups.

As part of the workflow, CodePipeline deploys a new MongoDB instance for staging onto the Kubernetes cluster and mounts the second Portworx volume that has the data from production. CodePipeline triggers the snapshot of a Portworx volume through an AWS Lambda function, as shown here

 

 

 

AWS Developer Tools with Kubernetes: Integrated Workflow with Portworx

In the following workflow, a developer is testing changes to a containerized application that calls on MongoDB. The tests are performed against a staging instance of MongoDB. The same workflow applies if changes were on the server side. The original production deployment is scheduled as a Kubernetes deployment object and uses Portworx as the storage for the persistent volume.

The continuous deployment pipeline runs as follows:

  • Developers integrate bug fix changes into a main development branch that gets merged into a CodeCommit master branch.
  • Amazon CloudWatch triggers the pipeline when code is merged into a master branch of an AWS CodeCommit repository.
  • AWS CodePipeline sends the new revision to AWS CodeBuild, which builds a Docker container image with the build ID.
  • AWS CodeBuild pushes the new Docker container image tagged with the build ID to an Amazon ECR registry.
  • Kubernetes downloads the new container (for the database client) from Amazon ECR and deploys the application (as a pod) and staging MongoDB instance (as a deployment object).
  • AWS CodePipeline, through a Lambda function, calls Portworx to snapshot the production MongoDB and deploy a staging instance of MongoDB• Portworx provides a snapshot of the production instance as the persistent storage of the staging MongoDB
    • The MongoDB instance mounts the snapshot.

At this point, the staging setup mimics a production environment. Teams can run integration and full end-to-end tests, using partner tooling, without impacting production workloads. The full pipeline is shown here.

 

Summary

This reference architecture showcases how development teams can easily move data between production and staging for the purposes of testing. Instead of taking application-specific manual steps, all operations in this CodePipeline architecture are automated and tracked as part of the CI/CD process.

This integrated experience is part of making stateful containers as easy as stateless. With AWS CodePipeline for CI/CD process, developers can easily deploy stateful containers onto a Kubernetes cluster with Portworx storage and automate data movement within their process.

The reference architecture and code are available on GitHub:

● Reference architecture: https://github.com/portworx/aws-kube-codesuite
● Lambda function source code for Portworx additions: https://github.com/portworx/aws-kube-codesuite/blob/master/src/kube-lambda.py

For more information about persistent storage for containers, visit the Portworx website. For more information about Code Pipeline, see the AWS CodePipeline User Guide.

Secure Build with AWS CodeBuild and LayeredInsight

Post Syndicated from Asif Khan original https://aws.amazon.com/blogs/devops/secure-build-with-aws-codebuild-and-layeredinsight/

This post is written by Asif Awan, Chief Technology Officer of Layered InsightSubin Mathew – Software Development Manager for AWS CodeBuild, and Asif Khan – Solutions Architect

Enterprises adopt containers because they recognize the benefits: speed, agility, portability, and high compute density. They understand how accelerating application delivery and deployment pipelines makes it possible to rapidly slipstream new features to customers. Although the benefits are indisputable, this acceleration raises concerns about security and corporate compliance with software governance. In this blog post, I provide a solution that shows how Layered Insight, the pioneer and global leader in container-native application protection, can be used with seamless application build and delivery pipelines like those available in AWS CodeBuild to address these concerns.

Layered Insight solutions

Layered Insight enables organizations to unify DevOps and SecOps by providing complete visibility and control of containerized applications. Using the industry’s first embedded security approach, Layered Insight solves the challenges of container performance and protection by providing accurate insight into container images, adaptive analysis of running containers, and automated enforcement of container behavior.

 

AWS CodeBuild

AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. CodeBuild scales continuously and processes multiple builds concurrently, so your builds are not left waiting in a queue. You can get started quickly by using prepackaged build environments, or you can create custom build environments that use your own build tools.

 

Problem Definition

Security and compliance concerns span the lifecycle of application containers. Common concerns include:

Visibility into the container images. You need to verify the software composition information of the container image to determine whether known vulnerabilities associated with any of the software packages and libraries are included in the container image.

Governance of container images is critical because only certain open source packages/libraries, of specific versions, should be included in the container images. You need support for mechanisms for blacklisting all container images that include a certain version of a software package/library, or only allowing open source software that come with a specific type of license (such as Apache, MIT, GPL, and so on). You need to be able to address challenges such as:

·       Defining the process for image compliance policies at the enterprise, department, and group levels.

·       Preventing the images that fail the compliance checks from being deployed in critical environments, such as staging, pre-prod, and production.

Visibility into running container instances is critical, including:

·       CPU and memory utilization.

·       Security of the build environment.

·       All activities (system, network, storage, and application layer) of the application code running in each container instance.

Protection of running container instances that is:

·       Zero-touch to the developers (not an SDK-based approach).

·       Zero touch to the DevOps team and doesn’t limit the portability of the containerized application.

·       This protection must retain the option to switch to a different container stack or orchestration layer, or even to a different Container as a Service (CaaS ).

·       And it must be a fully automated solution to SecOps, so that the SecOps team doesn’t have to manually analyze and define detailed blacklist and whitelist policies.

 

Solution Details

In AWS CodeCommit, we have three projects:
●     “Democode” is a simple Java application, with one buildspec to build the app into a Docker container (run by build-demo-image CodeBuild project), and another to instrument said container (instrument-image CodeBuild project). The resulting container is stored in ECR repo javatestasjavatest:20180415-layered. This instrumented container is running in AWS Fargate cluster demo-java-appand can be seen in the Layered Insight runtime console as the javatestapplication in us-east-1.
●     aws-codebuild-docker-imagesis a clone of the official aws-codebuild-docker-images repo on GitHub . This CodeCommit project is used by the build-python-builder CodeBuild project to build the python 3.3.6 codebuild image and is stored at the codebuild-python ECR repo. We then manually instructed the Layered Insight console to instrument the image.
●     scan-java-imagecontains just a buildspec.yml file. This file is used by the scan-java-image CodeBuild project to instruct Layered Assessment to perform a vulnerability scan of the javatest container image built previously, and then run the scan results through a compliance policy that states there should be no medium vulnerabilities. This build fails — but in this case that is a success: the scan completes successfully, but compliance fails as there are medium-level issues found in the scan.

This build is performed using the instrumented version of the Python 3.3.6 CodeBuild image, so the activity of the processes running within the build are recorded each time within the LI console.

Build container image

Create or use a CodeCommit project with your application. To build this image and store it in Amazon Elastic Container Registry (Amazon ECR), add a buildspec file to the project and build a container image and create a CodeBuild project.

Scan container image

Once the image is built, create a new buildspec in the same project or a new one that looks similar to below (update ECR URL as necessary):

version: 0.2
phases:
  pre_build:
    commands:
      - echo Pulling down LI Scan API client scripts
      - git clone https://github.com/LayeredInsight/scan-api-example-python.git
      - echo Setting up LI Scan API client
      - cd scan-api-example-python
      - pip install layint_scan_api
      - pip install -r requirements.txt
  build:
    commands:
      - echo Scanning container started on `date`
      - IMAGEID=$(./li_add_image --name <aws-region>.amazonaws.com/javatest:20180415)
      - ./li_wait_for_scan -v --imageid $IMAGEID
      - ./li_run_image_compliance -v --imageid $IMAGEID --policyid PB15260f1acb6b2aa5b597e9d22feffb538256a01fbb4e5a95

Add the buildspec file to the git repo, push it, and then build a CodeBuild project using with the instrumented Python 3.3.6 CodeBuild image at <aws-region>.amazonaws.com/codebuild-python:3.3.6-layered. Set the following environment variables in the CodeBuild project:
●     LI_APPLICATIONNAME – name of the build to display
●     LI_LOCATION – location of the build project to display
●     LI_API_KEY – ApiKey:<key-name>:<api-key>
●     LI_API_HOST – location of the Layered Insight API service

Instrument container image

Next, to instrument the new container image:

  1. In the Layered Insight runtime console, ensure that the ECR registry and credentials are defined (click the Setup icon and the ‘+’ sign on the top right of the screen to add a new container registry). Note the name given to the registry in the console, as this needs to be referenced in the li_add_imagecommand in the script, below.
  2. Next, add a new buildspec (with a new name) to the CodeCommit project, such as the one shown below. This code will download the Layered Insight runtime client, and use it to instruct the Layered Insight service to instrument the image that was just built:
    version: 0.2
    phases:
    pre_build:
    commands:
    echo Pulling down LI API Runtime client scripts
    git clone https://github.com/LayeredInsight/runtime-api-example-python
    echo Setting up LI API client
    cd runtime-api-example-python
    pip install layint-runtime-api
    pip install -r requirements.txt
    build:
    commands:
    echo Instrumentation started on `date`
    ./li_add_image --registry "Javatest ECR" --name IMAGE_NAME:TAG --description "IMAGE DESCRIPTION" --policy "Default Policy" --instrument --wait --verbose
  3. Commit and push the new buildspec file.
  4. Going back to CodeBuild, create a new project, with the same CodeCommit repo, but this time select the new buildspec file. Use a Python 3.3.6 builder – either the AWS or LI Instrumented version.
  5. Click Continue
  6. Click Save
  7. Run the build, again on the master branch.
  8. If everything runs successfully, a new image should appear in the ECR registry with a -layered suffix. This is the instrumented image.

Run instrumented container image

When the instrumented container is now run — in ECS, Fargate, or elsewhere — it will log data back to the Layered Insight runtime console. It’s appearance in the console can be modified by setting the LI_APPLICATIONNAME and LI_LOCATION environment variables when running the container.

Conclusion

In the above blog we have provided you steps needed to embed governance and runtime security in your build pipelines running on AWS CodeBuild using Layered Insight.

 

 

 

Implement continuous integration and delivery of serverless AWS Glue ETL applications using AWS Developer Tools

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/

AWS Glue is an increasingly popular way to develop serverless ETL (extract, transform, and load) applications for big data and data lake workloads. Organizations that transform their ETL applications to cloud-based, serverless ETL architectures need a seamless, end-to-end continuous integration and continuous delivery (CI/CD) pipeline: from source code, to build, to deployment, to product delivery. Having a good CI/CD pipeline can help your organization discover bugs before they reach production and deliver updates more frequently. It can also help developers write quality code and automate the ETL job release management process, mitigate risk, and more.

AWS Glue is a fully managed data catalog and ETL service. It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, and job scheduling. AWS Glue crawls your data sources and constructs a data catalog using pre-built classifiers for popular data formats and data types, including CSV, Apache Parquet, JSON, and more.

When you are developing ETL applications using AWS Glue, you might come across some of the following CI/CD challenges:

  • Iterative development with unit tests
  • Continuous integration and build
  • Pushing the ETL pipeline to a test environment
  • Pushing the ETL pipeline to a production environment
  • Testing ETL applications using real data (live test)
  • Exploring and validating data

In this post, I walk you through a solution that implements a CI/CD pipeline for serverless AWS Glue ETL applications supported by AWS Developer Tools (including AWS CodePipeline, AWS CodeCommit, and AWS CodeBuild) and AWS CloudFormation.

Solution overview

The following diagram shows the pipeline workflow:

This solution uses AWS CodePipeline, which lets you orchestrate and automate the test and deploy stages for ETL application source code. The solution consists of a pipeline that contains the following stages:

1.) Source Control: In this stage, the AWS Glue ETL job source code and the AWS CloudFormation template file for deploying the ETL jobs are both committed to version control. I chose to use AWS CodeCommit for version control.

To get the ETL job source code and AWS CloudFormation template, download the gluedemoetl.zip file. This solution is developed based on a previous post, Build a Data Lake Foundation with AWS Glue and Amazon S3.

2.) LiveTest: In this stage, all resources—including AWS Glue crawlers, jobs, S3 buckets, roles, and other resources that are required for the solution—are provisioned, deployed, live tested, and cleaned up.

The LiveTest stage includes the following actions:

  • Deploy: In this action, all the resources that are required for this solution (crawlers, jobs, buckets, roles, and so on) are provisioned and deployed using an AWS CloudFormation template.
  • AutomatedLiveTest: In this action, all the AWS Glue crawlers and jobs are executed and data exploration and validation tests are performed. These validation tests include, but are not limited to, record counts in both raw tables and transformed tables in the data lake and any other business validations. I used AWS CodeBuild for this action.
  • LiveTestApproval: This action is included for the cases in which a pipeline administrator approval is required to deploy/promote the ETL applications to the next stage. The pipeline pauses in this action until an administrator manually approves the release.
  • LiveTestCleanup: In this action, all the LiveTest stage resources, including test crawlers, jobs, roles, and so on, are deleted using the AWS CloudFormation template. This action helps minimize cost by ensuring that the test resources exist only for the duration of the AutomatedLiveTest and LiveTestApproval

3.) DeployToProduction: In this stage, all the resources are deployed using the AWS CloudFormation template to the production environment.

Try it out

This code pipeline takes approximately 20 minutes to complete the LiveTest test stage (up to the LiveTest approval stage, in which manual approval is required).

To get started with this solution, choose Launch Stack:

This creates the CI/CD pipeline with all of its stages, as described earlier. It performs an initial commit of the sample AWS Glue ETL job source code to trigger the first release change.

In the AWS CloudFormation console, choose Create. After the template finishes creating resources, you see the pipeline name on the stack Outputs tab.

After that, open the CodePipeline console and select the newly created pipeline. Initially, your pipeline’s CodeCommit stage shows that the source action failed.

Allow a few minutes for your new pipeline to detect the initial commit applied by the CloudFormation stack creation. As soon as the commit is detected, your pipeline starts. You will see the successful stage completion status as soon as the CodeCommit source stage runs.

In the CodeCommit console, choose Code in the navigation pane to view the solution files.

Next, you can watch how the pipeline goes through the LiveTest stage of the deploy and AutomatedLiveTest actions, until it finally reaches the LiveTestApproval action.

At this point, if you check the AWS CloudFormation console, you can see that a new template has been deployed as part of the LiveTest deploy action.

At this point, make sure that the AWS Glue crawlers and the AWS Glue job ran successfully. Also check whether the corresponding databases and external tables have been created in the AWS Glue Data Catalog. Then verify that the data is validated using Amazon Athena, as shown following.

Open the AWS Glue console, and choose Databases in the navigation pane. You will see the following databases in the Data Catalog:

Open the Amazon Athena console, and run the following queries. Verify that the record counts are matching.

SELECT count(*) FROM "nycitytaxi_gluedemocicdtest"."data";
SELECT count(*) FROM "nytaxiparquet_gluedemocicdtest"."datalake";

The following shows the raw data:

The following shows the transformed data:

The pipeline pauses the action until the release is approved. After validating the data, manually approve the revision on the LiveTestApproval action on the CodePipeline console.

Add comments as needed, and choose Approve.

The LiveTestApproval stage now appears as Approved on the console.

After the revision is approved, the pipeline proceeds to use the AWS CloudFormation template to destroy the resources that were deployed in the LiveTest deploy action. This helps reduce cost and ensures a clean test environment on every deployment.

Production deployment is the final stage. In this stage, all the resources—AWS Glue crawlers, AWS Glue jobs, Amazon S3 buckets, roles, and so on—are provisioned and deployed to the production environment using the AWS CloudFormation template.

After successfully running the whole pipeline, feel free to experiment with it by changing the source code stored on AWS CodeCommit. For example, if you modify the AWS Glue ETL job to generate an error, it should make the AutomatedLiveTest action fail. Or if you change the AWS CloudFormation template to make its creation fail, it should affect the LiveTest deploy action. The objective of the pipeline is to guarantee that all changes that are deployed to production are guaranteed to work as expected.

Conclusion

In this post, you learned how easy it is to implement CI/CD for serverless AWS Glue ETL solutions with AWS developer tools like AWS CodePipeline and AWS CodeBuild at scale. Implementing such solutions can help you accelerate ETL development and testing at your organization.

If you have questions or suggestions, please comment below.

 


Additional Reading

If you found this post useful, be sure to check out Implement Continuous Integration and Delivery of Apache Spark Applications using AWS and Build a Data Lake Foundation with AWS Glue and Amazon S3.

 


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 
Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.

 

 

 

Performing Unit Testing in an AWS CodeStar Project

Post Syndicated from Jerry Mathen Jacob original https://aws.amazon.com/blogs/devops/performing-unit-testing-in-an-aws-codestar-project/

In this blog post, I will show how you can perform unit testing as a part of your AWS CodeStar project. AWS CodeStar helps you quickly develop, build, and deploy applications on AWS. With AWS CodeStar, you can set up your continuous delivery (CD) toolchain and manage your software development from one place.

Because unit testing tests individual units of application code, it is helpful for quickly identifying and isolating issues. As a part of an automated CI/CD process, it can also be used to prevent bad code from being deployed into production.

Many of the AWS CodeStar project templates come preconfigured with a unit testing framework so that you can start deploying your code with more confidence. The unit testing is configured to run in the provided build stage so that, if the unit tests do not pass, the code is not deployed. For a list of AWS CodeStar project templates that include unit testing, see AWS CodeStar Project Templates in the AWS CodeStar User Guide.

The scenario

As a big fan of superhero movies, I decided to list my favorites and ask my friends to vote on theirs by using a WebService endpoint I created. The example I use is a Python web service running on AWS Lambda with AWS CodeCommit as the code repository. CodeCommit is a fully managed source control system that hosts Git repositories and works with all Git-based tools.

Here’s how you can create the WebService endpoint:

Sign in to the AWS CodeStar console. Choose Start a project, which will take you to the list of project templates.

create project

For code edits I will choose AWS Cloud9, which is a cloud-based integrated development environment (IDE) that you use to write, run, and debug code.

choose cloud9

Here are the other tasks required by my scenario:

  • Create a database table where the votes can be stored and retrieved as needed.
  • Update the logic in the Lambda function that was created for posting and getting the votes.
  • Update the unit tests (of course!) to verify that the logic works as expected.

For a database table, I’ve chosen Amazon DynamoDB, which offers a fast and flexible NoSQL database.

Getting set up on AWS Cloud9

From the AWS CodeStar console, go to the AWS Cloud9 console, which should take you to your project code. I will open up a terminal at the top-level folder under which I will set up my environment and required libraries.

Use the following command to set the PYTHONPATH environment variable on the terminal.

export PYTHONPATH=/home/ec2-user/environment/vote-your-movie

You should now be able to use the following command to execute the unit tests in your project.

python -m unittest discover vote-your-movie/tests

cloud9 setup

Start coding

Now that you have set up your local environment and have a copy of your code, add a DynamoDB table to the project by defining it through a template file. Open template.yml, which is the Serverless Application Model (SAM) template file. This template extends AWS CloudFormation to provide a simplified way of defining the Amazon API Gateway APIs, AWS Lambda functions, and Amazon DynamoDB tables required by your serverless application.

AWSTemplateFormatVersion: 2010-09-09
Transform:
- AWS::Serverless-2016-10-31
- AWS::CodeStar

Parameters:
  ProjectId:
    Type: String
    Description: CodeStar projectId used to associate new resources to team members

Resources:
  # The DB table to store the votes.
  MovieVoteTable:
    Type: AWS::Serverless::SimpleTable
    Properties:
      PrimaryKey:
        # Name of the "Candidate" is the partition key of the table.
        Name: Candidate
        Type: String
  # Creating a new lambda function for retrieving and storing votes.
  MovieVoteLambda:
    Type: AWS::Serverless::Function
    Properties:
      Handler: index.handler
      Runtime: python3.6
      Environment:
        # Setting environment variables for your lambda function.
        Variables:
          TABLE_NAME: !Ref "MovieVoteTable"
          TABLE_REGION: !Ref "AWS::Region"
      Role:
        Fn::ImportValue:
          !Join ['-', [!Ref 'ProjectId', !Ref 'AWS::Region', 'LambdaTrustRole']]
      Events:
        GetEvent:
          Type: Api
          Properties:
            Path: /
            Method: get
        PostEvent:
          Type: Api
          Properties:
            Path: /
            Method: post

We’ll use Python’s boto3 library to connect to AWS services. And we’ll use Python’s mock library to mock AWS service calls for our unit tests.
Use the following command to install these libraries:

pip install --upgrade boto3 mock -t .

install dependencies

Add these libraries to the buildspec.yml, which is the YAML file that is required for CodeBuild to execute.

version: 0.2

phases:
  install:
    commands:

      # Upgrade AWS CLI to the latest version
      - pip install --upgrade awscli boto3 mock

  pre_build:
    commands:

      # Discover and run unit tests in the 'tests' directory. For more information, see <https://docs.python.org/3/library/unittest.html#test-discovery>
      - python -m unittest discover tests

  build:
    commands:

      # Use AWS SAM to package the application by using AWS CloudFormation
      - aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml

artifacts:
  type: zip
  files:
    - template-export.yml

Open the index.py where we can write the simple voting logic for our Lambda function.

import json
import datetime
import boto3
import os

table_name = os.environ['TABLE_NAME']
table_region = os.environ['TABLE_REGION']

VOTES_TABLE = boto3.resource('dynamodb', region_name=table_region).Table(table_name)
CANDIDATES = {"A": "Black Panther", "B": "Captain America: Civil War", "C": "Guardians of the Galaxy", "D": "Thor: Ragnarok"}

def handler(event, context):
    if event['httpMethod'] == 'GET':
        resp = VOTES_TABLE.scan()
        return {'statusCode': 200,
                'body': json.dumps({item['Candidate']: int(item['Votes']) for item in resp['Items']}),
                'headers': {'Content-Type': 'application/json'}}

    elif event['httpMethod'] == 'POST':
        try:
            body = json.loads(event['body'])
        except:
            return {'statusCode': 400,
                    'body': 'Invalid input! Expecting a JSON.',
                    'headers': {'Content-Type': 'application/json'}}
        if 'candidate' not in body:
            return {'statusCode': 400,
                    'body': 'Missing "candidate" in request.',
                    'headers': {'Content-Type': 'application/json'}}
        if body['candidate'] not in CANDIDATES.keys():
            return {'statusCode': 400,
                    'body': 'You must vote for one of the following candidates - {}.'.format(get_allowed_candidates()),
                    'headers': {'Content-Type': 'application/json'}}

        resp = VOTES_TABLE.update_item(
            Key={'Candidate': CANDIDATES.get(body['candidate'])},
            UpdateExpression='ADD Votes :incr',
            ExpressionAttributeValues={':incr': 1},
            ReturnValues='ALL_NEW'
        )
        return {'statusCode': 200,
                'body': "{} now has {} votes".format(CANDIDATES.get(body['candidate']), resp['Attributes']['Votes']),
                'headers': {'Content-Type': 'application/json'}}

def get_allowed_candidates():
    l = []
    for key in CANDIDATES:
        l.append("'{}' for '{}'".format(key, CANDIDATES.get(key)))
    return ", ".join(l)

What our code basically does is take in the HTTPS request call as an event. If it is an HTTP GET request, it gets the votes result from the table. If it is an HTTP POST request, it sets a vote for the candidate of choice. We also validate the inputs in the POST request to filter out requests that seem malicious. That way, only valid calls are stored in the table.

In the example code provided, we use a CANDIDATES variable to store our candidates, but you can store the candidates in a JSON file and use Python’s json library instead.

Let’s update the tests now. Under the tests folder, open the test_handler.py and modify it to verify the logic.

import os
# Some mock environment variables that would be used by the mock for DynamoDB
os.environ['TABLE_NAME'] = "MockHelloWorldTable"
os.environ['TABLE_REGION'] = "us-east-1"

# The library containing our logic.
import index

# Boto3's core library
import botocore
# For handling JSON.
import json
# Unit test library
import unittest
## Getting StringIO based on your setup.
try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO
## Python mock library
from mock import patch, call
from decimal import Decimal

@patch('botocore.client.BaseClient._make_api_call')
class TestCandidateVotes(unittest.TestCase):

    ## Test the HTTP GET request flow. 
    ## We expect to get back a successful response with results of votes from the table (mocked).
    def test_get_votes(self, boto_mock):
        # Input event to our method to test.
        expected_event = {'httpMethod': 'GET'}
        # The mocked values in our DynamoDB table.
        items_in_db = [{'Candidate': 'Black Panther', 'Votes': Decimal('3')},
                        {'Candidate': 'Captain America: Civil War', 'Votes': Decimal('8')},
                        {'Candidate': 'Guardians of the Galaxy', 'Votes': Decimal('8')},
                        {'Candidate': "Thor: Ragnarok", 'Votes': Decimal('1')}
                    ]
        # The mocked DynamoDB response.
        expected_ddb_response = {'Items': items_in_db}
        # The mocked response we expect back by calling DynamoDB through boto.
        response_body = botocore.response.StreamingBody(StringIO(str(expected_ddb_response)),
                                                        len(str(expected_ddb_response)))
        # Setting the expected value in the mock.
        boto_mock.side_effect = [expected_ddb_response]
        # Expecting that there would be a call to DynamoDB Scan function during execution with these parameters.
        expected_calls = [call('Scan', {'TableName': os.environ['TABLE_NAME']})]

        # Call the function to test.
        result = index.handler(expected_event, {})

        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 200

        result_body = json.loads(result.get('body'))
        # Verifying that the results match to that from the table.
        assert len(result_body) == len(items_in_db)
        for i in range(len(result_body)):
            assert result_body.get(items_in_db[i].get("Candidate")) == int(items_in_db[i].get("Votes"))

        assert boto_mock.call_count == 1
        boto_mock.assert_has_calls(expected_calls)

    ## Test the HTTP POST request flow that places a vote for a selected candidate.
    ## We expect to get back a successful response with a confirmation message.
    def test_place_valid_candidate_vote(self, boto_mock):
        # Input event to our method to test.
        expected_event = {'httpMethod': 'POST', 'body': "{\"candidate\": \"D\"}"}
        # The mocked response in our DynamoDB table.
        expected_ddb_response = {'Attributes': {'Candidate': "Thor: Ragnarok", 'Votes': Decimal('2')}}
        # The mocked response we expect back by calling DynamoDB through boto.
        response_body = botocore.response.StreamingBody(StringIO(str(expected_ddb_response)),
                                                        len(str(expected_ddb_response)))
        # Setting the expected value in the mock.
        boto_mock.side_effect = [expected_ddb_response]
        # Expecting that there would be a call to DynamoDB UpdateItem function during execution with these parameters.
        expected_calls = [call('UpdateItem', {
                                                'TableName': os.environ['TABLE_NAME'], 
                                                'Key': {'Candidate': 'Thor: Ragnarok'},
                                                'UpdateExpression': 'ADD Votes :incr',
                                                'ExpressionAttributeValues': {':incr': 1},
                                                'ReturnValues': 'ALL_NEW'
                                            })]
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 200

        assert result.get('body') == "{} now has {} votes".format(
            expected_ddb_response['Attributes']['Candidate'], 
            expected_ddb_response['Attributes']['Votes'])

        assert boto_mock.call_count == 1
        boto_mock.assert_has_calls(expected_calls)

    ## Test the HTTP POST request flow that places a vote for an non-existant candidate.
    ## We expect to get back a successful response with a confirmation message.
    def test_place_invalid_candidate_vote(self, boto_mock):
        # Input event to our method to test.
        # The valid IDs for the candidates are A, B, C, and D
        expected_event = {'httpMethod': 'POST', 'body': "{\"candidate\": \"E\"}"}
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 400
        assert result.get('body') == 'You must vote for one of the following candidates - {}.'.format(index.get_allowed_candidates())

    ## Test the HTTP POST request flow that places a vote for a selected candidate but associated with an invalid key in the POST body.
    ## We expect to get back a failed (400) response with an appropriate error message.
    def test_place_invalid_data_vote(self, boto_mock):
        # Input event to our method to test.
        # "name" is not the expected input key.
        expected_event = {'httpMethod': 'POST', 'body': "{\"name\": \"D\"}"}
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 400
        assert result.get('body') == 'Missing "candidate" in request.'

    ## Test the HTTP POST request flow that places a vote for a selected candidate but not as a JSON string which the body of the request expects.
    ## We expect to get back a failed (400) response with an appropriate error message.
    def test_place_malformed_json_vote(self, boto_mock):
        # Input event to our method to test.
        # "body" receives a string rather than a JSON string.
        expected_event = {'httpMethod': 'POST', 'body': "Thor: Ragnarok"}
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 400
        assert result.get('body') == 'Invalid input! Expecting a JSON.'

if __name__ == '__main__':
    unittest.main()

I am keeping the code samples well commented so that it’s clear what each unit test accomplishes. It tests the success conditions and the failure paths that are handled in the logic.

In my unit tests I use the patch decorator (@patch) in the mock library. @patch helps mock the function you want to call (in this case, the botocore library’s _make_api_call function in the BaseClient class).
Before we commit our changes, let’s run the tests locally. On the terminal, run the tests again. If all the unit tests pass, you should expect to see a result like this:

You:~/environment $ python -m unittest discover vote-your-movie/tests
.....
----------------------------------------------------------------------
Ran 5 tests in 0.003s

OK
You:~/environment $

Upload to AWS

Now that the tests have passed, it’s time to commit and push the code to source repository!

Add your changes

From the terminal, go to the project’s folder and use the following command to verify the changes you are about to push.

git status

To add the modified files only, use the following command:

git add -u

Commit your changes

To commit the changes (with a message), use the following command:

git commit -m "Logic and tests for the voting webservice."

Push your changes to AWS CodeCommit

To push your committed changes to CodeCommit, use the following command:

git push

In the AWS CodeStar console, you can see your changes flowing through the pipeline and being deployed. There are also links in the AWS CodeStar console that take you to this project’s build runs so you can see your tests running on AWS CodeBuild. The latest link under the Build Runs table takes you to the logs.

unit tests at codebuild

After the deployment is complete, AWS CodeStar should now display the AWS Lambda function and DynamoDB table created and synced with this project. The Project link in the AWS CodeStar project’s navigation bar displays the AWS resources linked to this project.

codestar resources

Because this is a new database table, there should be no data in it. So, let’s put in some votes. You can download Postman to test your application endpoint for POST and GET calls. The endpoint you want to test is the URL displayed under Application endpoints in the AWS CodeStar console.

Now let’s open Postman and look at the results. Let’s create some votes through POST requests. Based on this example, a valid vote has a value of A, B, C, or D.
Here’s what a successful POST request looks like:

POST success

Here’s what it looks like if I use some value other than A, B, C, or D:

 

POST Fail

Now I am going to use a GET request to fetch the results of the votes from the database.

GET success

And that’s it! You have now created a simple voting web service using AWS Lambda, Amazon API Gateway, and DynamoDB and used unit tests to verify your logic so that you ship good code.
Happy coding!