Tag Archives: AWS CodePipeline

Deploy data lake ETL jobs using CDK Pipelines

Post Syndicated from Ravi Itha original https://aws.amazon.com/blogs/devops/deploying-data-lake-etl-jobs-using-cdk-pipelines/

Many organizations are building data lakes on AWS, which provides the most secure, scalable, comprehensive, and cost-effective portfolio of services. Like any application development project, a data lake must answer a fundamental question: “What is the DevOps strategy?” Defining a DevOps strategy for a data lake requires extensive planning and multiple teams. This typically requires multiple development and test cycles before maturing enough to support a data lake in a production environment. If an organization doesn’t have the right people, resources, and processes in place, this can quickly become daunting.

What if your data engineering team uses basic building blocks to encapsulate data lake infrastructure and data processing jobs? This is where CDK Pipelines brings the full benefit of infrastructure as code (IaC). CDK Pipelines is a high-level construct library within the AWS Cloud Development Kit (AWS CDK) that makes it easy to set up a continuous deployment pipeline for your AWS CDK applications. The AWS CDK provides essential automation for your release pipelines so that your development and operations team remain agile and focus on developing and delivering applications on the data lake.

In this post, we discuss a centralized deployment solution utilizing CDK Pipelines for data lakes. This implements a DevOps-driven data lake that delivers benefits such as continuous delivery of data lake infrastructure, data processing, and analytical jobs through a configuration-driven multi-account deployment strategy. Let’s dive in!

Data lakes on AWS

A data lake is a centralized repository where you can store all of your structured and unstructured data at any scale. Store your data as is, without having to first structure it, and run different types of analytics—from dashboards and visualizations to big data processing, real-time analytics, and machine learning in order to guide better decisions. To further explore data lakes, refer to What is a data lake?

We design a data lake with the following elements:

  • Secure data storage
  • Data cataloging in a central repository
  • Data movement
  • Data analysis

The following figure represents our data lake.

Data Lake on AWS

We use three Amazon Simple Storage Service (Amazon S3) buckets:

  • raw – Stores the input data in its original format
  • conformed – Stores the data that meets the data lake quality requirements
  • purpose-built – Stores the data that is ready for consumption by applications or data lake consumers

The data lake has a producer where we ingest data into the raw bucket at periodic intervals. We utilize the following tools: AWS Glue processes and analyzes the data. AWS Glue Data Catalog persists metadata in a central repository. AWS Lambda and AWS Step Functions schedule and orchestrate AWS Glue extract, transform, and load (ETL) jobs. Amazon Athena is used for interactive queries and analysis. Finally, we engage various AWS services for logging, monitoring, security, authentication, authorization, alerting, and notification.

A common data lake practice is to have multiple environments such as dev, test, and production. Applying the IaC principle for data lakes brings the benefit of consistent and repeatable runs across multiple environments, self-documenting infrastructure, and greater flexibility with resource management. The AWS CDK offers high-level constructs for use with all of our data lake resources. This simplifies usage and streamlines implementation.

Before exploring the implementation, let’s gain further scope of how we utilize our data lake.

The solution

Our goal is to implement a CI/CD solution that automates the provisioning of data lake infrastructure resources and deploys ETL jobs interactively. We accomplish this as follows: 1) applying separation of concerns (SoC) design principle to data lake infrastructure and ETL jobs via dedicated source code repositories, 2) a centralized deployment model utilizing CDK pipelines, and 3) AWS CDK enabled ETL pipelines from the start.

Data lake infrastructure

Our data lake infrastructure provisioning includes Amazon S3 buckets, S3 bucket policies, AWS Key Management Service (KMS) encryption keys, Amazon Virtual Private Cloud (Amazon VPC), subnets, route tables, security groups, VPC endpoints, and secrets in AWS Secrets Manager. The following diagram illustrates this.

Data Lake Infrastructure

Data lake ETL jobs

For our ETL jobs, we process New York City TLC Trip Record Data. The following figure displays our ETL process, wherein we run two ETL jobs within a Step Functions state machine.

AWS Glue ETL Jobs

Here are a few important details:

  1. A file server uploads files to the S3 raw bucket of the data lake. The file server is a data producer and source for the data lake. We assume that the data is pushed to the raw bucket.
  2. Amazon S3 triggers an event notification to the Lambda function.
  3. The function inserts an item in the Amazon DynamoDB table in order to track the file processing state. The first state written indicates the AWS Step Function start.
  4. The function starts the state machine.
  5. The state machine runs an AWS Glue job (Apache Spark).
  6. The job processes input data from the raw zone to the data lake conformed zone. The job also converts CSV input data to Parquet formatted data.
  7. The job updates the Data Catalog table with the metadata of the conformed Parquet file.
  8. A second AWS Glue job (Apache Spark) processes the input data from the conformed zone to the purpose-built zone of the data lake.
  9. The job fetches ETL transformation rules from the Amazon S3 code bucket and transforms the input data.
  10. The job stores the result in Parquet format in the purpose-built zone.
  11. The job updates the Data Catalog table with the metadata of the purpose-built Parquet file.
  12. The job updates the DynamoDB table and updates the job status to completed.
  13. An Amazon Simple Notification Service (Amazon SNS) notification is sent to subscribers that states the job is complete.
  14. Data engineers or analysts can now analyze data via Athena.

We will discuss data formats, Glue jobs, ETL transformation logics, data cataloging, auditing, notification, orchestration, and data analysis in more detail in AWS CDK Pipelines for Data Lake ETL Deployment GitHub repository. This will be discussed in the subsequent section.

Centralized deployment

Now that we have data lake infrastructure and ETL jobs ready, let’s define our deployment model. This model is based on the following design principles:

  • A dedicated AWS account to run CDK pipelines.
  • One or more AWS accounts into which the data lake is deployed.
  • The data lake infrastructure has a dedicated source code repository. Typically, data lake infrastructure is a one-time deployment and rarely evolves. Therefore, a dedicated code repository provides a landing zone for your data lake.
  • Each ETL job has a dedicated source code repository. Each ETL job may have unique AWS service, orchestration, and configuration requirements. Therefore, a dedicated source code repository will help you more flexibly build, deploy, and maintain ETL jobs.

We organize our source code repo into three branches: dev (main), test, and prod. In the deployment account, we manage three separate CDK Pipelines and each pipeline is sourced from a dedicated branch. Here we choose a branch-based software development method in order to demonstrate the strategy in more complex scenarios where integration testing and validation layers require human intervention. As well, these may not immediately follow with a corresponding release or deployment due to their manual nature. This facilitates the propagation of changes through environments without blocking independent development priorities. We accomplish this by isolating resources across environments in the central deployment account, allowing for the independent management of each environment, and avoiding cross-contamination during each pipeline’s self-mutating updates. The following diagram illustrates this method.

Centralized deployment

 

Note: This centralized deployment strategy can be adopted for trunk-based software development with minimal solution modification.

Deploying data lake ETL jobs

The following figure illustrates how we utilize CDK Pipelines to deploy data lake infrastructure and ETL jobs from a central deployment account. This model follows standard nomenclature from the AWS CDK. Each repository represents a cloud infrastructure code definition. This includes the pipelines construct definition. Pipelines have one or more actions, such as cloning the source code (source action) and synthesizing the stack into an AWS CloudFormation template (synth action). Each pipeline has one or more stages, such as testing and deploying. In an AWS CDK app context, the pipelines construct is a stack like any other stack. Therefore, when the AWS CDK app is deployed, a new pipeline is created in AWS CodePipeline.

This provides incredible flexibility regarding DevOps. In other words, as a developer with an understanding of AWS CDK APIs, you can harness the power and scalability of AWS services such as CodePipeline, AWS CodeBuild, and AWS CloudFormation.

Deploying data lake ETL jobs using CDK Pipelines

Here are a few important details:

  1. The DevOps administrator checks in the code to the repository.
  2. The DevOps administrator (with elevated access) facilitates a one-time manual deployment on a target environment. Elevated access includes administrative privileges on the central deployment account and target AWS environments.
  3. CodePipeline periodically listens to commit events on the source code repositories. This is the self-mutating nature of CodePipeline. It’s configured to work with and can update itself according to the provided definition.
  4. Code changes made to the main repo branch are automatically deployed to the data lake dev environment.
  5. Code changes to the repo test branch are automatically deployed to the test environment.
  6. Code changes to the repo prod branch are automatically deployed to the prod environment.

CDK Pipelines starter kits for data lakes

Want to get going quickly with CDK Pipelines for your data lake? Start by cloning our two GitHub repositories. Here is a summary:

AWS CDK Pipelines for Data Lake Infrastructure Deployment

This repository contains the following reusable resources:

  • CDK Application
  • CDK Pipelines stack
  • CDK Pipelines deploy stage
  • Amazon VPC stack
  • Amazon S3 stack

It also contains the following automation scripts:

  • AWS environments configuration
  • Deployment account bootstrapping
  • Target account bootstrapping
  • Account secrets configuration (e.g., GitHub access tokens)

AWS CDK Pipelines for Data Lake ETL Deployment

This repository contains the following reusable resources:

  • CDK Application
  • CDK Pipelines stack
  • CDK Pipelines deploy stage
  • Amazon DynamoDB stack
  • AWS Glue stack
  • AWS Step Functions stack

It also contains the following:

  • AWS Lambda scripts
  • AWS Glue scripts
  • AWS Step Functions State machine script

Advantages

This section summarizes some of the advantages offered by this solution.

Scalable and centralized deployment model

We utilize a scalable and centralized deployment model to deliver end-to-end automation. This allows DevOps and data engineers to use the single responsibility principal while maintaining precise control over the deployment strategy and code quality. The model can readily be expanded to more accounts, and the pipelines are responsive to custom controls within each environment, such as a production approval layer.

Configuration-driven deployment

Configuration in the source code and AWS Secrets Manager allow deployments to utilize targeted values that are declared globally in a single location. This provides consistent management of global configurations and dependencies such as resource names, AWS account Ids, Regions, and VPC CIDR ranges. Similarly, the CDK Pipelines export outputs from CloudFormation stacks for later consumption via other resources.

Repeatable and consistent deployment of new ETL jobs

Continuous integration and continuous delivery (CI/CD) pipelines allow teams to deploy to production more frequently. Code changes can be safely and securely propagated through environments and released for deployment. This allows rapid iteration on data processing jobs, and these jobs can be changed in isolation from pipeline changes, resulting in reliable workflows.

Cleaning up

You may delete the resources provisioned by utilizing the starter kits. You can do this by running the cdk destroy command using AWS CDK Toolkit. For detailed instructions, refer to the Clean up sections in the starter kit README files.

Conclusion

In this post, we showed how to utilize CDK Pipelines to deploy infrastructure and data processing ETL jobs of your data lake in dev, test, and production AWS environments. We provided two GitHub repositories for you to test and realize the full benefits of this solution first hand. We encourage you to fork the repositories, bring your ETL scripts, bootstrap your accounts, configure account parameters, and continuously delivery your data lake ETL jobs.

Let’s stay in touch via the GitHub—AWS CDK Pipelines for Data Lake Infrastructure Deployment and AWS CDK Pipelines for Data Lake ETL Deployment.


About the authors

Ravi Itha

Ravi Itha is a Sr. Data Architect at AWS. He works with customers to design and implement Data Lakes, Analytics, and Microservices on AWS. He is an open-source committer and has published more than a dozen solutions using AWS CDK, AWS Glue, AWS Lambda, AWS Step Functions, Amazon ECS, Amazon MQ, Amazon SQS, Amazon Kinesis Data Streams, and Amazon Kinesis Data Analytics for Apache Flink. His solutions can be found at his GitHub handle. Outside of work, he is passionate about books, cooking, movies, and yoga.

 

 

Isaiah Grant

Isaiah Grant is a Cloud Consultant at 2nd Watch. His primary function is to design architectures and build cloud-based applications and services. He leads customer engagements and helps customers with enterprise cloud adoptions. In his free time, he is engaged in local community initiatives and enjoys being outdoors with his family.

 

 

 

 

Zahid Ali

Zahid Ali is a Data Architect at AWS. He helps customers design, develop, and implement data warehouse and Data Lake solutions on AWS. Outside of work he enjoys playing tennis, spending time outdoors, and traveling.

 

Secure and analyse your Terraform code using AWS CodeCommit, AWS CodePipeline, AWS CodeBuild and tfsec

Post Syndicated from César Prieto Ballester original https://aws.amazon.com/blogs/devops/secure-and-analyse-your-terraform-code-using-aws-codecommit-aws-codepipeline-aws-codebuild-and-tfsec/

Introduction

More and more customers are using Infrastructure-as-Code (IaC) to design and implement their infrastructure on AWS. This is why it is essential to have pipelines with Continuous Integration/Continuous Deployment (CI/CD) for infrastructure deployment. HashiCorp Terraform is one of the popular IaC tools for customers on AWS.

In this blog, I will guide you through building a CI/CD pipeline on AWS to analyze and identify possible configurations issues in your Terraform code templates. This will help mitigate security risks within our infrastructure deployment pipelines as part of our CI/CD. To do this, we utilize AWS tools and the Open Source tfsec tool, a static analysis security scanner for your Terraform code, including more than 90 preconfigured checks with the ability to add custom checks.

Solutions Overview

The architecture goes through a CI/CD pipeline created on AWS using AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, and Amazon ECR.

Our demo has two separate pipelines:

  1. CI/CD Pipeline to build and push our custom Docker image to Amazon ECR
  2. CI/CD Pipeline where our tfsec analysis is executed and Terraform provisions infrastructure

The tfsec configuration and Terraform goes through a buildspec specification file defined within an AWS CodeBuild action. This action will calculate how many potential security risks we currently have within our Terraform templates, which will be displayed in our manual acceptance process for verification.

Architecture diagram

Provisioning the infrastructure

We have created an AWS Cloud Development Kit (AWS CDK) app hosted in a Git Repository written in Python. Here you can deploy the two main pipelines in order to manage this scenario. For a list of the deployment prerequisites, see the README.md file.

Clone the repo in your local machine. Then, bootstrap and deploy the CDK stack:

git clone https://github.com/aws-samples/aws-cdk-tfsec
cd aws-cdk-tfsec
pip install -r requirements.txt
cdk bootstrap aws://account_id/eu-west-1
cdk deploy --all

The infrastructure creation takes around 5-10 minutes due the AWS CodePipelines and referenced repository creation. Once the CDK has deployed the infrastructure, clone the two new AWS CodeCommit repos that have already been created and push the example code. First, one for the custom Docker image, and later for your Terraform code, like this:

git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/awsome-terraform-example-container
cd awsome-terraform-example-container
git checkout -b main
cp repos/docker_image/* .
git add .
git commit -am "First commit"
git push origin main

Once the Docker image is built and pushed to the Amazon ECR, proceed with Terraform repo. Check the pipeline process on the AWS CodePipeline console.

Screenshot of CI/CD Pipeline to build Docker Image

git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/awsome-terraform-example
cd awsome-terraform-example
git checkout -b main
cp -aR repos/terraform_code/* .
git add .
git commit -am "First commit"
git push origin main

The Terraform provisioning AWS CodePipeline has the following aspect:

Screenshot of CodePipeline to run security and orchestrate IaC

The pipeline has three main stages:

  • Source – AWS CodeCommit stores the Terraform repository infrastructure and every time we push code to the main branch the AWS CodePipeline will be triggered.
  • tfsec analysis – AWS CodeBuild looks for a buildspec to execute the tfsec actions configured on the same buildspec.

Screenshot showing tfsec analysis

The output shows the potential security issues detected by tfsec for our Terraform code. The output is linking to the different security issues already defined on tfsec. Check the security checks defined by tfsec here. After tfsec execution, a manual approval action is set up to decide if we should go for the next steps or if we reject and stop the AWS CodePipeline execution.

The URL for review is linking to our tfsec output console.

Screenshot of tfsec output

 

  • Terraform plan and Terraform apply – This will be applied to our infrastructure plan. After the Terraform plan command and before the Terraform apply, a manual action is set up to decide if we can apply the changes.

After going through all of the stages, our Terraform infrastructure should be created.

Clean up

After completing your demo, feel free to delete your stack using the CDK cli:

cdk destroy --all

Conclusion

At AWS, security is our top priority. This post demonstrates how to build a CI/CD pipeline by using AWS Services to automate and secure your infrastructure as code via Terraform and tfsec.

Learn more about tfsec through the official documentation: https://tfsec.dev/

About the authors

 

César Prieto Ballester is a DevOps Consultant at Amazon Web Services. He enjoys automating everything and building infrastructure using code. Apart from work, he plays electric guitar and loves riding his mountain bike.

 

 

 

Bruno Bardelli is a Senior DevOps Consultant at Amazon Web Services. He loves to build applications and in his free time plays video games, practices aikido, and goes on walks with his dog.

Blue/Green deployment with AWS Developer tools on Amazon EC2 using Amazon EFS to host application source code

Post Syndicated from Rakesh Singh original https://aws.amazon.com/blogs/devops/blue-green-deployment-with-aws-developer-tools-on-amazon-ec2-using-amazon-efs-to-host-application-source-code/

Many organizations building modern applications require a shared and persistent storage layer for hosting and deploying data-intensive enterprise applications, such as content management systems, media and entertainment, distributed applications like machine learning training, etc. These applications demand a centralized file share that scales to petabytes without disrupting running applications and remains concurrently accessible from potentially thousands of Amazon EC2 instances.

Simultaneously, customers want to automate the end-to-end deployment workflow and leverage continuous methodologies utilizing AWS developer tools services for performing a blue/green deployment with zero downtime. A blue/green deployment is a deployment strategy wherein you create two separate, but identical environments. One environment (blue) is running the current application version, and one environment (green) is running the new application version. The blue/green deployment strategy increases application availability by generally isolating the two application environments and ensuring that spinning up a parallel green environment won’t affect the blue environment resources. This isolation reduces deployment risk by simplifying the rollback process if a deployment fails.

Amazon Elastic File System (Amazon EFS) provides a simple, scalable, and fully-managed elastic NFS file system for use with AWS Cloud services and on-premises resources. It scales on demand, thereby eliminating the need to provision and manage capacity in order to accommodate growth. Utilize Amazon EFS to create a shared directory that stores and serves code and content for numerous applications. Your application can treat a mounted Amazon EFS volume like local storage. This means you don’t have to deploy your application code every time the environment scales up to multiple instances to distribute load.

In this blog post, I will guide you through an automated process to deploy a sample web application on Amazon EC2 instances utilizing Amazon EFS mount to host application source code, and utilizing a blue/green deployment with AWS code suite services in order to deploy the application source code with no downtime.

How this solution works

This blog post includes a CloudFormation template to provision all of the resources needed for this solution. The CloudFormation stack deploys a Hello World application on Amazon Linux 2 EC2 Instances running behind an Application Load Balancer and utilizes Amazon EFS mount point to store the application content. The AWS CodePipeline project utilizes AWS CodeCommit as the version control, AWS CodeBuild for installing dependencies and creating artifacts,  and AWS CodeDeploy to conduct deployment on EC2 instances running in an Amazon EC2 Auto Scaling group.

Figure 1 below illustrates our solution architecture.

Sample solution architecture

Figure 1: Sample solution architecture

The event flow in Figure 1 is as follows:

  1. A developer commits code changes from their local repo to the CodeCommit repository. The commit triggers CodePipeline execution.
  2. CodeBuild execution begins to compile source code, install dependencies, run custom commands, and create deployment artifact as per the instructions in the Build specification reference file.
  3. During the build phase, CodeBuild copies the source-code artifact to Amazon EFS file system and maintains two different directories for current (green) and new (blue) deployments.
  4. After successfully completing the build step, CodeDeploy deployment kicks in to conduct a Blue/Green deployment to a new Auto Scaling Group.
  5. During the deployment phase, CodeDeploy mounts the EFS file system on new EC2 instances as per the CodeDeploy AppSpec file reference and conducts other deployment activities.
  6. After successful deployment, a Lambda function triggers in order to store a deployment environment parameter in Systems Manager parameter store. The parameter stores the current EFS mount name that the application utilizes.
  7. The AWS Lambda function updates the parameter value during every successful deployment with the current EFS location.

Prerequisites

For this walkthrough, the following are required:

Deploy the solution

Once you’ve assembled the prerequisites, download or clone the GitHub repo and store the files on your local machine. Utilize the commands below to clone the repo:

mkdir -p ~/blue-green-sample/
cd ~/blue-green-sample/
git clone https://github.com/aws-samples/blue-green-deployment-pipeline-for-efs

Once completed, utilize the following steps to deploy the solution in your AWS account:

  1. Create a private Amazon Simple Storage Service (Amazon S3) bucket by using this documentation
    AWS S3 console view when creating a bucket

    Figure 2: AWS S3 console view when creating a bucket

     

  2. Upload the cloned or downloaded GitHub repo files to the root of the S3 bucket. the S3 bucket objects structure should look similar to Figure 3:
    AWS S3 bucket object structure after you upload the Github repo content

    Figure 3: AWS S3 bucket object structure

     

  3. Go to the S3 bucket and select the template name solution-stack-template.yml, and then copy the object URL.
  4. Open the CloudFormation console. Choose the appropriate AWS Region, and then choose Create Stack. Select With new resources.
  5. Select Amazon S3 URL as the template source, paste the object URL that you copied in Step 3, and then choose Next.
  6. On the Specify stack details page, enter a name for the stack and provide the following input parameter. Modify the default values for other parameters in order to customize the solution for your environment. You can leave everything as default for this walkthrough.
  • ArtifactBucket– The name of the S3 bucket that you created in the first step of the solution deployment. This is a mandatory parameter with no default value.
Defining the stack name and input parameters for the CloudFormation stack

Figure 4: Defining the stack name and input parameters for the CloudFormation stack

  1. Choose Next.
  2. On the Options page, keep the default values and then choose Next.
  3. On the Review page, confirm the details, acknowledge that CloudFormation might create IAM resources with custom names, and then choose Create Stack.
  4. Once the stack creation is marked as CREATE_COMPLETE, the following resources are created:
  • A virtual private cloud (VPC) configured with two public and two private subnets.
  • NAT Gateway, an EIP address, and an Internet Gateway.
  • Route tables for private and public subnets.
  • Auto Scaling Group with a single EC2 Instance.
  • Application Load Balancer and a Target Group.
  • Three security groups—one each for ALB, web servers, and EFS file system.
  • Amazon EFS file system with a mount target for each Availability Zone.
  • CodePipeline project with CodeCommit repository, CodeBuild, and CodeDeploy resources.
  • SSM parameter to store the environment current deployment status.
  • Lambda function to update the SSM parameter for every successful pipeline execution.
  • Required IAM Roles and policies.

      Note: It may take anywhere from 10-20 minutes to complete the stack creation.

Test the solution

Now that the solution stack is deployed, follow the steps below to test the solution:

  1. Validate CodePipeline execution status

After successfully creating the CloudFormation stack, a CodePipeline execution automatically triggers to deploy the default application code version from the CodeCommit repository.

  • In the AWS console, choose Services and then CloudFormation. Select your stack name. On the stack Outputs tab, look for the CodePipelineURL key and click on the URL.
  • Validate that all steps have successfully completed. For a successful CodePipeline execution, you should see something like Figure 5. Wait for the execution to complete in case it is still in progress.
CodePipeline console showing execution status of all stages

Figure 5: CodePipeline console showing execution status of all stages

 

  1. Validate the Website URL

After completing the pipeline execution, hit the website URL on a browser to check if it’s working.

  • On the stack Outputs tab, look for the WebsiteURL key and click on the URL.
  • For a successful deployment, it should open a default page similar to Figure 6.
Sample “Hello World” application (Green deployment)

Figure 6: Sample “Hello World” application (Green deployment)

 

  1. Validate the EFS share

After the website deployed successfully, we will get into the application server and validate the EFS mount point and the application source code directory.

  • Open the Amazon EC2 console, and then choose Instances in the left navigation pane.
  • Select the instance named bg-sample and choose
  • For Connection method, choose Session Manager, and then choose connect

After the connection is made, run the following bash commands to validate the EFS mount and the deployed content. Figure 7 shows a sample output from running the bash commands.

sudo df –h | grep efs
ls –la /efs/green
ls –la /var/www/
Sample output from the bash command (Green deployment)

Figure 7: Sample output from the bash command (Green deployment)

 

  1. Deploy a new revision of the application code

After verifying the application status and the deployed code on the EFS share, commit some changes to the CodeCommit repository in order to trigger a new deployment.

  • On the stack Outputs tab, look for the CodeCommitURL key and click on the corresponding URL.
  • Click on the file html.
  • Click on
  • Uncomment line 9 and comment line 10, so that the new lines look like those below after the changes:
background-color: #0188cc; 
#background-color: #90ee90;
  • Add Author name, Email address, and then choose Commit changes.

After you commit the code, the CodePipeline triggers and executes Source, Build, Deploy, and Lambda stages. Once the execution completes, hit the Website URL and you should see a new page like Figure 8.

New Application version (Blue deployment)

Figure 8: New Application version (Blue deployment)

 

On the EFS side, the application directory on the new EC2 instance now points to /efs/blue as shown in Figure 9.

Sample output from the bash command (Blue deployment)

Figure 9: Sample output from the bash command (Blue deployment)

Solution review

Let’s review the pipeline stages details and what happens during the Blue/Green deployment:

1) Build stage

For this sample application, the CodeBuild project is configured to mount the EFS file system and utilize the buildspec.yml file present in the source code root directory to run the build. Following is the sample build spec utilized in this solution:

version: 0.2
phases:
  install:
    runtime-versions:
      php: latest   
  build:
    commands:
      - current_deployment=$(aws ssm get-parameter --name $SSM_PARAMETER --query "Parameter.Value" --region $REGION --output text)
      - echo $current_deployment
      - echo $SSM_PARAMETER
      - echo $EFS_ID $REGION
      - if [[ "$current_deployment" == "null" ]]; then echo "this is the first GREEN deployment for this project" ; dir='/efs/green' ; fi
      - if [[ "$current_deployment" == "green" ]]; then dir='/efs/blue' ; else dir='/efs/green' ; fi
      - if [ ! -d $dir ]; then  mkdir $dir >/dev/null 2>&1 ; fi
      - echo $dir
      - rsync -ar $CODEBUILD_SRC_DIR/ $dir/
artifacts:
  files:
      - '**/*'

During the build job, the following activities occur:

  • Installs latest php runtime version.
  • Reads the SSM parameter value in order to know the current deployment and decide which directory to utilize. The SSM parameter value flips between green and blue for every successful deployment.
  • Synchronizes the latest source code to the EFS mount point.
  • Creates artifacts to be utilized in subsequent stages.

Note: Utilize the default buildspec.yml as a reference and customize it further as per your requirement. See this link for more examples.

2) Deploy Stage

The solution is utilizing CodeDeploy blue/green deployment type for EC2/On-premises. The deployment environment is configured to provision a new EC2 Auto Scaling group for every new deployment in order to deploy the new application revision. CodeDeploy creates the new Auto Scaling group by copying the current one. See this link for more details on blue/green deployment configuration with CodeDeploy. During each deployment event, CodeDeploy utilizes the appspec.yml file to run the deployment steps as per the defined life cycle hooks. Following is the sample AppSpec file utilized in this solution.

version: 0.0
os: linux
hooks:
  BeforeInstall:
    - location: scripts/install_dependencies
      timeout: 180
      runas: root
  AfterInstall:
    - location: scripts/app_deployment
      timeout: 180
      runas: root
  BeforeAllowTraffic :
     - location: scripts/check_app_status
       timeout: 180
       runas: root  

Note: The scripts mentioned in the AppSpec file are available in the scripts directory of the CodeCommit repository. Utilize these sample scripts as a reference and modify as per your requirement.

For this sample, the following steps are conducted during a deployment:

  • BeforeInstall:
    • Installs required packages on the EC2 instance.
    • Mounts the EFS file system.
    • Creates a symbolic link to point the apache home directory /var/www/html to the appropriate EFS mount point. It also ensures that the new application version deploys to a different EFS directory without affecting the current running application.
  • AfterInstall:
    • Stops apache web server.
    • Fetches current EFS directory name from Systems Manager.
    • Runs some clean up commands.
    • Restarts apache web server.
  • BeforeAllowTraffic:
    • Checks application status if running fine.
    • Exits the deployment with error if the app returns a non 200 HTTP status code. 

3) Lambda Stage

After completing the deploy stage, CodePipeline triggers a Lambda function in order to update the SSM parameter value with the updated EFS directory name. This parameter value alternates between “blue” and “green” to help CodePipeline identify the right EFS file system path during the next deployment.

CodeDeploy Blue/Green deployment

Let’s review the sequence of events flow during the CodeDeploy deployment:

  1. CodeDeploy creates a new Auto Scaling group by copying the original one.
  2. Provisions a replacement EC2 instance in the new Auto Scaling Group.
  3. Conducts the deployment on the new instance as per the instructions in the yml file.
  4. Sets up health checks and redirects traffic to the new instance.
  5. Terminates the original instance along with the Auto Scaling Group.
  6. After completing the deployment, it should appear as shown in Figure 10.
AWS CodeDeploy console view of a Blue/Green CodeDeploy deployment on Ec2

Figure 10: AWS console view of a Blue/Green CodeDeploy deployment on Ec2

Troubleshooting

To troubleshoot any service-related issues, see the following links:

More information

Now that you have tested the solution, here are some additional points worth noting:

  • The sample template and code utilized in this blog can work in any AWS region and are mainly intended for demonstration purposes. Utilize the sample as a reference and modify it further as per your requirement.
  • This solution works with single account, Region, and VPC combination.
  • For this sample, we have utilized AWS CodeCommit as version control, but you can also utilize any other source supported by AWS CodePipeline like Bitbucket, GitHub, or GitHub Enterprise Server

Clean up

Follow these steps to delete the components and avoid any future incurring charges:

  1. Open the AWS CloudFormation console.
  2. On the Stacks page in the CloudFormation console, select the stack that you created for this blog post. The stack must be currently running.
  3. In the stack details pane, choose Delete.
  4. Select Delete stack when prompted.
  5. Empty and delete the S3 bucket created during deployment step 1.

Conclusion

In this blog post, you learned how to set up a complete CI/CD pipeline for conducting a blue/green deployment on EC2 instances utilizing Amazon EFS file share as mount point to host application source code. The EFS share will be the central location hosting your application content, and it will help reduce your overall deployment time by eliminating the need for deploying a new revision on every EC2 instance local storage. It also helps to preserve any dynamically generated content when the life of an EC2 instance ends.

Author bio

Rakesh Singh

Rakesh is a Senior Technical Account Manager at Amazon. He loves automation and enjoys working directly with customers to solve complex technical issues and provide architectural guidance. Outside of work, he enjoys playing soccer, singing karaoke, and watching thriller movies.

Implementing Multi-Region Disaster Recovery Using Event-Driven Architecture

Post Syndicated from Vaibhav Shah original https://aws.amazon.com/blogs/architecture/implementing-multi-region-disaster-recovery-using-event-driven-architecture/

In this blog post, we share a reference architecture that uses a multi-Region active/passive strategy to implement a hot standby strategy for disaster recovery (DR).

We highlight the benefits of performing DR failover using event-driven, serverless architecture, which provides high reliability, one of the pillars of AWS Well Architected Framework.

With the multi-Region active/passive strategy, your workloads operate in primary and secondary Regions with full capacity. The main traffic flows through the primary and the secondary Region acts as a recovery Region in case of a disaster event. This makes your infrastructure more resilient and highly available and allows business continuity with minimal impact on production workloads. This blog post aligns with the Disaster Recovery Series that explains various DR strategies that you can implement based on your goals for recovery time objectives (RTO), recovery point objectives (RPO), and cost.

DR Strategies

Figure 1. DR strategies

Keeping RTO and RPO low

DR allows you to recover from various unforeseen failures that may make a Region unusable, including human errors causing misconfiguration, technical failures, natural disasters, etc. DR also mitigates the impact of disaster events and improves resiliency, which keeps Service Level Agreements high with minimum impact on business continuity.

As shown in Figure 2, the multi-Region active-passive strategy switches request traffic from primary to secondary Region via DNS records via Amazon Route 53 routing policies. This keeps RTO and RPO low.

DR implementation architecture on multi-Region active-passive workloads

Figure 2. DR implementation architecture on multi-Region active/passive workloads

Deploying your multi-Region workload with AWS CodePipeline

In the multi-Region active/passive strategy, your workload handles full capacity in primary and secondary AWS Regions using AWS CloudFormation. By using AWS CodePipeline, one deploy stage within the pipeline will deploy the stack to the primary Region (Figure 3). After that, the same stack is copied to the secondary Region.

The workloads in the primary and secondary Regions will be treated as two different environments. However, they will run the same version of your application for consistency and availability in event of a failure.

Deploying new versions to Lambda using CodePipeline and CloudFormation in two Regions

Figure 3. Deploying new versions to Lambda using CodePipeline and CloudFormation in two Regions

Fail over with event-driven serverless architecture

The event-driven serverless architecture performs failover by updating the weights of the Route 53 record. This shifts the traffic flow from the primary to the secondary Region. This operation specifies the source Region from where the failover is happening to the destination Region.

For a given application, there will be two Route 53 records with the same name. The two records will point at two different endpoints for the application deployed in two different Regions.

The record will use a weighted policy with the weight as 100 for the record pointing at the endpoint in the primary Region. This means that all the request traffic will be served by the endpoint in the primary Region. Similarly, the second record will have weight as 0 and will be pointing to the endpoint in the secondary Region. This means none of the request traffic will reach that endpoint. This process can be repeated for multiple API endpoints. The information about the applications, like DNS records, endpoints, hosted zone IDs, Regions, and weights will all be stored in a DynamoDB table.

Then the Amazon API Gateway calls an AWS Lambda function that scans each item in an Amazon DynamoDB table. The API Gateway also updates the weighted policy of the Route 53 record and the DynamoDB table weight attribute.

The API Gateway, Lambda, function and global DynamoDB table will all be deployed in the primary and secondary Regions.

Ensuring workload availability after a disaster

In the event of disaster, the data in the affected Region must be available in the recovery Region. In this section, we talk about fail over of databases like DynamoDB tables and Amazon Relational Database Service (Amazon RDS) databases.

Global DynamoDB tables

If you have two tables coexisting in two different Regions, any changes made to the table in the primary Region will be replicated to the secondary Region and vice versa. Once failover occurs, the request traffic moves through the recovery Region and connects to its databases, meaning your data and workloads are still working and available.

Amazon RDS database

Amazon RDS does not offer the same failover features that DynamoDB tables do. However, Amazon Aurora does offer a replication feature to read replicas in other Regions.

To use this feature, you’ll create an RDS database in the primary Region and enable backup replication in the configuration. In case of a DR event, you can choose to restore the replicated backup on the Amazon RDS instance in the destination Region.

Approach

Initiating DR

The DR process can be initiated manually or automatically based on certain metrics like status checks, error rates, etc. If the established thresholds are reached for these metrics, it signifies the workloads in the primary Region are failing.

You can initiate the DR process automatically by invoking an API call that can initiate backend automation. This allows you to measure how resilient and reliable your workload is and how quickly you can switch traffic to another Region if a real disaster happens.

Failover

The failover process is initiated by the API Gateway invoking a Lambda function, as shown on the left side of Figure 2. The Lambda function then performs failover by updating the weights of the Route 53 and DynamoDB table records. Similar steps can be performed for failing over database endpoints.

Once failover is complete, you’ll want to monitor traffic using Amazon CloudWatch. The List the available CloudWatch metrics for your instances user guide provides common metrics for you to monitor for Amazon Elastic Compute Cloud (Amazon EC2). The Amazon ECS CloudWatch metrics user guide provides common metrics to monitor for Amazon Elastic Container Service (Amazon ECS).

Failback

Once failover is successful and you’ve proven that traffic is being successfully routed to the new Region, you’ll failback to the primary Region. Similar metrics can be monitored in the secondary Region like you did in the Failover section.

Testing and results

Regularly test your DR process and evaluate the results and metrics. Based on the success and misses, you can make nearly continuous improvements in the DR process.

The critical applications within your organization will likely change over time, so it is important to evaluate which applications are mission critical and require an active/passive DR strategy. If an application is no longer mission critical, another DR strategy may be more appropriate.

Conclusion

The multi-Region active/passive strategy is one way to implement DR for applications hosted on AWS. It can fail over several applications in a short period of time by using serverless capabilities.

By using this strategy, your applications will be highly available and resilient to issues impacting Regional failures. It provides high business continuity, limits losses due to revenue reduction, and your customers will see minimum impact on performance efficiency, which is one of the pillars of AWS Well Architected Framework. By using this strategy, you can significantly reduce DR time by trading lower RTO and RPO for higher costs for critical applications.

Related information

Use the Snyk CLI to scan Python packages using AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild

Post Syndicated from BK Das original https://aws.amazon.com/blogs/devops/snyk-cli-scan-python-codecommit-codepipeline-codebuild/

One of the primary advantages of working in the cloud is achieving agility in product development. You can adopt practices like continuous integration and continuous delivery (CI/CD) and GitOps to increase your ability to release code at quicker iterations. Development models like these demand agility from security teams as well. This means your security team has to provide the tooling and visibility to developers for them to fix security vulnerabilities as quickly as possible.

Vulnerabilities in cloud-native applications can be roughly classified into infrastructure misconfigurations and application vulnerabilities. In this post, we focus on enabling developers to scan vulnerable data around Python open-source packages using the Snyk Command Line Interface (CLI).

The world of package dependencies

Traditionally, code scanning is performed by the security team; they either ship the code to the scanning instance, or in some cases ship it to the vendor for vulnerability scanning. After the vendor finishes the scan, the results are provided to the security team and forwarded to the developer. The end-to-end process of organizing the repositories, sending the code to security team for scanning, getting results back, and remediating them is counterproductive to the agility of working in the cloud.

Let’s take an example of package A, which uses package B and C. To scan package A, you scan package B and C as well. Similar to package A having dependencies on B and C, packages B and C can have their individual dependencies too. So the dependencies for each package get complex and cumbersome to scan over time. The ideal method is to scan all the dependencies in one go, without having manual intervention to understand the dependencies between packages.

Building on the foundation of GitOps and Gitflow

GitOps was introduced in 2017 by Weaveworks as a DevOps model to implement continuous deployment for cloud-native applications. It focuses on the developer ability to ship code faster. Because security is a non-negotiable piece of any application, this solution includes security as part of the deployment process. We define the Snyk scanner as declarative and immutable AWS Cloud Development Kit (AWS CDK) code, which instructs new Python code committed to the repository to be scanned.

Another continuous delivery practice that we base this solution on is Gitflow. Gitflow is a strict branching model that enables project release by enforcing a framework for managing Git projects. As a brief introduction on Gitflow, typically you have a main branch, which is the code sent to production, and you have a development branch where new code is committed. After the code in development branch passes all tests, it’s merged to the main branch, thereby becoming the code in production. In this solution, we aim to provide this scanning capability in all your branches, providing security observability through your entire Gitflow.

AWS services used in this solution

We use the following AWS services as part of this solution:

  • AWS CDK – The AWS CDK is an open-source software development framework to define your cloud application resources using familiar programming languages. In this solution, we use Python to write our AWS CDK code.
  • AWS CodeBuild – CodeBuild is a fully managed build service in the cloud. CodeBuild compiles your source code, runs unit tests, and produces artifacts that are ready to deploy. CodeBuild eliminates the need to provision, manage, and scale your own build servers.
  • AWS CodeCommit – CodeCommit is a fully managed source control service that hosts secure Git-based repositories. It makes it easy for teams to collaborate on code in a secure and highly scalable ecosystem. CodeCommit eliminates the need to operate your own source control system or worry about scaling its infrastructure. You can use CodeCommit to securely store anything from source code to binaries, and it works seamlessly with your existing Git tools.
  • AWS CodePipeline – CodePipeline is a continuous delivery service you can use to model, visualize, and automate the steps required to release your software. You can quickly model and configure the different stages of a software release process. CodePipeline automates the steps required to release your software changes continuously.
  • Amazon EventBridge – EventBridge rules deliver a near-real-time stream of system events that describe changes in AWS resources. With simple rules that you can quickly set up, you can match events and route them to one or more target functions or streams.
  • AWS Systems Manager Parameter Store – Parameter Store, a capability of AWS Systems Manager, provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, Amazon Machine Image (AMI) IDs, and license codes as parameter values.

Prerequisites

Before you get started, make sure you have the following prerequisites:

  • An AWS account (use a Region that supports CodeCommit, CodeBuild, Parameter Store, and CodePipeline)
  • A Snyk account
  • An existing CodeCommit repository you want to test on

Architecture overview

After you complete the steps in this post, you will have a working pipeline that scans your Python code for open-source vulnerabilities.

We use the Snyk CLI, which is available to customers on all plans, including the Free Tier, and provides the ability to programmatically scan repositories for vulnerabilities in open-source dependencies as well as base image recommendations for container images. The following reference architecture represents a general workflow of how Snyk performs the scan in an automated manner. The design uses DevSecOps principles of automation, event-driven triggers, and keeping humans out of the loop for its run.

As developers keep working on their code, they continue to commit their code to the CodeCommit repository. Upon each commit, a CodeCommit API call is generated, which is then captured using the EventBridge rule. You can customize this event rule for a specific event or feature branch you want to trigger the pipeline for.

When the developer commits code to the specified branch, that EventBridge event rule triggers a CodePipeline pipeline. This pipeline has a build stage using CodeBuild. This stage interacts with the Snyk CLI, and uses the token stored in Parameter Store. The Snyk CLI uses this token as authentication and starts scanning the latest code committed to the repository. When the scan is complete, you can review the results on the Snyk console.

This code is built for Python pip packages. You can edit the buildspec.yml to incorporate for any other language that Snyk supports.

The following diagram illustrates our architecture.

snyk architecture codepipeline

Code overview

The code in this post is written using the AWS CDK in Python. If you’re not familiar with the AWS CDK, we recommend reading Getting started with AWS CDK before you customize and deploy the code.

Repository URL: https://github.com/aws-samples/aws-cdk-codecommit-snyk

This AWS CDK construct uses the Snyk CLI within the CodeBuild job in the pipeline to scan the Python packages for open-source package vulnerabilities. The construct uses CodePipeline to create a two-stage pipeline: one source, and one build (the Snyk scan stage). The construct takes the input of the CodeCommit repository you want to scan, the Snyk organization ID, and Snyk auth token.

Resources deployed

This solution deploys the following resources:

For the deployment, we use the AWS CDK construct in the codebase cdk_snyk_construct/cdk_snyk_construct_stack.py in the AWS CDK stack cdk-snyk-stack. The construct requires the following parameters:

  • ARN of the CodeCommit repo you want to scan
  • Name of the repository branch you want to be monitored
  • Parameter Store name of the Snyk organization ID
  • Parameter Store name for the Snyk auth token

Set up the organization ID and auth token before deploying the stack. Because these are confidential and sensitive data, you should deploy them as a separate stack or manual process. In this solution, the parameters have been stored as a SecureString parameter type and encrypted using the AWS-managed KMS key.

You create the organization ID and auth token on the Snyk console. On the Settings page, choose General in the navigation page to add these parameters.

snyk settings console

 

You can retrieve the names of the parameters on the Systems Manager console by navigating to Parameter Store and finding the name on the Overview tab.

SSM Parameter Store

Create a requirements.txt file in the CodeCommit repository

We now create a repository in CodeCommit to store the code. For simplicity, we primarily store the requirements.txt file in our repository. In Python, a requirements file stores the packages that are used. Having clearly defined packages and versions makes it easier for development, especially in virtual environments.

For more information on the requirements file in Python, see Requirement Specifiers.

To create a CodeCommit repository, run the following AWS Command Line Interface (AWS CLI) command in your AWS accounts:

aws codecommit create-repository --repository-name snyk-repo \
--repository-description "Repository for Snyk to scan Python packages"

Now let’s create a branch called main in the repository using the following command:

aws codecommit create-branch --repository-name snyk-repo \
--branch-name main

After you create the repository, commit a file named requirements.txt with the following content. The following packages are pinned to a particular version that they have a vulnerability with. This file is our hypothetical vulnerable set of packages that have been committed into your development code.

PyYAML==5.3.1
Pillow==7.1.2
pylint==2.5.3
urllib3==1.25.8

 

For instructions on committing files in CodeCommit, see Connect to an AWS CodeCommit repository.

When you store the Snyk auth token and organization ID in Parameter Store, note the parameter names—you need to pass them as parameters during the deployment step.

Now clone the CDK code from the GitHub repository with the command below:

git clone https://github.com/aws-samples/aws-cdk-codecommit-snyk.git

After the cloning is complete you should see a directory named aws-cdk-codecommit-snyk on your machine.

When you’re ready to deploy, enter the aws-cdk-codecommit-snyk directory, and run the following command with the appropriate values:

cdk deploy cdk-snyk-stack \
--parameters RepoName=<name-of-codecommit-repo> \
--parameters RepoBranch=<branch-to-be-scanned>  \
--parameters SnykOrgId=<value> \
--parameters SnykAuthToken=<value>

After the stack deployment is complete, you can see a new pipeline in your AWS account, which is configured to be triggered every time a commit occurs on the main branch.

You can view the results of the scan on the Snyk console. After the pipeline runs, log in to snyk.io and you should see a project named as per your repository (see the following screenshot).

snyk dashboard

 

Choose the repo name to get a detailed view of the vulnerabilities found. Depending on what packages you put in your requirements.txt, your report will differ from the following screenshot.

snyk-vuln-details

 

To fix the vulnerability identified, you can change the version of these packages in the requirements.txt file. The edited requirements file should look like the following:

PyYAML==5.4
Pillow==8.2.0
pylint==2.6.1
urllib3==1.25.9

After you update the requirements.txt file in your repository, push your changes back to the CodeCommit repository you created earlier on the main branch. The push starts the pipeline again.

After the commit is performed to the targeted branch, you don’t see the vulnerability reported on the Snyk dashboard because the pinned version 5.4 doesn’t contain that vulnerability.

Clean up

To avoid accruing further cost for the resources deployed in this solution, run cdk destroy to remove all the AWS resources you deployed through CDK.

As the CodeCommit repository was created using AWS CLI, the following command deletes the CodeCommit repository:

aws codecommit delete-repository --repository-name snyk-repo

Conclusion

In this post, we provided a solution so developers can self- remediate vulnerabilities in their code by monitoring it through Snyk. This solution provides observability, agility, and security for your Python application by following DevOps principles.

A similar architecture has been used at NFL to shift-left the security of their code. According to the shift-left design principle, security should be moved closer to the developers to identify and remediate security issues earlier in the development cycle. NFL has implemented a similar architecture which made the total process, from committing code on the branch to remediating 15 times faster than their previous code scanning setup.

Here’s what NFL has to say about their experience:

“NFL used Snyk to scan Python packages for a service launch. Traditionally it would have taken 10days to scan the packages through our existing process but with Snyk we were able to follow DevSecOps principles and get the scans completed, and reviewed within matter of days. This simplified our time to market while maintaining visibility into our security posture.” – Joe Steinke (Director, Data Solution Architect)

Enforcing AWS CloudFormation scanning in CI/CD Pipelines at scale using Trend Micro Cloud One Conformity

Post Syndicated from Chris Dorrington original https://aws.amazon.com/blogs/devops/cloudformation-scanning-cicd-pipeline-cloud-conformity/

Integrating AWS CloudFormation template scanning into CI/CD pipelines is a great way to catch security infringements before application deployment. However, implementing and enforcing this in a multi team, multi account environment can present some challenges, especially when the scanning tools used require external API access.

This blog will discuss those challenges and offer a solution using Trend Micro Cloud One Conformity (formerly Cloud Conformity) as the worked example. Accompanying this blog is the end to end sample solution and detailed install steps which can be found on GitHub here.

We will explore explore the following topics in detail:

  • When to detect security vulnerabilities
    • Where can template scanning be enforced?
  • Managing API Keys for accessing third party APIs
    • How can keys be obtained and distributed between teams?
    • How easy is it to rotate keys with multiple teams relying upon them?
  • Viewing the results easily
    • How do teams easily view the results of any scan performed?
  • Solution maintainability
    • How can a fix or update be rolled out?
    • How easy is it to change scanner provider? (i.e. from Cloud Conformity to in house tool)
  • Enforcing the template validation
    • How to prevent teams from circumventing the checks?
  • Managing exceptions to the rules
    • How can the teams proceed with deployment if there is a valid reason for a check to fail?

 

When to detect security vulnerabilities

During the DevOps life-cycle, there are multiple opportunities to test cloud applications for best practice violations when it comes to security. The Shift-left approach is to move testing to as far left in the life-cycle, so as to catch bugs as early as possible. It is much easier and less costly to fix on a local developer machine than it is to patch in production.

Diagram showing Shift-left approach

Figure 1 – depicting the stages that an app will pass through before being deployed into an AWS account

At the very left of the cycle is where developers perform the traditional software testing responsibilities (such as unit tests), With cloud applications, there is also a responsibility at this stage to ensure there are no AWS security, configuration, or compliance vulnerabilities. Developers and subsequent peer reviewers looking at the code can do this by eye, but in this way it is hard to catch every piece of bad code or misconfigured resource.

For example, you might define an AWS Lambda function that contains an access policy making it accessible from the world, but this can be hard to spot when coding or peer review. Once deployed, potential security risks are now live. Without proper monitoring, these misconfigurations can go undetected, with potentially dire consequences if exploited by a bad actor.

There are a number of tools and SaaS offerings on the market which can scan AWS CloudFormation templates and detect infringements against security best practices, such as Stelligent’s cfn_nag, AWS CloudFormation Guard, and Trend Micro Cloud One Conformity. These can all be run from the command line on a developer’s machine, inside the IDE or during a git commit hook. These options are discussed in detail in Using Shift-Left to Find Vulnerabilities Before Deployment with Trend Micro Template Scanner.

Whilst this is the most left the testing can be moved, it is hard to enforce it this early on in the development process. Mandating that scan commands be integrated into git commit hooks or IDE tools can significantly increase the commit time and quickly become frustrating for the developer. Because they are responsible for creating these hooks or installing IDE extensions, you cannot guarantee that a template scan is performed before deployment, because the developer could easily turn off the scans or not install the tools in the first place.

Another consideration for very-left testing of templates is that when applications are written using AWS CDK or AWS Serverless Application Model (SAM), the actual AWS CloudFormation template that is submitted to AWS isn’t available in source control; it’s created during the build or package stage. Therefore, moving template scanning as far to the left is just not possible in these situations. Developers have to run a command such as cdk synth or sam package to obtain the final AWS CloudFormation templates.

If we now look at the far right of Figure 1, when an application has been deployed, real time monitoring of the account can pick up security issues very quickly. Conformity performs excellently in this area by providing central visibility and real-time monitoring of your cloud infrastructure with a single dashboard. Accounts are checked against over 400 best practices, which allows you to find and remediate non-compliant resources. This real time alerting is fast – you can be assured of an email stating non-compliance in no time at all! However, remediation does takes time. Following the correct process, a fix to code will need to go through the CI/CD pipeline again before a patch is deployed. Relying on account scanning only at the far right is sub-optimal.

The best place to scan templates is at the most left of the enforceable part of the process – inside the CI/CD pipeline. Conformity provides their Template Scanner API for this exact purpose. Templates can be submitted to the API, and the same Conformity checks that are being performed in real time on the account are run against the submitted AWS CloudFormation template. When integrated programmatically into a build, failing checks can prevent a deployment from occurring.

Whilst it may seem a simple task to incorporate the Template Scanner API call into a CI/CD pipeline, there are many considerations for doing this successfully in an enterprise environment. The remainder of this blog will address each consideration in detail, and the accompanying GitHub repo provides a working sample solution to use as a base in your own organization.

 

View failing checks as AWS CodeBuild test reports

Treating failing Conformity checks the same as unit test failures within the build will make the process feel natural to the developers. A failing unit test will break the build, and so will a failing Conformity check.

AWS CodeBuild provides test reporting for common unit test frameworks, such as NUnit, JUnit, and Cucumber. This allows developers to easily and very visually see what failing tests have occurred within their builds, allowing for quicker remediation than having to trawl through test log files. This same principle can be applied to failing Conformity checks—this allows developers to quickly see what checks have failed, rather than looking into AWS CodeBuild logs. However, the AWS CodeBuild test reporting feature doesn’t natively support the JSON schema that the Conformity Template Scanner API returns. Instead, you need custom code to turn the Conformity response into a usable format. Later in this blog we will explore how the conversion occurs.

Cloud conformity failed checks displayed as CodeBuild Reports

Figure 2 – Cloud Conformity failed checks appearing as failed test cases in AWS CodeBuild reports

Enterprise speed bumps

Teams wishing to use template scanning as part of their AWS CodePipeline currently need to create an AWS CodeBuild project that calls the external API, and then performs the custom translation code. If placed inside a buildspec file, it can easily become bloated with many lines of code, leading to maintainability issues arising as copies of the same buildspec file are distributed across teams and accounts. Additionally, third-party APIs such as Conformity are often authorized by an API key. In some enterprises, not all teams have access to the Conformity console, further compounding the problem for API key management.

Below are some factors to consider when implementing template scanning in the enterprise:

  • How can keys be obtained and distributed between teams?
  • How easy is it to rotate keys when multiple teams rely upon them?
  • How can a fix or update be rolled out?
  • How easy is it to change scanner provider? (i.e. From Cloud Conformity to in house tool)

Overcome scaling issues, use a centralized Validation API

An approach to overcoming these issues is to create a single AWS Lambda function fronted by Amazon API Gateway within your organization that runs the call to the Template Scanner API, and performs the transform of results into a format usable by AWS CodeBuild reports. A good place to host this API is within the Cloud Ops team account or similar shared services account. This way, you only need to issue one API key (stored in AWS Secrets Manager) and it’s not available for viewing by any developers. Maintainability for the code performing the Template Scanner API calls is also very easy, because it resides in one location only. Key rotation is now simple (due to only one key in one location requiring an update) and can be automated through AWS Secrets Manager

The following diagram illustrates a typical setup of a multi-account, multi-dev team scenario in which a team’s AWS CodePipeline uses a centralized Validation API to call Conformity’s Template Scanner.

architecture diagram central api for cloud conformity template scanning

Figure 3 – Example of an AWS CodePipeline utilizing a centralized Validation API to call Conformity’s Template Scanner

 

Providing a wrapper API around the Conformity Template Scanner API encapsulates the code required to create the CodeBuild reports. Enabling template scanning within teams’ CI/CD pipelines now requires only a small piece of code within their CodeBuild buildspec file. It performs the following three actions:

  1. Post the AWS CloudFormation templates to the centralized Validation API
  2. Write the results to file (which are already in a format readable by CodeBuild test reports)
  3. Stop the build if it detects failed checks within the results

The centralized Validation API in the shared services account can be hosted with a private API in Amazon API Gateway, fronted by a VPC endpoint. Using a private API denies any public access but does allow access from any internal address allowed by the VPC endpoint security group and endpoint policy. The developer teams can run their AWS CodeBuild validation phase within a VPC, thereby giving it access to the VPC endpoint.

A working example of the code required, along with an AWS CodeBuild buildspec file, is provided in the GitHub repository

 

Converting 3rd party tool results to CodeBuild Report format

With a centralized API, there is now only one place where the conversion code needs to reside (as opposed to copies embedded in each teams’ CodePipeline). AWS CodeBuild Reports are primarily designed for test framework outputs and displaying test case results. In our case, we want to display Conformity checks – which are not unit test case results. The accompanying GitHub repository to convert from Conformity Template Scanner API results, but we will discuss mappings between the formats so that bespoke conversions for other 3rd party tools, such as cfn_nag can be created if required.

AWS CodeBuild provides out of the box compatibility for common unit test frameworks, such as NUnit, JUnit and Cucumber. Out of the supported formats, Cucumber JSON is the most readable format to read and manipulate due to native support in languages such as Python (all other formats being in XML).

Figure 4 depicts where the Cucumber JSON fields will appear in the AWS CodeBuild reports page and Figure 5 below shows a valid Cucumber snippet, with relevant fields highlighted in yellow.

CodeBuild Reports page with fields highlighted that correspond to cucumber JSON fields

Figure 4 – AWS CodeBuild report test case field mappings utilized by Cucumber JSON

 

 

Cucumber JSON snippet showing CodeBuild Report field mappings

Figure 5 – Cucumber JSON with mappings to AWS CodeBuild report table

 

Note that in Figure 5, there are additional fields (eg. id, description etc) that are required to make the file valid Cucumber JSON – even though this data is not displayed in CodeBuild Reports page. However, raw reports are still available as AWS CodeBuild artifacts, and therefore it is useful to still populate these fields with data that could be useful to aid deeper troubleshooting.

Conversion code for Conformity results is provided in the accompanying GitHub repo, within file app.py, line 376 onwards

 

Making the validation phase mandatory in AWS CodePipeline

The Shift-Left philosophy states that we should shift testing as much as possible to the left. The furthest left would be before any CI/CD pipeline is triggered. Developers could and should have the ability to perform template validation from their own machines. However, as discussed earlier this is rarely enforceable – a scan during a pipeline deployment is the only true way to know that templates have been validated. But how can we mandate this and truly secure the validation phase against circumvention?

Preventing updates to deployed CI/CD pipelines

Using a centralized API approach to make the call to the validation API means that this code is now only accessible by the Cloud Ops team, and not the developer teams. However, the code that calls this API has to reside within the developer teams’ CI/CD pipelines, so that it can stop the build if failures are found. With CI/CD pipelines defined as AWS CloudFormation, and without any preventative measures in place, a team could move to disable the phase and deploy code without any checks performed.

Fortunately, there are a number of approaches to prevent this from happening, and to enforce the validation phase. We shall now look at one of them from the AWS CloudFormation Best Practices.

IAM to control access

Use AWS IAM to control access to the stacks that define the pipeline, and then also to the AWS CodePipeline/AWS CodeBuild resources within them.

IAM policies can generically restrict a team from updating a CI/CD pipeline provided to them if a naming convention is used in the stacks that create them. By using a naming convention, coupled with the wildcard “*”, these policies can be applied to a role even before any pipelines have been deployed..

For example, lets assume the pipeline depicted in Figure 6 is defined and deployed in AWS CloudFormation as follows:

  • Stack name is “cicd-pipeline-team-X”
  • AWS CodePipeline resource within the stack has logical name with prefix “CodePipelineCICD”
  • AWS CodeBuild Project for validation phase is prefixed with “CodeBuildValidateProject”

Creating an IAM policy with the statements below and attaching to the developer teams’ IAM role will prevent them from modifying the resources mentioned above. The AWS CloudFormation stack and resource names will match the wildcards in the statements and Deny the user to any update actions.

Example IAM policy highlighting how to deny updates to stacks and pipeline resources

Figure 6 – Example of how an IAM policy can restrict updates to AWS CloudFormation stacks and deployed resources

 

Preventing valid failing checks from being a bottleneck

When centralizing anything, and forcing developers to use tooling or features such as template scanners, it is imperative that it (or the team owning it) does not become a bottleneck and slow the developers down. This is just as true for our centralized API solution.

It is sometimes the case that a developer team has a valid reason for a template to yield a failing check. For instance, Conformity will report a HIGH severity alert if a load balancer does not have an HTTPS listener. If a team is migrating an older application which will only work on port 80 and not 443, the team may be able to obtain an exception from their cyber security team. It would not desirable to turn off the rule completely in the real time scanning of the account, because for other deployments this HIGH severity alert could be perfectly valid. The team faces an issue now because the validation phase of their pipeline will fail, preventing them from deploying their application – even though they have cyber approval to fail this one check.

It is imperative that when enforcing template scanning on a team that it must not become a bottleneck. Functionality and workflows must accompany such a pipeline feature to allow for quick resolution.

Screenshot of Trend Micro Cloud One Conformity rule from their website

Figure 7 – Screenshot of a Conformity rule from their website

Therefore the centralized validation API must provide a way to allow for exceptions on a case by case basis. Any exception should be tied to a unique combination of AWS account number + filename + rule ID, which ensures that exceptions are only valid for the specific instance of violation, and not for any other. This can be achieved by extending the centralized API with a set of endpoints to allow for exception request and approvals. These can then be integrated into existing or new tooling and workflows to be able to provide a self service method for teams to be able to request exceptions. Cyber security teams should be able to quickly approve/deny the requests.

The exception request/approve functionality can be implemented by extending the centralized private API to provide an /exceptions endpoint, and using DynamoDB as a data store. During a build and template validation, failed checks returned from Conformity are then looked up in the Dynamo table to see if an approved exception is available – if it is, then the check is not returned as a actual failing check, but rather an exempted check. The build can then continue and deploy to the AWS account.

Figure 8 and figure 9 depict the /exceptions endpoints that are provided as part of the sample solution in the accompanying GitHub repository.

screenshot of API gateway for centralized template scanner api

Figure 8 – Screenshot of API Gateway depicting the endpoints available as part of the accompanying solution

 

The /exceptions endpoint methods provides the following functionality:

Table containing HTTP verbs for exceptions endpoint

Figure 9 – HTTP verbs implementing exception functionality

Important note regarding endpoint authorization: Whilst the “validate” private endpoint may be left with no auth so that any call from within a VPC is accepted, the same is not true for the “exception” approval endpoint. It would be prudent to use AWS IAM authentication available in API Gateway to restrict approvals to this endpoint for certain users only (i.e. the cyber and cloud ops team only)

With the ability to raise and approve exception requests, the mandatory scanning phase of the developer teams’ pipelines is no longer a bottleneck.

 

Conclusion

Enforcing template validation into multi developer team, multi account environments can present challenges with using 3rd party APIs, such as Conformity Template Scanner, at scale. We have talked through each hurdle that can be presented, and described how creating a centralized Validation API and exception approval process can overcome those obstacles and keep the teams deploying without unwarranted speed bumps.

By shifting left and integrating scanning as part of the pipeline process, this can leave the cyber team and developers sure that no offending code is deployed into an account – whether they were written in AWS CDK, AWS SAM or AWS CloudFormation.

Additionally, we talked in depth on how to use CodeBuild reports to display the vulnerabilities found, aiding developers to quickly identify where attention is required to remediate.

Getting started

The blog has described real life challenges and the theory in detail. A complete sample for the described centralized validation API is available in the accompanying GitHub repo, along with a sample CodePipeline for easy testing. Step by step instructions are provided for you to deploy, and enhance for use in your own organization. Figure 10 depicts the sample solution available in GitHub.

https://github.com/aws-samples/aws-cloudformation-template-scanning-with-cloud-conformity

NOTE: Remember to tear down any stacks after experimenting with the provided solution, to ensure ongoing costs are not charged to your AWS account. Notes on how to do this are included inside the repo Readme.

 

example codepipeline architecture provided by the accompanying github solution

Figure 10 depicts the solution available for use in the accompanying GitHub repository

 

Find out more

Other blog posts are available that cover aspects when dealing with template scanning in AWS:

For more information on Trend Micro Cloud One Conformity, use the links below.

Trend Micro AWS Partner Network joint image

Avatar for Chris Dorrington

Chris Dorrington

Chris Dorrington is a Senior Cloud Architect with AWS Professional Services in Perth, Western Australia. Chris loves working closely with AWS customers to help them achieve amazing outcomes. He has over 25 years software development experience and has a passion for Serverless technologies and all things DevOps

 

How Banks Can Use AWS to Meet Compliance

Post Syndicated from Jiwan Panjiker original https://aws.amazon.com/blogs/architecture/how-banks-can-use-aws-to-meet-compliance/

Since the 2008 financial crisis, banking supervisory institutions such as the Basel Committee on Banking Supervision (BCBS) have strengthened regulations. There is now increased oversight over the financial services industry. For banks, making the necessary changes to comply with these rules is a challenging, multi-year effort.

Basel IV, a massive update to existing rules, is due for implementation in January 2023. Basel IV standardizes the approach to calculating credit risk, increases the impact of risk-weighted assets (RWAs) and emphasizes data transparency.

Given the complexity of data, modeling, and numerous assumptions that have to be made, compliance under Basel IV implementation will be challenging. Standardization omits nuances unique to your business, which can drive up costs, but violating guidelines will result in steep penalties.

This post will address these challenges by outlining a mechanism that facilitates a healthy, data-driven dialogue between banks and regulators to better achieve compliance objectives. The reference architecture will focus on enabling fast, iterative releases with the help of serverless AWS services.

There are four key actions to take in order to support this mechanism:

  1. Automate data management
  2. Establish a continuous integration/continuous delivery (CI/CD) pipeline
  3. Enable fast, point-in-time audit replays
  4. Set up proactive monitoring and notifications

Automate data management

Due to frequent merger activity, banks are typically comprised of a web of integrated systems and siloed business units, making it difficult to consolidate data. Under Basel IV guidelines, auditors want banks to provide detailed data in a presentable way.

You can tackle this first challenge by establishing a data pipeline as shown in Figure 1. Take inventory of each data source as it is incorporated into the pipeline. Identify the critical internal and external data sources that will be used to populate the initial landing area. Amazon Simple Storage Service (S3) is a great choice for this.

Figure 1. Data pipeline that cleans, processes, and segments data

Figure 1. Data pipeline that cleans, processes, and segments data

Amazon S3 is a highly available, durable service that is a popular data lake solution. S3 offers WORM storage capabilities like S3 Glacier Vault and S3 Object Lock to protect the integrity of your archived data in accordance with U.S. SEC and FINRA rules.

Basel IV regulations also require banks to use many attributes to develop accurate credit risk models. The attributes can be a mix of datasets such as financial statements, internal balanced scorecards, macro-economic data, and credit ratings. The risk models themselves can also be segmented by portfolio types, industry segments, asset types and much more.

You can split data into different domains and designate data owners with separate S3 buckets. Credit risk model developers, analyst, and data scientists can then use the structure of the S3 buckets to pull together relevant datasets. They can then store the outputs into S3 buckets.

To support fast, automated data retrieval, store object metadata in a highly scalable, and queryable database. You can set up Amazon S3 so that an event can initiate a function to populate Amazon DynamoDB. Developers can use AWS Lambda to write these functions using popular languages like Python.

With AWS Glue, you can automate Extract/Load/Transform (ETL) processes to clean and move data to the different S3 buckets. AWS Glue can also support data operations by automatically cataloging your various data sources.

Taking on a structured approach will simplify data governance and transparency as the business continues to grow and operate.

Establish a CI/CD pipeline

Adopt tools that machine learning teams can use to build a streamlined CI/CD solution as demonstrated in Figure 2.

Figure 2. An end-to-end machine learning development and deployment pipeline

Figure 2. An end-to-end machine learning development and deployment pipeline

Using tightly integrated AWS services, your teams can minimize time spent managing tools and deployment processes, and instead, focus on tuning the models and analyzing the results.

Amazon SageMaker brings together a powerful set of machine learning capabilities on the AWS Cloud. It helps data scientists and engineers build insightful models. Figure 2 depicts the high-level architecture and shows how Amazon SageMaker Pipelines helps teams orchestrate the automation and deployment processes.

The core of the pipeline uses a set of AWS deployment services so that your teams can collaborate and review effectively. With AWS CodeCommit, your teams can set up git-based repository to store and version models for data processing, training, and evaluation. The repository can also store code and configuration files using AWS CloudFormation for deployment. You can use AWS CodePipeline and AWS CodeBuild to create and update a model endpoint based on the approved/reviewed changes.

Any updates detected in the AWS CodeCommit repository initiate a deployment whenever a new model version is added to the Model Registry. Amazon S3 can be used to store generated model artifacts, historical data, and models.

Enable fast, point-in-time audit replays

Figure 3. Containers offer a lightweight, powerful solution to run audits using historical assets

Figure 3. Containers offer a lightweight, powerful solution to run audits using historical assets

One of the main themes of Basel IV is transparency. Figure 3 illustrates a solution to build trust with regulators by allowing them to verify and understand modeling activity.

A lightweight application is hosted in AWS Fargate and enables auditors to re-run Basel credit risk models under specified conditions. With AWS Fargate, you don’t need to manually manage instances or container orchestration. Configure the CPU or memory specifications at the task level and set guidelines around scalability for your service. Your tasks then scale up and down automatically, based on demand, and will optimize cost efficiency and availability.

Figure 3 shows the following:

  1. The application takes inputs such as date, release version, and model type.
  2. It then queries DynamoDB with this information.
  3. The query will return the data necessary to retrieve model artifacts from previous CI/CD deployments and relevant datasets from historical S3 buckets.
  4. Using this information, it can spin up as many containers as needed to run the model.
  5. It then stores the outputs in a separate S3 bucket.
  6. Auditors will have a detailed trace of all the attributes, assumptions, and data that went into the modeling effort. To streamline this process, the app can also compare the outputs of the historical runs to the recent replay and highlight any significant deviations.

Though internal models will be de-emphasized under Basel IV, banks will continue to run internal models as a benchmark against the broader standards. Schedule AWS Fargate tasks to run these models regularly to capitalize on highly performant compute services while minimizing costs.

Set up proactive monitoring and notifications

Figure 4. Scheduled jobs can send out notifications using Amazon SNS when certain thresholds are breached

Figure 4. Scheduled jobs can send out notifications using Amazon SNS when certain thresholds are breached

The last principle is based around establishing an early warning system, enabling banks to take on a more proactive role in maintaining compliance.

With automated monitoring and notifications, banks will be able to respond quickly to potential concerns. For instance, there can be a daily scheduled job that launches containers and runs the models against the latest data. If any thresholds are breached, alerts can be sent out via SMS or email. Operational teams can be subscribed to certain message topics using Amazon Simple Notification Service (SNS). They can then respond before actual compliance issues emerge.

Conclusion

With a Well-Architected approach, AWS helps you control your data, deploy new features, and embrace a serverless approach. This frees you to innovate quickly and address regulatory challenges.

You can iterate with new AWS services and bring machine learning to bear on various streams of data to identify high impact pools of value. You can get a clearer picture of the data to make it easier to identify areas where you can reduce RWAs. Using Amazon S3, you can turn on AWS analytics services such as Amazon QuickSight and Amazon Athena to visualize the data. You’ll be able to fulfill reporting requirements such as those found in regulatory studies like CCAR, DFAST, CECL, and IFRS9.

For more information about establishing a data pipeline, read Lake House Formation Architecture. It is a powerful pattern that combines a few concepts that will help bring your data together cohesively. To set up a robust CI/CD pipeline, explore the AWS Serverless CI/CD Reference Architecture.

Continuous Compliance Workflow for Infrastructure as Code: Part 2

Post Syndicated from DAMODAR SHENVI WAGLE original https://aws.amazon.com/blogs/devops/continuous-compliance-workflow-for-infrastructure-as-code-part-2/

In the first post of this series, we introduced a continuous compliance workflow in which an enterprise security and compliance team can release guardrails in a continuous integration, continuous deployment (CI/CD) fashion in your organization.

In this post, we focus on the technical implementation of the continuous compliance workflow. We demonstrate how to use AWS Developer Tools to create a CI/CD pipeline that releases guardrails for Terraform application workloads.

We use the Terraform-Compliance framework to define the guardrails. Terraform-Compliance is a lightweight, security and compliance-focused test framework for Terraform to enable the negative testing capability for your infrastructure as code (IaC).

With this compliance framework, we can ensure that the implemented Terraform code follows security standards and your own custom standards. Currently, HashiCorp provides Sentinel (a policy as code framework) for enterprise products. AWS has CloudFormation Guard an open-source policy-as-code evaluation tool for AWS CloudFormation templates. Terraform-Compliance allows us to build a similar functionality for Terraform, and is open source.

This post is from the perspective of a security and compliance engineer, and assumes that the engineer is familiar with the practices of IaC, CI/CD, behavior-driven development (BDD), and negative testing.

Solution overview

You start by building the necessary resources as listed in the workload (application development team) account:

  • An AWS CodeCommit repository for the Terraform workload
  • A CI/CD pipeline built using AWS CodePipeline to deploy the workload
  • A cross-account AWS Identity and Access Management (IAM) role that gives the security and compliance account the permissions to pull the Terraform workload from the workload account repository for testing their guardrails in observation mode

Next, we build the resources in the security and compliance account:

  • A CodeCommit repository to hold the security and compliance standards (guardrails)
  • A CI/CD pipeline built using CodePipeline to release new guardrails
  • A cross-account role that gives the workload account the permissions to pull the activated guardrails from the main branch of the security and compliance account repository.

The following diagram shows our solution architecture.

solution architecture diagram

The architecture has two workflows: security and compliance (Steps 1–4) and application delivery (Steps 5–7).

  1. When a new security and compliance guardrail is introduced into the develop branch of the compliance repository, it triggers the security and compliance pipeline.
  2. The pipeline pulls the Terraform workload.
  3. The pipeline tests this compliance check guardrail against the Terraform workload in the workload account repository.
  4. If the workload is compliant, the guardrail is automatically merged into the main branch. This activates the guardrail by making it available for all Terraform application workload pipelines to consume. By doing this, we make sure that we don’t break the Terraform application deployment pipeline by introducing new guardrails. It also provides the security and compliance team visibility into the resources in the application workload that are noncompliant. The security and compliance team can then reach out to the application delivery team and suggest appropriate remediation before the new standards are activated. If the compliance check fails, the automatic merge to the main branch is stopped. The security and compliance team has an option to force merge the guardrail into the main branch if it’s deemed critical and they need to activate it immediately.
  5. The Terraform deployment pipeline in the workload account always pulls the latest security and compliance checks from the main branch of the compliance repository.
  6. Checks are run against the Terraform workload to ensure that it meets the organization’s security and compliance standards.
  7. Only secure and compliant workloads are deployed by the pipeline. If the workload is noncompliant, the security and compliance checks fail and break the pipeline, forcing the application delivery team to remediate the issue and recheck-in the code.

Prerequisites

Before proceeding any further, you need to identify and designate two AWS accounts required for the solution to work:

  • Security and Compliance – In which you create a CodeCommit repository to hold compliance standards that are written based on Terraform-Compliance framework. You also create a CI/CD pipeline to release new compliance guardrails.
  • Workload – In which the Terraform workload resides. The pipeline to deploy the Terraform workload enforces the compliance guardrails prior to the deployment.

You also need to create two AWS account profiles in ~/.aws/credentials for the tools and target accounts, if you don’t already have them. These profiles need to have sufficient permissions to run an AWS Cloud Development Kit (AWS CDK) stack. They should be your private profiles and only be used during the course of this use case. Therefore, it should be fine if you want to use admin privileges. Don’t share the profile details, especially if it has admin privileges. I recommend removing the profile when you’re finished with this walkthrough. For more information about creating an AWS account profile, see Configuring the AWS CLI.

In addition, you need to generate a cucumber-sandwich.jar file by following the steps in the cucumber-sandwich GitHub repo. The JAR file is needed to generate pretty HTML compliance reports. The security and compliance team can use these reports to make sure that the standards are met.

To implement our solution, we complete the following high-level steps:

  1. Create the security and compliance account stack.
  2. Create the workload account stack.
  3. Test the compliance workflow.

Create the security and compliance account stack

We create the following resources in the security and compliance account:

  • A CodeCommit repo to hold the security and compliance guardrails
  • A CI/CD pipeline to roll out the Terraform compliance guardrails
  • An IAM role that trusts the application workload account and allows it to pull compliance guardrails from its CodeCommit repo

In this section, we set up the properties for the pipeline and cross-account role stacks, and run the deployment scripts.

Set up properties for the pipeline stack

Clone the GitHub repo aws-continuous-compliance-for-terraform and navigate to the folder security-and-compliance-account/stacks. This contains the folder pipeline_stack/, which holds the code and properties for creating the pipeline stack.

The folder has a JSON file cdk-stack-param.json, which has the parameter TERRAFORM_APPLICATION_WORKLOADS, which represents the list of application workloads that the security and compliance pipeline pulls and runs tests against to make sure that the workloads are compliant. In the workload list, you have the following parameters:

  • GIT_REPO_URL – The HTTPS URL of the CodeCommit repository in the workload account against which the security and compliance check pipeline runs compliance guardrails.
  • CROSS_ACCOUNT_ROLE_ARN – The ARN for the cross-account role we create in the next section. This role gives the security and compliance account permissions to pull Terraform code from the workload account.

For CROSS_ACCOUNT_ROLE_ARN, replace <workload-account-id> with the account ID for your designated AWS workload account. For GIT_REPO_URL, replace <region> with AWS Region where the repository resides.

security and compliance pipeline stack parameters

Set up properties for the cross-account role stack

In the cloned GitHub repo aws-continuous-compliance-for-terraform from the previous step, navigate to the folder security-and-compliance-account/stacks. This contains the folder cross_account_role_stack/, which holds the code and properties for creating the cross-account role.

The folder has a JSON file cdk-stack-param.json, which has the parameter TERRAFORM_APPLICATION_WORKLOAD_ACCOUNTS, which represents the list of Terraform workload accounts that intend to integrate with the security and compliance account for running compliance checks. All these accounts are trusted by the security and compliance account and given permissions to pull compliance guardrails. Replace <workload-account-id> with the account ID for your designated AWS workload account.

security and compliance cross account role stack parameters

Run the deployment script

Run deploy.sh by passing the name of the AWS security and compliance account profile you created earlier. The script uses the AWS CDK CLI to bootstrap and deploy the two stacks we discussed. See the following code:

cd aws-continuous-compliance-for-terraform/security-and-compliance-account/
./deploy.sh "<AWS-COMPLIANCE-ACCOUNT-PROFILE-NAME>"

You should now see three stacks in the tools account:

  • CDKToolkit – AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an Amazon Simple Storage Service (Amazon S3) bucket needed to hold deployment assets such as an AWS CloudFormation template and AWS Lambda code package.
  • cf-CrossAccountRoles – This stack creates the cross-account IAM role.
  • cf-SecurityAndCompliancePipeline – This stack creates the pipeline. On the Outputs tab of the stack, you can find the CodeCommit source repo URL from the key OutSourceRepoHttpUrl. Record the URL to use later.

security and compliance stack

Create a workload account stack

We create the following resources in the workload account:

  • A CodeCommit repo to hold the Terraform workload to be deployed
  • A CI/CD pipeline to deploy the Terraform workload
  • An IAM role that trusts the security and compliance account and allows it to pull Terraform code from its CodeCommit repo for testing

We follow similar steps as in the previous section to set up the properties for the pipeline stack and cross-account role stack, and then run the deployment script.

Set up properties for the pipeline stack

In the already cloned repo, navigate to the folder workload-account/stacks. This contains the folder pipeline_stack/, which holds the code and properties for creating the pipeline stack.

The folder has a JSON file cdk-stack-param.json, which has the parameter COMPLIANCE_CODE, which provides details on where to pull the compliance guardrails from. The pipeline pulls and runs compliance checks prior to deployment, to make sure that application workload is compliant. You have the following parameters:

  • GIT_REPO_URL – The HTTPS URL of the CodeCommit repositoryCode in the security and compliance account, which contains compliance guardrails that the pipeline in the workload account pulls to carry out compliance checks.
  • CROSS_ACCOUNT_ROLE_ARN – The ARN for the cross-account role we created in the previous step in the security and compliance account. This role gives the workload account permissions to pull the Terraform compliance code from its respective security and compliance account.

For CROSS_ACCOUNT_ROLE_ARN, replace <compliance-account-id> with the account ID for your designated AWS security and compliance account. For GIT_REPO_URL, replace <region> with Region where the repository resides.

workload pipeline stack config

Set up the properties for cross-account role stack

In the already cloned repo, navigate to folder workload-account/stacks. This contains the folder cross_account_role_stack/, which holds the code and properties for creating the cross-account role stack.

The folder has a JSON file cdk-stack-param.json, which has the parameter COMPLIANCE_ACCOUNT, which represents the security and compliance account that intends to integrate with the workload account for running compliance checks. This account is trusted by the workload account and given permissions to pull compliance guardrails. Replace <compliance-account-id> with the account ID for your designated AWS security and compliance account.

workload cross account role stack config

Run the deployment script

Run deploy.sh by passing the name of the AWS workload account profile you created earlier. The script uses the AWS CDK CLI to bootstrap and deploy the two stacks we discussed. See the following code:

cd aws-continuous-compliance-for-terraform/workload-account/
./deploy.sh "<AWS-WORKLOAD-ACCOUNT-PROFILE-NAME>"

You should now see three stacks in the tools account:

  • CDKToolkit –AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an S3 bucket needed to hold deployment assets such as a CloudFormation template and Lambda code package.
  • cf-CrossAccountRoles – This stack creates the cross-account IAM role.
  • cf-TerraformWorkloadPipeline – This stack creates the pipeline. On the Outputs tab of the stack, you can find the CodeCommit source repo URL from the key OutSourceRepoHttpUrl. Record the URL to use later.

workload pipeline stack

Test the compliance workflow

In this section, we walk through the following steps to test our workflow:

  1. Push the application workload code into its repo.
  2. Push the security and compliance code into its repo and run its pipeline to release the compliance guardrails.
  3. Run the application workload pipeline to exercise the compliance guardrails.
  4. Review the generated reports.

Push the application workload code into its repo

Clone the empty CodeCommit repo from workload account. You can find the URL from the variable OutSourceRepoHttpUrl on the Outputs tab of the cf-TerraformWorkloadPipeline stack we deployed in the previous section.

  1. Create a new branch main and copy the workload code into it.
  2. Copy the cucumber-sandwich.jar file you generated in the prerequisites section into a new folder /lib.
  3. Create a directory called reports with an empty file dummy. The reports directory is where Terraform-Compliance framework create compliance reports.
  4. Push the code to the remote origin.

See the following sample script

git checkout -b main
# Copy the code from git repo location
# Create reports directory and a dummy file.
mkdir reports
touch reports/dummy
git add .
git commit -m “Initial commit”
git push origin main

The folder structure of workload code repo should match the structure shown in the following screenshot.

workload code folder structure

The first commit triggers the pipeline-workload-main pipeline, which fails in the stage RunComplianceCheck due to the security and compliance repo not being present (which we add in the next section).

Push the security and compliance code into its repo and run its pipeline

Clone the empty CodeCommit repo from the security and compliance account. You can find the URL from the variable OutSourceRepoHttpUrl on the Outputs tab of the cf-SecurityAndCompliancePipeline stack we deployed in the previous section.

  1. Create a new local branch main and check in the empty branch into the remote origin so that the main branch is created in the remote origin. Skipping this step leads to failure in the code merge step of the pipeline due to the absence of the main branch.
  2. Create a new branch develop and copy the security and compliance code into it. This is required because the security and compliance pipeline is configured to be triggered from the develop branch for the purposes of this post.
  3. Copy the cucumber-sandwich.jar file you generated in the prerequisites section into a new folder /lib.

See the following sample script:

cd security-and-compliance-code
git checkout -b main
git add .
git commit --allow-empty -m “initial commit”
git push origin main
git checkout -b develop main
# Here copy the code from git repo location
# You also copy cucumber-sandwich.jar into a new folder /lib
git add .
git commit -m “Initial commit”
git push origin develop

The folder structure of security and compliance code repo should match the structure shown in the following screenshot.

security and compliance code folder structure

The code push to the develop branch of the security-and-compliance-code repo triggers the security and compliance pipeline. The pipeline pulls the code from the workload account repo, then runs the compliance guardrails against the Terraform workload to make sure that the workload is compliant. If the workload is compliant, the pipeline merges the compliance guardrails into the main branch. If the workload fails the compliance test, the pipeline fails. The following screenshot shows a sample run of the pipeline.

security and compliance pipeline

Run the application workload pipeline to exercise the compliance guardrails

After we set up the security and compliance repo and the pipeline runs successfully, the workload pipeline is ready to proceed (see the following screenshot of its progress).

workload pipeline

The service delivery teams are now being subjected to the security and compliance guardrails being implemented (RunComplianceCheck stage), and their pipeline breaks if any resource is noncompliant.

Review the generated reports

CodeBuild supports viewing reports generated in cucumber JSON format. In our workflow, we generate reports in cucumber JSON and BDD XML formats, and we use this capability of CodeBuild to generate and view HTML reports. Our implementation also generates report directly in HTML using the cucumber-sandwich library.

The following screenshot is snippet of the script compliance-check.sh, which implements report generation.

compliance check script

The bug noted in the screenshot is in the radish-bdd library that Terraform-Compliance uses for the cucumber JSON format report generation. For more information, you can review the defect logged against radish-bdd for this issue.

After the script generates the reports, CodeBuild needs to be configured to access them to generate HTML reports. The following screenshot shows a snippet from buildspec-compliance-check.yml, which shows how the reports section is set up for report generation:

buildspec compliance check

For more details on how to set up buildspec file for CodeBuild to generate reports, see Create a test report.

CodeBuild displays the compliance run reports as shown in the following screenshot.

code build cucumber report

We can also view a trending graph for multiple runs.

code build cucumber report

The other report generated by the workflow is the pretty HTML report generated by the cucumber-sandwich library.

code build cucumber report

The reports are available for download from the S3 bucket <OutPipelineBucketName>/pipeline-security-an/report_App/<zip file>.

The cucumber-sandwich generated report marks scenarios with skipped tests as failed scenarios. This is the only noticeable difference between the CodeBuild generated HTML and cucumber-sandwich generated HTML reports.

Clean up

To remove all the resources from the workload account, complete the following steps in order:

  1. Go to the folder where you cloned the workload code and edit buildspec-workload-deploy.yml:
    • Comment line 44 (- ./workload-deploy.sh).
    • Uncomment line 45 (- ./workload-deploy.sh --destroy).
    • Commit and push the code change to the remote repo. The workload pipeline is triggered, which cleans up the workload.
  2. Delete the CloudFormation stack cf-CrossAccountRoles. This step removes the cross-account role from the workload account, which gives permission to the security and compliance account to pull the Terraform workload.
  3. Go to the CloudFormation stack cf-TerraformWorkloadPipeline and note the OutPipelineBucketName and OutStateFileBucketName on the Outputs tab. Empty the two buckets and then delete the stack. This removes pipeline resources from workload account.
  4. Go to the CDKToolkit stack and note the BucketName on the Outputs tab. Empty that bucket and then delete the stack.

To remove all the resources from the security and compliance account, complete the following steps in order:

  1. Delete the CloudFormation stack cf-CrossAccountRoles. This step removes the cross-account role from the security and compliance account, which gives permission to the workload account to pull the compliance code.
  2. Go to CloudFormation stack cf-SecurityAndCompliancePipeline and note the OutPipelineBucketName on the Outputs tab. Empty that bucket and then delete the stack. This removes pipeline resources from the security and compliance account.
  3. Go to the CDKToolkit stack and note the BucketName on the Outputs tab. Empty that bucket and then delete the stack.

Security considerations

Cross-account IAM roles are very powerful and need to be handled carefully. For this post, we strictly limited the cross-account IAM role to specific CodeCommit permissions. This makes sure that the cross-account role can only do those things.

Conclusion

In this post in our two-part series, we implemented a continuous compliance workflow using CodePipeline and the open-source Terraform-Compliance framework. The Terraform-Compliance framework allows you to build guardrails for securing Terraform applications deployed on AWS.

We also showed how you can use AWS developer tools to seamlessly integrate security and compliance guardrails into an application release cycle and catch noncompliant AWS resources before getting deployed into AWS.

Try implementing the solution in your enterprise as shown in this post, and leave your thoughts and questions in the comments.

About the authors

sumit mishra

 

Sumit Mishra is Senior DevOps Architect at AWS Professional Services. His area of expertise include IaC, Security in pipeline, CI/CD and automation.

 

 

 

Damodar Shenvi Wagle

 

Damodar Shenvi Wagle is a Cloud Application Architect at AWS Professional Services. His areas of expertise include architecting serverless solutions, CI/CD and automation.

Building a CI/CD pipeline to update an AWS CloudFormation StackSets

Post Syndicated from Karim Afifi original https://aws.amazon.com/blogs/devops/building-a-ci-cd-pipeline-to-update-an-aws-cloudformation-stacksets/

AWS CloudFormation StackSets can extend the functionality of CloudFormation Stacks by enabling you to create, update, or delete one or more stack across multiple accounts. As a developer working in a large enterprise or for a group that supports multiple AWS accounts, you may often find yourself challenged with updating AWS CloudFormation StackSets. If you’re building a CI/CD pipeline to automate the process of updating CloudFormation stacks, you can do so natively. AWS CodePipeline can initiate a workflow that builds and tests a stack, and then pushes it to production. The workflow can either create or manipulate an existing stack; however, working with AWS CloudFormation StackSets is currently not a supported action at the time of this writing.

You can update an existing CloudFormation stack using one of two methods:

  • Directly updating the stack – AWS immediately deploys the changes that you submit. You can use this method when you want to quickly deploy your updates.
  • Running change sets – You can preview the changes AWS CloudFormation will make to the stack, and decide whether to proceed with the changes.

You have several options when building a CI/CD pipeline to automate creating or updating a stack. You can create or update a stack, delete a stack, create or replace a change set, or run a change set. Creating or updating a CloudFormation StackSet, however, is not a supported action.

The following screenshot shows the existing actions supported by CodePipeline against AWS CloudFormation on the CodePipeline console.

CodePipeline console

This post explains how to use CodePipeline to update an existing CloudFormation StackSet. For this post, we update the StackSet’s parameters. Parameters enable you to input custom values to your template each time you create or update a stack.

Overview of solution

To implement this solution, we walk you through the following high-level steps:

  1. Update a parameter for a StackSet by passing a parameter key and its associated value via an AWS CodeCommit
  2. Create an AWS CodeBuild
  3. Build a CI/CD pipeline.
  4. Run your pipeline and monitor its status.

After completing all the steps in this post, you will have a fully functional CI/CD that updates the CloudFormation StackSet parameters. The pipeline starts automatically after you apply the intended changes into the CodeCommit repository.

The following diagram illustrates the solution architecture.

Solution Architecture

The solution workflow is as follows:

  1. Developers integrate changes into a main branch hosted within a CodeCommit repository.
  2. CodePipeline polls the source code repository and triggers the pipeline to run when a new version is detected.
  3. CodePipeline runs a build of the new revision in CodeBuild.
  4. CodeBuild runs the changes in the yml file, which includes the changes against the StackSets. (To update all the stack instances associated with this StackSet, do not specify DeploymentTargets or Regions in the buildspec.yml file.)
  5. Verify that the changes were applied successfully.

Prerequisites

To complete this tutorial, you should have the following prerequisites:

Retrieving your StackSet parameters

Your first step is to verify that you have a StackSet in the AWS account you intend to use. If not, create one before proceeding. For this post, we use an existing StackSet called StackSet-Test.

  1. Sign in to your AWS account.
  2. On the CloudFormation console, choose StackSets.
  3. Choose your StackSet.

StackSet

For this post, we modify the value of the parameter with the key KMSId.

  1. On the Parameters tab, note the value of the key KMSId.

Parameters

Creating a CodeCommit repository

To create your repository, complete the following steps:

  1. On the CodeCommit console, choose Repositories.
  2. Choose Create repository.

Repositories name

  1. For Repository name, enter a name (for example, Demo-Repo).
  2. Choose Create.

Repositories Description

  1. Choose Create file to populate the repository with the following artifacts.

Create file

A buildspec.yml file informs CodeBuild of all the actions that should be taken during a build run for our application. We divide the build run into separate predefined phases for logical organization, and list the commands that run on the provisioned build server performing a build job.

  1. Enter the following code in the code editor:

YAML

phases:

  pre_build:

    commands:

      - aws cloudformation update-stack-set --stack-set-name StackSet-Test --use-previous-template --parameters ParameterKey=KMSId,ParameterValue=newCustomValue

The preceding AWS CloudFormation command updates a StackSet with the name StackSet-Test. The command results in updating the parameter value of the parameter key KMSId to newCustomValue.

  1. Name the file yml.
  2. Provide an author name and email address.
  3. Choose Commit changes.

Creating a CodeBuild project

To create your CodeBuild project, complete the following steps:

  1. On the CodeBuild console, choose Build projects.
  2. Choose Create build project.

create build project

  1. For Project name, enter your project name (for example, Demo-Build).
  2. For Description, enter an optional description.

project name

  1. For Source provider, choose AWS CodeCommit.
  2. For Repository, choose the CodeCommit repository you created in the previous step.
  3. For Reference type, keep default selection Branch.
  4. For Branch, choose master.

Source configuration

To set up the CodeBuild environment, we use a managed image based on Amazon Linux 2.

  1. For Environment Image, select Managed image.
  2. For Operating system, choose Amazon Linux 2.
  3. For Runtime(s), choose Standard.
  4. For Image, choose amazonlinux2-aarch64-standard:1.0.
  5. For Image version, choose Always use the latest for this runtime version.

Environment

  1. For Service role¸ select New service role.
  2. For Role name, enter your service role name.

Service Role

  1. Chose Create build project.

Creating a CodePipeline pipeline

To create your pipeline, complete the following steps:

  1. On the CodePipeline console, choose Pipelines.
  2. Choose Create pipeline

Code Pipeline

  1. For Pipeline name, enter a name for the pipeline (for example, DemoPipeline).
  2. For Service role, select New service role.
  3. For Role name, enter your service role name.

Pipeline name

  1. Choose Next.
  2. For Source provider, choose AWS CodeCommit.
  3. For Repository name, choose the repository you created.
  4. For Branch name, choose master.

Source Configurations

  1. Choose Next.
  2. For Build provider, choose AWS CodeBuild.
  3. For Region, choose your Region.
  4. For Project name, choose the build project you created.

CodeBuild

  1. Choose Next.
  2. Choose Skip deploy stage.
  3. Choose Skip
  4. Choose Create pipeline.

The pipeline is now created successfully.

Running and monitoring your pipeline

We use the pipeline to release changes. By default, a pipeline starts automatically when it’s created and any time a change is made in a source repository. You can also manually run the most recent revision through your pipeline, as in the following steps:

  1. On the CodePipeline console, choose the pipeline you created.
  2. On the pipeline details page, choose Release change.

The following screenshot shows the status of the run from the pipeline.

Release change

  1. Under Build, choose Details to view build logs, phase details, reports, environment variables, and build details.

Build details

  1. Choose the Build logs tab to view the logs generated as a result of the build in more detail.

The following screenshot shows that we ran the AWS CloudFormation command that was provided in the buildspec.yml file. It also shows that all phases of the build process are successfully complete.

 

Phase Details

The StackSet parameter KMSId has been updated successfully with the new value newCustomValue as a result of running the pipeline.  Please note that we used the parameter KMSId as an example for demonstration purposes. Any other parameter that is part of your StackSet could have been used instead.

Cleaning up

You may delete the resources that you created during this post:

  • AWS CloudFormation StackSet.
  • AWS CodeCommit repository.
  • AWS CodeBuild project.
  • AWS CodePipeline.

Conclusion

In this post, we explored how to use CodePipeline, CodeBuild, and CodeCommit to update an existing CloudFormation StackSet. Happy coding!

About the author

Karim Afifi is a Solutions Architect Leader with Amazon Web Services. He is part of the Global Life Sciences Solution Architecture team. team. He is based out of New York, and enjoys helping customers throughout their journey to innovation.

 

Building an end-to-end Kubernetes-based DevSecOps software factory on AWS

Post Syndicated from Srinivas Manepalli original https://aws.amazon.com/blogs/devops/building-an-end-to-end-kubernetes-based-devsecops-software-factory-on-aws/

DevSecOps software factory implementation can significantly vary depending on the application, infrastructure, architecture, and the services and tools used. In a previous post, I provided an end-to-end DevSecOps pipeline for a three-tier web application deployed with AWS Elastic Beanstalk. The pipeline used cloud-native services along with a few open-source security tools. This solution is similar, but instead uses a containers-based approach with additional security analysis stages. It defines a software factory using Kubernetes along with necessary AWS Cloud-native services and open-source third-party tools. Code is provided in the GitHub repo to build this DevSecOps software factory, including the integration code for third-party scanning tools.

DevOps is a combination of cultural philosophies, practices, and tools that combine software development with information technology operations. These combined practices enable companies to deliver new application features and improved services to customers at a higher velocity. DevSecOps takes this a step further by integrating and automating the enforcement of preventive, detective, and responsive security controls into the pipeline.

In a DevSecOps factory, security needs to be addressed from two aspects: security of the software factory, and security in the software factory. In this architecture, we use AWS services to address the security of the software factory, and use third-party tools along with AWS services to address the security in the software factory. This AWS DevSecOps reference architecture covers DevSecOps practices and security vulnerability scanning stages including secret analysis, SCA (Software Composite Analysis), SAST (Static Application Security Testing), DAST (Dynamic Application Security Testing), RASP (Runtime Application Self Protection), and aggregation of vulnerability findings into a single pane of glass.

The focus of this post is on application vulnerability scanning. Vulnerability scanning of underlying infrastructure such as the Amazon Elastic Kubernetes Service (Amazon EKS) cluster and network is outside the scope of this post. For information about infrastructure-level security planning, refer to Amazon Guard Duty, Amazon Inspector, and AWS Shield.

You can deploy this pipeline in either the AWS GovCloud (US) Region or standard AWS Regions. All listed AWS services are authorized for FedRamp High and DoD SRG IL4/IL5.

Security and compliance

Thoroughly implementing security and compliance in the public sector and other highly regulated workloads is very important for achieving an ATO (Authority to Operate) and continuously maintain an ATO (c-ATO). DevSecOps shifts security left in the process, integrating it at each stage of the software factory, which can make ATO a continuous and faster process. With DevSecOps, an organization can deliver secure and compliant application changes rapidly while running operations consistently with automation.

Security and compliance are shared responsibilities between AWS and the customer. Depending on the compliance requirements (such as FedRamp or DoD SRG), a DevSecOps software factory needs to implement certain security controls. AWS provides tools and services to implement most of these controls. For example, to address NIST 800-53 security controls families such as access control, you can use AWS Identity Access and Management (IAM) roles and Amazon Simple Storage Service (Amazon S3) bucket policies. To address auditing and accountability, you can use AWS CloudTrail and Amazon CloudWatch. To address configuration management, you can use AWS Config rules and AWS Systems Manager. Similarly, to address risk assessment, you can use vulnerability scanning tools from AWS.

The following table is the high-level mapping of the NIST 800-53 security control families and AWS services that are used in this DevSecOps reference architecture. This list only includes the services that are defined in the AWS CloudFormation template, which provides pipeline as code in this solution. You can use additional AWS services and tools or other environmental specific services and tools to address these and the remaining security control families on a more granular level.

# NIST 800-53 Security Control Family – Rev 5 AWS Services Used (In this DevSecOps Pipeline)
1 AC – Access Control

AWS IAM, Amazon S3, and Amazon CloudWatch are used.

AWS::IAM::ManagedPolicy
AWS::IAM::Role
AWS::S3::BucketPolicy
AWS::CloudWatch::Alarm

2 AU – Audit and Accountability

AWS CloudTrail, Amazon S3, Amazon SNS, and Amazon CloudWatch are used.

AWS::CloudTrail::Trail
AWS::Events::Rule
AWS::CloudWatch::LogGroup
AWS::CloudWatch::Alarm
AWS::SNS::Topic

3 CM – Configuration Management

AWS Systems Manager, Amazon S3, and AWS Config are used.

AWS::SSM::Parameter
AWS::S3::Bucket
AWS::Config::ConfigRule

4 CP – Contingency Planning

AWS CodeCommit and Amazon S3 are used.

AWS::CodeCommit::Repository
AWS::S3::Bucket

5 IA – Identification and Authentication

AWS IAM is used.

AWS:IAM:User
AWS::IAM::Role

6 RA – Risk Assessment

AWS Config, AWS CloudTrail, AWS Security Hub, and third party scanning tools are used.

AWS::Config::ConfigRule
AWS::CloudTrail::Trail
AWS::SecurityHub::Hub
Vulnerability Scanning Tools (AWS/AWS Partner/3rd party)

7 CA – Assessment, Authorization, and Monitoring

AWS CloudTrail, Amazon CloudWatch, and AWS Config are used.

AWS::CloudTrail::Trail
AWS::CloudWatch::LogGroup
AWS::CloudWatch::Alarm
AWS::Config::ConfigRule

8 SC – System and Communications Protection

AWS KMS and AWS Systems Manager are used.

AWS::KMS::Key
AWS::SSM::Parameter
SSL/TLS communication

9 SI – System and Information Integrity

AWS Security Hub, and third party scanning tools are used.

AWS::SecurityHub::Hub
Vulnerability Scanning Tools (AWS/AWS Partner/3rd party)

10 AT – Awareness and Training N/A
11 SA – System and Services Acquisition N/A
12 IR – Incident Response Not implemented, but services like AWS Lambda, and Amazon CloudWatch Events can be used.
13 MA – Maintenance N/A
14 MP – Media Protection N/A
15 PS – Personnel Security N/A
16 PE – Physical and Environmental Protection N/A
17 PL – Planning N/A
18 PM – Program Management N/A
19 PT – PII Processing and Transparency N/A
20 SR – SupplyChain Risk Management N/A

Services and tools

In this section, we discuss the various AWS services and third-party tools used in this solution.

CI/CD services

For continuous integration and continuous delivery (CI/CD) in this reference architecture, we use the following AWS services:

  • AWS CodeBuild – A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy.
  • AWS CodeCommit – A fully managed source control service that hosts secure Git-based repositories.
  • AWS CodeDeploy – A fully managed deployment service that automates software deployments to a variety of compute services such as Amazon Elastic Compute Cloud (Amazon EC2), AWS Fargate, AWS Lambda, and your on-premises servers.
  • AWS CodePipeline – A fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.
  • AWS Lambda – A service that lets you run code without provisioning or managing servers. You pay only for the compute time you consume.
  • Amazon Simple Notification Service – Amazon SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.
  • Amazon S3 – Amazon S3 is storage for the internet. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.
  • AWS Systems Manager Parameter Store – Parameter Store provides secure, hierarchical storage for configuration data management and secrets management.

Continuous testing tools

The following are open-source scanning tools that are integrated in the pipeline for the purpose of this post, but you could integrate other tools that meet your specific requirements. You can use the static code review tool Amazon CodeGuru for static analysis, but at the time of this writing, it’s not yet available in AWS GovCloud and currently supports Java and Python.

  • Anchore (SCA and SAST) – Anchore Engine is an open-source software system that provides a centralized service for analyzing container images, scanning for security vulnerabilities, and enforcing deployment policies.
  • Amazon Elastic Container Registry image scanning – Amazon ECR image scanning helps in identifying software vulnerabilities in your container images. Amazon ECR uses the Common Vulnerabilities and Exposures (CVEs) database from the open-source Clair project and provides a list of scan findings.
  • Git-Secrets (Secrets Scanning) – Prevents you from committing sensitive information to Git repositories. It is an open-source tool from AWS Labs.
  • OWASP ZAP (DAST) – Helps you automatically find security vulnerabilities in your web applications while you’re developing and testing your applications.
  • Snyk (SCA and SAST) – Snyk is an open-source security platform designed to help software-driven businesses enhance developer security.
  • Sysdig Falco (RASP) – Falco is an open source cloud-native runtime security project that detects unexpected application behavior and alerts on threats at runtime. It is the first runtime security project to join CNCF as an incubation-level project.

You can integrate additional security stages like IAST (Interactive Application Security Testing) into the pipeline to get code insights while the application is running. You can use AWS partner tools like Contrast Security, Synopsys, and WhiteSource to integrate IAST scanning into the pipeline. Malware scanning tools, and image signing tools can also be integrated into the pipeline for additional security.

Continuous logging and monitoring services

The following are AWS services for continuous logging and monitoring used in this reference architecture:

Auditing and governance services

The following are AWS auditing and governance services used in this reference architecture:

  • AWS CloudTrail – Enables governance, compliance, operational auditing, and risk auditing of your AWS account.
  • AWS Config – Allows you to assess, audit, and evaluate the configurations of your AWS resources.
  • AWS Identity and Access Management – Enables you to manage access to AWS services and resources securely. With IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources.

Operations services

The following are the AWS operations services used in this reference architecture:

  • AWS CloudFormation – Gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles, by treating infrastructure as code.
  • Amazon ECR – A fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts anywhere.
  • Amazon EKS – A managed service that you can use to run Kubernetes on AWS without needing to install, operate, and maintain your own Kubernetes control plane or nodes. Amazon EKS runs up-to-date versions of the open-source Kubernetes software, so you can use all of the existing plugins and tooling from the Kubernetes community.
  • AWS Security Hub – Gives you a comprehensive view of your security alerts and security posture across your AWS accounts. This post uses Security Hub to aggregate all the vulnerability findings as a single pane of glass.
  • AWS Systems Manager Parameter Store – Provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, Amazon Machine Image (AMI) IDs, and license codes as parameter values.

Pipeline architecture

The following diagram shows the architecture of the solution. We use AWS CloudFormation to describe the pipeline as code.

Containers devsecops pipeline architecture

Kubernetes DevSecOps Pipeline Architecture

The main steps are as follows:

    1. When a user commits the code to CodeCommit repository, a CloudWatch event is generated, which triggers CodePipeline to orchestrate the events.
    2. CodeBuild packages the build and uploads the artifacts to an S3 bucket.
    3. CodeBuild scans the code with git-secrets. If there is any sensitive information in the code such as AWS access keys or secrets keys, CodeBuild fails the build.
    4. CodeBuild creates the container image and perform SCA and SAST by scanning the image with Snyk or Anchore. In the provided CloudFormation template, you can pick one of these tools during the deployment. Please note, CodeBuild is fully enabled for a “bring your own tool” approach.
      • (4a) If there are any vulnerabilities, CodeBuild invokes the Lambda function. The function parses the results into AWS Security Finding Format (ASFF) and posts them to Security Hub. Security Hub helps aggregate and view all the vulnerability findings in one place as a single pane of glass. The Lambda function also uploads the scanning results to an S3 bucket.
      • (4b) If there are no vulnerabilities, CodeBuild pushes the container image to Amazon ECR and triggers another scan using built-in Amazon ECR scanning.
    5. CodeBuild retrieves the scanning results.
      • (5a) If there are any vulnerabilities, CodeBuild invokes the Lambda function again and posts the findings to Security Hub. The Lambda function also uploads the scan results to an S3 bucket.
      • (5b) If there are no vulnerabilities, CodeBuild deploys the container image to an Amazon EKS staging environment.
    6. After the deployment succeeds, CodeBuild triggers the DAST scanning with the OWASP ZAP tool (again, this is fully enabled for a “bring your own tool” approach).
      • (6a) If there are any vulnerabilities, CodeBuild invokes the Lambda function, which parses the results into ASFF and posts it to Security Hub. The function also uploads the scan results to an S3 bucket (similar to step 4a).
    7. If there are no vulnerabilities, the approval stage is triggered, and an email is sent to the approver for action via Amazon SNS.
    8. After approval, CodeBuild deploys the code to the production Amazon EKS environment.
    9. During the pipeline run, CloudWatch Events captures the build state changes and sends email notifications to subscribed users through Amazon SNS.
    10. CloudTrail tracks the API calls and sends notifications on critical events on the pipeline and CodeBuild projects, such as UpdatePipeline, DeletePipeline, CreateProject, and DeleteProject, for auditing purposes.
    11. AWS Config tracks all the configuration changes of AWS services. The following AWS Config rules are added in this pipeline as security best practices:
      1. CODEBUILD_PROJECT_ENVVAR_AWSCRED_CHECK – Checks whether the project contains environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. The rule is NON_COMPLIANT when the project environment variables contain plaintext credentials. This rule ensures that sensitive information isn’t stored in the CodeBuild project environment variables.
      2. CLOUD_TRAIL_LOG_FILE_VALIDATION_ENABLED – Checks whether CloudTrail creates a signed digest file with logs. AWS recommends that the file validation be enabled on all trails. The rule is noncompliant if the validation is not enabled. This rule ensures that pipeline resources such as the CodeBuild project aren’t altered to bypass critical vulnerability checks.

Security of the pipeline is implemented using IAM roles and S3 bucket policies to restrict access to pipeline resources. Pipeline data at rest and in transit is protected using encryption and SSL secure transport. We use Parameter Store to store sensitive information such as API tokens and passwords. To be fully compliant with frameworks such as FedRAMP, other things may be required, such as MFA.

Security in the pipeline is implemented by performing the Secret Analysis, SCA, SAST, DAST, and RASP security checks. Applicable AWS services provide encryption at rest and in transit by default. You can enable additional controls on top of these wherever required.

In the next section, I explain how to deploy and run the pipeline CloudFormation template used for this example. As a best practice, we recommend using linting tools like cfn-nag and cfn-guard to scan CloudFormation templates for security vulnerabilities. Refer to the provided service links to learn more about each of the services in the pipeline.

Prerequisites

Before getting started, make sure you have the following prerequisites:

  • An EKS cluster environment with your application deployed. In this post, we use PHP WordPress as a sample application, but you can use any other application.
  • Sysdig Falco installed on an EKS cluster. Sysdig Falco captures events on the EKS cluster and sends those events to CloudWatch using AWS FireLens. For implementation instructions, see Implementing Runtime security in Amazon EKS using CNCF Falco. This step is required only if you need to implement RASP in the software factory.
  • A CodeCommit repo with your application code and a Dockerfile. For more information, see Create an AWS CodeCommit repository.
  • An Amazon ECR repo to store container images and scan for vulnerabilities. Enable vulnerability scanning on image push in Amazon ECR. You can enable or disable the automatic scanning on image push via the Amazon ECR
  • The provided buildspec-*.yml files for git-secrets, Anchore, Snyk, Amazon ECR, OWASP ZAP, and your Kubernetes deployment .yml files uploaded to the root of the application code repository. Please update the Kubernetes (kubectl) commands in the buildspec files as needed.
  • A Snyk API key if you use Snyk as a SAST tool.
  • The Lambda function uploaded to an S3 bucket. We use this function to parse the scan reports and post the results to Security Hub.
  • An OWASP ZAP URL and generated API key for dynamic web scanning.
  • An application web URL to run the DAST testing.
  • An email address to receive approval notifications for deployment, pipeline change notifications, and CloudTrail events.
  • AWS Config and Security Hub services enabled. For instructions, see Managing the Configuration Recorder and Enabling Security Hub manually, respectively.

Deploying the pipeline

To deploy the pipeline, complete the following steps:

  1. Download the CloudFormation template and pipeline code from the GitHub repo.
  2. Sign in to your AWS account if you have not done so already.
  3. On the CloudFormation console, choose Create Stack.
  4. Choose the CloudFormation pipeline template.
  5. Choose Next.
  6. Under Code, provide the following information:
    1. Code details, such as repository name and the branch to trigger the pipeline.
    2. The Amazon ECR container image repository name.
  7. Under SAST, provide the following information:
    1. Choose the SAST tool (Anchore or Snyk) for code analysis.
    2. If you select Snyk, provide an API key for Snyk.
  8. Under DAST, choose the DAST tool (OWASP ZAP) for dynamic testing and enter the API token, DAST tool URL, and the application URL to run the scan.
  9. Under Lambda functions, enter the Lambda function S3 bucket name, filename, and the handler name.
  10. For STG EKS cluster, enter the staging EKS cluster name.
  11. For PRD EKS cluster, enter the production EKS cluster name to which this pipeline deploys the container image.
  12. Under General, enter the email addresses to receive notifications for approvals and pipeline status changes.
  13. Choose Next.
  14. Complete the stack.
  15. After the pipeline is deployed, confirm the subscription by choosing the provided link in the email to receive notifications.
Pipeline-CF-Parameters.png

Pipeline CloudFormation Parameters

The provided CloudFormation template in this post is formatted for AWS GovCloud. If you’re setting this up in a standard Region, you have to adjust the partition name in the CloudFormation template. For example, change ARN values from arn:aws-us-gov to arn:aws.

Running the pipeline

To trigger the pipeline, commit changes to your application repository files. That generates a CloudWatch event and triggers the pipeline. CodeBuild scans the code and if there are any vulnerabilities, it invokes the Lambda function to parse and post the results to Security Hub.

When posting the vulnerability finding information to Security Hub, we need to provide a vulnerability severity level. Based on the provided severity value, Security Hub assigns the label as follows. Adjust the severity levels in your code based on your organization’s requirements.

  • 0 – INFORMATIONAL
  • 1–39 – LOW
  • 40– 69 – MEDIUM
  • 70–89 – HIGH
  • 90–100 – CRITICAL

The following screenshot shows the progression of your pipeline.

DevSecOps-Pipeline.png

DevSecOps Kubernetes CI/CD Pipeline

 

Secrets analysis scanning

In this architecture, after the pipeline is initiated, CodeBuild triggers the Secret Analysis stage using git-secrets and the buildspec-gitsecrets.yml file. Git-Secrets looks for any sensitive information such as AWS access keys and secret access keys. Git-Secrets allows you to add custom strings to look for in your analysis. CodeBuild uses the provided buildspec-gitsecrets.yml file during the build stage.

SCA and SAST scanning

In this architecture, CodeBuild triggers the SCA and SAST scanning using Anchore, Snyk, and Amazon ECR. In this solution, we use the open-source versions of Anchore and Snyk. Amazon ECR uses open-source Clair under the hood, which comes with Amazon ECR for no additional cost. As mentioned earlier, you can choose Anchore or Snyk to do the initial image scanning.

Scanning with Anchore

If you choose Anchore as a SAST tool during the deployment, the build stage uses the buildspec-anchore.yml file to scan the container image. If there are any vulnerabilities, it fails the build and triggers the Lambda function to post those findings to Security Hub. If there are no vulnerabilities, it proceeds to next stage.

Anchore-lambda-codesnippet.png

Anchore Lambda Code Snippet

Scanning with Snyk

If you choose Snyk as a SAST tool during the deployment, the build stage uses the buildspec-snyk.yml file to scan the container image. If there are any vulnerabilities, it fails the build and triggers the Lambda function to post those findings to Security Hub. If there are no vulnerabilities, it proceeds to next stage.

Snyk-lambda-codesnippet.png

Snyk Lambda Code Snippet

Scanning with Amazon ECR

If there are no vulnerabilities from Anchore or Snyk scanning, the image is pushed to Amazon ECR, and the Amazon ECR scan is triggered automatically. Amazon ECR lists the vulnerability findings on the Amazon ECR console. To provide a single pane of glass view of all the vulnerability findings and for easy administration, we retrieve those findings and post them to Security Hub. If there are no vulnerabilities, the image is deployed to the EKS staging cluster and next stage (DAST scanning) is triggered.

ECR-lambda-codesnippet.png

ECR Lambda Code Snippet

 

DAST scanning with OWASP ZAP

In this architecture, CodeBuild triggers DAST scanning using the DAST tool OWASP ZAP.

After deployment is successful, CodeBuild initiates the DAST scanning. When scanning is complete, if there are any vulnerabilities, it invokes the Lambda function, similar to SAST analysis. The function parses and posts the results to Security Hub. The following is the code snippet of the Lambda function.

Zap-lambda-codesnippet.png

Zap Lambda Code Snippet

The following screenshot shows the results in Security Hub. The highlighted section shows the vulnerability findings from various scanning stages.

SecurityHub-vulnerabilities.png

Vulnerability Findings in Security Hub

We can drill down to individual resource IDs to get the list of vulnerability findings. For example, if we drill down to the resource ID of SASTBuildProject*, we can review all the findings from that resource ID.

Anchore-Vulnerability.png

SAST Vulnerabilities in Security Hub

 

If there are no vulnerabilities in the DAST scan, the pipeline proceeds to the manual approval stage and an email is sent to the approver. The approver can review and approve or reject the deployment. If approved, the pipeline moves to next stage and deploys the application to the production EKS cluster.

Aggregation of vulnerability findings in Security Hub provides opportunities to automate the remediation. For example, based on the vulnerability finding, you can trigger a Lambda function to take the needed remediation action. This also reduces the burden on operations and security teams because they can now address the vulnerabilities from a single pane of glass instead of logging into multiple tool dashboards.

Along with Security Hub, you can send vulnerability findings to your issue tracking systems such as JIRA, Systems Manager SysOps, or can automatically create an incident management ticket. This is outside the scope of this post, but is one of the possibilities you can consider when implementing DevSecOps software factories.

RASP scanning

Sysdig Falco is an open-source runtime security tool. Based on the configured rules, Falco can detect suspicious activity and alert on any behavior that involves making Linux system calls. You can use Falco rules to address security controls like NIST SP 800-53. Falco agents on each EKS node continuously scan the containers running in pods and send the events as STDOUT. These events can be then sent to CloudWatch or any third-party log aggregator to send alerts and respond. For more information, see Implementing Runtime security in Amazon EKS using CNCF Falco. You can also use Lambda to trigger and automatically remediate certain security events.

The following screenshot shows Falco events on the CloudWatch console. The highlighted text describes the Falco event that was triggered based on the default Falco rules on the EKS cluster. You can add additional custom rules to meet your security control requirements. You can also trigger responsive actions from these CloudWatch events using services like Lambda.

Falco alerts in CloudWatch

Falco alerts in CloudWatch

Cleanup

This section provides instructions to clean up the DevSecOps pipeline setup:

  1. Delete the EKS cluster.
  2. Delete the S3 bucket.
  3. Delete the CodeCommit repo.
  4. Delete the Amazon ECR repo.
  5. Disable Security Hub.
  6. Disable AWS Config.
  7. Delete the pipeline CloudFormation stack.

Conclusion

In this post, I presented an end-to-end Kubernetes-based DevSecOps software factory on AWS with continuous testing, continuous logging and monitoring, auditing and governance, and operations. I demonstrated how to integrate various open-source scanning tools, such as Git-Secrets, Anchore, Snyk, OWASP ZAP, and Sysdig Falco for Secret Analysis, SCA, SAST, DAST, and RASP analysis, respectively. To reduce operations overhead, I explained how to aggregate and manage vulnerability findings in Security Hub as a single pane of glass. This post also talked about how to implement security of the pipeline and in the pipeline using AWS Cloud-native services. Finally, I provided the DevSecOps software factory as code using AWS CloudFormation.

To get started with DevSecOps on AWS, see AWS DevOps and the DevOps blog.

Srinivas Manepalli is a DevSecOps Solutions Architect in the U.S. Fed SI SA team at Amazon Web Services (AWS). He is passionate about helping customers, building and architecting DevSecOps and highly available software systems. Outside of work, he enjoys spending time with family, nature and good food.

Choosing a CI/CD approach: AWS Services with BigHat Biosciences

Post Syndicated from Mike Apted original https://aws.amazon.com/blogs/devops/choosing-ci-cd-aws-services-bighat-biosciences/

Founded in 2019, BigHat Biosciences’ mission is to improve human health by reimagining antibody discovery and engineering to create better antibodies faster. Their integrated computational + experimental approach speeds up antibody design and discovery by combining high-speed molecular characterization with machine learning technologies to guide the search for better antibodies. They apply these design capabilities to develop new generations of safer and more effective treatments for patients suffering from today’s most challenging diseases. Their platform, from wet lab robots to cloud-based data and logistics plane, is woven together with rapidly changing BigHat-proprietary software. BigHat uses continuous integration and continuous deployment (CI/CD) throughout their data engineering workflows and when training and evaluating their machine learning (ML) models.

 

BigHat Biosciences Logo

 

In a previous post, we discussed the key considerations when choosing a CI/CD approach. In this post, we explore BigHat’s decisions and motivations in adopting managed AWS CI/CD services. You may find that your organization has commonalities with BigHat and some of their insights may apply to you. Throughout the post, considerations are informed and choices are guided by the best practices in the AWS Well-Architected Framework.

How did BigHat decide what they needed?

Making decisions on appropriate (CI/CD) solutions requires understanding the characteristics of your organization, the environment you operate in, and your current priorities and goals.

“As a company designing therapeutics for patients rather than software, the role of technology at BigHat is to enable a radically better approach to develop therapeutic molecules,” says Eddie Abrams, VP of Engineering at BigHat. “We need to automate as much as possible. We need the speed, agility, reliability and reproducibility of fully automated infrastructure to enable our company to solve complex problems with maximum scientific rigor while integrating best in class data analysis. Our engineering-first approach supports that.”

BigHat possesses a unique insight to an unsolved problem. As an early stage startup, their core focus is optimizing the fully integrated platform that they built from the ground-up to guide the design for better molecules. They respond to feedback from partners and learn from their own internal experimentation. With each iteration, the quality of what they’re creating improves, and they gain greater insight and improved models to support the next iteration. More than anything, they need to be able to iterate rapidly. They don’t need any additional complexity that would distract from their mission. They need uncomplicated and enabling solutions.

They also have to take into consideration the regulatory requirements that apply to them as a company, the data they work with and its security requirements; and the market segment they compete in. Although they don’t control these factors, they can control how they respond to them, and they want to be able to respond quickly. It’s not only speed that matters in designing for security and compliance, but also visibility and track-ability. These often overlooked and critical considerations are instrumental in choosing a CI/CD strategy and platform.

“The ability to learn faster than your competitors may be the only sustainable competitive advantage,” says Cindy Alvarez in her book Lean Customer Development.

The tighter the feedback loop, the easier it is to make a change. Rapid iteration allows BigHat to easily build upon what works, and make adjustments as they identify avenues that won’t lead to success.

Feature set

CI/CD is applicable to more than just the traditional use case. It doesn’t have to be software delivered in a classic fashion. In the case of BigHat, they apply CI/CD in their data engineering workflows and in training their ML models. BigHat uses automated solutions in all aspects of their workflow. Automation further supports taking what they have created internally and enabling advances in antibody design and development for safer, more effective treatments of conditions.

“We see a broadening of the notion of what can come under CI/CD,” says Abrams. “We use automated solutions wherever possible including robotics to perform scaled assays. The goal in tightening the loop is to improve precision and speed, and reduce latency and lag time.”

BigHat reached the conclusion that they would adopt managed service offerings wherever possible, including in their CI/CD tooling and other automation initiatives.

“The phrase ‘undifferentiated heavy lifting’ has always resonated,” says Abrams. “Building, scaling, and operating core software and infrastructure are hard problems, but solving them isn’t itself a differentiating advantage for a therapeutics company. But whether we can automate that infrastructure, and how we can use that infrastructure at scale on a rock solid control plane to provide our custom solutions iteratively, reliably and efficiently absolutely does give us an edge. We need an end-to-end, complete infrastructure solution that doesn’t force us to integrate a patchwork of solutions ourselves. AWS provides exactly what we need in this regard.”

Reducing risk

Startups can be full of risk, with the upside being potential future reward. They face risk in finding the right problem, in finding a solution to that problem, and in finding a viable customer base to buy that solution.

A key priority for early stage startups is removing risk from as many areas of the business as possible. Any steps an early stage startup can take to remove risk without commensurately limiting reward makes them more viable. The more risk a startup can drive out of their hypothesis the more likely their success, in part because they’re more attractive to customers, employees, and investors alike. The more likely their product solves their problem, the more willing a customer is to give it a chance. Likewise, the more attractive they are to investors when compared to alternative startups with greater risk in reaching their next major milestone.

Adoption of managed services for CI/CD accomplishes this goal in several ways. The most important advantage remains speed. The core functionality required can be stood up very quickly, as it’s an existing service. Customers have a large body of reference examples and documentation available to demonstrate how to use that service. They also insulate teams from the need to configure and then operate the underlying infrastructure. The team remains focused on their differentiation and their core value proposition.

“We are automated right up to the organizational level and because of this, running those services ourselves represents operational risk,” says Abrams. “The largest day-to-day infrastructure risk to us is having the business stalled while something is not working. Do I want to operate these services, and focus my staff on that? There is no guarantee I can just throw more compute at a self-managed software service I’m running and make it scale effectively. There is no guarantee that if one datacenter is having a network or electrical problem that I can simply switch to another datacenter. I prefer AWS manages those scale and uptime problems.”

Embracing an opinionated model

BigHat is a startup with a singular focus on using ML to reduce the time and difficulty of designing antibodies and other therapeutic proteins. By adopting managed services, they have removed the burden of implementing and maintaining CI/CD systems.

Accepting the opinionated guardrails of the managed service approach allows, and to a degree reinforces, the focus on what makes a startup unique. Rather than being focused on performance tuning, making decisions on what OS version to use, or which of the myriad optional puzzle pieces to put together, they can use a well-integrated set of tools built to work with each other in a defined fashion.

The opinionated model means best practices are baked into the toolchain. Instead of hiring for specialized administration skills they’re hiring for specialized biotech skills.

“The only degrees of freedom I care about are the ones that improve our technologies and reduce the time, cost, and risk of bringing a therapeutic to market,” says Abrams. “We focus on exactly where we can gain operational advantages by simply adopting managed services that already embrace the Well-Architected Framework. If we had to tackle all of these engineering needs with limited resources, we would be spending into a solved problem. Before AWS, startups just didn’t do these sorts of things very well. Offloading this effort to a trusted partner is pretty liberating.”

Beyond the reduction in operational concerns, BigHat can also expect continuous improvement of that service over time to be delivered automatically by the provider. For their use case they will likely derive more benefit for less cost over time without any investment required.

Overview of solution

BigHat uses the following key services:

 

BigHat Reference Architecture

Security

Managed services are supported, owned and operated by the provider . This allows BigHat to leave concerns like patching and security of the underlying infrastructure and services to the provider. BigHat continues to maintain ownership in the shared responsibility model, but their scope of concern is significantly narrowed. The surface area the’re responsible for is reduced, helping to minimize risk. Choosing a partner with best in class observability, tracking, compliance and auditing tools is critical to any company that manages sensitive data.

Cost advantages

A startup must also make strategic decisions about where to deploy the capital they have raised from their investors. The vendor managed services bring a model focused on consumption, and allow the startup to make decisions about where they want to spend. This is often referred to as an operational expense (OpEx) model, in other words “pay as you go”, like a utility. This is in contrast to a large upfront investment in both time and capital to build these tools. The lack of need for extensive engineering efforts to stand up these tools, and continued investment to evolve them, acts as a form of capital expenditure (CapEx) avoidance. Startups can allocate their capital where it matters most for them.

“This is corporate-level changing stuff,” says Abrams. “We engage in a weekly leadership review of cost budgets. Operationally I can set the spending knob where I want it monthly, weekly or even daily, and avoid the risks involved in traditional capacity provisioning.”

The right tool for the right time

A key consideration for BigHat was the ability to extend the provider managed tools, where needed, to incorporate extended functionality from the ecosystem. This allows for additional functionality that isn’t covered by the core managed services, while maintaining a focus on their product development versus operating these tools.

Startups must also ask themselves what they need now, versus what they need in the future. As their needs change and grow, they can augment, extend, and replace the tools they have chosen to meet the new requirements. Starting with a vendor-managed service is not a one-way door; it’s an opportunity to defer investment in building and operating these capabilities yourself until that investment is justified. The time to value in using managed services initially doesn’t leave a startup with a sunk cost that limits future options.

“You have to think about the degree you want to adopt a hybrid model for the services you run. Today we aren’t running any software or services that require us to run our own compute instances. It’s very rare we run into something that is hard to do using just the services AWS already provides. Where our needs are not met, we can communicate them to AWS and we can choose to wait for them on their roadmap, which we have done in several cases, or we can elect to do it ourselves,” says Abrams. “This freedom to tweak and expand our service model at will is incomparably liberating.”

Conclusion

BigHat Biosciences was able to make an informed decision by considering the priorities of the business at this stage of its lifecycle. They adopted and embraced opinionated and service provider-managed tooling, which allowed them to inherit a largely best practice set of technology and practices, de-risk their operations, and focus on product velocity and customer feedback. This maintains future flexibility, which delivers significantly more value to the business in its current stage.

“We believe that the underlying engineering, the underlying automation story, is an advantage that applies to every aspect of what we do for our customers,” says Abrams. “By taking those advantages into every aspect of the business, we deliver on operations in a way that provides a competitive advantage a lot of other companies miss by not thinking about it this way.”

About the authors

Mike is a Principal Solutions Architect with the Startup Team at Amazon Web Services. He is a former founder, current mentor, and enjoys helping startups live their best cloud life.

 

 

 

Sean is a Senior Startup Solutions Architect at AWS. Before AWS, he was Director of Scientific Computing at the Howard Hughes Medical Institute.

Choosing a Well-Architected CI/CD approach: Open-source software and AWS Services

Post Syndicated from Brian Carlson original https://aws.amazon.com/blogs/devops/choosing-well-architected-ci-cd-open-source-software-aws-services/

This series of posts discusses making informed decisions when choosing to implement open-source tools on AWS services, adopt managed AWS services to satisfy the same needs, or use a combination of both.

We look at key considerations for evaluating open-source software and AWS services using the perspectives of a startup company and a mature company as examples. You can use these two different points of view to compare to your own organization. To make this investigation easier we will use Continuous Integration (CI) and Continuous Delivery (CD) capabilities as the target of our investigation.

Startup Company rocket and Mature Company rocket

In two related posts, we follow two AWS customers, Iponweb and BigHat Biosciences, as they share their CI/CD journeys, their perspectives, the decisions they made, and why. To end the series, we explore an example reference architecture showing the benefits AWS provides regardless of your emphasis on open-source tools or managed AWS services.

Why CI/CD?

Until your creations are in the hands of your customers, investment in development has provided no return. The faster valuable changes enter production, the greater positive impact you can have on your customer. In today’s highly competitive world, the ability to frequently and consistently deliver value is a competitive advantage. The Operational Excellence (OE) pillar of the AWS Well-Architected Framework recognizes this impact and focuses on the capabilities of CI/CD in two dedicated sections.

The concepts in CI/CD originate from software engineering but apply equally to any form of content. The goal is to support development, integration, testing, deployment, and delivery to production. For example, making changes to an application, updating your machine learning (ML) models, changing your multimedia assets, or referring to the AWS Well-Architected Framework.

Adopting CI/CD and the best practices from the Operational Excellence pillar can help you address risks in your environment, and limit errors from manual processes. More importantly, they help free your teams from the related manual processes, so they can focus on satisfying customer needs, differentiating your organization, and accelerating the flow of valuable changes into production.

A red question mark sits on a field of chaotically arranged black question marks.

How do you decide what you need?

The first question in the Operational Excellence pillar is about understanding needs and making informed decisions. To help you frame your own decision-making process, we explore key considerations from the perspective of a fictional startup company and a fictional mature company. In our two related posts, we explore these same considerations with Iponweb and BigHat.

The key considerations include:

  • Functional requirements – Providing specific features and capabilities that deliver value to your customers.
  • Non-functional requirements – Enabling the safe, effective, and efficient delivery of the functional requirements. Non-functional requirements include security, reliability, performance, and cost requirements.
    • Without security, you can’t earn customer trust. If your customers can’t trust you, you won’t have customers.
    • Without reliability you aren’t available to serve your customers. If you can’t serve your customers, you won’t have customers.
    • Performance is focused on timely and efficient delivery of value, not delivering as fast as possible.
    • Cost is focused on optimizing the value received for the resources spent (for example, money, time, or effort), not minimizing expense.
  • Operational requirements – Enabling you to effectively and efficiently support, maintain, sustain, and improve the delivery of value to your customers. When you “Design with Ops in Mind,” you’re enabling effective and efficient support for your business outcomes.

These non-feature-related key considerations are why Operational Excellence, Security, Reliability, Performance Efficiency, and Cost Optimization are the five pillars of the AWS Well-Architected Framework.

The startup company

Any startup begins as a small team of inspired people working together to realize the unique solution they believe solves an unsolved problem.

For our fictional small team, everyone knows each other personally and all speak frequently. We share processes and procedures in discussions, and everyone know what needs to be done. Our team members bring their expertise and dedicate it, and the majority of their work time, to delivering our solution. The results of our efforts inform changes we make to support our next iteration.

However, our manual activities are error-prone and inconsistencies exist in the way we do them. Performing these tasks takes time away from delivering our solution. When errors occur, they have the potential to disrupt everyone’s progress.

We have capital available to make some investments. We would prefer to bring in more team members who can contribute directly to developing our solution. We need to iterate faster if we are to achieve a broadly viable product in time to qualify for our next round of funding. We need to decide what investments to make.

  • Goals – Reach the next milestone and secure funding to continue development
  • Needs – Reduce or eliminate the manual processes and associated errors
  • Priority – Rapid iteration
  • CI/CD emphasis – Baseline CI/CD capabilities and non-functional requirements are emphasized over a rich feature set

The mature company

Our second fictional company is a large and mature organization operating in a mature market segment. We’re focused on consistent, quality customer experiences to serve and retain our customers.

Our size limits the personal relationships between our service and development teams. The process to make requests, and the interfaces between teams and their systems, are well documented and understood.

However, the systems we have implemented over time, as needs were identified and addressed, aren’t well documented. Our existing tool chain includes some in-house scripting and both supported and unsupported versions of open-source tools. There are limited opportunities for us to acquire new customers.

When conditions change and new features are desired, we want to be able to rapidly implement and deploy those features as fast as possible. If we can differentiate our services, however briefly, we may be able to win customers away from our competitors. Our other path to improved profitability is to evolve our processes, maximizing integration and efficiencies, and capturing cost reductions.

  • Goals – Differentiate ourselves in the marketplace with desired new features
  • Needs – Address the risks of poorly documented systems and unsupported software
  • Priority – Evolve efficiency
  • CI/CD emphasis – Rich feature set and integrations are emphasized over improving the existing non-functional capabilities

Open-source tools on AWS vs. AWS services

The choice of open-source tools or AWS service is not binary. You can select the combination of solutions that provides the greatest value. You can implement open-source tools for their specific benefits where they outweigh the costs and operational burden, using underlying AWS services like Amazon Elastic Compute Cloud (Amazon EC2) to host them. You can then use AWS managed services, like AWS CodeBuild, for the undifferentiated features you need, without additional cost or operational burden.

A group of people sit around a table discussing the pieces of a puzzle and their ideas.

Feature Set

Our fictional organizations both want to accelerate the flow of beneficial changes into production and are evaluating CI/CD alternatives to support that outcome. Our startup company wants a working solution—basic capabilities, author/code, build, and deploy, so that they can focus on development. Our mature company is seeking every advantage—a rich feature set, extensive opportunities for customization, integration capabilities, and fine-grained control.

Open-source tools

Open-source tools often excel at meeting functional requirements. When a new functionality, capability, or integration is desired, any developer can implement it for themselves, and then contribute their code back to the project. As the user community for an open-source project expands the number of use cases and the features identified grows, so does the number of potential solutions and potential contributors. Developers are using these tools to support their efforts and implement new features that provide value to them.

However, features may be released in unsupported versions and then later added to the supported feature set. Non-functional requirements take time and are less appealing because they don’t typically bring immediate value to the product. Non-functional capabilities may lag behind the feature set.

Consider the following:

  • Open-source tools may have more features and existing integrations to other tools
  • The pace of feature set delivery may be extremely rapid
  • The features delivered are those desired and created by the active members of the community
  • You are free to implement the features your company desires
  • There is no commitment to long-term support for the project or any given feature
  • You can implement open-source tools on multiple cloud providers or on premises
  • If the project is abandoned, you’re responsible for maintaining your implementation

AWS services

AWS services are driven by customer needs. Services and features are supported by dedicated teams. These customer-obsessed teams focus on all customer needs, with security being their top priority. Both functional and non-functional requirements are addressed with an emphasis on enabling customer outcomes while minimizing the effort they expend to achieve them.

Consider the following:

  • The pace of delivery of feature sets is consistent
  • The feature roadmap is driven by customer need and customer requests
  • The AWS service team is dedicated to support of the service
  • AWS services are available on the AWS Cloud and on premises through AWS Outposts

Picture showing symbol of dollar

Cost Optimization

Why are we discussing cost after the feature set? Security and reliability are fundamentally more important. Leadership naturally gravitates to following the operational excellence best practice of evaluating trade-offs. Having looked at the potential benefits from the feature set, the next question is typically, “What is this going to cost?” Leadership defines the priorities and allocates the resources necessary (capital, time, effort). We review cost optimization second so that leadership can make a comparison of the expected benefits between CI/CD investments, and investments in other efforts, so they can make an informed decision.

Our organizations are both cost conscious. Our startup is working with finite capital and time. In contrast, our mature company can plan to make investments over time and budget for the needed capital. Early investment in a robust and feature-rich CI/CD tool chain could provide significant advantages towards the startup’s long-term success, but if the startup fails early, the value of that investment will never be realized. The mature company can afford to realize the value of their investment over time and can make targeted investments to address specific short-term needs.

Open-source tools

Open-source software doesn’t have to be purchased, but there are costs to adopt. Open-source tools require appropriate skills in order to be implemented, and to perform management and maintenance activities. Those skills must be gained through dedicated training of team members, team member self-study, or by hiring new team members with the existing skills. The availability of skilled practitioners of open-source tools varies with how popular a tool is and how long it has had an active community. Loss of skilled team members includes the loss of their institutional knowledge and intimacy with the implementation. Skills must be maintained with changes to the tools and as team members join or leave. Time is required from skilled team members to support management and maintenance activities. If commercial support for the tool is desired, it may be available through third-parties at an additional cost.

The time to value of an open-source implementation includes the time to implement and configure the resources and software. Additional value may be realized through investment of time configuring or implementing desired integrations and capabilities. There may be existing community-supported integrations or capabilities that reduce the level of effort to achieve these.

Consider the following:

  • No cost to acquire the software.
  • The availability of skill practitioners of open-source tools may be lower. Cost (capital and time) to acquire, establish, or maintain skill set may be higher.
  • There is an ongoing cost to maintain the team member skills necessary to support the open-source tools.
  • There is an ongoing cost of time for team members to perform management and maintenance activities.
  • Additional commercial support for open-source tools may be available at additional cost
  • Time to value includes implementation and configuration of resources and the open-source software. There may be more predefined community integrations.

AWS services

AWS services are provided pay-as-you-go with no required upfront costs. As of August 2020, more than 400,000 individuals hold active AWS Certifications, a number that grew more than 85% between August 2019 and August 2020.

Time to value for AWS services is extremely short and limited to the time to instantiate or configure the service for your use. Additional value may be realized through the investment of time configuring or implementing desired integrations. Predefined integrations for AWS services are added as part of the service development roadmap. However, there may be fewer existing integrations to reduce your level of effort.

Consider the following:

  • No cost to acquire the software; AWS services are pay-as-you-go for use.
  • AWS skill sets are broadly available. Cost (capital and time) to acquire, establish, or maintain skill sets may be lower.
  • AWS services are fully managed, and service teams are responsible for the operation of the services.
  • Time to value is limited to the time to instantiate or configure the service. There may be fewer predefined integrations.
  • Additional support for AWS services is available through AWS Support. Cost for support varies based on level of support and your AWS utilization.

Open-source tools on AWS services

Open-source tools on AWS services don’t impact these cost considerations. Migration off of either of these solutions is similarly not differentiated. In either case, you have to invest time in replacing the integrations and customizations you wish to maintain.

Picture showing a checkmark put on security

Security

Both organizations are concerned about reputation and customer trust. They both want to act to protect their information systems and are focusing on confidentiality and integrity of data. They both take security very seriously. Our startup wants to be secure by default and wants to trust the vendor to address vulnerabilities within the service. Our mature company has dedicated resources that focus on security, and the company practices defense in depth across internal organizations.

The startup and the mature company both want to know whether a choice is safe, secure, and can validate the security of their choice. They also want to understand their responsibilities and the shared responsibility model that applies.

Open-source tools

Open-source tools are the product of the contributors and may contain flaws or vulnerabilities. The entire community has access to the code to test and validate. There are frequently many eyes evaluating the security of the tools. A company or individual may perform a validation for themselves. However, there may be limited guidance on secure configurations. Controls in the implementer’s environment may reduce potential risk.

Consider the following:

  • You’re responsible for the security of the open-source software you implement
  • You control the security of your data within your open-source implementation
  • You can validate the security of the code and act as desired

AWS services

AWS service teams make security their highest priority and are able to respond rapidly when flaws are identified. There is robust guidance provided to support configuring AWS services securely.

Consider the following:

  • AWS is responsible for the security of the cloud and the underlying services
  • You are responsible for the security of your data in the cloud and how you configure AWS services
  • You must rely on the AWS service team to validate the security of the code

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation and the configuration of the AWS services it consumes. AWS is responsible for the security of the AWS Cloud and the managed AWS services.

Picture showing global distribution for redundancy to depict reliability

Reliability

Everyone wants reliable capabilities. What varies between companies is their appetite for risk, and how much they can tolerate the impact of non-availability. The startup emphasized the need for their systems to be available to support their rapid iterations. The mature company is operating with some existing reliability risks, including unsupported open-source tools and in-house scripts.

The startup and the mature company both want to understand the expected reliability of a choice, meaning what percentage of the time it is expected to be available. They both want to know if a choice is designed for high availability and will remain available even if a portion of the systems fails or is in a degraded state. They both want to understand the durability of their data, how to perform backups of their data, and how to perform recovery in the event of a failure.

Both companies need to determine what is an acceptable outage duration, commonly referred to as a Recovery Time Objective (RTO), and for what quantity of elapsed time it is acceptable to lose transactions (including committing changes), commonly referred to as Recovery Point Objective (RPO). They need to evaluate if they can achieve their RTO and RPO objectives with each of the choices they are considering.

Open-source tools

Open-source reliability is dependent upon the effectiveness of the company’s implementation, the underlying resources supporting the implementation, and the reliability of the open-source software. Open-source tools are the product of the contributors and may or may not incorporate high availability features. Depending on the implementation and tool, there may be a requirement for downtime for specific management or maintenance activities. The ability to support RTO and RPO depends on the teams supporting the company system, the implementation, and the mechanisms implemented for backup and recovery.

Consider the following:

  • You are responsible for implementing your open-source software to satisfy your reliability needs and high availability needs
  • Open-source tools may have downtime requirements to support specific management or maintenance activities
  • You are responsible for defining, implementing, and testing the backup and recovery mechanisms and procedures
  • You are responsible for the satisfaction of your RTO and RPO in the event of a failure of your open-source system

AWS services

AWS services are designed to support customer availability needs. As managed services, the service teams are responsible for maintaining the health of the services.

Consider the following:

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation (including data durability, backup, and recovery) and the configuration of the AWS services it consumes. AWS is responsible for the health of the AWS Cloud and the managed services.

Picture showing a graph depicting performance measurement

Performance

What defines timely and efficient delivery of value varies between our two companies. Each is looking for results before an engineer becomes idled by having to wait for results. The startup iterates rapidly based on the results of each prior iteration. There is limited other activity for our startup engineer to perform before they have to wait on actionable results. Our mature company is more likely to have an outstanding backlog or improvements that can be acted upon while changes moves through the pipeline.

Open-source tools

Open-source performance is defined by the resources upon which it is deployed. Open-source tools that can scale out can dynamically improve their performance when resource constrained. Performance can also be improved by scaling up, which is required when performance is constrained by resources and scaling out isn’t supported. The performance of open-source tools may be constrained by characteristics of how they were implemented in code or the libraries they use. If this is the case, the code is available for community or implementer-created improvements to address the limitation.

Consider the following:

  • You are responsible for managing the performance of your open-source tools
  • The performance of open-source tools may be constrained by the resources they are implemented upon; the code and libraries used; their system, resource, and software configuration; and the code and libraries present within the tools

AWS services

AWS services are designed to be highly scalable. CodeCommit has a highly scalable architecture, and CodeBuild scales up and down dynamically to meet your build volume. CodePipeline allows you to run actions in parallel in order to increase your workflow speeds.

Consider the following:

  • AWS services are fully managed, and service teams are responsible for the performance of the services.
  • AWS services are designed to scale automatically.
  • Your configuration of the services you consume can affect the performance of those services.
  • AWS services quotas exist to prevent unexpected costs. You can make changes to service quotas that may affect performance and costs.

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation (including the selection and configuration of the AWS Cloud resources) and the configuration of the AWS services it consumes. AWS is responsible for the performance of the AWS Cloud and the managed AWS services.

Picture showing cart-wheels in motion, depicting operations

Operations

Our startup company wants to limit its operations burden as much as possible in order to focus on development efforts. Our mature company has an established and robust operations capability. In both cases, they perform the management and maintenance activities necessary to support their needs.

Open-source tools

Open-source tools are supported by their volunteer communities. That support is voluntary, without any obligation or commitment from the users. If either company adopts open-source tools, they’re responsible for the management and maintenance of the system. If they want additional support with an obligation and commitment to support their implementation, third parties may provide commercial support at additional cost.

Consider the following:

  • You are responsible for supporting your implementation.
  • The open-source community may provide volunteer support for the software.
  • There is no commitment to support the software by the open-source community.
  • There may be less documentation, or accepted best practices, available to support open-source tools.
  • Early adoption of open-source tools, or the use of development builds, includes the chance of encountering unidentified edge cases and unanticipated issues.
  • The complexity of an implementation and its integrations may increase the difficulty to support open-source tools. The time to identify contributing factors may be extended by the complexity during an incident. Maintaining a set of skilled team members with deep understanding of your implementation may help mitigate this risk.
  • You may be able to acquire commercial support through a third party.

AWS services

AWS services are committed to providing long-term support for their customers.

Consider the following:

  • There is long-term commitment from AWS to support the service
  • As a managed service, the service team maintains current documentation
  • Additional levels of support are available through AWS Support
  • Support for AWS is available through partners and third parties

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations. The company is responsible for operating the open-source tools (for example, software configuration changes, updates, patching, and responding to faults). AWS is responsible for the operation of the AWS Cloud and the managed AWS services.

Conclusion

In this post, we discussed how to make informed decisions when choosing to implement open-source tools on AWS services, adopt managed AWS services, or use a combination of both. To do so, you must examine your organization and evaluate the benefits and risks.

A magnifying glass is focused on the single red figure in a group of otherwise blue paper figures standing on a white surface.

Examine your organization

You can make an informed decision about the capabilities you adopt. The insight you need can be gained by examining your organization to identify your goals, needs, and priorities, and discovering what your current emphasis is. Ask the following questions:

  • What is your organization trying to accomplish and why?
  • How large is your organization and how is it structured?
  • How are roles and responsibilities distributed across teams?
  • How well defined and understood are your processes and procedures?
  • How do you manage development, testing, delivery, and deployment today?
  • What are the major challenges your organization faces?
  • What are the challenges you face managing development?
  • What problems are you trying to solve with CI/CD tools?
  • What do you want to achieve with CI/CD tools?

Evaluate benefits and risk

Armed with that knowledge, the next step is to explore the trade-offs between open-source options and managed AWS services. Then evaluate the benefits and risks in terms of the key considerations:

  • Features
  • Cost
  • Security
  • Reliability
  • Performance
  • Operations

When asked “What is the correct answer?” the answer should never be “It depends.” We need to change the question to “What is our use case and what are our needs?” The answer will emerge from there.

Make an informed decision

A Well-Architected solution can include open-source tools, AWS Services, or any combination of both! A Well-Architected choice is an informed decision that evaluates trade-offs, balances benefits and risks, satisfies your requirements, and most importantly supports the achievement of your business outcomes.

Read the other posts in this series and take this journey with BigHat Biosciences and Iponweb as they share their perspectives, the decisions they made, and why.

Resources

Want to learn more? Check out the following CI/CD and developer tools on AWS:

Continuous integration (CI)
Continuous delivery (CD)
AWS Developer Tools

For more information about the AWS Well-Architected Framework, refer to the following whitepapers:

AWS Well-Architected Framework
AWS Well-Architected Operational Excellence pillar
AWS Well-Architected Security pillar
AWS Well-Architected Reliability pillar
AWS Well-Architected Performance Efficiency pillar
AWS Well-Architected Cost Optimization pillar

The 3 hexagons of the well architected logo appear to the right of the words AWS Well-Architected.

Author bio

portrait photo of Brian Carlson Brian is the global Operational Excellence lead for the AWS Well-Architected program. Formerly the technical lead for an international network, Brian works with customers and partners researching the operations best practices with the greatest positive impact and produces guidance to help you achieve your goals.

 

Integrate GitHub monorepo with AWS CodePipeline to run project-specific CI/CD pipelines

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/integrate-github-monorepo-with-aws-codepipeline-to-run-project-specific-ci-cd-pipelines/

AWS CodePipeline is a continuous delivery service that enables you to model, visualize, and automate the steps required to release your software. With CodePipeline, you model the full release process for building your code, deploying to pre-production environments, testing your application, and releasing it to production. CodePipeline then builds, tests, and deploys your application according to the defined workflow either in manual mode or automatically every time a code change occurs. A lot of organizations use GitHub as their source code repository. Some organizations choose to embed multiple applications or services in a single GitHub repository separated by folders. This method of organizing your source code in a repository is called a monorepo.

This post demonstrates how to customize GitHub events that invoke a monorepo service-specific pipeline by reading the GitHub event payload using AWS Lambda.

 

Solution overview

With the default setup in CodePipeline, a release pipeline is invoked whenever a change in the source code repository is detected. When using GitHub as the source for a pipeline, CodePipeline uses a webhook to detect changes in a remote branch and starts the pipeline. When using a monorepo style project with GitHub, it doesn’t matter which folder in the repository you change the code, CodePipeline gets an event at the repository level. If you have a continuous integration and continuous deployment (CI/CD) pipeline for each of the applications and services in a repository, all pipelines detect the change in any of the folders every time. The following diagram illustrates this scenario.

 

GitHub monorepo folder structure

 

This post demonstrates how to customize GitHub events that invoke a monorepo service-specific pipeline by reading the GitHub event payload using Lambda. This solution has the following benefits:

  • Add customizations to start pipelines based on external factors – You can use custom code to evaluate whether a pipeline should be triggered. This allows for further customization beyond polling a source repository or relying on a push event. For example, you can create custom logic to automatically reschedule deployments on holidays to the next available workday.
  • Have multiple pipelines with a single source – You can trigger selected pipelines when multiple pipelines are listening to a single GitHub repository. This lets you group small and highly related but independently shipped artifacts such as small microservices without creating thousands of GitHub repos.
  • Avoid reacting to unimportant files – You can avoid triggering a pipeline when changing files that don’t affect the application functionality (such as documentation, readme, PDF, and .gitignore files).

In this post, we’re not debating the advantages or disadvantages of a monorepo versus a single repo, or when to create monorepos or single repos for each application or project.

 

Sample architecture

This post focuses on controlling running pipelines in CodePipeline. CodePipeline can have multiple stages like test, approval, and deploy. Our sample architecture considers a simple pipeline with two stages: source and build.

 

Github monorepo - CodePipeline Sample Architecture

This solution is made up of following parts:

  • An Amazon API Gateway endpoint (3) is backed by a Lambda function (5) to receive and authenticate GitHub webhook push events (2)
  • The same function evaluates incoming GitHub push events and starts the pipeline on a match
  • An Amazon Simple Storage Service (Amazon S3) bucket (4) stores the CodePipeline-specific configuration files
  • The pipeline contains a build stage with AWS CodeBuild

 

Normally, after you create a CI/CD pipeline, it automatically triggers a pipeline to release the latest version of your source code. From then on, every time you make a change in your source code, the pipeline is triggered. You can also manually run the last revision through a pipeline by choosing Release change on the CodePipeline console. This architecture uses the manual mode to run the pipeline. GitHub push events and branch changes are evaluated by the Lambda function to avoid commits that change unimportant files from starting the pipeline.

 

Creating an API Gateway endpoint

We need a single API Gateway endpoint backed by a Lambda function with the responsibility of authenticating and validating incoming requests from GitHub. You can authenticate requests using HMAC security or GitHub Apps. API Gateway only needs one POST method to consume GitHub push events, as shown in the following screenshot.

 

Creating an API Gateway endpoint

 

Creating the Lambda function

This Lambda function is responsible for authenticating and evaluating the GitHub events. As part of the evaluation process, the function can parse through the GitHub events payload, determine which files are changed, added, or deleted, and perform the appropriate action:

  • Start a single pipeline, depending on which folder is changed in GitHub
  • Start multiple pipelines
  • Ignore the changes if non-relevant files are changed

You can store the project configuration details in Amazon S3. Lambda can read this configuration to decide what needs to be done when a particular folder is matched from a GitHub event. The following code is an example configuration:

{

    "GitHubRepo": "SampleRepo",

    "GitHubBranch": "main",

    "ChangeMatchExpressions": "ProjectA/.*",

    "IgnoreFiles": "*.pdf;*.md",

    "CodePipelineName": "ProjectA - CodePipeline"

}

For more complex use cases, you can store the configuration file in Amazon DynamoDB.

The following is the sample Lambda function code in Python 3.7 using Boto3:

def lambda_handler(event, context):

    import json
    modifiedFiles = event["commits"][0]["modified"]
    #full path
    for filePath in modifiedFiles:
        # Extract folder name
        folderName = (filePath[:filePath.find("/")])
        break

    #start the pipeline
    if len(folderName)>0:
        # Codepipeline name is foldername-job. 
        # We can read the configuration from S3 as well. 
        returnCode = start_code_pipeline(folderName + '-job')

    return {
        'statusCode': 200,
        'body': json.dumps('Modified project in repo:' + folderName)
    }
    

def start_code_pipeline(pipelineName):
    client = codepipeline_client()
    response = client.start_pipeline_execution(name=pipelineName)
    return True

cpclient = None
def codepipeline_client():
    import boto3
    global cpclient
    if not cpclient:
        cpclient = boto3.client('codepipeline')
    return cpclient
   

Creating a GitHub webhook

GitHub provides webhooks to allow external services to be notified on certain events. For this use case, we create a webhook for a push event. This generates a POST request to the URL (API Gateway URL) specified for any files committed and pushed to the repository. The following screenshot shows our webhook configuration.

Creating a GitHub webhook2

Conclusion

In our sample architecture, two pipelines monitor the same GitHub source code repository. A Lambda function decides which pipeline to run based on the GitHub events. The same function can have logic to ignore unimportant files, for example any readme or PDF files.

Using API Gateway, Lambda, and Amazon S3 in combination serves as a general example to introduce custom logic to invoke pipelines. You can expand this solution for increasingly complex processing logic.

 

About the Author

Vivek Kumar

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 23 years of experience in software engineering and architecture roles for various large-scale enterprises.

 

 

Gaurav-Sharma

Gaurav is a Solutions Architect at AWS. He works with digital native business customers providing architectural guidance on AWS services.

 

 

 

Nitin-Aggarwal

Nitin is a Solutions Architect at AWS. He works with digital native business customers providing architectural guidance on AWS services.

 

 

 

 

Integrating AWS Device Farm with your CI/CD pipeline to run cross-browser Selenium tests

Post Syndicated from Mahesh Biradar original https://aws.amazon.com/blogs/devops/integrating-aws-device-farm-with-ci-cd-pipeline-to-run-cross-browser-selenium-tests/

Continuously building, testing, and deploying your web application helps you release new features sooner and with fewer bugs. In this blog, you will create a continuous integration and continuous delivery (CI/CD) pipeline for a web app using AWS CodeStar services and AWS Device Farm’s desktop browser testing service.  AWS CodeStar is a suite of services that help you quickly develop and build your web apps on AWS.

AWS Device Farm’s desktop browser testing service helps developers improve the quality of their web apps by executing their Selenium tests on different desktop browsers hosted in AWS. For each test executed on the service, Device Farm generates action logs, web driver logs, video recordings, to help you quickly identify issues with your app. The service offers pay-as-you-go pricing so you only pay for the time your tests are executing on the browsers with no upfront commitments or additional costs. Furthermore, Device Farm provides a default concurrency of 50 Selenium sessions so you can run your tests in parallel and speed up the execution of your test suites.

After you’ve completed the steps in this blog, you’ll have a working pipeline that will build your web app on every code commit and test it on different versions of desktop browsers hosted in AWS.

Solution overview

In this solution, you use AWS CodeStar to create a sample web application and a CI/CD pipeline. The following diagram illustrates our solution architecture.

Figure 1: Deployment pipeline architecture

Figure 1: Deployment pipeline architecture

Prerequisites

Before you deploy the solution, complete the following prerequisite steps:

  1. On the AWS Management Console, search for AWS CodeStar.
  2. Choose Getting Started.
  3. Choose Create Project.
  4. Choose a sample project.

For this post, we choose a Python web app deployed on Amazon Elastic Compute Cloud (Amazon EC2) servers.

  1. For Templates, select Python (Django).
  2. Choose Next.
Figure 2: Choose a template in CodeStar

Figure 2: Choose a template in CodeStar

  1. For Project name, enter Demo CICD Selenium.
  2. Leave the remaining settings at their default.

You can choose from a list of existing key pairs. If you don’t have a key pair, you can create one.

  1. Choose Next.
  2. Choose Create project.
 Figure 3: Verify the project configuration

Figure 3: Verify the project configuration

AWS CodeStar creates project resources for you as listed in the following table.

Service Resource Created
AWS CodePipeline project demo-cicd-selen-Pipeline
AWS CodeCommit repository Demo-CICD-Selenium
AWS CodeBuild project demo-cicd-selen
AWS CodeDeploy application demo-cicd-selen
AWS CodeDeploy deployment group demo-cicd-selen-Env
Amazon EC2 server Tag: Environment = demo-cicd-selen-WebApp
IAM role AWSCodeStarServiceRole, CodeStarWorker*

Your EC2 instance needs to have access to run AWS Device Farm to run the Selenium test scripts. You can use service roles to achieve that.

  1. Attach policy AWSDeviceFarmFullAccess to the IAM role CodeStarWorker-demo-cicd-selen-WebApp.

You’re now ready to create an AWS Cloud9 environment.

Check the Pipelines tab of your AWS CodeStar project; it should show success.

  1. On the IDE tab, under Cloud 9 environments, choose Create environment.
  2. For Environment name, enter Demo-CICD-Selenium.
  3. Choose Create environment.
  4. Wait for the environment to be complete and then choose Open IDE.
  5. In the IDE, follow the instructions to set up your Git and make sure it’s up to date with your repo.

The following screenshot shows the verification that the environment is set up.

Figure 4: Verify AWS Cloud9 setup

Figure 4: Verify AWS Cloud9 setup

You can now verify the deployment.

  1. On the Amazon EC2 console, choose Instances.
  2. Choose demo-cicd-selen-WebApp.
  3. Locate its public IP and choose open address.
Figure 5: Locating IP address of the instance

Figure 5: Locating IP address of the instance

 

A webpage should open. If it doesn’t, check your VPN and firewall settings.

Now that you have a working pipeline, let’s move on to creating a Device Farm project for browser testing.

Creating a Device Farm project

To create your browser testing project, complete the following steps:

  1. On the Device Farm console, choose Desktop browser testing project.
  2. Choose Create a new project.
  3. For Project name, enter Demo cicd selenium.
  4. Choose Create project.
Figure 6: Creating AWS Device Farm project

Figure 6: Creating AWS Device Farm project

 

Note down the project ARN; we use this in the Selenium script for the remote web driver.

Figure 7: Project ARN for AWS Device Farm project

Figure 7: Project ARN for AWS Device Farm project

 

Testing the solution

This solution uses the following script to run browser testing. We call this script in the validate service lifecycle hook of the CodeDeploy project.

  1. Open your AWS Cloud9 IDE (which you made as a prerequisite).
  2. Create a folder tests under the project root directory.
  3. Add the sample Selenium script browser-test-sel.py under tests folder with the following content (replace <sample_url> with the url of your web application refer pre-requisite step 18):
import boto3
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

devicefarm_client = boto3.client("devicefarm", region_name="us-west-2")
testgrid_url_response = devicefarm_client.create_test_grid_url(
    projectArn="arn:aws:devicefarm:us-west-2:<your project ARN>",
    expiresInSeconds=300)

driver = webdriver.Remote(testgrid_url_response["url"],
                          webdriver.DesiredCapabilities.FIREFOX)
try:
    driver.implicitly_wait(30)
    driver.maximize_window()
    driver.get("<sample_url>")
    if driver.find_element_by_id("Layer_1"):
        print("graphics generated in full screen")
    assert driver.find_element_by_id("Layer_1")
    driver.set_window_position(0, 0) and driver.set_window_size(1000, 400)
    driver.get("<sample_url>")
    tower = driver.find_element_by_id("tower1")
    if tower.is_displayed():
        print("graphics generated after resizing")
    else:
        print("graphics not generated at this window size")
       # this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate
except Exception as e:
    print(e)
finally:
    driver.quit()

This script launches a website (in Firefox) created by you in the prerequisite step using the AWS CodeStar template and verifies if a graphic element is loaded.

The following screenshot shows a full-screen application with graphics loaded.

Figure 8: Full screen application with graphics loaded

Figure 8: Full screen application with graphics loaded

 

The following screenshot shows the resized window with no graphics loaded.

Figure 9: Resized window application with No graphics loaded

Figure 9: Resized window application with No graphics loaded

  1. Create the file validate_service in the scripts folder under the root directory with the following content:
#!/bin/bash
if [ "$DEPLOYMENT_GROUP_NAME" == "demo-cicd-selen-Env" ]
then
cd /home/ec2-user
source environment/bin/activate
python tests/browser-test-sel.py
fi

This script is part of CodeDeploy scripts and determines whether to stop the pipeline or continue based on the output from the browser testing script from the preceding step.

  1. Modify the file appspec.yml under the root directory, add the tests files and ValidateService hook , the file should look like following:
version: 0.0
os: linux
files:
 - source: /ec2django/
   destination: /home/ec2-user/ec2django
 - source: /helloworld/
   destination: /home/ec2-user/helloworld
 - source: /manage.py
   destination: /home/ec2-user
 - source: /supervisord.conf
   destination: /home/ec2-user
 - source: /requirements.txt
   destination: /home/ec2-user
 - source: /requirements/
   destination: /home/ec2-user/requirements
 - source: /tests/
   destination: /home/ec2-user/tests

permissions:
  - object: /home/ec2-user/manage.py
    owner: ec2-user
    mode: 644
    type:
      - file
  - object: /home/ec2-user/supervisord.conf
    owner: ec2-user
    mode: 644
    type:
      - file
hooks:
  AfterInstall:
    - location: scripts/install_dependencies
      timeout: 300
      runas: root
    - location: scripts/codestar_remote_access
      timeout: 300
      runas: root
    - location: scripts/start_server
      timeout: 300
      runas: root

  ApplicationStop:
    - location: scripts/stop_server
      timeout: 300
      runas: root

  ValidateService:
    - location: scripts/validate_service
      timeout: 600
      runas: root

This file is used by AWS CodeDeploy service to perform the deployment and validation steps.

  1. Modify the artifacts section in the buildspec.yml file. The section should look like the following:
artifacts:
  files:
    - 'template.yml'
    - 'ec2django/**/*'
    - 'helloworld/**/*'
    - 'scripts/**/*'
    - 'tests/**/*'
    - 'appspec.yml'
    - 'manage.py'
    - 'requirements/**/*'
    - 'requirements.txt'
    - 'supervisord.conf'
    - 'template-configuration.json'

This file is used by AWS CodeBuild service to package the code

  1. Modify the file Common.txt in the requirements folder under the root directory, the file should look like the following:
# dependencies common to all environments 
Django=2.1.15 
selenium==3.141.0 
boto3 >= 1.10.44
pytest

  1. Save All the changes, your folder structure should look like the following:
── README.md
├── appspec.yml*
├── buildspec.yml*
├── db.sqlite3
├── ec2django
├── helloworld
├── manage.py
├── requirements
│   ├── common.txt*
│   ├── dev.txt
│   ├── prod.txt
├── requirements.txt
├── scripts
│   ├── codestar_remote_access
│   ├── install_dependencies
│   ├── start_server
│   ├── stop_server
│   └── validate_service**
├── supervisord.conf
├── template-configuration.json
├── template.yml
├── tests
│   └── browser-test-sel.py**

**newly added files
*modified file

Running the tests

The Device Farm desktop browsing project is now integrated with your pipeline. All you need to do now is commit the code, and CodePipeline takes care of the rest.

  1. On Cloud9 terminal, go to project root directory.
  2. Run git add . to stage all changed files for commit.
  3. Run git commit -m “<commit message>” to commit the changes.
  4. Run git push to push the changes to repository, this should trigger the Pipeline.
  5. Check the Pipelines tab of your AWS CodeStar project; it should show success.
  6. Go to AWS Device Farm console and click on Desktop browser testing projects.
  7. Click on your Project Demo cicd selenium.

You can verify the running of your Selenium test cases using the recorded run steps shown on the Device Farm console, the video of the run, and the logs, all of which can be downloaded on the Device Farm console, or using the AWS SDK and AWS Command Line Interface (AWS CLI).

The following screenshot shows the project run details on the console.

Figure 10: Viewing AWS Device Farm project run details

Figure 10: Viewing AWS Device Farm project run details

 

To Test the Selenium script locally you can run the following commands.

1. Create a Python virtual environment for your Django project. This virtual environment allows you to isolate this project and install any packages you need without affecting the system Python installation. At the terminal, go to project root directory and type the following command:

$ python3 -m venv ./venv

2. Activate the virtual environment:

$ source ./venv/bin/activate

3. Install development Python dependencies for this project:

$ pip install -r requirements/dev.txt

4. Run Selenium Script:

$ python tests/browser-test-sel.py

Testing the failure scenario (optional)

To test the failure scenario, you can modify the sample script browser-test-sel.py at the else statement.

The following code shows the lines to change:

else:
print("graphics was not generated at this form size")
# this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate

The following is the updated code:

else:
exit(1)
# this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate

Commit the change, and the pipeline should fail and stop the deployment.

Conclusion

Integrating Device Farm with CI/CD pipelines allows you to control deployment based on browser testing results. Failing the Selenium test on validation failures can stop and roll back the deployment, and a successful testing can continue the pipeline to deploy the solution to the final stage. Device Farm offers a one-stop solution for testing your native and web applications on desktop browsers and real mobile devices.

Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale..

Using AWS DevOps Tools to model and provision AWS Glue workflows

Post Syndicated from Nuatu Tseggai original https://aws.amazon.com/blogs/devops/provision-codepipeline-glue-workflows/

This post provides a step-by-step guide on how to model and provision AWS Glue workflows utilizing a DevOps principle known as infrastructure as code (IaC) that emphasizes the use of templates, source control, and automation. The cloud resources in this solution are defined within AWS CloudFormation templates and provisioned with automation features provided by AWS CodePipeline and AWS CodeBuild. These AWS DevOps tools are flexible, interchangeable, and well suited for automating the deployment of AWS Glue workflows into different environments such as dev, test, and production, which typically reside in separate AWS accounts and Regions.

AWS Glue workflows allow you to manage dependencies between multiple components that interoperate within an end-to-end ETL data pipeline by grouping together a set of related jobs, crawlers, and triggers into one logical run unit. Many customers using AWS Glue workflows start by defining the pipeline using the AWS Management Console and then move on to monitoring and troubleshooting using either the console, AWS APIs, or the AWS Command Line Interface (AWS CLI).

Solution overview

The solution uses COVID-19 datasets. For more information on these datasets, see the public data lake for analysis of COVID-19 data, which contains a centralized repository of freely available and up-to-date curated datasets made available by the AWS Data Lake team.

Because the primary focus of this solution showcases how to model and provision AWS Glue workflows using AWS CloudFormation and CodePipeline, we don’t spend much time describing intricate transform capabilities that can be performed in AWS Glue jobs. As shown in the Python scripts, the business logic is optimized for readability and extensibility so you can easily home in on the functions that aggregate data based on monthly and quarterly time periods.

The ETL pipeline reads the source COVID-19 datasets directly and writes only the aggregated data to your S3 bucket.

The solution exposes the datasets in the following tables:

Table Name Description Dataset location Provider
countrycode Lookup table for country codes s3://covid19-lake/static-datasets/csv/countrycode/ Rearc
countypopulation Lookup table for the population of each county s3://covid19-lake/static-datasets/csv/CountyPopulation/ Rearc
state_abv Lookup table for US state abbreviations s3://covid19-lake/static-datasets/json/state-abv/ Rearc
rearc_covid_19_nyt_data_in_usa_us_counties Data on COVID-19 cases at US county level s3://covid19-lake/rearc-covid-19-nyt-data-in-usa/csv/us-counties/ Rearc
rearc_covid_19_nyt_data_in_usa_us_states Data on COVID-19 cases at US state level s3://covid19-lake/rearc-covid-19-nyt-data-in-usa/csv/us-states/ Rearc
rearc_covid_19_testing_data_states_daily Data on COVID-19 cases at US state level s3://covid19-lake/rearc-covid-19-testing-data/csv/states_daily/ Rearc
rearc_covid_19_testing_data_us_daily US total test daily trend s3://covid19-lake/rearc-covid-19-testing-data/csv/us_daily/ Rearc
rearc_covid_19_testing_data_us_total_latest US total tests s3://covid19-lake/rearc-covid-19-testing-data/csv/us-total-latest/ Rearc
rearc_covid_19_world_cases_deaths_testing World total tests s3://covid19-lake/rearc-covid-19-world-cases-deaths-testing/ Rearc
rearc_usa_hospital_beds Hospital beds and their utilization in the US s3://covid19-lake/rearc-usa-hospital-beds/ Rearc
world_cases_deaths_aggregates Monthly and quarterly aggregate of the world s3://<your-S3-bucket-name>/covid19/world-cases-deaths-aggregates/ Aggregate

Prerequisites

This post assumes you have the following:

  • Access to an AWS account
  • The AWS CLI (optional)
  • Permissions to create a CloudFormation stack
  • Permissions to create AWS resources, such as AWS Identity and Access Management (IAM) roles, Amazon Simple Storage Service (Amazon S3) buckets, and various other resources
  • General familiarity with AWS Glue resources (triggers, crawlers, and jobs)

Architecture

The CloudFormation template glue-workflow-stack.yml defines all the AWS Glue resources shown in the following diagram.

architecture diagram showing ETL process

Figure: AWS Glue workflow architecture diagram

Modeling the AWS Glue workflow using AWS CloudFormation

Let’s start by exploring the template used to model the AWS Glue workflow: glue-workflow-stack.yml

We focus on two resources in the following snippet:

  • AWS::Glue::Workflow
  • AWS::Glue::Trigger

From a logical perspective, a workflow contains one or more triggers that are responsible for invoking crawlers and jobs. Building a workflow starts with defining the crawlers and jobs as resources within the template and then associating it with triggers.

Defining the workflow

This is where the definition of the workflow starts. In the following snippet, we specify the type as AWS::Glue::Workflow and the property Name as a reference to the parameter GlueWorkflowName.

Parameters:
  GlueWorkflowName:
    Type: String
    Description: Glue workflow that tracks all triggers, jobs, crawlers as a single entity
    Default: Covid_19

Resources:
  Covid19Workflow:
    Type: AWS::Glue::Workflow
    Properties: 
      Description: Glue workflow that tracks specified triggers, jobs, and crawlers as a single entity
      Name: !Ref GlueWorkflowName

Defining the triggers

This is where we define each trigger and associate it with the workflow. In the following snippet, we specify the property WorkflowName on each trigger as a reference to the logical ID Covid19Workflow.

These triggers allow us to create a chain of dependent jobs and crawlers as specified by the properties Actions and Predicate.

The trigger t_Start utilizes a type of SCHEDULED, which means that it starts at a defined time (in our case, one time a day at 8:00 AM UTC). Every time it runs, it starts the job with the logical ID Covid19WorkflowStarted.

The trigger t_GroupA utilizes a type of CONDITIONAL, which means that it starts when the resources specified within the property Predicate have reached a specific state (when the list of Conditions specified equals SUCCEEDED). Every time t_GroupA runs, it starts the crawlers with the logical ID’s CountyPopulation and Countrycode, per the Actions property containing a list of actions.

  TriggerJobCovid19WorkflowStart:
    Type: AWS::Glue::Trigger
    Properties:
      Name: t_Start
      Type: SCHEDULED
      Schedule: cron(0 8 * * ? *) # Runs once a day at 8 AM UTC
      StartOnCreation: true
      WorkflowName: !Ref GlueWorkflowName
      Actions:
        - JobName: !Ref Covid19WorkflowStarted

  TriggerCrawlersGroupA:
    Type: AWS::Glue::Trigger
    Properties:
      Name: t_GroupA
      Type: CONDITIONAL
      StartOnCreation: true
      WorkflowName: !Ref GlueWorkflowName
      Actions:
        - CrawlerName: !Ref CountyPopulation
        - CrawlerName: !Ref Countrycode
      Predicate:
        Conditions:
          - JobName: !Ref Covid19WorkflowStarted
            LogicalOperator: EQUALS
            State: SUCCEEDED

Provisioning the AWS Glue workflow using CodePipeline

Now let’s explore the template used to provision the CodePipeline resources: codepipeline-stack.yml

This template defines an S3 bucket that is used as the source action for the pipeline. Any time source code is uploaded to a specified bucket, AWS CloudTrail logs the event, which is detected by an Amazon CloudWatch Events rule configured to start running the pipeline in CodePipeline. The pipeline orchestrates CodeBuild to get the source code and provision the workflow.

For more information on any of the available source actions that you can use with CodePipeline, such as Amazon S3, AWS CodeCommit, Amazon Elastic Container Registry (Amazon ECR), GitHub, GitHub Enterprise Server, GitHub Enterprise Cloud, or Bitbucket, see Start a pipeline execution in CodePipeline.

We start by deploying the stack that sets up the CodePipeline resources. This stack can be deployed in any Region where CodePipeline and AWS Glue are available. For more information, see AWS Regional Services.

Cloning the GitHub repo

Clone the GitHub repo with the following command:

$ git clone https://github.com/aws-samples/provision-codepipeline-glue-workflows.git

Deploying the CodePipeline stack

Deploy the CodePipeline stack with the following command:

$ aws cloudformation deploy \
--stack-name codepipeline-covid19 \
--template-file cloudformation/codepipeline-stack.yml \
--capabilities CAPABILITY_NAMED_IAM \
--no-fail-on-empty-changeset \
--region <AWS_REGION>

When the deployment is complete, you can view the pipeline that was provisioned on the CodePipeline console.

CodePipeline console showing the deploy pipeline in failed state

Figure: CodePipeline console

The preceding screenshot shows that the pipeline failed. This is because we haven’t uploaded the source code yet.

In the following steps, we zip and upload the source code, which triggers another (successful) run of the pipeline.

Zipping the source code

Zip the source code containing Glue scripts, CloudFormation templates, and Buildspecs file with the following command:

$ zip -r source.zip . -x images/\* *.history* *.git* *.DS_Store*

You can omit *.DS_Store* from the preceding command if you are not a Mac user.

Uploading the source code

Upload the source code with the following command:

$ aws s3 cp source.zip s3://covid19-codepipeline-source-<AWS_ACCOUNT_ID>-<AWS_REGION>

Make sure to provide your account ID and Region in the preceding command. For example, if your AWS account ID is 111111111111 and you’re using Region us-west-2, use the following command:

$ aws s3 cp source.zip s3://covid19-codepipeline-source-111111111111-us-west-2

Now that the source code has been uploaded, view the pipeline again to see it in action.

CodePipeline console showing the deploy pipeline in success state

Figure: CodePipeline console displaying stage “Deploy” in-progress

Choose Details within the Deploy stage to see the build logs.

CodeBuild console displaying build logs

Figure: CodeBuild console displaying build logs

To modify any of the commands that run within the Deploy stage, feel free to modify: deploy-glue-workflow-stack.yml

Try uploading the source code a few more times. Each time it’s uploaded, CodePipeline starts and runs another deploy of the workflow stack. If nothing has changed in the source code, AWS CloudFormation automatically determines that the stack is already up to date. If something has changed in the source code, AWS CloudFormation automatically determines that the stack needs to be updated and proceeds to run the change set.

Viewing the provisioned workflow, triggers, jobs, and crawlers

To view your workflows on the AWS Glue console, in the navigation pane, under ETL, choose Workflows.

Glue console showing workflows

Figure: Navigate to Workflows

To view your triggers, in the navigation pane, under ETL, choose Triggers.

Glue console showing triggers

Figure: Navigate to Triggers

To view your crawlers, under Data Catalog, choose Crawlers.

Glue console showing crawlers

Figure: Navigate to Crawlers

To view your jobs, under ETL, choose Jobs.

Glue console showing jobs

Figure: Navigate to Jobs

Running the workflow

The workflow runs automatically at 8:00 AM UTC. To start the workflow manually, you can use either the AWS CLI or the AWS Glue console.

To start the workflow with the AWS CLI, enter the following command:

$ aws glue start-workflow-run --name Covid_19 --region <AWS_REGION>

To start the workflow on the AWS Glue console, on the Workflows page, select your workflow and choose Run on the Actions menu.

Glue console run workflow

Figure: AWS Glue console start workflow run

To view the run details of the workflow, choose the workflow on the AWS Glue console and choose View run details on the History tab.

Glue console view run details of a workflow

Figure: View run details

The following screenshot shows a visual representation of the workflow as a graph with your run details.

Glue console showing visual representation of the workflow as a graph.

Figure: AWS Glue console displaying details of successful workflow run

Cleaning up

To avoid additional charges, delete the stack created by the CloudFormation template and the contents of the buckets you created.

1. Delete the contents of the covid19-dataset bucket with the following command:

$ aws s3 rm s3://covid19-dataset-<AWS_ACCOUNT_ID>-<AWS_REGION> --recursive

2. Delete your workflow stack with the following command:

$ aws cloudformation delete-stack --stack-name glue-covid19 --region <AWS_REGION>

To delete the contents of the covid19-codepipeline-source bucket, it’s simplest to use the Amazon S3 console because it makes it easy to delete multiple versions of the object at once.

3. Navigate to the S3 bucket named covid19-codepipeline-source-<AWS_ACCOUNT_ID>- <AWS_REGION>.

4. Choose List versions.

5. Select all the files to delete.

6. Choose Delete and follow the prompts to permanently delete all the objects.

S3 console delete all object versions

Figure: AWS S3 console delete all object versions

7. Delete the contents of the covid19-codepipeline-artifacts bucket:

$ aws s3 rm s3://covid19-codepipeline-artifacts-<AWS_ACCOUNT_ID>-<AWS-REGION> --recursive

8. Delete the contents of the covid19-cloudtrail-logs bucket:

$ aws s3 rm s3://covid19-cloudtrail-logs-<AWS_ACCOUNT_ID>-<AWS-REGION> --recursive

9. Delete the pipeline stack:

$ aws cloudformation delete-stack --stack-name codepipeline-covid19 --region <AWS-REGION>

Conclusion

In this post, we stepped through how to use AWS DevOps tooling to model and provision an AWS Glue workflow that orchestrates an end-to-end ETL pipeline on a real-world dataset.

You can download the source code and template from this Github repository and adapt it as you see fit for your data pipeline use cases. Feel free to leave comments letting us know about the architectures you build for your environment. To learn more about building ETL pipelines with AWS Glue, see the AWS Glue Developer Guide and the AWS Data Analytics learning path.

About the Authors

Nuatu Tseggai

Nuatu Tseggai is a Cloud Infrastructure Architect at Amazon Web Services. He enjoys working with customers to design and build event-driven distributed systems that span multiple services.

Suvojit Dasgupta

Suvojit Dasgupta is a Sr. Customer Data Architect at Amazon Web Services. He works with customers to design and build complex data solutions on AWS.

Building end-to-end AWS DevSecOps CI/CD pipeline with open source SCA, SAST and DAST tools

Post Syndicated from Srinivas Manepalli original https://aws.amazon.com/blogs/devops/building-end-to-end-aws-devsecops-ci-cd-pipeline-with-open-source-sca-sast-and-dast-tools/

DevOps is a combination of cultural philosophies, practices, and tools that combine software development with information technology operations. These combined practices enable companies to deliver new application features and improved services to customers at a higher velocity. DevSecOps takes this a step further, integrating security into DevOps. With DevSecOps, you can deliver secure and compliant application changes rapidly while running operations consistently with automation.

Having a complete DevSecOps pipeline is critical to building a successful software factory, which includes continuous integration (CI), continuous delivery and deployment (CD), continuous testing, continuous logging and monitoring, auditing and governance, and operations. Identifying the vulnerabilities during the initial stages of the software development process can significantly help reduce the overall cost of developing application changes, but doing it in an automated fashion can accelerate the delivery of these changes as well.

To identify security vulnerabilities at various stages, organizations can integrate various tools and services (cloud and third-party) into their DevSecOps pipelines. Integrating various tools and aggregating the vulnerability findings can be a challenge to do from scratch. AWS has the services and tools necessary to accelerate this objective and provides the flexibility to build DevSecOps pipelines with easy integrations of AWS cloud native and third-party tools. AWS also provides services to aggregate security findings.

In this post, we provide a DevSecOps pipeline reference architecture on AWS that covers the afore-mentioned practices, including SCA (Software Composite Analysis), SAST (Static Application Security Testing), DAST (Dynamic Application Security Testing), and aggregation of vulnerability findings into a single pane of glass. Additionally, this post addresses the concepts of security of the pipeline and security in the pipeline.

You can deploy this pipeline in either the AWS GovCloud Region (US) or standard AWS Regions. As of this writing, all listed AWS services are available in AWS GovCloud (US) and authorized for FedRAMP High workloads within the Region, with the exception of AWS CodePipeline and AWS Security Hub, which are in the Region and currently under the JAB Review to be authorized shortly for FedRAMP High as well.

Services and tools

In this section, we discuss the various AWS services and third-party tools used in this solution.

CI/CD services

For CI/CD, we use the following AWS services:

  • AWS CodeBuild – A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy.
  • AWS CodeCommit – A fully managed source control service that hosts secure Git-based repositories.
  • AWS CodeDeploy – A fully managed deployment service that automates software deployments to a variety of compute services such as Amazon Elastic Compute Cloud (Amazon EC2), AWS Fargate, AWS Lambda, and your on-premises servers.
  • AWS CodePipeline – A fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.
  • AWS Lambda – A service that lets you run code without provisioning or managing servers. You pay only for the compute time you consume.
  • Amazon Simple Notification Service – Amazon SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.
  • Amazon Simple Storage Service – Amazon S3 is storage for the internet. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.
  • AWS Systems Manager Parameter Store – Parameter Store gives you visibility and control of your infrastructure on AWS.

Continuous testing tools

The following are open-source scanning tools that are integrated in the pipeline for the purposes of this post, but you could integrate other tools that meet your specific requirements. You can use the static code review tool Amazon CodeGuru for static analysis, but at the time of this writing, it’s not yet available in GovCloud and currently supports Java and Python (available in preview).

  • OWASP Dependency-Check – A Software Composition Analysis (SCA) tool that attempts to detect publicly disclosed vulnerabilities contained within a project’s dependencies.
  • SonarQube (SAST) – Catches bugs and vulnerabilities in your app, with thousands of automated Static Code Analysis rules.
  • PHPStan (SAST) – Focuses on finding errors in your code without actually running it. It catches whole classes of bugs even before you write tests for the code.
  • OWASP Zap (DAST) – Helps you automatically find security vulnerabilities in your web applications while you’re developing and testing your applications.

Continuous logging and monitoring services

The following are AWS services for continuous logging and monitoring:

Auditing and governance services

The following are AWS auditing and governance services:

  • AWS CloudTrail – Enables governance, compliance, operational auditing, and risk auditing of your AWS account.
  • AWS Identity and Access Management – Enables you to manage access to AWS services and resources securely. With IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources.
  • AWS Config – Allows you to assess, audit, and evaluate the configurations of your AWS resources.

Operations services

The following are AWS operations services:

  • AWS Security Hub – Gives you a comprehensive view of your security alerts and security posture across your AWS accounts. This post uses Security Hub to aggregate all the vulnerability findings as a single pane of glass.
  • AWS CloudFormation – Gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles, by treating infrastructure as code.
  • AWS Systems Manager Parameter Store – Provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, Amazon Machine Image (AMI) IDs, and license codes as parameter values.
  • AWS Elastic Beanstalk – An easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS. This post uses Elastic Beanstalk to deploy LAMP stack with WordPress and Amazon Aurora MySQL. Although we use Elastic Beanstalk for this post, you could configure the pipeline to deploy to various other environments on AWS or elsewhere as needed.

Pipeline architecture

The following diagram shows the architecture of the solution.

AWS DevSecOps CICD pipeline architecture

AWS DevSecOps CICD pipeline architecture

 

The main steps are as follows:

  1. When a user commits the code to a CodeCommit repository, a CloudWatch event is generated which, triggers CodePipeline.
  2. CodeBuild packages the build and uploads the artifacts to an S3 bucket. CodeBuild retrieves the authentication information (for example, scanning tool tokens) from Parameter Store to initiate the scanning. As a best practice, it is recommended to utilize Artifact repositories like AWS CodeArtifact to store the artifacts, instead of S3. For simplicity of the workshop, we will continue to use S3.
  3. CodeBuild scans the code with an SCA tool (OWASP Dependency-Check) and SAST tool (SonarQube or PHPStan; in the provided CloudFormation template, you can pick one of these tools during the deployment, but CodeBuild is fully enabled for a bring your own tool approach).
  4. If there are any vulnerabilities either from SCA analysis or SAST analysis, CodeBuild invokes the Lambda function. The function parses the results into AWS Security Finding Format (ASFF) and posts it to Security Hub. Security Hub helps aggregate and view all the vulnerability findings in one place as a single pane of glass. The Lambda function also uploads the scanning results to an S3 bucket.
  5. If there are no vulnerabilities, CodeDeploy deploys the code to the staging Elastic Beanstalk environment.
  6. After the deployment succeeds, CodeBuild triggers the DAST scanning with the OWASP ZAP tool (again, this is fully enabled for a bring your own tool approach).
  7. If there are any vulnerabilities, CodeBuild invokes the Lambda function, which parses the results into ASFF and posts it to Security Hub. The function also uploads the scanning results to an S3 bucket (similar to step 4).
  8. If there are no vulnerabilities, the approval stage is triggered, and an email is sent to the approver for action.
  9. After approval, CodeDeploy deploys the code to the production Elastic Beanstalk environment.
  10. During the pipeline run, CloudWatch Events captures the build state changes and sends email notifications to subscribed users through SNS notifications.
  11. CloudTrail tracks the API calls and send notifications on critical events on the pipeline and CodeBuild projects, such as UpdatePipeline, DeletePipeline, CreateProject, and DeleteProject, for auditing purposes.
  12. AWS Config tracks all the configuration changes of AWS services. The following AWS Config rules are added in this pipeline as security best practices:
  13. CODEBUILD_PROJECT_ENVVAR_AWSCRED_CHECK – Checks whether the project contains environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. The rule is NON_COMPLIANT when the project environment variables contains plaintext credentials.
  14. CLOUD_TRAIL_LOG_FILE_VALIDATION_ENABLED – Checks whether CloudTrail creates a signed digest file with logs. AWS recommends that the file validation be enabled on all trails. The rule is noncompliant if the validation is not enabled.

Security of the pipeline is implemented by using IAM roles and S3 bucket policies to restrict access to pipeline resources. Pipeline data at rest and in transit is protected using encryption and SSL secure transport. We use Parameter Store to store sensitive information such as API tokens and passwords. To be fully compliant with frameworks such as FedRAMP, other things may be required, such as MFA.

Security in the pipeline is implemented by performing the SCA, SAST and DAST security checks. Alternatively, the pipeline can utilize IAST (Interactive Application Security Testing) techniques that would combine SAST and DAST stages.

As a best practice, encryption should be enabled for the code and artifacts, whether at rest or transit.

In the next section, we explain how to deploy and run the pipeline CloudFormation template used for this example. Refer to the provided service links to learn more about each of the services in the pipeline. If utilizing CloudFormation templates to deploy infrastructure using pipelines, we recommend using linting tools like cfn-nag to scan CloudFormation templates for security vulnerabilities.

Prerequisites

Before getting started, make sure you have the following prerequisites:

Deploying the pipeline

To deploy the pipeline, complete the following steps: Download the CloudFormation template and pipeline code from GitHub repo.

  1. Log in to your AWS account if you have not done so already.
  2. On the CloudFormation console, choose Create Stack.
  3. Choose the CloudFormation pipeline template.
  4. Choose Next.
  5. Provide the stack parameters:
    • Under Code, provide code details, such as repository name and the branch to trigger the pipeline.
    • Under SAST, choose the SAST tool (SonarQube or PHPStan) for code analysis, enter the API token and the SAST tool URL. You can skip SonarQube details if using PHPStan as the SAST tool.
    • Under DAST, choose the DAST tool (OWASP Zap) for dynamic testing and enter the API token, DAST tool URL, and the application URL to run the scan.
    • Under Lambda functions, enter the Lambda function S3 bucket name, filename, and the handler name.
    • Under STG Elastic Beanstalk Environment and PRD Elastic Beanstalk Environment, enter the Elastic Beanstalk environment and application details for staging and production to which this pipeline deploys the application code.
    • Under General, enter the email addresses to receive notifications for approvals and pipeline status changes.

CF Deploymenet - Passing parameter values

CloudFormation deployment - Passing parameter values

CloudFormation template deployment

After the pipeline is deployed, confirm the subscription by choosing the provided link in the email to receive the notifications.

The provided CloudFormation template in this post is formatted for AWS GovCloud. If you’re setting this up in a standard Region, you have to adjust the partition name in the CloudFormation template. For example, change ARN values from arn:aws-us-gov to arn:aws.

Running the pipeline

To trigger the pipeline, commit changes to your application repository files. That generates a CloudWatch event and triggers the pipeline. CodeBuild scans the code and if there are any vulnerabilities, it invokes the Lambda function to parse and post the results to Security Hub.

When posting the vulnerability finding information to Security Hub, we need to provide a vulnerability severity level. Based on the provided severity value, Security Hub assigns the label as follows. Adjust the severity levels in your code based on your organization’s requirements.

  • 0 – INFORMATIONAL
  • 1–39 – LOW
  • 40– 69 – MEDIUM
  • 70–89 – HIGH
  • 90–100 – CRITICAL

The following screenshot shows the progression of your pipeline.

CodePipeline stages

CodePipeline stages

SCA and SAST scanning

In our architecture, CodeBuild trigger the SCA and SAST scanning in parallel. In this section, we discuss scanning with OWASP Dependency-Check, SonarQube, and PHPStan. 

Scanning with OWASP Dependency-Check (SCA)

The following is the code snippet from the Lambda function, where the SCA analysis results are parsed and posted to Security Hub. Based on the results, the equivalent Security Hub severity level (normalized_severity) is assigned.

Lambda code snippet for OWASP Dependency-check

Lambda code snippet for OWASP Dependency-check

You can see the results in Security Hub, as in the following screenshot.

SecurityHub report from OWASP Dependency-check scanning

SecurityHub report from OWASP Dependency-check scanning

Scanning with SonarQube (SAST)

The following is the code snippet from the Lambda function, where the SonarQube code analysis results are parsed and posted to Security Hub. Based on SonarQube results, the equivalent Security Hub severity level (normalized_severity) is assigned.

Lambda code snippet for SonarQube

Lambda code snippet for SonarQube

The following screenshot shows the results in Security Hub.

SecurityHub report from SonarQube scanning

SecurityHub report from SonarQube scanning

Scanning with PHPStan (SAST)

The following is the code snippet from the Lambda function, where the PHPStan code analysis results are parsed and posted to Security Hub.

Lambda code snippet for PHPStan

Lambda code snippet for PHPStan

The following screenshot shows the results in Security Hub.

SecurityHub report from PHPStan scanning

SecurityHub report from PHPStan scanning

DAST scanning

In our architecture, CodeBuild triggers DAST scanning and the DAST tool.

If there are no vulnerabilities in the SAST scan, the pipeline proceeds to the manual approval stage and an email is sent to the approver. The approver can review and approve or reject the deployment. If approved, the pipeline moves to next stage and deploys the application to the provided Elastic Beanstalk environment.

Scanning with OWASP Zap

After deployment is successful, CodeBuild initiates the DAST scanning. When scanning is complete, if there are any vulnerabilities, it invokes the Lambda function similar to SAST analysis. The function parses and posts the results to Security Hub. The following is the code snippet of the Lambda function.

Lambda code snippet for OWASP-Zap

Lambda code snippet for OWASP-Zap

The following screenshot shows the results in Security Hub.

SecurityHub report from OWASP-Zap scanning

SecurityHub report from OWASP-Zap scanning

Aggregation of vulnerability findings in Security Hub provides opportunities to automate the remediation. For example, based on the vulnerability finding, you can trigger a Lambda function to take the needed remediation action. This also reduces the burden on operations and security teams because they can now address the vulnerabilities from a single pane of glass instead of logging into multiple tool dashboards.

Conclusion

In this post, I presented a DevSecOps pipeline that includes CI/CD, continuous testing, continuous logging and monitoring, auditing and governance, and operations. I demonstrated how to integrate various open-source scanning tools, such as SonarQube, PHPStan, and OWASP Zap for SAST and DAST analysis. I explained how to aggregate vulnerability findings in Security Hub as a single pane of glass. This post also talked about how to implement security of the pipeline and in the pipeline using AWS cloud native services. Finally, I provided the DevSecOps pipeline as code using AWS CloudFormation. For additional information on AWS DevOps services and to get started, see AWS DevOps and DevOps Blog.

 

Srinivas Manepalli is a DevSecOps Solutions Architect in the U.S. Fed SI SA team at Amazon Web Services (AWS). He is passionate about helping customers, building and architecting DevSecOps and highly available software systems. Outside of work, he enjoys spending time with family, nature and good food.

Automate thousands of mainframe tests on AWS with the Micro Focus Enterprise Suite

Post Syndicated from Kevin Yung original https://aws.amazon.com/blogs/devops/automate-mainframe-tests-on-aws-with-micro-focus/

Micro Focus – AWS Advanced Technology Parnter, they are a global infrastructure software company with 40 years of experience in delivering and supporting enterprise software.

We have seen mainframe customers often encounter scalability constraints, and they can’t support their development and test workforce to the scale required to support business requirements. These constraints can lead to delays, reduce product or feature releases, and make them unable to respond to market requirements. Furthermore, limits in capacity and scale often affect the quality of changes deployed, and are linked to unplanned or unexpected downtime in products or services.

The conventional approach to address these constraints is to scale up, meaning to increase MIPS/MSU capacity of the mainframe hardware available for development and testing. The cost of this approach, however, is excessively high, and to ensure time to market, you may reject this approach at the expense of quality and functionality. If you’re wrestling with these challenges, this post is written specifically for you.

To accompany this post, we developed an AWS prescriptive guidance (APG) pattern for developer instances and CI/CD pipelines: Mainframe Modernization: DevOps on AWS with Micro Focus.

Overview of solution

In the APG, we introduce DevOps automation and AWS CI/CD architecture to support mainframe application development. Our solution enables you to embrace both Test Driven Development (TDD) and Behavior Driven Development (BDD). Mainframe developers and testers can automate the tests in CI/CD pipelines so they’re repeatable and scalable. To speed up automated mainframe application tests, the solution uses team pipelines to run functional and integration tests frequently, and uses systems test pipelines to run comprehensive regression tests on demand. For more information about the pipelines, see Mainframe Modernization: DevOps on AWS with Micro Focus.

In this post, we focus on how to automate and scale mainframe application tests in AWS. We show you how to use AWS services and Micro Focus products to automate mainframe application tests with best practices. The solution can scale your mainframe application CI/CD pipeline to run thousands of tests in AWS within minutes, and you only pay a fraction of your current on-premises cost.

The following diagram illustrates the solution architecture.

Mainframe DevOps On AWS Architecture Overview, on the left is the conventional mainframe development environment, on the left is the CI/CD pipelines for mainframe tests in AWS

Figure: Mainframe DevOps On AWS Architecture Overview

 

Best practices

Before we get into the details of the solution, let’s recap the following mainframe application testing best practices:

  • Create a “test first” culture by writing tests for mainframe application code changes
  • Automate preparing and running tests in the CI/CD pipelines
  • Provide fast and quality feedback to project management throughout the SDLC
  • Assess and increase test coverage
  • Scale your test’s capacity and speed in line with your project schedule and requirements

Automated smoke test

In this architecture, mainframe developers can automate running functional smoke tests for new changes. This testing phase typically “smokes out” regression of core and critical business functions. You can achieve these tests using tools such as py3270 with x3270 or Robot Framework Mainframe 3270 Library.

The following code shows a feature test written in Behave and test step using py3270:

# home_loan_calculator.feature
Feature: calculate home loan monthly repayment
  the bankdemo application provides a monthly home loan repayment caculator 
  User need to input into transaction of home loan amount, interest rate and how many years of the loan maturity.
  User will be provided an output of home loan monthly repayment amount

  Scenario Outline: As a customer I want to calculate my monthly home loan repayment via a transaction
      Given home loan amount is <amount>, interest rate is <interest rate> and maturity date is <maturity date in months> months 
       When the transaction is submitted to the home loan calculator
       Then it shall show the monthly repayment of <monthly repayment>

    Examples: Homeloan
      | amount  | interest rate | maturity date in months | monthly repayment |
      | 1000000 | 3.29          | 300                     | $4894.31          |

 

# home_loan_calculator_steps.py
import sys, os
from py3270 import Emulator
from behave import *

@given("home loan amount is {amount}, interest rate is {rate} and maturity date is {maturity_date} months")
def step_impl(context, amount, rate, maturity_date):
    context.home_loan_amount = amount
    context.interest_rate = rate
    context.maturity_date_in_months = maturity_date

@when("the transaction is submitted to the home loan calculator")
def step_impl(context):
    # Setup connection parameters
    tn3270_host = os.getenv('TN3270_HOST')
    tn3270_port = os.getenv('TN3270_PORT')
	# Setup TN3270 connection
    em = Emulator(visible=False, timeout=120)
    em.connect(tn3270_host + ':' + tn3270_port)
    em.wait_for_field()
	# Screen login
    em.fill_field(10, 44, 'b0001', 5)
    em.send_enter()
	# Input screen fields for home loan calculator
    em.wait_for_field()
    em.fill_field(8, 46, context.home_loan_amount, 7)
    em.fill_field(10, 46, context.interest_rate, 7)
    em.fill_field(12, 46, context.maturity_date_in_months, 7)
    em.send_enter()
    em.wait_for_field()    

    # collect monthly replayment output from screen
    context.monthly_repayment = em.string_get(14, 46, 9)
    em.terminate()

@then("it shall show the monthly repayment of {amount}")
def step_impl(context, amount):
    print("expected amount is " + amount.strip() + ", and the result from screen is " + context.monthly_repayment.strip())
assert amount.strip() == context.monthly_repayment.strip()

To run this functional test in Micro Focus Enterprise Test Server (ETS), we use AWS CodeBuild.

We first need to build an Enterprise Test Server Docker image and push it to an Amazon Elastic Container Registry (Amazon ECR) registry. For instructions, see Using Enterprise Test Server with Docker.

Next, we create a CodeBuild project and uses the Enterprise Test Server Docker image in its configuration.

The following is an example AWS CloudFormation code snippet of a CodeBuild project that uses Windows Container and Enterprise Test Server:

  BddTestBankDemoStage:
    Type: AWS::CodeBuild::Project
    Properties:
      Name: !Sub '${AWS::StackName}BddTestBankDemo'
      LogsConfig:
        CloudWatchLogs:
          Status: ENABLED
      Artifacts:
        Type: CODEPIPELINE
        EncryptionDisabled: true
      Environment:
        ComputeType: BUILD_GENERAL1_LARGE
        Image: !Sub "${EnterpriseTestServerDockerImage}:latest"
        ImagePullCredentialsType: SERVICE_ROLE
        Type: WINDOWS_SERVER_2019_CONTAINER
      ServiceRole: !Ref CodeBuildRole
      Source:
        Type: CODEPIPELINE
        BuildSpec: bdd-test-bankdemo-buildspec.yaml

In the CodeBuild project, we need to create a buildspec to orchestrate the commands for preparing the Micro Focus Enterprise Test Server CICS environment and issue the test command. In the buildspec, we define the location for CodeBuild to look for test reports and upload them into the CodeBuild report group. The following buildspec code uses custom scripts DeployES.ps1 and StartAndWait.ps1 to start your CICS region, and runs Python Behave BDD tests:

version: 0.2
phases:
  build:
    commands:
      - |
        # Run Command to start Enterprise Test Server
        CD C:\
        .\DeployES.ps1
        .\StartAndWait.ps1

        py -m pip install behave

        Write-Host "waiting for server to be ready ..."
        do {
          Write-Host "..."
          sleep 3  
        } until(Test-NetConnection 127.0.0.1 -Port 9270 | ? { $_.TcpTestSucceeded } )

        CD C:\tests\features
        MD C:\tests\reports
        $Env:Path += ";c:\wc3270"

        $address=(Get-NetIPAddress -AddressFamily Ipv4 | where { $_.IPAddress -Match "172\.*" })
        $Env:TN3270_HOST = $address.IPAddress
        $Env:TN3270_PORT = "9270"
        
        behave.exe --color --junit --junit-directory C:\tests\reports
reports:
  bankdemo-bdd-test-report:
    files: 
      - '**/*'
    base-directory: "C:\\tests\\reports"

In the smoke test, the team may run both unit tests and functional tests. Ideally, these tests are better to run in parallel to speed up the pipeline. In AWS CodePipeline, we can set up a stage to run multiple steps in parallel. In our example, the pipeline runs both BDD tests and Robot Framework (RPA) tests.

The following CloudFormation code snippet runs two different tests. You use the same RunOrder value to indicate the actions run in parallel.

#...
        - Name: Tests
          Actions:
            - Name: RunBDDTest
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName: !Ref BddTestBankDemoStage
                PrimarySource: Config
              InputArtifacts:
                - Name: DemoBin
                - Name: Config
              RunOrder: 1
            - Name: RunRbTest
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName : !Ref RpaTestBankDemoStage
                PrimarySource: Config
              InputArtifacts:
                - Name: DemoBin
                - Name: Config
              RunOrder: 1  
#...

The following screenshot shows the example actions on the CodePipeline console that use the preceding code.

Screenshot of CodePipeine parallel execution tests using a same run order value

Figure – Screenshot of CodePipeine parallel execution tests

Both DBB and RPA tests produce jUnit format reports, which CodeBuild can ingest and show on the CodeBuild console. This is a great way for project management and business users to track the quality trend of an application. The following screenshot shows the CodeBuild report generated from the BDD tests.

CodeBuild report generated from the BDD tests showing 100% pass rate

Figure – CodeBuild report generated from the BDD tests

Automated regression tests

After you test the changes in the project team pipeline, you can automatically promote them to another stream with other team members’ changes for further testing. The scope of this testing stream is significantly more comprehensive, with a greater number and wider range of tests and higher volume of test data. The changes promoted to this stream by each team member are tested in this environment at the end of each day throughout the life of the project. This provides a high-quality delivery to production, with new code and changes to existing code tested together with hundreds or thousands of tests.

In enterprise architecture, it’s commonplace to see an application client consuming web services APIs exposed from a mainframe CICS application. One approach to do regression tests for mainframe applications is to use Micro Focus Verastream Host Integrator (VHI) to record and capture 3270 data stream processing and encapsulate these 3270 data streams as business functions, which in turn are packaged as web services. When these web services are available, they can be consumed by a test automation product, which in our environment is Micro Focus UFT One. This uses the Verastream server as the orchestration engine that translates the web service requests into 3270 data streams that integrate with the mainframe CICS application. The application is deployed in Micro Focus Enterprise Test Server.

The following diagram shows the end-to-end testing components.

Regression Test the end-to-end testing components using ECS Container for Exterprise Test Server, Verastream Host Integrator and UFT One Container, all integration points are using Elastic Network Load Balancer

Figure – Regression Test Infrastructure end-to-end Setup

To ensure we have the coverage required for large mainframe applications, we sometimes need to run thousands of tests against very large production volumes of test data. We want the tests to run faster and complete as soon as possible so we reduce AWS costs—we only pay for the infrastructure when consuming resources for the life of the test environment when provisioning and running tests.

Therefore, the design of the test environment needs to scale out. The batch feature in CodeBuild allows you to run tests in batches and in parallel rather than serially. Furthermore, our solution needs to minimize interference between batches, a failure in one batch doesn’t affect another running in parallel. The following diagram depicts the high-level design, with each batch build running in its own independent infrastructure. Each infrastructure is launched as part of test preparation, and then torn down in the post-test phase.

Regression Tests in CodeBuoild Project setup to use batch mode, three batches running in independent infrastructure with containers

Figure – Regression Tests in CodeBuoild Project setup to use batch mode

Building and deploying regression test components

Following the design of the parallel regression test environment, let’s look at how we build each component and how they are deployed. The followings steps to build our regression tests use a working backward approach, starting from deployment in the Enterprise Test Server:

  1. Create a batch build in CodeBuild.
  2. Deploy to Enterprise Test Server.
  3. Deploy the VHI model.
  4. Deploy UFT One Tests.
  5. Integrate UFT One into CodeBuild and CodePipeline and test the application.

Creating a batch build in CodeBuild

We update two components to enable a batch build. First, in the CodePipeline CloudFormation resource, we set BatchEnabled to be true for the test stage. The UFT One test preparation stage uses the CloudFormation template to create the test infrastructure. The following code is an example of the AWS CloudFormation snippet with batch build enabled:

#...
        - Name: SystemsTest
          Actions:
            - Name: Uft-Tests
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName : !Ref UftTestBankDemoProject
                PrimarySource: Config
                BatchEnabled: true
                CombineArtifacts: true
              InputArtifacts:
                - Name: Config
                - Name: DemoSrc
              OutputArtifacts:
                - Name: TestReport                
              RunOrder: 1
#...

Second, in the buildspec configuration of the test stage, we provide a build matrix setting. We use the custom environment variable TEST_BATCH_NUMBER to indicate which set of tests runs in each batch. See the following code:

version: 0.2
batch:
  fast-fail: true
  build-matrix:
    static:
      ignore-failure: false
    dynamic:
      env:
        variables:
          TEST_BATCH_NUMBER:
            - 1
            - 2
            - 3 
phases:
  pre_build:
commands:
#...

After setting up the batch build, CodeBuild creates multiple batches when the build starts. The following screenshot shows the batches on the CodeBuild console.

Regression tests Codebuild project ran in batch mode, three batches ran in prallel successfully

Figure – Regression tests Codebuild project ran in batch mode

Deploying to Enterprise Test Server

ETS is the transaction engine that processes all the online (and batch) requests that are initiated through external clients, such as 3270 terminals, web services, and websphere MQ. This engine provides support for various mainframe subsystems, such as CICS, IMS TM and JES, as well as code-level support for COBOL and PL/I. The following screenshot shows the Enterprise Test Server administration page.

Enterprise Server Administrator window showing configuration for CICS

Figure – Enterprise Server Administrator window

In this mainframe application testing use case, the regression tests are CICS transactions, initiated from 3270 requests (encapsulated in a web service). For more information about Enterprise Test Server, see the Enterprise Test Server and Micro Focus websites.

In the regression pipeline, after the stage of mainframe artifact compiling, we bake in the artifact into an ETS Docker container and upload the image to an Amazon ECR repository. This way, we have an immutable artifact for all the tests.

During each batch’s test preparation stage, a CloudFormation stack is deployed to create an Amazon ECS service on Windows EC2. The stack uses a Network Load Balancer as an integration point for the VHI’s integration.

The following code is an example of the CloudFormation snippet to create an Amazon ECS service using an Enterprise Test Server Docker image:

#...
  EtsService:
    DependsOn:
    - EtsTaskDefinition
    - EtsContainerSecurityGroup
    - EtsLoadBalancerListener
    Properties:
      Cluster: !Ref 'WindowsEcsClusterArn'
      DesiredCount: 1
      LoadBalancers:
        -
          ContainerName: !Sub "ets-${AWS::StackName}"
          ContainerPort: 9270
          TargetGroupArn: !Ref EtsPort9270TargetGroup
      HealthCheckGracePeriodSeconds: 300          
      TaskDefinition: !Ref 'EtsTaskDefinition'
    Type: "AWS::ECS::Service"

  EtsTaskDefinition:
    Properties:
      ContainerDefinitions:
        -
          Image: !Sub "${AWS::AccountId}.dkr.ecr.us-east-1.amazonaws.com/systems-test/ets:latest"
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref 'SystemsTestLogGroup'
              awslogs-region: !Ref 'AWS::Region'
              awslogs-stream-prefix: ets
          Name: !Sub "ets-${AWS::StackName}"
          cpu: 4096
          memory: 8192
          PortMappings:
            -
              ContainerPort: 9270
          EntryPoint:
          - "powershell.exe"
          Command: 
          - '-F'
          - .\StartAndWait.ps1
          - 'bankdemo'
          - C:\bankdemo\
          - 'wait'
      Family: systems-test-ets
    Type: "AWS::ECS::TaskDefinition"
#...

Deploying the VHI model

In this architecture, the VHI is a bridge between mainframe and clients.

We use the VHI designer to capture the 3270 data streams and encapsulate the relevant data streams into a business function. We can then deliver this function as a web service that can be consumed by a test management solution, such as Micro Focus UFT One.

The following screenshot shows the setup for getCheckingDetails in VHI. Along with this procedure we can also see other procedures (eg calcCostLoan) defined that get generated as a web service. The properties associated with this procedure are available on this screen to allow for the defining of the mapping of the fields between the associated 3270 screens and exposed web service.

example of VHI designer to capture the 3270 data streams and encapsulate the relevant data streams into a business function getCheckingDetails

Figure – Setup for getCheckingDetails in VHI

The following screenshot shows the editor for this procedure and is initiated by the selection of the Procedure Editor. This screen presents the 3270 screens that are involved in the business function that will be generated as a web service.

VHI designer Procedure Editor shows the procedure

Figure – VHI designer Procedure Editor shows the procedure

After you define the required functional web services in VHI designer, the resultant model is saved and deployed into a VHI Docker image. We use this image and the associated model (from VHI designer) in the pipeline outlined in this post.

For more information about VHI, see the VHI website.

The pipeline contains two steps to deploy a VHI service. First, it installs and sets up the VHI models into a VHI Docker image, and it’s pushed into Amazon ECR. Second, a CloudFormation stack is deployed to create an Amazon ECS Fargate service, which uses the latest built Docker image. In AWS CloudFormation, the VHI ECS task definition defines an environment variable for the ETS Network Load Balancer’s DNS name. Therefore, the VHI can bootstrap and point to an ETS service. In the VHI stack, it uses a Network Load Balancer as an integration point for UFT One test integration.

The following code is an example of a ECS Task Definition CloudFormation snippet that creates a VHI service in Amazon ECS Fargate and integrates it with an ETS server:

#...
  VhiTaskDefinition:
    DependsOn:
    - EtsService
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: systems-test-vhi
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      ExecutionRoleArn: !Ref FargateEcsTaskExecutionRoleArn
      Cpu: 2048
      Memory: 4096
      ContainerDefinitions:
        - Cpu: 2048
          Name: !Sub "vhi-${AWS::StackName}"
          Memory: 4096
          Environment:
            - Name: esHostName 
              Value: !GetAtt EtsInternalLoadBalancer.DNSName
            - Name: esPort
              Value: 9270
          Image: !Ref "${AWS::AccountId}.dkr.ecr.us-east-1.amazonaws.com/systems-test/vhi:latest"
          PortMappings:
            - ContainerPort: 9680
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref 'SystemsTestLogGroup'
              awslogs-region: !Ref 'AWS::Region'
              awslogs-stream-prefix: vhi

#...

Deploying UFT One Tests

UFT One is a test client that uses each of the web services created by the VHI designer to orchestrate running each of the associated business functions. Parameter data is supplied to each function, and validations are configured against the data returned. Multiple test suites are configured with different business functions with the associated data.

The following screenshot shows the test suite API_Bankdemo3, which is used in this regression test process.

the screenshot shows the test suite API_Bankdemo3 in UFT One test setup console, the API setup for getCheckingDetails

Figure – API_Bankdemo3 in UFT One Test Editor Console

For more information, see the UFT One website.

Integrating UFT One and testing the application

The last step is to integrate UFT One into CodeBuild and CodePipeline to test our mainframe application. First, we set up CodeBuild to use a UFT One container. The Docker image is available in Docker Hub. Then we author our buildspec. The buildspec has the following three phrases:

  • Setting up a UFT One license and deploying the test infrastructure
  • Starting the UFT One test suite to run regression tests
  • Tearing down the test infrastructure after tests are complete

The following code is an example of a buildspec snippet in the pre_build stage. The snippet shows the command to activate the UFT One license:

version: 0.2
batch: 
# . . .
phases:
  pre_build:
    commands:
      - |
        # Activate License
        $process = Start-Process -NoNewWindow -RedirectStandardOutput LicenseInstall.log -Wait -File 'C:\Program Files (x86)\Micro Focus\Unified Functional Testing\bin\HP.UFT.LicenseInstall.exe' -ArgumentList @('concurrent', 10600, 1, ${env:AUTOPASS_LICENSE_SERVER})        
        Get-Content -Path LicenseInstall.log
        if (Select-String -Path LicenseInstall.log -Pattern 'The installation was successful.' -Quiet) {
          Write-Host 'Licensed Successfully'
        } else {
          Write-Host 'License Failed'
          exit 1
        }
#...

The following command in the buildspec deploys the test infrastructure using the AWS Command Line Interface (AWS CLI)

aws cloudformation deploy --stack-name $stack_name `
--template-file cicd-pipeline/systems-test-pipeline/systems-test-service.yaml `
--parameter-overrides EcsCluster=$cluster_arn `
--capabilities CAPABILITY_IAM

Because ETS and VHI are both deployed with a load balancer, the build detects when the load balancers become healthy before starting the tests. The following AWS CLI commands detect the load balancer’s target group health:

$vhi_health_state = (aws elbv2 describe-target-health --target-group-arn $vhi_target_group_arn --query 'TargetHealthDescriptions[0].TargetHealth.State' --output text)
$ets_health_state = (aws elbv2 describe-target-health --target-group-arn $ets_target_group_arn --query 'TargetHealthDescriptions[0].TargetHealth.State' --output text)          

When the targets are healthy, the build moves into the build stage, and it uses the UFT One command line to start the tests. See the following code:

$process = Start-Process -Wait  -NoNewWindow -RedirectStandardOutput UFTBatchRunnerCMD.log `
-FilePath "C:\Program Files (x86)\Micro Focus\Unified Functional Testing\bin\UFTBatchRunnerCMD.exe" `
-ArgumentList @("-source", "${env:CODEBUILD_SRC_DIR_DemoSrc}\bankdemo\tests\API_Bankdemo\API_Bankdemo${env:TEST_BATCH_NUMBER}")

The next release of Micro Focus UFT One (November or December 2020) will provide an exit status to indicate a test’s success or failure.

When the tests are complete, the post_build stage tears down the test infrastructure. The following AWS CLI command tears down the CloudFormation stack:


#...
	post_build:
	  finally:
	  	- |
		  Write-Host "Clean up ETS, VHI Stack"
		  #...
		  aws cloudformation delete-stack --stack-name $stack_name
          aws cloudformation wait stack-delete-complete --stack-name $stack_name

At the end of the build, the buildspec is set up to upload UFT One test reports as an artifact into Amazon Simple Storage Service (Amazon S3). The following screenshot is the example of a test report in HTML format generated by UFT One in CodeBuild and CodePipeline.

UFT One HTML report shows regression testresult and test detals

Figure – UFT One HTML report

A new release of Micro Focus UFT One will provide test report formats supported by CodeBuild test report groups.

Conclusion

In this post, we introduced the solution to use Micro Focus Enterprise Suite, Micro Focus UFT One, Micro Focus VHI, AWS developer tools, and Amazon ECS containers to automate provisioning and running mainframe application tests in AWS at scale.

The on-demand model allows you to create the same test capacity infrastructure in minutes at a fraction of your current on-premises mainframe cost. It also significantly increases your testing and delivery capacity to increase quality and reduce production downtime.

A demo of the solution is available in AWS Partner Micro Focus website AWS Mainframe CI/CD Enterprise Solution. If you’re interested in modernizing your mainframe applications, please visit Micro Focus and contact AWS mainframe business development at [email protected].

References

Micro Focus

 

Peter Woods

Peter Woods

Peter has been with Micro Focus for almost 30 years, in a variety of roles and geographies including Technical Support, Channel Sales, Product Management, Strategic Alliances Management and Pre-Sales, primarily based in Europe but for the last four years in Australia and New Zealand. In his current role as Pre-Sales Manager, Peter is charged with driving and supporting sales activity within the Application Modernization and Connectivity team, based in Melbourne.

Leo Ervin

Leo Ervin

Leo Ervin is a Senior Solutions Architect working with Micro Focus Enterprise Solutions working with the ANZ team. After completing a Mathematics degree Leo started as a PL/1 programming with a local insurance company. The next step in Leo’s career involved consulting work in PL/1 and COBOL before he joined a start-up company as a technical director and partner. This company became the first distributor of Micro Focus software in the ANZ region in 1986. Leo’s involvement with Micro Focus technology has continued from this distributorship through to today with his current focus on cloud strategies for both DevOps and re-platform implementations.

Kevin Yung

Kevin Yung

Kevin is a Senior Modernization Architect in AWS Professional Services Global Mainframe and Midrange Modernization (GM3) team. Kevin currently is focusing on leading and delivering mainframe and midrange applications modernization for large enterprise customers.

Creating multi-architecture Docker images to support Graviton2 using AWS CodeBuild and AWS CodePipeline

Post Syndicated from Tyler Lynch original https://aws.amazon.com/blogs/devops/creating-multi-architecture-docker-images-to-support-graviton2-using-aws-codebuild-and-aws-codepipeline/

This post provides a clear path for customers who are evaluating and adopting Graviton2 instance types for performance improvements and cost-optimization.

Graviton2 processors are custom designed by AWS using 64-bit Arm Neoverse N1 cores. They power the T4g*, M6g*, R6g*, and C6g* Amazon Elastic Compute Cloud (Amazon EC2) instance types and offer up to 40% better price performance over the current generation of x86-based instances in a variety of workloads, such as high-performance computing, application servers, media transcoding, in-memory caching, gaming, and more.

More and more customers want to make the move to Graviton2 to take advantage of these performance optimizations while saving money.

During the transition process, a great benefit AWS provides is the ability to perform native builds for each architecture, instead of attempting to cross-compile on homogenous hardware. This has the benefit of decreasing build time as well as reducing complexity and cost to set up.

To see this benefit in action, we look at how to build a CI/CD pipeline using AWS CodePipeline and AWS CodeBuild that can build multi-architecture Docker images in parallel to aid you in evaluating and migrating to Graviton2.

Solution overview

With CodePipeline and CodeBuild, we can automate the creation of architecture-specific Docker images, which can be pushed to Amazon Elastic Container Registry (Amazon ECR). The following diagram illustrates this architecture.

Solution overview architectural diagram

The steps in this process are as follows:

  1. Create a sample Node.js application and associated Dockerfile.
  2. Create the buildspec files that contain the commands that CodeBuild runs.
  3. Create three CodeBuild projects to automate each of the following steps:
    • CodeBuild for x86 – Creates a x86 Docker image and pushes to Amazon ECR.
    • CodeBuild for arm64 – Creates a Arm64 Docker image and pushes to Amazon ECR.
    • CodeBuild for manifest list – Creates a Docker manifest list, annotates the list, and pushes to Amazon ECR.
  4. Automate the orchestration of these projects with CodePipeline.

Prerequisites

The prerequisites for this solution are as follows:

  • The correct AWS Identity and Access Management (IAM) role permissions for your account allowing for the creation of the CodePipeline pipeline, CodeBuild projects, and Amazon ECR repositories
  • An Amazon ECR repository named multi-arch-test
  • A source control service such as AWS CodeCommit or GitHub that CodeBuild and CodePipeline can interact with
  • The source code repository initialized and cloned locally

Creating a sample Node.js application and associated Dockerfile

For this post, we create a sample “Hello World” application that self-reports the processor architecture. We work in the local folder that is cloned from our source repository as specified in the prerequisites.

  1. In your preferred text editor, add a new file with the following Node.js code:

# Hello World sample app.
const http = require('http');

const port = 3000;

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader('Content-Type', 'text/plain');
  res.end(`Hello World. This processor architecture is ${process.arch}`);
});

server.listen(port, () => {
  console.log(`Server running on processor architecture ${process.arch}`);
});
  1. Save the file in the root of your source repository and name it app.js.
  2. Commit the changes to Git and push the changes to our source repository. See the following code:

git add .
git commit -m "Adding Node.js sample application."
git push

We also need to create a sample Dockerfile that instructs the docker build command how to build the Docker images. We use the default Node.js image tag for version 14.

  1. In a text editor, add a new file with the following code:

# Sample nodejs application
FROM node:14
WORKDIR /usr/src/app
COPY package*.json app.js ./
RUN npm install
EXPOSE 3000
CMD ["node", "app.js"]
  1. Save the file in the root of the source repository and name it Dockerfile. Make sure it is Dockerfile with no extension.
  2. Commit the changes to Git and push the changes to our source repository:

git add .
git commit -m "Adding Dockerfile to host the Node.js sample application."
git push

Creating a build specification file for your application

It’s time to create and add a buildspec file to our source repository. We want to use a single buildspec.yml file for building, tagging, and pushing the Docker images to Amazon ECR for both target native architectures, x86, and Arm64. We use CodeBuild to inject environment variables, some of which need to be changed for each architecture (such as image tag and image architecture).

A buildspec is a collection of build commands and related settings, in YAML format, that CodeBuild uses to run a build. For more information, see Build specification reference for CodeBuild.

The buildspec we add instructs CodeBuild to do the following:

  • install phase – Update the yum package manager
  • pre_build phase – Sign in to Amazon ECR using the IAM role assumed by CodeBuild
  • build phase – Build the Docker image using the Docker CLI and tag the newly created Docker image
  • post_build phase – Push the Docker image to our Amazon ECR repository

We first need to add the buildspec.yml file to our source repository.

  1. In a text editor, add a new file with the following build specification:

version: 0.2
phases:
    install:
        commands:
            - yum update -y
    pre_build:
        commands:
            - echo Logging in to Amazon ECR...
            - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
    build:
        commands:
            - echo Build started on `date`
            - echo Building the Docker image...          
            - docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
            - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG      
    post_build:
        commands:
            - echo Build completed on `date`
            - echo Pushing the Docker image...
            - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
  1. Save the file in the root of the repository and name it buildspec.yml.

Because we specify environment variables in the CodeBuild project, we don’t need to hard code any values in the buildspec file.

  1. Commit the changes to Git and push the changes to our source repository:

git add .
git commit -m "Adding CodeBuild buildspec.yml file."
git push

Creating a build specification file for your manifest list creation

Next we create a buildspec file that instructs CodeBuild to create a Docker manifest list, and associate that manifest list with the Docker images that the buildspec file builds.

A manifest list is a list of image layers that is created by specifying one or more (ideally more than one) image names. You can then use it in the same way as an image name in docker pull and docker run commands, for example. For more information, see manifest create.

As of this writing, manifest creation is an experimental feature of the Docker command line interface (CLI).

Experimental features provide early access to future product functionality. These features are intended only for testing and feedback because they may change between releases without warning or be removed entirely from a future release. Experimental features must not be used in production environments. For more information, Experimental features.

When creating the CodeBuild project for manifest list creation, we specify a buildspec file name override as buildspec-manifest.yml. This buildspec instructs CodeBuild to do the following:

  • install phase – Update the yum package manager
  • pre_build phase – Sign in to Amazon ECR using the IAM role assumed by CodeBuild
  • build phase – Perform three actions:
    • Set environment variable to enable Docker experimental features for the CLI
    • Create the Docker manifest list using the Docker CLI
    • Annotate the manifest list to add the architecture-specific Docker image references
  • post_build phase – Push the Docker image to our Amazon ECR repository and use docker manifest inspect to echo out the contents of the manifest list from Amazon ECR

We first need to add the buildspec-manifest.yml file to our source repository.

  1. In a text editor, add a new file with the following build specification:

version: 0.2
# Based on the Docker documentation, must include the DOCKER_CLI_EXPERIMENTAL environment variable
# https://docs.docker.com/engine/reference/commandline/manifest/    

phases:
    install:
        commands:
            - yum update -y
    pre_build:
        commands:
            - echo Logging in to Amazon ECR...
            - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
    build:
        commands:
            - echo Build started on `date`
            - echo Building the Docker manifest...   
            - export DOCKER_CLI_EXPERIMENTAL=enabled       
            - docker manifest create $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-arm64v8 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-amd64    
            - docker manifest annotate --arch arm64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-arm64v8
            - docker manifest annotate --arch amd64 $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:latest-amd64

    post_build:
        commands:
            - echo Build completed on `date`
            - echo Pushing the Docker image...
            - docker manifest push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
            - docker manifest inspect $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
  1. Save the file in the root of the repository and name it buildspec-manifest.yml.
  2. Commit the changes to Git and push the changes to our source repository:

git add .
git commit -m "Adding CodeBuild buildspec-manifest.yml file."
git push

Setting up your CodeBuild projects

Now we have created a single buildspec.yml file for building, tagging, and pushing the Docker images to Amazon ECR for both target native architectures: x86 and Arm64. This file is shared by two of the three CodeBuild projects that we create. We use CodeBuild to inject environment variables, some of which need to be changed for each architecture (such as image tag and image architecture). We also want to use the single Docker file, regardless of the architecture. We also need to ensure any third-party libraries are present and compiled correctly for the target architecture.

For more information about third-party libraries and software versions that have been optimized for Arm, see the Getting started with AWS Graviton GitHub repo.

We use the same environment variable names for the CodeBuild projects, but each project has specific values, as detailed in the following table. You need to modify these values to your numeric AWS account ID, the AWS Region where your Amazon ECR registry endpoint is located, and your Amazon ECR repository name. The instructions for adding the environment variables in the CodeBuild projects are in the following sections.

Environment Variable x86 Project values Arm64 Project values manifest Project values
1 AWS_DEFAULT_REGION us-east-1 us-east-1 us-east-1
2 AWS_ACCOUNT_ID 111111111111 111111111111 111111111111
3 IMAGE_REPO_NAME multi-arch-test multi-arch-test multi-arch-test
4 IMAGE_TAG latest-amd64 latest-arm64v8 latest

The image we use in this post uses architecture-specific tags with the term latest. This is for demonstration purposes only; it’s best to tag the images with an explicit version or another meaningful reference.

CodeBuild for x86

We start with creating a new CodeBuild project for x86 on the CodeBuild console.

CodeBuild looks for a file named buildspec.yml by default, unless overridden. For these first two CodeBuild projects, we rely on that default and don’t specify the buildspec name.

  1. On the CodeBuild console, choose Create build project.
  2. For Project name, enter a unique project name for your build project, such as node-x86.
  3. To add tags, add them under Additional Configuration.
  4. Choose a Source provider (for this post, we choose GitHub).
  5. For Environment image, choose Managed image.
  6. Select Amazon Linux 2.
  7. For Runtime(s), choose Standard.
  8. For Image, choose aws/codebuild/amazonlinux2-x86_64-standard:3.0.

This is a x86 build image.

  1. Select Privileged.
  2. For Service role, choose New service role.
  3. Enter a name for the new role (one is created for you), such as CodeBuildServiceRole-nodeproject.

We reuse this same service role for the other CodeBuild projects associated with this project.

  1. Expand Additional configurations and move to the Environment variables
  2. Create the following Environment variables:
Name Value Type
1 AWS_DEFAULT_REGION us-east-1 Plaintext
2 AWS_ACCOUNT_ID 111111111111 Plaintext
3 IMAGE_REPO_NAME multi-arch-test Plaintext
4 IMAGE_TAG latest-amd64 Plaintext
  1. Choose Create build project.

Attaching the IAM policy

Now that we have created the CodeBuild project, we need to adjust the new service role that was just created and attach an IAM policy so that it can interact with the Amazon ECR API.

  1. On the CodeBuild console, choose the node-x86 project
  2. Choose the Build details
  3. Under Service role, choose the link that looks like arn:aws:iam::111111111111:role/service-role/CodeBuildServiceRole-nodeproject.

A new browser tab should open.

  1. Choose Attach policies.
  2. In the Search field, enter AmazonEC2ContainerRegistryPowerUser.
  3. Select AmazonEC2ContainerRegistryPowerUser.
  4. Choose Attach policy.

CodeBuild for arm64

Now we move on to creating a new (second) CodeBuild project for Arm64.

  1. On the CodeBuild console, choose Create build project.
  2. For Project name, enter a unique project name, such as node-arm64.
  3. If you want to add tags, add them under Additional Configuration.
  4. Choose a Source provider (for this post, choose GitHub).
  5. For Environment image, choose Managed image.
  6. Select Amazon Linux 2.
  7. For Runtime(s), choose Standard.
  8. For Image, choose aws/codebuild/amazonlinux2-aarch64-standard:2.0.

This is an Arm build image and is different from the image selected in the previous CodeBuild project.

  1. Select Privileged.
  2. For Service role, choose Existing service role.
  3. Choose CodeBuildServiceRole-nodeproject.
  4. Select Allow AWS CodeBuild to modify this service role so it can be used with this build project.
  5. Expand Additional configurations and move to the Environment variables
  6. Create the following Environment variables:
Name Value Type
1 AWS_DEFAULT_REGION us-east-1 Plaintext
2 AWS_ACCOUNT_ID 111111111111 Plaintext
3 IMAGE_REPO_NAME multi-arch-test Plaintext
4 IMAGE_TAG latest-arm64v8 Plaintext
  1. Choose Create build project.

CodeBuild for manifest list

For the last CodeBuild project, we create a Docker manifest list, associating that manifest list with the Docker images that the preceding projects create, and pushing the manifest list to ECR. This project uses the buildspec-manifest.yml file created earlier.

  1. On the CodeBuild console, choose Create build project.
  2. For Project name, enter a unique project name for your build project, such as node-manifest.
  3. If you want to add tags, add them under Additional Configuration.
  4. Choose a Source provider (for this post, choose GitHub).
  5. For Environment image, choose Managed image.
  6. Select Amazon Linux 2.
  7. For Runtime(s), choose Standard.
  8. For Image, choose aws/codebuild/amazonlinux2-x86_64-standard:3.0.

This is a x86 build image.

  1. Select Privileged.
  2. For Service role, choose Existing service role.
  3. Choose CodeBuildServiceRole-nodeproject.
  4. Select Allow AWS CodeBuild to modify this service role so it can be used with this build project.
  5. Expand Additional configurations and move to the Environment variables
  6. Create the following Environment variables:
Name Value Type
1 AWS_DEFAULT_REGION us-east-1 Plaintext
2 AWS_ACCOUNT_ID 111111111111 Plaintext
3 IMAGE_REPO_NAME multi-arch-test Plaintext
4 IMAGE_TAG latest Plaintext
  1. For Buildspec name – optional, enter buildspec-manifest.yml to override the default.
  2. Choose Create build project.

Setting up CodePipeline

Now we can move on to creating a pipeline to orchestrate the builds and manifest creation.

  1. On the CodePipeline console, choose Create pipeline.
  2. For Pipeline name, enter a unique name for your pipeline, such as node-multi-architecture.
  3. For Service role, choose New service role.
  4. Enter a name for the new role (one is created for you). For this post, we use the generated role name CodePipelineServiceRole-nodeproject.
  5. Select Allow AWS CodePipeline to create a service role so it can be used with this new pipeline.
  6. Choose Next.
  7. Choose a Source provider (for this post, choose GitHub).
  8. If you don’t have any existing Connections to GitHub, select Connect to GitHub and follow the wizard.
  9. Choose your Branch name (for this post, I choose main, but your branch might be different).
  10. For Output artifact format, choose CodePipeline default.
  11. Choose Next.

You should now be on the Add build stage page.

  1. For Build provider, choose AWS CodeBuild.
  2. Verify the Region is your Region of choice (for this post, I use US East (N. Virginia)).
  3. For Project name, choose node-x86.
  4. For Build type, select Single build.
  5. Choose Next.

You should now be on the Add deploy stage page.

  1. Choose Skip deploy stage.

A pop-up appears that reads Your pipeline will not include a deployment stage. Are you sure you want to skip this stage?

  1. Choose Skip.
  2. Choose Create pipeline.

CodePipeline immediately attempts to run a build. You can let it continue without worry if it fails. We are only part of the way done with the setup.

Adding an additional build step

We need to add the additional build step for the Arm CodeBuild project in the Build stage.

  1. On the CodePipeline console, choose node-multi-architecture pipeline
  2. Choose Edit to start editing the pipeline stages.

You should now be on the Editing: node-multi-architecture page.

  1. For the Build stage, choose Edit stage.
  2. Choose + Add action.

Editing node-multi-architecture

  1. For Action name, enter Build-arm64.
  2. For Action provider, choose AWS CodeBuild.
  3. Verify your Region is correct.
  4. For Input artifacts, select SourceArtifact.
  5. For Project name, choose node-arm64.
  6. For Build type, select Single build.
  7. Choose Done.
  8. Choose Save.

A pop-up appears that reads Saving your changes cannot be undone. If the pipeline is running when you save your changes, that execution will not complete.

  1. Choose Save.

Updating the first build action name

This step is optional. The CodePipeline wizard doesn’t allow you to enter your Build action name during creation, but you can update the Build stage’s first build action to have consistent naming.

  1. Choose Edit to start editing the pipeline stages.
  2. Choose the Edit icon.
  3. For Action name, enter Build-x86.
  4. Choose Done.
  5. Choose Save.

A pop-up appears that says Saving your changes cannot be undone. If the pipeline is running when you save your changes, that execution will not complete.

  1. Choose Save.

Adding the project

Now we add the CodeBuild project for manifest creation and publishing.

  1. On the CodePipeline console, choose node-multi-architecture pipeline.
  2. Choose Edit to start editing the pipeline stages.
  3. Choose +Add stage below the Build
  4. Set the Stage name to Manifest
  5. Choose +Add action group.
  6. For Action name, enter Create-manifest.
  7. For Action provider, choose AWS CodeBuild.
  8. Verify your Region is correct.
  9. For Input artifacts, select SourceArtifact.
  10. For Project name, choose node-manifest.
  11. For Build type, select Single build.
  12. Choose Done.
  13. Choose Save.

A pop-up appears that reads Saving your changes cannot be undone. If the pipeline is running when you save your changes, that execution will not complete.

  1. Choose Save.

Testing the pipeline

Now let’s verify everything works as planned.

  1. In the pipeline details page, choose Release change.

This runs the pipeline in stages. The process should take a few minutes to complete. The pipeline should show each stage as Succeeded.

Pipeline visualization

Now we want to inspect the output of the Create-manifest action that runs the CodeBuild project for manifest creation.

  1. Choose Details in the Create-manifest

This opens the CodeBuild pipeline.

  1. Under Build logs, we should see the output from the manifest inspect command we ran as the last step in the buildspec-manifest.yml See the following sample log:

[Container] 2020/10/07 16:47:39 Running command docker manifest inspect $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1369,
         "digest": "sha256:238c2762212ff5d7e0b5474f23d500f2f1a9c851cdd3e7ef0f662efac508cd04",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1369,
         "digest": "sha256:0cc9e96921d5565bdf13274e0f356a139a31d10e95de9ad3d5774a31b8871b05",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      }
   ]
}

Cleaning up

To avoid incurring future charges, clean up the resources created as part of this post.

  1. On the CodePipeline console, choose the pipeline node-multi-architecture.
  2. Choose Delete pipeline.
  3. When prompted, enter delete.
  4. Choose Delete.
  5. On the CodeBuild console, choose the Build project node-x86.
  6. Choose Delete build project.
  7. When prompted, enter delete.
  8. Choose Delete.
  9. Repeat the deletion process for Build projects node-arm64 and node-manifest.

Next we delete the Docker images we created and pushed to Amazon ECR. Be careful to not delete a repository that is being used for other images.

  1. On the Amazon ECR console, choose the repository multi-arch-test.

You should see a list of Docker images.

  1. Select latest, latest-arm64v8, and latest-amd64.
  2. Choose Delete.
  3. When prompted, enter delete.
  4. Choose Delete.

Finally, we remove the IAM roles that we created.

  1. On the IAM console, choose Roles.
  2. In the search box, enter CodePipelineServiceRole-nodeproject.
  3. Select the role and choose Delete role.
  4. When prompted, choose Yes, delete.
  5. Repeat these steps for the role CodeBuildServiceRole-nodeproject.

Conclusion

To summarize, we successfully created a pipeline to create multi-architecture Docker images for both x86 and arm64. We referenced them via annotation in a Docker manifest list and stored them in Amazon ECR. The Docker images were based on a single Docker file that uses environment variables as parameters to allow for Docker file reuse.

For more information about these services, see the following:

About the Authors

 

Tyler Lynch photo

Tyler Lynch
Tyler Lynch is a Sr. Solutions Architect focusing on EdTech at AWS.

 

 

 

Alistair McLean photo

Alistair McLean

Alistair is a Principal Solutions Architect focused on State and Local Government and K12 customers at AWS.

 

 

Automating deployments to Raspberry Pi devices using AWS CodePipeline

Post Syndicated from Ahmed ElHaw original https://aws.amazon.com/blogs/devops/automating-deployments-to-raspberry-pi-devices-using-aws-codepipeline/

Managing applications deployments on Raspberry Pi can be cumbersome, especially in headless mode and at scale when placing the devices outdoors and out of reach such as in home automation projects, in the yard (for motion detection) or on the roof (as a humidity and temperature sensor). In these use cases, you have to remotely connect via secure shell to administer the device.

It can be complicated to keep physically connecting when you need a monitor, keyboard, and mouse. Alternatively, you can connect via SSH in your home local network, provided your client workstation is also on the same private network.

In this post, we discuss using Raspberry Pi as a headless server with minimal-to-zero direct interaction by using AWS CodePipeline. We examine two use cases:

  • Managing and automating operational tasks of the Raspberry Pi, running Raspbian OS or any other Linux distribution. For more information about this configuration, see Manage Raspberry Pi devices using AWS Systems Manager.
  • Automating deployments to one or more Raspberry Pi device in headless mode (in which you don’t use a monitor or keyboard to run your device). If you use headless mode but still need to do some wireless setup, you can enable wireless networking and SSH when creating an image.

Solution overview

Our solution uses the following services:

We use CodePipeline to manage continuous integration and deployment to Raspberry Pi running Ubuntu Server 18 for ARM. As of this writing, CodeDeploy agents are supported on Windows OS, Red Hat, and Ubuntu.

For this use case, we use the image ubuntu-18.04.4-preinstalled-server-arm64+raspi3.img.

To close the loop, you edit your code or commit new revisions from your PC or Amazon Elastic Compute Cloud (Amazon EC2) to trigger the pipeline to deploy to Pi. The following diagram illustrates the architecture of our automated pipeline.

 

Solution Overview architectural diagram

Setting up a Raspberry Pi device

To set up a CodeDeploy agent on a Raspberry Pi device, the device should be running an Ubuntu Server 18 for ARM, which is supported by the Raspberry Pi processor architecture and the CodeDeploy agent, and it should be connected to the internet. You will need a keyboard and a monitor for the initial setup.

Follow these instructions for your initial setup:

  1. Download the Ubuntu image.

Pick the image based on your Raspberry Pi model. For this use case, we use Raspberry Pi 4 with Ubuntu 18.04.4 LTS.

  1. Burn the Ubuntu image to your microSD using a disk imager software (or other reliable tool). For instructions, see Create an Ubuntu Image for a Raspberry Pi on Windows.
  2. Configure WiFi on the Ubuntu server.

After booting from the newly flashed microSD, you can configure the OS.

  1. To enable DHCP, enter the following YAML (or create the yaml file if it doesn’t exist) to /etc/netplan/wireless.yaml:
network:
  version: 2
  wifis:
    wlan0:
      dhcp4: yes
      dhcp6: no
      access-points:
        "<your network ESSID>":
          password: "<your wifi password>"

Replace the variables <your network ESSID> and <your wifi password> with your wireless network SSID and password, respectively.

  1. Run the netplan by entering the following command:
ubuntu@ubuntu:~$ sudo netplan try

Installing CodeDeploy and registering Raspberry Pi as an on-premises instance

When the Raspberry Pi is connected to the internet, you’re ready to install the AWS Command Line Interface (AWS CLI) and the CodeDeploy agent to manage automated deployments through CodeDeploy.

To register an on-premises instance, you must use an AWS Identity and Access Management (IAM) identity to authenticate your requests. You can choose from the following options for the IAM identity and registration method you use:

  • An IAM user ARN. This is best for registering a single on-premises instance.
  • An IAM role to authenticate requests with periodically refreshed temporary credentials generated with the AWS Security Token Service (AWS STS). This is best for registering a large number of on-premises instances.

For this post, we use the first option and create an IAM user and register a single Raspberry Pi. You can use this procedure for a handful of devices. Make sure you limit the privileges of the IAM user to what you need to achieve; a scoped-down IAM policy is given in the documentation instructions. For more information, see Use the register command (IAM user ARN) to register an on-premises instance.

  1. Install the AWS CLI on Raspberry Pi with the following code:
ubuntu@ubuntu:~$ sudo apt install awscli
  1. Configure the AWS CLI and enter your newly created IAM access key, secret access key, and Region (for example, eu-west-1):
ubuntu@ubuntu:~$ sudo aws configure
AWS Access Key ID [None]: <IAM Access Key>
AWS Secret Access Key [None]: <Secret Access Key>
Default region name [None]: <AWS Region>
Default output format [None]: Leave default, press Enter.
  1. Now that the AWS CLI running on the Raspberry Pi has access to CodeDeploy API operations, you can register the device as an on-premises instance:
ubuntu@ubuntu:~$ sudo aws deploy register --instance-name rpi4UbuntuServer --iam-user-arn arn:aws:iam::<AWS_ACCOUNT_ID>:user/Rpi --tags Key=Name,Value=Rpi4 --region eu-west-1
Registering the on-premises instance... DONE
Adding tags to the on-premises instance... DONE

Tags allow you to assign metadata to your AWS resources. Each tag is a simple label consisting of a customer-defined key and an optional value that can make it easier to manage, search for, and filter resources by purpose, owner, environment, or other criteria.

When working with on-premises instances with CodeDeploy, tags are mandatory to select the instances for deployment. For this post, we tag the first device with Key=Name,Value=Rpi4. Generally speaking, it’s good practice to use tags on all applicable resources.

You should see something like the following screenshot on the CodeDeploy console.

CodeDeploy console

Or from the CLI, you should see the following output:

ubuntu@ubuntu:~$ sudo aws deploy list-on-premises-instances
{
    "instanceNames": [
        "rpi4UbuntuServer"
    ]
}
  1. Install the CodeDeploy agent:
ubuntu@ubuntu:~$ sudo aws deploy install --override-config --config-file /etc/codedeploy-agent/conf/codedeploy.onpremises.yml --region eu-west-1

If the preceding command fails due to dependencies, you can get the CodeDeploy package and install it manually:

ubuntu@ubuntu:~$ sudo apt-get install ruby
ubuntu@ubuntu:~$ sudo wget https://aws-codedeploy-us-west-2.s3.amazonaws.com/latest/install
--2020-03-28 18:58:15--  https://aws-codedeploy-us-west-2.s3.amazonaws.com/latest/install
Resolving aws-codedeploy-us-west-2.s3.amazonaws.com (aws-codedeploy-us-west-2.s3.amazonaws.com)... 52.218.249.82
Connecting to aws-codedeploy-us-west-2.s3.amazonaws.com (aws-codedeploy-us-west-2.s3.amazonaws.com)|52.218.249.82|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13819 (13K) []
Saving to: ‘install’
install 100%[====================================================================>]  13.50K  --.-KB/s    in 0.003s 
2020-03-28 18:58:16 (3.81 MB/s) - ‘install’ saved [13819/13819]
ubuntu@ubuntu:~$ sudo chmod +x ./install
ubuntu@ubuntu:~$ sudo ./install auto

 Check the service status with the following code:

ubuntu@ubuntu:~$ sudo service codedeploy-agent status
codedeploy-agent.service - LSB: AWS CodeDeploy Host Agent
   Loaded: loaded (/etc/init.d/codedeploy-agent; generated)
   Active: active (running) since Sat 2020-08-15 14:18:22 +03; 17s ago
     Docs: man:systemd-sysv-generator(8)
    Tasks: 3 (limit: 4441)
   CGroup: /system.slice/codedeploy-agent.service
           └─4243 codedeploy-agent: master 4243

Start the service (if not started automatically):

ubuntu@ubuntu:~$ sudo service codedeploy-agent start

Congratulations! Now that the CodeDeploy agent is installed and the Raspberry Pi is registered as an on-premises instance, CodeDeploy can deploy your application build to the device.

Creating your source stage

You’re now ready to create your source stage.

  1. On the CodeCommit console, under Source, choose Repositories.
  2. Choose Create repository.

For instructions on connecting your repository from your local workstation, see Setup for HTTPS users using Git credentials.

CodeCommit repo

  1. In the root directory of the repository, you should include an AppSpec file for an EC2/On-Premises deployment, where the filename must be yml for a YAML-based file. The file name is case-sensitive.

AppSpec file

The following example code is from the appspec.yml file:

version: 0.0
os: linux
files:
  - source: /
    destination: /home/ubuntu/AQI/
hooks:
  BeforeInstall:
    - location: scripts/testGPIO.sh
      timeout: 60
      runas: root
  AfterInstall:
    - location: scripts/testSensors.sh
      timeout: 300
      runas: root
  ApplicationStart:
    - location: startpublishdht11toshadow.sh
    - location: startpublishnovatoshadow.sh
      timeout: 300
      runas: root

The files section defines the files to copy from the repository to the destination path on the Raspberry Pi.

The hooks section runs one time per deployment to an instance. If an event hook isn’t present, no operation runs for that event. This section is required only if you’re running scripts as part of the deployment. It’s useful to implement some basic testing before and after installation of your application revisions. For more information about hooks, see AppSpec ‘hooks’ section for an EC2/On-Premises deployment.

Creating your deploy stage

To create your deploy stage, complete the following steps:

  1. On the CodeDeploy console, choose Applications.
  2. Create your application and deployment group.
    1. For Deployment type, select In-place.

Deployment group

  1. For Environment configuration, select On-premises instances.
  2. Add the tags you registered the instance with in the previous step (for this post, we add the key-value pair Name=RPI4.

on-premises tags

Creating your pipeline

You’re now ready to create your pipeline.

  1. On the CodePipeline console, choose Pipelines.
  2. Choose Create pipeline.
  3. For Pipeline name, enter a descriptive name.
  4. For Service role¸ select New service role.
  5. For Role name, enter your service role name.
  6. Leave the advanced settings at their default.
  7. Choose Next.

 

  Pipeline settings

  1. For Source provider, choose AWS CodeCommit
  2. For Repository name, choose the repository you created earlier.
  3. For Branch name, enter your repository branch name.
  4. For Change detection options, select Amazon CloudWatch Events.
  5. Choose Next.

Source stage

 

As an optional step, you can add a build stage, depending on whether your application is built with an interpreted language like Python or a compiled one like .NET C#. CodeBuild creates a fully managed build server on your behalf that runs the build commands using the buildspec.yml in the source code root directory.

 

  1. For Deploy provider, choose AWS CodeDeploy.
  2. For Region, choose your Region.
  3. For Application name, choose your application.
  4. For Deployment group, choose your deployment group.
  5. Choose Next.

Deploy stage

  1. Review your settings and create your pipeline.

Cleaning up

If you no longer plan to deploy to your Raspberry PI and want remove the CodeDeploy agent from your device, you can clean up with the following steps.

Uninstalling the agent

Automatically uninstall the CodeDeploy agent and remove the configuration file from an on-premises instance with the following code:

ubuntu@ubuntu:~$ sudo aws deploy uninstall
(Reading database ... 238749 files and directories currently installed.)
Removing codedeploy-agent (1.0-1.1597) ...
Processing triggers for systemd (237-3ubuntu10.39) ...
Processing triggers for ureadahead (0.100.0-21) ...
Uninstalling the AWS CodeDeploy Agent... DONE
Deleting the on-premises instance configuration... DONE

The uninstall command does the following:

  1. Stops the running CodeDeploy agent on the on-premises instance.
  2. Uninstalls the CodeDeploy agent from the on-premises instance.
  3. Removes the configuration file from the on-premises instance. (For Ubuntu Server and RHEL, this is /etc/codedeploy-agent/conf/codedeploy.onpremises.yml. For Windows Server, this is C:\ProgramData\Amazon\CodeDeploy\conf.onpremises.yml.)

De-registering the on-premises instance

This step is only supported using the AWS CLI. To de-register your instance, enter the following code:

ubuntu@ubuntu:~$ sudo aws deploy deregister --instance-name rpi4UbuntuServer --region eu-west-1
Retrieving on-premises instance information... DONE
IamUserArn: arn:aws:iam::XXXXXXXXXXXX:user/Rpi
Tags: Key=Name,Value=Rpi4
Removing tags from the on-premises instance... DONE
Deregistering the on-premises instance... DONE
Deleting the IAM user policies... DONE
Deleting the IAM user access keys... DONE
Deleting the IAM user (Rpi)... DONE

Optionally, delete your application from CodeDeploy, and your repository from CodeCommit and CodePipeline from the respective service consoles.

Conclusion

You’re now ready to automate your deployments to your Raspberry Pi or any on-premises supported operating system. Automated deployments and source code version control frees up more time in developing your applications. Continuous deployment helps with the automation and version tracking of your scripts and applications deployed on the device.

For more information about IoT projects created using a Raspberry Pi, see my Air Pollution demo and Kid Monitor demo.

About the author

Ahmed ElHaw is a Sr. Solutions Architect at Amazon Web Services (AWS) with background in telecom, web development and design, and is passionate about spatial computing and AWS serverless technologies. He enjoys providing technical guidance to customers, helping them architect and build solutions that make the best use of AWS. Outside of work he enjoys spending time with his kids and playing video games.

Event-driven architecture for using third-party Git repositories as source for AWS CodePipeline

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/event-driven-architecture-for-using-third-party-git-repositories-as-source-for-aws-codepipeline/

In the post Using Custom Source Actions in AWS CodePipeline for Increased Visibility for Third-Party Source Control, we demonstrated using custom actions in AWS CodePipeline and a worker that periodically polls for jobs and processes further to get the artifact from the Git repository. In this post, we discuss using an event-driven architecture to trigger an AWS CodePipeline pipeline that has a third-party Git repository within the source stage that is part of a custom action.

Instead of using a worker to periodically poll for available jobs across all pipelines, we can define a custom source action on a particular pipeline to trigger an Amazon CloudWatch Events rule when the webhook on CodePipeline receives an event and puts it into an In Progress state. This works exactly like how CodePipeline works with natively supported Git repositories like AWS CodeCommit or GitHub as a source.

Solution architecture

The following diagram shows how you can use an event-driven architecture with a custom source stage that is associated with a third-party Git repository that isn’t supported by CodePipeline natively. For our use case, we use GitLab, but you can use any Git repository that supports Git webhooks.

3rdparty-gitblog-new.jpg

The architecture includes the following steps:

1. A user commits code to a Git repository.

2. The commit invokes a Git webhook.

3. This invokes a CodePipeline webhook.

4. The CodePipeline source stage is put into In Progress status.

5. The source stage action triggers a CloudWatch Events rule that indicates the stage started.

6. The CloudWatch event triggers an AWS Lambda function.

7. The function polls for the job details of the custom action.

8. The function also triggers AWS CodeBuild and passes all the job-related information.

9. CodeBuild gets the public SSH key stored in AWS Secrets Manager (or user name and password, if using HTTPS Git access).

10. CodeBuild clones the repository for a particular branch.

11. CodeBuild zips and uploads the archive to the CodePipeline artifact store Amazon Simple Storage Service (Amazon S3) bucket.

12. A Lambda function sends a success message to the CodePipeline source stage so it can proceed to the next stage.

Similarly, with the same setup, if you chose a release change for the pipeline that has custom source stage, a CloudWatch event is triggered, which triggers a Lambda function, and the same process repeats until it gets the artifact from the Git repository.

Solution overview

To set up the solution, you complete the following steps:

1. Create an SSH key pair for authenticating to the Git repository.

2. Publish the key to Secrets Manager.

3. Launch the AWS CloudFormation stack to provision resources.

4. Deploy a sample CodePipeline and test the custom action type.

5. Retrieve the webhook URL.

6. Create a webhook and add the webhook URL.

Creating an SSH key pair
You first create an SSH key pair to use for authenticating to the Git repository using ssh-keygen on your terminal. See the following code:

ssh-keygen -t rsa -b 4096 -C "[email protected]"

Follow the prompt from ssh-keygen and give a name for the key, for example codepipeline_git_rsa. This creates two new files in the current directory: codepipeline_git_rsa and codepipeline_git_rsa.pub.

Make a note of the contents of codepipeline_git_rsa.pub and add it as an authorized key for your Git user. For instructions, see Adding an SSH key to your GitLab account.

Publishing the key
Publish this key to Secrets Manager using the AWS Command Line Interface (AWS CLI):

export SecretsManagerArn=$(aws secretsmanager create-secret --name codepipeline_git \
--secret-string file://codepipeline_git_rsa --query ARN --output text)

Make a note of the ARN, which is required later.

Alternative, you can create a secret on the Secrets Manager console.

Make sure that the lines in the private key codepipeline_git are the same when the value to the secret is added.

Launching the CloudFormation stack

Clone the git repository aws-codepipeline-third-party-git-repositories that contains the AWS CloudFormation templates and AWS Lambda function code using the below command:

git clone https://github.com/aws-samples/aws-codepipeline-third-party-git-repositories.git .

Now you should have the below files in the cloned repository

cfn/
|--sample_pipeline_custom.yaml
`--third_party_git_custom_action.yaml
lambda/
`--lambda_function.py

Launch the CloudFormation stack using the template third_party_git_custom_action.yaml from the cfn directory. The main resources created by this stack are:

1. CodePipeline Custom Action Type. ResourceType: AWS::CodePipeline::CustomActionType
2. Lambda Function. ResourceType: AWS::Lambda::Function
3. CodeBuild Project. ResourceType: AWS::CodeBuild::Project
4. Lambda Execution Role. ResourceType: AWS::IAM::Role
5. CodeBuild Service Role. ResourceType: AWS::IAM::Role

These resources help uplift the logic for connecting to the Git repository, which for this post is GitLab.

Upload the Lambda function code to any S3 bucket in the same Region where the stack is being deployed. To create a new S3 bucket, use the following code (make sure to provide a unique name):

export ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export S3_BUCKET_NAME=codepipeline-git-custom-action-${ACCOUNT_ID} 
aws s3 mb s3://${S3_BUCKET_NAME} --region us-east-1

Then zip the contents of the function and upload to the S3 bucket (substitute the appropriate bucket name):

export ZIP_FILE_NAME="codepipeline_git.zip"
zip -jr ${ZIP_FILE_NAME} ./lambda/lambda_function.py && \
aws s3 cp codepipeline_git.zip \
s3://${S3_BUCKET_NAME}/${ZIP_FILE_NAME}

If you don’t have a VPC and subnets that Lambda and CodeBuild require, you can create those by launching the following CloudFormation stack.

Run the following AWS CLI command to deploy the third-party Git source solution stack:

export vpcId="vpc-123456789"
export subnetId1="subnet-12345"
export subnetId2="subnet-54321"
export GIT_SOURCE_STACK_NAME="thirdparty-codepipeline-git-source"
aws cloudformation create-stack \
--stack-name ${GIT_SOURCE_STACK_NAME} \
--template-body file://$(pwd)/cfn/third_party_git_custom_action.yaml \
--parameters ParameterKey=SourceActionVersion,ParameterValue=1 \
ParameterKey=SourceActionProvider,ParameterValue=CustomSourceForGit \
ParameterKey=GitPullLambdaSubnet,ParameterValue=${subnetId1}\\,${subnetId2} \
ParameterKey=GitPullLambdaVpc,ParameterValue=${vpcId} \
ParameterKey=LambdaCodeS3Bucket,ParameterValue=${S3_BUCKET_NAME} \
ParameterKey=LambdaCodeS3Key,ParameterValue=${ZIP_FILE_NAME} \
--capabilities CAPABILITY_IAM

Alternatively, launch the stack by choosing Launch Stack:

cloudformation-launch-stack.png

For more information about the VPC requirements for Lambda and CodeBuild, see Internet and service access for VPC-connected functions and Use AWS CodeBuild with Amazon Virtual Private Cloud, respectively.

A custom source action type is now available on the account where you deployed the stack. You can check this on the CodePipeline console by attempting to create a new pipeline. You can see your source type listed under Source provider.

codepipeline-source-stage-dropdown.png

Testing the pipeline

We now deploy a sample pipeline and test the custom action type using the template sample_pipeline_custom.yaml from the cfn directory . You can run the following AWS CLI command to deploy the CloudFormation stack:

Note: Please provide the GitLab repository url to SSH_URL environment variable that you have access to or create a new GitLab project and repository. The example url "[email protected]:kirankumar15/test.git" is for illustration purposes only.

export SSH_URL="[email protected]:kirankumar15/test.git"
export SAMPLE_STACK_NAME="third-party-codepipeline-git-source-test"
aws cloudformation create-stack \
--stack-name ${SAMPLE_STACK_NAME}\
--template-body file://$(pwd)/cfn/sample_pipeline_custom.yaml \
--parameters ParameterKey=Branch,ParameterValue=master \
ParameterKey=GitUrl,ParameterValue=${SSH_URL} \
ParameterKey=SourceActionVersion,ParameterValue=1 \
ParameterKey=SourceActionProvider,ParameterValue=CustomSourceForGit \
ParameterKey=CodePipelineName,ParameterValue=sampleCodePipeline \
ParameterKey=SecretsManagerArnForSSHPrivateKey,ParameterValue=${SecretsManagerArn} \
ParameterKey=GitWebHookIpAddress,ParameterValue=34.74.90.64/28 \
--capabilities CAPABILITY_IAM

Alternatively, choose Launch stack:

cloudformation-launch-stack.png

Retrieving the webhook URL
When the stack creation is complete, retrieve the CodePipeline webhook URL from the stack outputs. Use the following AWS CLI command:

aws cloudformation describe-stacks \
--stack-name ${SAMPLE_STACK_NAME}\ 
--output text \
--query "Stacks[].Outputs[?OutputKey=='CodePipelineWebHookUrl'].OutputValue"

For more information about stack outputs, see Outputs.

Creating a webhook

You can use an existing GitLab repository or create a new GitLab repository and follow the below steps to add a webhook to it.
To create your webhook, complete the following steps:

1. Navigate to the Webhooks Settings section on the GitLab console for the repository that you want to have as a source for CodePipeline.

2. For URL, enter the CodePipeline webhook URL you retrieved in the previous step.

3. Select Push events and optionally enter a branch name.

4. Select Enable SSL verification.

5. Choose Add webhook.

gitlab-webhook.png

For more information about webhooks, see Webhooks.

We’re now ready to test the solution.

Testing the solution

To test the solution, we make changes to the branch that we passed as the branch parameter in the GitLab repository. This should trigger the pipeline. On the CodePipeline console, you can see the Git Commit ID on the source stage of the pipeline when it succeeds.

Note: Please provide the GitLab repository url that you have access to or create a new GitLab repository. and make sure that it has buildspec.yml in the contents to execute in AWS CodeBuild project in the Build stage. The example url "[email protected]:kirankumar15/test.git" is for illustration purposes only.

Enter the following code to clone your repository:

git clone [email protected]:kirankumar15/test.git .

Add a sample file to the repository with the name sample.txt, then commit and push it to the repository:

echo "adding a sample file" >> sample_text_file.txt
git add ./
git commit -m "added sample_test_file.txt to the repository"
git push -u origin master

The pipeline should show the status In Progress.

codepipeline_inprogress.png

After few minutes, it changes to Succeeded status and you see the Git Commit message on the source stage.

codepipeline_succeeded.png

You can also view the Git Commit message by choosing the execution ID of the pipeline, navigating to the timeline section, and choosing the source action. You should see the Commit message and Commit ID that correlates with the Git repository.

codepipeline-commit-msg.png

Troubleshooting

If the CodePipeline fails, check the Lambda function logs for the function with the name GitLab-CodePipeline-Source-${ACCOUNT_ID}. For instructions on checking logs, see Accessing Amazon CloudWatch logs for AWS Lambda.

If the Lambda logs has the CodeBuild build ID, then check the CodeBuild run logs for that build ID for errors. For instructions, see View detailed build information.

Cleaning up

Delete the CloudFormation stacks that you created. You can use the following AWS CLI commands:

aws cloudformation delete-stack --stack-name ${SAMPLE_STACK_NAME}

aws cloudformation delete-stack --stack-name ${GIT_SOURCE_STACK_NAME}

Alternatively, delete the stack on the AWS CloudFormation console.

Additionally, empty the S3 bucket and delete it. Locate the bucket in the ${SAMPLE_STACK_NAME} stack. Then use the following AWS CLI command:

aws s3 rb s3://${S3_BUCKET_NAME} --force

You can also delete and empty the bucket on the Amazon S3 console.

Conclusion

You can use the architecture in this post for any Git repository that supports webhooks. This solution also works if the repository is reachable only from on premises, and if the endpoints can be accessed from a VPC. This event-driven architecture works just like using any natively supported source for CodePipeline.

 

About the Author

kirankumar.jpeg Kirankumar Chandrashekar is a DevOps consultant at AWS Professional Services. He focuses on leading customers in architecting DevOps technologies. Kirankumar is passionate about Infrastructure as Code and DevOps. He enjoys music, as well as cooking and travelling.