Tag Archives: Continuous Integration

Our Journey to Continuous Delivery at Grab (Part 2)

Post Syndicated from Grab Tech original https://engineering.grab.com/our-journey-to-continuous-delivery-at-grab-part2

In the first part of this blog post, you’ve read about the improvements made to our build and staging deployment process, and how plenty of manual tasks routinely taken by engineers have been automated with Conveyor: an in-house continuous delivery solution.

This new post begins with the introduction of the hermeticity principle for our deployments, and how it improves the confidence with promoting changes to production. Changes sent to production via Conveyor’s deployment pipelines are then described in detail.

Overview of Grab delivery process
Overview of Grab delivery process

Finally, looking back at the engineering efficiency improvements around velocity and reliability over the last 2 years, we answer the big question – was the investment on a custom continuous delivery solution like Conveyor the right decision for Grab?

Improving Confidence in our Production Deployments with Hermeticity

The term deployment hermeticity is borrowed from build systems. A build system is called hermetic if builds always produce the same artefacts regardless of changes in the environment they run on. Similarly, we call our deployments hermetic if they always result in the same deployed artefacts regardless of the environment’s change or the number of times they are executed.

The behaviour of a service is rarely controlled by a single variable. The application that makes up your service is an important driver of its behaviour, but its configuration is an important contributor, for example. The behaviour for traditional microservices at Grab is dictated mainly by 3 versioned artefacts: application code, static and dynamic configuration.

Conveyor has been integrated with the systems that operate changes in each of these parameters. By tracking all 3 parameters at every deployment, Conveyor can reproducibly deploy microservices with similar behaviour: its deployments are hermetic.

Building upon this property, Conveyor can ensure that all deployments made to production have been tested before with the same combination of parameters. This is valuable to us:

  • An outcome of staging deployments for a specific set of parameters is a good predictor of outcomes in production deployments for the same set of parameters and thus it makes testing in staging more relevant.
  • Rollbacks are hermetic; we never rollback to a combination of parameters that has not been used previously.

In the past, incidents had resulted from an application rollback not compatible with the current dynamic configuration version; this was aggravating since rollbacks are expected to be a safe recovery mechanism. The introduction of hermetic deployments has largely eliminated this category of problems.

Hermeticity is maintained by registering the deployment parameters as artefacts after each successfully completed pipeline. Users must then select one of the registered deployment metadata to promote to production.

At this point, you might be wondering: why not use a single pipeline that includes both staging and production deployments? This was indeed how it started, with a single pipeline spanning multiple environments. However, engineers soon complained about it.

The most obvious reason for the complaint was that less than 20% of changes deployed in staging will make their way to production. This meant that engineers would have toil associated with each completed staging deployment since the pipeline must be manually cancelled rather than continued to production.

The other reason is that this multi-environment pipeline approach reduced flexibility when promoting changes to production. There are different ways to apply changes to a cluster. For example, lengthy pipelines that refresh instances can be used to deploy any combination of changes, while there are quicker pipelines restricted to dynamic configuration changes (such as feature flags rollouts). Regardless of the order in which the changes are made and how they are applied, Conveyor tracks the change.

Eventually, engineers promote a deployment artefact to production. However they do not need to apply changes in the same sequence with which were applied to staging. Furthermore, to prevent erroneous actions, Conveyor presents only changes that can be applied with the requested pipeline (and sometimes, no changes are available). Not being forced into a specific method of deploying changes is one of added benefits of hermetic deployments.

Returning to Our Journey Towards Engineering Efficiency

If you can recall, the first part of this blog post series ended with a description of staging deployment. Our deployment to production starts with a verification that we uphold our hermeticity principle, as explained above.

Our production deployment pipelines can run for several hours for large clusters with rolling releases (few run for days), so we start by acquiring locks to ensure there are no concurrent deployments for any given cluster.

Before making any changes to the environment, we automatically generate release notes, giving engineers a chance to abort if the wrong set of changes are sent to production.

The pipeline next waits for a deployment slot. Early on, engineers adopted deployment windows that coincide with office hours, such that if anything goes wrong, there is always someone on hand to help. Prior to the introduction of Conveyor, however, engineers would manually ask a Slack bot for approval. This interaction is now automated, and the only remaining action left is for the engineer to approve that the deployment can proceed via a single click, in line with our hands-off deployment principle.

When the canary is in production, Conveyor automates monitoring it. This process is similar to the one already discussed in the first part of this blog post: Engineers can configure a set of alerts that Conveyor will keep track of. As soon as any one of the alerts is triggered, Conveyor automatically rolls back the service.

If no alert is raised for the duration of the monitoring period, Conveyor waits again for a deployment slot. It then publishes the release notes for that deployment and completes the deployments for the cluster. After the lock is released and the deployment registered, the pipeline finally comes to its successful completion.

Benefits of Our Journey Towards Engineering Efficiency

All these improvements made over the last 2 years have reduced the effort spent by engineers on deployment while also reducing the failure rate of our deployments.

If you are an engineer working on DevOps in your organisation, you know how hard it can be to measure the impact you made on your organisation. To estimate the time saved by our pipelines, we can model the activities that were previously done manually with a rudimentary weighted graph. In this graph, each edge carries a probability of the activity being performed (100% when unspecified), while each vertex carries the time taken for that activity.

Focusing on our regular staging deployments only, such a graph would look like this:

The overall amount of effort automated by the staging pipelines () is represented in the graph above. It can be converted into the equation below:

This equation shows that for each staging deployment, around 16 minutes of work have been saved. Similarly, for regular production deployments, we find that 67 minutes of work were saved for each deployment:

Moreover, efficiency was not the only benefit brought by the use of deployment pipelines for our traditional microservices. Surprisingly perhaps, the rate of failures related to production changes is progressively reducing while the amount of production changes that were made with Conveyor increased across the organisation (starting at 1.5% of failures per deployments, and finishing at 0.3% on average over the last 3 months for the period of data collected):

Keep Calm and Automate

Since the first draft for this post was written, we’ve made many more improvements to our pipelines. We’ve begun automating Database Migrations; we’ve extended our set of hermetic variables to Amazon Machine Image (AMI) updates; and we’re working towards supporting container deployments.

Through automation, all of Conveyor’s deployment pipelines have contributed to save more than 5,000 man-days of efforts in 2020 alone, across all supported teams. That’s around 20 man-years worth of effort, which is around 3 times the capacity of the team working on the project! Investments in our automation pipelines have more than paid for themselves, and the gains go up every year as more workflows are automated and more teams are onboarded.

If Conveyor has saved efforts for engineering teams, has it then helped to improve velocity? I had opened the first part of this blog post with figures on the deployment funnel for microservice teams at Grab, towards the end of 2018. So where do the figures stand today for these teams?

In the span of 2 years, the average number of build and staging deployment performed each day has not varied much. However, in the last 3 months of 2020, engineers have sent twice more changes to production than they did for the same period in 2018.

Perhaps the biggest recognition received by the team working on the project, was from Grab’s engineers themselves. In the 2020 internal NPS survey for engineering experience at Grab, Conveyor received the highest score of any tools (built in-house or not).


All these improvements in efficiency for our engineers would never have been possible without the hard work of all team members involved in the project, past and present: Tanun Chalermsinsuwan, Aufar Gilbran, Deepak Ramakrishnaiah, Repon Kumar Roy (Kowshik), Su Han, Voislav Dimitrijevikj, Stanley Goh, Htet Aung Shine, Evan Sebastian, Qijia Wang, Oscar Ng, Jacob Sunny, Subhodip Mandal and many others who have contributed and collaborated with them.


Join Us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

Integrate GitHub monorepo with AWS CodePipeline to run project-specific CI/CD pipelines

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/integrate-github-monorepo-with-aws-codepipeline-to-run-project-specific-ci-cd-pipelines/

AWS CodePipeline is a continuous delivery service that enables you to model, visualize, and automate the steps required to release your software. With CodePipeline, you model the full release process for building your code, deploying to pre-production environments, testing your application, and releasing it to production. CodePipeline then builds, tests, and deploys your application according to the defined workflow either in manual mode or automatically every time a code change occurs. A lot of organizations use GitHub as their source code repository. Some organizations choose to embed multiple applications or services in a single GitHub repository separated by folders. This method of organizing your source code in a repository is called a monorepo.

This post demonstrates how to customize GitHub events that invoke a monorepo service-specific pipeline by reading the GitHub event payload using AWS Lambda.

 

Solution overview

With the default setup in CodePipeline, a release pipeline is invoked whenever a change in the source code repository is detected. When using GitHub as the source for a pipeline, CodePipeline uses a webhook to detect changes in a remote branch and starts the pipeline. When using a monorepo style project with GitHub, it doesn’t matter which folder in the repository you change the code, CodePipeline gets an event at the repository level. If you have a continuous integration and continuous deployment (CI/CD) pipeline for each of the applications and services in a repository, all pipelines detect the change in any of the folders every time. The following diagram illustrates this scenario.

 

GitHub monorepo folder structure

 

This post demonstrates how to customize GitHub events that invoke a monorepo service-specific pipeline by reading the GitHub event payload using Lambda. This solution has the following benefits:

  • Add customizations to start pipelines based on external factors – You can use custom code to evaluate whether a pipeline should be triggered. This allows for further customization beyond polling a source repository or relying on a push event. For example, you can create custom logic to automatically reschedule deployments on holidays to the next available workday.
  • Have multiple pipelines with a single source – You can trigger selected pipelines when multiple pipelines are listening to a single GitHub repository. This lets you group small and highly related but independently shipped artifacts such as small microservices without creating thousands of GitHub repos.
  • Avoid reacting to unimportant files – You can avoid triggering a pipeline when changing files that don’t affect the application functionality (such as documentation, readme, PDF, and .gitignore files).

In this post, we’re not debating the advantages or disadvantages of a monorepo versus a single repo, or when to create monorepos or single repos for each application or project.

 

Sample architecture

This post focuses on controlling running pipelines in CodePipeline. CodePipeline can have multiple stages like test, approval, and deploy. Our sample architecture considers a simple pipeline with two stages: source and build.

 

Github monorepo - CodePipeline Sample Architecture

This solution is made up of following parts:

  • An Amazon API Gateway endpoint (3) is backed by a Lambda function (5) to receive and authenticate GitHub webhook push events (2)
  • The same function evaluates incoming GitHub push events and starts the pipeline on a match
  • An Amazon Simple Storage Service (Amazon S3) bucket (4) stores the CodePipeline-specific configuration files
  • The pipeline contains a build stage with AWS CodeBuild

 

Normally, after you create a CI/CD pipeline, it automatically triggers a pipeline to release the latest version of your source code. From then on, every time you make a change in your source code, the pipeline is triggered. You can also manually run the last revision through a pipeline by choosing Release change on the CodePipeline console. This architecture uses the manual mode to run the pipeline. GitHub push events and branch changes are evaluated by the Lambda function to avoid commits that change unimportant files from starting the pipeline.

 

Creating an API Gateway endpoint

We need a single API Gateway endpoint backed by a Lambda function with the responsibility of authenticating and validating incoming requests from GitHub. You can authenticate requests using HMAC security or GitHub Apps. API Gateway only needs one POST method to consume GitHub push events, as shown in the following screenshot.

 

Creating an API Gateway endpoint

 

Creating the Lambda function

This Lambda function is responsible for authenticating and evaluating the GitHub events. As part of the evaluation process, the function can parse through the GitHub events payload, determine which files are changed, added, or deleted, and perform the appropriate action:

  • Start a single pipeline, depending on which folder is changed in GitHub
  • Start multiple pipelines
  • Ignore the changes if non-relevant files are changed

You can store the project configuration details in Amazon S3. Lambda can read this configuration to decide what needs to be done when a particular folder is matched from a GitHub event. The following code is an example configuration:

{

    "GitHubRepo": "SampleRepo",

    "GitHubBranch": "main",

    "ChangeMatchExpressions": "ProjectA/.*",

    "IgnoreFiles": "*.pdf;*.md",

    "CodePipelineName": "ProjectA - CodePipeline"

}

For more complex use cases, you can store the configuration file in Amazon DynamoDB.

The following is the sample Lambda function code in Python 3.7 using Boto3:

def lambda_handler(event, context):

    import json
    modifiedFiles = event["commits"][0]["modified"]
    #full path
    for filePath in modifiedFiles:
        # Extract folder name
        folderName = (filePath[:filePath.find("/")])
        break

    #start the pipeline
    if len(folderName)>0:
        # Codepipeline name is foldername-job. 
        # We can read the configuration from S3 as well. 
        returnCode = start_code_pipeline(folderName + '-job')

    return {
        'statusCode': 200,
        'body': json.dumps('Modified project in repo:' + folderName)
    }
    

def start_code_pipeline(pipelineName):
    client = codepipeline_client()
    response = client.start_pipeline_execution(name=pipelineName)
    return True

cpclient = None
def codepipeline_client():
    import boto3
    global cpclient
    if not cpclient:
        cpclient = boto3.client('codepipeline')
    return cpclient
   

Creating a GitHub webhook

GitHub provides webhooks to allow external services to be notified on certain events. For this use case, we create a webhook for a push event. This generates a POST request to the URL (API Gateway URL) specified for any files committed and pushed to the repository. The following screenshot shows our webhook configuration.

Creating a GitHub webhook2

Conclusion

In our sample architecture, two pipelines monitor the same GitHub source code repository. A Lambda function decides which pipeline to run based on the GitHub events. The same function can have logic to ignore unimportant files, for example any readme or PDF files.

Using API Gateway, Lambda, and Amazon S3 in combination serves as a general example to introduce custom logic to invoke pipelines. You can expand this solution for increasingly complex processing logic.

 

About the Author

Vivek Kumar

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 23 years of experience in software engineering and architecture roles for various large-scale enterprises.

 

 

Gaurav-Sharma

Gaurav is a Solutions Architect at AWS. He works with digital native business customers providing architectural guidance on AWS services.

 

 

 

Nitin-Aggarwal

Nitin is a Solutions Architect at AWS. He works with digital native business customers providing architectural guidance on AWS services.

 

 

 

 

Integrating AWS Device Farm with your CI/CD pipeline to run cross-browser Selenium tests

Post Syndicated from Mahesh Biradar original https://aws.amazon.com/blogs/devops/integrating-aws-device-farm-with-ci-cd-pipeline-to-run-cross-browser-selenium-tests/

Continuously building, testing, and deploying your web application helps you release new features sooner and with fewer bugs. In this blog, you will create a continuous integration and continuous delivery (CI/CD) pipeline for a web app using AWS CodeStar services and AWS Device Farm’s desktop browser testing service.  AWS CodeStar is a suite of services that help you quickly develop and build your web apps on AWS.

AWS Device Farm’s desktop browser testing service helps developers improve the quality of their web apps by executing their Selenium tests on different desktop browsers hosted in AWS. For each test executed on the service, Device Farm generates action logs, web driver logs, video recordings, to help you quickly identify issues with your app. The service offers pay-as-you-go pricing so you only pay for the time your tests are executing on the browsers with no upfront commitments or additional costs. Furthermore, Device Farm provides a default concurrency of 50 Selenium sessions so you can run your tests in parallel and speed up the execution of your test suites.

After you’ve completed the steps in this blog, you’ll have a working pipeline that will build your web app on every code commit and test it on different versions of desktop browsers hosted in AWS.

Solution overview

In this solution, you use AWS CodeStar to create a sample web application and a CI/CD pipeline. The following diagram illustrates our solution architecture.

Figure 1: Deployment pipeline architecture

Figure 1: Deployment pipeline architecture

Prerequisites

Before you deploy the solution, complete the following prerequisite steps:

  1. On the AWS Management Console, search for AWS CodeStar.
  2. Choose Getting Started.
  3. Choose Create Project.
  4. Choose a sample project.

For this post, we choose a Python web app deployed on Amazon Elastic Compute Cloud (Amazon EC2) servers.

  1. For Templates, select Python (Django).
  2. Choose Next.
Figure 2: Choose a template in CodeStar

Figure 2: Choose a template in CodeStar

  1. For Project name, enter Demo CICD Selenium.
  2. Leave the remaining settings at their default.

You can choose from a list of existing key pairs. If you don’t have a key pair, you can create one.

  1. Choose Next.
  2. Choose Create project.
 Figure 3: Verify the project configuration

Figure 3: Verify the project configuration

AWS CodeStar creates project resources for you as listed in the following table.

Service Resource Created
AWS CodePipeline project demo-cicd-selen-Pipeline
AWS CodeCommit repository Demo-CICD-Selenium
AWS CodeBuild project demo-cicd-selen
AWS CodeDeploy application demo-cicd-selen
AWS CodeDeploy deployment group demo-cicd-selen-Env
Amazon EC2 server Tag: Environment = demo-cicd-selen-WebApp
IAM role AWSCodeStarServiceRole, CodeStarWorker*

Your EC2 instance needs to have access to run AWS Device Farm to run the Selenium test scripts. You can use service roles to achieve that.

  1. Attach policy AWSDeviceFarmFullAccess to the IAM role CodeStarWorker-demo-cicd-selen-WebApp.

You’re now ready to create an AWS Cloud9 environment.

Check the Pipelines tab of your AWS CodeStar project; it should show success.

  1. On the IDE tab, under Cloud 9 environments, choose Create environment.
  2. For Environment name, enter Demo-CICD-Selenium.
  3. Choose Create environment.
  4. Wait for the environment to be complete and then choose Open IDE.
  5. In the IDE, follow the instructions to set up your Git and make sure it’s up to date with your repo.

The following screenshot shows the verification that the environment is set up.

Figure 4: Verify AWS Cloud9 setup

Figure 4: Verify AWS Cloud9 setup

You can now verify the deployment.

  1. On the Amazon EC2 console, choose Instances.
  2. Choose demo-cicd-selen-WebApp.
  3. Locate its public IP and choose open address.
Figure 5: Locating IP address of the instance

Figure 5: Locating IP address of the instance

 

A webpage should open. If it doesn’t, check your VPN and firewall settings.

Now that you have a working pipeline, let’s move on to creating a Device Farm project for browser testing.

Creating a Device Farm project

To create your browser testing project, complete the following steps:

  1. On the Device Farm console, choose Desktop browser testing project.
  2. Choose Create a new project.
  3. For Project name, enter Demo cicd selenium.
  4. Choose Create project.
Figure 6: Creating AWS Device Farm project

Figure 6: Creating AWS Device Farm project

 

Note down the project ARN; we use this in the Selenium script for the remote web driver.

Figure 7: Project ARN for AWS Device Farm project

Figure 7: Project ARN for AWS Device Farm project

 

Testing the solution

This solution uses the following script to run browser testing. We call this script in the validate service lifecycle hook of the CodeDeploy project.

  1. Open your AWS Cloud9 IDE (which you made as a prerequisite).
  2. Create a folder tests under the project root directory.
  3. Add the sample Selenium script browser-test-sel.py under tests folder with the following content (replace <sample_url> with the url of your web application refer pre-requisite step 18):
import boto3
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

devicefarm_client = boto3.client("devicefarm", region_name="us-west-2")
testgrid_url_response = devicefarm_client.create_test_grid_url(
    projectArn="arn:aws:devicefarm:us-west-2:<your project ARN>",
    expiresInSeconds=300)

driver = webdriver.Remote(testgrid_url_response["url"],
                          webdriver.DesiredCapabilities.FIREFOX)
try:
    driver.implicitly_wait(30)
    driver.maximize_window()
    driver.get("<sample_url>")
    if driver.find_element_by_id("Layer_1"):
        print("graphics generated in full screen")
    assert driver.find_element_by_id("Layer_1")
    driver.set_window_position(0, 0) and driver.set_window_size(1000, 400)
    driver.get("<sample_url>")
    tower = driver.find_element_by_id("tower1")
    if tower.is_displayed():
        print("graphics generated after resizing")
    else:
        print("graphics not generated at this window size")
       # this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate
except Exception as e:
    print(e)
finally:
    driver.quit()

This script launches a website (in Firefox) created by you in the prerequisite step using the AWS CodeStar template and verifies if a graphic element is loaded.

The following screenshot shows a full-screen application with graphics loaded.

Figure 8: Full screen application with graphics loaded

Figure 8: Full screen application with graphics loaded

 

The following screenshot shows the resized window with no graphics loaded.

Figure 9: Resized window application with No graphics loaded

Figure 9: Resized window application with No graphics loaded

  1. Create the file validate_service in the scripts folder under the root directory with the following content:
#!/bin/bash
if [ "$DEPLOYMENT_GROUP_NAME" == "demo-cicd-selen-Env" ]
then
cd /home/ec2-user
source environment/bin/activate
python tests/browser-test-sel.py
fi

This script is part of CodeDeploy scripts and determines whether to stop the pipeline or continue based on the output from the browser testing script from the preceding step.

  1. Modify the file appspec.yml under the root directory, add the tests files and ValidateService hook , the file should look like following:
version: 0.0
os: linux
files:
 - source: /ec2django/
   destination: /home/ec2-user/ec2django
 - source: /helloworld/
   destination: /home/ec2-user/helloworld
 - source: /manage.py
   destination: /home/ec2-user
 - source: /supervisord.conf
   destination: /home/ec2-user
 - source: /requirements.txt
   destination: /home/ec2-user
 - source: /requirements/
   destination: /home/ec2-user/requirements
 - source: /tests/
   destination: /home/ec2-user/tests

permissions:
  - object: /home/ec2-user/manage.py
    owner: ec2-user
    mode: 644
    type:
      - file
  - object: /home/ec2-user/supervisord.conf
    owner: ec2-user
    mode: 644
    type:
      - file
hooks:
  AfterInstall:
    - location: scripts/install_dependencies
      timeout: 300
      runas: root
    - location: scripts/codestar_remote_access
      timeout: 300
      runas: root
    - location: scripts/start_server
      timeout: 300
      runas: root

  ApplicationStop:
    - location: scripts/stop_server
      timeout: 300
      runas: root

  ValidateService:
    - location: scripts/validate_service
      timeout: 600
      runas: root

This file is used by AWS CodeDeploy service to perform the deployment and validation steps.

  1. Modify the artifacts section in the buildspec.yml file. The section should look like the following:
artifacts:
  files:
    - 'template.yml'
    - 'ec2django/**/*'
    - 'helloworld/**/*'
    - 'scripts/**/*'
    - 'tests/**/*'
    - 'appspec.yml'
    - 'manage.py'
    - 'requirements/**/*'
    - 'requirements.txt'
    - 'supervisord.conf'
    - 'template-configuration.json'

This file is used by AWS CodeBuild service to package the code

  1. Modify the file Common.txt in the requirements folder under the root directory, the file should look like the following:
# dependencies common to all environments 
Django=2.1.15 
selenium==3.141.0 
boto3 >= 1.10.44
pytest

  1. Save All the changes, your folder structure should look like the following:
── README.md
├── appspec.yml*
├── buildspec.yml*
├── db.sqlite3
├── ec2django
├── helloworld
├── manage.py
├── requirements
│   ├── common.txt*
│   ├── dev.txt
│   ├── prod.txt
├── requirements.txt
├── scripts
│   ├── codestar_remote_access
│   ├── install_dependencies
│   ├── start_server
│   ├── stop_server
│   └── validate_service**
├── supervisord.conf
├── template-configuration.json
├── template.yml
├── tests
│   └── browser-test-sel.py**

**newly added files
*modified file

Running the tests

The Device Farm desktop browsing project is now integrated with your pipeline. All you need to do now is commit the code, and CodePipeline takes care of the rest.

  1. On Cloud9 terminal, go to project root directory.
  2. Run git add . to stage all changed files for commit.
  3. Run git commit -m “<commit message>” to commit the changes.
  4. Run git push to push the changes to repository, this should trigger the Pipeline.
  5. Check the Pipelines tab of your AWS CodeStar project; it should show success.
  6. Go to AWS Device Farm console and click on Desktop browser testing projects.
  7. Click on your Project Demo cicd selenium.

You can verify the running of your Selenium test cases using the recorded run steps shown on the Device Farm console, the video of the run, and the logs, all of which can be downloaded on the Device Farm console, or using the AWS SDK and AWS Command Line Interface (AWS CLI).

The following screenshot shows the project run details on the console.

Figure 10: Viewing AWS Device Farm project run details

Figure 10: Viewing AWS Device Farm project run details

 

To Test the Selenium script locally you can run the following commands.

1. Create a Python virtual environment for your Django project. This virtual environment allows you to isolate this project and install any packages you need without affecting the system Python installation. At the terminal, go to project root directory and type the following command:

$ python3 -m venv ./venv

2. Activate the virtual environment:

$ source ./venv/bin/activate

3. Install development Python dependencies for this project:

$ pip install -r requirements/dev.txt

4. Run Selenium Script:

$ python tests/browser-test-sel.py

Testing the failure scenario (optional)

To test the failure scenario, you can modify the sample script browser-test-sel.py at the else statement.

The following code shows the lines to change:

else:
print("graphics was not generated at this form size")
# this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate

The following is the updated code:

else:
exit(1)
# this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate

Commit the change, and the pipeline should fail and stop the deployment.

Conclusion

Integrating Device Farm with CI/CD pipelines allows you to control deployment based on browser testing results. Failing the Selenium test on validation failures can stop and roll back the deployment, and a successful testing can continue the pipeline to deploy the solution to the final stage. Device Farm offers a one-stop solution for testing your native and web applications on desktop browsers and real mobile devices.

Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale..

Automate thousands of mainframe tests on AWS with the Micro Focus Enterprise Suite

Post Syndicated from Kevin Yung original https://aws.amazon.com/blogs/devops/automate-mainframe-tests-on-aws-with-micro-focus/

Micro Focus – AWS Advanced Technology Parnter, they are a global infrastructure software company with 40 years of experience in delivering and supporting enterprise software.

We have seen mainframe customers often encounter scalability constraints, and they can’t support their development and test workforce to the scale required to support business requirements. These constraints can lead to delays, reduce product or feature releases, and make them unable to respond to market requirements. Furthermore, limits in capacity and scale often affect the quality of changes deployed, and are linked to unplanned or unexpected downtime in products or services.

The conventional approach to address these constraints is to scale up, meaning to increase MIPS/MSU capacity of the mainframe hardware available for development and testing. The cost of this approach, however, is excessively high, and to ensure time to market, you may reject this approach at the expense of quality and functionality. If you’re wrestling with these challenges, this post is written specifically for you.

To accompany this post, we developed an AWS prescriptive guidance (APG) pattern for developer instances and CI/CD pipelines: Mainframe Modernization: DevOps on AWS with Micro Focus.

Overview of solution

In the APG, we introduce DevOps automation and AWS CI/CD architecture to support mainframe application development. Our solution enables you to embrace both Test Driven Development (TDD) and Behavior Driven Development (BDD). Mainframe developers and testers can automate the tests in CI/CD pipelines so they’re repeatable and scalable. To speed up automated mainframe application tests, the solution uses team pipelines to run functional and integration tests frequently, and uses systems test pipelines to run comprehensive regression tests on demand. For more information about the pipelines, see Mainframe Modernization: DevOps on AWS with Micro Focus.

In this post, we focus on how to automate and scale mainframe application tests in AWS. We show you how to use AWS services and Micro Focus products to automate mainframe application tests with best practices. The solution can scale your mainframe application CI/CD pipeline to run thousands of tests in AWS within minutes, and you only pay a fraction of your current on-premises cost.

The following diagram illustrates the solution architecture.

Mainframe DevOps On AWS Architecture Overview, on the left is the conventional mainframe development environment, on the left is the CI/CD pipelines for mainframe tests in AWS

Figure: Mainframe DevOps On AWS Architecture Overview

 

Best practices

Before we get into the details of the solution, let’s recap the following mainframe application testing best practices:

  • Create a “test first” culture by writing tests for mainframe application code changes
  • Automate preparing and running tests in the CI/CD pipelines
  • Provide fast and quality feedback to project management throughout the SDLC
  • Assess and increase test coverage
  • Scale your test’s capacity and speed in line with your project schedule and requirements

Automated smoke test

In this architecture, mainframe developers can automate running functional smoke tests for new changes. This testing phase typically “smokes out” regression of core and critical business functions. You can achieve these tests using tools such as py3270 with x3270 or Robot Framework Mainframe 3270 Library.

The following code shows a feature test written in Behave and test step using py3270:

# home_loan_calculator.feature
Feature: calculate home loan monthly repayment
  the bankdemo application provides a monthly home loan repayment caculator 
  User need to input into transaction of home loan amount, interest rate and how many years of the loan maturity.
  User will be provided an output of home loan monthly repayment amount

  Scenario Outline: As a customer I want to calculate my monthly home loan repayment via a transaction
      Given home loan amount is <amount>, interest rate is <interest rate> and maturity date is <maturity date in months> months 
       When the transaction is submitted to the home loan calculator
       Then it shall show the monthly repayment of <monthly repayment>

    Examples: Homeloan
      | amount  | interest rate | maturity date in months | monthly repayment |
      | 1000000 | 3.29          | 300                     | $4894.31          |

 

# home_loan_calculator_steps.py
import sys, os
from py3270 import Emulator
from behave import *

@given("home loan amount is {amount}, interest rate is {rate} and maturity date is {maturity_date} months")
def step_impl(context, amount, rate, maturity_date):
    context.home_loan_amount = amount
    context.interest_rate = rate
    context.maturity_date_in_months = maturity_date

@when("the transaction is submitted to the home loan calculator")
def step_impl(context):
    # Setup connection parameters
    tn3270_host = os.getenv('TN3270_HOST')
    tn3270_port = os.getenv('TN3270_PORT')
	# Setup TN3270 connection
    em = Emulator(visible=False, timeout=120)
    em.connect(tn3270_host + ':' + tn3270_port)
    em.wait_for_field()
	# Screen login
    em.fill_field(10, 44, 'b0001', 5)
    em.send_enter()
	# Input screen fields for home loan calculator
    em.wait_for_field()
    em.fill_field(8, 46, context.home_loan_amount, 7)
    em.fill_field(10, 46, context.interest_rate, 7)
    em.fill_field(12, 46, context.maturity_date_in_months, 7)
    em.send_enter()
    em.wait_for_field()    

    # collect monthly replayment output from screen
    context.monthly_repayment = em.string_get(14, 46, 9)
    em.terminate()

@then("it shall show the monthly repayment of {amount}")
def step_impl(context, amount):
    print("expected amount is " + amount.strip() + ", and the result from screen is " + context.monthly_repayment.strip())
assert amount.strip() == context.monthly_repayment.strip()

To run this functional test in Micro Focus Enterprise Test Server (ETS), we use AWS CodeBuild.

We first need to build an Enterprise Test Server Docker image and push it to an Amazon Elastic Container Registry (Amazon ECR) registry. For instructions, see Using Enterprise Test Server with Docker.

Next, we create a CodeBuild project and uses the Enterprise Test Server Docker image in its configuration.

The following is an example AWS CloudFormation code snippet of a CodeBuild project that uses Windows Container and Enterprise Test Server:

  BddTestBankDemoStage:
    Type: AWS::CodeBuild::Project
    Properties:
      Name: !Sub '${AWS::StackName}BddTestBankDemo'
      LogsConfig:
        CloudWatchLogs:
          Status: ENABLED
      Artifacts:
        Type: CODEPIPELINE
        EncryptionDisabled: true
      Environment:
        ComputeType: BUILD_GENERAL1_LARGE
        Image: !Sub "${EnterpriseTestServerDockerImage}:latest"
        ImagePullCredentialsType: SERVICE_ROLE
        Type: WINDOWS_SERVER_2019_CONTAINER
      ServiceRole: !Ref CodeBuildRole
      Source:
        Type: CODEPIPELINE
        BuildSpec: bdd-test-bankdemo-buildspec.yaml

In the CodeBuild project, we need to create a buildspec to orchestrate the commands for preparing the Micro Focus Enterprise Test Server CICS environment and issue the test command. In the buildspec, we define the location for CodeBuild to look for test reports and upload them into the CodeBuild report group. The following buildspec code uses custom scripts DeployES.ps1 and StartAndWait.ps1 to start your CICS region, and runs Python Behave BDD tests:

version: 0.2
phases:
  build:
    commands:
      - |
        # Run Command to start Enterprise Test Server
        CD C:\
        .\DeployES.ps1
        .\StartAndWait.ps1

        py -m pip install behave

        Write-Host "waiting for server to be ready ..."
        do {
          Write-Host "..."
          sleep 3  
        } until(Test-NetConnection 127.0.0.1 -Port 9270 | ? { $_.TcpTestSucceeded } )

        CD C:\tests\features
        MD C:\tests\reports
        $Env:Path += ";c:\wc3270"

        $address=(Get-NetIPAddress -AddressFamily Ipv4 | where { $_.IPAddress -Match "172\.*" })
        $Env:TN3270_HOST = $address.IPAddress
        $Env:TN3270_PORT = "9270"
        
        behave.exe --color --junit --junit-directory C:\tests\reports
reports:
  bankdemo-bdd-test-report:
    files: 
      - '**/*'
    base-directory: "C:\\tests\\reports"

In the smoke test, the team may run both unit tests and functional tests. Ideally, these tests are better to run in parallel to speed up the pipeline. In AWS CodePipeline, we can set up a stage to run multiple steps in parallel. In our example, the pipeline runs both BDD tests and Robot Framework (RPA) tests.

The following CloudFormation code snippet runs two different tests. You use the same RunOrder value to indicate the actions run in parallel.

#...
        - Name: Tests
          Actions:
            - Name: RunBDDTest
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName: !Ref BddTestBankDemoStage
                PrimarySource: Config
              InputArtifacts:
                - Name: DemoBin
                - Name: Config
              RunOrder: 1
            - Name: RunRbTest
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName : !Ref RpaTestBankDemoStage
                PrimarySource: Config
              InputArtifacts:
                - Name: DemoBin
                - Name: Config
              RunOrder: 1  
#...

The following screenshot shows the example actions on the CodePipeline console that use the preceding code.

Screenshot of CodePipeine parallel execution tests using a same run order value

Figure – Screenshot of CodePipeine parallel execution tests

Both DBB and RPA tests produce jUnit format reports, which CodeBuild can ingest and show on the CodeBuild console. This is a great way for project management and business users to track the quality trend of an application. The following screenshot shows the CodeBuild report generated from the BDD tests.

CodeBuild report generated from the BDD tests showing 100% pass rate

Figure – CodeBuild report generated from the BDD tests

Automated regression tests

After you test the changes in the project team pipeline, you can automatically promote them to another stream with other team members’ changes for further testing. The scope of this testing stream is significantly more comprehensive, with a greater number and wider range of tests and higher volume of test data. The changes promoted to this stream by each team member are tested in this environment at the end of each day throughout the life of the project. This provides a high-quality delivery to production, with new code and changes to existing code tested together with hundreds or thousands of tests.

In enterprise architecture, it’s commonplace to see an application client consuming web services APIs exposed from a mainframe CICS application. One approach to do regression tests for mainframe applications is to use Micro Focus Verastream Host Integrator (VHI) to record and capture 3270 data stream processing and encapsulate these 3270 data streams as business functions, which in turn are packaged as web services. When these web services are available, they can be consumed by a test automation product, which in our environment is Micro Focus UFT One. This uses the Verastream server as the orchestration engine that translates the web service requests into 3270 data streams that integrate with the mainframe CICS application. The application is deployed in Micro Focus Enterprise Test Server.

The following diagram shows the end-to-end testing components.

Regression Test the end-to-end testing components using ECS Container for Exterprise Test Server, Verastream Host Integrator and UFT One Container, all integration points are using Elastic Network Load Balancer

Figure – Regression Test Infrastructure end-to-end Setup

To ensure we have the coverage required for large mainframe applications, we sometimes need to run thousands of tests against very large production volumes of test data. We want the tests to run faster and complete as soon as possible so we reduce AWS costs—we only pay for the infrastructure when consuming resources for the life of the test environment when provisioning and running tests.

Therefore, the design of the test environment needs to scale out. The batch feature in CodeBuild allows you to run tests in batches and in parallel rather than serially. Furthermore, our solution needs to minimize interference between batches, a failure in one batch doesn’t affect another running in parallel. The following diagram depicts the high-level design, with each batch build running in its own independent infrastructure. Each infrastructure is launched as part of test preparation, and then torn down in the post-test phase.

Regression Tests in CodeBuoild Project setup to use batch mode, three batches running in independent infrastructure with containers

Figure – Regression Tests in CodeBuoild Project setup to use batch mode

Building and deploying regression test components

Following the design of the parallel regression test environment, let’s look at how we build each component and how they are deployed. The followings steps to build our regression tests use a working backward approach, starting from deployment in the Enterprise Test Server:

  1. Create a batch build in CodeBuild.
  2. Deploy to Enterprise Test Server.
  3. Deploy the VHI model.
  4. Deploy UFT One Tests.
  5. Integrate UFT One into CodeBuild and CodePipeline and test the application.

Creating a batch build in CodeBuild

We update two components to enable a batch build. First, in the CodePipeline CloudFormation resource, we set BatchEnabled to be true for the test stage. The UFT One test preparation stage uses the CloudFormation template to create the test infrastructure. The following code is an example of the AWS CloudFormation snippet with batch build enabled:

#...
        - Name: SystemsTest
          Actions:
            - Name: Uft-Tests
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName : !Ref UftTestBankDemoProject
                PrimarySource: Config
                BatchEnabled: true
                CombineArtifacts: true
              InputArtifacts:
                - Name: Config
                - Name: DemoSrc
              OutputArtifacts:
                - Name: TestReport                
              RunOrder: 1
#...

Second, in the buildspec configuration of the test stage, we provide a build matrix setting. We use the custom environment variable TEST_BATCH_NUMBER to indicate which set of tests runs in each batch. See the following code:

version: 0.2
batch:
  fast-fail: true
  build-matrix:
    static:
      ignore-failure: false
    dynamic:
      env:
        variables:
          TEST_BATCH_NUMBER:
            - 1
            - 2
            - 3 
phases:
  pre_build:
commands:
#...

After setting up the batch build, CodeBuild creates multiple batches when the build starts. The following screenshot shows the batches on the CodeBuild console.

Regression tests Codebuild project ran in batch mode, three batches ran in prallel successfully

Figure – Regression tests Codebuild project ran in batch mode

Deploying to Enterprise Test Server

ETS is the transaction engine that processes all the online (and batch) requests that are initiated through external clients, such as 3270 terminals, web services, and websphere MQ. This engine provides support for various mainframe subsystems, such as CICS, IMS TM and JES, as well as code-level support for COBOL and PL/I. The following screenshot shows the Enterprise Test Server administration page.

Enterprise Server Administrator window showing configuration for CICS

Figure – Enterprise Server Administrator window

In this mainframe application testing use case, the regression tests are CICS transactions, initiated from 3270 requests (encapsulated in a web service). For more information about Enterprise Test Server, see the Enterprise Test Server and Micro Focus websites.

In the regression pipeline, after the stage of mainframe artifact compiling, we bake in the artifact into an ETS Docker container and upload the image to an Amazon ECR repository. This way, we have an immutable artifact for all the tests.

During each batch’s test preparation stage, a CloudFormation stack is deployed to create an Amazon ECS service on Windows EC2. The stack uses a Network Load Balancer as an integration point for the VHI’s integration.

The following code is an example of the CloudFormation snippet to create an Amazon ECS service using an Enterprise Test Server Docker image:

#...
  EtsService:
    DependsOn:
    - EtsTaskDefinition
    - EtsContainerSecurityGroup
    - EtsLoadBalancerListener
    Properties:
      Cluster: !Ref 'WindowsEcsClusterArn'
      DesiredCount: 1
      LoadBalancers:
        -
          ContainerName: !Sub "ets-${AWS::StackName}"
          ContainerPort: 9270
          TargetGroupArn: !Ref EtsPort9270TargetGroup
      HealthCheckGracePeriodSeconds: 300          
      TaskDefinition: !Ref 'EtsTaskDefinition'
    Type: "AWS::ECS::Service"

  EtsTaskDefinition:
    Properties:
      ContainerDefinitions:
        -
          Image: !Sub "${AWS::AccountId}.dkr.ecr.us-east-1.amazonaws.com/systems-test/ets:latest"
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref 'SystemsTestLogGroup'
              awslogs-region: !Ref 'AWS::Region'
              awslogs-stream-prefix: ets
          Name: !Sub "ets-${AWS::StackName}"
          cpu: 4096
          memory: 8192
          PortMappings:
            -
              ContainerPort: 9270
          EntryPoint:
          - "powershell.exe"
          Command: 
          - '-F'
          - .\StartAndWait.ps1
          - 'bankdemo'
          - C:\bankdemo\
          - 'wait'
      Family: systems-test-ets
    Type: "AWS::ECS::TaskDefinition"
#...

Deploying the VHI model

In this architecture, the VHI is a bridge between mainframe and clients.

We use the VHI designer to capture the 3270 data streams and encapsulate the relevant data streams into a business function. We can then deliver this function as a web service that can be consumed by a test management solution, such as Micro Focus UFT One.

The following screenshot shows the setup for getCheckingDetails in VHI. Along with this procedure we can also see other procedures (eg calcCostLoan) defined that get generated as a web service. The properties associated with this procedure are available on this screen to allow for the defining of the mapping of the fields between the associated 3270 screens and exposed web service.

example of VHI designer to capture the 3270 data streams and encapsulate the relevant data streams into a business function getCheckingDetails

Figure – Setup for getCheckingDetails in VHI

The following screenshot shows the editor for this procedure and is initiated by the selection of the Procedure Editor. This screen presents the 3270 screens that are involved in the business function that will be generated as a web service.

VHI designer Procedure Editor shows the procedure

Figure – VHI designer Procedure Editor shows the procedure

After you define the required functional web services in VHI designer, the resultant model is saved and deployed into a VHI Docker image. We use this image and the associated model (from VHI designer) in the pipeline outlined in this post.

For more information about VHI, see the VHI website.

The pipeline contains two steps to deploy a VHI service. First, it installs and sets up the VHI models into a VHI Docker image, and it’s pushed into Amazon ECR. Second, a CloudFormation stack is deployed to create an Amazon ECS Fargate service, which uses the latest built Docker image. In AWS CloudFormation, the VHI ECS task definition defines an environment variable for the ETS Network Load Balancer’s DNS name. Therefore, the VHI can bootstrap and point to an ETS service. In the VHI stack, it uses a Network Load Balancer as an integration point for UFT One test integration.

The following code is an example of a ECS Task Definition CloudFormation snippet that creates a VHI service in Amazon ECS Fargate and integrates it with an ETS server:

#...
  VhiTaskDefinition:
    DependsOn:
    - EtsService
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: systems-test-vhi
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      ExecutionRoleArn: !Ref FargateEcsTaskExecutionRoleArn
      Cpu: 2048
      Memory: 4096
      ContainerDefinitions:
        - Cpu: 2048
          Name: !Sub "vhi-${AWS::StackName}"
          Memory: 4096
          Environment:
            - Name: esHostName 
              Value: !GetAtt EtsInternalLoadBalancer.DNSName
            - Name: esPort
              Value: 9270
          Image: !Ref "${AWS::AccountId}.dkr.ecr.us-east-1.amazonaws.com/systems-test/vhi:latest"
          PortMappings:
            - ContainerPort: 9680
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref 'SystemsTestLogGroup'
              awslogs-region: !Ref 'AWS::Region'
              awslogs-stream-prefix: vhi

#...

Deploying UFT One Tests

UFT One is a test client that uses each of the web services created by the VHI designer to orchestrate running each of the associated business functions. Parameter data is supplied to each function, and validations are configured against the data returned. Multiple test suites are configured with different business functions with the associated data.

The following screenshot shows the test suite API_Bankdemo3, which is used in this regression test process.

the screenshot shows the test suite API_Bankdemo3 in UFT One test setup console, the API setup for getCheckingDetails

Figure – API_Bankdemo3 in UFT One Test Editor Console

For more information, see the UFT One website.

Integrating UFT One and testing the application

The last step is to integrate UFT One into CodeBuild and CodePipeline to test our mainframe application. First, we set up CodeBuild to use a UFT One container. The Docker image is available in Docker Hub. Then we author our buildspec. The buildspec has the following three phrases:

  • Setting up a UFT One license and deploying the test infrastructure
  • Starting the UFT One test suite to run regression tests
  • Tearing down the test infrastructure after tests are complete

The following code is an example of a buildspec snippet in the pre_build stage. The snippet shows the command to activate the UFT One license:

version: 0.2
batch: 
# . . .
phases:
  pre_build:
    commands:
      - |
        # Activate License
        $process = Start-Process -NoNewWindow -RedirectStandardOutput LicenseInstall.log -Wait -File 'C:\Program Files (x86)\Micro Focus\Unified Functional Testing\bin\HP.UFT.LicenseInstall.exe' -ArgumentList @('concurrent', 10600, 1, ${env:AUTOPASS_LICENSE_SERVER})        
        Get-Content -Path LicenseInstall.log
        if (Select-String -Path LicenseInstall.log -Pattern 'The installation was successful.' -Quiet) {
          Write-Host 'Licensed Successfully'
        } else {
          Write-Host 'License Failed'
          exit 1
        }
#...

The following command in the buildspec deploys the test infrastructure using the AWS Command Line Interface (AWS CLI)

aws cloudformation deploy --stack-name $stack_name `
--template-file cicd-pipeline/systems-test-pipeline/systems-test-service.yaml `
--parameter-overrides EcsCluster=$cluster_arn `
--capabilities CAPABILITY_IAM

Because ETS and VHI are both deployed with a load balancer, the build detects when the load balancers become healthy before starting the tests. The following AWS CLI commands detect the load balancer’s target group health:

$vhi_health_state = (aws elbv2 describe-target-health --target-group-arn $vhi_target_group_arn --query 'TargetHealthDescriptions[0].TargetHealth.State' --output text)
$ets_health_state = (aws elbv2 describe-target-health --target-group-arn $ets_target_group_arn --query 'TargetHealthDescriptions[0].TargetHealth.State' --output text)          

When the targets are healthy, the build moves into the build stage, and it uses the UFT One command line to start the tests. See the following code:

$process = Start-Process -Wait  -NoNewWindow -RedirectStandardOutput UFTBatchRunnerCMD.log `
-FilePath "C:\Program Files (x86)\Micro Focus\Unified Functional Testing\bin\UFTBatchRunnerCMD.exe" `
-ArgumentList @("-source", "${env:CODEBUILD_SRC_DIR_DemoSrc}\bankdemo\tests\API_Bankdemo\API_Bankdemo${env:TEST_BATCH_NUMBER}")

The next release of Micro Focus UFT One (November or December 2020) will provide an exit status to indicate a test’s success or failure.

When the tests are complete, the post_build stage tears down the test infrastructure. The following AWS CLI command tears down the CloudFormation stack:


#...
	post_build:
	  finally:
	  	- |
		  Write-Host "Clean up ETS, VHI Stack"
		  #...
		  aws cloudformation delete-stack --stack-name $stack_name
          aws cloudformation wait stack-delete-complete --stack-name $stack_name

At the end of the build, the buildspec is set up to upload UFT One test reports as an artifact into Amazon Simple Storage Service (Amazon S3). The following screenshot is the example of a test report in HTML format generated by UFT One in CodeBuild and CodePipeline.

UFT One HTML report shows regression testresult and test detals

Figure – UFT One HTML report

A new release of Micro Focus UFT One will provide test report formats supported by CodeBuild test report groups.

Conclusion

In this post, we introduced the solution to use Micro Focus Enterprise Suite, Micro Focus UFT One, Micro Focus VHI, AWS developer tools, and Amazon ECS containers to automate provisioning and running mainframe application tests in AWS at scale.

The on-demand model allows you to create the same test capacity infrastructure in minutes at a fraction of your current on-premises mainframe cost. It also significantly increases your testing and delivery capacity to increase quality and reduce production downtime.

A demo of the solution is available in AWS Partner Micro Focus website AWS Mainframe CI/CD Enterprise Solution. If you’re interested in modernizing your mainframe applications, please visit Micro Focus and contact AWS mainframe business development at [email protected].

References

Micro Focus

 

Peter Woods

Peter Woods

Peter has been with Micro Focus for almost 30 years, in a variety of roles and geographies including Technical Support, Channel Sales, Product Management, Strategic Alliances Management and Pre-Sales, primarily based in Europe but for the last four years in Australia and New Zealand. In his current role as Pre-Sales Manager, Peter is charged with driving and supporting sales activity within the Application Modernization and Connectivity team, based in Melbourne.

Leo Ervin

Leo Ervin

Leo Ervin is a Senior Solutions Architect working with Micro Focus Enterprise Solutions working with the ANZ team. After completing a Mathematics degree Leo started as a PL/1 programming with a local insurance company. The next step in Leo’s career involved consulting work in PL/1 and COBOL before he joined a start-up company as a technical director and partner. This company became the first distributor of Micro Focus software in the ANZ region in 1986. Leo’s involvement with Micro Focus technology has continued from this distributorship through to today with his current focus on cloud strategies for both DevOps and re-platform implementations.

Kevin Yung

Kevin Yung

Kevin is a Senior Modernization Architect in AWS Professional Services Global Mainframe and Midrange Modernization (GM3) team. Kevin currently is focusing on leading and delivering mainframe and midrange applications modernization for large enterprise customers.

Our Journey to Continuous Delivery at Grab (Part 1)

Post Syndicated from Grab Tech original https://engineering.grab.com/our-journey-to-continuous-delivery-at-grab

This blog post is a two-part presentation of the effort that went into improving the continuous delivery processes for backend services at Grab in the past two years. In the first part, we take stock of where we started two years ago and describe the software and tools we created while introducing some of the integrations we’ve done to automate our software delivery in our staging environment.


Continuous Delivery is the principle of delivering software often, every day.

As a backend engineer at Grab, nothing matters more than the ability to innovate quickly and safely. Around the end of 2018, Grab’s transportation and deliveries backend architecture consisted of roughly 270 services (the majority being microservices). The deployment process was lengthy, required careful inputs and clear communication. The care needed to push changes in production and the risk associated with manual operations led to the introduction of a Slack bot to coordinate deployments. The bot ensures that deployments occur only during off-peak and within work hours:

Overview of the Grab Delivery Process
Overview of the Grab Delivery Process

Once the build was completed, engineers who desired to deploy their software to the Staging environment would copy release versions from the build logs, and paste them in a Jenkins job’s parameter. Tests needed to be manually triggered from another dedicated Jenkins job.

Prior to production deployments, engineers would generate their release notes via a script and update them manually in a wiki document. Deployments would be scheduled through interactions with a Slack bot that controls release notes and deployment windows. Production deployments were made once again by pasting the correct parameters into two dedicated Jenkins jobs, one for the canary (a.k.a. one-box) deployment and the other for the full deployment, spread one hour apart. During the monitoring phase, engineers would continuously observe metrics reported on our dashboards.

In spite of the fragmented process and risky manual operations impacting our velocity and stability, around 614 builds were running each business day and changes were deployed on our staging environment at an average rate of 300 new code releases per business day, while production changes averaged a rate of 28 new code releases per business day.

Our Deployment Funnel, Towards the End of 2018
Our Deployment Funnel, Towards the End of 2018

These figures meant that, on average, it took 10 business days between each service update in production, and only 10% of the staging deployments were eventually promoted to production.

Automating Continuous Deployments at Grab

With an increased focus on Engineering efficiency, in 2018 we started an internal initiative to address frictions in deployments that became known as Conveyor. To build Conveyor with a small team of engineers, we had to rely on an already mature platform which exhibited properties that are desirable to us to achieve our mission.

Hands-off deployments

Deployments should be an afterthought. Engineers should be as removed from the process as possible, and whenever possible, decisions should be taken early, during the code review process. The machine will do the heavy lifting, and only when it can’t decide for itself, should the engineer be involved. Notifications can be leveraged to ensure that engineers are only informed when something goes wrong and a human decision is required.

Hands-off Deployment Principle
Hands-off Deployment Principle

Confidence in Deployments

Grab’s focus on gathering internal Engineering NPS feedback helped us collect valuable metrics. One of the metrics we cared about was our engineers’ confidence in their production deployments. A team’s entire deployment process to production could last for more than a day and may extend up to a week for teams with large infrastructures running critical services. The possibility of losing progress in deployments when individual steps may last for hours is detrimental to the improvement of Engineering efficiency in the organisation. The deployment automation platform is the bedrock of that confidence. If the platform itself fails regularly or does provide a path of upgrade that is transparent to end-users, any features built on top of it would suffer from these downtimes and ultimately erode confidence in deployments.

Tailored To Most But Extensible For The Few

Our backend engineering teams are working on diverse stacks, and so are their deployment processes. Right from the start, we wanted our product to benefit the largest population of engineers that had adopted the same process, so as to maximize returns on our investments. To ease adoption, we decided to tailor a deployment pipeline such that:

  1. It would model the exact sequence of manual processes followed by this population of engineers.
  2. Switching to use that pipeline should require as little work as possible by service teams.

However, in cases where this model would not fit a team’s specific process, our deployment platform should be open and extensible and support new customizations even when they are not originally supported by the product’s ecosystem.

Cloud-Agnosticity

While we were going to target a specific process and team, to ensure that our solution would stand the test of time, we needed to ensure that our solution would support the variety of environments currently used in production. This variety was also likely to increase, and we wanted a platform that would mature together with the rest of our ecosystem.

Overview Of Conveyor

Setting Sail With Spinnaker

Conveyor is based on Spinnaker, an open-source, multi-cloud continuous delivery platform. We’ve chosen Spinnaker over other platforms because it is a mature deployment platform with no single point of failure, supports complex workflows (referred to as pipelines in Spinnaker), and already supports a large array of cloud providers. Since Spinnaker is open-source and extensible, it allowed us to add the features we needed for the specificity of our ecosystem.

To further ease adoption within our organization, we built a tailored  user interface and created our own domain-specific language (DSL) to manage its pipelines as code.

Outline of Conveyor's Architecture
Outline of Conveyor’s Architecture

Onboarding To A Simpler Interface

Spinnaker comes with its own interface, it has all the features an engineer would want from an advanced continuous delivery system. However, Spinnaker interface is vastly different from Jenkins and makes for a steep learning curve.

To reduce our barrier to adoption, we decided early on to create a simple interface for our users. In this interface, deployment pipelines take the center stage of our application. Pipelines are objects managed by Spinnaker, they model the different steps in the workflow of each deployment. Each pipeline is made up of stages that can be assembled like lego-bricks to form the final pipeline. An instance of a pipeline is called an execution.

Conveyor dashboard. Sensitive information like authors and service names are redacted.
Conveyor Dashboard

With this interface, each engineer can focus on what matters to them immediately: the pipeline they have started, or those started by other teammates working on the same services as they are. Conveyor also provides a search bar (on the top) and filters (on the left) that work in concert to explore all pipelines executed at Grab.

We adopted a consistent set of colours to model all information in our interface:

  • blue: represent stages that are currently running;
  • red: stages that have failed or important information;
  • yellow: stages that require human interaction;
  • and finally, in green: stages that were successfully completed.

Conveyor also provides a task and notifications area, where all stages requiring human intervention are listed in one location. Manual interactions are often no more than just YES or NO questions:

Conveyor tasks. Sensitive information like author/service names is redacted.
Conveyor Tasks

Finally, in addition to supporting automated deployments, we greatly simplified the start of manual deployments. Instead of being required to copy/paste information, each parameter can be selected on the interface from a set of predefined items, sorted chronologically, and presented with contextual information to help engineers in their decision.

Several parameters are required for our deployments and their values are selected from the UI to ensure correctness.

Simplified manual deployments
Simplified Manual Deployments

Ease Of Adoption With Our Pipeline-As-Code DSL

Ease of adoption for the team is not simply about the learning curve of the new tools. We needed to make it easy for teams to configure their services to deploy with Conveyor. Since we focused on automating tasks that were already performed manually, we needed only to configure the layer that would enable the integration.

We set on creating a pipeline-as-code implementation when none were widely being developed in the Spinnaker community. It’s interesting to see that two years on, this idea has grown in parallel in the community, with the birth of other pipeline-as-code implementations. Our pipeline-as-code is referred to as the Pipeline DSL, and its configuration is located inside each team’s repository. Artificer is the name of our Pipeline DSL interpreter and it runs with every change inside our monorepository:

Artificer: Our Pipeline DSL
Artificer: Our Pipeline DSL

Pipelines are being updated at every commit if necessary.

Creating a conveyor.jsonnet file inside with the service’s directory of our monorepository with the few lines below is all that’s required for Artificer to do its work and get the benefits of automation provided by Conveyor’s pipeline:

local default = import 'default.libsonnet';
[
 {
 name: "service-name",
 group: [
 "group-name",
 ]
 }
]

Sample minimal conveyor.jsonnet configuration to onboard services.

In this file, engineers simply specify the name of their service and the group that a user should belong to, to have deployment rights for the service.

Once the build is completed, teams can log in to Conveyor and start manual deployments of their services with our pipelines. Three pipelines are provided by default: the integration pipeline used for tests and developments, the staging pipeline used for pre-production tests, and the production pipeline for production deployment.

Thanks to the simplicity of this minimal configuration file, we were able to generate these configuration files for all existing services of our monorepository. This resulted in the automatic onboarding of a large number of teams and was a major contributing factor to the adoption of Conveyor throughout our organisation.

Our Journey To Engineering Efficiency (for backend services)

The sections below relate some of the improvements in engineering efficiency we’ve delivered since Conveyor’s inception. They were not made precisely in this order but for readability, they have been mapped to each step of the software development lifecycle.

Automate Deployments at Build Time

Continuous Integration Job
Continuous Integration Job

Continuous delivery begins with a pushed code commit in our trunk-based development flow. Whenever a developer pushes changes onto their development branch or onto the trunk, a continuous integration job is triggered on Jenkins. The products of this job (binaries, docker images, etc) are all uploaded into our artefact repositories. We’ve made two additions to our continuous integration process.

The first modification happens at the step “Upload & Register artefacts”. At this step, each artefact created is now registered in Conveyor with its associated metadata. When and if an engineer needs to trigger a deployment manually, Conveyor can display the list of versions to choose from, eliminating the need for error-prone manual inputs:

 Staging
Staging

Each selectable version shows contextual information: title, author, version and link to the code change where it originated. During registration, the commit time is also recorded and used to order entries chronologically in the interface. To ensure this integration is not a single point of failure for deployments, manual input is still available optionally.

The second modification implements one of the essential feature continuous delivery: your deployments should happen often, automatically. Engineers are now given the possibility to start automatic deployments once continuous integration has successfully completed, by simply modifying their project’s continuous integration settings:

 "AfterBuild": [
  {
      "AutoDeploy": {
      "OnDiff": false,
      "OnLand": true
    }
    "TYPE": "conveyor"
  }
 ],

Sample settings needed to trigger auto-deployments. ‘Diff’ refers to code review submissions, and ‘Land’ refers to merged code changes.

Staging Pipeline

Before deploying a new artefact to a service in production, changes are validated on the staging environment. During the staging deployment, we verify that canary (one-box) deployments and full deployments with automated smoke and functional tests suites.

Staging Pipeline
Staging Pipeline

We start by acquiring a deployment lock for this service and this environment. This prevents another deployment of the same service on the same environment to happen concurrently, other deployments will be waiting in a FIFO queue until the lock is released.

The stage “Compute Changeset” ensures that the deployment is not a rollback. It verifies that the new version deployed does not correspond to a rollback by comparing the ancestry of the commits provided during the artefact registration at build time: since we automate deployments after the build process has completed, cases of rollback may occur when two changes are created in quick succession and the latest build completes earlier than the older one.

After the stage “Deploy Canary” has completed, smoke test run. There are three kinds of tests executed at different stages of the pipeline: smoke, functional and security tests. Smoke tests directly reach the canary instance’s endpoint, by-passing load-balancers. If the smoke tests fail,  the canary is immediately rolled back and this deployment is terminated.

All tests are generated from the same builds as the artefact being tested and their versions must match during testing. To ensure that the right version of the test run and distinguish between the different kind of tests to perform, we provide additional metadata that will be passed by Conveyor to the tests system, known internally as Gandalf:

local default = import 'default.libsonnet';
[
  {
    name: "service-name",
    group: [
    "group-name",
    ],
    gandalf_smoke_tests: [
    {
        path: "repo.internal/path/to/my/smoke/tests"
      }
      ],
      gandalf_functional_tests: [
      {
        path: "repo.internal/path/to/my/functional/tests"
      }
      gandalf_security_tests: [
      {
        path: "repo.internal/path/to/my/security/tests"
      }
      ]
    }
]

Sample conveyor.jsonnet configuration with integration tests added.

Additionally, in parallel to the execution of the smoke tests, the canary is also being monitored from the moment its deployment has completed and for a predetermined duration. We leverage our integration with Datadog to allow engineers to select the alerts to monitor. If an alert is triggered during the monitoring period, and while the tests are executed, the canary is again rolled back, and the pipeline is terminated. Engineers can specify the alerts by adding them to the conveyor.jsonnet configuration file together with the monitoring duration:

local default = import 'default.libsonnet';
[
 {
   name: "service-name",
   group: [
   "group-name",
   ],
    gandalf_smoke_tests: [
    {
      path: "repo.internal/path/to/my/smoke/tests"
   }
   ],
   gandalf_functional_tests: [
   {
   path: "repo.internal/path/to/my/functional/tests"
  }
     gandalf_security_tests: [
     {
     path: "repo.internal/path/to/my/security/tests"
     }
     ],
     monitor: {
     stg: {
     duration_seconds: 300,
     alarms: [
     {
   type: "datadog",
   alert_id: 12345678
   },
   {
   type: "datadog",
   alert_id: 23456789
      }
      ]
      }
    }
  }
]

Sample conveyor.jsonnet configuration with alerts in staging added.

When the smoke tests and monitor pass and the deployment of new artefacts is completed, the pipeline execution triggers functional and security tests. Unlike smoke tests, functional & security tests run only after that step, as they communicate with the cluster through load-balancers, impersonating other services.

Before releasing the lock, release notes are generated to inform engineers of the delta of changes between the version they just released and the one currently running in production. Once the lock is released, the stage “Check Policies” verifies that the parameters and variable of the deployment obeys a specific set of criteria, for example: if its service metadata is up-to-date in our service inventory, or if the base image used during deployment is sufficiently recent.

Here’s how the policy stage, the engine, and the providers interact with each other:

Check Policy Stage
Check Policy Stage

In Spinnaker, each event of a pipeline’s execution updates the pipeline’s state in the database. The current state of the pipeline can be fetched by its API as a single JSON document, describing all information related to its execution: including its parameters, the contextual information related to each stage or even the response from the various interfacing components. The role of our “Policy Check” stage is to query this JSON representation of the pipeline, to extract and transform the variables which are forwarded to our policy engine for validation. Our policy engine gathers judgements passed by different policy providers. If the validation by the policy engine fails, the deployment is not rolled back this time; however, promotion to production is not possible and the pipeline is immediately terminated.

The journey through staging deployment finally ends with the stage “Register Deployment”. This stage registers that a successful deployment was made in our staging environment as an artefact. Similarly to the policy check above, certain parameters of the deployment are picked up and consolidated into this document. We use this kind of artefact as proof for upcoming production deployment.

Continuing Our Journey to Engineering Efficiency

With the advancements made in continuous integration and deployment to staging, Conveyor has reduced the efforts needed by our engineers to just three clicks in its interface, when automated deployment is used. Even when the deployment is triggered manually, Conveyor gives the assurance that the parameters selected are valid and it does away with copy/pasting and human interactions across heterogeneous tools.

In the sequel to this blog post, we’ll dive into the improvements that we’ve made to our production deployments and introduce a crucial concept that led to the creation of our proof for successful staging deployment. Finally, we’ll cover the impact that Conveyor had on the continuous delivery of our backend services, by comparing our deployment velocity when we started two years ago versus where we are today.


All these improvements in efficiency for our engineers would never have been possible without the hard work of all team members involved in the project, past and present: Evan Sebastian, Tanun Chalermsinsuwan, Aufar Gilbran, Deepak Ramakrishnaiah, Repon Kumar Roy (Kowshik), Su Han, Voislav Dimitrijevikj, Qijia Wang, Oscar Ng, Jacob Sunny, Subhodip Mandal, and many others who have contributed and collaborated with them.


Join us

Grab is more than just the leading ride-hailing and mobile payments platform in Southeast Asia. We use data and technology to improve everything from transportation to payments and financial services across a region of more than 620 million people. We aspire to unlock the true potential of Southeast Asia and look for like-minded individuals to join us on this ride.

If you share our vision of driving South East Asia forward, apply to join our team today.

Cross-account and cross-region deployment using GitHub actions and AWS CDK

Post Syndicated from DAMODAR SHENVI WAGLE original https://aws.amazon.com/blogs/devops/cross-account-and-cross-region-deployment-using-github-actions-and-aws-cdk/

GitHub Actions is a feature on GitHub’s popular development platform that helps you automate your software development workflows in the same place you store code and collaborate on pull requests and issues. You can write individual tasks called actions, and combine them to create a custom workflow. Workflows are custom automated processes that you can set up in your repository to build, test, package, release, or deploy any code project on GitHub.

A cross-account deployment strategy is a CI/CD pattern or model in AWS. In this pattern, you have a designated AWS account called tools, where all CI/CD pipelines reside. Deployment is carried out by these pipelines across other AWS accounts, which may correspond to dev, staging, or prod. For more information about a cross-account strategy in reference to CI/CD pipelines on AWS, see Building a Secure Cross-Account Continuous Delivery Pipeline.

In this post, we show you how to use GitHub Actions to deploy an AWS Lambda-based API to an AWS account and Region using the cross-account deployment strategy.

Using GitHub Actions may have associated costs in addition to the cost associated with the AWS resources you create. For more information, see About billing for GitHub Actions.

Prerequisites

Before proceeding any further, you need to identify and designate two AWS accounts required for the solution to work:

  • Tools – Where you create an AWS Identity and Access Management (IAM) user for GitHub Actions to use to carry out deployment.
  • Target – Where deployment occurs. You can call this as your dev/stage/prod environment.

You also need to create two AWS account profiles in ~/.aws/credentials for the tools and target accounts, if you don’t already have them. These profiles need to have sufficient permissions to run an AWS Cloud Development Kit (AWS CDK) stack. They should be your private profiles and only be used during the course of this use case. So, it should be fine if you want to use admin privileges. Don’t share the profile details, especially if it has admin privileges. I recommend removing the profile when you’re finished with this walkthrough. For more information about creating an AWS account profile, see Configuring the AWS CLI.

Solution overview

You start by building the necessary resources in the tools account (an IAM user with permissions to assume a specific IAM role from the target account to carry out deployment). For simplicity, we refer to this IAM role as the cross-account role, as specified in the architecture diagram.

You also create the cross-account role in the target account that trusts the IAM user in the tools account and provides the required permissions for AWS CDK to bootstrap and initiate creating an AWS CloudFormation deployment stack in the target account. GitHub Actions uses the tools account IAM user credentials to the assume the cross-account role to carry out deployment.

In addition, you create an AWS CloudFormation execution role in the target account, which AWS CloudFormation service assumes in the target account. This role has permissions to create your API resources, such as a Lambda function and Amazon API Gateway, in the target account. This role is passed to AWS CloudFormation service via AWS CDK.

You then configure your tools account IAM user credentials in your Git secrets and define the GitHub Actions workflow, which triggers upon pushing code to a specific branch of the repo. The workflow then assumes the cross-account role and initiates deployment.

The following diagram illustrates the solution architecture and shows AWS resources across the tools and target accounts.

Architecture diagram

Creating an IAM user

You start by creating an IAM user called git-action-deployment-user in the tools account. The user needs to have only programmatic access.

  1. Clone the GitHub repo aws-cross-account-cicd-git-actions-prereq and navigate to folder tools-account. Here you find the JSON parameter file src/cdk-stack-param.json, which contains the parameter CROSS_ACCOUNT_ROLE_ARN, which represents the ARN for the cross-account role we create in the next step in the target account. In the ARN, replace <target-account-id> with the actual account ID for your designated AWS target account.                                             Replace <target-account-id> with designated AWS account id
  2. Run deploy.sh by passing the name of the tools AWS account profile you created earlier. The script compiles the code, builds a package, and uses the AWS CDK CLI to bootstrap and deploy the stack. See the following code:
cd aws-cross-account-cicd-git-actions-prereq/tools-account/
./deploy.sh "<AWS-TOOLS-ACCOUNT-PROFILE-NAME>"

You should now see two stacks in the tools account: CDKToolkit and cf-GitActionDeploymentUserStack. AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an Amazon Simple Storage Service (Amazon S3) bucket needed to hold deployment assets such as a CloudFormation template and Lambda code package. cf-GitActionDeploymentUserStack creates the IAM user with permission to assume git-action-cross-account-role (which you create in the next step). On the Outputs tab of the stack, you can find the user access key and the AWS Secrets Manager ARN that holds the user secret. To retrieve the secret, you need to go to Secrets Manager. Record the secret to use later.

Stack that creates IAM user with its secret stored in secrets manager

Creating a cross-account IAM role

In this step, you create two IAM roles in the target account: git-action-cross-account-role and git-action-cf-execution-role.

git-action-cross-account-role provides required deployment-specific permissions to the IAM user you created in the last step. The IAM user in the tools account can assume this role and perform the following tasks:

  • Upload deployment assets such as the CloudFormation template and Lambda code package to a designated S3 bucket via AWS CDK
  • Create a CloudFormation stack that deploys API Gateway and Lambda using AWS CDK

AWS CDK passes git-action-cf-execution-role to AWS CloudFormation to create, update, and delete the CloudFormation stack. It has permissions to create API Gateway and Lambda resources in the target account.

To deploy these two roles using AWS CDK, complete the following steps:

  1. In the already cloned repo from the previous step, navigate to the folder target-account. This folder contains the JSON parameter file cdk-stack-param.json, which contains the parameter TOOLS_ACCOUNT_USER_ARN, which represents the ARN for the IAM user you previously created in the tools account. In the ARN, replace <tools-account-id> with the actual account ID for your designated AWS tools account.                                             Replace <tools-account-id> with designated AWS account id
  2. Run deploy.sh by passing the name of the target AWS account profile you created earlier. The script compiles the code, builds the package, and uses the AWS CDK CLI to bootstrap and deploy the stack. See the following code:
cd ../target-account/
./deploy.sh "<AWS-TARGET-ACCOUNT-PROFILE-NAME>"

You should now see two stacks in your target account: CDKToolkit and cf-CrossAccountRolesStack. AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an S3 bucket to hold deployment assets such as the CloudFormation template and Lambda code package. The cf-CrossAccountRolesStack creates the two IAM roles we discussed at the beginning of this step. The IAM role git-action-cross-account-role now has the IAM user added to its trust policy. On the Outputs tab of the stack, you can find these roles’ ARNs. Record these ARNs as you conclude this step.

Stack that creates IAM roles to carry out cross account deployment

Configuring secrets

One of the GitHub actions we use is aws-actions/[email protected]. This action configures AWS credentials and Region environment variables for use in the GitHub Actions workflow. The AWS CDK CLI detects the environment variables to determine the credentials and Region to use for deployment.

For our cross-account deployment use case, aws-actions/[email protected] takes three pieces of sensitive information besides the Region: AWS_ACCESS_KEY_ID, AWS_ACCESS_KEY_SECRET, and CROSS_ACCOUNT_ROLE_TO_ASSUME. Secrets are recommended for storing sensitive pieces of information in the GitHub repo. It keeps the information in an encrypted format. For more information about referencing secrets in the workflow, see Creating and storing encrypted secrets.

Before we continue, you need your own empty GitHub repo to complete this step. Use an existing repo if you have one, or create a new repo. You configure secrets in this repo. In the next section, you check in the code provided by the post to deploy a Lambda-based API CDK stack into this repo.

  1. On the GitHub console, navigate to your repo settings and choose the Secrets tab.
  2. Add a new secret with name as TOOLS_ACCOUNT_ACCESS_KEY_ID.
  3. Copy the access key ID from the output OutGitActionDeploymentUserAccessKey of the stack GitActionDeploymentUserStack in tools account.
  4. Enter the ID in the Value field.                                                                                                                                                                Create secret
  5. Repeat this step to add two more secrets:
    • TOOLS_ACCOUNT_SECRET_ACCESS_KEY (value retrieved from the AWS Secrets Manager in tools account)
    • CROSS_ACCOUNT_ROLE (value copied from the output OutCrossAccountRoleArn of the stack cf-CrossAccountRolesStack in target account)

You should now have three secrets as shown below.

All required git secrets

Deploying with GitHub Actions

As the final step, first clone your empty repo where you set up your secrets. Download and copy the code from the GitHub repo into your empty repo. The folder structure of your repo should mimic the folder structure of source repo. See the following screenshot.

Folder structure of the Lambda API code

We can take a detailed look at the code base. First and foremost, we use Typescript to deploy our Lambda API, so we need an AWS CDK app and AWS CDK stack. The app is defined in app.ts under the repo root folder location. The stack definition is located under the stack-specific folder src/git-action-demo-api-stack. The Lambda code is located under the Lambda-specific folder src/git-action-demo-api-stack/lambda/ git-action-demo-lambda.

We also have a deployment script deploy.sh, which compiles the app and Lambda code, packages the Lambda code into a .zip file, bootstraps the app by copying the assets to an S3 bucket, and deploys the stack. To deploy the stack, AWS CDK has to pass CFN_EXECUTION_ROLE to AWS CloudFormation; this role is configured in src/params/cdk-stack-param.json. Replace <target-account-id> with your own designated AWS target account ID.

Update cdk-stack-param.json in git-actions-cross-account-cicd repo with TARGET account id

Finally, we define the Git Actions workflow under the .github/workflows/ folder per the specifications defined by GitHub Actions. GitHub Actions automatically identifies the workflow in this location and triggers it if conditions match. Our workflow .yml file is named in the format cicd-workflow-<region>.yml, where <region> in the file name identifies the deployment Region in the target account. In our use case, we use us-east-1 and us-west-2, which is also defined as an environment variable in the workflow.

The GitHub Actions workflow has a standard hierarchy. The workflow is a collection of jobs, which are collections of one or more steps. Each job runs on a virtual machine called a runner, which can either be GitHub-hosted or self-hosted. We use the GitHub-hosted runner ubuntu-latest because it works well for our use case. For more information about GitHub-hosted runners, see Virtual environments for GitHub-hosted runners. For more information about the software preinstalled on GitHub-hosted runners, see Software installed on GitHub-hosted runners.

The workflow also has a trigger condition specified at the top. You can schedule the trigger based on the cron settings or trigger it upon code pushed to a specific branch in the repo. See the following code:

name: Lambda API CICD Workflow
# This workflow is triggered on pushes to the repository branch master.
on:
  push:
    branches:
      - master

# Initializes environment variables for the workflow
env:
  REGION: us-east-1 # Deployment Region

jobs:
  deploy:
    name: Build And Deploy
    # This job runs on Linux
    runs-on: ubuntu-latest
    steps:
      # Checkout code from git repo branch configured above, under folder $GITHUB_WORKSPACE.
      - name: Checkout
        uses: actions/[email protected]
      # Sets up AWS profile.
      - name: Configure AWS credentials
        uses: aws-actions/[email protected]
        with:
          aws-access-key-id: ${{ secrets.TOOLS_ACCOUNT_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.TOOLS_ACCOUNT_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.REGION }}
          role-to-assume: ${{ secrets.CROSS_ACCOUNT_ROLE }}
          role-duration-seconds: 1200
          role-session-name: GitActionDeploymentSession
      # Installs CDK and other prerequisites
      - name: Prerequisite Installation
        run: |
          sudo npm install -g [email protected]
          cdk --version
          aws s3 ls
      # Build and Deploy CDK application
      - name: Build & Deploy
        run: |
          cd $GITHUB_WORKSPACE
          ls -a
          chmod 700 deploy.sh
          ./deploy.sh

For more information about triggering workflows, see Triggering a workflow with events.

We have configured a single job workflow for our use case that runs on ubuntu-latest and is triggered upon a code push to the master branch. When you create an empty repo, master branch becomes the default branch. The workflow has four steps:

  1. Check out the code from the repo, for which we use a standard Git action actions/[email protected]. The code is checked out into a folder defined by the variable $GITHUB_WORKSPACE, so it becomes the root location of our code.
  2. Configure AWS credentials using aws-actions/[email protected]. This action is configured as explained in the previous section.
  3. Install your prerequisites. In our use case, the only prerequisite we need is AWS CDK. Upon installing AWS CDK, we can do a quick test using the AWS Command Line Interface (AWS CLI) command aws s3 ls. If cross-account access was successfully established in the previous step of the workflow, this command should return a list of buckets in the target account.
  4. Navigate to root location of the code $GITHUB_WORKSPACE and run the deploy.sh script.

You can check in the code into the master branch of your repo. This should trigger the workflow, which you can monitor on the Actions tab of your repo. The commit message you provide is displayed for the respective run of the workflow.

Workflow for region us-east-1 Workflow for region us-west-2

You can choose the workflow link and monitor the log for each individual step of the workflow.

Git action workflow steps

In the target account, you should now see the CloudFormation stack cf-GitActionDemoApiStack in us-east-1 and us-west-2.

Lambda API stack in us-east-1 Lambda API stack in us-west-2

The API resource URL DocUploadRestApiResourceUrl is located on the Outputs tab of the stack. You can invoke your API by choosing this URL on the browser.

API Invocation Output

Clean up

To remove all the resources from the target and tools accounts, complete the following steps in their given order:

  1. Delete the CloudFormation stack cf-GitActionDemoApiStack from the target account. This step removes the Lambda and API Gateway resources and their associated IAM roles.
  2. Delete the CloudFormation stack cf-CrossAccountRolesStack from the target account. This removes the cross-account role and CloudFormation execution role you created.
  3. Go to the CDKToolkit stack in the target account and note the BucketName on the Output tab. Empty that bucket and then delete the stack.
  4. Delete the CloudFormation stack cf-GitActionDeploymentUserStack from tools account. This removes cross-account-deploy-user IAM user.
  5. Go to the CDKToolkit stack in the tools account and note the BucketName on the Output tab. Empty that bucket and then delete the stack.

Security considerations

Cross-account IAM roles are very powerful and need to be handled carefully. For this post, we strictly limited the cross-account IAM role to specific Amazon S3 and CloudFormation permissions. This makes sure that the cross-account role can only do those things. The actual creation of Lambda, API Gateway, and Amazon DynamoDB resources happens via the AWS CloudFormation IAM role, which AWS  CloudFormation assumes in the target AWS account.

Make sure that you use secrets to store your sensitive workflow configurations, as specified in the section Configuring secrets.

Conclusion

In this post we showed how you can leverage GitHub’s popular software development platform to securely deploy to AWS accounts and Regions using GitHub actions and AWS CDK.

Build your own GitHub Actions CI/CD workflow as shown in this post.

About the author

 

Damodar Shenvi Wagle is a Cloud Application Architect at AWS Professional Services. His areas of expertise include architecting serverless solutions, ci/cd and automation.

Multi-branch CodePipeline strategy with event-driven architecture

Post Syndicated from Henrique Bueno original https://aws.amazon.com/blogs/devops/multi-branch-codepipeline-strategy-with-event-driven-architecture/

Henrique Bueno, DevOps Consultant, Professional Services

This blog post presents a solution for automated pipelines creation in AWS CodePipeline when a new branch is created in an AWS CodeCommit repository. A use case for this solution is when a GitFlow approach using CodePipeline is required. The strategy presented here is used by AWS customers to enable the use of GitFlow using only AWS tools.

CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline automates the build, test, and deploy phases of your release process every time there is a code change, based on the release model you define.

CodeCommit is a fully managed source control service that hosts secure Git-based repositories. It makes it easy for teams to collaborate on code in a secure and highly scalable ecosystem.

GitFlow is a branching model designed around the project release. This provides a robust framework for managing larger projects. Gitflow is ideally suited for projects that have a scheduled release cycle.

Applicability

When using CodePipeline to orchestrate pipelines and CodeCommit as a code source, in addition to setting a repository, you must also set which branch will trigger the pipeline. This configuration works perfectly for the trunk-based strategy, in which you have only one main branch and all the developers interact with this single branch. However, when you need to work with a multi-branching strategy like GitFlow, the requirement to set a pipeline for each branch brings additional challenges.

It’s important to note that trunk-based is, by far, the best strategy for taking full advantage of a DevOps approach; this is the branching strategy that AWS recommends to its customers. On the other hand, many customers like to work with multiple branches and believe it justifies the effort and complexity in dealing with branching merges. This solution is for these customers.

Solution Overview

One of the great benefits of working with Infrastructure as Code is the ability to create multiple identical environments through a single template. This example uses AWS CloudFormation templates to provision pipelines and other necessary resources, as shown in the following diagram.

Multi-branch CodePipeline strategy with event-driven architecture

Multi-branch CodePipeline strategy with event-driven architecture

The template is hosted in an Amazon S3 bucket. An AWS Lambda function deploys a new AWS CloudFormation stack based on this template. This Lambda function is trigged for an Amazon CloudWatch Events rule that looks for events at the CodePipeline repository.

The CreatePipeline Events rule

The AWS CloudFormation snippet that creates the Events rule follows. The Events rule monitors create and delete branches events in all repositories, triggering the CreatePipeline Lambda function.

#----------------------------------------------------------------------#
# EventRule to trigger LambdaPipeline lambda
#----------------------------------------------------------------------#
  CreatePipelineRule:
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      EventPattern:
        source:
          - aws.codecommit
        detail-type:
          - 'CodeCommit Repository State Change'
        detail:
          event:
              - referenceDeleted
              - referenceCreated
          referenceType:
            - branch
      State: ENABLED
      Targets: 
      - Arn: !GetAtt CreatePipeline.Arn
        Id: CreatePipeline

 

The CreatePipeline Lambda function

The Lambda function receives the event details, parses the variables, and executes the appropriate actions. If the event is referenceCreated, then the stack is created; otherwise the stack is deleted. The stack name created or deleted is the junction of the repository name plus the new branch name. This is a very simple function.

#----------------------------------------------------------------------#
# Lambda for Stack Creation
#----------------------------------------------------------------------#
import boto3
def lambda_handler(event, context):
    Region = event['region']
    Account = event['account']
    RepositoryName = event['detail']['repositoryName']
    NewBranch = event['detail']['referenceName']
    Event = event['detail']['event']
    if NewBranch == "master":
        quit()
    if Event == "referenceCreated":
        cf_client = boto3.client('cloudformation')
        cf_client.create_stack(
            StackName=f'Pipeline-{RepositoryName}-{NewBranch}',
            TemplateURL=f'https://s3.amazonaws.com/{Account}-templates/TemplatePipeline.yaml',
            Parameters=[
                {
                    'ParameterKey': 'RepositoryName',
                    'ParameterValue': RepositoryName,
                    'UsePreviousValue': False
                },
                {
                    'ParameterKey': 'BranchName',
                    'ParameterValue': NewBranch,
                    'UsePreviousValue': False
                }
            ],
            OnFailure='ROLLBACK',
            Capabilities=['CAPABILITY_NAMED_IAM']
        )
    else:
        cf_client = boto3.client('cloudformation')
        cf_client.delete_stack(
            StackName=f'Pipeline-{RepositoryName}-{NewBranch}'
        )

The logic for creating only the CI or the CI+CD is on the AWS CloudFormation template. The Conditions section of AWS CloudFormation analyzes the new branch name.

Conditions: 
    BranchMaster: !Equals [ !Ref BranchName, "master" ]
    BranchDevelop: !Equals [ !Ref BranchName, "develop"]
    Setup: !Equals [ !Ref Setup, true ]
  • If the new branch is named master, then a stack will be created containing CI+CD pipelines, with deploy stages in the homologation and production environments.
  • If the new branch is named develop, then a stack will be created containing CI+CD pipelines, with a deploy stage in the Dev environment.
  • If the new branch has any other name, then the stack will be created with only a CI pipeline.

Since the purpose of this blog post is to present only a sample of automated pipelines creation, the pipelines used here are for examples only: they don’t deploy to any environment.

 

Applicability

This event-driven strategy permits pipelines to be created or deleted along with the branches. Since the entire environment is created using Infrastructure as Code and the template is the same for all pipelines, there is no possibility of different configuration issues between environments and pipeline stages.

A GitFlow simulation could resemble that shown in the following diagram:

GitFlow Diagram

GitFlow Diagram

  1. First, a CodeCommit repository is created, along with the master branch and its respective pipeline (CI+CD).
  2. The developer creates a branch called develop based on the master branch. The pipeline (CI+CD at Dev) is automatically created.
  3. The developer creates a feature-branch called feature-a based on the develop branch. The CI pipeline for this branch is automatically created.
  4. The developer creates a Pull Request from the feature-a branch to the develop branch. As soon as the Pull Request is accepted and merged and the feature-a branch is deleted, its pipeline is automatically deleted.

The same process can be followed for the release branch and hotfix branch. Once the branch is created, a new pipeline is created for it which follows its branch lifecycle.

Pipelines Stacks

 

Implementation

Before you start, make sure that the AWS CLI is installed and configured on your machine by following these steps:

  1. Clone the repository.
  2. Create the prerequisites stack.
  3. Copy the AWS CloudFormation template to Amazon S3.
  4. Copy the seed.zip file to the Amazon S3 bucket.
  5. Create the first repository and its pipeline.
  6. Create the develop branch.
  7. Create the first feature branch.
  8. Create the first Pull Request.
  9. Execute the Pull Request approval.
  10. Cleanup.

 

1. Clone the repository

Clone the repository with the sample code.

The main files are:

  • Setup.yaml: an AWS CloudFormation template for creating pipeline prerequisites.
  • TemplatePipeline.yaml: an AWS CloudFormation template for pipeline creation.
  • seed/buildspec/CIAction.yaml: a configuration file for an AWS CodeBuild project at the CI stage.
  • seed/buildspec/CDAction.yaml: a configuration file for a CodeBuild project at the CD stage.
# Command to clone the repository
git clone https://github.com/aws-samples/aws-codepipeline-multi-branch-strategy.git
cd aws-codepipeline-multi-branch-strategy

 

2. Create the prerequisite stack

The Setup stack creates the resources that are prerequisites for pipeline creation, as shown in the following chart.

These resources are created only once and they fit all the pipelines created in this example.

Resources

# Command to create Setup stack
aws cloudformation deploy --stack-name Setup-Pipeline \
--template-file Setup.yaml --region us-east-1 --capabilities CAPABILITY_NAMED_IAM

 

3. Copy the AWS CloudFormation template to Amazon S3

For the Lambda function to deploy a new pipeline stack, it needs to get the AWS CloudFormation template from somewhere. To enable it to do so, you need to save the template inside the Amazon S3 bucket that you just created at the Setup stack.

# Command that copy Template to S3 Bucket
aws s3 cp TemplatePipeline.yaml s3://"$(aws sts get-caller-identity --query Account --output text)"-templates/ --acl private

 

4. Copy the seed.zip file to the Amazon S3 bucket

CodeCommit permits you to populate a repository at the moment of its creation as a first commit. The content of this first commit can be saved in a .zip file in an Amazon S3 bucket. Use this CodeCommit option to populate your repository with BuildSpec files for CodeBuild.

# Command to create zip file with the Buildspec folder content.
zip -r seed.zip buildspec

# Command that copy seed.zip file to S3 Bucket.
aws s3 cp seed.zip s3://"$(aws sts get-caller-identity --query Account --output text)"-templates/ --acl private

 

5. Create the first repository and its pipeline

Now that the Setup stack is created and the seed file is stored in an Amazon S3 bucket, create the first CodeCommit repository. Every time that you want to create a new repository, execute the command below to create a new stack.

# Command to create the stack with the CodeCommit repository,
# CodeBuild Projects and the Pipeline for the master branch.
# Note: Change "myapp" by the name you want.

RepoName="myapp"
aws cloudformation deploy --stack-name Repo-$RepoName --template-file TemplatePipeline.yaml \
--parameter-overrides RepositoryName=$RepoName Setup=true \
--region us-east-1 --capabilities CAPABILITY_NAMED_IAM

When the stack is created, in addition to the CodeCommit repository, the CodeBuild projects and the master branch pipeline are also created. By default, a CodeCommit repository is created empty, with no branch. When the repository is populated with the seed.zip file, the master branch is created.

Resources

Access the CodeCommit repository to see the seed files at the master branch. Access the CodePipeline console to see that there’s a new pipeline with the name as the repository. This pipeline contains the CI+CD stages (homolog and prod).

 

6. Create the develop branch

To simulate a real development scenario, create a new branch called develop based on the master branch. In the GitFlow concept these two (master and develop) branches are fixed and never deleted.

When this new branch is created, the Events rule identifies that there’s a change on this repository and triggers the CreatePipeline Lambda function to create a new pipeline for this branch. Access the CodePipeline console to see that there’s a new pipeline with the name of the repository plus the branch name. This pipeline contains the CI+CD stages (Dev).

# Configure Git Credentials using AWS CLI Credential Helper
mkdir myapp
cd myapp
git config --global credential.helper '!aws codecommit credential-helper [email protected]'
git config --global credential.UseHttpPath true

# Clone the CodeCommit repository
# You can get the URL in the CodeCommit Console
git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/myapp .

# Create the develop branch
# For more details: https://docs.aws.amazon.com/codecommit/latest/userguide/how-to-create-branch.html
git checkout -b develop
git push origin develop

 

7. Create the first feature branch

Now that there are two main and fixed branches (master and develop), you can create a feature branch. In the GitFlow concept, feature branches have a short lifetime and are frequently merged to the develop branch. This type of branch only exists during the development period. When the feature development is finished, it is merged to the develop branch and the feature branch is deleted.

# Create the feature-branch branch
# make sure that you are at develop branch
git checkout -b feature-abc
git push origin feature-abc

Just as with the develop branch, when this new branch is created, the Events rule triggers the CreatePipeline Lambda function to create a new pipeline for this branch. Access the CodePipeline console to see that there’s a new pipeline with the name of the repository plus the branch name. This pipeline contains only the CI stage, without a CD stage.

Pipelines-2

 

8. Create the first Pull Request

It’s possible to simulate the end of a feature development, when the branch is ready to be merged with the develop branch. Keep in mind that the feature branch has a short lifecycle:  when the merge is done, the feature branch is deleted along with its pipeline.

# Create the Pull Request
aws codecommit create-pull-request --title "My Pull Request" \
--description "Please review these changes by Tuesday" \
--targets repositoryName=$RepoName,sourceReference=feature-abc,destinationReference=develop \
--region us-east-1

 

9. Execute the Pull Request approval

To merge the feature branch to the develop branch, the Pull Request needs to be approved. In a real scenario, a teammate should do a peer review before approval.

# Accept the Pull Request
# You can get the Pull-Request-ID in the json output of the create-pull-request command
aws codecommit merge-pull-request-by-fast-forward --pull-request-id <PULL_REQUEST_ID_FROM_PREVIOUS_COMMAND> \
--repository-name $RepoName --region us-east-1

# Delete the feature-branch
aws codecommit delete-branch --repository-name $RepoName --branch-name feature-abc --region us-east-1

The new code is integrated with the develop branch. If there’s a conflict, it needs to be solved. After that, the featurebranch is deleted together with its pipeline. The Event Rule triggers the CreatePipeline Lambda function to delete the pipeline for its branch. Access the CodePipeline console to see that the pipeline for the feature branch is deleted.

Pipelines

 

10. Cleanup

To remove the resources created as part of this blog post, follow these steps:

Delete Pipeline Stacks

# Delete all the pipeline Stacks created by CreatePipeline Lambda
Pipelines=$(aws cloudformation list-stacks --stack-status-filter --region us-east-1 --query 'StackSummaries[? StackStatus==`CREATE_COMPLETE` && starts_with(StackName, `Pipeline`) == `true`].[StackName]' --output text)
while read -r Pipeline rest; do aws cloudformation delete-stack --stack-name $Pipeline --region us-east-1 ; done <<< $Pipelines

Delete Repository Stacks

# Delete all the Repository stacks Stacks created by Step 5.
Repos=$(aws cloudformation list-stacks --stack-status-filter --region us-east-1 --query 'StackSummaries[? StackStatus==`CREATE_COMPLETE` && starts_with(StackName, `Repo-`) == `true`].[StackName]' --output text)
while read -r Repo rest; do aws cloudformation delete-stack --stack-name $Repo --region us-east-1 ; done <<< $Repos

Delete Setup-Pipeline Stack

# Cleaning Bucket before Stack deletion 
aws s3 rm s3://"$(aws sts get-caller-identity --query Account --output text)"-templates --recursive

# Delete Setup Stack
aws cloudformation delete-stack --stack-name Setup-Pipeline --region us-east-1 

 

Conclusion

This blog post discussed how you can work with event-driven strategy and Infrastructure as Code to implement a multi-branch pipeline flow using CodePipeline. It demonstrated how an Events rule and Lambda function can be used to fully orchestrate the creation and deletion of pipelines.

 

About the Author

Henrique Bueno

Henrique Bueno

Henrique is a DevOps Consultant in the Brazilian Professional Services Team at Amazon Web Services. He has helped AWS customers design, build and deploy cloud native applications by following the Twelve-Factor App methodology.

Setting up a CI/CD pipeline by integrating Jenkins with AWS CodeBuild and AWS CodeDeploy

Post Syndicated from Noha Ghazal original https://aws.amazon.com/blogs/devops/setting-up-a-ci-cd-pipeline-by-integrating-jenkins-with-aws-codebuild-and-aws-codedeploy/

In this post, I explain how to use the Jenkins open-source automation server to deploy AWS CodeBuild artifacts with AWS CodeDeploy, creating a functioning CI/CD pipeline. When properly implemented, the CI/CD pipeline is triggered by code changes pushed to your GitHub repo, automatically fed into CodeBuild, then the output is deployed on CodeDeploy.

Solution overview

The functioning pipeline creates a fully managed build service that compiles your source code. It then produces code artifacts that can be used by CodeDeploy to deploy to your production environment automatically.

The deployment workflow starts by placing the application code on the GitHub repository. To automate this scenario, I added source code management to the Jenkins project under the Source Code section. I chose the GitHub option, which by design clones a copy from the GitHub repo content in the Jenkins local workspace directory.

In the second step of my automation procedure, I enabled a trigger for the Jenkins server using an “Poll SCM” option. This option makes Jenkins check the configured repository for any new commits/code changes with a specified frequency. In this testing scenario, I configured the trigger to perform every two minutes. The automated Jenkins deployment process works as follows:

  1. Jenkins checks for any new changes on GitHub every two minutes.
  2. Change determination:
    1. If Jenkins finds no changes, Jenkins exits the procedure.
    2. If it does find changes, Jenkins clones all the files from the GitHub repository to the Jenkins server workspace directory.
  3. The File Operation plugin deletes all the files cloned from GitHub. This keeps the Jenkins workspace directory clean.
  4. The AWS CodeBuild plugin zips the files and sends them to a predefined Amazon S3 bucket location then initiates the CodeBuild project, which obtains the code from the S3 bucket. The project then creates the output artifact zip file, and stores that file again on the S3 bucket.
  5. The HTTP Request plugin downloads the CodeBuild output artifacts from the S3 bucket.
    I edited the S3 bucket policy to allow access from the Jenkins server IP address. See the following example policy:

    {
      "Version": "2012-10-17",
      "Id": "S3PolicyId1",
      "Statement": [
        {
          "Sid": "IPAllow",
          "Effect": "Allow",
          "Principal": "*",
          "Action": "s3:*",
          "Resource": "arn:aws:s3:::examplebucket/*",
          "Condition": {
             "IpAddress": {"aws:SourceIp": "x.x.x.x/x"},  <--- IP of the Jenkins server
          } 
        } 
      ]
    }
    
    

    This policy enables the HTTP request plugin to access the S3 bucket. This plugin doesn’t use the IAM instance profile or the AWS access keys (access key ID and secret access key).

  6. The output artifact is a compressed ZIP file. The CodeDeploy plugin by design requires the files to be unzipped to zip them and send them over to the S3 bucket for the CodeDeploy deployment. For that, I used the File Operation plugin to perform the following:
    1. Unzip the CodeBuild zipped artifact output in the Jenkins root workspace directory. At this point, the workspace directory should include the original zip file downloaded from the S3 bucket from Step 5 and the files extracted from this archive.
    2. Delete the original .zip file, and leave only the source bundle contents for the deployment.
  7. The CodeDeploy plugin selects and zips all workspace directory files. This plugin uses the CodeDeploy application name, deployment group name, and deployment configurations that you configured to initiate a new CodeDeploy deployment. The CodeDeploy plugin then uploads the newly zipped file according to the S3 bucket location provided to CodeDeploy as a source code for its new deployment operation.

Walkthrough

In this post, I walk you through the following steps:

  • Creating resources to build the infrastructure, including the Jenkins server, CodeBuild project, and CodeDeploy application.
  • Accessing and unlocking the Jenkins server.
  • Creating a project and configuring the CodeDeploy Jenkins plugin.
  • Testing the whole CI/CD pipeline.

Create the resources

In this section, I show you how to launch an AWS CloudFormation template, a tool that creates the following resources:

  • Amazon S3 bucket—Stores the GitHub repository files and the CodeBuild artifact application file that CodeDeploy uses.
  • IAM S3 bucket policy—Allows the Jenkins server access to the S3 bucket.
  • JenkinsRole—An IAM role and instance profile for the Amazon EC2 instance for use as a Jenkins server. This role allows Jenkins on the EC2 instance to access the S3 bucket to write files and access to create CodeDeploy deployments.
  • CodeDeploy application and CodeDeploy deployment group.
  • CodeDeploy service role—An IAM role to enable CodeDeploy to read the tags applied to the instances or the EC2 Auto Scaling group names associated with the instances.
  • CodeDeployRole—An IAM role and instance profile for the EC2 instances of CodeDeploy. This role has permissions to write files to the S3 bucket created by this template and to create deployments in CodeDeploy.
  • CodeBuildRole—An IAM role to be used by CodeBuild to access the S3 bucket and create the build projects.
  • Jenkins server—An EC2 instance running Jenkins.
  • CodeBuild project—This is configured with the S3 bucket and S3 artifact.
  • Auto Scaling group—Contains EC2 instances running Apache and the CodeDeploy agent fronted by an Elastic Load Balancer.
  • Auto Scaling launch configurations—For use by the Auto Scaling group.
  • Security groups—For the Jenkins server, the load balancer, and the CodeDeploy EC2 instances.

 

  1. To create the CloudFormation stack (for example in the AWS Frankfurt Region) click the below link:
    .

    .
  2. Choose Next and provide the following values on the Specify Details page:
    • For Stack name, name your stack as you prefer.
    • For CodedeployInstanceCount, choose the default of t2.medium.
      To check the supported instance types by AWS Region, see Supported Regions.
    • For InstanceCount, keep the default of 3, to launch three EC2 instances for CodeDeploy.
    • For JenkinsInstanceType, keep the default of t2.medium.
    • For KeyName, choose an existing EC2 key pair in your AWS account. Use this to connect by using SSH to the Jenkins server and the CodeDeploy EC2 instances. Make sure that you have access to the private key of this key pair.
    • For PublicSubnet1, choose a public subnet from which the load balancer, Jenkins server, and CodeDeploy web servers launch.
    • For PublicSubnet2, choose a public subnet from which the load balancers and CodeDeploy web servers launch.
    • For VpcId, choose the VPC for the public subnets you used in PublicSubnet1 and PublicSubnet2.
    • For YourIPRange, enter the CIDR block of the network from which you connect to the Jenkins server using HTTP and SSH. If your local machine has a static public IP address, go to https://www.whatismyip.com/ to find your IP address, and then enter your IP address followed by /32. If you don’t have a static IP address (or aren’t sure if you have one), enter 0.0.0.0/0. Then, any address can reach your Jenkins server.
      .
  3. Choose Next.
  4. On the Review page, select the I acknowledge that this template might cause AWS CloudFormation to create IAM resources check box.
  5. Choose Create and wait for the CloudFormation stack status to change to CREATE_COMPLETE. This takes approximately 6–10 minutes.
  6. Check the resulting values on the Outputs tab. You need them later.
    .
  7. Browse to the ELBDNSName value from the Outputs tab, verifying that you can see the Sample page. You should see a congratulatory message.
  8. Your Jenkins server should be ready to deploy.

Access and unlock your Jenkins server

In this section, I discuss how to access, unlock, and customize your Jenkins server.

  1. Copy the JenkinsServerDNSName value from the Outputs tab of the CloudFormation stack, and paste it into your browser.
  2. To unlock the Jenkins server, SSH to the server using the IP address and key pair, following the instructions from Unlocking Jenkins.
  3. Use the root user to Cat the log file (/var/log/jenkins/jenkins.log) and copy the automatically generated alphanumeric password (between the two sets of asterisks). Then, use the password to unlock your Jenkins server, as shown in the following screenshots.
    .
  4. On the Customize Jenkins page, choose Install suggested plugins.

  5. Wait until Jenkins installs all the suggested plugins. When the process completes, you should see the check marks alongside all of the installed plugins.
    .
    .
  6. On the Create First Admin User page, enter a user name, password, full name, and email address of the Jenkins user.
  7. Choose Save and continue, Save and finish, and Start using Jenkins.
    .
    After you install all the needed Jenkins plugins along with their required dependencies, the Jenkins server restarts. This step should take about two minutes. After Jenkins restarts, refresh the page. Your Jenkins server should be ready to use.

Create a project and configure the CodeDeploy Jenkins plugin

Now, to create our project in Jenkins we need to configure the required Jenkins plugin.

  1. Sign in to Jenkins with the user name and password that you created earlier and click on Manage Jenkins then Manage Plugins.
  2. From the Available tab search for and select the below plugins then choose Install without restart:
    .
    AWS CodeDeploy
    AWS CodeBuild
    Http Request
    File Operations
    .
  3. Select the Restart Jenkins when installation is complete and no jobs are running.
    Jenkins will take couple of minutes to download the plugins along with their dependencies then will restart.
  4. Login then choose New Item, Freestyle project.
  5. Enter a name for the project (for example, CodeDeployApp), and choose OK.
    .

    .
  6. On the project configuration page, under Source Code Management, choose Git. For Repository URL, enter the URL of your GitHub repository.
    .

    .
  7. For Build Triggers, select the Poll SCM check box. In the Schedule, for testing enter H/2 * * * *. This entry tells Jenkins to poll GitHub every two minutes for updates.
    .

    .
  8. Under Build Environment, select the Delete workspace before build starts check box. Each Jenkins project has a dedicated workspace directory. This option allows you to wipe out your workspace directory with each new Jenkins build, to keep it clean.
    .

    .
  9. Under Build Actions, add a Build Step, and AWS CodeBuild. On the AWS Configurations, choose Manually specify access and secret keys and provide the keys.
    .
    .
  10. From the CloudFormation stack Outputs tab, copy the AWS CodeBuild project name (myProjectName) and paste it in the Project Name field. Also, set the Region that you are using and choose Use Jenkins source.
    It is a best practice is to store AWS credentials for CodeBuild in the native Jenkins credential store. For more information, see the Jenkins AWS CodeBuild Plugin wiki.
    .
    .
  11. To make sure that all files cloned from the GitHub repository are deleted choose Add build step and select File Operation plugin, then click Add and select File Delete. Under File Delete operation in the Include File Pattern, type an asterisk.
    .
    .
  12. Under Build, configure the following:
    1. Choose Add a Build step.
    2. Choose HTTP Request.
    3. Copy the S3 bucket name from the CloudFormation stack Outputs tab and paste it after (http://s3-eu-central-1.amazonaws.com/) along with the name of the zip file codebuild-artifact.zip as the value for HTTP Plugin URL.
      Example: (http://s3-eu-central-1.amazonaws.com/mybucketname/codebuild-artifact.zip)
    4. For Ignore SSL errors?, choose Yes.
      .

      .
  13. Under HTTP Request, choose Advanced and leave the default values for Authorization, Headers, and Body. Under Response, for Output response to file, enter the codebuild-artifact.zip file name.
    .

    .
  14. Add the two build steps for the File Operations plugin, in the following order:
    1. Unzip action: This build step unzips the codebuild-artifact.zip file and places the contents in the root workspace directory.
    2. File Delete action: This build step deletes the codebuild-artifact.zip file, leaving only the source bundle contents for deployment.
      .
      .
  15. On the Post-build Actions, choose Add post-build actions and select the Deploy an application to AWS CodeDeploy check box.
  16. Enter the following values from the Outputs tab of your CloudFormation stack and leave the other settings at their default (blank):
    • For AWS CodeDeploy Application Name, enter the value of CodeDeployApplicationName.
    • For AWS CodeDeploy Deployment Group, enter the value of CodeDeployDeploymentGroup.
    • For AWS CodeDeploy Deployment Config, enter CodeDeployDefault.OneAtATime.
    • For AWS Region, choose the Region where you created the CodeDeploy environment.
    • For S3 Bucket, enter the value of S3BucketName.
      The CodeDeploy plugin uses the Include Files option to filter the files based on specific file names existing in your current Jenkins deployment workspace directory. The plugin zips specified files into one file. It then sends them to the location specified in the S3 Bucket parameter for CodeDeploy to download and use in the new deployment.
      .
      As shown below, in the optional Include Files field, I used (**) so all files in the workspace directory get zipped.
      .
      .
  17. Choose Deploy Revision. This option registers the newly created revision to your CodeDeploy application and gets it ready for deployment.
  18. Select the Wait for deployment to finish? check box. This option allows you to view the CodeDeploy deployments logs and events on your Jenkins server console output.
    .
    .
    Now that you have created a project, you are ready to test deployment.

Testing the whole CI/CD pipeline

To test the whole solution, put an application on your GitHub repository. You can download the sample from here.

The following screenshot shows an application tree containing the application source files, including text and binary files, executables, and packages:

In this example, the application files are the templates directory, test_app.py file, and web.py file.

The appspec.yml file is the main application specification file telling CodeDeploy how to deploy your application. Jenkins uses the AppSpec file to manage each deployment as a series of lifecycle event “hooks”, as defined in the file. For information about how to create a well-formed AppSpec file, see AWS CodeDeploy AppSpec File Reference.

The buildspec.yml file is a collection of build commands and related settings, in YAML format, that CodeBuild uses to run a build. You can include a build spec as part of the source code, or you can define a build spec when you create a build project. For more information, see How AWS CodeBuild Works.

The scripts folder contains the scripts that you would like to run during the CodeDeploy LifecycleHooks execution with respect to your application requirements. For more information, see Plan a Revision for AWS CodeDeploy.

To test this solution, perform the following steps:

  1. Unzip the application files and send them to your GitHub repository, run the following git commands from the path where you placed your sample application:
    $ git add -A
    
    $ git commit -m 'Your first application'
    
    $ git push
  2. On the Jenkins server dashboard, wait for two minutes until the previously set project trigger starts working. After the trigger starts working, you should see a new build taking place.
    .

    .
  3. In the Jenkins server Console Output page, check the build events and review the steps performed by each Jenkins plugin. You can also review the CodeDeploy deployment in detail, as shown in the following screenshot:
    .

On completion, Jenkins should report that you have successfully deployed a web application. You can also use your ELBDNSName value to confirm that the deployed application is running successfully.

.

.Conclusion

In this post, I outlined how you can use a Jenkins open-source automation server to deploy CodeBuild artifacts with CodeDeploy. I showed you how to construct a functioning CI/CD pipeline with these tools. I walked you through how to build the deployment infrastructure and automatically deploy application version changes from GitHub to your production environment.

Hopefully, you have found this post informative and the proposed solution useful. As always, AWS welcomes all feedback or comment.

About the Author

.

 

Noha Ghazal is a Cloud Support Engineer at Amazon Web Services. She is is a subject matter expert for AWS CodeDeploy. In her role, she enjoys supporting customers with their CodeDeploy and other DevOps configurations. Outside of work she enjoys drawing portraits, fishing and playing video games.

 

 

EC2 Fleet – Manage Thousands of On-Demand and Spot Instances with One Request

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-fleet-manage-thousands-of-on-demand-and-spot-instances-with-one-request/

EC2 Spot Fleets are really cool. You can launch a fleet of Spot Instances that spans EC2 instance types and Availability Zones without having to write custom code to discover capacity or monitor prices. You can set the target capacity (the size of the fleet) in units that are meaningful to your application and have Spot Fleet create and then maintain the fleet on your behalf. Our customers are creating Spot Fleets of all sizes. For example, one financial service customer runs Monte Carlo simulations across 10 different EC2 instance types. They routinely make requests for hundreds of thousands of vCPUs and count on Spot Fleet to give them access to massive amounts of capacity at the best possible price.

EC2 Fleet
Today we are extending and generalizing the set-it-and-forget-it model that we pioneered in Spot Fleet with EC2 Fleet, a new building block that gives you the ability to create fleets that are composed of a combination of EC2 On-Demand, Reserved, and Spot Instances with a single API call. You tell us what you need, capacity and instance-wise, and we’ll handle all the heavy lifting. We will launch, manage, monitor and scale instances as needed, without the need for scaffolding code.

You can specify the capacity of your fleet in terms of instances, vCPUs, or application-oriented units, and also indicate how much of the capacity should be fulfilled by Spot Instances. The application-oriented units allow you to specify the relative power of each EC2 instance type in a way that directly maps to the needs of your application. All three capacity specification options (instances, vCPUs, and application-oriented units) are known as weights.

I think you’ll find a number ways this feature makes managing a fleet of instances easier, and believe that you will also appreciate the team’s near-term feature roadmap of interest (more on that in a bit).

Using EC2 Fleet
There are a number of ways that you can use this feature, whether you’re running a stateless web service, a big data cluster or a continuous integration pipeline. Today I’m going to describe how you can use EC2 Fleet for genomic processing, but this is similar to workloads like risk analysis, log processing or image rendering. Modern DNA sequencers can produce multiple terabytes of raw data each day, to process that data into meaningful information in a timely fashion you need lots of processing power. I’ll be showing you how to deploy a “grid” of worker nodes that can quickly crunch through secondary analysis tasks in parallel.

Projects in genomics can use the elasticity EC2 provides to experiment and try out new pipelines on hundreds or even thousands of servers. With EC2 you can access as many cores as you need and only pay for what you use. Prior to today, you would need to use the RunInstances API or an Auto Scaling group for the On-Demand & Reserved Instance portion of your grid. To get the best price performance you’d also create and manage a Spot Fleet or multiple Spot Auto Scaling groups with different instance types if you wanted to add Spot Instances to turbo-boost your secondary analysis. Finally, to automate scaling decisions across multiple APIs and Auto Scaling groups you would need to write Lambda functions that periodically assess your grid’s progress & backlog, as well as current Spot prices – modifying your Auto Scaling Groups and Spot Fleets accordingly.

You can now replace all of this with a single EC2 Fleet, analyzing genomes at scale for as little as $1 per analysis. In my grid, each step in in the pipeline requires 1 vCPU and 4 GiB of memory, a perfect match for M4 and M5 instances with 4 GiB of memory per vCPU. I will create a fleet using M4 and M5 instances with weights that correspond to the number of vCPUs on each instance:

  • m4.16xlarge – 64 vCPUs, weight = 64
  • m5.24xlarge – 96 vCPUs, weight = 96

This is expressed in a template that looks like this:

"Overrides": [
{
  "InstanceType": "m4.16xlarge",
  "WeightedCapacity": 64,
},
{
  "InstanceType": "m5.24xlarge",
  "WeightedCapacity": 96,
},
]

By default, EC2 Fleet will select the most cost effective combination of instance types and Availability Zones (both specified in the template) using the current prices for the Spot Instances and public prices for the On-Demand Instances (if you specify instances for which you have matching RIs, your discounts will apply). The default mode takes weights into account to get the instances that have the lowest price per unit. So for my grid, fleet will find the instance that offers the lowest price per vCPU.

Now I can request capacity in terms of vCPUs, knowing EC2 Fleet will select the lowest cost option using only the instance types I’ve defined as acceptable. Also, I can specify how many vCPUs I want to launch using On-Demand or Reserved Instance capacity and how many vCPUs should be launched using Spot Instance capacity:

"TargetCapacitySpecification": {
	"TotalTargetCapacity": 2880,
	"OnDemandTargetCapacity": 960,
	"SpotTargetCapacity": 1920,
	"DefaultTargetCapacityType": "Spot"
}

The above means that I want a total of 2880 vCPUs, with 960 vCPUs fulfilled using On-Demand and 1920 using Spot. The On-Demand price per vCPU is lower for m5.24xlarge than the On-Demand price per vCPU for m4.16xlarge, so EC2 Fleet will launch 10 m5.24xlarge instances to fulfill 960 vCPUs. Based on current Spot pricing (again, on a per-vCPU basis), EC2 Fleet will choose to launch 30 m4.16xlarge instances or 20 m5.24xlarges, delivering 1920 vCPUs either way.

Putting it all together, I have a single file (fl1.json) that describes my fleet:

    "LaunchTemplateConfigs": [
        {
            "LaunchTemplateSpecification": {
                "LaunchTemplateId": "lt-0e8c754449b27161c",
                "Version": "1"
            }
        "Overrides": [
        {
          "InstanceType": "m4.16xlarge",
          "WeightedCapacity": 64,
        },
        {
          "InstanceType": "m5.24xlarge",
          "WeightedCapacity": 96,
        },
      ]
        }
    ],
    "TargetCapacitySpecification": {
        "TotalTargetCapacity": 2880,
        "OnDemandTargetCapacity": 960,
        "SpotTargetCapacity": 1920,
        "DefaultTargetCapacityType": "Spot"
    }
}

I can launch my fleet with a single command:

$ aws ec2 create-fleet --cli-input-json file://home/ec2-user/fl1.json
{
    "FleetId":"fleet-838cf4e5-fded-4f68-acb5-8c47ee1b248a"
}

My entire fleet is created within seconds and was built using 10 m5.24xlarge On-Demand Instances and 30 m4.16xlarge Spot Instances, since the current Spot price was 1.5¢ per vCPU for m4.16xlarge and 1.6¢ per vCPU for m5.24xlarge.

Now lets imagine my grid has crunched through its backlog and no longer needs the additional Spot Instances. I can then modify the size of my fleet by changing the target capacity in my fleet specification, like this:

{         
    "TotalTargetCapacity": 960,
}

Since 960 was equal to the amount of On-Demand vCPUs I had requested, when I describe my fleet I will see all of my capacity being delivered using On-Demand capacity:

"TargetCapacitySpecification": {
	"TotalTargetCapacity": 960,
	"OnDemandTargetCapacity": 960,
	"SpotTargetCapacity": 0,
	"DefaultTargetCapacityType": "Spot"
}

When I no longer need my fleet I can delete it and terminate the instances in it like this:

$ aws ec2 delete-fleets --fleet-id fleet-838cf4e5-fded-4f68-acb5-8c47ee1b248a \
  --terminate-instances   
{
    "UnsuccessfulFleetDletetions": [],
    "SuccessfulFleetDeletions": [
        {
            "CurrentFleetState": "deleted_terminating",
            "PreviousFleetState": "active",
            "FleetId": "fleet-838cf4e5-fded-4f68-acb5-8c47ee1b248a"
        }
    ]
}

Earlier I described how RI discounts apply when EC2 Fleet launches instances for which you have matching RIs, so you might be wondering how else RI customers benefit from EC2 Fleet. Let’s say that I own regional RIs for M4 instances. In my EC2 Fleet I would remove m5.24xlarge and specify m4.10xlarge and m4.16xlarge. Then when EC2 Fleet creates the grid, it will quickly find M4 capacity across the sizes and AZs I’ve specified, and my RI discounts apply automatically to this usage.

In the Works
We plan to connect EC2 Fleet and EC2 Auto Scaling groups. This will let you create a single fleet that mixed instance types and Spot, Reserved and On-Demand, while also taking advantage of EC2 Auto Scaling features such as health checks and lifecycle hooks. This integration will also bring EC2 Fleet functionality to services such as Amazon ECS, Amazon EKS, and AWS Batch that build on and make use of EC2 Auto Scaling for fleet management.

Available Now
You can create and make use of EC2 Fleets today in all public AWS Regions!

Jeff;

CI/CD with Data: Enabling Data Portability in a Software Delivery Pipeline with AWS Developer Tools, Kubernetes, and Portworx

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/cicd-with-data-enabling-data-portability-in-a-software-delivery-pipeline-with-aws-developer-tools-kubernetes-and-portworx/

This post is written by Eric Han – Vice President of Product Management Portworx and Asif Khan – Solutions Architect

Data is the soul of an application. As containers make it easier to package and deploy applications faster, testing plays an even more important role in the reliable delivery of software. Given that all applications have data, development teams want a way to reliably control, move, and test using real application data or, at times, obfuscated data.

For many teams, moving application data through a CI/CD pipeline, while honoring compliance and maintaining separation of concerns, has been a manual task that doesn’t scale. At best, it is limited to a few applications, and is not portable across environments. The goal should be to make running and testing stateful containers (think databases and message buses where operations are tracked) as easy as with stateless (such as with web front ends where they are often not).

Why is state important in testing scenarios? One reason is that many bugs manifest only when code is tested against real data. For example, we might simply want to test a database schema upgrade but a small synthetic dataset does not exercise the critical, finer corner cases in complex business logic. If we want true end-to-end testing, we need to be able to easily manage our data or state.

In this blog post, we define a CI/CD pipeline reference architecture that can automate data movement between applications. We also provide the steps to follow to configure the CI/CD pipeline.

 

Stateful Pipelines: Need for Portable Volumes

As part of continuous integration, testing, and deployment, a team may need to reproduce a bug found in production against a staging setup. Here, the hosting environment is comprised of a cluster with Kubernetes as the scheduler and Portworx for persistent volumes. The testing workflow is then automated by AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild.

Portworx offers Kubernetes storage that can be used to make persistent volumes portable between AWS environments and pipelines. The addition of Portworx to the AWS Developer Tools continuous deployment for Kubernetes reference architecture adds persistent storage and storage orchestration to a Kubernetes cluster. The example uses MongoDB as the demonstration of a stateful application. In practice, the workflow applies to any containerized application such as Cassandra, MySQL, Kafka, and Elasticsearch.

Using the reference architecture, a developer calls CodePipeline to trigger a snapshot of the running production MongoDB database. Portworx then creates a block-based, writable snapshot of the MongoDB volume. Meanwhile, the production MongoDB database continues serving end users and is uninterrupted.

Without the Portworx integrations, a manual process would require an application-level backup of the database instance that is outside of the CI/CD process. For larger databases, this could take hours and impact production. The use of block-based snapshots follows best practices for resilient and non-disruptive backups.

As part of the workflow, CodePipeline deploys a new MongoDB instance for staging onto the Kubernetes cluster and mounts the second Portworx volume that has the data from production. CodePipeline triggers the snapshot of a Portworx volume through an AWS Lambda function, as shown here

 

 

 

AWS Developer Tools with Kubernetes: Integrated Workflow with Portworx

In the following workflow, a developer is testing changes to a containerized application that calls on MongoDB. The tests are performed against a staging instance of MongoDB. The same workflow applies if changes were on the server side. The original production deployment is scheduled as a Kubernetes deployment object and uses Portworx as the storage for the persistent volume.

The continuous deployment pipeline runs as follows:

  • Developers integrate bug fix changes into a main development branch that gets merged into a CodeCommit master branch.
  • Amazon CloudWatch triggers the pipeline when code is merged into a master branch of an AWS CodeCommit repository.
  • AWS CodePipeline sends the new revision to AWS CodeBuild, which builds a Docker container image with the build ID.
  • AWS CodeBuild pushes the new Docker container image tagged with the build ID to an Amazon ECR registry.
  • Kubernetes downloads the new container (for the database client) from Amazon ECR and deploys the application (as a pod) and staging MongoDB instance (as a deployment object).
  • AWS CodePipeline, through a Lambda function, calls Portworx to snapshot the production MongoDB and deploy a staging instance of MongoDB• Portworx provides a snapshot of the production instance as the persistent storage of the staging MongoDB
    • The MongoDB instance mounts the snapshot.

At this point, the staging setup mimics a production environment. Teams can run integration and full end-to-end tests, using partner tooling, without impacting production workloads. The full pipeline is shown here.

 

Summary

This reference architecture showcases how development teams can easily move data between production and staging for the purposes of testing. Instead of taking application-specific manual steps, all operations in this CodePipeline architecture are automated and tracked as part of the CI/CD process.

This integrated experience is part of making stateful containers as easy as stateless. With AWS CodePipeline for CI/CD process, developers can easily deploy stateful containers onto a Kubernetes cluster with Portworx storage and automate data movement within their process.

The reference architecture and code are available on GitHub:

● Reference architecture: https://github.com/portworx/aws-kube-codesuite
● Lambda function source code for Portworx additions: https://github.com/portworx/aws-kube-codesuite/blob/master/src/kube-lambda.py

For more information about persistent storage for containers, visit the Portworx website. For more information about Code Pipeline, see the AWS CodePipeline User Guide.

Implement continuous integration and delivery of serverless AWS Glue ETL applications using AWS Developer Tools

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/

AWS Glue is an increasingly popular way to develop serverless ETL (extract, transform, and load) applications for big data and data lake workloads. Organizations that transform their ETL applications to cloud-based, serverless ETL architectures need a seamless, end-to-end continuous integration and continuous delivery (CI/CD) pipeline: from source code, to build, to deployment, to product delivery. Having a good CI/CD pipeline can help your organization discover bugs before they reach production and deliver updates more frequently. It can also help developers write quality code and automate the ETL job release management process, mitigate risk, and more.

AWS Glue is a fully managed data catalog and ETL service. It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, and job scheduling. AWS Glue crawls your data sources and constructs a data catalog using pre-built classifiers for popular data formats and data types, including CSV, Apache Parquet, JSON, and more.

When you are developing ETL applications using AWS Glue, you might come across some of the following CI/CD challenges:

  • Iterative development with unit tests
  • Continuous integration and build
  • Pushing the ETL pipeline to a test environment
  • Pushing the ETL pipeline to a production environment
  • Testing ETL applications using real data (live test)
  • Exploring and validating data

In this post, I walk you through a solution that implements a CI/CD pipeline for serverless AWS Glue ETL applications supported by AWS Developer Tools (including AWS CodePipeline, AWS CodeCommit, and AWS CodeBuild) and AWS CloudFormation.

Solution overview

The following diagram shows the pipeline workflow:

This solution uses AWS CodePipeline, which lets you orchestrate and automate the test and deploy stages for ETL application source code. The solution consists of a pipeline that contains the following stages:

1.) Source Control: In this stage, the AWS Glue ETL job source code and the AWS CloudFormation template file for deploying the ETL jobs are both committed to version control. I chose to use AWS CodeCommit for version control.

To get the ETL job source code and AWS CloudFormation template, download the gluedemoetl.zip file. This solution is developed based on a previous post, Build a Data Lake Foundation with AWS Glue and Amazon S3.

2.) LiveTest: In this stage, all resources—including AWS Glue crawlers, jobs, S3 buckets, roles, and other resources that are required for the solution—are provisioned, deployed, live tested, and cleaned up.

The LiveTest stage includes the following actions:

  • Deploy: In this action, all the resources that are required for this solution (crawlers, jobs, buckets, roles, and so on) are provisioned and deployed using an AWS CloudFormation template.
  • AutomatedLiveTest: In this action, all the AWS Glue crawlers and jobs are executed and data exploration and validation tests are performed. These validation tests include, but are not limited to, record counts in both raw tables and transformed tables in the data lake and any other business validations. I used AWS CodeBuild for this action.
  • LiveTestApproval: This action is included for the cases in which a pipeline administrator approval is required to deploy/promote the ETL applications to the next stage. The pipeline pauses in this action until an administrator manually approves the release.
  • LiveTestCleanup: In this action, all the LiveTest stage resources, including test crawlers, jobs, roles, and so on, are deleted using the AWS CloudFormation template. This action helps minimize cost by ensuring that the test resources exist only for the duration of the AutomatedLiveTest and LiveTestApproval

3.) DeployToProduction: In this stage, all the resources are deployed using the AWS CloudFormation template to the production environment.

Try it out

This code pipeline takes approximately 20 minutes to complete the LiveTest test stage (up to the LiveTest approval stage, in which manual approval is required).

To get started with this solution, choose Launch Stack:

This creates the CI/CD pipeline with all of its stages, as described earlier. It performs an initial commit of the sample AWS Glue ETL job source code to trigger the first release change.

In the AWS CloudFormation console, choose Create. After the template finishes creating resources, you see the pipeline name on the stack Outputs tab.

After that, open the CodePipeline console and select the newly created pipeline. Initially, your pipeline’s CodeCommit stage shows that the source action failed.

Allow a few minutes for your new pipeline to detect the initial commit applied by the CloudFormation stack creation. As soon as the commit is detected, your pipeline starts. You will see the successful stage completion status as soon as the CodeCommit source stage runs.

In the CodeCommit console, choose Code in the navigation pane to view the solution files.

Next, you can watch how the pipeline goes through the LiveTest stage of the deploy and AutomatedLiveTest actions, until it finally reaches the LiveTestApproval action.

At this point, if you check the AWS CloudFormation console, you can see that a new template has been deployed as part of the LiveTest deploy action.

At this point, make sure that the AWS Glue crawlers and the AWS Glue job ran successfully. Also check whether the corresponding databases and external tables have been created in the AWS Glue Data Catalog. Then verify that the data is validated using Amazon Athena, as shown following.

Open the AWS Glue console, and choose Databases in the navigation pane. You will see the following databases in the Data Catalog:

Open the Amazon Athena console, and run the following queries. Verify that the record counts are matching.

SELECT count(*) FROM "nycitytaxi_gluedemocicdtest"."data";
SELECT count(*) FROM "nytaxiparquet_gluedemocicdtest"."datalake";

The following shows the raw data:

The following shows the transformed data:

The pipeline pauses the action until the release is approved. After validating the data, manually approve the revision on the LiveTestApproval action on the CodePipeline console.

Add comments as needed, and choose Approve.

The LiveTestApproval stage now appears as Approved on the console.

After the revision is approved, the pipeline proceeds to use the AWS CloudFormation template to destroy the resources that were deployed in the LiveTest deploy action. This helps reduce cost and ensures a clean test environment on every deployment.

Production deployment is the final stage. In this stage, all the resources—AWS Glue crawlers, AWS Glue jobs, Amazon S3 buckets, roles, and so on—are provisioned and deployed to the production environment using the AWS CloudFormation template.

After successfully running the whole pipeline, feel free to experiment with it by changing the source code stored on AWS CodeCommit. For example, if you modify the AWS Glue ETL job to generate an error, it should make the AutomatedLiveTest action fail. Or if you change the AWS CloudFormation template to make its creation fail, it should affect the LiveTest deploy action. The objective of the pipeline is to guarantee that all changes that are deployed to production are guaranteed to work as expected.

Conclusion

In this post, you learned how easy it is to implement CI/CD for serverless AWS Glue ETL solutions with AWS developer tools like AWS CodePipeline and AWS CodeBuild at scale. Implementing such solutions can help you accelerate ETL development and testing at your organization.

If you have questions or suggestions, please comment below.

 


Additional Reading

If you found this post useful, be sure to check out Implement Continuous Integration and Delivery of Apache Spark Applications using AWS and Build a Data Lake Foundation with AWS Glue and Amazon S3.

 


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 
Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.

 

 

 

Migrating .NET Classic Applications to Amazon ECS Using Windows Containers

Post Syndicated from Sundar Narasiman original https://aws.amazon.com/blogs/compute/migrating-net-classic-applications-to-amazon-ecs-using-windows-containers/

This post contributed by Sundar Narasiman, Arun Kannan, and Thomas Fuller.

AWS recently announced the general availability of Windows container management for Amazon Elastic Container Service (Amazon ECS). Docker containers and Amazon ECS make it easy to run and scale applications on a virtual machine by abstracting the complex cluster management and setup needed.

Classic .NET applications are developed with .NET Framework 4.7.1 or older and can run only on a Windows platform. These include Windows Communication Foundation (WCF), ASP.NET Web Forms, and an ASP.NET MVC web app or web API.

Why classic ASP.NET?

ASP.NET MVC 4.6 and older versions of ASP.NET occupy a significant footprint in the enterprise web application space. As enterprises move towards microservices for new or existing applications, containers are one of the stepping stones for migrating from monolithic to microservices architectures. Additionally, the support for Windows containers in Windows 10, Windows Server 2016, and Visual Studio Tooling support for Docker simplifies the containerization of ASP.NET MVC apps.

Getting started

In this post, you pick an ASP.NET 4.6.2 MVC application and get step-by-step instructions for migrating to ECS using Windows containers. The detailed steps, AWS CloudFormation template, Microsoft Visual Studio solution, ECS service definition, and ECS task definition are available in the aws-ecs-windows-aspnet GitHub repository.

To help you getting started running Windows containers, here is the reference architecture for Windows containers on GitHub: ecs-refarch-cloudformation-windows. This reference architecture is the layered CloudFormation stack, in that it calls the other stacks to create the environment. The CloudFormation YAML template in this reference architecture is referenced to create a single JSON CloudFormation stack, which is used in the steps for the migration.

Steps for Migration

The code and templates to implement this migration can be found on GitHub: https://github.com/aws-samples/aws-ecs-windows-aspnet.

  1. Your development environment needs to have the latest version and updates for Visual Studio 2017, Windows 10, and Docker for Windows Stable.
  2. Next, containerize the ASP.NET application and test it locally. The size of Windows container application images is generally larger compared to Linux containers. This is because the base image of the Windows container itself is large in size, typically greater than 9 GB.
  3. After the application is containerized, the container image needs to be pushed to Amazon Elastic Container Registry (Amazon ECR). Images stored in ECR are compressed to improve pull times and reduce storage costs. In this case, you can see that ECR compresses the image to around 1 GB, for an optimization factor of 90%.
  4. Create a CloudFormation stack using the template in the ‘CloudFormation template’ folder. This creates an ECS service, task definition (referring the containerized ASP.NET application), and other related components mentioned in the ECS reference architecture for Windows containers.
  5. After the stack is created, verify the successful creation of the ECS service, ECS instances, running tasks (with the threshold mentioned in the task definition), and the Application Load Balancer’s successful health check against running containers.
  6. Navigate to the Application Load Balancer URL and see the successful rendering of the containerized ASP.NET MVC app in the browser.

Key Notes

  • Generally, Windows container images occupy large amount of space (in the order of few GBs).
  • All the task definition parameters for Linux containers are not available for Windows containers. For more information, see Windows Task Definitions.
  • An Application Load Balancer can be configured to route requests to one or more ports on each container instance in a cluster. The dynamic port mapping allows you to have multiple tasks from a single service on the same container instance.
  • IAM roles for Windows tasks require extra configuration. For more information, see Windows IAM Roles for Tasks. For this post, configuration was handled by the CloudFormation template.
  • The ECS container agent log file can be accessed for troubleshooting Windows containers: C:\ProgramData\Amazon\ECS\log\ecs-agent.log

Summary

In this post, you migrated an ASP.NET MVC application to ECS using Windows containers.

The logical next step is to automate the activities for migration to ECS and build a fully automated continuous integration/continuous deployment (CI/CD) pipeline for Windows containers. This can be orchestrated by leveraging services such as AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, Amazon ECR, and Amazon ECS. You can learn more about how this is done in the Set Up a Continuous Delivery Pipeline for Containers Using AWS CodePipeline and Amazon ECS post.

If you have questions or suggestions, please comment below.

Continuous Deployment to Kubernetes using AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, Amazon ECR and AWS Lambda

Post Syndicated from Chris Barclay original https://aws.amazon.com/blogs/devops/continuous-deployment-to-kubernetes-using-aws-codepipeline-aws-codecommit-aws-codebuild-amazon-ecr-and-aws-lambda/

Thank you to my colleague Omar Lari for this blog on how to create a continuous deployment pipeline for Kubernetes!


You can use Kubernetes and AWS together to create a fully managed, continuous deployment pipeline for container based applications. This approach takes advantage of Kubernetes’ open-source system to manage your containerized applications, and the AWS developer tools to manage your source code, builds, and pipelines.

This post describes how to create a continuous deployment architecture for containerized applications. It uses AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, and AWS Lambda to deploy containerized applications into a Kubernetes cluster. In this environment, developers can remain focused on developing code without worrying about how it will be deployed, and development managers can be satisfied that the latest changes are always deployed.

What is Continuous Deployment?

There are many articles, posts and even conferences dedicated to the practice of continuous deployment. For the purposes of this post, I will summarize continuous delivery into the following points:

  • Code is more frequently released into production environments
  • More frequent releases allow for smaller, incremental changes reducing risk and enabling simplified roll backs if needed
  • Deployment is automated and requires minimal user intervention

For a more information, see “Practicing Continuous Integration and Continuous Delivery on AWS”.

How can you use continuous deployment with AWS and Kubernetes?

You can leverage AWS services that support continuous deployment to automatically take your code from a source code repository to production in a Kubernetes cluster with minimal user intervention. To do this, you can create a pipeline that will build and deploy committed code changes as long as they meet the requirements of each stage of the pipeline.

To create the pipeline, you will use the following services:

  • AWS CodePipeline. AWS CodePipeline is a continuous delivery service that models, visualizes, and automates the steps required to release software. You define stages in a pipeline to retrieve code from a source code repository, build that source code into a releasable artifact, test the artifact, and deploy it to production. Only code that successfully passes through all these stages will be deployed. In addition, you can optionally add other requirements to your pipeline, such as manual approvals, to help ensure that only approved changes are deployed to production.
  • AWS CodeCommit. AWS CodeCommit is a secure, scalable, and managed source control service that hosts private Git repositories. You can privately store and manage assets such as your source code in the cloud and configure your pipeline to automatically retrieve and process changes committed to your repository.
  • AWS CodeBuild. AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces artifacts that are ready to deploy. You can use AWS CodeBuild to both build your artifacts, and to test those artifacts before they are deployed.
  • AWS Lambda. AWS Lambda is a compute service that lets you run code without provisioning or managing servers. You can invoke a Lambda function in your pipeline to prepare the built and tested artifact for deployment by Kubernetes to the Kubernetes cluster.
  • Kubernetes. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It provides a platform for running, deploying, and managing containers at scale.

An Example of Continuous Deployment to Kubernetes:

The following example illustrates leveraging AWS developer tools to continuously deploy to a Kubernetes cluster:

  1. Developers commit code to an AWS CodeCommit repository and create pull requests to review proposed changes to the production code. When the pull request is merged into the master branch in the AWS CodeCommit repository, AWS CodePipeline automatically detects the changes to the branch and starts processing the code changes through the pipeline.
  2. AWS CodeBuild packages the code changes as well as any dependencies and builds a Docker image. Optionally, another pipeline stage tests the code and the package, also using AWS CodeBuild.
  3. The Docker image is pushed to Amazon ECR after a successful build and/or test stage.
  4. AWS CodePipeline invokes an AWS Lambda function that includes the Kubernetes Python client as part of the function’s resources. The Lambda function performs a string replacement on the tag used for the Docker image in the Kubernetes deployment file to match the Docker image tag applied in the build, one that matches the image in Amazon ECR.
  5. After the deployment manifest update is completed, AWS Lambda invokes the Kubernetes API to update the image in the Kubernetes application deployment.
  6. Kubernetes performs a rolling update of the pods in the application deployment to match the docker image specified in Amazon ECR.
    The pipeline is now live and responds to changes to the master branch of the CodeCommit repository. This pipeline is also fully extensible, you can add steps for performing testing or adding a step to deploy into a staging environment before the code ships into the production cluster.

An example pipeline in AWS CodePipeline that supports this architecture can be seen below:

Conclusion

We are excited to see how you leverage this pipeline to help ease your developer experience as you develop applications in Kubernetes.

You’ll find an AWS CloudFormation template with everything necessary to spin up your own continuous deployment pipeline at the CodeSuite – Continuous Deployment Reference Architecture for Kubernetes repo on GitHub. The repository details exactly how the pipeline is provisioned and how you can use it to deploy your own applications. If you have any questions, feedback, or suggestions, please let us know!