Tag Archives: CI/CD

Combining Snyk’s Insight with Amazon Q Developer’s Assistance to Streamline Secure Development

Post Syndicated from Omar Faruk original https://aws.amazon.com/blogs/devops/combining-snyks-insight-with-amazon-q-developers-assistance-to-streamline-secure-development/

Developers today face a constant balancing act – building new features and functionality while also ensuring the security and reliability of their codebase. Two powerful tools, Snyk and Amazon Q Developer, can work in tandem to help developers navigate this challenge with greater efficiency and efficacy.

Snyk is a leading developer security platform that empowers developers to seamlessly secure their code, open-source dependencies, container images, and cloud infrastructure all from a single, unified platform. Amazon Q Developer is a generative AI-powered assistant designed to accelerate a variety of tasks across the software development lifecycle. By combining the security insights from Snyk with the assistive capabilities of Amazon Q Developer, developers can streamline their workflows and focus on delivery.

Getting started with Amazon Q Developer and Snyk IDE Plugins

To get started with Amazon Q Developer, you need to have an AWS Builder ID or be part of an organization with an AWS IAM Identity Center instance that allows you to use Amazon Q. To use Amazon Q Developer agents for software development in Visual Studio Code, start by installing the Amazon Q extension. Find the latest version of the extension on the Amazon Q Developer page. The extension is also available for JetBrains, Eclipse (Preview), and Visual Studio IDEs. For a detailed list of supported IDEs and the features available in each, refer to the Amazon Q Developer documentation.

To get started with Snyk, sign up for a free Snyk account or log in with your existing account. To use Snyk in your IDE to automatically find security issues, review the IDE documentation and install Snyk using your IDE extension marketplace. After Snyk is installed, navigate to the Snyk panel in your IDE and follow the on-screen instructions to authenticate with your Snyk account.

After authenticating, Snyk will automatically scan your entire codebase for security issues. Snyk will continue scanning periodically as you write code or generate code with Amazon Q Developer.

Walkthrough

Let’s explore how Snyk and Amazon Q Developer can be used together through a few examples. Imagine that you maintain an open-source project. As a new Snyk user, you would like to find and fix the security issues in the project. In this first and simple scenario, Snyk has identified many cases of security vulnerabilities in specific lines of code. Among the vulnerabilities, we’ll focus on the Information Exposure vulnerability.

Snyk's IDE plugin shares a list of vulnerabilities and an overview, such as the line of code with the vulnerability and detail, of the vulnerability when it is selected.

Figure 1 – Snyk IDE Plugin displaying vulnerability analysis of an Information Exposure issue, showing severity, affected code, and prevention tips.

Rather than manually researching and implementing the fix, you can simply highlight the flagged line, invoke Amazon Q Developer’s inline chat by pressing ⌘+I (Mac) or Ctrl+I (Windows), and request assistance. Amazon Q Developer will analyze the issue, propose the necessary code changes, and provide you with an inline diff to review and accept. This allows for rapid remediation of security flaws saving time while improving the code.

Activating inline Q Developer and making a prompt for the agent to resolve the information exposure vulnerability identified by Snyk.

Figure 2 – Activating Amazon Q Developer inline code generation to fix the detected information exposure vulnerability.

We are happy with the change Amazon Q Developer proposed, so we’ll simply hit enter to accept the suggestions. Of course, we could always hit escape to reject the suggestion if needed.

Q Developer makes an inline code suggestion to resolve the information exposure vulnerability.

Figure 3 – Amazon Q Developer displaying an inline code generation to fix the detected information exposure vulnerability.

In addition to the inline chat, you can pass the vulnerability details directly from the Snyk plugin’s Problems view into the Amazon Q Developer /dev agentic capability.

In the chat interface of Q Developer, the /dev agentic capability allows longer conversation, broader workspace context, and handle changes within multiple files and topics. When this workflow is invoked, the Amazon Q Developer Agent will generate code based on the description and existing code in the workspace, provide a list of suggestions to review and add to the workspace, and if needed, iterate on the code based on feedback.

Copying the information of the information exposure vulnerability from Snyk plugin and requesting a fix using /dev agent capability.

Figure 4 – Using Amazon Q’s /dev agent to implement project-wide fixes for Snyk-detected vulnerabilities across multiple files.

Not all issues are trivial as the prior example. In a more complex case, Snyk may surface a vulnerability that requires a deeper understanding of the code and the potential risk. Let’s look at another issue that Snyk identified in the project we have been discussing.

Snyk identified cross-site scripting vulnerability and explains the line of code, details, and prevention tips of the vulnerability.

Figure 5 – Snyk Plugin highlighting a cross-site scripting (XSS) vulnerability, showing the affected code line and prevention recommendations.

Here, you can switch to Amazon Q Developer’s chat interface, provide the details of the issue, and ask for a more thorough explanation. Amazon Q Developer can then dive into the codebase, explain the problem in detail, and walk you through the appropriate fixes. This collaborative approach empowers developers to make informed decisions and gain broader knowledge, rather than simply implementing a suggestion.

Chat interface that takes a prompt from user to explain why Snyk flagged an cross-site scripting vulnerability and its impact.

Figure 6 – Amazon Q Developer’s chat interface explaining an XSS vulnerability and its security implications through natural language dialogue.

Note that Amazon Q Developer provides links to documentation and other sources for further reading. In addition, you can continue discussing the issue to learn more. For example, imagine that you want to understand real world breaches that have occurred as a result of the issues that Synk has identified. Q provides a few examples for me to learn more.

A natural language query in the chat interface if there has been any major breaches caused by the issue of cross-site scripting. Q responds with popular and impactful incidents.

Figure 7 – Amazon Q Developer discussing notable real-world XSS breach examples and their security impacts.

Beyond fixing issues, Amazon Q Developer can also assist with other development tasks identified by Snyk, such as updating dependencies, refactoring code, or optimizing cloud infrastructure. By integrating these two tools, developers can streamline security scanning, issue investigation, and remediation, dramatically increasing their overall productivity.

Conclusion

In this blog, we took a look at how Snyk and Amazon Q Developer are a powerful duo in the modern developer’s toolkit. Integrating Snyk’s leading security insights with the generative AI capabilities of Amazon Q Developer empowers developers to more efficiently identify, comprehend, and address security vulnerabilities. This combination enables developers to upskill and enhance their own abilities as they work to resolve security issues. Get started with installing the Amazon Q Developer in the IDE and Snyk plugin.



Connect with AWS Partner Snyk.

Snyk – AWS Partner Spotlight

Snyk empowers the world’s developers to build secure applications and equip security teams to meet the demands of the digital world. Used by 1,200 customers worldwide, Snyk’s Developer Security Platform automatically integrates with a developer’s workflow and is purpose-built for security teams to collaborate with their development teams.

Snyk on AWS Marketplace

About the authors:

Omar Faruk

Omar Faruk is a DevOps Partner Solutions Architect at Amazon Web Services. He helps DevSecOps partners to design, build and operate their and shared customers’ workloads in AWS. He is passionate about CI/CD, Infrastructure as Code, and next-generation developer experience. Outside work, he enjoys family time and travel.

David Schott

David is a seasoned DevSecOps Solutions Engineer with 15+ years of experience helping Fortune 100 companies optimize their software delivery security and efficiency. After driving DevOps adoption and CI development at CloudBees, he now focuses on DevSecOps at Snyk, where he collaborates with strategic partners to enable secure innovation at scale.

Announcing General Availability of GitLab Duo with Amazon Q

Post Syndicated from Ryan Bachman original https://aws.amazon.com/blogs/devops/announcing-general-availability-of-gitlab-duo-with-amazon-q/

Announcing General Availability of GitLab Duo with Amazon Q

Today, we’re excited to announce the general availability of GitLab Duo with Amazon Q. This new offering is an integrated product, bringing together GitLab’s DevSecOps platform with Amazon Q’s generative AI capabilities. Gitlab Duo with Amazon Q embeds Amazon Q agent capabilities directly in GitLab’s DevSecOps platform to accelerate complex, multi-step tasks across the entire software development lifecycle.

In today’s fast-paced software development environment, developers are constantly looking for ways to improve productivity while maintaining best practices for code quality, security, and deployment. The integration of GitLab Duo with Amazon Q addresses these needs by combining GitLab’s comprehensive DevSecOps platform with Amazon Q’s intelligent coding assistance.

This integration enables developers to leverage AI throughout their entire workflow—from idea conception to deployment—all within the familiar GitLab environment they already use. For new and existing Amazon Q Developer users, this integration also leverages the same Q Developer agents available in the IDE, providing a consistent experience across different interfaces.

Key Benefits and Features

The GitLab Duo with Amazon Q integration delivers significant value to development teams by creating a more efficient, secure, and collaborative workflow.

This integration eliminates the need to switch between different tools and environments, as developers can access powerful AI assistance directly within GitLab. GitLab helps automate building, testing, packaging, and deployment of secure code, streamlining the entire development lifecycle. What makes this particularly powerful is how the AI agents utilize the context throughout a GitLab project to keep the SLDC “loop” going. So whether you are troubleshooting a failed pipeline, investigating a vulnerability, or writing a new feature, Amazon Q agents can leverage the appropriate context to assist you with the task at hand.

Security and compliance are foundational elements of this integration. End-to-end security controls are built directly into the platform. The Amazon Q agents come with appropriate guardrails to help customers meet compliance without affecting development velocity, all while leveraging AWS’s cloud infrastructure to scale your AI-enhanced development workflows with confidence. You can ask Amazon Q agents to help remediate a finding in the project’s vulnerability reports or help troubleshoot a failed pipeline.

Throughout your development workflow, you’ll find collaborative AI agents ready to assist with various tasks. Whether you need to upgrade Java code from version 8 or 11 to 17, get AI-powered code review suggestions, automatically generate comprehensive test cases, or transform ideas into complete merge requests—Amazon Q is there to help at every step. These intelligent agents work alongside your team, enhancing productivity.

Use Cases and Examples

To demonstrate how GitLab and Amazon Q complement each other to accelerate development productivity and help organizations with application security, I’ll be using a Java application enjoyed by puzzle enthusiasts.

A Video of playing Q Words

Idea to Merge Request

Whether you are looking to scale your developer teams or streamline the processes between feature requests and production, GitLab Duo with Amazon Q is now integrated into GitLab’s platform, so you can begin development simply by assigning a GitLab issue to Amazon Q Developer agents.

I start by creating a task in my GitLab project. I want to create a new feature to support multiple languages in the Q words game.

Creating a new issue for the Q Dev agent

From here I assign the task directly to the Amazon Q agent by using GitLab’s quick action /q dev in the issue’s comment section.

Invoking the Q Dev Agent with a quick action

The agent will automatically open up a merge request for me to review with suggested code changes. Here you can see changes the agent made across 11 files, accounting for front-end, API, and styling changes. In the past I would have opened my IDE, cloned the project, and coded these changes myself. Using GitLab Duo with Amazon Q, I just review and test the new code before I am ready to deploy.

The merge request created by the Q Dev Agent

Code Reviews

Code reviews play a critical function during the development life-cycle. They act as a quality gate to help maintain high quality security and coding standards. While important, code reviews add latency to software delivery, especially when reviewers are not available or when changes are complex.

The Amazon Q agent for Code Reviews in GitLab helps teams move faster through their code review. Using the quick action /q review in a merge request comment field sends the merge request to Amazon Q, where it will identify security and quality risks associated with code changes in the merge request.

I start by opening an open merge request. In this example, another developer had the task to add authentication to the Q words application.

I then invoke the agent with the /q review quick action.

Invoking the Q review agent

The review is returned as inline code suggestions to the merge request. Here you can see an example of a finding from the review agent. Comments include a description of the findings as well as guidance and links to help improve the code.

Amazon Q Review adds comments to the merge request

I next use the Gitlab Duo with Amazon Q chat agent in the web interface to ask for a summary of the change and ask it to highlight any critical issues. GitLab Duo chat allows me to ask questions about the current resource in the URL. In this example it is the merge request, but it could also be a GitLab issue I want to explain or a summary of a code file in a repository.

Chatting with GitLab Dou with Amazon Q

Test Generation

Next, I ask GitLab Duo with Amazon Q to generate tests using the /q test quick action. Adding this action to the comment field will generate recommended tests when the MR lacks sufficient tests.

A test recommened by the Q test agent

The summary I receive from GitLab Duo with Amazon Q helps me understand the scope of the changes and focuses my attention to the more important aspects of the change. Along with the tests that Q Developer agents recommended, I am able to approve the merge request in less time.

Java Transformation

Upgrading Java applications from older versions to Java 17 can be time-consuming and error-prone. With GitLab Duo and Amazon Q, I can leverage the transform agent to help me automate the migration from the current Java 8 code to Java 17 along with upgrading the project’s dependencies. I start by creating a new issue in my GitLab project that indicates the Java upgrade.

Creating a new issue for the Q Transform agent

To begin the upgrade, I use the GitLab Q quick action /q transform to begin the upgrade process. The Amazon Q transformation agent asks me to update the gitlab-ci.yaml file to continue the process.

Instructions on how to update your .gitlab-ci.yaml file

I can follow the agent’s progress by watching for updates in the Issue’s details. GitLab Duo with Amazon Q will also add a transformation plan to the issue so I can understand what types of changes will be involved to complete the upgrade.

The Amazon Q Agent will provide a transformation plan before it begins

When the transform is complete, a new merge request is opened for me to review. As you can see, my pom.xml file was updated to compile on Java 17 as well as additional changes to ensure the project compiles. It also includes a report detailing next steps to consider before merging and deploying the updated Java code.

A completed summary from the Q transform agent

Conclusion

In this post, I demonstrated how GitLab Duo with Amazon Q can help scale and improve application development. Using GitLab Duo with Amazon Q, I was able to quickly add additional features, review code changes, and upgrade my application to Java 17 all within GitLab’s collaborative interface. I now have a secure and modern java app that I can use to practice my Español.

The general availability of GitLab Duo with Amazon Q marks a significant milestone in AI-assisted software development. By combining GitLab’s comprehensive DevSecOps platform with Amazon Q ‘s generative AI capabilities, this integration empowers development teams to work more efficiently while maintaining high standards of security and compliance.

Organizations can now leverage this powerful integration to accelerate their software development lifecycle, reduce manual effort, and ship more secure code faster. The seamless developer experience, enterprise-grade security, and collaborative AI agents throughout the workflow make this integration a valuable addition to any development team’s toolkit. We’re excited to see how customers leverage this integration to transform their development processes and achieve new levels of productivity and innovation.

Learn More

About the Author:
Ryan Bachman profile image

Ryan Bachman

Ryan Bachman is a Sr. Specialist Solutions Architect with Amazon Web Services (AWS) Next Generation Developer Experience Team. Ryan is passionate about helping customers adopt process and services that increase their efficiency developing applications for the cloud. He has over 20 years professional experience as a technologist, including roles in development, network architecture, and technical product management.

Deploying and Managing Application Configurations using AWS AppConfig

Post Syndicated from Aditya Ranjan original https://aws.amazon.com/blogs/devops/deploying-and-managing-application-configurations-using-aws-appconfig/

The management of configurations across multiple environments and tenants poses a significant challenge in modern software development. Organizations must balance maintaining distinct settings for various environments while accommodating the unique needs of different tenants in multi-tenant architectures. This complexity is compounded by requirements for consistency, version control, security, and efficient troubleshooting.

AWS AppConfig offers a powerful solution to these challenges. AWS AppConfig centrally stores, manages, and deploys application configurations. It streamlines pushing changes without frequent code deployments. The service also enables automatic rollbacks, providing a safety net for configuration changes.

When integrated with a CI/CD pipeline, such as GitLab, AWS AppConfig becomes part of a streamlined, automated system for configuration management. This combination addresses the complexities of multi-environment and multi-tenant deployments, ensuring consistent, version-controlled, and secure configuration management across the entire application ecosystem.

Solution and Scenario Overview

The GitLab CI/CD pipeline in this blog focuses on the way application configurations are managed and deployed using AWS AppConfig. By automating the entire process from configuration updates to multi-environment deployment, it offers a streamlined approach to configuration management.

In this configuration management setup, we’re dealing with a multi-environment, multi-tenant application structure that leverages AWS AppConfig for configuration deployment.

It describes a multi-tenant configuration setup where each tenant has dedicated environments (dev and qa). Real-world examples of what these could represent:

  • Development (dev): Where developers test new features and changes
  • Quality Assurance (qa): Where quality assurance teams validate changes before production

The system supports multiple tenants (tenant1, tenant2), each with their own isolated environments. In real-world applications, these tenants could represent:

  • Different customers:
    • A retail company (tenant1)
    • A healthcare provider (tenant2)
  • Different business units:
    • North America division (tenant1)
    • EMEA division (tenant2)

Each tenant maintains separate configurations for their dev and qa environments, with three example configuration files:

  1. AllowList.yml
  2. FeatureFlags.yml
  3. ThrottlingLimits.yml

The ‘template’ directory provides base configuration files that can be inherited and customized by each tenant’s environment-specific configurations. This hierarchical structure ensures that tenants can maintain their unique configurations while adhering to a standardized template format.

Here’s an example of how the template YAML files might look:

  1. AllowList.yml
# AllowList.yml

# Network Access Controls
ip_allowlists:
  internal_networks:
    - "10.0.0.0/8"     # Internal corporate network
    - "172.16.0.0/12"  # VPC network range
    - "192.168.1.0/24" # Development network

# Domain Allowlist
domain_allowlist:
  api_consumers:
    - "api.partner1.com"
    - "services.partner2.com"
    - "*.trusted-client.com"
  1. FeatureFlags.yml
# FeatureFlags.yml

features:
  new_search:
    enabled: true
    rollout_percentage: 76
    description: "Enhanced search functionality"
    
  ai_recommendations:
    enabled: true

  chat_support:
    enabled: false
    description: "In-app chat support"
  1. ThrottlingLimits.yml
#ThrottlingLimits.yml
api_limits:
  global:
    requests_per_second: 100
    concurrent_requests: 50
    max_retry_attempts: 3

service_specific:
  user_service:
      requests_per_second: 80
      burst_limit: 100

These templates serve as the starting point for all environment and tenant-specific configurations.

The folder structure reflects a sophisticated approach to organizing configurations across different environments and tenants.

├── template
│   ├── AllowList.yml
│   ├── FeatureFlags.yml
│   └── ThrottlingLimits.yml

└── tenants
├── tenant1
│   ├── dev
│   │   ├── AllowList.yml
│   │   ├── FeatureFlags.yml
│   │   └── ThrottlingLimits.yml
│   └── qa
│       ├── AllowList.yml
│       ├── FeatureFlags.yml
│       └── ThrottlingLimits.yml
└── tenant2
├── dev
│   ├── AllowList.yml
│   ├── FeatureFlags.yml
│   └── ThrottlingLimits.yml
└── qa
├── AllowList.yml
├── FeatureFlags.yml
└── ThrottlingLimits.yml

At the root level, we have two main directories:

  1. template: Houses the base configuration templates
  2. tenants: Contains tenant-specific configurations

The ‘tenants’ directory follows a hierarchical structure where each tenant (tenant1, tenant2) has their own directory. Within each tenant’s directory, there are ‘dev’ and ‘qa’ environment subdirectories. Each environment directory contains three configuration files: AllowList.yml, FeatureFlags.yml, and ThrottlingLimits.yml. These files represent different aspects of the application’s configuration and can override the base templates found in the ‘template’ directory. This structure allows for environment-specific configurations while maintaining a clear separation between tenants and their respective environments.

This structure allows for:

  1. Standardization through templates: The base templates in the ‘template’ directory ensure consistency across all tenants, providing default configurations that can be selectively overridden by tenant-specific needs.
  2. Tenant-specific customization: Each tenant can maintain unique configurations in their dev and qa environments while inheriting from the base templates. This allows for customization without losing standardization benefits.
  3. Environment isolation: Clear separation between dev and qa environments within each tenant’s directory ensures that configuration changes in one environment don’t affect other
  4. Version control of configurations: By storing configurations in a Git repository, changes can be tracked, reviewed, and rolled back if necessary.
  5. AWS AppConfig integration:
    1. Each tenant gets their own Application in AWS AppConfig
    2. Configuration profiles map to different configuration types (AllowList, FeatureFlags, ThrottlingLimits)
    3. Separate environments (dev/qa) within each tenant’s application

The GitLab CI/CD pipeline we’re setting up will need to:

  1. Generate environment and tenant-specific configurations based on these templates
  2. Update the corresponding applications and configuration profiles in AWS AppConfig
  3. Deploy the appropriate configurations to each tenant and environment

Pre-Requisites

  1. Configuring GitLab CI/CD with AWS: Please refer Deploy to AWS from GitLab CI/CD
  2. Setting up GitLab Runners: Please refer Deploy and Manage Gitlab Runners on Amazon EC2 if you want to use Gitlab runners on EC2 or you can refer Install GitLab Runner and Configure GitLab Runner guides
  3. Configure Runner in .gitlab-ci.yml:
    • Use tags to specify which runner should execute your jobs:
job_name:
tags:
- aws-runner  # Tag of your specific runner

Setting Up the Directory Structure:

  1. First, create the base directory structure using these commands:
# Create directory structure and files
mkdir -p template tenants/{tenant1,tenant2}/{dev,qa}

  1. Create all required YAML files:
for file in AllowList.yml FeatureFlags.yml ThrottlingLimits.yml;do
  touch template/$file
  touch tenants/tenant{1,2}/{dev,qa}/$file
done

  1. Populate the template files:
Copy the content of each YAML file (AllowList.yml, FeatureFlags.yml, ThrottlingLimits.yml) shown above into the corresponding files in the template directory.
  1. For tenant-specific configurations:
Start by copying the template files to each tenant's environment directory
  1. Verify the folder structure.

Setting Up the GitLab CI/CD Pipeline

Code for the GitLab pipeline is in this repo.

This phase begins with gaining a clear understanding of the pipeline’s structure and flow, which forms the foundation for all subsequent steps.

Configuring .gitlab-ci.yml

    1. Creating the .gitlab-ci.yml file in your repository root
    2. Defining the base image for the pipeline (e.g., alpine:latest)
    3. Setting up pipeline stages: update-app-config, deploy-app-config
    4. Configuring global variables and default settings
      • Locate these sections in the .gitlab-ci.yml file below and Replace them with your AWS account details
variables:
  AWS_CREDS_TARGET_ROLE: arn:aws:iam::<aws_account_ID>:role/GitLab
  AWS_DEFAULT_REGION: <aws_region>
      •  Make sure to replace these variables in both stages (update-app-config and deploy-app-config) of the pipeline. The AWS role should have appropriate permissions to interact with AWS AppConfig service

Here’s the complete .gitlab-ci.yml file:

stages:
  - update-app-config
  - deploy-app-config

update-app-config:
  stage: update-app-config
  image:
    name: amazon/aws-cli:latest
    entrypoint:
      - '/usr/bin/env'
  script:
    - |
      # Get list of all tenant
      TENANTS=$(find tenants -mindepth 1 -maxdepth 1 -type d -exec basename {} \;)
      
      for TENANT in $TENANTS; do
        echo "Processing tenant: $TENANT"
        
        # Create/Get Application for tenant
        APP_ID=$(aws appconfig list-applications --query "Items[?Name=='$TENANT'].Id" --output text)
        if [ -z "$APP_ID" ]; then
          echo "Creating application for tenant '$TENANT'..."
          APP_ID=$(aws appconfig create-application --name $TENANT --query Id --output text)
        fi
        
        # Process each configuration type
        for CONFIG_TYPE in AllowList FeatureFlags ThrottlingLimits; do
          echo "Processing config type: $CONFIG_TYPE"
          
          # Create/Get Configuration Profile
          PROFILE_ID=$(aws appconfig list-configuration-profiles --application-id "$APP_ID" --query "Items[?Name=='$CONFIG_TYPE'].Id" --output text)
          if [ -z "$PROFILE_ID" ]; then
            echo "Creating configuration profile '$CONFIG_TYPE' for tenant '$TENANT'..."
            PROFILE_ID=$(aws appconfig create-configuration-profile --application-id "$APP_ID" --name "$CONFIG_TYPE" --description "Configuration profile for $CONFIG_TYPE" --location-uri hosted --query Id --output text)
          fi
          
          # Process each environment
          for ENV in dev qa; do
            echo "Processing environment: $ENV"
            
            # Priority: Use tenant-specific config if it exists, otherwise use template
            if [ -f "tenants/$TENANT/$ENV/$CONFIG_TYPE.yml" ]; then
              echo "Using tenant-specific configuration for $ENV"
              CONFIG_CONTENT=$(cat "tenants/$TENANT/$ENV/$CONFIG_TYPE.yml" | base64)
            else
              echo "Using template configuration for $ENV"
              CONFIG_CONTENT=$(cat "template/$CONFIG_TYPE.yml" | base64)
            fi
            
            echo "Creating new version for $CONFIG_TYPE configuration in $ENV..."
            aws appconfig create-hosted-configuration-version \
              --application-id "$APP_ID" \
              --configuration-profile-id "$PROFILE_ID" \
              --content "$CONFIG_CONTENT" \
              --content-type "application/json" \
              configuration_version_output
          done
        done
      done
  variables:
    AWS_CREDS_TARGET_ROLE: arn:aws:iam::<aws_account_ID>:role/GitLab 
    AWS_DEFAULT_REGION: <aws_region>

deploy-app-config:
  stage: deploy-app-config
  image: 
    name: amazon/aws-cli:latest
    entrypoint: 
      - '/usr/bin/env'
  script:
    - yum install -y jq
    - |
      TENANTS=$(find tenants -mindepth 1 -maxdepth 1 -type d -exec basename {} \;)
      
      for TENANT in $TENANTS; do
        echo "Processing tenant: $TENANT"
        APP_ID=$(aws appconfig list-applications --query "Items[?Name=='$TENANT'].Id" --output text)
        
        # Process each environment
        for ENV in dev qa; do
          echo "Processing environment: $ENV"
          
          # Create/Get Environment
          ENV_ID=$(aws appconfig list-environments --application-id "$APP_ID" --query "Items[?Name=='$ENV'].Id" --output text)
          if [ -z "$ENV_ID" ]; then
            echo "Creating environment '$ENV' for tenant '$TENANT'..."
            ENV_ID=$(aws appconfig create-environment --application-id "$APP_ID" --name "$ENV" --description "Environment for $ENV" --query Id --output text)
          fi
          
          # Process each configuration types
          for CONFIG_TYPE in AllowList FeatureFlags ThrottlingLimits; do
            echo "Processing $CONFIG_TYPE for $TENANT/$ENV"
            
            PROFILE_ID=$(aws appconfig list-configuration-profiles --application-id "$APP_ID" --query "Items[?Name=='$CONFIG_TYPE'].Id" --output text)

            echo " Profile ID $PROFILE_ID "
            # Get latest version for this specific profile
            LATEST_VERSION=$(aws appconfig list-hosted-configuration-versions \
              --application-id "$APP_ID" \
              --configuration-profile-id "$PROFILE_ID" \
              --query "Items[0].VersionNumber" \
              --output text)
            
            # Get current deployment for this specific profile
            CURRENT_DEPLOYMENT=$(aws appconfig list-deployments \
            --application-id "$APP_ID" \
            --environment-id "$ENV_ID" \
            --query "Items[?ConfigurationName=='$CONFIG_TYPE'].ConfigurationVersion | [0]" \
            --output text)


            echo "Current deployment $CURRENT_DEPLOYMENT"
              
            CURRENT_VERSION=$(aws appconfig list-deployments \
            --application-id "$APP_ID" \
            --environment-id "$ENV_ID" \
            --query "Items[?ConfigurationName=='$CONFIG_TYPE'].ConfigurationVersion | [0]" \
            --output text)
            
            echo "Latest Version: $LATEST_VERSION"
            echo "Current Version: $CURRENT_VERSION"
            
            if [[ "$CURRENT_DEPLOYMENT" == "None" ]] || [[ "$LATEST_VERSION" != "$CURRENT_VERSION" ]]; then
              echo "Starting deployment for $TENANT/$ENV/$CONFIG_TYPE..."
              DEPLOYMENT_RESPONSE=$(aws appconfig start-deployment \
                --application-id "$APP_ID" \
                --environment-id "$ENV_ID" \
                --deployment-strategy-id Linear50PercentEvery30Seconds \
                --configuration-profile-id "$PROFILE_ID" \
                --configuration-version "$LATEST_VERSION")
              
              DEPLOYMENT_ID=$(echo $DEPLOYMENT_RESPONSE | jq -r '.DeploymentNumber')
              
              # Monitor deployment
              max_attempts=10
              attempt=1
              while [ $attempt -le $max_attempts ]; do
                echo "Checking deployment status (attempt $attempt of $max_attempts)..."
                status=$(aws appconfig get-deployment \
                  --application-id "$APP_ID" \
                  --environment-id "$ENV_ID" \
                  --deployment-number "$DEPLOYMENT_ID" \
                  --query "State" \
                  --output text)
                
                if [ "$status" = "COMPLETE" ]; then
                  echo "Deployment completed successfully!"
                  break
                elif [ "$status" = "FAILED" ] || [ "$status" = "ROLLED_BACK" ]; then
                  echo "Deployment failed or was rolled back!"
                  exit 1
                fi
                
                if [ $attempt -eq $max_attempts ]; then
                  echo "Deployment timed out after $max_attempts attempts"
                  exit 1
                fi
                
                attempt=$((attempt + 1))
                sleep 30
              done
            else
              echo "No changes detected for $TENANT/$ENV/$CONFIG_TYPE (Current: $CURRENT_VERSION, Latest: $LATEST_VERSION). Skipping deployment..."
            fi
          done
        done
      done
  dependencies:
    - update-app-config
  variables:
    AWS_CREDS_TARGET_ROLE: arn:aws:iam::<aws_account_ID>:role/GitLab 
    AWS_DEFAULT_REGION: <aws_region>

Implementing Pipeline Stages

  1. Update-App-Config Stage:

  • Creates/Updates AWS AppConfig Applications:
    • Creates one application per tenant (tenant1, tenant2)
    • Uses tenant ID as application name
    • Retrieves existing application if already present
  • Manages Configuration Profiles:
    • Creates three profiles per tenant application (AllowList, FeatureFlags, ThrottlingLimits)
    • Each profile represents a distinct configuration type
    • Handles profile creation if not already existing
  • Creates Hosted Configuration Versions:
    • Processes changes from both template and tenant directories
    • Prioritizes tenant-specific configurations over templates
    • Creates new versions only for modified configurations
    • Uploads properly encoded configurations to AWS AppConfig
  1. Deploy-App-Config Stage:

    • Environment Deployment:
      • Manages dev and qa environments per tenant
      • Creates environments if not existing
      • Uses staged deployment strategy
    • Tenant Configuration Process:
      • Deploys per tenant and configuration type
      • Checks current deployed version against latest version
      • Only deploys if either of the follows is true:
        • No existing deployment is found
        • Latest Hosted Configuration version differs from currently deployed version
      • Maintains tenant-specific settings and version history
      • Provides clear deployment status messages, including cases where deployment is skipped
    • Deployment Management:
      • Executes AWS AppConfig deployments
      • Monitors deployment status
      • Handles failures and rollbacks
      • Times out after 10 retries

Executing the Pipeline

  1. Initiation:
    • Pipeline triggered by changes pushed to the repository
  1. Update-App-Config Stage:
    • Creates or updates applications and configuration profiles
    • Generates new versions of hosted configurations
  1. Deploy-App-Config Stage:
    • Iterates through each environment tenant and their environments
    • Checks current deployment status for each environment and tenant
    • Initiates new deployments only for changed configurations
    • Implements specified AWS AppConfig deployment strategy

Note: Deployment Strategy used in this example is a fast one used for testing (Linear50PercentEvery30Seconds) but for real production workloads, the reader should use the slower, AWS-recommended Linear20PercentEvery6Minutes strategy. More details here

This structured execution process ensures efficient and consistent deployment of configuration changes across the entire application ecosystem, maintaining synchronization between GitLab and AWS AppConfig.

Cleaning up

To clean up all AWS AppConfig resources created by this solution, you can use the following cleanup script. Create a file named delete_appconfig_resources.sh with this content:

#!/bin/bash

# List all applications
APPS=$(aws appconfig list-applications --query 'Items[*].Id' --output text)

for APP_ID in $APPS
do
  echo "Processing application $APP_ID"
  
  # List and delete all environments for this application
  ENVS=$(aws appconfig list-environments --application-id $APP_ID --query 'Items[*].Id' --output text)
  for ENV_ID in $ENVS
  do
    echo "  Deleting environment $ENV_ID"
    aws appconfig delete-environment --application-id $APP_ID --environment-id $ENV_ID
  done

  # List and delete all configuration profiles for this application
  PROFILES=$(aws appconfig list-configuration-profiles --application-id $APP_ID --query 'Items[*].Id' --output text)
  for PROFILE_ID in $PROFILES
  do
    echo "  Deleting configuration profile $PROFILE_ID"
    
    # Delete all hosted configuration versions for this profile
    VERSIONS=$(aws appconfig list-hosted-configuration-versions --application-id $APP_ID --configuration-profile-id $PROFILE_ID --query 'Items[*].VersionNumber' --output text)
    for VERSION in $VERSIONS
    do
      echo "    Deleting hosted configuration version $VERSION"
      aws appconfig delete-hosted-configuration-version --application-id $APP_ID --configuration-profile-id $PROFILE_ID --version-number $VERSION
    done

    # Delete the configuration profile
    aws appconfig delete-configuration-profile --application-id $APP_ID --configuration-profile-id $PROFILE_ID
  done

  # Delete the application
  echo "  Deleting application $APP_ID"
  aws appconfig delete-application --application-id $APP_ID
done

echo "All AppConfig resources have been deleted."


The script is a comprehensive cleanup utility for AWS AppConfig resources.

To execute this script, you need to have the AWS CLI installed and configured with appropriate credentials that have permissions to delete AppConfig resources. Make the script delete_appconfig_resources.sh  executable by running the command:

chmod +x cleanup_appconfig.sh.

Before running the script, ensure that you’re in the correct AWS account and region, as this script will delete ALL AppConfig resources in the configured account and region. To execute the script, simply run it from your terminal:  ./ delete_appconfig_resources.sh

It’s crucial to note that this script performs irreversible deletions. Use it with extreme caution, preferably in non-production environments or when you’re absolutely certain you want to remove all AppConfig resources.

Conclusion

This blog post has explored the powerful synergy between GitLab CI/CD and AWS AppConfig for managing application configurations in multi-tenant environments. We’ve demonstrated how this integration automates and streamlines the process of updating, versioning, and deploying configuration changes, offering benefits such as scalability, version control, and the balance between consistency and flexibility. By adopting this approach, development teams can significantly reduce manual errors, save time, and focus more on building features, ultimately leading to faster development cycles and more reliable applications in our increasingly complex and distributed computing landscape.

Key resources for further reading:

About the Author

Aditya Ranjan

Aditya Ranjan is a Lead Consultant with Amazon Web Services. He helps customers design and implement well-architected technical solutions using AWS’s latest technologies, including generative AI services, enabling them to achieve their business goals and objectives.

How GitHub uses CodeQL to secure GitHub

Post Syndicated from Brandon Stewart original https://github.blog/engineering/how-github-uses-codeql-to-secure-github/


GitHub’s Product Security Engineering team writes code and implements tools that help secure the code that powers GitHub. We use GitHub Advanced Security (GHAS) to discover, track, and remediate vulnerabilities and enforce secure coding standards at scale. One tool we rely heavily on to analyze our code at scale is CodeQL.

CodeQL is GitHub’s static analysis engine that powers automated security analyses. You can use it to query code in much the same way you would query a database. It provides a much more robust way to analyze code and uncover problems than an old-fashioned text search through a codebase.

The following post will detail how we use CodeQL to keep GitHub secure and how you can apply these lessons to your own organization. You will learn why and how we use:

  • Custom query packs (and how we create and manage them).
  • Custom queries.
  • Variant analysis to uncover potentially insecure programming practices.

Enabling CodeQL at scale

We employ CodeQL in a variety of ways at GitHub.

  1. Default setup with the default and security-extended query suites
    Default setup with the default and security-extended query suites meets the needs of the vast majority of our over 10,000 repositories. With these settings, pull requests automatically get a security review from CodeQL.
  2. Advanced setup with a custom query pack
    A few repositories, like our large Ruby monolith, need extra special attention, so we use advanced setup with a query pack containing custom queries to really tailor to our needs.
  3. Multi-repository variant analysis (MRVA)
    To conduct variant analysis and quick auditing, we use MRVA. We also write custom CodeQL queries to detect code patterns that are either specific to GitHub’s codebases or patterns we want a security engineer to manually review.

The specific custom Actions workflow step we use on our monolith is pretty simple. It looks like this:

- name: Initialize CodeQL
    uses: github/codeql-action/init@v3
    with:
      languages: ${{ matrix.language }}
      config-file: ./.github/codeql/${{ matrix.language }}/codeql-config.yml

Our Ruby configuration is pretty standard, but advanced setup offers a variety of configuration options using custom configuration files. The interesting part is the packs option, which is how we enable our custom query pack as part of the CodeQL analysis. This pack contains a collection of CodeQL queries we have written for Ruby, specifically for the GitHub codebase.

So, let’s dive deeper into why we did that—and how!

Publishing our CodeQL query pack

Initially, we published CodeQL query files directly to the GitHub monolith repository, but we moved away from this approach for several reasons:

  • It required going through the production deployment process for each new or updated query.
  • Queries not included in a query pack were not pre-compiled, which slowed down CodeQL analysis in CI.
  • Our test suite for CodeQL queries ran as part of the monolith’s CI jobs. When a new version of the CodeQL CLI was released, it sometimes caused the query tests to fail because of changes in the query output, even when there were no changes to the code in the pull request. This often led to confusion and frustration among engineers, as the failure wasn’t related to their pull request changes.

By switching to publishing a query pack to GitHub Container Registry (GCR), we’ve simplified our process and eliminated many of these pain points, making it easier to ship and maintain our CodeQL queries. So while it’s possible to deploy custom CodeQL query files directly to a repository, we recommend publishing CodeQL queries as a query pack to the GCR for easier deployment and faster iteration.

Creating our query pack

When setting up our custom query pack, we faced several considerations, particularly around managing dependencies like the ruby-all package.

To ensure our custom queries remain maintainable and concise, we extend classes from the default query suite, such as the ruby-all library. This allows us to leverage existing functionality rather than reinventing the wheel, keeping our queries concise and maintainable. However, changes to the CodeQL library API can introduce breaking changes, potentially deprecating our queries or causing errors. Since CodeQL runs as part of our CI, we wanted to minimize the chance of this happening, as this can lead to frustration and loss of trust from developers.

We develop our queries against the latest version of the ruby-all package, ensuring we’re always working with the most up-to-date functionality. To mitigate the risk of breaking changes affecting CI, we pin the ruby-all version when we’re ready to release, locking it in the codeql-pack.lock.yml file. This guarantees that when our queries are deployed, they will run with the specific version of ruby-all we’ve tested, avoiding potential issues from unintentional updates.

Here’s how we manage this setup:

  • In our qlpack.yml, we set the dependency to use the latest version of ruby-all
  • During development, this configuration pulls in the latest version) of ruby-all when running codeql pack init, ensuring we’re always up to date.
    // Our custom query pack's qlpack.yml
    
    library: false
    name: github/internal-ruby-codeql
    version: 0.2.3
    extractor: 'ruby'
    dependencies:
      codeql/ruby-all: "*"
    tests: 'test'
    description: "Ruby CodeQL queries used internally at GitHub"
    
  • Before releasing, we lock the version in the codeql-pack.lock.yml file, specifying the exact version to ensure stability and prevent issues in CI.
    // Our custom query pack's codeql-pack.lock.yml
    
    lockVersion: 1.0.0
    dependencies:
     ...
     codeql/ruby-all:
       version: 1.0.6
    

This approach allows us to balance developing against the latest features of the ruby-all package while ensuring stability when we release.

We also have a set of CodeQL unit tests that exercise our queries against sample code snippets, which helps us quickly determine if any query will cause errors before we publish our pack. These tests are run as part of the CI process in our query pack repository, providing an early check for issues. We strongly recommend writing unit tests for your custom CodeQL queries to ensure stability and reliability.

Altogether, the basic flow for releasing new CodeQL queries via our pack is as follows:

  • Open a pull request with the new query.
  • Write unit tests for the new query.
  • Merge the pull request.
  • Increment the pack version in a new pull request.
  • Run codeql pack init to resolve dependencies.
  • Correct unit tests as needed.
  • Publish the query pack to the GitHub Container Registry (GCR).
  • Repositories with the query pack in their config will start using the updated queries.

We have found this flow balances our team’s development experience while ensuring stability in our published query pack.

Configuring our repository to use our custom query pack

We won’t provide a general recommendation on configuration here, given that it ultimately depends on how your organization deploys code. We opted against locking our pack to a particular version in our CodeQL configuration file (see above). Instead, we chose to manage our versioning by publishing the CodeQL package in GCR. This results in the GitHub monolith retrieving the latest published version of the query pack. To roll back changes, we simply have to republish the package. In one instance, we released a query that had a high number of false positives and we were able to publish a new version of the pack that removed that query in less than 15 minutes. This is faster than the time it would have taken us to merge a pull request on the monolith repository to roll back the version in the CodeQL configuration file.

One of the problems we encountered with publishing the query pack in GCR was how to easily make the package available to multiple repositories within our enterprise. There are several approaches we explored.

  • Grant access permissions for individual repositories. On the package management page, you can grant permissions for individual repositories to access your package. This was not a good solution for us since we have too many repositories for it to be feasible to do manually, yet there is not currently a way to configure programmatically using an API.
  • Mint a personal access token for the CodeQL action runner. We could have minted a personal access token (PAT) that has access to read all packages for our organization and added that to the CodeQL action runner. However, this would have required managing a new token, and it seemed a bit more permissive than we wanted because it could read all of our private packages rather than ones we explicitly allow it to have access to.
  • Provide access permissions via a linked repository. We ended up implementing the third solution that we explored. We link a repository to the package and allow the package to inherit access permissions from the linked repository.

CodeQL query pack queries

We write a variety of custom queries to be used in our custom query packs. These cover GitHub-specific patterns that aren’t included in the default CodeQL query pack. This allows us to tailor the analysis to patterns and preferences that are specific to our company and codebase. Some of the types of things we alert on using our custom query pack include:

  • High-risk APIs specific to GitHub’s code that can be dangerous if they receive unsanitized user input.
  • Use of specific built-in Rails methods for which we have safer, custom methods or functions.
  • Required authorization methods not being used in our REST API endpoint definitions and GraphQL object/mutation definitions.
  • REST API endpoints and GraphQL mutations that require engineers to define access control methods to determine which actors can access them. (Specifically, the query detects the absence of this method definition to ensure that the actors’ permissions are being checked for these endpoints.)
  • Use of signed tokens so we can nudge engineers to include Product Security as a reviewer when using them.

Custom queries can be used more for educational purposes rather than being blockers to shipping code. For example, we want to alert engineers when they use the ActiveRecord::decrypt method. This method should generally not be used in production code, as it will cause an encrypted column to become decrypted. We use the recommendation severity in the query metadata so these alerts are treated as more of an informational alert. That means this may trigger an alert in a pull request, but it won’t cause the CodeQL CI job to fail. We use this lower severity level to allow engineers to assess the impact of new queries without immediate blocking. Additionally, this alert level isn’t tracked through our Fundamentals program, meaning it doesn’t require immediate action, reflecting the query’s maturity as we continue to refine its relevance and risk assessment.

/**
 * @id rb/github/use-of-activerecord-decrypt
 * @description Do not use the .decrypt method on AR models, this will decrypt all encrypted attributes and save
 * them unencrypted, effectively undoing encryption and possibly making the attributes inaccessible.
 * If you need to access the unencrypted value of any attribute, you can do so by calling my_model.attribute_name.
 * @kind problem
 * @severity recommendation
 * @name Use of ActiveRecord decrypt method
 * @tags security
 *      github-internal
 */

import ruby
import DataFlow
import codeql.ruby.DataFlow
import codeql.ruby.frameworks.ActiveRecord

/** Match against .decrypt method calls where the receiver may be an ActiveRecord object */
class ActiveRecordDecryptMethodCall extends ActiveRecordInstanceMethodCall {
  ActiveRecordDecryptMethodCall() { this.getMethodName() = "decrypt" }
}

from ActiveRecordDecryptMethodCall call
select call,
  "Do not use the .decrypt method on AR models, this will decrypt all encrypted attributes and save them unencrypted.

Another educational query is the one mentioned above in which we detect the absence of the `control_access` method in a class that defines a REST API endpoint. If a pull request introduces a new endpoint without `control_access`, a comment will appear on the pull request saying that the `control_access` method wasn’t found and it’s a requirement for REST API endpoints. This will notify the reviewer of a potential issue and prompt the developer to fix it.

/**
 * @id rb/github/api-control-access
 * @name Rest API Without 'control_access'
 * @description All REST API endpoints must call the 'control_access' method, to ensure that only specified actor types are able to access the given endpoint.
 * @kind problem
 * @tags security
 * github-internal
 * @precision high
 * @problem.severity recommendation
 */

import codeql.ruby.AST
import codeql.ruby.DataFlow
import codeql.ruby.TaintTracking
import codeql.ruby.ApiGraphs

// Api::App REST API endpoints should generally call the control_access method
private DataFlow::ModuleNode appModule() {
  result = API::getTopLevelMember("Api").getMember("App").getADescendentModule() and
  not result = protectedApiModule() and
  not result = staffAppApiModule()
}

// Api::Admin, Api::Staff, Api::Internal, and Api::ThirdParty REST API endpoints do not need to call the control_access method
private DataFlow::ModuleNode protectedApiModule() {
  result =
    API::getTopLevelMember(["Api"])
        .getMember(["Admin", "Staff", "Internal", "ThirdParty"])
        .getADescendentModule()
}

// Api::Staff::App REST API endpoints do not need to call the control_access method
private DataFlow::ModuleNode staffAppApiModule() {
  result =
    API::getTopLevelMember(["Api"]).getMember("Staff").getMember("App").getADescendentModule()
}

private class ApiRouteWithoutControlAccess extends DataFlow::CallNode {
  ApiRouteWithoutControlAccess() {
    this = appModule().getAModuleLevelCall(["get", "post", "delete", "patch", "put"]) and
    not performsAccessControl(this.getBlock())
  }
}

predicate performsAccessControl(DataFlow::BlockNode blocknode) {
  accessControlCalled(blocknode.asExpr().getExpr())
}

predicate accessControlCalled(Block block) {
  // the method `control_access` is called somewhere inside `block`
  block.getAStmt().getAChild*().(MethodCall).getMethodName() = "control_access"
}

from ApiRouteWithoutControlAccess api
select api.getLocation(),
  "The control_access method was not detected in this REST API endpoint. All REST API endpoints must call this method to ensure that the endpoint is only accessible to the specified actor types."

Variant analysis

Variant analysis (VA) refers to the process of searching for variants of security vulnerabilities. This is particularly useful when we’re responding to a bug bounty submission or a security incident. We use a combination of tools to do this, including GitHub’s code search functionality, custom scripts, and CodeQL. We will often start by using code search to find patterns similar to the one that caused a particular vulnerability across numerous repositories. This is sometimes not good enough, as code search is not semantically aware, meaning that it cannot determine whether a given variable is an Active Record object or whether it is being used in an `if` expression. To answer those types of questions we turn to CodeQL.

When we write CodeQL queries for variant analysis we are much less concerned about false positives, since the goal is to provide results for security engineers to analyze. The quality of the code is also not quite as important, as these queries will only be used for the duration of the VA effort. Some of the types of things we use CodeQL for during VAs are:

  • Where are we using SHA1 hashes?
  • One of our internal API endpoints was vulnerable to SQLi according to a recent bug bounty report. Where are we passing user input to that API endpoint?
  • There is a problem with how some HTTP request libraries in Ruby handle the proxy setting. Can we look at places we are instantiating our HTTP request libraries with a proxy setting?

One recent example involved a subtle vulnerability in Rails. We wanted to detect when the following condition was present in our code:

  • A parameter was used to look up an Active Record object.
  • That parameter is later reused after the Active Record object is looked up.

The concern with this condition is that it could lead to an insecure direct object reference (IDOR) vulnerability because Active Record finder methods can accept an array. If the code looks up an Active Record object in one call to determine if a given entity has access to a resource, but later uses a different element from that array to find an object reference, that can lead to an IDOR vulnerability. It would be difficult to write a query to detect all vulnerable instances of this pattern, but we were able to write a query that found potential vulnerabilities that gave us a list of code paths to manually analyze. We ran the query against a large number of our Ruby codebases using CodeQL’s MRVA.

The query, which is a bit hacky and not quite production grade, is below:

/**
 * @name wip array query
 * @description an array is passed to an AR finder object
 */

import ruby
import codeql.ruby.AST
import codeql.ruby.ApiGraphs
import codeql.ruby.frameworks.Rails
import codeql.ruby.frameworks.ActiveRecord
import codeql.ruby.frameworks.ActionController
import codeql.ruby.DataFlow
import codeql.ruby.Frameworks
import codeql.ruby.TaintTracking

// Gets the "final" receiver in a chain of method calls.
// For example, in `Foo.bar`, this would give the `Foo` access, and in
// `foo.bar.baz("arg")` it would give the `foo` variable access
private Expr getUltimateReceiver(MethodCall call) {
  exists(Expr recv |
    recv = call.getReceiver() and
    (
      result = getUltimateReceiver(recv)
      or
      not recv instanceof MethodCall and result = recv
    )
  )
}

// Names of class methods on ActiveRecord models that may return one or more
// instances of that model. This also includes the `initialize` method.
// See https://api.rubyonrails.org/classes/ActiveRecord/FinderMethods.html
private string staticFinderMethodName() {
  exists(string baseName |
    baseName = ["find_by", "find_or_create_by", "find_or_initialize_by", "where"] and
    result = baseName + ["", "!"]
  )
  // or
  // result = ["new", "create"]
}

private class ActiveRecordModelFinderCall extends ActiveRecordModelInstantiation, DataFlow::CallNode
{
  private ActiveRecordModelClass cls;

  ActiveRecordModelFinderCall() {
    exists(MethodCall call, Expr recv |
      call = this.asExpr().getExpr() and
      recv = getUltimateReceiver(call) and
      (
        // The receiver refers to an `ActiveRecordModelClass` by name
        recv.(ConstantReadAccess).getAQualifiedName() = cls.getAQualifiedName()
        or
        // The receiver is self, and the call is within a singleton method of
        // the `ActiveRecordModelClass`
        recv instanceof SelfVariableAccess and
        exists(SingletonMethod callScope |
          callScope = call.getCfgScope() and
          callScope = cls.getAMethod()
        )
      ) and
      (
        call.getMethodName() = staticFinderMethodName()
        or
        // dynamically generated finder methods
        call.getMethodName().indexOf("find_by_") = 0
      )
    )
  }

  final override ActiveRecordModelClass getClass() { result = cls }
}

class FinderCallArgument extends DataFlow::Node {
  private ActiveRecordModelFinderCall finderCallNode;

  FinderCallArgument() { this = finderCallNode.getArgument(_) }
}

class ParamsHashReference extends DataFlow::CallNode {
  private Rails::ParamsCall params;

  // TODO: only direct element references against `params` calls are considered
  ParamsHashReference() { this.getReceiver().asExpr().getExpr() = params }

  string getArgString() {
    result = this.getArgument(0).asExpr().getConstantValue().getStringlikeValue()
  }
}

class ArrayPassedToActiveRecordFinder extends TaintTracking::Configuration {
  ArrayPassedToActiveRecordFinder() { this = "ArrayPassedToActiveRecordFinder" }

  override predicate isSource(DataFlow::Node source) { source instanceof ParamsHashReference }

  override predicate isSink(DataFlow::Node sink) {
    sink instanceof FinderCallArgument
  }

  string getParamsArg(DataFlow::CallNode paramsCall) {
    result = paramsCall.getArgument(0).asExpr().getConstantValue().getStringlikeValue()
  }

  // this doesn't check for anything fancy like whether it's reuse in a if/else
  // only intended for quick manual audit filtering of interesting candidates
  // so remains fairly broad to not induce false negatives
  predicate paramsUsedAfterLookups(DataFlow::Node source) {
    exists(DataFlow::CallNode y | y instanceof ParamsHashReference
    and source.getEnclosingMethod() = y.getEnclosingMethod()
    and source != y
    and getParamsArg(source) = getParamsArg(y)
    // we only care if it's used again AFTER an object lookup
    and y.getLocation().getStartLine() > source.getLocation().getStartLine())
  }
}

from ArrayPassedToActiveRecordFinder config, DataFlow::Node source, DataFlow::Node sink
where config.hasFlow(source, sink) and config.paramsUsedAfterLookups(source)
select source, sink.getLocation()

Conclusion

CodeQL can be very useful for product security engineering teams to detect and prevent vulnerabilities at scale. We use a combination of queries that run in CI using our query pack and one-off queries run through MRVA to find potential vulnerabilities and communicate them to engineers. CodeQL isn’t only useful for finding security vulnerabilities, though; it is also useful for detecting the presence or absence of security controls that are defined in code. This saves our security team time by surfacing certain security problems automatically, and saves our engineers time by detecting them earlier in the development process.

Writing custom CodeQL queries

Tips for getting started

We have a large number of articles and resources for writing custom CodeQL queries. If you haven’t written custom CodeQL queries before, here are some resources to help get you started:

Improve the security of your applications today by enabling CodeQL for free on your public repositories, or try GitHub Advanced Security for your organization.

Michael Recachinas, GitHub Staff Security Engineer, also contributed to this blog post.

The post How GitHub uses CodeQL to secure GitHub appeared first on The GitHub Blog.

Using Semantic Versioning to Simplify Release Management

Post Syndicated from Brandon Kindred original https://aws.amazon.com/blogs/devops/using-semantic-versioning-to-simplify-release-management/

Any organization that manages software libraries and applications needs a standardized way to catalog, reference, import, fix bugs and update the versions of those libraries and applications. Semantic Versioning enables developers, testers, and project managers to have a more standardized process for committing code and managing different versions. It’s benefits also extend beyond development teams to end users by using change logs and transparent feature documentation. This alleviates the operational burden of communicating changes for each release cycle.

In this post, we dive deep into how Semantic Versioning works and how to use the NodeJS package semantic-release/npm in a GitHub Actions continuous integration and continuous delivery (CI/CD) pipeline to automatically version an Amazon Web Services (AWS) Cloud Development Kit (AWS CDK) project. Once we’re done, you’ll be ready to deliver AWS CDK projects that are automatically versioned so your team can focus more on code instead of versioning for each release.

How does Semantic Versioning work?

With Semantic Versioning, version number changes convey a specific meaning about the underlying code change. This helps others to understand what has changed from one version to the next.

Semantic Versioning is defined in a Major.Minor.Patch format, which is a guideline to track the scope and potential impact of changes for feature development, major updates and bug fixes. The Major version number change signifies that this release will have breaking changes and any upgrade to this version should be validated for backward compatibility. The Minor version number change signifies a feature release in a backward-compatible manner. Finally, the Patch version number signifies a small insignificant change or a bug fix and users can upgrade to this version without introducing breaking changes or new features.

Each version number is always incremented by one digit and when a higher digit increments, the lower digits reset to zero. For example, version 1.5.2 means that the software is in its first major version, and has released 5 features with 2 patches implemented. When a new feature is added, it will be released as version 1.6.0. If there are any bugs reported and a fix is release for that bug, the next release version will be 1.6.1. For a version 2.0.0 release there will be breaking changes, and may not be backward compatible with previous 1.x.x versions.

What is semantic-release?

Now that we have covered how Semantic Versioning works, and why it’s useful in software projects, let’s review semantic-release. It is a Node.js tool that simplifies Semantic Versioning management. Semantic-release can do more than just apply Semantic Versioning to your projects. With a variety of plugins available, you can track what sort of changes are being made with commit analysis or by generating a changelog for each release. This automates what would otherwise have been a manual and time-consuming process to create and review roadmaps and documentation tracking the new features and fixes that comprise a release. Semantic-release is also straightforward to integrate with CI tools such as GitHub Actions, GitLab CI, Travis CI, and CircleCI 2.0 workflows.

Unfortunately, if your commit messages don’t match the format semantic-release requires, the automatic versioning won’t work correctly. To verify that commit messages are in the right format, you can integrate tools such as commitizen or commitlint, which can validate that commit messages are in a valid format.

Using semantic-release

Commit message formats

By default, semantic-release uses Angular Commit Message Conventions, however the commit message format can be altered by using config options. Let’s take a quick look at the three different types of commit messages, fixes, features, and breaking changes.

Patches and fixes

For patch or fix commits, you need to start your commit message with “fix(context):”. This tells semantic-release that this is a minor patch—meaning that if your version is 0.0.1, after this commit is merged the version will be incremented to 0.0.2. Following is an example commit message.

fix(printer): inform user when printer is out of paper

Features

Feature commit messages start with “feat(context):” which indicates to semantic-release that this is a feature release, also referred to as a minor release. If your version is 0.1.0, after a feature commit is merged, the version will be incremented to 0.2.0. Following is an example of a feature commit.

feat(printer): add option for printing in color

Breaking changes

Breaking change commits are intended for marking a major breaking change—meaning that the commit introduces changes that are not compatible with existing or previous versions. The way that commit messages are marked as breaking changes is slightly different than the previous commit types. For a breaking change, the commit footer needs to include “BREAKING CHANGE:”. For example, a breaking change commit, might look like the following.

perf(printer): remove color printing option

BREAKING CHANGE: The color printing option has been removed. The default black and white printing option is always used for performance reasons.

Release change log

To generate a release change log, we only need to add the semantic-release/changelog plugin to the package.json file. We’ll demonstrate how to do this in the next section.

Adding Semantic Release to an AWS CDK project

To demonstrate semantic-release in action we will build an example with AWS CDK and the NPM semantic-release library. All code provided in the remainder of this blog can be found in the semantic-versioning demo repository (aws-cdk-semantic-release) on GitHub.

To get started we’ll need to create a Node.js CDK application, add semantic-release and its plugins, commit changes, then build the project. If you don’t have AWS CDK installed yet, check out the Getting Stated with CDK guide.

To begin, navigate to the folder where the project will be saved. Next, we initialize a new AWS CDK project and add semantic-release. Then, we configure the branches that Semantic Versioning should be applied to and add a few NPM plugins to extend the functionality of semantic-release. While there are many plugins available, only a few are needed to get started. Below, we’ll walk through which plugins to install and how to install them.

Initiate a new AWS CDK Typescript project and add semantic-release

In a command terminal, navigate to the folder where you want to store your AWS CDK project and run the following three commands. The first line will create the project and the next two will install the semantic-release NPM and Git plugins in the AWS CDK project.

cd <<PROJECT_FOLDER>>
cdk init app --language typescript
npm install @semantic-release/npm -D
npm install @semantic-release/git -D

Define which branches to apply versioning to

We need to tell NPM which branches can trigger a new release. In the example we’ll use the main branch and the staging branches for release. We will add a new section to the package.json file along with the branches we want to use for releases.

"release": {
  "branches": ["main"]
}

Add supporting plugins to package.json

In order for semantic-release to pick up commits, generate release notes, and be able to perform a release with GitHub we need to add the commit-analyzer, release-notes-generator, changelog, NPM, Git, and GitHub plugins. Add the following plugins section to the package.json file after the “release” section we added in the previous step.

"plugins": [
    "@semantic-release/commit-analyzer",
    "@semantic-release/release-notes-generator",
    "@semantic-release/changelog",
    "@semantic-release/npm",
    [
        "@semantic-release/git",
        {
            "assets": [
                "package.json",
                "CHANGELOG.md"
            ],
            "message": "chore(release): ${nextRelease.version}[skip ci]\n\n${nextRelease.notes}"
        }
    ],
    "@semantic-release/github"
],

Putting it all together, our package.json file should look similar to the following.

{
    "name": "aws-cdk-semantic-release",
    "version": "0.1.0",
    "bin": {
        "aws-cdk-semantic-release": "bin/aws-cdk-semantic-release.js"
    },
    "scripts": {
        "build": "tsc",
        "watch": "tsc -w",
        "test": "jest",
        "cdk": "cdk"
    },
    "plugins": [
        "@semantic-release/commit-analyzer",
        "@semantic-release/release-notes-generator",
        "@semantic-release/changelog",
        "@semantic-release/npm",
        [
            "@semantic-release/git",
            {
                "assets": [
                    "package.json",
                    "CHANGELOG.md"
                ],
                "message": "chore(release): ${nextRelease.version} [skip ci]\n\n${nextRelease.notes}"
            }
        ],
        "@semantic-release/github"
    ],
    "release": {
        "branches": [
            "main"
        ]
    },
    "devDependencies": {
        "@semantic-release/changelog": "^6.0.3",
        "@semantic-release/git": "^10.0.1",
        "@semantic-release/npm": "^11.0.2",
        "@types/jest": "^29.5.12",
        "@types/node": "20.11.16",
        "aws-cdk": "2.127.0",
        "jest": "^29.7.0",
        "ts-jest": "^29.1.2",
        "ts-node": "^10.9.2",
        "typescript": "~5.3.3"
    },
    "dependencies": {
        "aws-cdk-lib": "2.127.0",
        "constructs": "^10.0.0",
        "source-map-support": "^0.5.21"
    }
}

With the necessary package.json changes in place, we can commit our code and build the project again.

git commit -m "feat(semver): added semantic release plugins and set main branch for versioning"
npm run build

Configure GitHub Actions

In the projects root folder, we are going to create folders for .github/workflows so that we can configure GitHub Actions release steps.

mkdir .github
cd .github
mkdir workflows
cd workflows

Create a file called release.yaml in the workflows folder that has the following.

name: Release
on:
  push:
    branches:
      - main
jobs:
  release:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Code
         uses: actions/checkout@v3
      - name: Setup Node.js
         uses: actions/setup-node@v3
         with:
           node-version: '20.8.1'
      - name: Install Dependencies
         run: npm install
      - name: Semantic Release
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
           NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
         run: npx semantic-release

The GITHUB_TOKEN is automatically generated and managed by GitHub Actions, so we don’t need to worry about retrieving or defining this value in our GitHub Actions steps. However, NPM_TOKEN is not automatically provided to us, so we need to define the NPM token. To do this we need to login to npmjs.com and create an account. Once we’re logged in, we need to create an access token. Once the NPM access token is created, then we need to add the token to GitHub secrets.

Verify Semantic-Release is setup correctly

Now that we have everything configured, new merges and commits to the main branch will kick off a CI/CD pipeline through GitHub Actions that builds the project and increments the version. To verify that everything is working correctly, we’ll alter the CDK code to include an AWS Lambda function. Afterward, we’ll commit changes to the main branch to verify that semantic-release is working correctly.

Make a change, commit, push and verify

In the AWS CDK project lib/aws-cdk-semantic-release-stack.ts file we’ll update the visibilityTimeout duration to 600, then commit the changes. We want to be sure to use the Semantic Versioning format for the commit message so that our changes are versioned correctly. For this we use the following commit message.

“fix(sqs-visibility): increased visibility timeout to 600”

After committing the changes, we push the changes to the remote GitHub repository to initiate the GitHub Actions build process. Once the build is complete, we see that the build number has been incremented from the previous version. You can review the releases for the sample code to see how it has been versioned. Following are examples of what it should look like before and after executing this. Please note that the version numbers depicted may not match yours.

GitHub releases view with information about release v1.1.0 including features and assets that were updated as part of v1.1.0

Screenshot of GitHub releases page showing release 1.0.0 and the associated features changelog

Figure 1 – Before the new commits

GitHub releases view with information about release v1.1.1 followed by the previous release v1.1.0.

Screenshot of GitHub releases page showing a previous release and the latest release along with their respective versions of 1.1.0 and 1.1.1

Figure 2 – After GitHub Actions generates a new build

Conclusion

Semantic Versioning offers a structured and reliable way to manage software changes. When paired with the significant benefits of using Node.js packages like semantic-release, version management can become effortless. All of this enables automated version management Node.js projects such as AWS CDK pipelines for automated code deployment. It enhances clarity, stability, and collaboration, while also supporting automation and fostering user confidence.

By adopting Semantic Versioning, development teams can achieve more predictable and efficient workflows, ultimately leading to higher quality software. It also builds confidence among users without going into much details with respect to software upgrades and backward compatibility. As a best practice, start incorporating Semantic Versioning for existing and future applications.

Contact an AWS Representative to know how we can help accelerate your business.

Reference

CDK setup and deployment of application with CDK V2 Construct:

Headshot of Brandon Kindred (CAA), a white man with short hair, a beard and wearing a black shirt

Brandon Kindred

Brandon is a Cloud Application Architect at AWS ProServe where he helps companies achieve their cloud goals through tailored strategies and cost-effective solutions. With a passion for teaching, he enjoys empowering others to harness the full potential of cloud technologies while optimizing their cloud spend. Brandon also enjoys providing guidance on development best practices, ensure teams build resilient and scalable applications in the cloud.

Sunita Dubey (Sr CAA)

Sunita Dubey

Sunita Dubey is a senior Cloud Application Architect based out of New Jersey. She focuses on designing, architecting and delivering end to end solutions for customers in financial domain. She has solved multiple customer challenges to move their workload to AWS and saved cost in the migration process. She insists on highest standard in all the deliverables.

Accelerate Serverless Streamlit App Deployment with Terraform

Post Syndicated from Kevon Mayers original https://aws.amazon.com/blogs/devops/accelerate-serverless-streamlit-app-deployment-with-terraform/

Image depicting the HashiCorp Terraform and Amazon Web Services (AWS) logos. Underneath the AWS logo are AWS service logos for Amazon Elastic Container Service (ECS), AWS CodePipeline, AWS CodeBuild, and Amazon CloudFront

Graphic created by Kevon Mayers.

Introduction

As customers increasingly seek to harness the power of generative AI (GenAI) and machine learning to deliver cutting-edge applications, the need for a flexible, intuitive, and scalable development platform has never been greater. In this landscape, Streamlit has emerged as a standout tool, making it easy for developers to prototype, build, and deploy GenAI-powered apps with minimal friction. It is an open-source Python framework designed to simplify the development of custom web applications for data science, machine learning, and GenAI projects. With Streamlit, developers can quickly transform Python scripts into interactive dashboards, LLM-powered chatbots, and web apps, using just a few lines of code. Its unique combination of simplicity, interactivity, and speed is the perfect complement to the rapid advancements in AI.

When deploying Streamlit applications, customers often face the challenge of ensuring their applications are highly available and can scale to meet a variable amount of demand. To achieve these goals, customers are looking at serverless approaches to deploying their Streamlit apps. With a serverless application, you only pay for the resources required and do not want have to worry about managing servers or capacity planning.

In this post, we will walk you through deploying containerized, serverless Streamlit applications automatically via HashiCorp Terraform, an Infrastructure as Code (IaC) tool that enables users to define and provision infrastructure across cloud platforms.

Solution Overview

For this solution, we have the Streamlit app running on an Amazon Elastic Container Service (ECS) cluster across multiple availability zones (AZs), using AWS Fargate to manage the compute. Fargate is a serverless, pay-as-you-go compute engine that lets you focus on building apps without managing servers. Using Fargate helps reduce the undifferentiated heavy lifting that can come with building and maintaining web applications. It is also often desirable to use a Content Delivery Network (CDN) to ensure low latency for users globally by caching the content at edge locations closer to where the users are geographically located.

Let’s zoom in on the two architectures – the Streamlit App hosting architecture, and the Streamlit App deployment pipeline.

Streamlit app hosting

Image depicting the AWS data flow architecture for the solution. The architecture shows an Amazon Elastic Container Service (ECS) cluster that spans across two availability zones. Within each availability zone are a public and private subnet. A NAT gateway is within the public subnet, and an ECS Cluster with AWS Fargate deployment type is in the private subnet. An Internet Gateway (IGW) is used to allow traffic to flow through the NAT Gateway out to the internet.An Application Load Balancer (ALB) is used to distribute the load to the ECS cluster. Amazon CloudFront is used as the content delivery network (CDN).

In the above architecture, the following flow applies:

  1. Users access the Streamlit App using the public DNS endpoint for an Amazon CloudFront distribution.
  2. Using an Internet Gateway (IGW), user requests are routed to a public-facing Application Load Balancer (ALB).
  3. This ALB has target groups which map to ECS task nodes that are part of an ECS cluster running in two AZs (us-east-1a and us-east-1b in this example).
  4. Fargate will automatically scale the underlying compute nodes in the ECS cluster based on the demand.

Streamlit app deployment pipeline

Image depicting the Streamlit app deployment pipeline architecture. Within it, a developer uploads a .zip file called streamlit-app-assets.zip to an Amazon S3 Bucket. This upload event is processed by Amazon EventBridge, which in turn invokes an AWS CodePipeline to run. Related artifacts are stored in a connected CodePipeline S3 bucket. CodePipeline orchestrates an AWS CodeBuild project that creates a new Docker image using the .zip file that was uploaded, and stores in an Amazon Elastic Container Registry (ECR) repository. This image upload triggers a new Amazon Elastic Container Service (ECS) deployment. Terraform then creates a Amazon CloudFront invalidation to serve the new version of the application to customers.

In the above architecture, the following flow applies:

  1. User develops a local Streamlit App and defines the path of these assets in the module configuration, then runs terraform apply to generate a local .zip file comprised of the Streamlit App directory, and upload this to an Amazon S3 bucket (Streamlit Assets) with versioning enabled, which is configured to trigger the Streamlit CI/CD pipeline to run.
  2. AWS CodePipeline (Streamlit CI/CD pipeline) begins running. The pipeline copies the .zip file from the Streamlit Assets S3 Bucket, stores the contents in a connected CodePipeline Artifacts S3 bucket, and passes the asset to the AWS CodeBuild project that is also part of the pipeline.
  3. CodeBuild (Streamlit CodeBuild Project) configures a compute/build environment and fetches a Python Docker Image from a public Amazon ECR repository. CodeBuild uses Docker to build a new Streamlit App image based on what is defined in the Dockerfile within the .zip file, and pushes the new image to a private ECR repository. It tags the image with latest, an app_version (user-defined in Terraform), as well as the S3 Version ID of the .zip file and pushes the image to ECR.
  4. ECS has a task definition that references the image in ECR based on the S3 Version ID tag which will always be a unique value, as it is generated whenever a new version of the file is created. This also serves as data lineage so versions of the Streamlit App .zip files in S3 can be linked to versions of the image stored in ECR. Once a new image is pushed to ECR (with a unique image tag), the task definition is updated and the ECS service begins a new deployment using the new version of the Streamlit App.
  5. When a new image is pushed to ECR, the Terraform Module is configured to use the local-exec provisioner to run an AWS CLI command that creates a CloudFront invalidation. This enables users of the Streamlit app to use the new version without waiting for the time-to-live (TTL) of the cached file to expire on the edge locations (default is 24 hours).
    Both of these pipelines are built and packaged into a Terraform module that can be reused efficiently with only a few lines of code.

Both of these pipelines are built and packaged into a Terraform module that can be reused efficiently with only a few lines of code.

Prerequisites

This solution requires the following prerequisites:

  • An AWS account. If you don’t have an account, you can sign up for one.
  • Terraform v1.0.0 or newer installed.
  • python v3.8 or newer installed.
  • A Streamlit app. If you don’t have a Streamlit project already, you can download this app directory as a sample Streamlit app for this post and save it to a local folder.

Your folder structure will look something like this:

terraform_streamlit_folder
├── README.md
└── app                 # Streamlit app directory
    ├── home.py         # Streamlit app entry point
    ├── Dockerfile      # Dockerfile
     └── pages/          # Streamlit pages

Create and initialize a Terraform project

In the same folder where you have the your Streamlit app saved, in the above example in the terraform_streamlit_folder, you will create and initialize a new Terraform project.

  1.  In your preferred terminal, create a new file named main.tf by running the following command on Unix/Linux machines, or an equivalent command on Windows machines:
    touch main.tf
  2. Open up the main.tf file and add the following code to it:
    module "serverless-streamlit-app" {
      source          = "aws-ia/serverless-streamlit-app/aws"
      app_name        = "streamlit-app"
      app_version     = "v1.1.0" 
      path_to_app_dir = "./app" # Replace with path to your app
    }

    This code utilizes a module block with a source pointing to the Terraform module, and the appropriate input variables passed in. When Terraform encounters a module block, it loads and processes that module’s configuration files using the source. The Serverless Streamlit App Terraform module has many optional input variables. If you have existing resources, such as an existing VPC, subnets, and security groups that you’d like to reuse instead of deploying new ones, you can use the module’s input variables to reference your existing resources. However, in this post, we’re deploying all of the resources in the above architecture from scratch. Here, we simply define the source that references the module hosted in the Terraform Registry, provide an app_name that will be used as a prefix for naming your resources, the app_version that is used for tracking changes to your app, and the path_to_app_dir which is the path to the local directory where the assets for your Streamlit app are stored.

  3. Save the file.
  4. To initialize the Terraform working directory, run the following command in your terminal:
    terraform init

    The output will contain a successful message like the following:

    "Terraform has been successfully initialized"

Output the CloudFront URL

To be able to easily access the Cloudfront URL of the deployed Streamlit application, you can add the URL as a Terraform output.

  1. In your terminal, create a new file named outputs.tf by running the following command on Unix/Linux machines, or an equivalent command on Windows machines:
    touch outputs.tf
  2. Open up the outputs.tf file and add the following code to it:
    output "streamlit_cloudfront_distribution_url" {
      value = module.serverless-streamlit-app.streamlit_cloudfront_distribution_url
    }
  3. Save the file.
    Now, your folder structure will look like:

    terraform_streamlit_folder
    ├── README.md
    ├── app                 # Streamlit app directory
    │   ├── home.py         # Streamlit app entry point
    │   ├── Dockerfile      # Dockerfile
    │   └── pages/          # Streamlit pages
    │     
    ├── main.tf             # Terraform Code (where you call the module) 
    └── outputs.tf          # Outputs definition

Deploy the solution

Now you can use Terraform to deploy the resources defined in your main.tf file.

  1. In your terminal, run the following command to apply to deploy the infrastructure. This includes the hosting for your Streamlit application using ECS and CloudFront, as well as the pipeline that is used to push updates.
    terraform apply

    When the apply command finishes running, you’ll see the Terraform outputs displayed in the terminal.

  2. Navigate to the streamlit_cloudfront_distribution_url to see your Streamlit application that is hosted on AWS.
  3. When you make changes to your Streamlit codebase, you can go ahead and re-run terraform apply to push your new changes to your cloud environment.

When updating the Streamlit codebase, the CodePipeline and CodeBuild processes kick off to automatically update your new changes, which get reflected on your Streamlit application. CodePipeline automates the entire software release process, managing stages like source retrieval, building, testing, and deployment. It integrates with AWS services and third-party tools (such as GitHub and Jenkins) to enhance automation, speed, and security. CodeBuild focuses on automating code compilation, testing, and packaging, supporting multiple languages and custom Docker environments, while integrating with CodePipeline for scalable, secure builds. With this CI/CD pipeline, when you make changes to your code, all you need to run is terraform apply to update your cloud environment. For an example buildspec, see the example in the repo.

You can find full examples of deploying the infrastructure with and without existing resources in the GitHub repository.

Clean up

When you no longer need the resources deployed in this post, you can clean up the resources by using the Terraform destroy command. Simply run terraform destroy . This will remove all of the resources you have deployed in this post with Terraform.

Conclusion

Building serverless Streamlit applications with Terraform on AWS offers a powerful combination of scalability, efficiency, and automation. As you continue to build and refine your Streamlit applications, Terraform’s flexibility ensures that your infrastructure can evolve seamlessly, supporting rapid innovation and agile development. With Streamlit and Terraform, you have the tools to create dynamic, serverless applications that scale effortlessly and operate reliably in the cloud.

Authors

Image depicting Kevon Mayers, a Solutions Architect at AWS

Kevon Mayers

Kevon Mayers is a Solutions Architect at AWS. Kevon is a Terraform Contributor and has led multiple Terraform initiatives within AWS. Prior to joining AWS he was working as a DevOps Engineer and Developer, and before that was working with the GRAMMYs/The Recording Academy as a Studio Manager, Music Producer, and Audio Engineer. He also owns a professional production company, MM Productions.

Image depicting Alexa Perlov, a Prototyping Architect at AWS

Alexa Perlov

Alexa Perlov is a Prototyping Architect with the Prototyping Acceleration team at AWS. She helps customers build with emerging technologies by open sourcing repeatable projects. She is currently based out of Pittsburgh, PA.

Image depicting Shravani Malipeddi, a Solutions Architect at AWS

Shravani Malipeddi

Shravani Malipeddi is a Solutions Architect at AWS who came out of the TechU Program. She currently supports strategic accounts and is based out of San Francisco, CA. .

Terraform CI/CD and testing on AWS with the new Terraform Test Framework

Post Syndicated from Kevon Mayers original https://aws.amazon.com/blogs/devops/terraform-ci-cd-and-testing-on-aws-with-the-new-terraform-test-framework/

Image of HashiCorp Terraform logo and Amazon Web Services (AWS) Logo. Underneath the AWS Logo are the service logos for AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and Amazon S3. Graphic created by Kevon Mayers

Graphic created by Kevon Mayers

 Introduction

Organizations often use Terraform Modules to orchestrate complex resource provisioning and provide a simple interface for developers to enter the required parameters to deploy the desired infrastructure. Modules enable code reuse and provide a method for organizations to standardize deployment of common workloads such as a three-tier web application, a cloud networking environment, or a data analytics pipeline. When building Terraform modules, it is common for the module author to start with manual testing. Manual testing is performed using commands such as terraform validate for syntax validation, terraform plan to preview the execution plan, and terraform apply followed by manual inspection of resource configuration in the AWS Management Console. Manual testing is prone to human error, not scalable, and can result in unintended issues. Because modules are used by multiple teams in the organization, it is important to ensure that any changes to the modules are extensively tested before the release. In this blog post, we will show you how to validate Terraform modules and how to automate the process using a Continuous Integration/Continuous Deployment (CI/CD) pipeline.

Terraform Test

Terraform test is a new testing framework for module authors to perform unit and integration tests for Terraform modules. Terraform test can create infrastructure as declared in the module, run validation against the infrastructure, and destroy the test resources regardless if the test passes or fails. Terraform test will also provide warnings if there are any resources that cannot be destroyed. Terraform test uses the same HashiCorp Configuration Language (HCL) syntax used to write Terraform modules. This reduces the burden for modules authors to learn other tools or programming languages. Module authors run the tests using the command terraform test which is available on Terraform CLI version 1.6 or higher.

Module authors create test files with the extension *.tftest.hcl. These test files are placed in the root of the Terraform module or in a dedicated tests directory. The following elements are typically present in a Terraform tests file:

  • Provider block: optional, used to override the provider configuration, such as selecting AWS region where the tests run.
  • Variables block: the input variables passed into the module during the test, used to supply non-default values or to override default values for variables.
  • Run block: used to run a specific test scenario. There can be multiple run blocks per test file, Terraform executes run blocks in order. In each run block you specify the command Terraform (plan or apply), and the test assertions. Module authors can specify the conditions such as: length(var.items) != 0. A full list of condition expressions can be found in the HashiCorp documentation.

Terraform tests are performed in sequential order and at the end of the Terraform test execution, any failed assertions are displayed.

Basic test to validate resource creation

Now that we understand the basic anatomy of a Terraform tests file, let’s create basic tests to validate the functionality of the following Terraform configuration. This Terraform configuration will create an AWS CodeCommit repository with prefix name repo-.

# main.tf

variable "repository_name" {
  type = string
}
resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description     = "Test repository."
}

Now we create a Terraform test file in the tests directory. See the following directory structure as an example:

├── main.tf 
└── tests 
└── basic.tftest.hcl

For this first test, we will not perform any assertion except for validating that Terraform execution plan runs successfully. In the tests file, we create a variable block to set the value for the variable repository_name. We also added the run block with command = plan to instruct Terraform test to run Terraform plan. The completed test should look like the following:

# basic.tftest.hcl

variables {
  repository_name = "MyRepo"
}

run "test_resource_creation" {
  command = plan
}

Now we will run this test locally. First ensure that you are authenticated into an AWS account, and run the terraform init command in the root directory of the Terraform module. After the provider is initialized, start the test using the terraform test command.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... pass
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... pass

Our first test is complete, we have validated that the Terraform configuration is valid and the resource can be provisioned successfully. Next, let’s learn how to perform inspection of the resource state.

Create resource and validate resource name

Re-using the previous test file, we add the assertion block to checks if the CodeCommit repository name starts with a string repo- and provide error message if the condition fails. For the assertion, we use the startswith function. See the following example:

# basic.tftest.hcl

variables {
  repository_name = "MyRepo"
}

run "test_resource_creation" {
  command = plan

  assert {
    condition = startswith(aws_codecommit_repository.test.repository_name, "repo-")
    error_message = "CodeCommit repository name ${var.repository_name} did not start with the expected value of ‘repo-****’."
  }
}

Now, let’s assume that another module author made changes to the module by modifying the prefix from repo- to my-repo-. Here is the modified Terraform module.

# main.tf

variable "repository_name" {
  type = string
}
resource "aws_codecommit_repository" "test" {
  repository_name = format("my-repo-%s", var.repository_name)
  description = "Test repository."
}

We can catch this mistake by running the the terraform test command again.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... fail
╷
│ Error: Test assertion failed
│
│ on tests/basic.tftest.hcl line 9, in run "test_resource_creation":
│ 9: condition = startswith(aws_codecommit_repository.test.repository_name, "repo-")
│ ├────────────────
│ │ aws_codecommit_repository.test.repository_name is "my-repo-MyRepo"
│
│ CodeCommit repository name MyRepo did not start with the expected value 'repo-***'.
╵
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... fail

Failure! 0 passed, 1 failed.

We have successfully created a unit test using assertions that validates the resource name matches the expected value. For more examples of using assertions see the Terraform Tests Docs. Before we proceed to the next section, don’t forget to fix the repository name in the module (revert the name back to repo- instead of my-repo-) and re-run your Terraform test.

Testing variable input validation

When developing Terraform modules, it is common to use variable validation as a contract test to validate any dependencies / restrictions. For example, AWS CodeCommit limits the repository name to 100 characters. A module author can use the length function to check the length of the input variable value. We are going to use Terraform test to ensure that the variable validation works effectively. First, we modify the module to use variable validation.

# main.tf

variable "repository_name" {
  type = string
  validation {
    condition = length(var.repository_name) <= 100
    error_message = "The repository name must be less than or equal to 100 characters."
  }
}

resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description = "Test repository."
}

By default, when variable validation fails during the execution of Terraform test, the Terraform test also fails. To simulate this, create a new test file and insert the repository_name variable with a value longer than 100 characters.

# var_validation.tftest.hcl

variables {
  repository_name = “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
}

run “test_invalid_var” {
  command = plan
}

Notice on this new test file, we also set the command to Terraform plan, why is that? Because variable validation runs prior to Terraform apply, thus we can save time and cost by skipping the entire resource provisioning. If we run this Terraform test, it will fail as expected.

❯ terraform test
tests/basic.tftest.hcl… in progress
run “test_resource_creation”… pass
tests/basic.tftest.hcl… tearing down
tests/basic.tftest.hcl… pass
tests/var_validation.tftest.hcl… in progress
run “test_invalid_var”… fail
╷
│ Error: Invalid value for variable
│
│ on main.tf line 1:
│ 1: variable “repository_name” {
│ ├────────────────
│ │ var.repository_name is “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
│
│ The repository name must be less than or equal to 100 characters.
│
│ This was checked by the validation rule at main.tf:3,3-13.
╵
tests/var_validation.tftest.hcl… tearing down
tests/var_validation.tftest.hcl… fail

Failure! 1 passed, 1 failed.

For other module authors who might iterate on the module, we need to ensure that the validation condition is correct and will catch any problems with input values. In other words, we expect the validation condition to fail with the wrong input. This is especially important when we want to incorporate the contract test in a CI/CD pipeline. To prevent our test from failing due introducing an intentional error in the test, we can use the expect_failures attribute. Here is the modified test file:

# var_validation.tftest.hcl

variables {
  repository_name = “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
}

run “test_invalid_var” {
  command = plan

  expect_failures = [
    var.repository_name
  ]
}

Now if we run the Terraform test, we will get a successful result.

❯ terraform test
tests/basic.tftest.hcl… in progress
run “test_resource_creation”… pass
tests/basic.tftest.hcl… tearing down
tests/basic.tftest.hcl… pass
tests/var_validation.tftest.hcl… in progress
run “test_invalid_var”… pass
tests/var_validation.tftest.hcl… tearing down
tests/var_validation.tftest.hcl… pass

Success! 2 passed, 0 failed.

As you can see, the expect_failures attribute is used to test negative paths (the inputs that would cause failures when passed into a module). Assertions tend to focus on positive paths (the ideal inputs). For an additional example of a test that validates functionality of a completed module with multiple interconnected resources, see this example in the Terraform CI/CD and Testing on AWS Workshop.

Orchestrating supporting resources

In practice, end-users utilize Terraform modules in conjunction with other supporting resources. For example, a CodeCommit repository is usually encrypted using an AWS Key Management Service (KMS) key. The KMS key is provided by end-users to the module using a variable called kms_key_id. To simulate this test, we need to orchestrate the creation of the KMS key outside of the module. In this section we will learn how to do that. First, update the Terraform module to add the optional variable for the KMS key.

# main.tf

variable "repository_name" {
  type = string
  validation {
    condition = length(var.repository_name) <= 100
    error_message = "The repository name must be less than or equal to 100 characters."
  }
}

variable "kms_key_id" {
  type = string
  default = ""
}

resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description = "Test repository."
  kms_key_id = var.kms_key_id != "" ? var.kms_key_id : null
}

In a Terraform test, you can instruct the run block to execute another helper module. The helper module is used by the test to create the supporting resources. We will create a sub-directory called setup under the tests directory with a single kms.tf file. We also create a new test file for KMS scenario. See the updated directory structure:

├── main.tf
└── tests
├── setup
│ └── kms.tf
├── basic.tftest.hcl
├── var_validation.tftest.hcl
└── with_kms.tftest.hcl

The kms.tf file is a helper module to create a KMS key and provide its ARN as the output value.

# kms.tf

resource "aws_kms_key" "test" {
  description = "test KMS key for CodeCommit repo"
  deletion_window_in_days = 7
}

output "kms_key_id" {
  value = aws_kms_key.test.arn
}

The new test will use two separate run blocks. The first run block (setup) executes the helper module to generate a KMS key. This is done by assigning the command apply which will run terraform apply to generate the KMS key. The second run block (codecommit_with_kms) will then use the KMS key ARN output of the first run as the input variable passed to the main module.

# with_kms.tftest.hcl

run "setup" {
  command = apply
  module {
    source = "./tests/setup"
  }
}

run "codecommit_with_kms" {
  command = apply

  variables {
    repository_name = "MyRepo"
    kms_key_id = run.setup.kms_key_id
  }

  assert {
    condition = aws_codecommit_repository.test.kms_key_id != null
    error_message = "KMS key ID attribute value is null"
  }
}

Go ahead and run the Terraform init, followed by Terraform test. You should get the successful result like below.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... pass
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... pass
tests/var_validation.tftest.hcl... in progress
run "test_invalid_var"... pass
tests/var_validation.tftest.hcl... tearing down
tests/var_validation.tftest.hcl... pass
tests/with_kms.tftest.hcl... in progress
run "create_kms_key"... pass
run "codecommit_with_kms"... pass
tests/with_kms.tftest.hcl... tearing down
tests/with_kms.tftest.hcl... pass

Success! 4 passed, 0 failed.

We have learned how to run Terraform test and develop various test scenarios. In the next section we will see how to incorporate all the tests into a CI/CD pipeline.

Terraform Tests in CI/CD Pipelines

Now that we have seen how Terraform Test works locally, let’s see how the Terraform test can be leveraged to create a Terraform module validation pipeline on AWS. The following AWS services are used:

  • AWS CodeCommit – a secure, highly scalable, fully managed source control service that hosts private Git repositories.
  • AWS CodeBuild – a fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages.
  • AWS CodePipeline – a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.
  • Amazon Simple Storage Service (Amazon S3) – an object storage service offering industry-leading scalability, data availability, security, and performance.
Terraform module validation pipeline Architecture. Multiple interconnected AWS services such as AWS CodeCommit, CodeBuild, CodePipeline, and Amazon S3 used to build a Terraform module validation pipeline.

Terraform module validation pipeline

In the above architecture for a Terraform module validation pipeline, the following takes place:

  • A developer pushes Terraform module configuration files to a git repository (AWS CodeCommit).
  • AWS CodePipeline begins running the pipeline. The pipeline clones the git repo and stores the artifacts to an Amazon S3 bucket.
  • An AWS CodeBuild project configures a compute/build environment with Checkov installed from an image fetched from Docker Hub. CodePipeline passes the artifacts (Terraform module) and CodeBuild executes Checkov to run static analysis of the Terraform configuration files.
  • Another CodeBuild project configured with Terraform from an image fetched from Docker Hub. CodePipeline passes the artifacts (repo contents) and CodeBuild runs Terraform command to execute the tests.

CodeBuild uses a buildspec file to declare the build commands and relevant settings. Here is an example of the buildspec files for both CodeBuild Projects:

# Checkov
version: 0.1
phases:
  pre_build:
    commands:
      - echo pre_build starting

  build:
    commands:
      - echo build starting
      - echo starting checkov
      - ls
      - checkov -d .
      - echo saving checkov output
      - checkov -s -d ./ > checkov.result.txt

In the above buildspec, Checkov is run against the root directory of the cloned CodeCommit repository. This directory contains the configuration files for the Terraform module. Checkov also saves the output to a file named checkov.result.txt for further review or handling if needed. If Checkov fails, the pipeline will fail.

# Terraform Test
version: 0.1
phases:
  pre_build:
    commands:
      - terraform init
      - terraform validate

  build:
    commands:
      - terraform test

In the above buildspec, the terraform init and terraform validate commands are used to initialize Terraform, then check if the configuration is valid. Finally, the terraform test command is used to run the configured tests. If any of the Terraform tests fails, the pipeline will fail.

For a full example of the CI/CD pipeline configuration, please refer to the Terraform CI/CD and Testing on AWS workshop. The module validation pipeline mentioned above is meant as a starting point. In a production environment, you might want to customize it further by adding Checkov allow-list rules, linting, checks for Terraform docs, or pre-requisites such as building the code used in AWS Lambda.

Choosing various testing strategies

At this point you may be wondering when you should use Terraform tests or other tools such as Preconditions and Postconditions, Check blocks or policy as code. The answer depends on your test type and use-cases. Terraform test is suitable for unit tests, such as validating resources are created according to the naming specification. Variable validations and Pre/Post conditions are useful for contract tests of Terraform modules, for example by providing error warning when input variables value do not meet the specification. As shown in the previous section, you can also use Terraform test to ensure your contract tests are running properly. Terraform test is also suitable for integration tests where you need to create supporting resources to properly test the module functionality. Lastly, Check blocks are suitable for end to end tests where you want to validate the infrastructure state after all resources are generated, for example to test if a website is running after an S3 bucket configured for static web hosting is created.

When developing Terraform modules, you can run Terraform test in command = plan mode for unit and contract tests. This allows the unit and contract tests to run quicker and cheaper since there are no resources created. You should also consider the time and cost to execute Terraform test for complex / large Terraform configurations, especially if you have multiple test scenarios. Terraform test maintains one or many state files within the memory for each test file. Consider how to re-use the module’s state when appropriate. Terraform test also provides test mocking, which allows you to test your module without creating the real infrastructure.

Conclusion

In this post, you learned how to use Terraform test and develop various test scenarios. You also learned how to incorporate Terraform test in a CI/CD pipeline. Lastly, we also discussed various testing strategies for Terraform configurations and modules. For more information about Terraform test, we recommend the Terraform test documentation and tutorial. To get hands on practice building a Terraform module validation pipeline and Terraform deployment pipeline, check out the Terraform CI/CD and Testing on AWS Workshop.

Authors

Kevon Mayers

Kevon Mayers is a Solutions Architect at AWS. Kevon is a Terraform Contributor and has led multiple Terraform initiatives within AWS. Prior to joining AWS he was working as a DevOps Engineer and Developer, and before that was working with the GRAMMYs/The Recording Academy as a Studio Manager, Music Producer, and Audio Engineer. He also owns a professional production company, MM Productions.

Welly Siauw

Welly Siauw is a Principal Partner Solution Architect at Amazon Web Services (AWS). He spends his day working with customers and partners, solving architectural challenges. He is passionate about service integration and orchestration, serverless and artificial intelligence (AI) and machine learning (ML). He has authored several AWS blog posts and actively leads AWS Immersion Days and Activation Days. Welly spends his free time tinkering with espresso machines and outdoor hiking.

Your DevOps and Developer Productivity guide to re:Invent 2023

Post Syndicated from Anubhav Rao original https://aws.amazon.com/blogs/devops/your-devops-and-developer-productivity-guide-to-reinvent-2023/

Your DevOps and Developer Productivity guide to re:Invent 2023

ICYMI – AWS re:Invent is less than a week away! We can’t wait to join thousands of builders in person and virtually for another exciting event. Still need to save your spot? You can register here.

With so much planned for the DevOps and Developer Productivity (DOP) track at re:Invent, we’re highlighting the most exciting sessions for technology leaders and developers in this post. Sessions span intermediate (200) through expert (400) levels of content in a mix of interactive chalk talks, hands-on workshops, and lecture-style breakout sessions.

You will experience the future of efficient development at the DevOps and Developer Productivity track and get a chance to talk to AWS experts about exciting services, tools, and new AI capabilities that optimize and automate your software development lifecycle. Attendees will leave re:Invent with the latest strategies to accelerate development, use generative AI to improve developer productivity, and focus on high-value work and innovation.

How to reserve a seat in the sessions

Reserved seating is available for registered attendees to secure seats in the sessions of their choice. Reserve a seat by signing in to the attendee portal and navigating to Event, then Sessions.

Do not miss the Innovation Talk led by Vice President of AWS Generative Builders, Adam Seligman. In DOP225-INT Build without limits: The next-generation developer experience at AWS, Adam will provide updates on the latest developer tools and services, including generative AI-powered capabilities, low-code abstractions, cloud development, and operations. He’ll also welcome special guests to lead demos of key developer services and showcase how they integrate to increase productivity and innovation.

DevOps and Developer Productivity breakout sessions

What are breakout sessions?

AWS re:Invent breakout sessions are lecture-style and 60 minutes long. These sessions are delivered by AWS experts and typically reserve 10–15 minutes for Q&A at the end. Breakout sessions are recorded and made available on-demand after the event.

Level 200 — Intermediate

DOP201 | Best practices for Amazon CodeWhisperer Generative AI can create new content and ideas, including conversations, stories, images, videos, and music. Learning how to interact with generative AI effectively and proficiently is a skill worth developing. Join this session to learn about best practices for engaging with Amazon CodeWhisperer, which uses an underlying foundation model to radically improve developer productivity by generating code suggestions in real time.

DOP202 | Realizing the developer productivity benefits of Amazon CodeWhisperer Developers spend a significant amount of their time writing undifferentiated code. Amazon CodeWhisperer radically improves productivity by generating code suggestions in real time to alleviate this burden. In this session, learn how CodeWhisperer can “write” much of this undifferentiated code, allowing developers to focus on business logic and accelerate the pace of their innovation.

DOP205 | Accelerate development with Amazon CodeCatalyst In this session, explore the newest features in Amazon CodeCatalyst. Learn firsthand how these practical additions to CodeCatalyst can simplify application delivery, improve team collaboration, and speed up the software development lifecycle from concept to deployment.

DOP206 | AWS infrastructure as code: A year in review AWS provides services that help with the creation, deployment, and maintenance of application infrastructure in a programmatic, descriptive, and declarative way. These services help provide rigor, clarity, and reliability to application development. Join this session to learn about the new features and improvements for AWS infrastructure as code with AWS CloudFormation and AWS Cloud Development Kit (AWS CDK) and how they can benefit your team.

DOP207 | Build and run it: Streamline DevOps with machine learning on AWS While organizations have improved how they deliver and operate software, development teams still run into issues when performing manual code reviews, looking for hard-to-find defects, and uncovering security-related problems. Developers have to keep up with multiple programming languages and frameworks, and their productivity can be impaired when they have to search online for code snippets. Additionally, they require expertise in observability to successfully operate the applications they build. In this session, learn how companies like Fidelity Investments use machine learning–powered tools like Amazon CodeWhisperer and Amazon DevOps Guru to boost application availability and write software faster and more reliably.

DOP208 | Continuous integration and delivery for AWS AWS provides one place where you can plan work, collaborate on code, build, test, and deploy applications with continuous integration/continuous delivery (CI/CD) tools. In this session, learn about how to create end-to-end CI/CD pipelines using infrastructure as code on AWS.

DOP209 | Governance and security with infrastructure as code In this session, learn how to use AWS CloudFormation and the AWS CDK to deploy cloud applications in regulated environments while enforcing security controls. Find out how to catch issues early with cdk-nag, validate your pipelines with cfn-guard, and protect your accounts from unintended changes with CloudFormation hooks.

DOP210 | Scale your application development with Amazon CodeCatalyst Amazon CodeCatalyst brings together everything you need to build, deploy, and collaborate on software into one integrated software development service. In this session, discover the ways that CodeCatalyst helps developers and teams build and ship code faster while spending more time doing the work they love.

DOP211 | Boost developer productivity with Amazon CodeWhisperer Generative AI is transforming the way that developers work. Writing code is already getting disrupted by tools like Amazon CodeWhisperer, which enhances developer productivity by providing real-time code completions based on natural language prompts. In this session, get insights into how to evaluate and measure productivity with the adoption of generative AI–powered tools. Learn from the AWS Disaster Recovery team who uses CodeWhisperer to solve complex engineering problems by gaining efficiency through longer productivity cycles and increasing velocity to market for ongoing fixes. Hear how integrating tools like CodeWhisperer into your workflows can boost productivity.

DOP212 | New AWS generative AI features and tools for developers Explore how generative AI coding tools are changing the way developers and companies build software. Generative AI–powered tools are boosting developer and business productivity by automating tasks, improving communication and collaboration, and providing insights that can inform better decision-making. In this session, see the newest AWS tools and features that make it easier for builders to solve problems with minimal technical expertise and that help technical teams boost productivity. Walk through how organizations like FINRA are exploring generative AI and beginning their journey using these tools to accelerate their pace of innovation.

DOP220 | Simplify building applications with AWS SDKs AWS SDKs play a vital role in using AWS services in your organization’s applications and services. In this session, learn about the current state and the future of AWS SDKs. Explore how they can simplify your developer experience and unlock new capabilities. Discover how SDKs are evolving, providing a consistent experience in multiple languages and empowering you to do more with high-level abstractions to make it easier to build on AWS. Learn how AWS SDKs are built using open source tools like Smithy, and how you can use these tools to build your own SDKs to serve your customers’ needs.

DevOps and Developer Productivity chalk talks

What are chalk talks?

Chalk Talks are highly interactive sessions with a small audience. Experts lead you through problems and solutions on a digital whiteboard as the discussion unfolds. Each begins with a short lecture (10–15 minutes) delivered by an AWS expert, followed by a 45- or 50-minute Q&A session with the audience.

Level 300 — Advanced

DOP306 | Streamline DevSecOps with a complete software development service Security is not just for application code—the automated software supply chains that build modern software can also be exploited by attackers. In this chalk talk, learn how you can use Amazon CodeCatalyst to incorporate security tests into every aspect of your software development lifecycle while maintaining a great developer experience. Discover how CodeCatalyst’s flexible actions-based CI/CD workflows streamline the process of adapting to security threats.

DOP309-R | AI for DevOps: Modernizing your DevOps operations with AWS As more organizations move to microservices architectures to scale their businesses, applications increasingly have become distributed, requiring the need for even greater visibility. IT operations professionals and developers need more automated practices to maintain application availability and reduce the time and effort required to detect, debug, and resolve operational issues. In this chalk talk, discover how you can use AWS services, including Amazon CodeWhisperer, Amazon CodeGuru and Amazon DevOps Guru, to start using AI for DevOps solutions to detect, diagnose, and remedy anomalous application behavior.

DOP310-R | Better together: GitHub Actions, Amazon CodeCatalyst, or AWS CodeBuild Learn how combining GitHub Actions with Amazon CodeCatalyst or AWS CodeBuild can maximize development efficiency. In this chalk talk, learn about the tradeoffs of using GitHub Actions runners hosted on Amazon EC2 or Amazon ECS with GitHub Actions hosted on CodeCatalyst or CodeBuild. Explore integration with other AWS services to enhance workflow automation. Join this talk to learn how GitHub Actions on AWS can take your development processes to the next level.

DOP311 | Building infrastructure as code with AWS CloudFormation AWS CloudFormation helps you manage your AWS infrastructure as code, increasing automation and supporting infrastructure-as-code best practices. In this chalk talk, learn the fundamentals of CloudFormation, including templates, stacks, change sets, and stack dependencies. See a demo of how to describe your AWS infrastructure in a template format and provision resources in an automated, repeatable way.

DOP312 | Creating custom constructs with AWS CDK Join this chalk talk to get answers to your questions about creating, publishing, and sharing your AWS CDK constructs publicly and privately. Learn about construct levels, how to test your constructs, how to discover and use constructs in your AWS CDK projects, and explore Construct Hub.

DOP313-R | Multi-account and multi-Region deployments at scale Many AWS customers are implementing multi-account strategies to more easily manage their cloud infrastructure and improve their security and compliance postures. In this chalk talk, learn about various options for deploying resources into multiple accounts and AWS Regions using AWS developer tools, including AWS CodePipeline, AWS CodeDeploy, and Amazon CodeCatalyst.

DOP314 | Simplifying cloud infrastructure creation with the AWS CDK The AWS Cloud Development Kit (AWS CDK) is an open source software development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. In this chalk talk, get an introduction to the AWS CDK and see a demo of how it can simplify infrastructure creation. Through code examples and diagrams, see how the AWS CDK lets you use familiar programming languages for declarative infrastructure definition. Also learn how it provides higher-level abstractions and constructs over native CloudFormation.

DOP317 | Applying Amazon’s DevOps culture to your team In this chalk talk, learn how Amazon helps its developers rapidly release and iterate software while maintaining industry-leading standards on security, reliability, and performance. Learn about the culture of two-pizza teams and how to maintain a culture of DevOps in a large enterprise. Also, discover how you can help build such a culture at your own organization.

DOP318 | Testing for resilience with AWS Fault Injection Simulator As cloud-based systems grow in scale and complexity, there is increased need to test distributed systems for resiliency. AWS Fault Injection Simulator (FIS) allows you to stress test your applications to understand failure modes and build more resilient services. Through code examples and diagrams, see how to set up and run fault injection experiments on AWS. By the end of this session, understand how FIS helps identify weaknesses and validate improvements to build more resilient cloud-based systems.

DOP319-R | Zero-downtime deployment strategies AWS services support a wealth of deployment options to meet your needs, ranging from in-place updates to blue/green deployment to continuous configuration with feature flags. In this chalk talk, hear about multiple options for deploying changes to Amazon EC2, Amazon ECS, and AWS Lambda compute platforms using AWS CodeDeploy, AWS AppConfig, AWS CloudFormation, AWS Cloud Development Kit (AWS CDK), and Amazon CodeCatalyst.

DOP320 | Build a path to production with Amazon CodeCatalyst blueprints Amazon CodeCatalyst uses blueprints to configure your software projects in the service. Blueprints instruct CodeCatalyst on how to set up a code repository with working sample code, define cloud infrastructure, and run pre-configured CI/CD workflows for your project. In this session, learn how blueprints in CodeCatalyst can give developers a compliant software service they’ll want to use on AWS.

DOP321-R | Code faster with Amazon CodeWhisperer Traditionally, building applications requires developers to spend a lot of time manually writing code and trying to learn and keep up with new frameworks, SDKs, and libraries. In the last three years, AI models have grown exponentially in complexity and sophistication, enabling the creation of tools like Amazon CodeWhisperer that can generate code suggestions in real time based on a natural language description of the task. In this session, learn how CodeWhisperer can accelerate and enhance your software development with code generation, reference tracking, security scans, and more.

DOP324 | Accelerating application development with AWS client-side tools Did you know AWS has more than just services? There are dozens of AWS client-side tools and libraries designed to make developing quality applications easier. In this chalk talk, explore some of the tools available in your development workspace. Learn more about command line tooling (AWS CLI), libraries (AWS SDK), IDE integrations, and application frameworks that can accelerate your AWS application development. The audience helps set the agenda so there’s sure to be something for every builder.

DevOps and Developer Productivity workshops

What are workshops?

Workshops are two-hour interactive learning sessions where you work in small group teams to solve problems using AWS services. Each workshop starts with a short lecture (10–15 minutes) by the main speaker, and the rest of the time is spent working as a group.

Level 300 — Advanced

DOP301 | Boost your application availability with AIOps on AWS As applications become increasingly distributed and complex, developers and IT operations teams can benefit from more automated practices to maintain application availability and reduce the time and effort spent detecting, debugging, and resolving operational issues manually. In this workshop, learn how AWS AIOps solutions can help you make the shift toward more automation and proactive mechanisms so your IT team can innovate faster. The workshop includes use cases spanning multiple AWS services such as AWS Lambda, Amazon DynamoDB, Amazon API Gateway, Amazon RDS, and Amazon EKS. Learn how you can reduce MTTR and quickly identify issues within your AWS infrastructure. You must bring your laptop to participate.

DOP302 | Build software faster with Amazon CodeCatalyst In this workshop, learn about creating continuous integration and continuous delivery (CI/CD) pipelines using Amazon CodeCatalyst. CodeCatalyst is a unified software development service on AWS that brings together everything teams need to plan, code, build, test, and deploy applications with continuous CI/CD tools. You can utilize AWS services and integrate AWS resources into your projects by connecting your AWS accounts. With all of the stages of an application’s lifecycle in one tool, you can deliver quality software quickly and confidently. You must bring your laptop to participate.

DOP303-R | Continuous integration and delivery on AWS In this workshop, learn to create end-to-end continuous integration and continuous delivery (CI/CD) pipelines using AWS Cloud Development Kit (AWS CDK). Review the fundamental concepts of continuous integration, continuous deployment, and continuous delivery. Then, using TypeScript/Python, define an AWS CodePipeline, AWS CodeBuild, and AWS CodeCommit workflow. You must bring your laptop to participate.

DOP304 | Develop AWS CDK resources to deploy your applications on AWS In this workshop, learn how to build and deploy applications using infrastructure as code with AWS Cloud Development Kit (AWS CDK). Create resources using AWS CDK and learn maintenance and operations tips. In addition, get an introduction to building your own constructs. You must bring your laptop to participate.

DOP305 | Develop AWS CloudFormation templates to manage your infrastructure In this workshop, learn how to develop and test AWS CloudFormation templates. Create CloudFormation templates to deploy and manage resources and learn about CloudFormation language features that allow you to reuse and extend templates for many scenarios. Explore testing tools that can help you validate your CloudFormation templates, including cfn-lint and CloudFormation Guard. You must bring your laptop to participate.

DOP307-R | Hands-on with Amazon CodeWhisperer In this workshop, learn how to build applications faster and more securely with Amazon CodeWhisperer. The workshop begins with several examples highlighting how CodeWhisperer incorporates your comments and existing code to produce results. Then dive into a series of challenges designed to improve your productivity using multiple languages and frameworks. You must bring your laptop to participate.

DOP308 | Enforcing development standards with Amazon CodeCatalyst In this workshop, learn how Amazon CodeCatalyst can accelerate the application development lifecycle within your organization. Discover how your cloud center of excellence (CCoE) can provide standardized code and workflows to help teams get started quickly and securely. In addition, learn how to update projects as organization standards evolve. You must bring your laptop to participate.

Level 400 — Expert

DOP401 | Get better at building AWS CDK constructs In this workshop, dive deep into how to design AWS CDK constructs, which are reusable and shareable cloud components that help you meet your organization’s security, compliance, and governance requirements. Learn how to build, test, and share constructs representing a single AWS resource, as well as how to create higher-level abstractions that include built-in defaults and allow you to provision multiple AWS resources. You must bring your laptop to participate.

DevOps and Developer Productivity builders’ sessions

What are builders’ sessions?

These 60-minute group sessions are led by an AWS expert and provide an interactive learning experience for building on AWS. Builders’ sessions are designed to create a hands-on experience where questions are encouraged.

Level 300 — Advanced

DOP322-R | Accelerate data science coding with Amazon CodeWhisperer Generative AI removes the heavy lifting that developers experience today by writing much of the undifferentiated code, allowing them to build faster. Helping developers code faster could be one of the most powerful uses of generative AI that we will see in the coming years—and this framework can also be applied to data science projects. In this builders’ session, explore how Amazon CodeWhisperer accelerates the completion of data science coding tasks with extensions for JupyterLab and Amazon SageMaker. Learn how to build data processing pipeline and machine learning models with the help of CodeWhisperer and accelerate data science experiments in Python. You must bring your laptop to participate.

Level 400 — Expert

DOP402-R | Manage dev environments at scale with Amazon CodeCatalyst Amazon CodeCatalyst Dev Environments are cloud-based environments that you can use to quickly work on the code stored in the source repositories of your project. They are automatically created with pre-installed dependencies and language-specific packages so you can work on a new or existing project right away. In this session, learn how to create secure, reproducible, and consistent environments for VS Code, AWS Cloud9, and JetBrains IDEs. You must bring your laptop to participate.

DOP403-R | Hands-on with Amazon CodeCatalyst: Automating security in CI/CD pipelines In this session, learn how to build a CI/CD pipeline with Amazon CodeCatalyst and add the necessary steps to secure your pipeline. Learn how to perform tasks such as secret scanning, software composition analysis (SCA), static application security testing (SAST), and generating a software bill of materials (SBOM). You must bring your laptop to participate.

DevOps and Developer Productivity lightning talks

What are lightning talks?

Lightning talks are short, 20-minute demos led from a stage.

DOP221 | Amazon CodeCatalyst in real time: Deploying to production in minutes In this follow-up demonstration to DOP210, see how you can use an Amazon CodeCatalyst blueprint to build a production-ready application that is set up for long-term success. See in real time how to create a project using a CodeCatalyst Dev Environment and deploy it to production using a CodeCatalyst workflow.

DevOps and Developer Productivity code talks

What are code talks?

Code talks are 60-minute, highly-interactive discussions featuring live coding. Attendees are encouraged to dig in and ask questions about the speaker’s approach.

DOP203 | The future of development on AWS This code talk includes a live demo and an open discussion about how builders can use the latest AWS developer tools and generative AI to build production-ready applications in minutes. Starting at an Amazon CodeCatalyst blueprint and using integrated AWS productivity and security capabilities, see a glimpse of what the future holds for developing on AWS.

DOP204 | Tips and tricks for coding with Amazon CodeWhisperer Generative AI tools that can generate code suggestions, such as Amazon CodeWhisperer, are growing rapidly in popularity. Join this code talk to learn how CodeWhisperer can accelerate and enhance your software development with code generation, reference tracking, security scans, and more. Learn best practices for prompt engineering, and get tips and tricks that can help you be more productive when building applications.

Want to stay connected?

Get the latest updates for DevOps and Developer Productivity by following us on Twitter and visiting the AWS devops blog.

Blue/Green deployments using AWS CDK Pipelines and AWS CodeDeploy

Post Syndicated from Luiz Decaro original https://aws.amazon.com/blogs/devops/blue-green-deployments-using-aws-cdk-pipelines-and-aws-codedeploy/

Customers often ask for help with implementing Blue/Green deployments to Amazon Elastic Container Service (Amazon ECS) using AWS CodeDeploy. Their use cases usually involve cross-Region and cross-account deployment scenarios. These requirements are challenging enough on their own, but in addition to those, there are specific design decisions that need to be considered when using CodeDeploy. These include how to configure CodeDeploy, when and how to create CodeDeploy resources (such as Application and Deployment Group), and how to write code that can be used to deploy to any combination of account and Region.

Today, I will discuss those design decisions in detail and how to use CDK Pipelines to implement a self-mutating pipeline that deploys services to Amazon ECS in cross-account and cross-Region scenarios. At the end of this blog post, I also introduce a demo application, available in Java, that follows best practices for developing and deploying cloud infrastructure using AWS Cloud Development Kit (AWS CDK).

The Pipeline

CDK Pipelines is an opinionated construct library used for building pipelines with different deployment engines. It abstracts implementation details that developers or infrastructure engineers need to solve when implementing a cross-Region or cross-account pipeline. For example, in cross-Region scenarios, AWS CloudFormation needs artifacts to be replicated to the target Region. For that reason, AWS Key Management Service (AWS KMS) keys, an Amazon Simple Storage Service (Amazon S3) bucket, and policies need to be created for the secondary Region. This enables artifacts to be moved from one Region to another. In cross-account scenarios, CodeDeploy requires a cross-account role with access to the KMS key used to encrypt configuration files. This is the sort of detail that our customers want to avoid dealing with manually.

AWS CodeDeploy is a deployment service that automates application deployment across different scenarios. It deploys to Amazon EC2 instances, On-Premises instances, serverless Lambda functions, or Amazon ECS services. It integrates with AWS Identity and Access Management (AWS IAM), to implement access control to deploy or re-deploy old versions of an application. In the Blue/Green deployment type, it is possible to automate the rollback of a deployment using Amazon CloudWatch Alarms.

CDK Pipelines was designed to automate AWS CloudFormation deployments. Using AWS CDK, these CloudFormation deployments may include deploying application software to instances or containers. However, some customers prefer using CodeDeploy to deploy application software. In this blog post, CDK Pipelines will deploy using CodeDeploy instead of CloudFormation.

A pipeline build with CDK Pipelines that deploys to Amazon ECS using AWS CodeDeploy. It contains at least 5 stages: Source, Build, UpdatePipeline, Assets and at least one Deployment stage.

Design Considerations

In this post, I’m considering the use of CDK Pipelines to implement different use cases for deploying a service to any combination of accounts (single-account & cross-account) and regions (single-Region & cross-Region) using CodeDeploy. More specifically, there are four problems that need to be solved:

CodeDeploy Configuration

The most popular options for implementing a Blue/Green deployment type using CodeDeploy are using CloudFormation Hooks or using a CodeDeploy construct. I decided to operate CodeDeploy using its configuration files. This is a flexible design that doesn’t rely on using custom resources, which is another technique customers have used to solve this problem. On each run, a pipeline pushes a container to a repository on Amazon Elastic Container Registry (ECR) and creates a tag. CodeDeploy needs that information to deploy the container.

I recommend creating a pipeline action to scan the AWS CDK cloud assembly and retrieve the repository and tag information. The same action can create the CodeDeploy configuration files. Three configuration files are required to configure CodeDeploy: appspec.yaml, taskdef.json and imageDetail.json. This pipeline action should be executed before the CodeDeploy deployment action. I recommend creating template files for appspec.yaml and taskdef.json. The following script can be used to implement the pipeline action:

##
#!/bin/sh
#
# Action Configure AWS CodeDeploy
# It customizes the files template-appspec.yaml and template-taskdef.json to the environment
#
# Account = The target Account Id
# AppName = Name of the application
# StageName = Name of the stage
# Region = Name of the region (us-east-1, us-east-2)
# PipelineId = Id of the pipeline
# ServiceName = Name of the service. It will be used to define the role and the task definition name
#
# Primary output directory is codedeploy/. All the 3 files created (appspec.json, imageDetail.json and 
# taskDef.json) will be located inside the codedeploy/ directory
#
##
Account=$1
Region=$2
AppName=$3
StageName=$4
PipelineId=$5
ServiceName=$6
repo_name=$(cat assembly*$PipelineId-$StageName/*.assets.json | jq -r '.dockerImages[] | .destinations[] | .repositoryName' | head -1) 
tag_name=$(cat assembly*$PipelineId-$StageName/*.assets.json | jq -r '.dockerImages | to_entries[0].key')  
echo ${repo_name} 
echo ${tag_name} 
printf '{"ImageURI":"%s"}' "$Account.dkr.ecr.$Region.amazonaws.com/${repo_name}:${tag_name}" > codedeploy/imageDetail.json                     
sed 's#APPLICATION#'$AppName'#g' codedeploy/template-appspec.yaml > codedeploy/appspec.yaml 
sed 's#APPLICATION#'$AppName'#g' codedeploy/template-taskdef.json | sed 's#TASK_EXEC_ROLE#arn:aws:iam::'$Account':role/'$ServiceName'#g' | sed 's#fargate-task-definition#'$ServiceName'#g' > codedeploy/taskdef.json 
cat codedeploy/appspec.yaml
cat codedeploy/taskdef.json
cat codedeploy/imageDetail.json

Using a Toolchain

A good strategy is to encapsulate the pipeline inside a Toolchain to abstract how to deploy to different accounts and regions. This helps decoupling clients from the details such as how the pipeline is created, how CodeDeploy is configured, and how cross-account and cross-Region deployments are implemented. To create the pipeline, deploy a Toolchain stack. Out-of-the-box, it allows different environments to be added as needed. Depending on the requirements, the pipeline may be customized to reflect the different stages or waves that different components might require. For more information, please refer to our best practices on how to automate safe, hands-off deployments and its reference implementation.

In detail, the Toolchain stack follows the builder pattern used throughout the CDK for Java. This is a convenience that allows complex objects to be created using a single statement:

 Toolchain.Builder.create(app, Constants.APP_NAME+"Toolchain")
        .stackProperties(StackProps.builder()
                .env(Environment.builder()
                        .account(Demo.TOOLCHAIN_ACCOUNT)
                        .region(Demo.TOOLCHAIN_REGION)
                        .build())
                .build())
        .setGitRepo(Demo.CODECOMMIT_REPO)
        .setGitBranch(Demo.CODECOMMIT_BRANCH)
        .addStage(
                "UAT",
                EcsDeploymentConfig.CANARY_10_PERCENT_5_MINUTES,
                Environment.builder()
                        .account(Demo.SERVICE_ACCOUNT)
                        .region(Demo.SERVICE_REGION)
                        .build())                                                                                                             
        .build();

In the statement above, the continuous deployment pipeline is created in the TOOLCHAIN_ACCOUNT and TOOLCHAIN_REGION. It implements a stage that builds the source code and creates the Java archive (JAR) using Apache Maven.  The pipeline then creates a Docker image containing the JAR file.

The UAT stage will deploy the service to the SERVICE_ACCOUNT and SERVICE_REGION using the deployment configuration CANARY_10_PERCENT_5_MINUTES. This means 10 percent of the traffic is shifted in the first increment and the remaining 90 percent is deployed 5 minutes later.

To create additional deployment stages, you need a stage name, a CodeDeploy deployment configuration and an environment where it should deploy the service. As mentioned, the pipeline is, by default, a self-mutating pipeline. For example, to add a Prod stage, update the code that creates the Toolchain object and submit this change to the code repository. The pipeline will run and update itself adding a Prod stage after the UAT stage. Next, I show in detail the statement used to add a new Prod stage. The new stage deploys to the same account and Region as in the UAT environment:

... 
        .addStage(
                "Prod",
                EcsDeploymentConfig.CANARY_10_PERCENT_5_MINUTES,
                Environment.builder()
                        .account(Demo.SERVICE_ACCOUNT)
                        .region(Demo.SERVICE_REGION)
                        .build())                                                                                                                                      
        .build();

In the statement above, the Prod stage will deploy new versions of the service using a CodeDeploy deployment configuration CANARY_10_PERCENT_5_MINUTES. It means that 10 percent of traffic is shifted in the first increment of 5 minutes. Then, it shifts the rest of the traffic to the new version of the application. Please refer to Organizing Your AWS Environment Using Multiple Accounts whitepaper for best-practices on how to isolate and manage your business applications.

Some customers might find this approach interesting and decide to provide this as an abstraction to their application development teams. In this case, I advise creating a construct that builds such a pipeline. Using a construct would allow for further customization. Examples are stages that promote quality assurance or deploy the service in a disaster recovery scenario.

The implementation creates a stack for the toolchain and another stack for each deployment stage. As an example, consider a toolchain created with a single deployment stage named UAT. After running successfully, the DemoToolchain and DemoService-UAT stacks should be created as in the next image:

Two stacks are needed to create a Pipeline that deploys to a single environment. One stack deploys the Toolchain with the Pipeline and another stack deploys the Service compute infrastructure and CodeDeploy Application and DeploymentGroup. In this example, for an application named Demo that deploys to an environment named UAT, the stacks deployed are: DemoToolchain and DemoService-UAT

CodeDeploy Application and Deployment Group

CodeDeploy configuration requires an application and a deployment group. Depending on the use case, you need to create these in the same or in a different account from the toolchain (pipeline). The pipeline includes the CodeDeploy deployment action that performs the blue/green deployment. My recommendation is to create the CodeDeploy application and deployment group as part of the Service stack. This approach allows to align the lifecycle of CodeDeploy application and deployment group with the related Service stack instance.

CodePipeline allows to create a CodeDeploy deployment action that references a non-existing CodeDeploy application and deployment group. This allows us to implement the following approach:

  • Toolchain stack deploys the pipeline with CodeDeploy deployment action referencing a non-existing CodeDeploy application and deployment group
  • When the pipeline executes, it first deploys the Service stack that creates the related CodeDeploy application and deployment group
  • The next pipeline action executes the CodeDeploy deployment action. When the pipeline executes the CodeDeploy deployment action, the related CodeDeploy application and deployment will already exist.

Below is the pipeline code that references the (initially non-existing) CodeDeploy application and deployment group.

private IEcsDeploymentGroup referenceCodeDeployDeploymentGroup(
        final Environment env, 
        final String serviceName, 
        final IEcsDeploymentConfig ecsDeploymentConfig, 
        final String stageName) {

    IEcsApplication codeDeployApp = EcsApplication.fromEcsApplicationArn(
            this,
            Constants.APP_NAME + "EcsCodeDeployApp-"+stageName,
            Arn.format(ArnComponents.builder()
                    .arnFormat(ArnFormat.COLON_RESOURCE_NAME)
                    .partition("aws")
                    .region(env.getRegion())
                    .service("codedeploy")
                    .account(env.getAccount())
                    .resource("application")
                    .resourceName(serviceName)
                    .build()));

    IEcsDeploymentGroup deploymentGroup = EcsDeploymentGroup.fromEcsDeploymentGroupAttributes(
            this,
            Constants.APP_NAME + "-EcsCodeDeployDG-"+stageName,
            EcsDeploymentGroupAttributes.builder()
                    .deploymentGroupName(serviceName)
                    .application(codeDeployApp)
                    .deploymentConfig(ecsDeploymentConfig)
                    .build());

    return deploymentGroup;
}

To make this work, you should use the same application name and deployment group name values when creating the CodeDeploy deployment action in the pipeline and when creating the CodeDeploy application and deployment group in the Service stack (where the Amazon ECS infrastructure is deployed). This approach is necessary to avoid a circular dependency error when trying to create the CodeDeploy application and deployment group inside the Service stack and reference these objects to configure the CodeDeploy deployment action inside the pipeline. Below is the code that uses Service stack construct ID to name the CodeDeploy application and deployment group. I set the Service stack construct ID to the same name I used when creating the CodeDeploy deployment action in the pipeline.

   // configure AWS CodeDeploy Application and DeploymentGroup
   EcsApplication app = EcsApplication.Builder.create(this, "BlueGreenApplication")
           .applicationName(id)
           .build();

   EcsDeploymentGroup.Builder.create(this, "BlueGreenDeploymentGroup")
           .deploymentGroupName(id)
           .application(app)
           .service(albService.getService())
           .role(createCodeDeployExecutionRole(id))
           .blueGreenDeploymentConfig(EcsBlueGreenDeploymentConfig.builder()
                   .blueTargetGroup(albService.getTargetGroup())
                   .greenTargetGroup(tgGreen)
                   .listener(albService.getListener())
                   .testListener(listenerGreen)
                   .terminationWaitTime(Duration.minutes(15))
                   .build())
           .deploymentConfig(deploymentConfig)
           .build();

CDK Pipelines roles and permissions

CDK Pipelines creates roles and permissions the pipeline uses to execute deployments in different scenarios of regions and accounts. When using CodeDeploy in cross-account scenarios, CDK Pipelines deploys a cross-account support stack that creates a pipeline action role for the CodeDeploy action. This cross-account support stack is defined in a JSON file that needs to be published to the AWS CDK assets bucket in the target account. If the pipeline has the self-mutation feature on (default), the UpdatePipeline stage will do a cdk deploy to deploy changes to the pipeline. In cross-account scenarios, this deployment also involves deploying/updating the cross-account support stack. For this, the SelfMutate action in UpdatePipeline stage needs to assume CDK file-publishing and a deploy roles in the remote account.

The IAM role associated with the AWS CodeBuild project that runs the UpdatePipeline stage does not have these permissions by default. CDK Pipelines cannot grant these permissions automatically, because the information about the permissions that the cross-account stack needs is only available after the AWS CDK app finishes synthesizing. At that point, the permissions that the pipeline has are already locked-in­­. Hence, for cross-account scenarios, the toolchain should extend the permissions of the pipeline’s UpdatePipeline stage to include the file-publishing and deploy roles.

In cross-account environments it is possible to manually add these permissions to the UpdatePipeline stage. To accomplish that, the Toolchain stack may be used to hide this sort of implementation detail. In the end, a method like the one below can be used to add these missing permissions. For each different mapping of stage and environment in the pipeline it validates if the target account is different than the account where the pipeline is deployed. When the criteria is met, it should grant permission to the UpdatePipeline stage to assume CDK bootstrap roles (tagged using key aws-cdk:bootstrap-role) in the target account (with the tag value as file-publishing or deploy). The example below shows how to add permissions to the UpdatePipeline stage:

private void grantUpdatePipelineCrossAccoutPermissions(Map<String, Environment> stageNameEnvironment) {

    if (!stageNameEnvironment.isEmpty()) {

        this.pipeline.buildPipeline();
        for (String stage : stageNameEnvironment.keySet()) {

            HashMap<String, String[]> condition = new HashMap<>();
            condition.put(
                    "iam:ResourceTag/aws-cdk:bootstrap-role",
                    new String[] {"file-publishing", "deploy"});
            pipeline.getSelfMutationProject()
                    .getRole()
                    .addToPrincipalPolicy(PolicyStatement.Builder.create()
                            .actions(Arrays.asList("sts:AssumeRole"))
                            .effect(Effect.ALLOW)
                            .resources(Arrays.asList("arn:*:iam::"
                                    + stageNameEnvironment.get(stage).getAccount() + ":role/*"))
                            .conditions(new HashMap<String, Object>() {{
                                    put("ForAnyValue:StringEquals", condition);
                            }})
                            .build());
        }
    }
}

The Deployment Stage

Let’s consider a pipeline that has a single deployment stage, UAT. The UAT stage deploys a DemoService. For that, it requires four actions: DemoService-UAT (Prepare and Deploy), ConfigureBlueGreenDeploy and Deploy.

When using CodeDeploy the deployment stage is expected to have four actions: two actions to create CloudFormation change set and deploy the ECS or compute infrastructure, an action to configure CodeDeploy and the last action that deploys the application using CodeDeploy. In the diagram, these are (in the diagram in the respective order): DemoService-UAT.Prepare and DemoService-UAT.Deploy, ConfigureBlueGreenDeploy and Deploy.

The
DemoService-UAT.Deploy action will create the ECS resources and the CodeDeploy application and deployment group. The
ConfigureBlueGreenDeploy action will read the AWS CDK
cloud assembly. It uses the configuration files to identify the Amazon Elastic Container Registry (Amazon ECR) repository and the container image tag pushed. The pipeline will send this information to the
Deploy action.  The
Deploy action starts the deployment using CodeDeploy.

Solution Overview

As a convenience, I created an application, written in Java, that solves all these challenges and can be used as an example. The application deployment follows the same 5 steps for all deployment scenarios of account and Region, and this includes the scenarios represented in the following design:

A pipeline created by a Toolchain should be able to deploy to any combination of accounts and regions. This includes four scenarios: single-account and single-Region, single-account and cross-Region, cross-account and single-Region and cross-account and cross-Region

Conclusion

In this post, I identified, explained and solved challenges associated with the creation of a pipeline that deploys a service to Amazon ECS using CodeDeploy in different combinations of accounts and regions. I also introduced a demo application that implements these recommendations. The sample code can be extended to implement more elaborate scenarios. These scenarios might include automated testing, automated deployment rollbacks, or disaster recovery. I wish you success in your transformative journey.

Luiz Decaro

Luiz is a Principal Solutions architect at Amazon Web Services (AWS). He focuses on helping customers from the Financial Services Industry succeed in the cloud. Luiz holds a master’s in software engineering and he triggered his first continuous deployment pipeline in 2005.

How GitHub uses GitHub Actions and Actions larger runners to build and test GitHub.com

Post Syndicated from Max Wagner original https://github.blog/2023-09-26-how-github-uses-github-actions-and-actions-larger-runners-to-build-and-test-github-com/

The Developer Experience (DX) team at GitHub collaborated with a number of other teams to work on moving our continuous integration (CI) system to GitHub Actions to support the development and scaling demands of our engineering team. Our goal as a team is to enable our engineers to confidently and quickly ship software. To that end, we’ve worked on providing paved paths, a suite of automated tools and applications to streamline our development, runtime platforms, and deployments. Recently, we’ve been working to make our CI experience better by leveraging the newly released GitHub feature, Actions larger runners, to run our CI.

Read on to see how we run 15,000 CI jobs within an hour across 150,000 cores of compute!

Brief history of CI at GitHub

GitHub has invested in a variety of different CI systems throughout its history. With each system, our aim has been to enhance the development experience for both GitHub engineers writing and deploying code and for engineers maintaining the systems.

However, with past CI systems we faced challenges with scaling the system to meet the needs of our engineering team to provide both stable and ephemeral build environments. Neither of these challenges allowed us to provide the optimal developer experience.

Then, GitHub released GitHub Actions larger runners. This gave us an opportunity not only to transition to a fully featured CI system, but also to develop, experience, and utilize the systems we are creating for our customers and to drive feedback to help build the product. For the GitHub DX team, this transition was a great opportunity to move away from maintaining our past CI systems while delivering a superior developer experience.

What are larger runners?

Larger runners are GitHub Actions runners that are hosted by GitHub. They are managed virtual machines (VMs) with more RAM, CPU, and disk space than standard GitHub-hosted runners. There are a variety of different machine sizes offered for the runners as well as some additional features compared to the standard GitHub-hosted runners.

Larger runners are available to GitHub Team and GitHub Enterprise Cloud customers. Check out these docs to learn more about larger runners.

Why did we pick larger runners?

Autoscaling and managed

Coming from previous iterations of GitHub’s CI systems, we needed the ability to create CI machines on demand to meet the fast feedback cycles needed by GitHub engineers and to scale with the rate of change of the site.

With larger runners, we maintain the ability to autoscale our CI system because GitHub will automatically create multiple instances of a runner that scale up and down to match the job demands of our engineers. An added benefit is that the GitHub DX team no longer has to worry about the scaling of the runners since all of those complexities are handled by GitHub itself!

We wanted to share some raw numbers on our current peak utilization of larger runners:

  • Uses 4,500 concurrent 32-core runners
  • Runs 125,000 build minutes per hour
  • Queues and runs approximately 15,000 jobs within an hour
  • Allocates around 150,000 cores of compute

(Beta) Custom VM image support

GitHub Actions provides runners with a lot of tools already baked in, which is sufficient and convenient for a variety of projects across the company. However, for some complex production GitHub services, the prebuilt runners did not satisfy all our requirements.

To maintain an efficient and fast CI system, the DX team needed the ability to provide machines with all the tools needed to build those production services. We didn’t want to spend extra time installing tools or compiling projects during CI jobs.

We are currently building features into larger runners so they have the ability to be launched from a custom VM image, called custom images. While this feature is still in beta, using custom images is a huge benefit to GitHub’s CI lifecycle for a couple of reasons.

First, custom images allows GitHub to bundle all the required software and tools needed to build and test complex production bearing services. Anything that is unique to GitHub or one of our projects can be pre-installed on the image before a GitHub Actions workflow even starts.

Second, custom images enable GitHub to dramatically speed up our GitHub Actions workflows by acting as a bootstrapping cache for some projects. During custom image creation, we bundle a pre-built version of a project’s source code into the image. Subsequently, when the project starts a GitHub Actions workflow, it can utilize a cached version of its source code, and any other build artifacts, to speed up its build process.

The cached project source code on the custom VM image can quickly become out of date due to the rapid rate of development within GitHub. This, in turn, causes workflow durations to increase. The DX team worked with the GitHub Actions engineering team to create an API on GitHub to regularly update the custom image multiple times a day to keep the project source up to date.

In practice, this has reduced the bootstrapping time of our projects significantly. Without custom images, our workflows would take around 50 minutes from start to finish, versus the 12 minutes they take today. This is a game changer for our engineers.

We’re working on a way to offer this functionality at scale. If you are interested in custom images for your CI/CD workflows, please reach out to your account manager to learn more!

Important GitHub Actions features

There are thousands of projects at GitHub — from services that run production workloads to small tools that need to run CI to perform their daily operations. To make this a reality, GitHub leverages several important features in GitHub Actions that enable us to use the platform efficiently and securely across the company at scale.

Reusable workflows

One of the DX team’s driving goals is to pave paths for all repositories to run CI without introducing unnecessary repetition across repositories. Prior to GitHub Actions, we created single job configurations that could be used across multiple projects. In GitHub Actions, this was not as easy because any repository can define its own workflows. Reusable workflows to the rescue!

The reusable workflows feature in GitHub Actions provides a way to centrally manage a workflow in a repository that can be utilized by many other repositories in an organization. This was critical in our transition from our previous CI system to GitHub Actions. We were able to create several prebuilt workflows in a single repository, and many repositories could then use those workflows. This makes the process of adding CI to an existing or new project very much plug and play.

In our central repository hosting our reusable workflows, we can have workflows defined like:

on:
  workflow_call:
    inputs:
      cibuild-script:
        description: 'Which cibuild script to run.'
        type: string
        required: false
        default: "script/cibuild"
    secrets:
      service-api-key:
        required: true

jobs:
  reusable_workflow_job:
    runs-on: gh-larger-runner-medium
    name: Simple Workflow Job
    timeout-minutes: 20
    steps:
      - name: Checkout Project
        uses: actions/checkout@v3
      - name: Run cibuild script
        run: |
          bash ${{ inputs.cibuild-script }}
        shell: bash

And in consuming repositories, they can simply utilize the reusable workflow, with just a few lines of code!

name: my-new-project
on:
  workflow_dispatch:
  push:

jobs:
  call-reusable-workflow:
    uses: github/internal-actions/.github/workflows/default.yml@main
    with:
      cibuild-script: "script/cibuild-my-tests"
    secrets:
      service-api-key: ${{ secrets.SERVICE_API_KEY }}

Another great benefit of the reusable workflows feature is that the runner can be defined in the Reusable Workflow, meaning that we can guarantee all users of the workflow will run on our designated larger runner pool. Now, projects don’t need to worry about which runner they need to use!

(Beta) Reusing previous workflow outcomes

To optimize our developer experience, the DX team worked with our engineering team to create a feature for GitHub Actions that allows workflows to reuse the outcome of a previous workflow run where the outcomes would be the same.

In some cases, the file contents of a repository are exactly the same between workflow runs that run on different commits. That is, the Git tree IDs for the current commit is the same as the previous commit (there are no file differences). In these cases, we can bypass CI checks by reusing the previous workflow outcomes and allow engineers to not have to wait for CI to run again.

This feature saves GitHub engineers from running anywhere from 300 to 500 workflows runs a day!

Other challenges faced

Private service access

During some internal GitHub Actions workflow runs, the workflows need the ability to access some GitHub private services, within a GitHub virtual private cloud (VPC), over the network. These could be resources such as artifact storage, application metadata services, and other services that enable invocation of our test harness.

When we moved to larger runners, this requirement to access private services became a top-of-mind concern. In previous iterations of our CI infrastructure, these private services were accessible through other cloud and network configurations. However, larger runners are isolated from other production environments, meaning they cannot access our private services.

Like all companies, we need to focus on both the security of our platform as well as the developer experience. To satisfy these two requirements, GitHub developed a remote access solution that allows clients residing outside of our VPCs (larger runners) to securely access select private services.

This remote access solution works on the principle of minting an OIDC token in GitHub Actions, passing the OIDC token to a remote access gateway that authorizes the request by validating the OIDC token, and then proxying the request to the private service residing in a private network.

Flow diagram showing an OIDC token being mined in GitHub Actions, passed to a remote access gateway that authorizes the request by validating the OIDC token, and then proxying the request to the private service residing in a private network.

With this solution we are able to securely provide remote access from larger runners running GitHubActions to our private resources within our VPC.

GitHub has open sourced the basic scaffolding of this remote access gateway in the github/actions-oidc-gateway-example repository, so be sure to check it out!

Conclusion

GitHub Actions provides a robust and smooth developer experience for GitHub engineers working on GitHub.com. We have been able to accomplish this by using the power of GitHub Actions features, such as reusable workflows and reusable workflow outcomes, and by leveraging the scalability and manageability of the GitHub Actions larger runners. We have also used this effort to enhance the GitHub Actions product. To put it simply, GitHub runs on GitHub.

The post How GitHub uses GitHub Actions and Actions larger runners to build and test GitHub.com appeared first on The GitHub Blog.

How to add notifications and manual approval to an AWS CDK Pipeline

Post Syndicated from Jehu Gray original https://aws.amazon.com/blogs/devops/how-to-add-notifications-and-manual-approval-to-an-aws-cdk-pipeline/

A deployment pipeline typically comprises several stages such as dev, test, and prod, which ensure that changes undergo testing before reaching the production environment. To improve the reliability and stability of release processes, DevOps teams must review Infrastructure as Code (IaC) changes before applying them in production. As a result, implementing a mechanism for notification and manual approval that grants stakeholders improved access to changes in their release pipelines has become a popular practice for DevOps teams.

Notifications keep development teams and stakeholders informed in real-time about updates and changes to deployment status within release pipelines. Manual approvals establish thresholds for transitioning a change from one stage to the next in the pipeline. They also act as a guardrail to mitigate risks arising from errors and rework because of faulty deployments.

Please note that manual approvals, as described in this post, are not a replacement for the use of automation. Instead, they complement automated checks within the release pipeline.

In this blog post, we describe how to set up notifications and add a manual approval stage to AWS Cloud Development Kit (AWS CDK) Pipeline.

Concepts

CDK Pipeline

CDK Pipelines is a construct library for painless continuous delivery of CDK applications. CDK Pipelines can automatically build, test, and deploy changes to CDK resources. CDK Pipelines are self-mutating which means as application stages or stacks are added, the pipeline automatically reconfigures itself to deploy those new stages or stacks. Pipelines need only be manually deployed once, afterwards, the pipeline keeps itself up to date from the source code repository by pulling the changes pushed to the repository.

Notifications

Adding notifications to a pipeline provides visibility to changes made to the environment by utilizing the NotificationRule construct. You can also use this rule to notify pipeline users of important changes, such as when a pipeline starts execution. Notification rules specify both the events and the targets, such as Amazon Simple Notification Service (Amazon SNS) topic or AWS Chatbot clients configured for Slack which represents the nominated recipients of the notifications. An SNS topic is a logical access point that acts as a communication channel while Chatbot is an AWS service that enables DevOps and software development teams to use messaging program chat rooms to monitor and respond to operational events.

Manual Approval

In a CDK pipeline, you can incorporate an approval action at a specific stage, where the pipeline should pause, allowing a team member or designated reviewer to manually approve or reject the action. When an approval action is ready for review, a notification is sent out to alert the relevant parties. This combination of notifications and approvals ensures timely and efficient decision-making regarding crucial actions within the pipeline.

Solution Overview

The solution explains a simple web service that is comprised of an AWS Lambda function that returns a static web page served by Amazon API Gateway. Since Continuous Deployment and Continuous Integration (CI/CD) are important components to most web projects, the team implements a CDK Pipeline for their web project.

There are two important stages in this CDK pipeline; the Pre-production stage for testing and the Production stage, which contains the end product for users.

The flow of the CI/CD process to update the website starts when a developer pushes a change to the repository using their Integrated Development Environment (IDE). An Amazon CloudWatch event triggers the CDK Pipeline. Once the changes reach the pre-production stage for testing, the CI/CD process halts. This is because a manual approval gate is between the pre-production and production stages. So, it becomes a stakeholder’s responsibility to review the changes in the pre-production stage before approving them for production. The pipeline includes an SNS notification that notifies the stakeholder whenever the pipeline requires manual approval.

After approving the changes, the CI/CD process proceeds to the production stage and the updated version of the website becomes available to the end user. If the approver rejects the changes, the process ends at the pre-production stage with no impact to the end user.

The following diagram illustrates the solution architecture.

 

This diagram shows the CDK pipeline process in the solution and how applications or updates are deployed using AWS Lambda Function to end users.

Figure 1. This image shows the CDK pipeline process in our solution and how applications or updates are deployed using AWS Lambda Function to end users.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Add notification to the pipeline

In this tutorial, perform the following steps:

  • Add the import statements for AWS CodeStar notifications and SNS to the import section of the pipeline stack py
import aws_cdk.aws_codestarnotifications as notifications
import aws_cdk.pipelines as pipelines
import aws_cdk.aws_sns as sns
import aws_cdk.aws_sns_subscriptions as subs
  • Ensure the pipeline is built by calling the ‘build pipeline’ function.

pipeline.build_pipeline()

  • Create an SNS topic.

topic = sns.Topic(self, "MyTopic1")

  • Add a subscription to the topic. This specifies where the notifications are sent (Add the stakeholders’ email here).

topic.add_subscription(subs.EmailSubscription("[email protected]"))

  • Define a rule. This contains the source for notifications, the event trigger, and the target .

rule = notifications.NotificationRule(self, "NotificationRule", )

  • Assign the source the value pipeline.pipeline The first pipeline is the name of the CDK pipeline(variable) and the .pipeline is to show it is a pipeline(function).

source=pipeline.pipeline,

  • Define the events to be monitored. Specify notifications for when the pipeline starts, when it fails, when the execution succeeds, and finally when manual approval is needed.
events=["codepipeline-pipeline-pipeline-execution-started", "codepipeline-pipeline-pipeline-execution-failed","codepipeline-pipeline-pipeline-execution-succeeded", 
"codepipeline-pipeline-manual-approval-needed"],
  • For the complete list of supported event types for pipelines, see here
  • Finally, add the target. The target here is the topic created previously.

targets=[topic]

The combination of all the steps becomes:

pipeline.build_pipeline()
topic = sns.Topic(self, "MyTopic1")
topic.add_subscription(subs.EmailSubscription("[email protected]"))
rule = notifications.NotificationRule(self, "NotificationRule",
source=pipeline.pipeline,
events=["codepipeline-pipeline-pipeline-execution-started", "codepipeline-pipeline-pipeline-execution-failed","codepipeline-pipeline-pipeline-execution-succeeded", 
"codepipeline-pipeline-manual-approval-needed"],
targets=[topic]
)

Adding Manual Approval

  • Add the ManualApprovalStep import to the aws_cdk.pipelines import statement.
from aws_cdk.pipelines import (
CodePipeline,
CodePipelineSource,
ShellStep,
ManualApprovalStep
)
  • Add the ManualApprovalStep to the production stage. The code must be added to the add_stage() function.
 prod = WorkshopPipelineStage(self, "Prod")
        prod_stage = pipeline.add_stage(prod,
            pre = [ManualApprovalStep('PromoteToProduction')])

When a stage is added to a pipeline, you can specify the pre and post steps, which are arbitrary steps that run before or after the contents of the stage. You can use them to add validations like manual or automated gates to the pipeline. It is recommended to put manual approval gates in the set of pre steps, and automated approval gates in the set of post steps. So, the manual approval action is added as a pre step that runs after the pre-production stage and before the production stage .

  • The final version of the pipeline_stack.py becomes:
from constructs import Construct
import aws_cdk as cdk
import aws_cdk.aws_codestarnotifications as notifications
import aws_cdk.aws_sns as sns
import aws_cdk.aws_sns_subscriptions as subs
from aws_cdk import (
    Stack,
    aws_codecommit as codecommit,
    aws_codepipeline as codepipeline,
    pipelines as pipelines,
    aws_codepipeline_actions as cpactions,
    
)
from aws_cdk.pipelines import (
    CodePipeline,
    CodePipelineSource,
    ShellStep,
    ManualApprovalStep
)


class WorkshopPipelineStack(cdk.Stack):
    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)
        
        # Creates a CodeCommit repository called 'WorkshopRepo'
        repo = codecommit.Repository(
            self, "WorkshopRepo", repository_name="WorkshopRepo",
            
        )
        
        #Create the Cdk pipeline
        pipeline = pipelines.CodePipeline(
            self,
            "Pipeline",
            
            synth=pipelines.ShellStep(
                "Synth",
                input=pipelines.CodePipelineSource.code_commit(repo, "main"),
                commands=[
                    "npm install -g aws-cdk",  # Installs the cdk cli on Codebuild
                    "pip install -r requirements.txt",  # Instructs Codebuild to install required packages
                    "npx cdk synth",
                ]
                
            ),
        )

        
         # Create the Pre-Prod Stage and its API endpoint
        deploy = WorkshopPipelineStage(self, "Pre-Prod")
        deploy_stage = pipeline.add_stage(deploy)
    
        deploy_stage.add_post(
            
            pipelines.ShellStep(
                "TestViewerEndpoint",
                env_from_cfn_outputs={
                    "ENDPOINT_URL": deploy.hc_viewer_url
                },
                commands=["curl -Ssf $ENDPOINT_URL"],
            )
    
        
        )
        deploy_stage.add_post(
            pipelines.ShellStep(
                "TestAPIGatewayEndpoint",
                env_from_cfn_outputs={
                    "ENDPOINT_URL": deploy.hc_endpoint
                },
                commands=[
                    "curl -Ssf $ENDPOINT_URL",
                    "curl -Ssf $ENDPOINT_URL/hello",
                    "curl -Ssf $ENDPOINT_URL/test",
                ],
            )
            
        )
        
        # Create the Prod Stage with the Manual Approval Step
        prod = WorkshopPipelineStage(self, "Prod")
        prod_stage = pipeline.add_stage(prod,
            pre = [ManualApprovalStep('PromoteToProduction')])
        
        prod_stage.add_post(
            
            pipelines.ShellStep(
                "ViewerEndpoint",
                env_from_cfn_outputs={
                    "ENDPOINT_URL": prod.hc_viewer_url
                },
                commands=["curl -Ssf $ENDPOINT_URL"],
                
            )
            
        )
        prod_stage.add_post(
            pipelines.ShellStep(
                "APIGatewayEndpoint",
                env_from_cfn_outputs={
                    "ENDPOINT_URL": prod.hc_endpoint
                },
                commands=[
                    "curl -Ssf $ENDPOINT_URL",
                    "curl -Ssf $ENDPOINT_URL/hello",
                    "curl -Ssf $ENDPOINT_URL/test",
                ],
            )
            
        )
        
        # Create The SNS Notification for the Pipeline
        
        pipeline.build_pipeline()
        
        topic = sns.Topic(self, "MyTopic")
        topic.add_subscription(subs.EmailSubscription("[email protected]"))
        rule = notifications.NotificationRule(self, "NotificationRule",
            source = pipeline.pipeline,
            events = ["codepipeline-pipeline-pipeline-execution-started", "codepipeline-pipeline-pipeline-execution-failed", "codepipeline-pipeline-manual-approval-needed", "codepipeline-pipeline-manual-approval-succeeded"],
            targets=[topic]
            )
  
    

When a commit is made with git commit -am "Add manual Approval" and changes are pushed with git push, the pipeline automatically self-mutates to add the new approval stage.

Now when the developer pushes changes to update the build environment or the end user application, the pipeline execution stops at the point where the approval action was added. The pipeline won’t resume unless a manual approval action is taken.

Image showing the CDK pipeline with the added Manual Approval action on the AWS Management Console

Figure 2. This image shows the pipeline with the added Manual Approval action.

Since there is a notification rule that includes the approval action, an email notification is sent with the pipeline information and approval status to the stakeholder(s) subscribed to the SNS topic.

Image showing the SNS email notification sent when the pipeline starts

Figure 3. This image shows the SNS email notification sent when the pipeline starts.

After pushing the updates to the pipeline, the reviewer or stakeholder can use the AWS Management Console to access the pipeline to approve or deny changes based on their assessment of these changes. This process helps eliminate any potential issues or errors and ensures only changes deemed relevant are made.

Image showing the review action on the AWS Management Console that gives the stakeholder the ability to approve or reject any changes.

Figure 4. This image shows the review action that gives the stakeholder the ability to approve or reject any changes. 

If a reviewer rejects the action, or if no approval response is received within seven days of the pipeline stopping for the review action, the pipeline status is “Failed.”

Image showing when a stakeholder rejects the action

Figure 5. This image depicts when a stakeholder rejects the action.

If a reviewer approves the changes, the pipeline continues its execution.

Image showing when a stakeholder approves the action

Figure 6. This image depicts when a stakeholder approves the action.

Considerations

It is important to consider any potential drawbacks before integrating a manual approval process into a CDK pipeline. one such consideration is its implementation may delay the delivery of updates to end users. An example of this is business hours limitation. The pipeline process might be constrained by the availability of stakeholders during business hours. This can result in delays if changes are made outside regular working hours and require approval when stakeholders are not immediately accessible.

Clean up

To avoid incurring future charges, delete the resources. Use cdk destroy via the command line to delete the created stack.

Conclusion

Adding notifications and manual approval to CDK Pipelines provides better visibility and control over the changes made to the pipeline environment. These features ideally complement the existing automated checks to ensure that all updates are reviewed before deployment. This reduces the risk of potential issues arising from bugs or errors. The ability to approve or deny changes through the AWS Management Console makes the review process simple and straightforward. Additionally, SNS notifications keep stakeholders updated on the status of the pipeline, ensuring a smooth and seamless deployment process.

Jehu Gray

Jehu Gray is an Enterprise Solutions Architect at Amazon Web Services where he helps customers design solutions that fits their needs. He enjoys exploring whats possible with IaC such as CDK.

Abiola Olanrewaju

Abiola Olanrewaju is an Enterprise Solutions Architect at Amazon Web Services where he helps customers design and implement scalable solutions that drive business outcomes. He has a keen interest in Data Analytics, Security and Automation.

Serge Poueme

Serge Poueme is a Solutions Architect on the AWS for Games Team. He started his career as a software development engineer and enjoys building new products. At AWS, Serge focuses on improving Builders Experience for game developers and optimize servers hosting using Containers. When he is not working he enjoys playing Far Cry or Fifa on his XBOX

Deploy serverless applications in a multicloud environment using Amazon CodeCatalyst

Post Syndicated from Deepak Kovvuri original https://aws.amazon.com/blogs/devops/deploy-serverless-applications-in-a-multicloud-environment-using-amazon-codecatalyst/

Amazon CodeCatalyst is an integrated service for software development teams adopting continuous integration and deployment practices into their software development process. CodeCatalyst puts the tools you need all in one place. You can plan work, collaborate on code, and build, test, and deploy applications by leveraging CodeCatalyst Workflows.

Introduction

In the first post of the blog series, we showed you how organizations can deploy workloads to instances, and virtual machines (VMs), across hybrid and multicloud environment. The second post of the series covered deploying containerized application in a multicloud environment. Finally, in this post, we explore how organizations can deploy modern, cloud-native, serverless application across multiple cloud platforms. Figure 1 shows the solution which we walk through in the post.

Figure 1 – Architecture diagram

The post walks through how to develop, deploy and test a HTTP RESTful API to Azure Functions using Amazon CodeCatalyst. The solution covers the following steps:

  • Set up CodeCatalyst development environment and develop your application using the Serverless Framework.
  • Build a CodeCatalyst workflow to test and then deploy to Azure Functions using GitHub Actions in Amazon CodeCatalyst.

An Amazon CodeCatalyst workflow is an automated procedure that describes how to build, test, and deploy your code as part of a continuous integration and continuous delivery (CI/CD) system. You can use GitHub Actions alongside native CodeCatalyst actions in a CodeCatalyst workflow.

Pre-requisites

Walkthrough

In this post, we will create a hello world RESTful API using the Serverless Framework. As we progress through the solution, we will focus on building a CodeCatalyst workflow that deploys and tests the functionality of the application. At the end of the post, the workflow will look similar to the one shown in Figure 2.

 CodeCatalyst CI/CD workflow

Figure 2 – CodeCatalyst CI/CD workflow

Environment Setup

Before we start developing the application, we need to setup a CodeCatalyst project and then link a code repository to the project. The code repository can be CodeCatalyst Repo or GitHub. In this scenario, we’ve used GitHub repository. By the time we develop the solution, the repository should look as shown below.

Files in solution's GitHub repository

Figure 3 – Files in GitHub repository

In Amazon CodeCatalyst, there’s an option to create Dev Environments, which can used to work on the code stored in the source repositories of a project. In the post, we create a Dev Environment, and associate it with the source repository created above and work off it. But you may choose not to use a Dev Environment, and can run the following commands, and commit to the repository. The /projects directory of a Dev Environment stores the files that are pulled from the source repository. In the dev environment, install the Serverless Framework using this command:

npm install -g serverless

and then initialize a serverless project in the source repository folder:

├── README.md
├── host.json
├── package.json
├── serverless.yml
└── src
    └── handlers
        ├── goodbye.js
        └── hello.js

We can push the code to the CodeCatalyst project using git. Now, that we have the code in CodeCatalyst, we can turn our focus to building the workflow using the CodeCatalyst console.

CI/CD Setup in CodeCatalyst

Configure access to the Azure Environment

We’ll use the GitHub action for Serverless to create and manage Azure Function. For the action to be able to access the Azure environment, it requires credentials associated with a Service Principal passed to the action as environment variables.

Service Principals in Azure are identified by the CLIENT_ID, CLIENT_SECRET, SUBSCRIPTION_ID, and TENANT_ID properties. Storing these values in plaintext anywhere in your repository should be avoided because anyone with access to the repository which contains the secret can see them. Similarly, these values shouldn’t be used directly in any workflow definitions because they will be visible as files in your repository. With CodeCatalyst, we can protect these values by storing them as secrets within the project, and then reference the secret in the CI\CD workflow.

We can create a secret by choosing Secrets (1) under CI\CD and then selecting ‘Create Secret’ (2) as shown in Figure 4. Now, we can key in the secret name and value of each of the identifiers described above.

Figure 4 – CodeCatalyst Secrets

Building the workflow

To create a new workflow, select CI/CD from navigation on the left and then select Workflows (1). Then, select Create workflow (2), leave the default options, and select Create (3) as shown in Figure 5.

Create CodeCatalyst CI/CD workflow

Figure 5 – Create CI/CD workflow

If the workflow editor opens in YAML mode, select Visual to open the visual designer. Now, we can start adding actions to the workflow.

Configure the Deploy action

We’ll begin by adding a GitHub action for deploying to Azure. Select “+ Actions” to open the actions list and choose GitHub from the dropdown menu. Find the Build action and click “+” to add a new GitHub action to the workflow.

Next, configure the GitHub action from the configurations tab by adding the following snippet to the GitHub Actions YAML property:

- name: Deploy to Azure Functions
  uses: serverless/[email protected]
  with:
    args: -c "serverless plugin install --name serverless-azure-functions && serverless deploy"
    entrypoint: /bin/sh
  env:
    AZURE_SUBSCRIPTION_ID: ${Secrets.SUBSCRIPTION_ID}
    AZURE_TENANT_ID: ${Secrets.TENANT_ID}
    AZURE_CLIENT_ID: ${Secrets.CLIENT_ID}
    AZURE_CLIENT_SECRET: ${Secrets.CLIENT_SECRET}

The above workflow configuration makes use of Serverless GitHub Action that wraps the Serverless Framework to run serverless commands. The action is configured to package and deploy the source code to Azure Functions using the serverless deploy command.

Please note how we were able to pass the secrets to GitHub action by referencing the secret identifiers in the above configuration.

Configure the Test action

Similar to the previous step, we add another GitHub action which will use the serverless framework’s serverless invoke command to test the API deployed on to Azure Functions.

- name: Test Function
  uses: serverless/[email protected]
  with:
    args: |
      -c "serverless plugin install --name serverless-azure-functions && \
          serverless invoke -f hello -d '{\"name\": \"CodeCatalyst\"}' && \
          serverless invoke -f goodbye -d '{\"name\": \"CodeCatalyst\"}'"
    entrypoint: /bin/sh
  env:
    AZURE_SUBSCRIPTION_ID: ${Secrets.SUBSCRIPTION_ID}
    AZURE_TENANT_ID: ${Secrets.TENANT_ID}
    AZURE_CLIENT_ID: ${Secrets.CLIENT_ID}
    AZURE_CLIENT_SECRET: ${Secrets.CLIENT_SECRET}

The workflow is now ready and can be validated by choosing ‘Validate’ and then saved to the repository by choosing ‘Commit’. The workflow should automatically kick-off after commit and the application is automatically deployed to Azure Functions.

The functionality of the API can now be verified from the logs of the test action of the workflow as shown in Figure 6.

Test action in CodeCatalyst CI/CD workfl

Figure 6 – CI/CD workflow Test action

Cleanup

If you have been following along with this workflow, you should delete the resources you deployed so you do not continue to incur charges. First, delete the Azure Function App (usually prefixed ‘sls’) using the Azure console. Second, delete the project from CodeCatalyst by navigating to Project settings and choosing Delete project. There’s no cost associated with the CodeCatalyst project and you can continue using it.

Conclusion

In summary, this post highlighted how Amazon CodeCatalyst can help organizations deploy cloud-native, serverless workload into multi-cloud environment. The post also walked through the solution detailing the process of setting up Amazon CodeCatalyst to deploy a serverless application to Azure Functions by leveraging GitHub Actions. Though we showed an application deployment to Azure Functions, you can follow a similar process and leverage CodeCatalyst to deploy any type of application to almost any cloud platform. Learn more and get started with your Amazon CodeCatalyst journey!

We would love to hear your thoughts, and experiences, on deploying serverless applications to multiple cloud platforms. Reach out to us if you’ve any questions, or provide your feedback in the comments section.

About Authors

Picture of Deepak

Deepak Kovvuri

Deepak Kovvuri is a Senior Solutions Architect at supporting Enterprise Customers at AWS in the US East area. He has over 6 years of experience in helping customers architecting a DevOps strategy for their cloud workloads. Deepak specializes in CI/CD, Systems Administration, Infrastructure as Code and Container Services. He holds an Masters in Computer Engineering from University of Illinois at Chicago.

Picture of Amandeep

Amandeep Bajwa

Amandeep Bajwa is a Senior Solutions Architect at AWS supporting Financial Services enterprises. He helps organizations achieve their business outcomes by identifying the appropriate cloud transformation strategy based on industry trends, and organizational priorities. Some of the areas Amandeep consults on are cloud migration, cloud strategy (including hybrid & multicloud), digital transformation, data & analytics, and technology in general.

Picture of Brian

Brian Beach

Brian Beach has over 20 years of experience as a Developer and Architect. He is currently a Principal Solutions Architect at Amazon Web Services. He holds a Computer Engineering degree from NYU Poly and an MBA from Rutgers Business School. He is the author of “Pro PowerShell for Amazon Web Services” from Apress. He is a regular author and has spoken at numerous events. Brian lives in North Carolina with his wife and three kids.

Picture of Pawan

Pawan Shrivastava

Pawan Shrivastava is a Partner Solution Architect at AWS in the WWPS team. He focusses on working with partners to provide technical guidance on AWS, collaborate with them to understand their technical requirements, and designing solutions to meet their specific needs. Pawan is passionate about DevOps, automation and CI CD pipelines. He enjoys watching MMA, playing cricket and working out in the Gym.

Deploy container applications in a multicloud environment using Amazon CodeCatalyst

Post Syndicated from Pawan Shrivastava original https://aws.amazon.com/blogs/devops/deploy-container-applications-in-a-multicloud-environment-using-amazon-codecatalyst/

In the previous post of this blog series, we saw how organizations can deploy workloads to virtual machines (VMs) in a hybrid and multicloud environment. This post shows how organizations can address the requirement of deploying containers, and containerized applications to hybrid and multicloud platforms using Amazon CodeCatalyst. CodeCatalyst is an integrated DevOps service which enables development teams to collaborate on code, and build, test, and deploy applications with continuous integration and continuous delivery (CI/CD) tools.

One prominent scenario where multicloud container deployment is useful is when organizations want to leverage AWS’ broadest and deepest set of Artificial Intelligence (AI) and Machine Learning (ML) capabilities by developing and training AI/ML models in AWS using Amazon SageMaker, and deploying the model package to a Kubernetes platform on other cloud platforms, such as Azure Kubernetes Service (AKS) for inference. As shown in this workshop for operationalizing the machine learning pipeline, we can train an AI/ML model, push it to Amazon Elastic Container Registry (ECR) as an image, and later deploy the model as a container application.

Scenario description

The solution described in the post covers the following steps:

  • Setup Amazon CodeCatalyst environment.
  • Create a Dockerfile along with a manifest for the application, and a repository in Amazon ECR.
  • Create an Azure service principal which has permissions to deploy resources to Azure Kubernetes Service (AKS), and store the credentials securely in Amazon CodeCatalyst secret.
  • Create a CodeCatalyst workflow to build, test, and deploy the containerized application to AKS cluster using Github Actions.

The architecture diagram for the scenario is shown in Figure 1.

Solution architecture diagram

Figure 1 – Solution Architecture

Solution Walkthrough

This section shows how to set up the environment, and deploy a HTML application to an AKS cluster.

Setup Amazon ECR and GitHub code repository

Create a new Amazon ECR and a code repository. In this case we’re using GitHub as the repository but you can create a source repository in CodeCatalyst or you can choose to link an existing source repository hosted by another service if that service is supported by an installed extension. Then follow the application and Docker image creation steps outlined in Step 1 in the environment creation process in exposing Multiple Applications on Amazon EKS. Create a file named manifest.yaml as shown, and map the “image” parameter to the URL of the Amazon ECR repository created above.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: multicloud-container-deployment-app
  labels:
    app: multicloud-container-deployment-app
spec:
  selector:
    matchLabels:
      app: multicloud-container-deployment-app
  replicas: 2
  template:
    metadata:
      labels:
        app: multicloud-container-deployment-app
    spec:
      nodeSelector:
        "beta.kubernetes.io/os": linux
      containers:
      - name: ecs-web-page-container
        image: <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/<my_repository>
        imagePullPolicy: Always
        ports:
            - containerPort: 80
        resources:
          limits:
            memory: "100Mi"
            cpu: "200m"
      imagePullSecrets:
          - name: ecrsecret
---
apiVersion: v1
kind: Service
metadata:
  name: multicloud-container-deployment-service
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: multicloud-container-deployment-app

Push the files to Github code repository. The multicloud-container-app github repository should look similar to Figure 2 below

Files in multicloud container app github repository 

Figure 2 – Files in Github repository

Configure Azure Kubernetes Service (AKS) cluster to pull private images from ECR repository

Pull the docker images from a private ECR repository to your AKS cluster by running the following command. This setup is required during the azure/k8s-deploy Github Actions in the CI/CD workflow. Authenticate Docker to an Amazon ECR registry with get-login-password by using aws ecr get-login-password. Run the following command in a shell where AWS CLI is configured, and is used to connect to the AKS cluster. This creates a secret called ecrsecret, which is used to pull an image from the private ECR repository.

kubectl create secret docker-registry ecrsecret\
 --docker-server=<aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/<my_repository>\
 --docker-username=AWS\
 --docker-password= $(aws ecr get-login-password --region us-west-2)

Provide ECR URI in the variable “–docker-server =”.

CodeCatalyst setup

Follow these steps to set up CodeCatalyst environment:

Configure access to the AKS cluster

In this solution, we use three GitHub Actions – azure/login, azure/aks-set-context and azure/k8s-deploy – to login, set the AKS cluster, and deploy the manifest file to the AKS cluster respectively. For the Github Actions to access the Azure environment, they require credentials associated with an Azure Service Principal.

Service Principals in Azure are identified by the CLIENT_ID, CLIENT_SECRET, SUBSCRIPTION_ID, and TENANT_ID properties. Create the Service principal by running the following command in the azure cloud shell:

az ad sp create-for-rbac \
    --name "ghActionHTMLapplication" \
    --scope /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP> \
    --role Contributor \
    --sdk-auth

The command generates a JSON output (shown in Figure 3), which is stored in CodeCatalyst secret called AZURE_CREDENTIALS. This credential is used by azure/login Github Actions.

JSON output stored in AZURE-CREDENTIALS secret

Figure 3 – JSON output

Configure secrets inside CodeCatalyst Project

Create three secrets CLUSTER_NAME (Name of AKS cluster), RESOURCE_GROUP(Name of Azure resource group) and AZURE_CREDENTIALS(described in the previous step) as described in the working with secret document. The secrets are shown in Figure 4.

Secrets in CodeCatalyst

Figure 4 – CodeCatalyst Secrets

CodeCatalyst CI/CD Workflow

To create a new CodeCatalyst workflow, select CI/CD from the navigation on the left and select Workflows (1). Then, select Create workflow (2), leave the default options, and select Create (3) as shown in Figure 5.

Create CodeCatalyst CI/CD workflow

Figure 5 – Create CodeCatalyst CI/CD workflow

Add “Push to Amazon ECR” Action

Add the Push to Amazon ECR action, and configure the environment where you created the ECR repository as shown in Figure 6. Refer to adding an action to learn how to add CodeCatalyst action.

Create ‘Push to ECR’ CodeCatalyst Action

Figure 6 – Create ‘Push to ECR’ Action

Select the Configuration tab and specify the configurations as shown in Figure7.

Configure ‘Push to ECR’ CodeCatalyst Action

Figure 7 – Configure ‘Push to ECR’ Action

Configure the Deploy action

1. Add a GitHub action for deploying to AKS as shown in Figure 8.

Github action to deploy to AKS

Figure 8 – Github action to deploy to AKS

2. Configure the GitHub action from the configurations tab by adding the following snippet to the GitHub Actions YAML property:

- name: Install Azure CLI
  run: pip install azure-cli
- name: Azure login
  id: login
  uses: azure/[email protected]
  with:
    creds: ${Secrets.AZURE_CREDENTIALS}
- name: Set AKS context
  id: set-context
  uses: azure/aks-set-context@v3
  with:
    resource-group: ${Secrets.RESOURCE_GROUP}
    cluster-name: ${Secrets.CLUSTER_NAME}
- name: Setup kubectl
  id: install-kubectl
  uses: azure/setup-kubectl@v3
- name: Deploy to AKS
  id: deploy-aks
  uses: Azure/k8s-deploy@v4
  with:
    namespace: default
    manifests: manifest.yaml
    pull-images: true

Github action configuration for deploying application to AKS

Figure 9 – Github action configuration

3. The workflow is now ready and can be validated by choosing ‘Validate’ and then saved to the repository by choosing ‘Commit’.
We have implemented an automated CI/CD workflow that builds the container image of the application (refer Figure 10), pushes the image to ECR, and deploys the application to AKS cluster. This CI/CD workflow is triggered as application code is pushed to the repository.

Automated CI/CD workflow

Figure 10 – Automated CI/CD workflow

Test the deployment

When the HTML application runs, Kubernetes exposes the application using a public facing load balancer. To find the external IP of the load balancer, connect to the AKS cluster and run the following command:

kubectl get service multicloud-container-deployment-service

The output of the above command should look like the image in Figure 11.

Output of kubectl get service command

Figure 11 – Output of kubectl get service

Paste the External IP into a browser to see the running HTML application as shown in Figure 12.

HTML application running successfully in AKS

Figure 12 – Application running in AKS

Cleanup

If you have been following along with the workflow described in the post, you should delete the resources you deployed so you do not continue to incur charges. First, delete the Amazon ECR repository using the AWS console. Second, delete the project from CodeCatalyst by navigating to Project settings and choosing Delete project. There’s no cost associated with the CodeCatalyst project and you can continue using it. Finally, if you deployed the application on a new AKS cluster, delete the cluster from the Azure console. In case you deployed the application to an existing AKS cluster, run the following commands to delete the application resources.

kubectl delete deployment multicloud-container-deployment-app
kubectl delete services multicloud-container-deployment-service

Conclusion

In summary, this post showed how Amazon CodeCatalyst can help organizations deploy containerized workloads in a hybrid and multicloud environment. It demonstrated in detail how to set up and configure Amazon CodeCatalyst to deploy a containerized application to Azure Kubernetes Service, leveraging a CodeCatalyst workflow, and GitHub Actions. Learn more and get started with your Amazon CodeCatalyst journey!

If you have any questions or feedback, leave them in the comments section.

About Authors

Picture of Pawan

Pawan Shrivastava

Pawan Shrivastava is a Partner Solution Architect at AWS in the WWPS team. He focusses on working with partners to provide technical guidance on AWS, collaborate with them to understand their technical requirements, and designing solutions to meet their specific needs. Pawan is passionate about DevOps, automation and CI CD pipelines. He enjoys watching MMA, playing cricket and working out in the gym.

Picture of Brent

Brent Van Wynsberge

Brent Van Wynsberge is a Solutions Architect at AWS supporting enterprise customers. He accelerates the cloud adoption journey for organizations by aligning technical objectives to business outcomes and strategic goals, and defining them where needed. Brent is an IoT enthusiast, specifically in the application of IoT in manufacturing, he is also interested in DevOps, data analytics and containers.

Picture of Amandeep

Amandeep Bajwa

Amandeep Bajwa is a Senior Solutions Architect at AWS supporting Financial Services enterprises. He helps organizations achieve their business outcomes by identifying the appropriate cloud transformation strategy based on industry trends, and organizational priorities. Some of the areas Amandeep consults on are cloud migration, cloud strategy (including hybrid & multicloud), digital transformation, data & analytics, and technology in general.

Picture of Brian

Brian Beach

Brian Beach has over 20 years of experience as a Developer and Architect. He is currently a Principal Solutions Architect at Amazon Web Services. He holds a Computer Engineering degree from NYU Poly and an MBA from Rutgers Business School. He is the author of “Pro PowerShell for Amazon Web Services” from Apress. He is a regular author and has spoken at numerous events. Brian lives in North Carolina with his wife and three kids.

How to deploy workloads in a multicloud environment with AWS developer tools

Post Syndicated from Brent Van Wynsberge original https://aws.amazon.com/blogs/devops/how-to-deploy-workloads-in-a-multicloud-environment-with-aws-developer-tools/

As organizations embrace cloud computing as part of “cloud first” strategy, and migrate to the cloud, some of the enterprises end up in a multicloud environment.  We see that enterprise customers get the best experience, performance and cost structure when they choose a primary cloud provider. However, for a variety of reasons, some organizations end up operating in a multicloud environment. For example, in case of mergers & acquisitions, an organization may acquire an entity which runs on a different cloud platform, resulting in the organization operating in a multicloud environment. Another example is in the case where an ISV (Independent Software Vendor) provides services to customers operating on different cloud providers. One more example is the scenario where an organization needs to adhere to data residency and data sovereignty requirements, and ends up with workloads deployed to multiple cloud platforms across locations. Thus, the organization ends up running in a multicloud environment.

In the scenarios described above, one of the challenges organizations face operating such a complex environment is managing release process (building, testing, and deploying applications at scale) across multiple cloud platforms. If an organization’s primary cloud provider is AWS, they may want to continue using AWS developer tools to deploy workloads in other cloud platforms. Organizations facing such scenarios can leverage AWS services to develop their end-to-end CI/CD and release process instead of developing a release pipeline for each platform, which is complex, and not sustainable in the long run.

In this post we show how organizations can continue using AWS developer tools in a hybrid and multicloud environment. We walk the audience through a scenario where we deploy an application to VMs running on-premises and Azure, showcasing AWS’ hybrid and multicloud DevOps capabilities.

Solution and scenario overview

In this post we’re demonstrating the following steps:

  • Setup a CI/CD pipeline using AWS CodePipeline, and show how it’s run when application code is updated, and checked into the code repository (GitHub).
  • Check out application code from the code repository, and use an IDE (Visual Studio Code) to make changes, and check-in the code to the code repository.
  • Check in the modified application code to automatically run the release process built using AWS CodePipeline. It makes use of AWS CodeBuild to retrieve the latest version of code from code repository, compile it, build the deployment package, and test the application.
  • Deploy the updated application to VMs across on-premises, and Azure using AWS CodeDeploy.

The high-level solution is shown below. This post does not show all of the possible combinations and integrations available to build the CI/CD pipeline. As an example, you can integrate the pipeline with your existing tools for test and build such as Selenium, Jenkins, SonarQube etc.

This post focuses on deploying application in a multicloud environment, and how AWS Developer Tools can support virtually any scenario or use case specific to your organization. We will be deploying a sample application from this AWS tutorial to an on-premises server, and an Azure Virtual Machine (VM) running Red Hat Enterprise Linux (RHEL). In future posts in this series, we will cover how you can deploy any type of workload using AWS tools, including containers, and serverless applications.

Architecture Diagram

CI/CD pipeline setup

This section describes instructions for setting up a multicloud CI/CD pipeline.

Note: A key point to note is that the CI/CD pipeline setup, and related sub-sections in this post, are a one-time activity, and you’ll not need to perform these steps every time an application is deployed or modified.

Install CodeDeploy agent

The AWS CodeDeploy agent is a software package that is used to execute deployments on an instance. You can install the CodeDeploy agent on an on-premises server and Azure VM by either using the command line, or AWS Systems Manager.

Setup GitHub code repository

Setup GitHub code repository using the following steps:

  1. Create a new GitHub code repository or use a repository that already exists.
  2. Copy the Sample_App_Linux app (zip) from Amazon S3 as described in Step 3 of Upload a sample application to your GitHub repository tutorial.
  3. Commit the files to code repository
    git add .
    git commit -m 'Initial Commit'
    git push

You will use this repository to deploy your code across environments.

Configure AWS CodePipeline

Follow the steps outlined below to setup and configure CodePipeline to orchestrate the CI/CD pipeline of our application.

  1. Navigate to CodePipeline in the AWS console and click on ‘Create pipeline’
  2. Give your pipeline a name (eg: MyWebApp-CICD) and allow CodePipeline to create a service role on your behalf.
  3. For the source stage, select GitHub (v2) as your source provide and click on the Connect to GitHub button to give CodePipeline access to your git repository.
  4. Create a new GitHub connection and click on the Install a new App button to install the AWS Connector in your GitHub account.
  5. Back in the CodePipeline console select the repository and branch you would like to build and deploy.

Image showing the configured source stage

  1. Now we create the build stage; Select AWS CodeBuild as the build provider.
  2. Click on the ‘Create project’ button to create the project for your build stage, and give your project a name.
  3. Select Ubuntu as the operating system for your managed image, chose the standard runtime and select the ‘aws/codebuild/standard’ image with the latest version.

Image showing the configured environment

  1. In the Buildspec section select “Insert build commands” and click on switch to editor. Enter the following yaml code as your build commands:
version: 0.2
phases:
    build:
        commands:
            - echo "This is a dummy build command"
artifacts:
    files:
        - "*/*"

Note: you can also integrate build commands to your git repository by using a buildspec yaml file. More information can be found at Build specification reference for CodeBuild.

  1. Leave all other options as default and click on ‘Continue to CodePipeline’

Image showing the configured buildspec

  1. Back in the CodePipeline console your Project name will automatically be filled in. You can now continue to the next step.
  2. Click the “Skip deploy stage” button; We will create this in the next section.
  3. Review your changes and click “Create pipeline”. Your newly created pipeline will now build for the first time!

Image showing the first execution of the CI/CD pipeline

Configure AWS CodeDeploy on Azure and on-premises VMs

Now that we have built our application, we want to deploy it to both the environments – Azure, and on-premises. In the “Install CodeDeploy agent” section we’ve already installed the CodeDeploy agent. As a one-time step we now have to give the CodeDeploy agents access to the AWS environment.  You can leverage AWS Identity and Access Management (IAM) Roles Anywhere in combination with the code-deploy-session-helper to give access to the AWS resources needed.
The IAM Role should at least have the AWSCodeDeployFullAccess AWS managed policy and Read only access to the CodePipeline S3 bucket in your account (called codepipeline-<region>-<account-id>) .

For more information on how to setup IAM Roles Anywhere please refer how to extend AWS IAM roles to workloads outside of AWS with IAM Roles Anywhere. Alternative ways to configure access can be found in the AWS CodeDeploy user guide. Follow the steps below for instances you want to configure.

  1. Configure your CodeDeploy agent as described in the user guide. Ensure the AWS Command Line Interface (CLI) is installed on your VM and execute the following command to register the instance with CodeDeploy.
    aws deploy register-on-premises-instance --instance-name <name_for_your_instance> --iam-role-arn <arn_of_your_iam_role>
  1. Tag the instance as follows
    aws deploy add-tags-to-on-premises-instances --instance-names <name_for_your_instance> --tags Key=Application,Value=MyWebApp
  2. You should now see both instances registered in the “CodeDeploy > On-premises instances” panel. You can now deploy application to your Azure VM and on premises VMs!

Image showing the registered instances

Configure AWS CodeDeploy to deploy WebApp

Follow the steps mentioned below to modify the CI/CD pipeline to deploy the application to Azure, and on-premises environments.

  1. Create an IAM role named CodeDeployServiceRole and select CodeDeploy > CodeDeploy as your use case. IAM will automatically select the right policy for you. CodeDeploy will use this role to manage the deployments of your application.
  2. In the AWS console navigate to CodeDeploy > Applications. Click on “Create application”.
  3. Give your application a name and choose “EC2/On-premises” as the compute platform.
  4. Configure the instances we want to deploy to. In the detail view of your application click on “Create deployment group”.
  5. Give your deployment group a name and select the CodeDeployServiceRole.
  6. In the environment configuration section choose On-premises Instances.
  7. Configure the Application, MyWebApp key value pair.
  8. Disable load balancing and leave all other options default.
  9. Click on create deployment group. You should now see your newly created deployment group.

Image showing the created CodeDeploy Application and Deployment group

  1. We can now edit our pipeline to deploy to the newly created deployment group.
  2. Navigate to your previously created Pipeline in the CodePipeline section and click edit. Add the deploy stage by clicking on Add stage and name it Deploy. Aftewards click Add action.
  3. Name your action and choose CodeDeploy as your action provider.
  4. Select “BuildArtifact” as your input artifact and select your newly created application and deployment group.
  5. Click on Done and on Save in your pipeline to confirm the changes. You have now added the deploy step to your pipeline!

Image showing the updated pipeline

This completes the on-time devops pipeline setup, and you will not need to repeat the process.

Automated DevOps pipeline in action

This section demonstrates how the devops pipeline operates end-to-end, and automatically deploys application to Azure VM, and on-premises server when the application code changes.

  1. Click on Release Change to deploy your application for the first time. The release change button manually triggers CodePipeline to update your code. In the next section we will make changes to the repository which triggers the pipeline automatically.
  2. During the “Source” stage your pipeline fetches the latest version from github.
  3. During the “Build” stage your pipeline uses CodeBuild to build your application and generate the deployment artifacts for your pipeline. It uses the buildspec.yml file to determine the build steps.
  4. During the “Deploy” stage your pipeline uses CodeDeploy to deploy the build artifacts to the configured Deployment group – Azure VM and on-premises VM. Navigate to the url of your application to see the results of the deployment process.

Image showing the deployed sample application

 

Update application code in IDE

You can modify the application code using your favorite IDE. In this example we will change the background color and a paragraph of the sample application.

Image showing modifications being made to the file

Once you’ve modified the code, save the updated file followed by pushing the code to the code repository.

git add .
git commit -m "I made changes to the index.html file "
git push

DevOps pipeline (CodePipeline) – compile, build, and test

Once the code is updated, and pushed to GitHub, the DevOps pipeline (CodePipeline) automatically compiles, builds and tests the modified application. You can navigate to your pipeline (CodePipeline) in the AWS Console, and should see the pipeline running (or has recently completed). CodePipeline automatically executes the Build and Deploy steps. In this case we’re not adding any complex logic, but based on your organization’s requirements you can add any build step, or integrate with other tools.

Image showing CodePipeline in action

Deployment process using CodeDeploy

In this section, we describe how the modified application is deployed to the Azure, and on-premises VMs.

  1. Open your pipeline in the CodePipeline console, and click on the “AWS CodeDeploy” link in the Deploy step to navigate to your deployment group. Open the “Deployments” tab.

Image showing application deployment history

  1. Click on the first deployment in the Application deployment history section. This will show the details of your latest deployment.

Image showing deployment lifecycle events for the deployment

  1. In the “Deployment lifecycle events” section click on one of the “View events” links. This shows you the lifecycle steps executed by CodeDeploy and will display the error log output if any of the steps have failed.

Image showing deployment events on instance

  1. Navigate back to your application. You should now see your changes in the application. You’ve successfully set up a multicloud DevOps pipeline!

Image showing a new version of the deployed application

Conclusion

In summary, the post demonstrated how AWS DevOps tools and services can help organizations build a single release pipeline to deploy applications and workloads in a hybrid and multicloud environment. The post also showed how to set up CI/CD pipeline to deploy applications to AWS, on-premises, and Azure VMs.

If you have any questions or feedback, leave them in the comments section.

About the Authors

Picture of Amandeep

Amandeep Bajwa

Amandeep Bajwa is a Senior Solutions Architect at AWS supporting Financial Services enterprises. He helps organizations achieve their business outcomes by identifying the appropriate cloud transformation strategy based on industry trends, and organizational priorities. Some of the areas Amandeep consults on are cloud migration, cloud strategy (including hybrid & multicloud), digital transformation, data & analytics, and technology in general.

Picture of Pawan

Pawan Shrivastava

Pawan Shrivastava is a Partner Solution Architect at AWS in the WWPS team. He focusses on working with partners to provide technical guidance on AWS, collaborate with them to understand their technical requirements, and designing solutions to meet their specific needs. Pawan is passionate about DevOps, automation and CI CD pipelines. He enjoys watching mma, playing cricket and working out in the gym.

Picture of Brent

Brent Van Wynsberge

Brent Van Wynsberge is a Solutions Architect at AWS supporting enterprise customers. He guides organizations in their digital transformation and innovation journey and accelerates cloud adoption. Brent is an IoT enthusiast, specifically in the application of IoT in manufacturing, he is also interested in DevOps, data analytics, containers, and innovative technologies in general.

Picture of Mike

Mike Strubbe

Mike is a Cloud Solutions Architect Manager at AWS with a strong focus on cloud strategy, digital transformation, business value, leadership, and governance. He helps Enterprise customers achieve their business goals through cloud expertise, coupled with strong business acumen skills. Mike is passionate about implementing cloud strategies that enable cloud transformations, increase operational efficiency and drive business value.

Best practices to optimize your Amazon EC2 Spot Instances usage

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/best-practices-to-optimize-your-amazon-ec2-spot-instances-usage/

This blog post is written by Pranaya Anshu, EC2 PMM, and Sid Ambatipudi, EC2 Compute GTM Specialist.

Amazon EC2 Spot Instances are a powerful tool that thousands of customers use to optimize their compute costs. The National Football League (NFL) is an example of customer using Spot Instances, leveraging 4000 EC2 Spot Instances across more than 20 instance types to build its season schedule. By using Spot Instances, it saves 2 million dollars every season! Virtually any organization – small or big – can benefit from using Spot Instances by following best practices.

Overview of Spot Instances

Spot Instances let you take advantage of unused EC2 capacity in the AWS cloud and are available at up to a 90% discount compared to On-Demand prices. Through Spot Instances, you can take advantage of the massive operating scale of AWS and run hyperscale workloads at a significant cost saving. In exchange for these discounts, AWS has the option to reclaim Spot Instances when EC2 requires the capacity. AWS provides a two-minute notification before reclaiming Spot Instances, allowing workloads running on those instances to be gracefully shut down.

In this blog post, we explore four best practices that can help you optimize your Spot Instances usage and minimize the impact of Spot Instances interruptions: diversifying your instances, considering attribute-based instance type selection, leveraging Spot placement scores, and using the price-capacity-optimized allocation strategy. By applying these best practices, you’ll be able to leverage Spot Instances for appropriate workloads and ultimately reduce your compute costs. Note for the purposes of this blog, we will focus on the integration of Spot Instances with Amazon EC2 Auto Scaling groups.

Pre-requisites

Spot Instances can be used for various stateless, fault-tolerant, or flexible applications such as big data, containerized workloads, CI/CD, web servers, high-performance computing (HPC), and AI/ML workloads. However, as previously mentioned, AWS can interrupt Spot Instances with a two-minute notification, so it is best not to use Spot Instances for workloads that cannot handle individual instance interruption — that is, workloads that are inflexible, stateful, fault-intolerant, or tightly coupled.

Best practices

  1. Diversify your instances

The fundamental best practice when using Spot Instances is to be flexible. A Spot capacity pool is a set of unused EC2 instances of the same instance type (for example, m6i.large) within the same AWS Region and Availability Zone (for example, us-east-1a). When you request Spot Instances, you are requesting instances from a specific Spot capacity pool. Since Spot Instances are spare EC2 capacity, you want to base your selection (request) on as many spare pools of capacity as possible in order to increase your likelihood of getting Spot Instances. You should diversify across instance sizes, generations, instance types, and Availability Zones to maximize your savings with Spot Instances. For example, if you are currently using c5a.large in us-east-1a, consider including c6a instances (newer generation of instances), c5a.xl (larger size), or us-east-1b (different Availability Zone) to increase your overall flexibility. Instance diversification is beneficial not only for selecting Spot Instances, but also for scaling, resilience, and cost optimization.

To get hands-on experience with Spot Instances and to practice instance diversification, check out Amazon EC2 Spot Instances workshops. And once you’ve diversified your instances, you can leverage AWS Fault Injection Simulator (AWS FIS) to test your applications’ resilience to Spot Instance interruptions to ensure that they can maintain target capacity while still benefiting from the cost savings offered by Spot Instances. To learn more about stress testing your applications, check out the Back to Basics: Chaos Engineering with AWS Fault Injection Simulator video and AWS FIS documentation.

  1. Consider attribute-based instance type selection

We have established that flexibility is key when it comes to getting the most out of Spot Instances. Similarly, we have said that in order to access your desired Spot Instances capacity, you should select multiple instance types. While building and maintaining instance type configurations in a flexible way may seem daunting or time-consuming, it doesn’t have to be if you use attribute-based instance type selection. With attribute-based instance type selection, you can specify instance attributes — for example, CPU, memory, and storage — and EC2 Auto Scaling will automatically identify and launch instances that meet your defined attributes. This removes the manual-lift of configuring and updating instance types. Moreover, this selection method enables you to automatically use newly released instance types as they become available so that you can continuously have access to an increasingly broad range of Spot Instance capacity. Attribute-based instance type selection is ideal for workloads and frameworks that are instance agnostic, such as HPC and big data workloads, and can help to reduce the work involved with selecting specific instance types to meet specific requirements.

For more information on how to configure attribute-based instance selection for your EC2 Auto Scaling group, refer to Create an Auto Scaling Group Using Attribute-Based Instance Type Selection documentation. To learn more about attribute-based instance type selection, read the Attribute-Based Instance Type Selection for EC2 Auto Scaling and EC2 Fleet news blog or check out the Using Attribute-Based Instance Type Selection and Mixed Instance Groups section of the Launching Spot Instances workshop.

  1. Leverage Spot placement scores

Now that we’ve stressed the importance of flexibility when it comes to Spot Instances and covered the best way to select instances, let’s dive into how to find preferred times and locations to launch Spot Instances. Because Spot Instances are unused EC2 capacity, Spot Instances capacity fluctuates. Correspondingly, it is possible that you won’t always get the exact capacity at a specific time that you need through Spot Instances. Spot placement scores are a feature of Spot Instances that indicates how likely it is that you will be able to get the Spot capacity that you require in a specific Region or Availability Zone. Your Spot placement score can help you reduce Spot Instance interruptions, acquire greater capacity, and identify optimal configurations to run workloads on Spot Instances. However, it is important to note that Spot placement scores serve only as point-in-time recommendations (scores can vary depending on current capacity) and do not provide any guarantees in terms of available capacity or risk of interruption.  To learn more about how Spot placement scores work and to get started with them, see the Identifying Optimal Locations for Flexible Workloads With Spot Placement Score blog and Spot placement scores documentation.

As a near real-time tool, Spot placement scores are often integrated into deployment automation. However, because of its logging and graphic capabilities, you may find it to be a valuable resource even before you launch a workload in the cloud. If you are looking to understand historical Spot placement scores for your workload, you should check out the Spot placement score tracker, a tool that automates the capture of Spot placement scores and stores Spot placement score metrics in Amazon CloudWatch. The tracker is available through AWS Labs, a GitHub repository hosting tools. Learn more about the tracker through the Optimizing Amazon EC2 Spot Instances with Spot Placement Scores blog.

When considering ideal times to launch Spot Instances and exploring different options via Spot placement scores, be sure to consider running Spot Instances at off-peak hours – or hours when there is less demand for EC2 Instances. As you may assume, there is less unused capacity – Spot Instances – available during typical business hours than after business hours. So, in order to leverage as much Spot capacity as you can, explore the possibility of running your workload at hours when there is reduced demand for EC2 instances and thus greater availability of Spot Instances. Similarly, consider running your Spot Instances in “off-peak Regions” – or Regions that are not experiencing business hours at that certain time.

On a related note, to maximize your usage of Spot Instances, you should consider using previous generation of instances if they meet your workload needs. This is because, as with off-peak vs peak hours, there is typically greater capacity available for previous generation instances than current generation instances, as most people tend to use current generation instances for their compute needs.

  1. Use the price-capacity-optimized allocation strategy

Once you’ve selected a diversified and flexible set of instances, you should select your allocation strategy. When launching instances, your Auto Scaling group uses the allocation strategy that you specify to pick the specific Spot pools from all your possible pools. Spot offers four allocation strategies: price-capacity-optimized, capacity-optimized, capacity-optimized-prioritized, and lowest-price. Each of these allocation strategies select Spot Instances in pools based on price, capacity, a prioritized list of instances, or a combination of these factors.

The price-capacity-optimized strategy launched in November 2022. This strategy makes Spot Instance allocation decisions based on the most capacity at the lowest price. It essentially enables Auto Scaling groups to identify the Spot pools with the highest capacity availability for the number of instances that are launching. In other words, if you select this allocation strategy, we will find the Spot capacity pools that we believe have the lowest chance of interruption in the near term. Your Auto Scaling groups then request Spot Instances from the lowest priced of these pools.

We recommend you leverage the price-capacity-optimized allocation strategy for the majority of your workloads that run on Spot Instances. To see how the price-capacity-optimized allocation strategy selects Spot Instances in comparison with lowest-price and capacity-optimized allocation strategies, read the Introducing the Price-Capacity-Optimized Allocation Strategy for EC2 Spot Instances blog post.

Clean-up

If you’ve explored the different Spot Instances workshops we recommended throughout this blog post and spun up resources, please remember to delete resources that you are no longer using to avoid incurring future costs.

Conclusion

Spot Instances can be leveraged to reduce costs across a wide-variety of use cases, including containers, big data, machine learning, HPC, and CI/CD workloads. In this blog, we discussed four Spot Instances best practices that can help you optimize your Spot Instance usage to maximize savings: diversifying your instances, considering attribute-based instance type selection, leveraging Spot placement scores, and using the price-capacity-optimized allocation strategy.

To learn more about Spot Instances, check out Spot Instances getting started resources. Or to learn of other ways of reducing costs and improving performance, including leveraging other flexible purchase models such as AWS Savings Plans, read the Increase Your Application Performance at Lower Costs eBook or watch the Seven Steps to Lower Costs While Improving Application Performance webinar.

Multi-branch pipeline management and infrastructure deployment using AWS CDK Pipelines

Post Syndicated from Iris Kraja original https://aws.amazon.com/blogs/devops/multi-branch-pipeline-management-and-infrastructure-deployment-using-aws-cdk-pipelines/

This post describes how to use the AWS CDK Pipelines module to follow a Gitflow development model using AWS Cloud Development Kit (AWS CDK). Software development teams often follow a strict branching strategy during a solutions development lifecycle. Newly-created branches commonly need their own isolated copy of infrastructure resources to develop new features.

CDK Pipelines is a construct library module for continuous delivery of AWS CDK applications. CDK Pipelines are self-updating: if you add application stages or stacks, then the pipeline automatically reconfigures itself to deploy those new stages and/or stacks.

The following solution creates a new AWS CDK Pipeline within a development account for every new branch created in the source repository (AWS CodeCommit). When a branch is deleted, the pipeline and all related resources are also destroyed from the account. This GitFlow model for infrastructure provisioning allows developers to work independently from each other, concurrently, even in the same stack of the application.

Solution overview

The following diagram provides an overview of the solution. There is one default pipeline responsible for deploying resources to the different application environments (e.g., Development, Pre-Prod, and Prod). The code is stored in CodeCommit. When new changes are pushed to the default CodeCommit repository branch, AWS CodePipeline runs the default pipeline. When the default pipeline is deployed, it creates two AWS Lambda functions.

These two Lambda functions are invoked by CodeCommit CloudWatch events when a new branch in the repository is created or deleted. The Create Lambda function uses the boto3 CodeBuild module to create an AWS CodeBuild project that builds the pipeline for the feature branch. This feature pipeline consists of a build stage and an optional update pipeline stage for itself. The Destroy Lambda function creates another CodeBuild project which cleans all of the feature branch’s resources and the feature pipeline.

Figure 1. Architecture diagram.

Figure 1. Architecture diagram.

Prerequisites

Before beginning this walkthrough, you should have the following prerequisites:

  • An AWS account
  • AWS CDK installed
  • Python3 installed
  • Jq (JSON processor) installed
  • Basic understanding of continuous integration/continuous development (CI/CD) Pipelines

Initial setup

Download the repository from GitHub:

# Command to clone the repository
git clone https://github.com/aws-samples/multi-branch-cdk-pipelines.git
cd multi-branch-cdk-pipelines

Create a new CodeCommit repository in the AWS Account and region where you want to deploy the pipeline and upload the source code from above to this repository. In the config.ini file, change the repository_name and region variables accordingly.

Make sure that you set up a fresh Python environment. Install the dependencies:

pip install -r requirements.txt

Run the initial-deploy.sh script to bootstrap the development and production environments and to deploy the default pipeline. You’ll be asked to provide the following parameters: (1) Development account ID, (2) Development account AWS profile name, (3) Production account ID, and (4) Production account AWS profile name.

sh ./initial-deploy.sh --dev_account_id <YOUR DEV ACCOUNT ID> --
dev_profile_name <YOUR DEV PROFILE NAME> --prod_account_id <YOUR PRODUCTION
ACCOUNT ID> --prod_profile_name <YOUR PRODUCTION PROFILE NAME>

Default pipeline

In the CI/CD pipeline, we set up an if condition to deploy the default branch resources only if the current branch is the default one. The default branch is retrieved programmatically from the CodeCommit repository. We deploy an Amazon Simple Storage Service (Amazon S3) Bucket and two Lambda functions. The bucket is responsible for storing the feature branches’ CodeBuild artifacts. The first Lambda function is triggered when a new branch is created in CodeCommit. The second one is triggered when a branch is deleted.

if branch == default_branch:
    
...

    # Artifact bucket for feature AWS CodeBuild projects
    artifact_bucket = Bucket(
        self,
        'BranchArtifacts',
        encryption=BucketEncryption.KMS_MANAGED,
        removal_policy=RemovalPolicy.DESTROY,
        auto_delete_objects=True
    )
...
    # AWS Lambda function triggered upon branch creation
    create_branch_func = aws_lambda.Function(
        self,
        'LambdaTriggerCreateBranch',
        runtime=aws_lambda.Runtime.PYTHON_3_8,
        function_name='LambdaTriggerCreateBranch',
        handler='create_branch.handler',
        code=aws_lambda.Code.from_asset(path.join(this_dir, 'code')),
        environment={
            "ACCOUNT_ID": dev_account_id,
            "CODE_BUILD_ROLE_ARN": iam_stack.code_build_role.role_arn,
            "ARTIFACT_BUCKET": artifact_bucket.bucket_name,
            "CODEBUILD_NAME_PREFIX": codebuild_prefix
        },
        role=iam_stack.create_branch_role)


    # AWS Lambda function triggered upon branch deletion
    destroy_branch_func = aws_lambda.Function(
        self,
        'LambdaTriggerDestroyBranch',
        runtime=aws_lambda.Runtime.PYTHON_3_8,
        function_name='LambdaTriggerDestroyBranch',
        handler='destroy_branch.handler',
        role=iam_stack.delete_branch_role,
        environment={
            "ACCOUNT_ID": dev_account_id,
            "CODE_BUILD_ROLE_ARN": iam_stack.code_build_role.role_arn,
            "ARTIFACT_BUCKET": artifact_bucket.bucket_name,
            "CODEBUILD_NAME_PREFIX": codebuild_prefix,
            "DEV_STAGE_NAME": f'{dev_stage_name}-{dev_stage.main_stack_name}'
        },
        code=aws_lambda.Code.from_asset(path.join(this_dir,
                                                  'code')))

Then, the CodeCommit repository is configured to trigger these Lambda functions based on two events:

(1) Reference created

# Configure AWS CodeCommit to trigger the Lambda function when a new branch is created
repo.on_reference_created(
    'BranchCreateTrigger',
    description="AWS CodeCommit reference created event.",
    target=aws_events_targets.LambdaFunction(create_branch_func))

(2) Reference deleted

# Configure AWS CodeCommit to trigger the Lambda function when a branch is deleted
repo.on_reference_deleted(
    'BranchDeleteTrigger',
    description="AWS CodeCommit reference deleted event.",
    target=aws_events_targets.LambdaFunction(destroy_branch_func))

Lambda functions

The two Lambda functions build and destroy application environments mapped to each feature branch. An Amazon CloudWatch event triggers the LambdaTriggerCreateBranch function whenever a new branch is created. The CodeBuild client from boto3 creates the build phase and deploys the feature pipeline.

Create function

The create function deploys a feature pipeline which consists of a build stage and an optional update pipeline stage for itself. The pipeline downloads the feature branch code from the CodeCommit repository, initiates the Build and Test action using CodeBuild, and securely saves the built artifact on the S3 bucket.

The Lambda function handler code is as follows:

def handler(event, context):
    """Lambda function handler"""
    logger.info(event)

    reference_type = event['detail']['referenceType']

    try:
        if reference_type == 'branch':
            branch = event['detail']['referenceName']
            repo_name = event['detail']['repositoryName']

            client.create_project(
                name=f'{codebuild_name_prefix}-{branch}-create',
                description="Build project to deploy branch pipeline",
                source={
                    'type': 'CODECOMMIT',
                    'location': f'https://git-codecommit.{region}.amazonaws.com/v1/repos/{repo_name}',
                    'buildspec': generate_build_spec(branch)
                },
                sourceVersion=f'refs/heads/{branch}',
                artifacts={
                    'type': 'S3',
                    'location': artifact_bucket_name,
                    'path': f'{branch}',
                    'packaging': 'NONE',
                    'artifactIdentifier': 'BranchBuildArtifact'
                },
                environment={
                    'type': 'LINUX_CONTAINER',
                    'image': 'aws/codebuild/standard:4.0',
                    'computeType': 'BUILD_GENERAL1_SMALL'
                },
                serviceRole=role_arn
            )

            client.start_build(
                projectName=f'CodeBuild-{branch}-create'
            )
    except Exception as e:
        logger.error(e)

Create branch CodeBuild project’s buildspec.yaml content:

version: 0.2
env:
  variables:
    BRANCH: {branch}
    DEV_ACCOUNT_ID: {account_id}
    PROD_ACCOUNT_ID: {account_id}
    REGION: {region}
phases:
  pre_build:
    commands:
      - npm install -g aws-cdk && pip install -r requirements.txt
  build:
    commands:
      - cdk synth
      - cdk deploy --require-approval=never
artifacts:
  files:
    - '**/*'

Destroy function

The second Lambda function is responsible for the destruction of a feature branch’s resources. Upon the deletion of a feature branch, an Amazon CloudWatch event triggers this Lambda function. The function creates a CodeBuild Project which destroys the feature pipeline and all of the associated resources created by that pipeline. The source property of the CodeBuild Project is the feature branch’s source code saved as an artifact in Amazon S3.

The Lambda function handler code is as follows:

def handler(event, context):
    logger.info(event)
    reference_type = event['detail']['referenceType']

    try:
        if reference_type == 'branch':
            branch = event['detail']['referenceName']
            client.create_project(
                name=f'{codebuild_name_prefix}-{branch}-destroy',
                description="Build project to destroy branch resources",
                source={
                    'type': 'S3',
                    'location': f'{artifact_bucket_name}/{branch}/CodeBuild-{branch}-create/',
                    'buildspec': generate_build_spec(branch)
                },
                artifacts={
                    'type': 'NO_ARTIFACTS'
                },
                environment={
                    'type': 'LINUX_CONTAINER',
                    'image': 'aws/codebuild/standard:4.0',
                    'computeType': 'BUILD_GENERAL1_SMALL'
                },
                serviceRole=role_arn
            )

            client.start_build(
                projectName=f'CodeBuild-{branch}-destroy'
            )

            client.delete_project(
                name=f'CodeBuild-{branch}-destroy'
            )

            client.delete_project(
                name=f'CodeBuild-{branch}-create'
            )
    except Exception as e:
        logger.error(e)

Destroy the branch CodeBuild project’s buildspec.yaml content:

version: 0.2
env:
  variables:
    BRANCH: {branch}
    DEV_ACCOUNT_ID: {account_id}
    PROD_ACCOUNT_ID: {account_id}
    REGION: {region}
phases:
  pre_build:
    commands:
      - npm install -g aws-cdk && pip install -r requirements.txt
  build:
    commands:
      - cdk destroy cdk-pipelines-multi-branch-{branch} --force
      - aws cloudformation delete-stack --stack-name {dev_stage_name}-{branch}
      - aws s3 rm s3://{artifact_bucket_name}/{branch} --recursive

Create a feature branch

On your machine’s local copy of the repository, create a new feature branch using the following git commands. Replace user-feature-123 with a unique name for your feature branch. Note that this feature branch name must comply with the CodePipeline naming restrictions, as it will be used to name a unique pipeline later in this walkthrough.

# Create the feature branch
git checkout -b user-feature-123
git push origin user-feature-123

The first Lambda function will deploy the CodeBuild project, which then deploys the feature pipeline. This can take a few minutes. You can log in to the AWS Console and see the CodeBuild project running under CodeBuild.

Figure 2. AWS Console - CodeBuild projects.

Figure 2. AWS Console – CodeBuild projects.

After the build is successfully finished, you can see the deployed feature pipeline under CodePipelines.

Figure 3. AWS Console - CodePipeline pipelines.

Figure 3. AWS Console – CodePipeline pipelines.

The Lambda S3 trigger project from AWS CDK Samples is used as the infrastructure resources to demonstrate this solution. The content is placed inside the src directory and is deployed by the pipeline. When visiting the Lambda console page, you can see two functions: one by the default pipeline and one by our feature pipeline.

Figure 4. AWS Console - Lambda functions.

Figure 4. AWS Console – Lambda functions.

Destroy a feature branch

There are two common ways for removing feature branches. The first one is related to a pull request, also known as a “PR”. This occurs when merging a feature branch back into the default branch. Once it’s merged, the feature branch will be automatically closed. The second way is to delete the feature branch explicitly by running the following git commands:

# delete branch local
git branch -d user-feature-123

# delete branch remote
git push origin --delete user-feature-123

The CodeBuild project responsible for destroying the feature resources is now triggered. You can see the project’s logs while the resources are being destroyed in CodeBuild, under Build history.

Figure 5. AWS Console - CodeBuild projects.

Figure 5. AWS Console – CodeBuild projects.

Cleaning up

To avoid incurring future charges, log into the AWS console of the different accounts you used, go to the AWS CloudFormation console of the Region(s) where you chose to deploy, and select and click Delete on the main and branch stacks.

Conclusion

This post showed how you can work with an event-driven strategy and AWS CDK to implement a multi-branch pipeline flow using AWS CDK Pipelines. The described solutions leverage Lambda and CodeBuild to provide a dynamic orchestration of resources for multiple branches and pipelines.
For more information on CDK Pipelines and all the ways it can be used, see the CDK Pipelines reference documentation.

About the authors:

Iris Kraja

Iris is a Cloud Application Architect at AWS Professional Services based in New York City. She is passionate about helping customers design and build modern AWS cloud native solutions, with a keen interest in serverless technology, event-driven architectures and DevOps.  Outside of work, she enjoys hiking and spending as much time as possible in nature.

Jan Bauer

Jan is a Cloud Application Architect at AWS Professional Services. His interests are serverless computing, machine learning, and everything that involves cloud computing.

Rolando Santamaria Maso

Rolando is a senior cloud application development consultant at AWS Professional Services, based in Germany. He helps customers migrate and modernize workloads in the AWS Cloud, with a special focus on modern application architectures and development best practices, but he also creates IaC using AWS CDK. Outside work, he maintains open-source projects and enjoys spending time with family and friends.

Caroline Gluck

Caroline is an AWS Cloud application architect based in New York City, where she helps customers design and build cloud native data science applications. Caroline is a builder at heart, with a passion for serverless architecture and machine learning. In her spare time, she enjoys traveling, cooking, and spending time with family and friends.

Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines

Post Syndicated from Puneet Babbar original https://aws.amazon.com/blogs/big-data/build-test-and-deploy-etl-solutions-using-aws-glue-and-aws-cdk-based-ci-cd-pipelines/

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning (ML), and application development. It’s serverless, so there’s no infrastructure to set up or manage.

This post provides a step-by-step guide to build a continuous integration and continuous delivery (CI/CD) pipeline using AWS CodeCommit, AWS CodeBuild, and AWS CodePipeline to define, test, provision, and manage changes of AWS Glue based data pipelines using the AWS Cloud Development Kit (AWS CDK).

The AWS CDK is an open-source software development framework for defining cloud infrastructure as code using familiar programming languages and provisioning it through AWS CloudFormation. It provides you with high-level components called constructs that preconfigure cloud resources with proven defaults, cutting down boilerplate code and allowing for faster development in a safe, repeatable manner.

Solution overview

The solution constructs a CI/CD pipeline with multiple stages. The CI/CD pipeline constructs a data pipeline using COVID-19 Harmonized Data managed by Talend / Stitch. The data pipeline crawls the datasets provided by neherlab from the public Amazon Simple Storage Service (Amazon S3) bucket, exposes the public datasets in the AWS Glue Data Catalog so they’re available for SQL queries using Amazon Athena, performs ETL (extract, transform, and load) transformations to denormalize the datasets to a table, and makes the denormalized table available in the Data Catalog.

The solution is designed as follows:

  • A data engineer deploys the initial solution. The solution creates two stacks:
    • cdk-covid19-glue-stack-pipeline – This stack creates the CI/CD infrastructure as shown in the architectural diagram (labeled Tool Chain).
    • cdk-covid19-glue-stack – The cdk-covid19-glue-stack-pipeline stack deploys the cdk-covid19-glue-stack stack to create the AWS Glue based data pipeline as shown in the diagram (labeled ETL).
  • The data engineer makes changes on cdk-covid19-glue-stack (when a change in the ETL application is required).
  • The data engineer pushes the change to a CodeCommit repository (generated in the cdk-covid19-glue-stack-pipeline stack).
  • The pipeline is automatically triggered by the push, and deploys and updates all the resources in the cdk-covid19-glue-stack stack.

At the time of publishing of this post, the AWS CDK has two versions of the AWS Glue module: @aws-cdk/aws-glue and @aws-cdk/aws-glue-alpha, containing L1 constructs and L2 constructs, respectively. At this time, the @aws-cdk/aws-glue-alpha module is still in an experimental stage. We use the stable @aws-cdk/aws-glue module for the purpose of this post.

The following diagram shows all the components in the solution.

BDB-2467-architecture-diagram

Figure 1 – Architecture diagram

The data pipeline consists of an AWS Glue workflow, triggers, jobs, and crawlers. The AWS Glue job uses an AWS Identity and Access Management (IAM) role with appropriate permissions to read and write data to an S3 bucket. AWS Glue crawlers crawl the data available in the S3 bucket, update the AWS Glue Data Catalog with the metadata, and create tables. You can run SQL queries on these tables using Athena. For ease of identification, we followed the naming convention for triggers to start with t_*, crawlers with c_*, and jobs with j_*. A CI/CD pipeline based on CodeCommit, CodeBuild, and CodePipeline builds, tests and deploys the solution. The complete infrastructure is created using the AWS CDK.

The following table lists the tables created by this solution that you can query using Athena.

Table Name Description Dataset Location Access Location
neherlab_case_counts Total number of cases s3://covid19-harmonized-dataset/covid19tos3/neherlab_case_counts/ Read Public
neherlab_country_codes Country code s3://covid19-harmonized-dataset/covid19tos3/neherlab_country_codes/ Read Public
neherlab_icu_capacity Intensive Care Unit (ICU) capacity s3://covid19-harmonized-dataset/covid19tos3/neherlab_icu_capacity/ Read Public
neherlab_population Population s3://covid19-harmonized-dataset/covid19tos3/neherlab_population/ Read Public
neherla_denormalized Denormalized table that combines all the preceding tables into one table s3://<your-S3-bucket-name>/neherlab_denormalized Read/Write Reader’s AWS account

Anatomy of the AWS CDK application

In this section, we visit key concepts and anatomy of the AWS CDK application, review the important sections of the code, and discuss how the AWS CDK reduces complexity of the solution as compared to AWS CloudFormation.

An AWS CDK app defines one or more stacks. Stacks (equivalent to CloudFormation stacks) contain constructs, each of which defines one or more concrete AWS resources. Each stack in the AWS CDK app is associated with an environment. An environment is the target AWS account ID and Region into which the stack is intended to be deployed.

In the AWS CDK, the top-most object is the AWS CDK app, which contains multiple stacks vs. the top-level stack in AWS CloudFormation. Given this difference, you can define all the stacks required for the application in the AWS CDK app. In AWS Glue based ETL projects, developers need to define multiple data pipelines by subject area or business logic. In AWS CloudFormation, we can achieve this by writing multiple CloudFormation stacks and often deploy them independently. In some cases, developers write nested stacks, which over time becomes very large and complicated to maintain. In the AWS CDK, all stacks are deployed from the AWS CDK app, increasing modularity of the code and allowing developers to identify all the data pipelines associated with an application easily.

Our AWS CDK application consists of four main files:

  • app.py – This is the AWS CDK app and the entry point for the AWS CDK application
  • pipeline.py – The pipeline.py stack, invoked by app.py, creates the CI/CD pipeline
  • etl/infrastructure.py – The etl/infrastructure.py stack, invoked by pipeline.py, creates the AWS Glue based data pipeline
  • default-config.yaml – The configuration file contains the AWS account ID and Region.

The AWS CDK application reads the configuration from the default-config.yaml file, sets the environment information (AWS account ID and Region), and invokes the PipelineCDKStack class in pipeline.py. Let’s break down the preceding line and discuss the benefits of this design.

For every application, we want to deploy in pre-production environments and a production environment. The application in all the environments will have different configurations, such as the size of the deployed resources. In the AWS CDK, every stack has a property called env, which defines the stack’s target environment. This property receives the AWS account ID and Region for the given stack.

Lines 26–34 in app.py show the aforementioned details:

# Initiating the CodePipeline stack
PipelineCDKStack(
app,
"PipelineCDKStack",
config=config,
env=env,
stack_name=config["codepipeline"]["pipelineStackName"]
)

The env=env line sets the target AWS account ID and Region for PipelieCDKStack. This design allows an AWS CDK app to be deployed in multiple environments at once and increases the parity of the application in all environment. For our example, if we want to deploy PipelineCDKStack in multiple environments, such as development, test, and production, we simply call the PipelineCDKStack stack after populating the env variable appropriately with the target AWS account ID and Region. This was more difficult in AWS CloudFormation, where developers usually needed to deploy the stack for each environment individually. The AWS CDK also provides features to pass the stage at the command line. We look into this option and usage in the later section.

Coming back to the AWS CDK application, the PipelineCDKStack class in pipeline.py uses the aws_cdk.pipeline construct library to create continuous delivery of AWS CDK applications. The AWS CDK provides multiple opinionated construct libraries like aws_cdk.pipeline to reduce boilerplate code from an application. The pipeline.py file creates the CodeCommit repository, populates the repository with the sample code, and creates a pipeline with the necessary AWS CDK stages for CodePipeline to run the CdkGlueBlogStack class from the etl/infrastructure.py file.

Line 99 in pipeline.py invokes the CdkGlueBlogStack class.

The CdkGlueBlogStack class in etl/infrastructure.py creates the crawlers, jobs, database, triggers, and workflow to provision the AWS Glue based data pipeline.

Refer to line 539 for creating a crawler using the CfnCrawler construct, line 564 for creating jobs using the CfnJob construct, and line 168 for creating the workflow using the CfnWorkflow construct. We use the CfnTrigger construct to stitch together multiple triggers to create the workflow. The AWS CDK L1 constructs expose all the available AWS CloudFormation resources and entities using methods from popular programing languages. This allows developers to use popular programing languages to provision resources instead of working with JSON or YAML files in AWS CloudFormation.

Refer to etl/infrastructure.py for additional details.

Walkthrough of the CI/CD pipeline

In this section, we walk through the various stages of the CI/CD pipeline. Refer to CDK Pipelines: Continuous delivery for AWS CDK applications for additional information.

  • Source – This stage fetches the source of the AWS CDK app from the CodeCommit repo and triggers the pipeline every time a new commit is made.
  • Build – This stage compiles the code (if necessary), runs the tests, and performs a cdk synth. The output of the step is a cloud assembly, which is used to perform all the actions in the rest of the pipeline. The pytest is run using the amazon/aws-glue-libs:glue_libs_3.0.0_image_01 Docker image. This image comes with all the required libraries to run tests for AWS Glue version 3.0 jobs using a Docker container. Refer to Develop and test AWS Glue version 3.0 jobs locally using a Docker container for additional information.
  • UpdatePipeline – This stage modifies the pipeline if necessary. For example, if the code is updated to add a new deployment stage to the pipeline or add a new asset to your application, the pipeline is automatically updated to reflect the changes.
  • Assets – This stage prepares and publishes all AWS CDK assets of the app to Amazon S3 and all Docker images to Amazon Elastic Container Registry (Amazon ECR). When the AWS CDK deploys an app that references assets (either directly by the app code or through a library), the AWS CDK CLI first prepares and publishes the assets to Amazon S3 using a CodeBuild job. This AWS Glue solution creates four assets.
  • CDKGlueStage – This stage deploys the assets to the AWS account. In this case, the pipeline deploys the AWS CDK template etl/infrastructure.py to create all the AWS Glue artifacts.

Code

The code can be found at AWS Samples on GitHub.

Prerequisites

This post assumes you have the following:

Deploy the solution

To deploy the solution, complete the following steps:

  • Download the source code from the AWS Samples GitHub repository to the client machine:
$ git clone [email protected]:aws-samples/aws-glue-cdk-cicd.git
  • Create the virtual environment:
$ cd aws-glue-cdk-cicd 
$ python3 -m venv .venv

This step creates a Python virtual environment specific to the project on the client machine. We use a virtual environment in order to isolate the Python environment for this project and not install software globally.

  • Activate the virtual environment according to your OS:
    • On MacOS and Linux, use the following code:
$ source .venv/bin/activate
    • On a Windows platform, use the following code:
% .venv\Scripts\activate.bat

After this step, the subsequent steps run within the bounds of the virtual environment on the client machine and interact with the AWS account as needed.

  • Install the required dependencies described in requirements.txt to the virtual environment:
$ pip install -r requirements.txt
  • Bootstrap the AWS CDK app:
cdk bootstrap

This step populates a given environment (AWS account ID and Region) with resources required by the AWS CDK to perform deployments into the environment. Refer to Bootstrapping for additional information. At this step, you can see the CloudFormation stack CDKToolkit on the AWS CloudFormation console.

  • Synthesize the CloudFormation template for the specified stacks:
$ cdk synth # optional if not default (-c stage=default)

You can verify the CloudFormation templates to identify the resources to be deployed in the next step.

  • Deploy the AWS resources (CI/CD pipeline and AWS Glue based data pipeline):
$ cdk deploy # optional if not default (-c stage=default)

At this step, you can see CloudFormation stacks cdk-covid19-glue-stack-pipeline and cdk-covid19-glue-stack on the AWS CloudFormation console. The cdk-covid19-glue-stack-pipeline stack gets deployed first, which in turn deploys cdk-covid19-glue-stack to create the AWS Glue pipeline.

Verify the solution

When all the previous steps are complete, you can check for the created artifacts.

CloudFormation stacks

You can confirm the existence of the stacks on the AWS CloudFormation console. As shown in the following screenshot, the CloudFormation stacks have been created and deployed by cdk bootstrap and cdk deploy.

BDB-2467-cloudformation-stacks

Figure 2 – AWS CloudFormation stacks

CodePipeline pipeline

On the CodePipeline console, check for the cdk-covid19-glue pipeline.

BDB-2467-code-pipeline-summary

Figure 3 – AWS CodePipeline summary view

You can open the pipeline for a detailed view.

BDB-2467-code-pipeline-detailed

Figure 4 – AWS CodePipeline detailed view

AWS Glue workflow

To validate the AWS Glue workflow and its components, complete the following steps:

  • On the AWS Glue console, choose Workflows in the navigation pane.
  • Confirm the presence of the Covid_19 workflow.
BDB-2467-glue-workflow-summary

Figure 5 – AWS Glue Workflow summary view

You can select the workflow for a detailed view.

BDB-2467-glue-workflow-detailed

Figure 6 – AWS Glue Workflow detailed view

  • Choose Triggers in the navigation pane and check for the presence of seven t-* triggers.
BDB-2467-glue-triggers

Figure 7 – AWS Glue Triggers

  • Choose Jobs in the navigation pane and check for the presence of three j_* jobs.
BDB-2467-glue-jobs

Figure 8 – AWS Glue Jobs

The jobs perform the following tasks:

    • etlScripts/j_emit_start_event.py – A Python job that starts the workflow and creates the event
    • etlScripts/j_neherlab_denorm.py – A Spark ETL job to transform the data and create a denormalized view by combining all the base data together in Parquet format
    • etlScripts/j_emit_ended_event.py – A Python job that ends the workflow and creates the specific event
  • Choose Crawlers in the navigation pane and check for the presence of five neherlab-* crawlers.
BDB-2467-glue-crawlers

Figure 9 – AWS Glue Crawlers

Execute the solution

  • The solution creates a scheduled AWS Glue workflow which runs at 10:00 AM UTC on day 1 of every month. A scheduled workflow can also be triggered on-demand. For the purpose of this post, we will execute the workflow on-demand using the following command from the AWS CLI. If the workflow is successfully started, the command returns the run ID. For instructions on how to run and monitor a workflow in Amazon Glue, refer to Running and monitoring a workflow in Amazon Glue.
aws glue start-workflow-run --name Covid_19
  • You can verify the status of a workflow run by execution the following command from the AWS CLI. Please use the run ID returned from the above command. A successfully executed Covid_19 workflow should return a value of 7 for SucceededActions  and 0 for FailedActions.
aws glue get-workflow-run --name Covid_19 --run-id <run_ID>
  • A sample output of the above command is provided below.
{
"Run": {
"Name": "Covid_19",
"WorkflowRunId": "wr_c8855e82ab42b2455b0e00cf3f12c81f957447abd55a573c087e717f54a4e8be",
"WorkflowRunProperties": {},
"StartedOn": "2022-09-20T22:13:40.500000-04:00",
"CompletedOn": "2022-09-20T22:21:39.545000-04:00",
"Status": "COMPLETED",
"Statistics": {
"TotalActions": 7,
"TimeoutActions": 0,
"FailedActions": 0,
"StoppedActions": 0,
"SucceededActions": 7,
"RunningActions": 0
}
}
}
  • (Optional) To verify the status of the workflow run using AWS Glue console, choose Workflows in the navigation pane, select the Covid_19 workflow, click on the History tab, select the latest row and click on View run details. A successfully completed workflow is marked in green check marks. Please refer to the Legend section in the below screenshot for additional statuses.

    BDB-2467-glue-workflow-success

    Figure 10 – AWS Glue Workflow successful run

Check the output

  • When the workflow is complete, navigate to the Athena console to check the successful creation and population of neherlab_denormalized table. You can run SQL queries against all 5 tables to check the data. A sample SQL query is provided below.
SELECT "country", "location", "date", "cases", "deaths", "ecdc-countries",
        "acute_care", "acute_care_per_100K", "critical_care", "critical_care_per_100K" 
FROM "AwsDataCatalog"."covid19db"."neherlab_denormalized"
limit 10;
BDB-2467-athena

Figure 10 – Amazon Athena

Clean up

To clean up the resources created in this post, delete the AWS CloudFormation stacks in the following order:

  • cdk-covid19-glue-stack
  • cdk-covid19-glue-stack-pipeline
  • CDKToolkit

Then delete all associated S3 buckets:

  • cdk-covid19-glue-stack-p-pipelineartifactsbucketa-*
  • cdk-*-assets-<AWS_ACCOUNT_ID>-<AWS_REGION>
  • covid19-glue-config-<AWS_ACCOUNT_ID>-<AWS_REGION>
  • neherlab-denormalized-dataset-<AWS_ACCOUNT_ID>-<AWS_REGION>

Conclusion

In this post, we demonstrated a step-by-step guide to define, test, provision, and manage changes to an AWS Glue based ETL solution using the AWS CDK. We used an AWS Glue example, which has all the components to build a complex ETL solution, and demonstrated how to integrate individual AWS Glue components into a frictionless CI/CD pipeline. We encourage you to use this post and associated code as the starting point to build your own CI/CD pipelines for AWS Glue based ETL solutions.


About the authors

Puneet Babbar is a Data Architect at AWS, specialized in big data and AI/ML. He is passionate about building products, in particular products that help customers get more out of their data. During his spare time, he loves to spend time with his family and engage in outdoor activities including hiking, running, and skating. Connect with him on LinkedIn.

Suvojit Dasgupta is a Sr. Lakehouse Architect at Amazon Web Services. He works with customers to design and build data solutions on AWS.

Justin Kuskowski is a Principal DevOps Consultant at Amazon Web Services. He works directly with AWS customers to provide guidance and technical assistance around improving their value stream, which ultimately reduces product time to market and leads to a better customer experience. Outside of work, Justin enjoys traveling the country to watch his two kids play soccer and spending time with his family and friends wake surfing on the lakes in Michigan.

6 strategic ways to level up your CI/CD pipeline

Post Syndicated from Damian Brady original https://github.blog/2022-07-19-6-strategic-ways-to-level-up-your-ci-cd-pipeline/

In today’s world, a well-tuned CI/CD pipeline is a critical component for any development team looking to build and ship high-quality software fast. But here’s the thing: It’s rare you’ll find two CI/CD pipelines that are exactly the same. And that’s by design. Every CI/CD pipeline should be built to meet a team’s specific needs.

Despite this, there are levels of maturity when building a CI/CD pipeline that range from basic implementations to more advanced automation workflows. But wherever you are on your CI/CD journey, there are a few things you can do to level up your CI/CD pipeline.

With that, here are six strategic things I often see missing from CI/CD pipelines that can help any developer or team advance and improve their workflows.

Need a primer on how to build a CI/CD pipeline on GitHub? Check out our guide

1. Add performance, device compatibility, and accessibility testing

Performance, device compatibility, and accessibility testing are often a manual exercise—and something that some teams are only partially doing. Manually testing for these things can slow down your delivery cycle, so many teams either eat the costs or just don’t do it.

But if these things are important to you—and they should be—there are tools that can be included in your CI/CD pipeline to automate the testing for and discovery of any issues.

Performance and device compatibility testing

One tool, for example, is Playwright which can do end-to-end testing, automated testing, and everything in between. You can also use it to do UI testing so you can catch issues in your product.

Visual regression testing

There’s another class of tools that can help you automate visual regression testing to make sure you haven’t changed the UI when you weren’t intending to do so. That means you haven’t introduced any unexpected UI changes. This can be super useful for device compatibility testing too. If something looks bad on one device, you can quickly correct it.

Accessibility testing

This is another incredibly impactful class of automated tests to add to your CI/CD pipeline. Why? Because every one of your customers should be valuable to you—and if even just a fraction of your customers have trouble using your product, that matters.

There are a ton of accessibility testing tools that can tell you things like if you have appropriate content for screen readers or if the colors on your website make sense to someone with color blindness. A great example is Pa11y, an open source tool you can use to run automated accessibility tests via the command line or Node.js.

2. Incorporate more automated security testing

Security should always be part of your software delivery pipeline, and it’s incredibly vital in today’s environments. Even still, I’ve seen a number of teams and companies who aren’t incorporating automated security tests in their CI/CD pipelines and instead treat security as something that happens after the DevOps process takes place.

Here’s the good news: There are a lot of tools that can help you do this without too much effort—including GitHub-native tools like Dependabot, code scanning, secret scanning, and if you’re a GitHub Enterprise user, you can bundle all the security functionality GitHub offers and more with GitHub Advanced Security. But even with a free GitHub account, you still can use Dependabot on any public or private repository, and code scanning and secret scanning are available on all public repositories, too.

Dependabot, for example, can help you mitigate any potential issues in your dependencies by scanning them for outdated packages and automatically creating pull requests for teams to fix them. It can also be configured to automatically update any project dependencies, too.

This is super impactful. Developers and teams often don’t update their dependencies because of the time it takes—or, sometimes they even just forget to update their dependencies. Dependencies are a legitimate source of vulnerabilities that are all too often overlooked.

Additionally, code scanning and secret scanning are offered on the GitHub platform and can be built into your CI/CD pipeline to improve your security profile. Where code scanning offers SAST capabilities that show if your code itself contains any known vulnerabilities, secret scanning makes sure you’re not leaking any credentials to your repositories. It can also be used to prevent any pushes to your repository if there are any exposed credentials.

The biggest thing is that teams should treat security as something you do throughout the SDLC—and, not just before and after something goes to production. You should, of course, always be checking for security issues. But the earlier you can catch issues, the better (hello DevSecOps). So including security testing within your CI/CD pipeline is an essential practice.

A screenshot of automated security testing workflows on GitHub.
A screenshot of automated security testing workflows on GitHub.

3. Build a phased testing strategy

Phased testing is a great strategy for making sure you’re able to deliver secure software fast and at scale. But it’s also something that takes time to build. And consequently, a lot of teams just aren’t doing it.

Often, developers will put all or most of their automated testing at the build phase in their CI/CD pipelines. That means the build can take a long time to execute. And while there’s nothing necessarily wrong with this, you may find that it takes longer to get feedback on your code.

With phased testing, you can catch the big things early and get faster feedback on your codebase. The goal is to have a quick build that rapidly tests the fundamentals with simpler tests such as unit tests. After this, you may then perhaps deploy your build to a test environment to execute additional tests such as some accessibility testing, user testing, and other things that may take longer to execute. This means you’re working your way through a number of possible issues starting with the most critical elements first.

As you get closer to production in a phased testing model, you’ll want to test more and more things. This will likely include key items such as regression testing to make sure previous bugs aren’t reappearing in your codebase. At this stage, things are less likely to go wrong. But you’ll want to effectively catch the big things early and then narrow your testing down to ensure you’re shipping a very high-quality application.

Oh, and of course, there’s also testing in production, which is its own thing. But you can incorporate post-deployment tests into your production environment. You may have a hypothesis you want to test about if something works in production and execute tests to find out. At GitHub, we do this a lot by releasing new features behind feature flags and then enabling that flag for a subset of our user base to collect feedback.

4. Invest in blue-green deployments for easier rollouts

When it comes to releasing a new version of an application, what’s one word you think of? For me, the big word is “stress” (although “excitement” and “relief” are a close second and third). Blue-green deployments are one way to improve how you roll out a new version of an application in your CI/CD pipeline, but it can also be a bit more complex, too.

In the simplest terms, a blue-green deployment involves having two or more versions of your application in production and slowly moving your users from an older version to a newer one. This means that when you need to update or deploy a new version of an application, it goes to an “unused” production environment, and you can slowly move your users across safely.

The benefit of this is you can quickly roll back any changes by redirecting users to another prod environment. It also leads to drastically reduced downtime while you’re deploying a new application version. You can get everything set up in the environment and then just point people to a new one.

Blue-green deployments are perfect when you have two environments that are interchangeable. In reality with larger systems, you may have a suite of web servers or a number of serverless applications running. In practice, this means you might be using a load balancer that can distribute traffic across multiple locations. The canonical example of a load balancer is nginx—but every cloud has its own offerings (like Azure Front Door or Elastic Load Balancing on AWS).

This kind of strategy is common among organizations using Kubernetes. You may have a number of pods that are running and when you do a deployment, Kubernetes will deploy updates to new instances and redirects traffic. The management of which ones are up and running operates under the same principles as blue-green deployments—but you’re also navigating a far more complex architecture.

5. Adopt infrastructure-as-code for greater flexibility

Infrastructure provisioning is the practice of building IT infrastructure as you need it—and some teams will adopt infrastructure-as-code (IaC) in their CI/CD pipelines to provision resources automatically at specific points in the pipeline.

I strongly recommend doing this. The goal of IaC is that when you’re deploying your application, you’re also deploying your infrastructure. That means you always know what your infrastructure looks like in production, and your testing environment is also replicable to what’s in production.

There are two benefits to building IaC into your CI/CD pipeline:

  1. It helps you make sure that your application and the infrastructure it runs on are routinely being tested in tandem. The old school way of doing things was to say that this is a production machine and it looks like this—and this is our testing machine and we want it to be as close to production as possible. But almost always, you’ll find that production environments change over time—and it makes it harder to know what your production environment is.

  2. It helps you mitigate any real-time issues with your infrastructure. That means if your production server goes down, it’s not a disaster—you can just re-deploy it (and even automate your redeployment at that).

Last but not least: building IaC into your CI/CD pipeline means you can more effectively do things like blue-green deployments. You can deploy a new version of an application—code and infrastructure included—and reroute your DNS to go to that version. If it doesn’t work, that’s fine—you can quickly roll back to your previous version.

A screenshot of a GitHub Actions Terraform workflow.
A screenshot of a GitHub Actions Terraform workflow.

6. Create checkpoints for automated rollbacks

Ideally, you want to avoid ever having to roll back a software release. But let’s be honest. We all make mistakes and sometimes code that worked in your development or test environment doesn’t work perfectly in production.

When you need to roll back a release to a previous application version, automation makes it much easier to do so quickly. I think of a rollback as a general term for mitigating production problems by reverting to a previous version, whether that’s redeploying or restoring from backup. If you have a great CI/CD pipeline, you can ideally fix a problem and roll out an update immediately—so you can avoid having to go to a previous app version.

Looking for more ways to improve your CI/CD pipeline?

Try exploring the GitHub Marketplace for CI/CD and automation workflow templates. At the time I’m writing this, there are more than 14,000 pre-built, community-developed CI/CD and automation actions in the GitHub Marketplace. And, of course, you can always build your own custom workflows with GitHub Actions.

Explore the GitHub Marketplace

Additional resources

Jenkins high availability and disaster recovery on AWS

Post Syndicated from James Bland original https://aws.amazon.com/blogs/devops/jenkins-high-availability-and-disaster-recovery-on-aws/

We often hear from customers about their challenges architecting Jenkins for scale and high availability (HA). Jenkins was originally built as a continuous integration (CI) system to test software before it was committed to a repository. Since its beginning, Jenkins has grown out of necessity versus grand master plan. Developers who extended Jenkins favored speed of creating functionality over performance or scalability of the entire system. This is not to say that it’s impossible to scale Jenkins, it’s only mentioned here to highlight the challenges and technical debt that has accumulated because of the prioritization of features versus developing towards a specific architecture. In this post, we discuss these challenges and our proposed solution.

Challenges with Jenkins at scale and HA

Business and customer demand are forcing organizations to increase the speed and agility at which they release features and functionality. As organizations make this transition, the usage of continuous integration and continuous delivery (CI/CD) increases, which drives the need to scale Jenkins. Overlay this with an organization that commits hundreds of changes per day and works around the clock, with developers dispersed globally, and you end up with an operational situation where there is no room for downtime. To mitigate the risk of impacting an organization’s ability to release when they need it, developers require a system that not only scales but is also highly available.

The ability to scale Jenkins and provide HA comes down to two problems. One is the ability to scale compute to handle additional jobs, and the second is storage. To scale compute, we typically do it in one of two ways, horizontally or vertically. Horizontally means we scale Jenkins to add additional compute nodes. Scaling vertically means we scale Jenkins by adding more resources to the compute node.

Let’s start with the storage problem. Jenkins is designed around the local file system. Anyone who has spent time around Jenkins is aware that logs, cloned repos, plugins, and build artifacts are stored into JENKINS_HOME. Local file systems, while good for single-server designs, tend to be a challenge when HA comes into the picture. In on-premises designs, administrators have often used Network File System (NFS) and Storage Area Networks (SAN) to achieve some scale and resiliency. This type of design comes with a trade-off of performance and doesn’t provide the true HA and inherent disaster recovery (DR) required to meet the demands of the business.

Because of the local file system constraint, there are two native families of storage available in AWS: Amazon Elastic Block Store (Amazon EBS) and Amazon Elastic File System (Amazon EFS). Amazon EBS is great for a single-server design in a single Availability Zone. The challenge is trying to scale a single-server design to support HA. Because of the requirement to assign an EBS volume to a specific Availability Zone, you can’t automatically transition the EBS volume to another Availability Zone and attach it to a Jenkins instance. If you don’t mind having an impact on Recovery Time Objective (RTO) and Recovery Point Objective (RPO), a solution using Amazon EBS snapshots copied to additional Availability Zones might work. Although EBS snapshot copy is possible, it’s not a recommended solution because it doesn’t scale and has complexities in building and maintaining this type of solution.

Amazon EFS as an alternative has worked well for customers that don’t have high usage patterns of Jenkins. All Jenkins instances within the Region can access the Amazon EFS file system and data durably stored in multiple Availability Zones. If a single Availability Zone experiences an outage, the Jenkins file system is still accessible from other Availability Zones providing HA for the storage layer. This solution is not recommended for high-usage systems due to the way that Jenkins reads and writes data. Jenkins’s access pattern is skewed towards writing data such as logs, cloned repos, and building artifacts versus reading data. Amazon EFS, on the other hand, is designed for workloads that read more than they write. On high-usage workloads, customers have experienced Jenkins build slowness and Jenkins page load latency. This is why Amazon EFS isn’t recommended for high-usage Jenkins systems.

Solution for Jenkins at scale and HA

Solving the compute problem is relatively straightforward by using Amazon Elastic Kubernetes Service (Amazon EKS). In the context of Jenkins, an organization would run Jenkins in an Amazon EKS cluster that spans multiple Availability Zones, as shown in the following diagram.

Diagram showing Jenkins deployment in Amazon EKS with three availability zones inside a VPC

Figure 1 –Jenkins deployment in Amazon EKS with multiple availability zones.

Jenkins Controller and Agent would run in an Availability Zone as a Kubernetes pod. Amazon EKS is designed around Desired State Configuration (DSC), which means that it continuously make sure that the running environment matches the configuration that has been applied to Amazon EKS. In practice, when Amazon EKS is told that you want a single pod of Jenkins running, it monitors and makes sure that pod is always running. If an Availability Zone is unavailable, Amazon EKS launches a new node in another Availability Zone and deploys all pods to meet any necessary constraints defined in Amazon EKS. With this option, we still need to have the data in other Availability Zones, which we cover later in this post.

The only option of scaling Jenkins controllers is vertical. Scaling Jenkins horizontally could lead to an undesirable state because the system wasn’t designed to have multiple instances of Jenkins attached to the same storage layer. There is no exclusive file locking mechanism to ensure data consistency. For organizations that have exhausted the limits with vertical scaling, the recommendation is to run multiple independent Jenkins controllers and separate them per team or group. Vertical scaling of Jenkins is simpler in Amazon EKS. Node sizes and container memory are controlled by configuration. Increasing memory size is as simple as changing a container’s memory setting. Due to the ease of changing configuration, it’s best to start with a lower memory setting, monitor performance, and increase as necessary. You want to find a good balance between price and performance.

For Jenkins agents, there are many options to scale the compute. In the context of scale and HA, the best options are to use AWS CodeBuild, AWS Fargate for Amazon EKS, or Amazon EKS managed node groups. With CodeBuild, you don’t need to provision, manage, or scale your build servers. CodeBuild scales continuously and processes multiple builds concurrently. You can use the Jenkins plugin for CodeBuild to integrate CodeBuild with Jenkins. Fargate is a good option but has some challenges if you’re trying to build container images within a container due to permissions necessary that aren’t exposed in Fargate. For additional information on how to overcome this challenge with Jenkins, refer to How to build container images with Amazon EKS on Fargate.

Now let’s look at the storage layer and see how LINBIT is helping organizations solve this problem with LINSTOR. LINBIT’s LINSTOR is an open-source management tool designed to manage block storage devices. Its primary use case is to provide Linux block storage for Kubernetes and other public and private cloud platforms. LINBIT also provides enterprise subscription for LINSTOR, which include technical support with SLA.

The following diagram illustrates a LINSTOR storage solution running on Amazon EKS using multiple Availability Zones and Amazon Simple Storage Service (Amazon S3) for snapshots.

Diagram showing LINSTOR storage solution running on Amazon EKS across three availability zone with snapshot stored in Amazon S3.

Figure 2. LINSTOR storage solution running on Amazon EKS using multiple availability zones and S3 for snapshot.

LINSTOR is composed of a control plane and a data plane. The control plane consists of a set of containers deployed into Amazon EKS and is responsible for managing the data plane. The data plane consists of a collection of open-source block storage software, most importantly LINBIT’s Distributed Replicated Storage System (DRBD) software. DRBD is responsible for provisioning and synchronously replicating storage between Amazon EKS worker instances in different Availability Zones.

LINSTOR is deployed via Helm into Amazon EKS, and the LINSTOR cluster is initialized by the LINSTOR Operator. Once deployed, LINSTOR volumes and volume snapshots are managed via Kubernetes Storage Classes and Snapshot Classes in a Kubernetes native fashion. LINSTOR volumes are backed by LINSTOR objects known as storage pools, which are composed of one or more EBS volumes attached to each Amazon EKS worker instance.

LINSTOR volumes layer DRBD on top of the worker’s attached EBS volume to enable synchronous replication between peers in the Amazon EKS cluster. This ensures that you have an identical copy of your persistent volume on the EBS volumes in each Availability Zone. In the event of an Availability Zone outage or planned migration, Amazon EKS moves the Jenkins deployment to another Availability Zone where the persistent volume copy is available. In terms of scaling, LINBIT DRDB supports up to 32 replicas per volume, with a maximum size of 1 PiB per volume. LINSTOR node itself can scale beyond hundreds of nodes, as shown in this case study.

LINSTOR also provides an HA Controller component in its control plane to speed up failover times during outages. LINSTOR’s HA Controller looks for pods with a specific label, and if LINSTOR’s persistent volumes replication network becomes interrupted (like during an Availability Zone outage), LINSTOR reschedules the pod sooner than the default Kubernetes pod-eviction-timeout.

LINBIT provides a detailed full installation for Jenkins HA in AWS. A sample of LINSTOR’s helm values supporting these features is as follows:

operator:
  satelliteSet:
    storagePools:
      lvmThinPools:
      - name: lvm-thin
        thinVolume: thinpool
        volumeGroup: ""
        devicePaths:
        - /dev/nvme1n1
    kernelModuleInjectionMode: Compile
stork:
  enabled: false
csi:
  enableTopology: true
etcd:
  replicas: 3
haController:
  replicas: 3

After LINSTOR is deployed, you create a Kubernetes StorageClass supporting persistent volumes with three replicas using the following example:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: "linstor-csi-lvm-thin-r3"
provisioner: linstor.csi.linbit.com
parameters:
  allowRemoteVolumeAccess: "false"
  autoPlace: "3"
  storagePool: "lvm-thin"
  DrbdOptions/Disk/disk-flushes: "no"
  DrbdOptions/Disk/md-flushes: "no"
  DrbdOptions/Net/max-buffers: "10000"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

Finally, Jenkins helm charts are deployed into Amazon EKS with the following Helm values to request a PV from the LINSTOR StorageClass:

persistence:
  storageClass: linstor-csi-lvm-thin-r3
  size: "200Gi"
controller:
  serviceType: LoadBalancer
  podLabels:
    linstor.csi.linbit.com/on-storage-lost: remove

To protect against entire AWS Region outages and provide disaster recovery, LINSTOR takes volume snapshots and replicates it cross-Region using Amazon S3. LINSTOR requires read and write access to the target S3 bucket using AWS credentials provided as Kubernetes secrets:

kind: Secret
apiVersion: v1
metadata:
  name: linstor-csi-s3-access
  namespace: default
type: linstor.csi.linbit.com/s3-credentials.v1
immutable: true
stringData:
  access-key: REDACTED
  secret-key: REDACTED

The target S3 bucket is referenced as a snapshot shipping target using a LINSTOR S3 VolumeSnapshotClass. The following example shows a VolumeSnapshotClass referencing the S3 bucket’s secret and additional configuration for the target S3 bucket:

kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
metadata:
  name: linstor-csi-snapshot-class-s3
driver: linstor.csi.linbit.com
deletionPolicy: Delete
parameters:
  snap.linstor.csi.linbit.com/type: S3
  snap.linstor.csi.linbit.com/remote-name: s3-us-west-2
  snap.linstor.csi.linbit.com/allow-incremental: "false"
  snap.linstor.csi.linbit.com/s3-bucket: name-of-bucket-123
  snap.linstor.csi.linbit.com/s3-endpoint: http://s3.us-west-2.amazonaws.com
  snap.linstor.csi.linbit.com/s3-signing-region: us-west-2
  snap.linstor.csi.linbit.com/s3-use-path-style: "false"
  # Secret to store access credentials
  csi.storage.k8s.io/snapshotter-secret-name: linstor-csi-s3-access
  csi.storage.k8s.io/snapshotter-secret-namespace: default

Jenkins deployment persistent volume claim (PVC) is stored as a snapshot in Amazon S3 by using a standard Kubernetes volumeSnapshot definition with LINSTOR’s snapshot class for Amazon S3:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: jenkins-dr-snapshot-0
spec:
  volumeSnapshotClassName: linstor-csi-snapshot-class-s3
  source:
    persistentVolumeClaimName: <jenkins-pvc-name>

Conclusion

In this post, we explained  the challenges to scale Jenkins for HA and DR. We also reviewed Jenkins storage architecture with Amazon EBS and Amazon EFS and where to apply these. We demonstrated how you can use Amazon EKS to scale Jenkins compute for HA and how AWS partner solutions such as LINBIT LINSTOR can help scale Jenkins storage for HA and DR. Combining both solutions can help organizations maintain their ability to deploy software with speed and agility. We hope you found this post useful as you think through building your CI/CD infrastructure in AWS. To learn more about running Jenkins in Amazon EKS, check out Orchestrate Jenkins Workloads using Dynamic Pod Autoscaling with Amazon EKS. To find out more information about LINBIT’s LINSTOR, check the Jenkins technical guide.

Authors:

James Bland

James is a 25+ year veteran in the IT industry helping organizations from startups to ultra large enterprises achieve their business objectives. He has held various leadership roles in software development, worldwide infrastructure automation, and enterprise architecture. James has been
practicing DevOps long before the term became popularized. He holds a doctorate in computer science with a focus on leveraging machine learning algorithms for scaling systems. In his current role at AWS as the APN Global Tech Lead for DevOps, he works with partners to help shape the future of technology.

Welly Siauw

Welly Siauw is a Sr. Partner Solution Architect at Amazon Web Services (AWS). He spends his day working with customers and partners, solving architectural challenges. He is passionate about service integration and orchestration, serverless and artificial intelligence (AI) and machine learning (ML). He authored several AWS blogs and actively leading AWS Immersion Days and Activation Days. Welly spends his free time tinkering with espresso machine and outdoor hiking.

Matt Kereczman

Matt Kereczman is a Solutions Architect at LINBIT with a long history of Linux System Administration and Linux System Engineering. Matt is a cornerstone in LINBIT’s technical team, and plays an important role in making LINBIT and LINBIT’s customer’s solutions great. Matt was President of the GNU/Linux Club at Northampton Area Community College prior to graduating with Honors from Pennsylvania College of Technology with a BS in Information Security. Open Source Software and Hardware are at the core of most of Matt’s hobbies.

Integrating with GitHub Actions – CI/CD pipeline to deploy a Web App to Amazon EC2

Post Syndicated from Mahesh Biradar original https://aws.amazon.com/blogs/devops/integrating-with-github-actions-ci-cd-pipeline-to-deploy-a-web-app-to-amazon-ec2/

Many Organizations adopt DevOps Practices to innovate faster by automating and streamlining the software development and infrastructure management processes. Beyond cultural adoption, DevOps also suggests following certain best practices and Continuous Integration and Continuous Delivery (CI/CD) is among the important ones to start with. CI/CD practice reduces the time it takes to release new software updates by automating deployment activities. Many tools are available to implement this practice. Although AWS has a set of native tools to help achieve your CI/CD goals, it also offers flexibility and extensibility for integrating with numerous third party tools.

In this post, you will use GitHub Actions to create a CI/CD workflow and AWS CodeDeploy to deploy a sample Java SpringBoot application to Amazon Elastic Compute Cloud (Amazon EC2) instances in an Autoscaling group.

GitHub Actions is a feature on GitHub’s popular development platform that helps you automate your software development workflows in the same place that you store code and collaborate on pull requests and issues. You can write individual tasks called actions, and then combine them to create a custom workflow. Workflows are custom automated processes that you can set up in your repository to build, test, package, release, or deploy any code project on GitHub.

AWS CodeDeploy is a deployment service that automates application deployments to Amazon EC2 instances, on-premises instances, serverless AWS Lambda functions, or Amazon Elastic Container Service (Amazon ECS) services.

Solution Overview

The solution utilizes the following services:

  1. GitHub Actions – Workflow Orchestration tool that will host the Pipeline.
  2. AWS CodeDeploy – AWS service to manage deployment on Amazon EC2 Autoscaling Group.
  3. AWS Auto Scaling – AWS Service to help maintain application availability and elasticity by automatically adding or removing Amazon EC2 instances.
  4. Amazon EC2 – Destination Compute server for the application deployment.
  5. AWS CloudFormation – AWS infrastructure as code (IaC) service used to spin up the initial infrastructure on AWS side.
  6. IAM OIDC identity provider – Federated authentication service to establish trust between GitHub and AWS to allow GitHub Actions to deploy on AWS without maintaining AWS Secrets and credentials.
  7. Amazon Simple Storage Service (Amazon S3) – Amazon S3 to store the deployment artifacts.

The following diagram illustrates the architecture for the solution:

Architecture Diagram

  1. Developer commits code changes from their local repo to the GitHub repository. In this post, the GitHub action is triggered manually, but this can be automated.
  2. GitHub action triggers the build stage.
  3. GitHub’s Open ID Connector (OIDC) uses the tokens to authenticate to AWS and access resources.
  4. GitHub action uploads the deployment artifacts to Amazon S3.
  5. GitHub action invokes CodeDeploy.
  6. CodeDeploy triggers the deployment to Amazon EC2 instances in an Autoscaling group.
  7. CodeDeploy downloads the artifacts from Amazon S3 and deploys to Amazon EC2 instances.

Prerequisites

Before you begin, you must complete the following prerequisites:

  • An AWS account with permissions to create the necessary resources.
  • A GitHub account with permissions to configure GitHub repositories, create workflows, and configure GitHub secrets.
  • A Git client to clone the provided source code.

Steps

The following steps provide a high-level overview of the walkthrough:

  1. Clone the project from the AWS code samples repository.
  2. Deploy the AWS CloudFormation template to create the required services.
  3. Update the source code.
  4. Setup GitHub secrets.
  5. Integrate CodeDeploy with GitHub.
  6. Trigger the GitHub Action to build and deploy the code.
  7. Verify the deployment.

Download the source code

  1. Clone the source code repository aws-codedeploy-github-actions-deployment.

git clone https://github.com/aws-samples/aws-codedeploy-github-actions-deployment.git

  1. Create an empty repository in your personal GitHub account. To create a GitHub repository, see Create a repo. Clone this repo to your computer. Furthermore, ignore the warning about cloning an empty repository.

git clone https://github.com/<username>/<repoName>.git

Figure2: Github Clone

  1. Copy the code. We need contents from the hidden .github folder for the GitHub actions to work.

cp -r aws-codedeploy-github-actions-deployment/. <new repository>

e.g. GitActionsDeploytoAWS

  1. Now you should have the following folder structure in your local repository.

Figure3: Directory Structure

Repository folder structure

  • The .github folder contains actions defined in the YAML file.
  • The aws/scripts folder contains code to run at the different deployment lifecycle events.
  • The cloudformation folder contains the template.yaml file to create the required AWS resources.
  • Spring-boot-hello-world-example is a sample application used by GitHub actions to build and deploy.
  • Root of the repo contains appspec.yml. This file is required by CodeDeploy to perform deployment on Amazon EC2. Find more details here.

The following commands will help make sure that your remote repository points to your personal GitHub repository.

git remote remove origin

git remote add origin <your repository url>

git branch -M main

git push -u origin main

Deploy the CloudFormation template

To deploy the CloudFormation template, complete the following steps:

  1. Open AWS CloudFormation console. Enter your account ID, user name, and Password.
  2. Check your region, as this solution uses us-east-1.
  3. If this is a new AWS CloudFormation account, select Create New Stack. Otherwise, select Create Stack.
  4. Select Template is Ready
  5. Select Upload a template file
  6. Select Choose File. Navigate to template.yml file in your cloned repository at “aws-codedeploy-github-actions-deployment/cloudformation/template.yaml”.
  7. Select the template.yml file, and select next.
  8. In Specify Stack Details, add or modify the values as needed.
    • Stack name = CodeDeployStack.
    • VPC and Subnets = (these are pre-populated for you) you can change these values if you prefer to use your own Subnets)
    • GitHubThumbprintList = 6938fd4d98bab03faadb97b34396831e3780aea1
    • GitHubRepoName – Name of your GitHub personal repository which you created.

Figure4: CloudFormation Parameters

  1. On the Options page, select Next.
  2. Select the acknowledgement box to allow for the creation of IAM resources, and then select Create. It will take CloudFormation approximately 10 minutes to create all of the resources. This stack would create the following resources.
    • Two Amazon EC2 Linux instances with Tomcat server and CodeDeploy agent are installed
    • Autoscaling group with Internet Application load balancer
    • CodeDeploy application name and deployment group
    • Amazon S3 bucket to store build artifacts
    • Identity and Access Management (IAM) OIDC identity provider
    • Instance profile for Amazon EC2
    • Service role for CodeDeploy
    • Security groups for ALB and Amazon EC2

Update the source code

  1.  On the AWS CloudFormation console, select the Outputs tab. Note that the Amazon S3 bucket name and the ARM of the GitHub IAM Role. We will use this in the next step.

Figure5: CloudFormation Output

  1. Update the Amazon S3 bucket in the workflow file deploy.yml. Navigate to /.github/workflows/deploy.yml from your Project root directory.

Replace ##s3-bucket## with the name of the Amazon S3 bucket created previously.

Replace ##region## with your AWS Region.

Figure6: Actions YML

  1. Update the Amazon S3 bucket name in after-install.sh. Navigate to aws/scripts/after-install.sh. This script would copy the deployment artifact from the Amazon S3 bucket to the tomcat webapps folder.

Figure7: CodeDeploy Instruction

Remember to save all of the files and push the code to your GitHub repo.

  1. Verify that you’re in your git repository folder by running the following command:

git remote -V

You should see your remote branch address, which is similar to the following:

username@3c22fb075f8a GitActionsDeploytoAWS % git remote -v

origin [email protected]:<username>/GitActionsDeploytoAWS.git (fetch)

origin [email protected]:<username>/GitActionsDeploytoAWS.git (push)

  1. Now run the following commands to push your changes:

git add .

git commit -m “Initial commit”

git push

Setup GitHub Secrets

The GitHub Actions workflows must access resources in your AWS account. Here we are using IAM OpenID Connect identity provider and IAM role with IAM policies to access CodeDeploy and Amazon S3 bucket. OIDC lets your GitHub Actions workflows access resources in AWS without needing to store the AWS credentials as long-lived GitHub secrets.

These credentials are stored as GitHub secrets within your GitHub repository, under Settings > Secrets. For more information, see “GitHub Actions secrets”.

  • Navigate to your github repository. Select the Settings tab.
  • Select Secrets on the left menu bar.
  • Select New repository secret.
  • Select Actions under Secrets.
    • Enter the secret name as ‘IAMROLE_GITHUB’.
    • enter the value as ARN of GitHubIAMRole, which you copied from the CloudFormation output section.

Figure8: Adding Github Secrets

Figure9: Adding New Secret

Integrate CodeDeploy with GitHub

For CodeDeploy to be able to perform deployment steps using scripts in your repository, it must be integrated with GitHub.

CodeDeploy application and deployment group are already created for you. Please use these applications in the next step:

CodeDeploy Application =CodeDeployAppNameWithASG

Deployment group = CodeDeployGroupName

To link a GitHub account to an application in CodeDeploy, follow until step 10 from the instructions on this page.

You can cancel the process after completing step 10. You don’t need to create Deployment.

Trigger the GitHub Actions Workflow

Now you have the required AWS resources and configured GitHub to build and deploy the code to Amazon EC2 instances.

The GitHub actions as defined in the GITHUBREPO/.github/workflows/deploy.yml would let us run the workflow. The workflow is currently setup to be manually run.

Follow the following steps to run it manually.

Go to your GitHub Repo and select Actions tab

Figure10: See Actions Tab

Select Build and Deploy link, and select Run workflow as shown in the following image.

Figure11: Running Workflow Manually

After a few seconds, the workflow will be displayed. Then, select Build and Deploy.

Figure12: Observing Workflow

You will see two stages:

  1. Build and Package.
  2. Deploy.

Build and Package

The Build and Package stage builds the sample SpringBoot application, generates the war file, and then uploads it to the Amazon S3 bucket.

Figure13: Completed Workflow

You should be able to see the war file in the Amazon S3 bucket.

Figure14: Artifacts saved in S3

Deploy

In this stage, workflow would invoke the CodeDeploy service and trigger the deployment.

Figure15: Deploy With Actions

Verify the deployment

Log in to the AWS Console and navigate to the CodeDeploy console.

Select the Application name and deployment group. You will see the status as Succeeded if the deployment is successful.

Figure16: Verifying Deployment

Point your browsers to the URL of the Application Load balancer.

Note: You can get the URL from the output section of the CloudFormation stack or Amazon EC2 console Load Balancers.

Figure17: Verifying Application

Optional – Automate the deployment on Git Push

Workflow can be automated by changing the following line of code in your .github/workflow/deploy.yml file.

From

workflow_dispatch: {}

To


  #workflow_dispatch: {}
  push:
    branches: [ main ]
  pull_request:

This will be interpreted by GitHub actions to automaticaly run the workflows on every push or pull requests done on the main branch.

After testing end-to-end flow manually, you can enable the automated deployment.

Clean up

To avoid incurring future changes, you should clean up the resources that you created.

  1. Empty the Amazon S3 bucket:
  2. Delete the CloudFormation stack (CodeDeployStack) from the AWS console.
  3. Delete the GitHub Secret (‘IAMROLE_GITHUB’)
    1. Go to the repository settings on GitHub Page.
    2. Select Secrets under Actions.
    3. Select IAMROLE_GITHUB, and delete it.

Conclusion

In this post, you saw how to leverage GitHub Actions and CodeDeploy to securely deploy Java SpringBoot application to Amazon EC2 instances behind AWS Autoscaling Group. You can further add other stages to your pipeline, such as Test and security scanning.

Additionally, this solution can be used for other programming languages.

About the Authors

Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale.
Suresh Moolya is a Cloud Application Architect with Amazon Web Services. He works with customers to architect, design, and automate business software at scale on AWS cloud.