Tag Archives: AWS CodePipeline

Accelerate deployments on AWS with effective governance

Post Syndicated from Rostislav Markov original https://aws.amazon.com/blogs/architecture/accelerate-deployments-on-aws-with-effective-governance/

Amazon Web Services (AWS) users ask how to accelerate their teams’ deployments on AWS while maintaining compliance with security controls. In this blog post, we describe common governance models introduced in mature organizations to manage their teams’ AWS deployments. These models are best used to increase the maturity of your cloud infrastructure deployments.

Governance models for AWS deployments

We distinguish three common models used by mature cloud adopters to manage their infrastructure deployments on AWS. The models differ in what they control: the infrastructure code, deployment toolchain, or provisioned AWS resources. We define the models as follows:

  1. Central pattern library, which offers a repository of curated deployment templates that application teams can re-use with their deployments.
  2. Continuous Integration/Continuous Delivery (CI/CD) as a service, which offers a toolchain standard to be re-used by application teams.
  3. Centrally managed infrastructure, which allows application teams to deploy AWS resources managed by central operations teams.

The decision of how much responsibility you shift to application teams depends on their autonomy, operating model, application type, and rate of change. The three models can be used in tandem to address different use cases and maximize impact. Typically, organizations start by gathering pre-approved deployment templates in a central pattern library.

Model 1: Central pattern library

With this model, cloud platform engineers publish a central pattern library from which teams can reference infrastructure as code templates. Application teams reuse the templates by forking the central repository or by copying the templates into their own repository. Application teams can also manage their own deployment AWS account and pipeline with AWS CodePipeline), as well as the resource-provisioning process, while reusing templates from the central pattern library with a service like AWS CodeCommit. Figure 1 provides an overview of this governance model.

Deployment governance with central pattern library

Figure 1. Deployment governance with central pattern library

The central pattern library represents the least intrusive form of enablement via reusable assets. Application teams appreciate the central pattern library model, as it allows them to maintain autonomy over their deployment process and toolchain. Reusing existing templates speeds up the creation of your teams’ first infrastructure templates and eases policy adherence, such as tagging policies and security controls.

After the reusable templates are in the application team’s repository, incremental updates can be pulled from the central library when the template has been enhanced. This allows teams to pull when they see fit. Changes to the team’s repository will trigger the pipeline to deploy the associated infrastructure code.

With the central pattern library model, application teams need to manage resource configuration and CI/CD toolchain on their own in order to gain the benefits of automated deployments. Model 2 addresses this.

Model 2: CI/CD as a service

In Model 2, application teams launch a governed deployment pipeline from AWS Service Catalog. This includes the infrastructure code needed to run the application and “hello world” source code to show the end-to-end deployment flow.

Cloud platform engineers develop the service catalog portfolio (in this case the CI/CD toolchain). Then, application teams can launch AWS Service Catalog products, which deploy an instance of the pipeline code and populated Git repository (Figure 2).

The pipeline is initiated immediately after the repository is populated, which results in the “hello world” application being deployed to the first environment. The infrastructure code (for example, Amazon Elastic Compute Cloud [Amazon EC2] and AWS Fargate) will be located in the application team’s repository. Incremental updates can be pulled by launching a product update from AWS Service Catalog. This allows application teams to pull when they see fit.

Deployment governance with CI/CD as a service

Figure 2. Deployment governance with CI/CD as a service

This governance model is particularly suitable for mature developer organizations with full-stack responsibility or platform projects, as it provides end-to-end deployment automation to provision resources across multiple teams and AWS accounts. This model also adds security controls over the deployment process.

Since there is little room for teams to adapt the toolchain standard, the model can be perceived as very opinionated. The model expects application teams to manage their own infrastructure. Model 3 addresses this.

Model 3: Centrally managed infrastructure

This model allows application teams to provision resources managed by a central operations team as self-service. Cloud platform engineers publish infrastructure portfolios to AWS Service Catalog with pre-approved configuration by central teams (Figure 3). These portfolios can be shared with all AWS accounts used by application engineers.

Provisioning AWS resources via AWS Service Catalog products ensures resource configuration fulfills central operations requirements. Compared with Model 2, the pre-populated infrastructure templates launch AWS Service Catalog products, as opposed to directly referencing the API of the corresponding AWS service (for example Amazon EC2). This locks down how infrastructure is configured and provisioned.

Deployment governance with centrally managed infrastructure

Figure 3. Deployment governance with centrally managed infrastructure

In our experience, it is essential to manage the variety of AWS Service Catalog products. This avoids proliferation of products with many templates differing slightly. Centrally managed infrastructure propagates an “on-premises” mindset so it should be used only in cases where application teams cannot own the full stack.

Models 2 and 3 can be combined for application engineers to launch both deployment toolchain and resources as AWS Service Catalog products (Figure 4), while also maintaining the opportunity to provision from pre-populated infrastructure templates in the team repository. After the code is in their repository, incremental updates can be pulled by running an update from the provisioned AWS Service Catalog product. This allows the application team to pull an update as needed while avoiding manual deployments of service catalog products.

Using AWS Service Catalog to automate CI/CD and infrastructure resource provisioning

Figure 4. Using AWS Service Catalog to automate CI/CD and infrastructure resource provisioning

Comparing models

The three governance models differ along the following aspects (see Table 1):

  • Governance level: What component is managed centrally by cloud platform engineers?
  • Role of application engineers: What is the responsibility split and operating model?
  • Use case: When is each model applicable?

Table 1. Governance models for managing infrastructure deployments

 

Model 1: Central pattern library Model 2: CI/CD as a service Model 3: Centrally managed infrastructure
Governance level Centrally defined infrastructure templates Centrally defined deployment toolchain Centrally defined provisioning and management of AWS resources
Role of cloud platform engineers Manage pattern library and policy checks Manage deployment toolchain and stage checks Manage resource provisioning (including CI/CD)
Role of application teams Manage deployment toolchain and resource provisioning Manage resource provisioning Manage application integration
Use case Federated governance with application teams maintaining autonomy over application and infrastructure Platform projects or development organizations with strong preference for pre-defined deployment standards including toolchain Applications without development teams (e.g., “commercial-off-the-shelf”) or with separation of duty (e.g., infrastructure operations teams)

Conclusion

In this blog post, we distinguished three common governance models to manage the deployment of AWS resources. The three models can be used in tandem to address different use cases and maximize impact in your organization. The decision of how much responsibility is shifted to application teams depends on your organizational setup and use case.

Want to learn more?

Multi-Region Terraform Deployments with AWS CodePipeline using Terraform Built CI/CD

Post Syndicated from Lerna Ekmekcioglu original https://aws.amazon.com/blogs/devops/multi-region-terraform-deployments-with-aws-codepipeline-using-terraform-built-ci-cd/

As of February 2022, the AWS Cloud spans 84 Availability Zones within 26 geographic Regions, with announced plans for more Availability Zones and Regions. Customers can leverage this global infrastructure to expand their presence to their primary target of users, satisfying data residency requirements, and implementing disaster recovery strategy to make sure of business continuity. Although leveraging multi-Region architecture would address these requirements, deploying and configuring consistent infrastructure stacks across multi-Regions could be challenging, as AWS Regions are designed to be autonomous in nature. Multi-region deployments with Terraform and AWS CodePipeline can help customers with these challenges.

In this post, we’ll demonstrate the best practice for multi-Region deployments using HashiCorp Terraform as infrastructure as code (IaC), and AWS CodeBuild , CodePipeline as continuous integration and continuous delivery (CI/CD) for consistency and repeatability of deployments into multiple AWS Regions and AWS Accounts. We’ll dive deep on the IaC deployment pipeline architecture and the best practices for structuring the Terraform project and configuration for multi-Region deployment of multiple AWS target accounts.

You can find the sample code for this solution here

Solutions Overview

Architecture

The following architecture diagram illustrates the main components of the multi-Region Terraform deployment pipeline with all of the resources built using IaC.

DevOps engineer initially works against the infrastructure repo in a short-lived branch. Once changes in the short-lived branch are ready, DevOps engineer gets them reviewed and merged into the main branch. Then, DevOps engineer git tags the repo. For any future changes in the infra repo, DevOps engineer repeats this same process.

Git tags named “dev_us-east-1/research/1.0”, “dev_eu-central-1/research/1.0”, “dev_ap-southeast-1/research/1.0”, “dev_us-east-1/risk/1.0”, “dev_eu-central-1/risk/1.0”, “dev_ap-southeast-1/risk/1.0” corresponding to the version 1.0 of the code to release from the main branch using git tagging. Short-lived branch in between each version of the code, followed by git tags corresponding to each subsequent version of the code such as version 1.1 and version 2.0.”

Fig 1. Tagging to release from the main branch.

  1. The deployment is triggered from DevOps engineer git tagging the repo, which contains the Terraform code to be deployed. This action starts the deployment pipeline execution.
    Tagging with ‘dev_us-east-1/research/1.0’ triggers a pipeline to deploy the research dev account to us-east-1. In our example git tag ‘dev_us-east-1/research/1.0’ contains the target environment (i.e., dev), AWS Region (i.e. us-east-1), team (i.e., research), and a version number (i.e., 1.0) that maps to an annotated tag on a commit ID. The target workload account aliases (i.e., research dev, risk qa) are mapped to AWS account numbers in the environment configuration files of the infra repo in AWS CodeCommit.
The central tooling account contains the CodeCommit Terraform infra repo, where DevOps engineer has git access, along with the pipeline trigger, the CodePipeline dev pipeline consisting of the S3 bucket with Terraform infra repo and git tag, CodeBuild terraform tflint scan, checkov scan, plan and apply. Terraform apply points using the cross account role to VPC containing an Application Load Balancer (ALB) in eu-central-1 in the dev target workload account. A qa pipeline, a staging pipeline, a prod pipeline are included along with a qa target workload account, a staging target workload account, a prod target workload account. EventBridge, Key Management Service, CloudTrail, CloudWatch in us-east-1 Region are in the central tooling account along with Identity Access Management service. In addition, the dev target workload account contains us-east-1 and ap-southeast-1 VPC’s each with an ALB as well as Identity Access Management.

Fig 2. Multi-Region AWS deployment with IaC and CI/CD pipelines.

  1. To capture the exact git tag that starts a pipeline, we use an Amazon EventBridge rule. The rule is triggered when the tag is created with an environment prefix for deploying to a respective environment (i.e., dev). The rule kicks off an AWS CodeBuild project that takes the git tag from the AWS CodeCommit event and stores it with a full clone of the repo into a versioned Amazon Simple Storage Service (Amazon S3) bucket for the corresponding environment.
  2. We have a continuous delivery pipeline defined in AWS CodePipeline. To make sure that the pipelines for each environment run independent of each other, we use a separate pipeline per environment. Each pipeline consists of three stages in addition to the Source stage:
    1. IaC linting stage – A stage for linting Terraform code. For illustration purposes, we’ll use the open source tool tflint.
    2. IaC security scanning stage – A stage for static security scanning of Terraform code. There are many tooling choices when it comes to the security scanning of Terraform code. Checkov, TFSec, and Terrascan are the commonly used tools. For illustration purposes, we’ll use the open source tool Checkov.
    3. IaC build stage – A stage for Terraform build. This includes an action for the Terraform execution plan followed by an action to apply the plan to deploy the stack to a specific Region in the target workload account.
  1. Once the Terraform apply is triggered, it deploys the infrastructure components in the target workload account to the AWS Region based on the git tag. In turn, you have the flexibility to point the deployment to any AWS Region or account configured in the repo.
  2. The sample infrastructure in the target workload account consists of an AWS Identity and Access Management (IAM) role, an external facing Application Load Balancer (ALB), as well as all of the required resources down to the Amazon Virtual Private Cloud (Amazon VPC). Upon successful deployment, browsing to the external facing ALB DNS Name URL displays a very simple message including the location of the Region.

Architectural considerations

Multi-account strategy

Leveraging well-architected multi-account strategy, we have a separate central tooling account for housing the code repository and infrastructure pipeline, and a separate target workload account to house our sample workload infra-architecture. The clean account separation lets us easily control the IAM permission for granular access and have different guardrails and security controls applied. Ultimately, this enforces the separation of concerns as well as minimizes the blast radius.

A dev pipeline, a qa pipeline, a staging pipeline and, a prod pipeline in the central tooling account, each targeting the workload account for the respective environment pointing to the Regional resources containing a VPC and an ALB.

Fig 3. A separate pipeline per environment.

The sample architecture shown above contained a pipeline per environment (DEV, QA, STAGING, PROD) in the tooling account deploying to the target workload account for the respective environment. At scale, you can consider having multiple infrastructure deployment pipelines for multiple business units in the central tooling account, thereby targeting workload accounts per environment and business unit. If your organization has a complex business unit structure and is bound to have different levels of compliance and security controls, then the central tooling account can be further divided into the central tooling accounts per business unit.

Pipeline considerations

The infrastructure deployment pipeline is hosted in a central tooling account and targets workload accounts. The pipeline is the authoritative source managing the full lifecycle of resources. The goal is to decrease the risk of ad hoc changes (e.g., manual changes made directly via the console) that can’t be easily reproduced at a future date. The pipeline and the build step each run as their own IAM role that adheres to the principle of least privilege. The pipeline is configured with a stage to lint the Terraform code, as well as a static security scan of the Terraform resources following the principle of shifting security left in the SDLC.

As a further improvement for resiliency and applying the cell architecture principle to the CI/CD deployment, we can consider having multi-Region deployment of the AWS CodePipeline pipeline and AWS CodeBuild build resources, in addition to a clone of the AWS CodeCommit repository. We can use the approach detailed in this post to sync the repo across multiple regions. This means that both the workload architecture and the deployment infrastructure are multi-Region. However, it’s important to note that the business continuity requirements of the infrastructure deployment pipeline are most likely different than the requirements of the workloads themselves.

A dev pipeline in us-east-1, a dev pipeline in eu-central-1, a dev pipeline in ap-southeast-1, all in the central tooling account, each pointing respectively to the regional resources containing a VPC and an ALB for the respective Region in the dev target workload account.

Fig 4. Multi-Region CI/CD dev pipelines targeting the dev workload account resources in the respective Region.

Deeper dive into Terraform code

Backend configuration and state

As a prerequisite, we created Amazon S3 buckets to store the Terraform state files and Amazon DynamoDB tables for the state file locks. The latter is a best practice to prevent concurrent operations on the same state file. For naming the buckets and tables, our code expects the use of the same prefix (i.e., <tf_backend_config_prefix>-<env> for buckets and <tf_backend_config_prefix>-lock-<env> for tables). The value of this prefix must be passed in as an input param (i.e., “tf_backend_config_prefix”). Then, it’s fed into AWS CodeBuild actions for Terraform as an environment variable. Separation of remote state management resources (Amazon S3 bucket and Amazon DynamoDB table) across environments makes sure that we’re minimizing the blast radius.


-backend-config="bucket=${TF_BACKEND_CONFIG_PREFIX}-${ENV}" 
-backend-config="dynamodb_table=${TF_BACKEND_CONFIG_PREFIX}-lock-${ENV}"
A dev Terraform state files bucket named 

<prefix>-dev, a dev Terraform state locks DynamoDB table named <prefix>-lock-dev, a qa Terraform state files bucket named <prefix>-qa, a qa Terraform state locks DynamoDB table named <prefix>-lock-qa, a staging Terraform state files bucket named <prefix>-staging, a staging Terraform state locks DynamoDB table named <prefix>-lock-staging, a prod Terraform state files bucket named <prefix>-prod, a prod Terraform state locks DynamoDB table named <prefix>-lock-prod, in us-east-1 in the central tooling account” width=”600″ height=”456″>
 <p id=Fig 5. Terraform state file buckets and state lock tables per environment in the central tooling account.

The git tag that kicks off the pipeline is named with the following convention of “<env>_<region>/<team>/<version>” for regional deployments and “<env>_global/<team>/<version>” for global resource deployments. The stage following the source stage in our pipeline, tflint stage, is where we parse the git tag. From the tag, we derive the values of environment, deployment scope (i.e., Region or global), and team to determine the Terraform state Amazon S3 object key uniquely identifying the Terraform state file for the deployment. The values of environment, deployment scope, and team are passed as environment variables to the subsequent AWS CodeBuild Terraform plan and apply actions.

-backend-config="key=${TEAM}/${ENV}-${TARGET_DEPLOYMENT_SCOPE}/terraform.tfstate"

We set the Region to the value of AWS_REGION env variable that is made available by AWS CodeBuild, and it’s the Region in which our build is running.

-backend-config="region=$AWS_REGION"

The following is how the Terraform backend config initialization looks in our AWS CodeBuild buildspec files for Terraform actions, such as tflint, plan, and apply.

terraform init -backend-config="key=${TEAM}/${ENV}-
${TARGET_DEPLOYMENT_SCOPE}/terraform.tfstate" -backend-config="region=$AWS_REGION"
-backend-config="bucket=${TF_BACKEND_CONFIG_PREFIX}-${ENV}" 
-backend-config="dynamodb_table=${TF_BACKEND_CONFIG_PREFIX}-lock-${ENV}"
-backend-config="encrypt=true"

Using this approach, the Terraform states for each combination of account and Region are kept in their own distinct state file. This means that if there is an issue with one Terraform state file, then the rest of the state files aren’t impacted.

In the central tooling account us-east-1 Region, Terraform state files named “research/dev-us-east-1/terraform.tfstate”, “risk/dev-ap-southeast-1/terraform.tfstate”, “research/dev-eu-central-1/terraform.tfstate”, “research/dev-global/terraform.tfstate” are in S3 bucket named 

<prefix>-dev along with DynamoDB table for Terraform state locks named <prefix>-lock-dev. The Terraform state files named “research/qa-us-east-1/terraform.tfstate”, “risk/qa-ap-southeast-1/terraform.tfstate”, “research/qa-eu-central-1/terraform.tfstate” are in S3 bucket named <prefix>-qa along with DynamoDB table for Terraform state locks named <prefix>-lock-qa. Similarly for staging and prod.” width=”600″ height=”677″>
 <p id=Fig 6. Terraform state files per account and Region for each environment in the central tooling account

Following the example, a git tag of the form “dev_us-east-1/research/1.0” that kicks off the dev pipeline works against the research team’s dev account’s state file containing us-east-1 Regional resources (i.e., Amazon S3 object key “research/dev-us-east-1/terraform.tfstate” in the S3 bucket <tf_backend_config_prefix>-dev), and a git tag of the form “dev_ap-southeast-1/risk/1.0” that kicks off the dev pipeline works against the risk team’s dev account’s Terraform state file containing ap-southeast-1 Regional resources (i.e., Amazon S3 object key “risk/dev-ap-southeast-1/terraform.tfstate”). For global resources, we use a git tag of the form “dev_global/research/1.0” that kicks off a dev pipeline and works against the research team’s dev account’s global resources as they are at account level (i.e., “research/dev-global/terraform.tfstate).

Git tag “dev_us-east-1/research/1.0” pointing to the Terraform state file named “research/dev-us-east-1/terraform.tfstate”, git tag “dev_ap-southeast-1/risk/1.0 pointing to “risk/dev-ap-southeast-1/terraform.tfstate”, git tag “dev_eu-central-1/research/1.0” pointing to ”research/dev-eu-central-1/terraform.tfstate”, git tag “dev_global/research/1.0” pointing to “research/dev-global/terraform.tfstate”, in dev Terraform state files S3 bucket named <prefix>-dev along with <prefix>-lock-dev DynamoDB dev Terraform state locks table.” width=”600″ height=”318″>
 <p id=Fig 7. Git tags and the respective Terraform state files.

This backend configuration makes sure that the state file for one account and Region is independent of the state file for the same account but different Region. Adding or expanding the workload to additional Regions would have no impact on the state files of existing Regions.

If we look at the further improvement where we make our deployment infrastructure also multi-Region, then we can consider each Region’s CI/CD deployment to be the authoritative source for its local Region’s deployments and Terraform state files. In this case, tagging against the repo triggers a pipeline within the local CI/CD Region to deploy resources in the Region. The Terraform state files in the local Region are used for keeping track of state for the account’s deployment within the Region. This further decreases cross-regional dependencies.

A dev pipeline in the central tooling account in us-east-1, pointing to the VPC containing ALB in us-east-1 in dev target workload account, along with a dev Terraform state files S3 bucket named <prefix>-use1-dev containing us-east-1 Regional resources “research/dev/terraform.tfstate” and “risk/dev/terraform.tfstate” Terraform state files along with DynamoDB dev Terraform state locks table named <prefix>-use1-lock-dev. A dev pipeline in the central tooling account in eu-central-1, pointing to the VPC containing ALB in eu-central-1 in dev target workload account, along with a dev Terraform state files S3 bucket named <prefix>-euc1-dev containing eu-central-1 Regional resources “research/dev/terraform.tfstate” and “risk/dev/terraform.tfstate” Terraform state files along with DynamoDB dev Terraform state locks table named <prefix>-euc1-lock-dev. A dev pipeline in the central tooling account in ap-southeast-1, pointing to the VPC containing ALB in ap-southeast-1 in dev target workload account, along with a dev Terraform state files S3 bucket named <prefix>-apse1-dev containing ap-southeast-1 Regional resources “research/dev/terraform.tfstate” and “risk/dev/terraform.tfstate” Terraform state files along with DynamoDB dev Terraform state locks table named <prefix>-apse1-lock-dev” width=”700″ height=”603″>
 <p id=Fig 8. Multi-Region CI/CD with Terraform state resources stored in the same Region as the workload account resources for the respective Region

Provider

For deployments, we use the default Terraform AWS provider. The provider is parametrized with the value of the region passed in as an input parameter.

provider "aws" {
  region = var.region
   ...
}

Once the provider knows which Region to target, we can refer to the current AWS Region in the rest of the code.

# The value of the current AWS region is the name of the AWS region configured on the provider
# https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region
data "aws_region" "current" {} 

locals {
    region = data.aws_region.current.name # then use local.region where region is needed
}

Provider is configured to assume a cross account IAM role defined in the workload account. The value of the account ID is fed as an input parameter.

provider "aws" {
  region = var.region
  assume_role {
    role_arn     = "arn:aws:iam::${var.account}:role/InfraBuildRole"
    session_name = "INFRA_BUILD"
  }
}

This InfraBuildRole IAM role could be created as part of the account creation process. The AWS Control Tower Terraform Account Factory could be used to automate this.

Code

Minimize cross-regional dependencies

We keep the Regional resources and the global resources (e.g., IAM role or policy) in distinct namespaces following the cell architecture principle. We treat each Region as one cell, with the goal of decreasing cross-regional dependencies. Regional resources are created once in each Region. On the other hand, global resources are created once globally and may have cross-regional dependencies (e.g., DynamoDB global table with a replica table in multiple Regions). There’s no “global” Terraform AWS provider since the AWS provider requires a Region. This means that we pick a specific Region from which to deploy our global resources (i.e., global_resource_deploy_from_region input param). By creating a distinct Terraform namespace for Regional resources (e.g., module.regional) and a distinct namespace for global resources (e.g., module.global), we can target a deployment for each using pipelines scoped to the respective namespace (e.g., module.global or module.regional).

Deploying Regional resources: A dev pipeline in the central tooling account triggered via git tag “dev_eu-central-1/research/1.0” pointing to the eu-central-1 VPC containing ALB in the research dev target workload account corresponding to the module.regional Terraform namespace. Deploying global resources: a dev pipeline in the central tooling account triggered via git tag “dev_global/research/1.0” pointing to the IAM resource corresponding to the module.global Terraform namespace.

Fig 9. Deploying regional and global resources scoped to the Terraform namespace

As global resources have a scope of the whole account regardless of Region while Regional resources are scoped for the respective Region in the account, one point of consideration and a trade-off with having to pick a Region to deploy global resources is that this introduces a dependency on that region for the deployment of the global resources. In addition, in the case of a misconfiguration of a global resource, there may be an impact to each Region in which we deployed our workloads. Let’s consider a scenario where an IAM role has access to an S3 bucket. If the IAM role is misconfigured as a result of one of the deployments, then this may impact access to the S3 bucket in each Region.

There are alternate approaches, such as creating an IAM role per Region (myrole-use1 with access to the S3 bucket in us-east-1, myrole-apse1 with access to the S3 bucket in ap-southeast-1, etc.). This would make sure that if the respective IAM role is misconfigured, then the impact is scoped to the Region. Another approach is versioning our global resources (e.g., myrole-v1, myrole-v2) with the ability to move to a new version and roll back to a previous version if needed. Each of these approaches has different drawbacks, such as the duplication of global resources that may make auditing more cumbersome with the tradeoff of minimizing cross Regional dependencies.

We recommend looking at the pros and cons of each approach and selecting the approach that best suits the requirements for your workloads regarding the flexibility to deploy to multiple Regions.

Consistency

We keep one copy of the infrastructure code and deploy the resources targeted for each Region using this same copy. Our code is built using versioned module composition as the “lego blocks”. This follows the DRY (Don’t Repeat Yourself) principle and decreases the risk of code drift per Region. We may deploy to any Region independently, including any Regions added at a future date with zero code changes and minimal additional configuration for that Region. We can see three advantages with this approach.

  1. The total deployment time per Region remains the same regardless of the addition of Regions. This helps for restrictions, such as tight release windows due to business requirements.
  2. If there’s an issue with one of the regional deployments, then the remaining Regions and their deployment pipelines aren’t affected.
  3. It allows the ability to stagger deployments or the possibility of not deploying to every region in non-critical environments (e.g., dev) to minimize costs and remain in line with the Well Architected Sustainability pillar.

Conclusion

In this post, we demonstrated a multi-account, multi-region deployment approach, along with sample code, with a focus on architecture using IaC tool Terraform and CI/CD services AWS CodeBuild and AWS CodePipeline to help customers in their journey through multi-Region deployments.

Thanks to Welly Siauw, Kenneth Jackson, Andy Taylor, Rodney Bozo, Craig Edwards and Curtis Rissi for their contributions reviewing this post and its artifacts.

Author:

Lerna Ekmekcioglu

Lerna Ekmekcioglu is a Senior Solutions Architect with AWS where she helps Global Financial Services customers build secure, scalable and highly available workloads.
She brings over 17 years of platform engineering experience including authentication systems, distributed caching, and multi region deployments using IaC and CI/CD to name a few.
In her spare time, she enjoys hiking, sight seeing and backyard astronomy.

Jack Iu

Jack is a Global Solutions Architect at AWS Financial Services. Jack is based in New York City, where he works with Financial Services customers to help them design, deploy, and scale applications to achieve their business goals. In his spare time, he enjoys badminton and loves to spend time with his wife and Shiba Inu.

Continually assessing application resilience with AWS Resilience Hub and AWS CodePipeline

Post Syndicated from Scott Bryen original https://aws.amazon.com/blogs/architecture/continually-assessing-application-resilience-with-aws-resilience-hub-and-aws-codepipeline/

As customers commit to a DevOps mindset and embrace a nearly continuous integration/continuous delivery model to implement change with a higher velocity, assessing every change impact on an application resilience is key. This blog shows an architecture pattern for automating resiliency assessments as part of your CI/CD pipeline. Automatically running a resiliency assessment within CI/CD pipelines, development teams can fail fast and understand quickly if a change negatively impacts an applications resilience. The pipeline can stop the deployment into further environments, such as QA/UAT and Production, until the resilience issues have been improved.

AWS Resilience Hub is a managed service that gives you a central place to define, validate and track the resiliency of your AWS applications. It is integrated with AWS Fault Injection Simulator (FIS), a chaos engineering service, to provide fault-injection simulations of real-world failures. Using AWS Resilience Hub, you can assess your applications to uncover potential resilience enhancements. This will allow you to validate your applications recovery time (RTO), recovery point (RPO) objectives and optimize business continuity while reducing recovery costs. Resilience Hub also provides APIs for you to integrate its assessment and testing into your CI/CD pipelines for ongoing resilience validation.

AWS CodePipeline is a fully managed continuous delivery service for fast and reliable application and infrastructure updates. You can use AWS CodePipeline to model and automate your software release processes. This enables you to increase the speed and quality of your software updates by running all new changes through a consistent set of quality checks.

Continuous resilience assessments

Figure 1 shows the resilience assessments automation architecture in a multi-account setup. AWS CodePipeline, AWS Step Functions, and AWS Resilience Hub are defined in your deployment account while the application AWS CloudFormation stacks are imported from your workload account. This pattern relies on AWS Resilience Hub ability to import CloudFormation stacks from a different accounts, regions, or both, when discovering an application structure.

High-level architecture pattern for automating resilience assessments

Figure 1. High-level architecture pattern for automating resilience assessments

Add application to AWS Resilience Hub

Begin by adding your application to AWS Resilience Hub and assigning a resilience policy. This can be done via the AWS Management Console or using CloudFormation. In this instance, the application has been created through the AWS Management Console. Sebastien Stormacq’s post, Measure and Improve Your Application Resilience with AWS Resilience Hub, walks you through how to add your application to AWS Resilience Hub.

In a multi-account environment, customers typically have dedicated AWS workload account per environment and we recommend you separate CI/CD capabilities into another account. In this post, the AWS Resilience Hub application has been created in the deployment account and the resources have been discovered using an CloudFormation stack from the workload account. Proper permissions are required to use AWS Resilience Hub to manage application in multiple accounts.

Adding application to AWS Resilience Hub

Figure 2. Adding application to AWS Resilience Hub

Create AWS Step Function to run resilience assessment

Whenever you make a change to your application CloudFormation, you need to update and publish the latest version in AWS Resilience Hub to ensure you are assessing the latest changes. Now that AWS Step Functions SDK integrations support AWS Resilience Hub, you can build a state machine to coordinate the process, which will be triggered from AWS Code Pipeline.

AWS Step Functions is a low-code, visual workflow service that developers use to build distributed applications, automate IT and business processes, and build data and machine learning pipelines using AWS services. Workflows manage failures, retries, parallelization, service integrations, and observability so developers can focus on higher-value business logic.

AWS Step Function for orchestrating AWS SDK calls

Figure 3. AWS Step Function for orchestrating AWS SDK calls

  1. The first step in the workflow is to update the resources associated with the application defined in AWS Resilience Hub by calling ImportResourcesToDraftApplication.
  2. Check for the import process to complete using a wait state, a call to DescribeDraftAppVersionResourcesImportStatus and then a choice state to decide whether to progress or continue waiting.
  3. Once complete, publish the draft application by calling PublishAppVersion to ensure we are assessing the latest version.
  4. Once published, call StartAppAssessment to kick-off a resilience assessment.
  5. Check for the assessment to complete using a wait state, a call to DescribeAppAssessment and then a choice state to decide whether to progress or continue waiting.
  6. In the choice state, use assessment status from the response to determine if the assessment is pending, in progress or successful.
  7. If successful, use the compliance status from the response to determine whether to progress to success or fail.
    • Compliance status will be either “PolicyMet” or “PolicyBreached”.
  8. If policy breached, publish onto SNS to alert the development team before moving to fail.

Create stage within code pipeline

Now that we have the AWS Step Function created, we need to integrate it into our pipeline. The post Fine-grained Continuous Delivery With CodePipeline and AWS Step Functions demonstrates how you can trigger a step function from AWS Code Pipeline.

When adding the stage, you need to pass the ARN of the stack which was deployed in the previous stage as well as the ARN of the application in AWS Resilience Hub. These will be required on the AWS SDK calls and you can pass this in as a literal.

AWS CodePipeline stage step function input

Figure 4. AWS CodePipeline stage step function input

Example state using the input from AWS CodePipeline stage

Figure 5. Example state using the input from AWS CodePipeline stage

For more information about these AWS SDK calls, please refer to the AWS Resilience Hub API Reference documents.

Customers often run their workloads in lower environments in a less resilient way to save on cost. It’s important to add the assessment stage at the appropriate point of your pipeline. We recommend adding this to your pipeline after the deployment to a test environment which mirrors production but before deploying to production. By doing this you can fail fast and halt changes which will lower resilience in production.

A note on service quotas: AWS Resilience Hub allows you to run 20 assessments per month per application. If you need to increase this quota, please raise a ticket with AWS Support.

Conclusion

In this post, we have seen an approach to continuously assessing resilience as part of your CI/CD pipeline using AWS Resilience Hub, AWS CodePipeline and AWS Step Functions. This approach will enable you to understand fast if a change will weaken resilience.

AWS Resilience Hub also generates recommended AWS FIS Experiments that you can deploy and use to test the resilience of your application. As well as assessing the resilience, we also recommend you integrate running these tests into your pipeline. The post Chaos Testing with AWS Fault Injection Simulator and AWS CodePipeline demonstrates how you can active this.

Build Health Aware CI/CD Pipelines

Post Syndicated from sangusah original https://aws.amazon.com/blogs/devops/build-health-aware-ci-cd-pipelines/

Everything fails all the time — Werner Vogels, AWS CTO

At the moment of imminent failure, you want to avoid an unlucky deployment. I’ll start here with a short story that demonstrates the purpose of this post.

The DevOps team has just started a database upgrade with a planned outage of 30 minutes. The team automated the entire upgrade flow, triggered a CI/CD pipeline with no human intervention, and the upgrade is progressing smoothly. Then, 20 minutes in, the pipeline is stuck, and your upgrade isn’t progressing. The maintenance window has expired and customers can’t transact. You’ve created a support case, and the AWS engineer confirmed that the upgrade is failing because of a running AWS Health incident in the us-west-2 Region. The engineer has directed the DevOps team to continue monitoring the status.aws.amazon.com page for updates regarding incident resolution. The event continued running for three hours, during which time customers couldn’t transact. Once resolved, the DevOps team retried the failed pipeline, and it completed successfully.

After the incident, the DevOps team explored the possibilities for avoiding these types of incidents in the future. The team was made aware of AWS Health API that provides programmatic access to AWS Health information. In this post, we’ll help the DevOps team make the most of the AWS Health API to proactively prevent unintended outages.

AWS provides Business and Enterprise Support customers with access to the AWS Health API. Customers can have access to running events in the AWS infrastructure that may impact their service usage. Incidents could be Regional, AZ-specific, or even account specific. During these incidents, it isn’t recommended to deploy or change services that are impacted by the event.

In this post, I will walk you through how to embed AWS Health API insights into your CI/CD pipelines to automatically stop deployments whenever an AWS Health event is reported in a Region that you’re operating in. Furthermore, I will demonstrate how you can automate detection and remediation.

The Demo

In this demo, I will use AWS CodePipeline to demonstrate the idea. I will build a simple pipeline that demonstrates the concept without going into the build, test, and deployment specifics.

CodePipeline Flow

The CodePipeline flow consists of three steps:

  1. Source stage that downloads a CloudFormation template from AWS CodeCommit. The template will be deployed in the last stage.
  2. Custom stage that invokes the AWS Lambda function to evaluate the AWS Health. The Lambda function calls the AWS Health API, evaluates the health risk, and calls back CodePipeline with the assessment result.
  3. Deploy stage that deploys the CloudFormation templates downloaded from CodeCommit in the first stage.
The CodePipeline flow consists of 3 steps. First, "source stage" that downloads a CloudFormation template from CodeCommit. The template will be deployed in the last stage. Step 2 is a "custom stage" that invokes the Lambda function to evaluate AWS Health. The Lambda function calls the AWS Health API, evaluates the health risk and calls back CodePipeline with the assessment result. Finally, step 3 is a "deploy stage" that deploys the CloudFormation template downloaded from CodeCommit in the first stage. If a health is detected in step 2, the workflow will retry after a predefined timeout.

Figure 1. CodePipeline workflow.

Lambda evaluation logic

The Lambda function evaluates whether or not a running AWS Health event may be impacted by the deployment. In this case, the following criteria must be met to consider it as safe to deploy:

  • Deployment will take place in the North Virginia Region and accordingly the Lambda function will filter on the us-east-1 Region.
  • A closed event is irrelevant. The Lambda function will filter events with only the open status.
  • AWS Health API can return different event types that may not be relevant, such as: Scheduled Maintenance, and Account and Billing notifications. The Lambda function will filter only “Issue” type events.

The AWS Health API follows a multi-Region application architecture and has two regional endpoints in an active-passive configuration. To support active-passive DNS failover, AWS Health provides a global endpoint. The Python code is available on GitHub with more information in the README on how to build the Lambda code package.

The Lambda function requires the following AWS Identity and Access Management (IAM) permissions to access AWS Health API, CodePipeline, and publish logs to CloudWatch:

{
  "Version": "2012-10-17", 
  "Statement": [
    {
      "Action": [ 
        "logs:CreateLogStream",
        "logs:CreateLogGroup",
        "logs:PutLogEvents"
      ],
      "Effect": "Allow", 
      "Resource": "arn:aws:logs:us-east-1:replaceWithAccountNumber:*"
    },
    {
      "Action": [
        "codepipeline:PutJobSuccessResult",
        "codepipeline:PutJobFailureResult"
        ],
        "Effect": "Allow",
        "Resource": "*"
     },
     {
        "Effect": "Allow",
        "Action": "health:DescribeEvents",
        "Resource": "*"
    }
  ]
}

Solution architecture

This is the solution architecture diagram. It involved three entities: AWS Code Pipeline, AWS Lambda and the AWS Health API. First, AWS Code Pipeline invoke the Lambda function asynchronously. Second, the Lambda function call the AWS Health API, DescribeEvents. Third, the DescribeEvents API will respond back with a list of health events. Finally, the Lambda function will respond with either a success response or a failed one through calling PutJobSuccessResult and PutJobFailureResults consecutively.

Figure 2. Solution architecture diagram.

In CodePipeline, create a new stage with a single action to asynchronously invoke a Lambda function. The function will call AWS Health DescribeEvents API to retrieve the list of active health incidents. Then, the function will complete the event analysis and decide whether or not it may impact the running deployment. Finally, the function will call back CodePipeline with the evaluation results through either PutJobSuccessResult or PutJobFailureResult API operations.

If the Lambda evaluation succeeds, then it will call back the pipeline with a PutJobSuccessResult API. In turn, the pipeline will mark the step as successful and complete the execution.

AWS Code Pipeline workflow execution snapshot from the AWS Console. The first step, Source is a success after completing source code download from AWS CodeCommit service. The second step, check the AWS service health is a success as well.

Figure 3. AWS Code Pipeline workflow successful execution.

If the Lambda evaluation fails, then it will call back the pipeline with a PutJobFailureResult API specifying a failure message. Once the DevOps team is made aware that the event has been resolved, select the Retry button to re-evaluate the health status.

AWS CodePipeline workflow execution snapshot from the AWS Console. The first step, Source is a success after completing source code download from AWS CodeCommit service. The second step, check the AWS service health has failed after detecting a running health event/incident in the operating AWS region.

Figure 4. AWS CodePipeline workflow failed execution.

Your DevOps team must be aware of failed deployments. Therefore, it’s a good idea to configure alerts to notify concerned stakeholders with failed stage executions. Create a notification rule that posts a Slack message if a stage fails. For detailed steps, see Create a notification rule – AWS CodePipeline. In case of failure, a Slack notification will be sent through AWS Chatbot.

A Slack UI snapshot showing the notification to be sent if a deployment fails to execute. The notification shows a title of "AWS CodePipeline Notification". The notification indicates that one action has failed in the stage aws-health-check. The notification also shows that the failure reason is that there is an Incident In Progress. The notification also mentions the Pipeline name as well as the failed stage name.

Figure 5. Slack UI snapshot notification for a failed deployment.

A more elegant solution involves pushing the notification to an SNS topic that in turns calls a Lambda function to retry the failed stage. The Lambda function extracts the pipeline failed stage identifier, and then calls the RetryStageExecution CodePipeline API.

Conclusion

We’ve learned how to create an automation that evaluates the risk associated with proceeding with a deployment in conjunction with a running AWS Health event. Then, the automation decides whether to proceed with the deployment or block the progress to avoid unintended downtime. Accordingly, this results in the improved availability of your application.

This solution isn’t exclusive to CodePipeline. However, the pattern can be applied to other CI/CD tools that your DevOps team uses.

Author:

Islam Ghanim

Islam Ghanim is a Senior Technical Account Manager at Amazon Web Services in Melbourne, Australia. He enjoys helping customers build resilient and cost-efficient architectures. Outside work, he plays squash, tennis and almost any other racket sport.

Govern CI/CD best practices via AWS Service Catalog

Post Syndicated from César Prieto Ballester original https://aws.amazon.com/blogs/devops/govern-ci-cd-best-practices-via-aws-service-catalog/

Introduction

AWS Service Catalog enables organizations to create and manage Information Technology (IT) services catalogs that are approved for use on AWS. These IT services can include resources such as virtual machine images, servers, software, and databases to complete multi-tier application architectures. AWS Service Catalog lets you centrally manage deployed IT services and your applications, resources, and metadata , which helps you achieve consistent governance and meet your compliance requirements. In addition,  this configuration enables users to quickly deploy only approved IT services.

In large organizations, as more products are created, Service Catalog management can become exponentially complicated when different teams work on various products. The following solution simplifies Service Catalog products provisioning by considering elements such as shared accounts, roles, or users who can run portfolios or tags in the form of best practices via Continuous Integrations and Continuous Deployment (CI/CD) patterns.

This post demonstrates how Service Catalog Products can be delivered by taking advantage of the main benefits of CI/CD principles along with reducing complexity required to sync services. In this scenario, we have built a CI/CD Pipeline exclusively using AWS Services and the AWS Cloud Development Kit (CDK) Framework to provision the necessary Infrastructure.

Customers need the capability to consume services in a self-service manner, with services built on patterns that follow best practices, including focus areas such as compliance and security. The key tenants for these customers are: the use of infrastructure as code (IaC), and CI/CD. For these reasons, we built a scalable and automated deployment solution covered in this post.Furthermore, this post is also inspired from another post from the AWS community, Building a Continuous Delivery Pipeline for AWS Service Catalog.

Solution Overview

The solution is built using a unified AWS CodeCommit repository with CDK v1 code, which manages and deploys the Service Catalog Product estate. The solution supports the following scenarios: 1) making Products available to accounts and 2) provisioning these Products directly into accounts. The configuration provides flexibility regarding which components must be deployed in accounts as opposed to making a collection of these components available to account owners/users who can in turn build upon and provision them via sharing.

Figure shows the pipeline created comprised of stages

The pipeline created is comprised of the following stages:

  1. Retrieving the code from the repository
  2. Synthesize the CDK code to transform it into a CloudFormation template
  3. Ensure the pipeline is defined correctly
  4. Deploy and/or share the defined Portfolios and Products to a hub account or multiple accounts

Deploying and using the solution

Deploy the pipeline

We have created a Python AWS Cloud Development Kit (AWS CDK) v1 application hosted in a Git Repository. Deploying this application will create the required components described in this post. For a list of the deployment prerequisites, see the project README.

Clone the repository to your local machine. Then, bootstrap and deploy the CDK stack following the next steps.

git clone https://github.com/aws-samples/aws-cdk-service-catalog-pipeline
cd aws-cdk-service-catalog
pip install -r requirements.txt
cdk bootstrap aws://account_id/eu-west-1
cdk deploy

The infrastructure creation takes around 3-5 minutes to complete deploying the AWS CodePipelines and repository creation. Once CDK has deployed the components, you will have a new empty repository where we will define the target Service Catalog estate. To do so, clone the new repository and push our sample code into it:

git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/service-catalog-repo
git checkout -b main
cd service-catalog-repo
cp -aR ../cdk-service-catalog-pipeline/* .
git add .
git commit -am "First commit"
git push origin main

Review and update configuration

Our cdk.json file is used to manage context settings such as shared accounts, permissions, region to deploy, etc.

shared_accounts_ecs: AWS account IDs where the ECS portfolio will be shared
shared_accounts_storage: AWS account IDs where the Storage portfolio will be shared
roles: ARN for the roles who will have permissions to access to the Portfolio
users: ARN for the users who will have permissions to access to the Portfolio
groups: ARN for the groups who will have permissions to access to the Portfolio
hub_account: AWS account ID where the Portfolio will be created
pipeline_account: AWS account ID where the main Infrastructure Pipeline will be created
region: the AWS region to be used for the deployment of the account
"shared_accounts_ecs":["012345678901","012345678902"],
    "shared_accounts_storage":["012345678901","012345678902"],
    "roles":[],
    "users":[],
    "groups":[],
    "hub_account":"012345678901",
    "pipeline_account":"012345678901",
    "region":"eu-west-1"

There are two mechanisms that can be used to create Service Catalog Products in this solution: 1) providing a CloudFormation template or 2) declaring a CDK stack (that will be transformed as part of the pipeline). Our sample contains two Products, each demonstrating one of these options: an Amazon Elastic Container Services (ECS) deployment and an Amazon Simple Storage Service (S3) product.

These Products are automatically shared with accounts specified in the shared_accounts_storage variable. Each product is managed by a CDK Python file in the cdk_service_catalog folder.

Figure shows Pipeline stages that AWS CodePipeline runs through

Figure shows Pipeline stages that AWS CodePipeline runs through

Figure shows Pipeline stages that AWS CodePipeline runs through

The Pipeline stages that AWS CodePipeline runs through are as follows:

  1. Download the AWS CodeCommit code
  2. Synthesize the CDK code to transform it into a CloudFormation template
  3. Auto-modify the Pipeline in case you have made manual changes to it
  4. Display the different Portfolios and Products associated in a Hub account in a Region or in multiple accounts

Adding new Portfolios and Products

To add a new Portfolio to the Pipeline, we recommend creating a new class under cdk_service_catalog similar to cdk_service_catalog_ecs_stack.py from our sample. Once the new class is created with the products you wish to associate, we instantiate the new class inside cdk_pipelines.py, and then add it inside the wave in the stage. There are two ways to create portfolio products. The first one is by creating a CloudFormation template, as can be seen in the Amazon Elastic Container Service (ECS) example.  The second way is by creating a CDK stack that will be transformed into a template, as can be seen in the Storage example.

Product and Portfolio definition:

class ECSCluster(servicecatalog.ProductStack):
    def __init__(self, scope, id):
        super().__init__(scope, id)
        # Parameters for the Product Template
        cluster_name = cdk.CfnParameter(self, "clusterName", type="String", description="The name of the ECS cluster")
        container_insights_enable = cdk.CfnParameter(self, "container_insights", type="String",default="False",allowed_values=["False","True"],description="Enable Container Insights")
        vpc = cdk.CfnParameter(self, "vpc", type="AWS::EC2::VPC::Id", description="VPC")
        ecs.Cluster(self,"ECSCluster_template", enable_fargate_capacity_providers=True,cluster_name=cluster_name.value_as_string,container_insights=bool(container_insights_enable.value_as_string),vpc=vpc)
              cdk.Tags.of(self).add("key", "value")

Clean up

The following will help you clean up all necessary parts of this post: After completing your demo, feel free to delete your stack using the CDK CLI:

cdk destroy --all

Conclusion

In this post, we demonstrated how Service Catalog deployments can be accelerated by building a CI/CD pipeline using self-managed services. The Portfolio & Product estate is defined in its entirety by using Infrastructure-as-Code and automatically deployed based on your configuration. To learn more about AWS CDK Pipelines or AWS Service Catalog, visit the appropriate product documentation.

Authors:

 

César Prieto Ballester

César Prieto Ballester is a Senior DevOps Consultant at AWS. He enjoys automating everything and building infrastructure using code. Apart from work, he plays electric guitar and loves riding his mountain bike.

Daniel Mutale

Daniel Mutale is a Cloud Infrastructure Architect at AWS Professional Services. He enjoys creating cloud based architectures and building out the underlying infrastructure to support the architectures using code. Apart from work, he is an avid animal photographer and has a passion for interior design.

Raphael Sack

Raphael is a technical business development manager for Service Catalog & Control Tower. He enjoys tinkering with automation and code and active member of the management tools community.

How to unit test and deploy AWS Glue jobs using AWS CodePipeline

Post Syndicated from Praveen Kumar Jeyarajan original https://aws.amazon.com/blogs/devops/how-to-unit-test-and-deploy-aws-glue-jobs-using-aws-codepipeline/

This post is intended to assist users in understanding and replicating a method to unit test Python-based ETL Glue Jobs, using the PyTest Framework in AWS CodePipeline. In the current practice, several options exist for unit testing Python scripts for Glue jobs in a local environment. Although a local development environment may be set up to build and unit test Python-based Glue jobs, by following the documentation, replicating the same procedure in a DevOps pipeline is difficult and time consuming.

Unit test scripts are one of the initial quality gates used by developers to provide a high-quality build. One must reuse these scripts during regression testing to make sure that all of the existing functionality is intact, and that new releases don’t disrupt key application functionality. The majority of the regression test suites are expected to be integrated with the DevOps Pipeline for its execution. Unit testing an application code is a fundamental task that evaluates  whether each (unit) code written by a programmer functions as expected. Unit testing of code provides a mechanism to determine that software quality hasn’t been compromised. One of the difficulties in building Python-based Glue ETL tasks is their ability for unit testing to be incorporated within DevOps Pipeline, especially when there are modernization of mainframe ETL process to modern tech stacks in AWS

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning (ML), and application development. AWS Glue provides all of the capabilities needed for data integration. This means that you can start analyzing your data and putting it to use in minutes rather than months. AWS Glue provides both visual and code-based interfaces to make data integration easier.

Prerequisites

GitHub Repository

Amazon ECR Image URI for Glue Library

Solution overview

A typical enterprise-scale DevOps pipeline is illustrated in the following diagram. This solution describes how to incorporate the unit testing of Python-based AWS Glue ETL processes into the AWS DevOps Pipeline.

Figure 1 Solution Overview

The GitHub repository aws-glue-jobs-unit-testing has a sample Python-based Glue job in the src folder. Its associated unit test cases built using the Pytest Framework are accessible in the tests folder. An AWS CloudFormation template written in YAML is included in the deploy folder. As a runtime environment, AWS CodeBuild utilizes custom container images. This feature is used to build a project utilizing Glue libraries from Public ECR repository, that can run the code package to demonstrate unit testing integration.

Solution walkthrough

Time to read  7 min
Time to complete  15-20 min
Learning level  300
Services used
AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, Amazon Elastic Container Registry (Amazon ECR) Public Repositories, AWS CloudFormation

The container image at the Public ECR repository for AWS Glue libraries includes all of the binaries required to run PySpark-based AWS Glue ETL tasks locally, as well as unit test them. The public container repository has three image tags, one for each AWS Glue version supported by AWS Glue. To demonstrate the solution, we use the image tag glue_libs_3.0.0_image_01 in this post. To utilize this container image as a runtime image in CodeBuild, copy the Image URI corresponding to the image tag that you intend to use, as shown in the following image.

Figure 2 Select Glue Library from Public ECR

The aws-glue-jobs-unit-testing GitHub repository contains a CloudFormation template, pipeline.yml, which deploys a CodePipeline with CodeBuild projects to create, test, and publish the AWS Glue job. As illustrated in the following, use the copied image URL from Amazon ECR public to create and test a CodeBuild project.

  TestBuild:
    Type: AWS::CodeBuild::Project
    Properties:
      Artifacts:
        Type: CODEPIPELINE
      BadgeEnabled: false
      Environment:
        ComputeType: BUILD_GENERAL1_LARGE
        Image: "public.ecr.aws/glue/aws-glue-libs:glue_libs_3.0.0_image_01"
        ImagePullCredentialsType: CODEBUILD
        PrivilegedMode: false
        Type: LINUX_CONTAINER
      Name: !Sub "${RepositoryName}-${BranchName}-build"
      ServiceRole: !GetAtt CodeBuildRole.Arn  

The pipeline performs the following operations:

  1. It uses the CodeCommit repository as the source and transfers the most recent code from the main branch to the CodeBuild project for further processing.
  2. The following stage is build and test, in which the most recent code from the previous phase is unit tested and the test report is published to CodeBuild report groups.
  3. If all of the test results are good, then the next CodeBuild project is launched to publish the code to an Amazon Simple Storage Service (Amazon S3) bucket.
  4. Following the successful completion of the publish phase, the final step is to deploy the AWS Glue task using the CloudFormation template in the deploy folder.

Deploying the solution

Set up

Now we’ll deploy the solution using a CloudFormation template.

  • Using the GitHub Web, download the code.zip file from the aws-glue-jobs-unit-testing repository. This zip file contains the GitHub repository’s src, tests, and deploy folders. You may also create the zip file yourself using command-line tools, such as git and zip. To create the zip file on Linux or Mac, open the terminal and enter the following commands.
git clone https://github.com/aws-samples/aws-glue-jobs-unit-testing.git
cd aws-glue-jobs-unit-testing
git checkout master
zip -r code.zip src/ tests/ deploy/
  • Sign in to the AWS Management Console and choose the AWS Region of your choice.
  • Create an Amazon S3 bucket. For more information, see How Do I Create an S3 Bucket? in the AWS documentation.
  • Upload the downloaded zip package, code.zip, to the Amazon S3 bucket that you created.

In this example, I created an Amazon S3 bucket named aws-glue-artifacts-us-east-1 in the N. Virginia (us-east-1) Region, and used the console to upload the zip package from the GitHub repository to the Amazon S3 bucket.

Figure 3 Upload code.zip file to S3 bucket

Creating the stack

  1.  In the CloudFormation console, choose Create stack.
  2. On the Specify template page, choose Upload a template file, and then choose the pipeline.yml template, downloaded from the GitHub repository

Figure 4 Upload pipeline.yml template to create a new CloudFormation stack

  1. Specify the following parameters:.
  • Stack name: glue-unit-testing-pipeline (Choose a stack name of your choice)
  • ApplicationStackName: glue-codepipeline-app (This is the name of the CloudFormation stack that will be created by the pipeline)
  • BranchName: master (This is the name of the branch to be created in the CodeCommit repository to check-in the code from the Amazon S3 bucket zip file)
  • BucketName: aws-glue-artifacts-us-east-1 (This is the name of the Amazon S3 bucket that contains the zip file. This bucket will also be used by the pipeline for storing code artifacts)
  • CodeZipFile: lambda.zip (This is the key name of the sample code Amazon S3 object. The object should be a zip file)
  • RepositoryName: aws-glue-unit-testing (This is the name of the CodeCommit repository that will be created by the stack)
  • TestReportGroupName: glue-unittest-report (This is the name of the CodeBuild test report group that will be created to store the unit test reports)

Figure 5 Fill parameters for stack creation

  1. Choose Next, and again Next.
  1. On the Review page, under Capabilities, choose the following options:
  • I acknowledge that CloudFormation might create IAM resources with custom names.

Figure 6 Acknowledge IAM roles creation

  1. Choose Create stack to begin the stack creation process. Once the stack creation is complete, the resources that were created are displayed on the Resources tab. The stack creation takes approximately 5-7 minutes.

Figure 7 Successful completion of stack creation

The stack automatically creates a CodeCommit repository with the initial code checked-in from the zip file uploaded to the Amazon S3 bucket. Furthermore, it creates a CodePipeline view using the CodeCommit repository as the source. In the above example, the CodeCommit repository is aws-glue-unit-test, and the pipeline is aws-glue-unit-test-pipeline.

Testing the solution

To test the deployed pipeline, open the CodePipeline console and select the pipeline created by the CloudFormation stack. Select the Release Change button on the pipeline page.

Figure 8 Choose Release Change on pipeline page

The pipeline begins its execution with the most recent code in the CodeCommit repository.

When the Test_and_Build phase is finished, select the Details link to examine the execution logs.

Figure 9 Successfully completed the Test_and_Build stage

Select the Reports tab, and choose the test report from Report history to view the unit execution results.

Figure 10 Test report from pipeline execution

Finally, after the deployment stage is complete, you can see, run, and monitor the deployed AWS Glue job on the AWS Glue console page. For more information, refer to the Running and monitoring AWS Glue documentation

Figure 11 Successful pipeline execution

Cleanup

To avoid additional infrastructure costs, make sure that you delete the stack after experimenting with the examples provided in the post. On the CloudFormation console, select the stack that you created, and then choose Delete. This will delete all of the resources that it created, including CodeCommit repositories, IAM roles/policies, and CodeBuild projects.

Summary

In this post, we demonstrated how to unit test and deploy Python-based AWS Glue jobs in a pipeline with unit tests written with the PyTest framework. The approach is not limited to CodePipeline, and it can be used to build up a local development environment, as demonstrated in the Big Data blog. The aws-glue-jobs-unit-testing GitHub repository contains the example’s CloudFormation template, as well as sample AWS Glue Python code and Pytest code used in this post. If you have any questions or comments regarding this example, please open an issue or submit a pull request.

Authors:

Praveen Kumar Jeyarajan

Praveen Kumar Jeyarajan is a PraveenKumar is a Senior DevOps Consultant in AWS supporting Enterprise customers and their journey to the cloud. He has 11+ years of DevOps experience and is skilled in solving myriad technical challenges using the latest technologies. He holds a Masters degree in Software Engineering. Outside of work, he enjoys watching movies and playing tennis.

Vaidyanathan Ganesa Sankaran

Vaidyanathan Ganesa Sankaran is a Sr Modernization Architect at AWS supporting Global Enterprise customers on their journey towards modernization. He is specialized in Artificial intelligence, legacy Modernization and Cloud Computing. He holds a Masters degree in Software Engineering and has 12+ years of Modernization experience. Outside work, he loves conducting training sessions for college grads and professional starter who wants to learn cloud and AI. His hobbies are playing tennis, philately and traveling.

How MarketAxess® uses AWS Developer Tools to create scalable and secure CI/CD pipelines

Post Syndicated from Aaron Lima original https://aws.amazon.com/blogs/devops/how-marketaxess-uses-aws-developer-tools-to-create-scalable-and-secure-ci-cd-pipelines/

Very often,  enterprise organizations strive to adopt modern DevOps practices, tofocus on governance and security without sacrificing development velocity. In this guest post, Prashant Joshi, Senior Cloud Engineer at MarketAxess, explains how they use the AWS Cloud Development Kit (AWS CDK), AWS CodePipeline, and AWS CodeBuild to simplify the developer experience by dynamically provisioning pipelines and maintaining governance at MarketAxess.

Problem Statement

MarketAxess is a financial technology company that operates an e-trading platform, for institutional credit markets. As MarketAxess adopted DevOps firm-wide, we struggled to ensure pipeline consistency. We had developers using static code analysis and linting, but it wasn’t enforced. As more teams began to adopt DevOps practices, the importance of providing consistency over code quality, security scanning, and artifact management grew. However, we were challenged with increasing our engineering workforce and implementing best practices in the various pipelines. As a small team, we needed a way to reliably manage and scale pipelines while reducing engineering overhead. We thought about the DevOps tenets, as well as the importance of automation, and we decided to build automation that would provision pipelines for development teams.  These pipelines included best practices for Continuous Integration and Continuous Deployment (CI/CD). We wanted to build this automation with self-service, so that teams can get started developing a solution to a business problem, without having to spend too much time around the CI/CD aspects of their projects.

We chose the AWS CDK to deploy AWS CodePipeline, AWS CodeBuild, and AWS Identity and Access Management (IAM) resources, and used an API webhook using AWS Lambda and Amazon API Gateway for integration. In this post, we provide an example of how these services can be used to create dynamic cross account CI/CD pipelines.

Solution

In developing our solution, we wanted to accomplish three main goals:

  1. Standardization and Governance of Pipelines – We wanted to ensure consistent practices in each team’s pipeline to make sure of code quality and security.
  2. Simplified Developer Interaction – We wanted developers to focus mainly on interacting with the code repository for their project.
  3. Improve Management of Dynamically Provisioned Pipelines – Knowing that we would need to make changes, improvements, and enhancements, we wanted tools and a process that was flexible.

We achieved these goals using AWS CDK to automate the creation of CodePipeline and define mandatory actions in the pipeline. We also created a webhook using API Gateway to integrate with our Bitbucket repositories to automatically trigger the automation. The pipelines can dynamically be provisioned or updated based on the YAML manifest file submitted to the repository. We process the manifest file with Amazon Elastic Container Service (Amazon ECS) Fargate tasks, because we had containerized the processing components using Docker. However, with the release of container support in Lambda, we are now considering this as a potential replacement. These pipelines run CI stages based on the programing language defined by development teams in the manifest file, and they deploy a tested versioned artifact to the corresponding environments via standard Software Defined Lifecycle (SDLC) practices. As a part of CI stages, we semantically version our code and tag our commits accordingly. This lets us trace commit to pipeline execution. The following architecture diagram shows a CloudFormation pipeline generated via AWS CDK.

CloudFormation Pipeline Architecture Diagram

The process flow is as follows:

  1. Developer pushes a change to the repository.
  2. A webhook is triggered when the Pull Request is merged that creates or modifies the pipeline based on the manifest file submitted to the repository.
  3. This triggers a Lambda function that performs the following:
    1. Clones the repository from Internally hosted BitBucket repos.
    2. Uploads the repository to the source Amazon Simple Storage Service (Amazon S3) bucket, which is encrypted using Customer Managed Keys (CMK) with the AWS Key Management Service (KMS).
    3. An ECS Task is run, and a manifest file is passed which gives the project parameters. Pipelines are built according to these project parameters.
  4. An ECS Task processes the metadata file and runs cdk Logic, finally it triggers the pipeline.
    1. As source code is progressed through the pipeline, the build stage output to the artifact bucket. Pipeline artifacts are encrypted with a CMK. The IAM roles in the target account only have access to this bucket.

Additionally, through the power of the IAM integration with CodePipeline, the team could implement session tags with IAM roles and Okta to make sure that independent teams only approve pipelines, which are owned by respective teams. Furthermore, we use attribute-based tags to protect the production environment from unauthorized actions, so that deployment to production can only come through the pipeline.

The AWS CDK-based pipelines let MarketAxess enable teams to independently build and obtain immediate feedback, while still centrally governing CI and CD patterns. The solution took six months of two DevOps engineers working full time to build the cdk structure and support for the core languages and their corresponding CI and CD stages. We continue to iterate on the cdk code base and pipelines, incorporating feedback from our development community to ensure developer satisfaction.

Simplified Developer Interaction

Although we were enforcing standards via the automation, we still wanted to give development teams autonomy through a simple mechanism. We wanted developers to interact with our pipeline creation process through a pipeline manifest file that they submitted to their repository. An example of the manifest file schema is in the following screenshot:

Manifest File Schema

As shown above, the manifest lets developers define custom application configurations, while preserving consistent quality gates. This manifest is checked in to source control, and upon a commit to the code repository it triggers our automation. This lets our pipelines mutate on manifest file changes, and it makes sure that the latest commit goes through the latest quality gates. Each repository gets its own pipeline, and, to maintain the security of the pipeline, we used IAM Session Tags with Okta. We tag each pipeline and its associated resources with a unique attribute that is mapped to the development team so that they only have access to their pipelines, and only authorized individuals may approve production deployments.

Using AWS CDK, AWS CodePipeline, and other AWS Services, we have been able to improve the stability and quality of the code being delivered. CodePipeline and AWS CDK have helped us develop a cloud native pipeline solution that meets our governance best practices and compliance requirements. We met our three goals, and we can iterate and change easily moving forward.

Conclusion

Organizations that achieve the automation and self-service ideals of DevOps can build, release, and deploy features and apps to users faster and at higher levels of quality. In this post, we saw a real-life example of using Infrastructure as Code with AWS CDK to build a service that helps maintain governance and helps developers get work done. Here are two other posts that demonstrate using AWS Service Catalog to create secure DevOps pipelines or DevOps pipelines that deploy containerized applications.



Prashant Joshi

Prashant Joshi

Prashant Joshi is a Senior Cloud Engineer working in the Cloud Foundation team at MarketAxess. MarketAxess is a registered trademark of MarketAxess Holdings Inc.

Building Blue/Green application deployment to Micro Focus Enterprise Server

Post Syndicated from Kevin Yung original https://aws.amazon.com/blogs/devops/building-blue-green-application-deployment-to-micro-focus-enterprise-server/

Organizations running mainframe production workloads often follow the traditional approach of application deployment. To release new features of existing applications into production, the application is redeployed using the new version of software on the existing infrastructure. This poses the following challenges:

  • The cutover of the application deployment from testing to production usually takes place during a planned outage window with associated downtime.
  • Rollback is difficult, since the earlier version of the software must be redeployed from scratch on the existing infrastructure. This may result in applications being unavailable for longer durations owing to the rollback.
  • Due to differences in testing and production environments, some defects may leak into production, affecting the application code quality and thus increasing the number of production outages

Automated, robust application deployment is recognized as a prime driver for moving from a Mainframe to AWS, as service stability, security, and quality can be better managed. In this post, you will learn how to build Blue/Green (zero-downtime) deployments for mainframe applications rehosted to Micro Focus Enterprise Server with AWS Developer Tools (AWS CodeBuild, CodePipeline, and CodeDeploy).

This is a continuation of our previous post “Automate thousands of mainframe tests on AWS with the Micro Focus Enterprise Suite”. In our last post, we explained how you can implement a pattern for continuous integration and testing of mainframe applications with AWS Developer tools and Micro Focus Enterprise Suite. If you haven’t already checked it out, then we strongly recommend that you read through it before proceeding to the rest of this post.

Overview of solution

In this section, we explain the three important design “ingredients” to be implemented in the overall solution:

  1. Implementation of Enterprise Server Performance and Availability Cluster (PAC)
  2. End-to-end design of CI/CD pipeline for multiple teams development
  3. Blue/green deployment process for a rehosted mainframe application

First, let’s look at the solution design for the Micro Focus Enterprise Server PAC cluster.

Overview of Micro Focus Enterprise Server Performance and Availability Cluster (PAC)

In the Blue/Green deployment solution, Micro Focus Enterprise Server is the hosting environment for mainframe applications with the software installed into Amazon EC2 instances. Application deployment in Amazon EC2 Auto Scaling is one of the critical requirements to build a Blue/Green deployment. Micro Focus Enterprise Server PAC technology is the feature that allows for the Auto Scaling of Enterprise Server instances. For details on how to build Micro Focus Enterprise PAC Cluster with Amazon EC2 Auto Scaling and Systems Manager, see our AWS Prescriptive Guidance document. An overview of the infrastructure architecture is shown in the following figure, and the following table explains the components in the architecture.

Infrastructure architecture overview for blue/green application deployment to Micro Focus Enterprise Server

Components Description
Micro Focus Enterprise Servers Deploy applications to Micro Focus Enterprise Servers PAC in Amazon EC2 Auto Scaling Group.
Micro Focus Enterprise Server Common Web Administration (ESCWA) Manage Micro Focus Enterprise Server PAC with ESCWA server, e.g., Adding or Removing Enterprise Server to/from a PAC.
Relational Database for both user and system data files Setup Amazon Aurora RDS Instance in Multi-AZ to host both user and system data files to be shared across the Enterprise server instances.
Micro Focus Enterprise Server Scale-Out Repository (SOR) Setup an Amazon ElastiCache Redis Instance and replicas in Multi-AZ to host user data.
Application endpoint and load balancer Setup a Network Load Balancer to provide a hostname for end users to connect the application, e.g., accessing the application through a 3270 emulator.

CI/CD Pipelines design supporting multi-streams of mainframe development

In a previous DevOps post, Automate thousands of mainframe tests on AWS with the Micro Focus Enterprise Suite, we introduced two levels of pipelines. The first level of pipeline is used by mainframe project teams to test project scope changes. The second level of the pipeline is used for system integration tests, where the pipeline will perform tests for all of the promoted changes from the project pipelines and perform extensive systems tests.

In this post, we are extending the two levels pipeline to add a production deployment pipeline. When system testing is complete and successful, the tested application artefacts are promoted to the production pipeline in preparation for live production release. The following figure depicts each stage of the three levels of CI/CD pipeline and the purpose of each stage.

Different levels of CI/CD pipeline - Project Team Pipeline, Systems Test Pipeline and Production Deployment Pipeline

Let’s look at the artifact promotion to production pipeline in greater detail. The Systems Test Pipeline promotes the tested artifacts in binary format into an Amazon S3 bucket and the S3 event triggers production pipeline to kick-off. This artifact promotion process can be gated using a manual approval action in CodePipeline. For customers who want to have a fully automated continuous deployment, the manual promotion approval step can be removed.

The following diagram shows the AWS Stages in AWS CodePipeline of the production deployment pipeline:

Stages in production deployment pipeline using AWS CodePipeline

After the production pipeline is kicked off, it downloads the new version artifact from the S3 bucket. See the details of how to setup the S3 bucket as a Source of CodePipeline in the document AWS CodePipeline Document S3 as Source

In the following section, we explain each of these pipeline stages in detail:

  1. It prepares and packages a new version of production configuration artifacts, for example, the Micro Focus Enterprise Server config file, blue/green deployment scripts etc.
  2. Use in the CodeBuild Project to kick off an application blue/green deployment with AWS CodeDeploy.
  3. Use a manual approval gate to wait for an operator to validate the new version of the application and approve to continue the production traffic switch
  4. Continue the blue/green deployment by allowing traffic to the new version of the application and block the traffic to the old version.
  5. After a successful Blue/Green switch and deployment, tag the production version in the code repository.

Now that you’ve seen the pipeline design, we will dive deep into the details of the blue/green deployment with AWS CodeDeploy.

Blue/green deployment with AWS CodeDeploy

In the blue/green deployment, we used the technique of swapping Auto Scaling Group behind an Elastic Load Balancer. Refer to the AWS Blue/Green deployment whitepaper for the details of the technique. As AWS CodeDeploy is a fully-managed service that automates software deployment, it is used to automate the entire Blue/Green process.

Firstly, the following best practices are applied to setup the Enterprise Server’s infrastructure:

  1. AWS Image Builder is used to install Micro Focus Enterprise Server software and AWS CodeDeploy Agent into Amazon Machine Image (AMI). Create an EC2 Launch Template with the Enterprise Server AMI ID.
  2. A Network Load Balancer is used to setup a TCP connection health check to validate that Micro Focus Enterprise Server is listening on the required ports, e.g., port 9270, so that connectivity is available for 3270 emulators.
  3. A script was created to confirm application deployment validity in each EC2 instance. This is achieved by using a PowerShell script that triggers a CICS transaction from the Micro Focus Enterprise Server command line interface.

In the CodePipeline, we created a CodeBuild project to create a new deployment with CodeDeploy. We will go into the details of the CodeBuild buildspec.yaml configuration.

In the CodeBuild buildspec.yaml’s pre_build section, we used the following steps:

In the pre-build stage, the CodeBuild will perform two steps:

  1. Create an initial Amazon EC2 Auto Scaling using Micro Focus Enterprise Server AMI and a Launch Template for the first-time deployment of the application.
  2. Use AWS CLI to update the initial Auto Scaling Group name into a Systems Manager Parameter Store, and it will later be used by CodeDeploy to create a copy during the blue/green deployment.

In the build stage, the buildspec will perform the following steps:

  1. Retrieve the Auto Scaling Group name of the Enterprise Servers from the Systems Manager Parameter Store.
  2. Then, a blue/green deployment configuration is created for the deployment group of the application. In the AWS CLI command, we use the WITH_TRAFFIC_CONTROL option to let us manually verify and approve before switching the traffic to the new version of the application. The command snippet is shown here.
BlueGreenConf=\
        "terminateBlueInstancesOnDeploymentSuccess={action=TERMINATE}"\
        ",deploymentReadyOption={actionOnTimeout=STOP_DEPLOYMENT,waitTimeInMinutes=600}" \
        ",greenFleetProvisioningOption={action=COPY_AUTO_SCALING_GROUP}"

DeployType="BLUE_GREEN,deploymentOption=WITH_TRAFFIC_CONTROL"

/usr/local/bin/aws deploy update-deployment-group \
      --application-name "${APPLICATION_NAME}" \
     --current-deployment-group-name "${DEPLOYMENT_GROUP_NAME}" \
     --auto-scaling-groups "${AsgName}" \
      --load-balancer-info targetGroupInfoList=[{name="${TARGET_GROUP_NAME}"}] \
      --deployment-style "deploymentType=$DeployType" \
      --Blue/Green-deployment-configuration "$BlueGreenConf"
  1. Next, the new version of application binary is released from the CodeBuild source DemoBinto the production S3 bucket.
release="bankdemo-$(date '+%Y-%m-%d-%H-%M').tar.gz"
RELEASE_FILE="s3://${PRODUCTION_BUCKET}/${release}"

/usr/local/bin/aws deploy push \
    --application-name ${APPLICATION_NAME} \
    --description "version - $(date '+%Y-%m-%d %H:%M')" \
    --s3-location ${RELEASE_FILE} \
    --source ${CODEBUILD_SRC_DIR_DemoBin}/
  1. Create a new deployment for the application to initiate the Blue/Green switch.
/usr/local/bin/aws deploy create-deployment \
    --application-name ${APPLICATION_NAME} \
    --s3-location bucket=${PRODUCTION_BUCKET},key=${release},bundleType=zip \
    --deployment-group-name "${DEPLOYMENT_GROUP_NAME}" \
    --description "Bankdemo Production Deployment ${release}"\
    --query deploymentId \
    --output text

After setting up the deployment options, the following is a snapshot of a deployment configuration from the AWS Management Console.

Snapshot of deployment configuration from AWS Management Console

In the AWS Post “Under the Hood: AWS CodeDeploy and Auto Scaling Integration”, we explain how AWS CodeDeploy sets up Auto Scaling lifecycle hooks to listen for Auto Scaling events. In the event of an EC2 instance launch and termination, AWS CodeDeploy can instruct its agent in the instance to run the prepared scripts.

In the following table, we list each stage in a blue/green deployment and the tasks that ran.

Hooks Tasks
BeforeInstall Create application folder structures in the newly launched Amazon EC2 and prepare for installation
  AfterInstall Enable Windows Firewall Rule for application traffic
Activate Micro Focus License using License Server
Prepare Production Database Connections
Import config to create Region in Micro Focus Enterprise Server
Deploy the latest application binaries into each of the Micro Focus Enterprise Servers
ApplicationStart Use AWS CLI to start a Systems Manager Automation “Scale-Out” runbook with the target of ESCWA server
The Automation runbook will add the newly launched Micro Focus Enterprise Server instance into a PAC
The Automation runbook will start the imported region in the newly launched Micro Focus Enterprise Server
Validate that the application is listening on a service port, for example, port 9270
Use the Micro Focus command “castran” to run an online transaction in Micro Focus Enterprise Server to validate the service status
AfterBlockTraffic Use AWS CLI to start a Systems Manager Automation “Scale-In” runbook with the target ESCWA server
The Automation runbook will try stopping the Region in the terminating EC2 instance
The Automation runbook will remove the Enterprise Server instance from the PAC

The tasks in the table are automated using PowerShell, and the scripts are used in appspec.yml config for CodeDeploy to orchestrate the deployment.

In the following appspec.yml, the locations of the binary files to be installed are defined in addition to the Micro Focus Enterprise Server Region XML config file. During the AfrerInstall stage, the XML config is imported into the Enterprise Server.

version: 0.0
os: windows
files:
  - source: scripts
    destination: C:\scripts\
  - source: online
    destination: C:\BANKDEMO\online\
  - source: common
    destination: C:\BANKDEMO\common\
  - source: batch
    destination: C:\BANKDEMO\batch\
  - source: scripts\BANKDEMO.xml
    destination: C:\BANKDEMO\
hooks:
  BeforeInstall: 
    - location: scripts\BeforeInstall.ps1
      timeout: 300
  AfterInstall: 
    - location: scripts\AfterInstall.ps1    
  ApplicationStart:
    - location: scripts\ApplicationStart.ps1
      timeout: 300
  ValidateService:
    - location: scripts\ValidateServer.cmd
      timeout: 300
  AfterBlockTraffic:
    - location: scripts\AfterBlockTraffic.ps1

Using the sample Micro Focus Bankdemo application, and the steps outlined above, we have setup a blue/green deployment process in Micro Focus Enterprise Server.

There are four important considerations when setting up blue/green deployment:

  1. For batch applications, the blue/green deployment should be invoked only outside of the scheduled “batch window”.
  2. For online applications, AWS CodeDeploy will deregister the Auto Scaling group from the target group of the Network Load Balancer. The deregistration may take a while as the server has to finish processing the ongoing requests before it can continue deployment of the new application instance. In this case, enabling Elastic Load Balancing connection draining feature with appropriate timeout value can minimize the risk of closing unfinished transactions. In addition, consider doing deployment in low-traffic windows to improve the deployment speeds.
  3. For application changes that require updates to the database schema, the version roll-forward and rollback can be managed via DB migrations tools, e.g., Flyway and Fluent Migrator.
  4. For testing in production environments, adherence to any regulatory compliance, such as full audit trail of events, must be considered.

Conclusion

In this post, we introduced the solution to use Micro Focus Enterprise Server PAC, Amazon EC2 Auto Scaling, AWS Systems Manager, and AWS CodeDeploy to automate the blue/green deployment of rehosted mainframe applications in AWS.

Through the blue/green deployment methodology, we can shift traffic between two identical clusters running different application versions in parallel. This mitigates the risks commonly associated with mainframe application deployment, namely downtime and rollback capacity, while ensure higher code quality in production through “Shift Right” testing.

A demo of the solution is available on the AWS Partner Micro Focus website [Solution-Demo]. If you’re interested in modernizing your mainframe applications, then please contact Micro Focus and AWS mainframe business development at [email protected].

Additional Information

About the authors

Kevin Yung

Kevin Yung

Kevin is a Senior Modernization Architect in AWS Professional Services Global Mainframe and Midrange Modernization (GM3) team. Kevin currently is focusing on leading and delivering mainframe and midrange applications modernization for large enterprise customers.

Krithika Palani Selvam

Krithika is a Senior Modernization Architect in AWS Professional Services Global Mainframe and Midrange Modernization (GM3) team. She is currently working with enterprise customers for migrating and modernizing mainframe and midrange applications to cloud.

Peter Woods

Peter Woods has been with Micro Focus for over 30 years <within the Application Modernisation & Connectivity portfolio>. His diverse range of roles has included Technical Support, Channel Sales, Product Management, Strategic Alliances Management and Pre-Sales and was primarily based in the UK. In 2017 Peter re-located to Melbourne, Australia and in his current role of AM2C APJ Regional Technical Leader and ANZ Pre-Sales Manager, he is charged with driving and supporting Application Modernisation sales activity across the APJ region.

Abraham Mercado Rondon

Abraham Rondon is a Solutions Architect working on Micro Focus Enterprise Solutions for the Application Modernization team based in Melbourne. After completing a degree in Statistics and before joining Micro Focus, Abraham had a long career in supporting Mainframe Applications in different countries doing progressive roles from Developer to Production Support, Business and Technical Analyst, and Project Team Lead.  Now, a vital part of the Micro Focus Application Modernization team, one of his main focus is Cloud implementations of mainframe DevOps and production workload rehost.

Using DevOps Automation to Deploy Lambda APIs across Accounts and Environments

Post Syndicated from Subrahmanyam Madduru original https://aws.amazon.com/blogs/architecture/using-devops-automation-to-deploy-lambda-apis-across-accounts-and-environments/

by Subrahmanyam Madduru – Global Partner Solutions Architect Leader, AWS, Sandipan Chakraborti – Senior AWS Architect, Wipro Limited, Abhishek Gautam – AWS Developer and Solutions Architect, Wipro Limited, Arati Deshmukh – AWS Architect, Infosys

As more and more enterprises adopt serverless technologies to deliver their business capabilities in a more agile manner, it is imperative to automate release processes. Multiple AWS Accounts are needed to separate and isolate workloads in production versus non-production environments. Release automation becomes critical when you have multiple business units within an enterprise, each consisting of a number of AWS accounts that are continuously deploying to production and non-production environments.

As a DevOps best practice, the DevOps engineering team responsible for build-test-deploy in a non-production environment should not release the application and infrastructure code on to both non-production and production environments.  This risks introducing errors in application and infrastructure deployments in production environments. This in turn results in significant rework and delays in delivering functionalities and go-to-market initiatives. Deploying the code in a repeatable fashion while reducing manual error requires automating the entire release process. In this blog, we show how you can build a cross-account code pipeline that automates the releases across different environments using AWS CloudFormation templates and AWS cross-account access.

Cross-account code pipeline enables an AWS Identity & Access Management (IAM) user to assume an IAM Production role using AWS Secure Token Service (Managing AWS STS in an AWS Region – AWS Identity and Access Management) to switch between non-production and production deployments based as required. An automated release pipeline goes through all the release stages from source, to build, to deploy, on non-production AWS Account and then calls STS Assume Role API (cross-account access) to get temporary token and access to AWS Production Account for deployment. This follow the least privilege model for granting role-based access through IAM policies, which ensures the secure automation of the production pipeline release.

Solution Overview

In this blog post, we will show how a cross-account IAM assume role can be used to deploy AWS Lambda Serverless API code into pre-production and production environments. We are building on the process outlined in this blog post: Building a CI/CD pipeline for cross-account deployment of an AWS Lambda API with the Serverless Framework by programmatically automating the deployment of Amazon API Gateway using CloudFormation templates. For this use case, we are assuming a single tenant customer with separate AWS Accounts to isolate pre-production and production workloads.  In Figure 1, we have represented the code pipeline workflow diagramatically for our use case.

Figure 1. AWS cross-account CodePipeline for production and non-production workloads

Figure 1. AWS cross-account AWS CodePipeline for production and non-production workloads

Let us describe the code pipeline workflow in detail for each step noted in the preceding diagram:

  1. An IAM user belonging to the DevOps engineering team logs in to AWS Command-line Interface (AWS CLI) from a local machine using an IAM secret and access key.
  2. Next, the  IAM user assumes the IAM role to the corresponding activities – AWS Code Commit, AWS CodeBuild, AWS CodeDeploy, AWS CodePipeline Execution and deploys the code for pre-production.
  3. A typical AWS CodePipeline comprises of build, test and deploy stages. In the build stage, the AWS CodeBuild service generates the Cloudformation template stack (template-export.yaml) into Amazon S3.
  4. In the deploy stage, AWS CodePipeline uses a CloudFormation template (a yaml file) to deploy the code from an S3 bucket containing the application API endpoints via Amazon API Gateway in the pre-production environment.
  5. The final step in the pipeline workflow is to deploy the application code changes onto the Production environment by assuming STS production IAM role.

Since the AWS CodePipeline is fully automated, we can use the same pipeline by switching between  pre-production and production accounts. These accounts assume the IAM role appropriate to the target environment and deploy the validated build to that environment using CloudFormation templates.

Prerequisites

Here are the pre-requisites before you get started with implementation.

  • A user  with appropriate privileges (for example: Project Admin) in a production AWS account
  • A user with appropriate privileges (for example: Developer Lead) in a pre-production AWS account such as development
  • A CloudFormation template for deploying infrastructure in the pre-production account
  • Ensure your local machine has AWS CLI installed and configured 

Implementation Steps

In this section, we show how you can use AWS CodePipeline to release a serverless API in a secure manner to pre-production and production environments. AWS CloudWatch logging will be used to monitor the events on the AWS CodePipeline.

1. Create Resources in a pre-production account

In this step, we create the required resources such as a code repository, an S3 bucket, and a KMS key in a pre-production environment.

  • Clone the code repository into your CodeCommit. Make necessary changes to index.js and ensure the buildspec.yaml is there to build the artifacts.
    • Using codebase (lambda APIs) as input, you output a CloudFormation template, and environmental configuration JSON files (used for configuring Production and other non-Production environments such as dev, test). The build artifacts are packaged using AWS Serverless Application Model into a zip file and uploads it to an S3 bucket created for storing artifacts. Make note of the repository name as it will be required later.
  • Create an S3 bucket in a Region (Example: us-east-2). This bucket will be used by the pipeline for get and put artifacts. Make a note of the bucket name.
    • Make sure you edit the bucket policy to have your production account ID and the bucket name. Refer to AWS S3 Bucket Policy documentation to make changes to Amazon S3 bucket policies and permissions.
  • Navigate to AWS Key Management Service (KMS) and create a symmetric key.
  • Then create a new secret, configure the KMS key and provide access to development and production account. Make a note of the ARN for the key.

2. Create IAM Roles in the Production Account and required policies

In this step, we create roles and policies required to deploy the code.

{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
        "kms:DescribeKey",
        "kms:GenerateDataKey*",
        "kms:Encrypt",
        "kms:ReEncrypt*",
        "kms:Decrypt"
      ],
      "Resource": [
        "Your KMS Key ARN you created in Development Account"
      ]
    }
  ]
}

Once you’ve created both policies, attach them to the previously created cross-account role.

3. Create a CloudFormation Deployment role

In this step, you need to create another IAM role, “CloudFormationDeploymentRole” for Application deployment. Then attach the following four policies to it.

Policy 1: For Cloudformation to deploy the application in the Production account

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "cloudformation:DetectStackDrift",
        "cloudformation:CancelUpdateStack",
        "cloudformation:DescribeStackResource",
        "cloudformation:CreateChangeSet",
        "cloudformation:ContinueUpdateRollback",
        "cloudformation:DetectStackResourceDrift",
        "cloudformation:DescribeStackEvents",
        "cloudformation:UpdateStack",
        "cloudformation:DescribeChangeSet",
        "cloudformation:ExecuteChangeSet",
        "cloudformation:ListStackResources",
        "cloudformation:SetStackPolicy",
        "cloudformation:ListStacks",
        "cloudformation:DescribeStackResources",
        "cloudformation:DescribePublisher",
        "cloudformation:GetTemplateSummary",
        "cloudformation:DescribeStacks",
        "cloudformation:DescribeStackResourceDrifts",
        "cloudformation:CreateStack",
        "cloudformation:GetTemplate",
        "cloudformation:DeleteStack",
        "cloudformation:TagResource",
        "cloudformation:UntagResource",
        "cloudformation:ListChangeSets",
        "cloudformation:ValidateTemplate"
      ],
      "Resource": "arn:aws:cloudformation:us-east-2:940679525002:stack/DevOps-Automation-API*/*"        }
  ]
}

Policy 2: For Cloudformation to perform required IAM actions

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "iam:GetRole",
        "iam:GetPolicy",
        "iam:TagRole",
        "iam:DeletePolicy",
        "iam:CreateRole",
        "iam:DeleteRole",
        "iam:AttachRolePolicy",
        "iam:PutRolePolicy",
        "iam:TagPolicy",
        "iam:CreatePolicy",
        "iam:PassRole",
        "iam:DetachRolePolicy",
        "iam:DeleteRolePolicy"
      ],
      "Resource": "*"
    }
  ]
}

Policy 3: Lambda function service invocation policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "lambda:CreateFunction",
        "lambda:UpdateFunctionCode",
        "lambda:AddPermission",
        "lambda:InvokeFunction",
        "lambda:GetFunction",
        "lambda:DeleteFunction",
        "lambda:PublishVersion",
        "lambda:CreateAlias"
      ],
      "Resource": "arn:aws:lambda:us-east-2:Your_Production_AccountID:function:SampleApplication*"
    }
  ]
}

Policy 4: API Gateway service invocation policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:POST",
        "apigateway:GET"
      ],
      "Resource": [
        "arn:aws:apigateway:*::/restapis/*/deployments/*",
        "arn:aws:apigateway:*::/restapis/*/stages/*",
        "arn:aws:apigateway:*::/clientcertificates",
        "arn:aws:apigateway:*::/restapis/*/models",
        "arn:aws:apigateway:*::/restapis/*/resources/*",
        "arn:aws:apigateway:*::/restapis/*/models/*",
        "arn:aws:apigateway:*::/restapis/*/gatewayresponses/*",
        "arn:aws:apigateway:*::/restapis/*/stages",
        "arn:aws:apigateway:*::/restapis/*/resources",
        "arn:aws:apigateway:*::/restapis/*/gatewayresponses",
        "arn:aws:apigateway:*::/clientcertificates/*",
        "arn:aws:apigateway:*::/account",
        "arn:aws:apigateway:*::/restapis/*/deployments",
        "arn:aws:apigateway:*::/restapis"
      ]
    },
    {
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:POST",
        "apigateway:GET"
      ],
      "Resource": "arn:aws:apigateway:*::/restapis/*/resources/*/methods/*/responses/*"
    },
    {
      "Sid": "VisualEditor2",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:GET"
      ],
      "Resource": "arn:aws:apigateway:*::/restapis/*"
    },
    {
      "Sid": "VisualEditor3",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:GET"
      ],
      "Resource": "arn:aws:apigateway:*::/restapis/*/resources/*/methods/*"
    }
  ]
}

Make sure you also attach the S3 read/write access and KMS policies created in Step-2, to the CloudFormationDeploymentRole.

4. Setup and launch CodePipeline

You can launch the CodePipeline either manually in the AWS console using “Launch Stack” or programmatically via command-line in CLI.

On your local machine go to terminal/ command prompt and launch this command:

aws cloudformation deploy –template-file <Path to pipeline.yaml> –region us-east-2 –stack-name <Name_Of_Your_Stack> –capabilities CAPABILITY_IAM –parameter-overrides ArtifactBucketName=<Your_Artifact_Bucket_Name>  ArtifactEncryptionKeyArn=<Your_KMS_Key_ARN>  ProductionAccountId=<Your_Production_Account_ID>  ApplicationRepositoryName=<Your_Repository_Name> RepositoryBranch=master

If you have configured a profile in AWS CLI,  mention that profile while executing the command:

–profile <your_profile_name>

After launching the pipeline, your serverless API gets deployed in pre-production as well as in the production Accounts. You can check the deployment of your API in production or pre-production Account, by navigating to the API Gateway in the AWS console and looking for your API in the Region where it was deployed.

Figure 2. Check your deployment in pre-production/production environment

Figure 2. Check your deployment in pre-production/production environment

Then select your API and navigate to stages, to view the published API with an endpoint. Then validate your API response by selecting the API link.

Figure 3. Check whether your API is being published in pre-production/production environment

Figure 3. Check whether your API is being published in pre-production/production environment

Alternatively you can also navigate to your APIs by navigating through your deployed application CloudFormation stack and selecting the link for API in the Resources tab.

Cleanup

If you are trying this out in your AWS accounts, make sure to delete all the resources created during this exercise to avoid incurring any AWS charges.

Conclusion

In this blog, we showed how to build a cross-account code pipeline to automate releases across different environments using AWS CloudFormation templates and AWS Cross Account Access. You also learned how serveless APIs can be securely deployed across pre-production and production accounts. This helps enterprises automate release deployments in a repeatable and agile manner, reduce manual errors and deliver business cababilities more quickly.

Deploying AWS Lambda layers automatically across multiple Regions

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/deploying-aws-lambda-layers-automatically-across-multiple-regions/

This post is written by Ben Freiberg, Solutions Architect, and Markus Ziller, Solutions Architect.

Many developers import libraries and dependencies into their AWS Lambda functions. These dependencies can be zipped and uploaded as part of the build and deployment process but it’s often easier to use Lambda layers instead.

A Lambda layer is an archive containing additional code, such as libraries or dependencies. Layers are deployed as immutable versions, and the version number increments each time you publish a new layer. When you include a layer in a function, you specify the layer version you want to use.

Lambda layers simplify and speed up the development process by providing common dependencies and reducing the deployment size of your Lambda functions. To learn more, refer to Using Lambda layers to simplify your development process.

Many customers build Lambda layers for use across multiple Regions. But maintaining up-to-date and consistent layer versions across multiple Regions is a manual process. Layers are set as private automatically but they can be shared with other AWS accounts or shared publicly. Permissions only apply to a single version of a layer. This solution automates the creation and deployment of Lambda layers across multiple Regions from a centralized pipeline.

Overview of the solution

This solution uses AWS Lambda, AWS CodeCommit, AWS CodeBuild and AWS CodePipeline.

Reference architecture

This diagram outlines the workflow implemented in this blog:

  1. A CodeCommit repository contains the language-specific definition of dependencies and libraries that the layer contains, such as package.json for Node.js or requirements.txt for Python. Each commit to the main branch triggers an execution of the surrounding CodePipeline.
  2. A CodeBuild job uses the provided buildspec.yaml to create a zipped archive containing the layer contents. CodePipeline automatically stores the output of the CodeBuild job as artifacts in a dedicated Amazon S3 bucket.
  3. A Lambda function is invoked for each configured Region.
  4. The function first downloads the zip archive from S3.
  5. Next, the function creates the layer version in the specified Region with the configured permissions.

Walkthrough

The following walkthrough explains the components and how the provisioning can be automated via CDK. For this walkthrough, you need:

To deploy the sample stack:

  1. Clone the associated GitHub repository by running the following command in a local directory:
    git clone https://github.com/aws-samples/multi-region-lambda-layers
  2. Open the repository in your preferred editor and review the contents of the src and cdk folder.
  3. Follow the instructions in the README.md to deploy the stack.
  4. Check the execution history of your pipeline in the AWS Management Console. The pipeline has been started once already and published a first version of the Lambda layer.
    Execution history

Code repository

The source code of the Lambda layers is stored in AWS CodeCommit. This is a secure, highly scalable, managed source control service that hosts private Git repositories. This example initializes a new repository as part of the CDK stack:

    const asset = new Asset(this, 'SampleAsset', {
      path: path.join(__dirname, '../../res')
    });

    const cfnRepository = new codecommit.CfnRepository(this, 'lambda-layer-source', {
      repositoryName: 'lambda-layer-source',
      repositoryDescription: 'Contains the source code for a nodejs12+14 Lambda layer.',
      code: {
        branchName: 'main',
        s3: {
          bucket: asset.s3BucketName,
          key: asset.s3ObjectKey
        }
      },
    });

This code uploads the contents of the ./cdk/res/ folder to an S3 bucket that is managed by the CDK. The CDK then initializes a new CodeCommit repository with the contents of the bucket. In this case, the repository gets initialized with the following files:

  • LICENSE: A text file describing the license for this Lambda layer
  • package.json: In Node.js, the package.json file is a manifest for projects. It defines dependencies, scripts, and metainformation about the project. The npm install command installs all project dependencies in a node_modules folder. This is where you define the contents of the Lambda layer.

The default package.json in the sample code defines a Lambda layer with the latest version of the AWS SDK for JavaScript:

{
    "name": "lambda-layer",
    "version": "1.0.0",
    "description": "Sample AWS Lambda layer",
    "dependencies": {
      "aws-sdk": "latest"
    }
}

To see what is included in the layer, run npm install in the ./cdk/res/ directory. This shows the files that are bundled into the Lambda layer. The contents of this folder initialize the CodeCommit repository, so delete node_modules and package-lock.json inspecting these files.

Node modules directory

This blog post uses a new CodeCommit repository as the source but you can adapt this to other providers. CodePipeline also supports repositories on GitHub and Bitbucket. To connect to those providers, see the documentation.

CI/CD Pipeline

CodePipeline automates the process of building and distributing Lambda layers across Region for every change to the main branch of the source repository. It is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline automates the build, test, and deploy phases of your release process every time there is a code change, based on the release model you define.

The CDK creates a pipeline in CodePipeline and configures it so that every change to the code base of the Lambda layer runs through the following three stages:

new codepipeline.Pipeline(this, 'Pipeline', {
      pipelineName: 'LambdaLayerBuilderPipeline',
      stages: [
        {
          stageName: 'Source',
          actions: [sourceAction]
        },
        {
          stageName: 'Build',
          actions: [buildAction]
        },
        {
          stageName: 'Distribute',
          actions: parallel,
        }
      ]
    });

Source

The source phase is the first phase of every run of the pipeline. It is typically triggered by a new commit to the main branch of the source repository. You can also start the source phase manually with the following AWS CLI command:

aws codepipeline start-pipeline-execution --name LambdaLayerBuilderPipeline

When started manually, the current head of the main branch is used. Otherwise CodePipeline checks out the code in the revision of the commit that triggered the pipeline execution.

CodePipeline stores the code locally and uses it as an output artifact of the Source stage. Stages use input and output artifacts that are stored in the Amazon S3 artifact bucket you chose when you created the pipeline. CodePipeline zips and transfers the files for input or output artifacts as appropriate for the action type in the stage.

Build

In the second phase of the pipeline, CodePipeline installs all dependencies and packages according to the specs of the targeted Lambda runtime. CodeBuild is a fully managed build service in the cloud. It reduces the need to provision, manage, and scale your own build servers. It provides prepackaged build environments for popular programming languages and build tools like npm for Node.js.

In CodeBuild, you use build specifications (buildspecs) to define what commands need to run to build your application. Here, it runs commands in a provisioned Docker image with Amazon Linux 2 to do the following:

  • Create the folder structure expected by Lambda Layer.
  • Run npm install to install all Node.js dependencies.
  • Package the code into a layer.zip file and define layer.zip as output of the Build stage.

The following CDK code highlights the specifications of the CodeBuild project.

const buildAction = new codebuild.PipelineProject(this, 'lambda-layer-builder', {
      buildSpec: codebuild.BuildSpec.fromObject({
        version: '0.2',
        phases: {
          install: {
            commands: [
              'mkdir -p node_layer/nodejs',
              'cp package.json ./node_layer/nodejs/package.json',
              'cd ./node_layer/nodejs',
              'npm install',
            ]
          },
          build: {
            commands: [
              'rm package-lock.json',
              'cd ..',
              'zip ../layer.zip * -r',
            ]
          }
        },
        artifacts: {
          files: [
            'layer.zip',
          ]
        }
      }),
      environment: {
        buildImage: codebuild.LinuxBuildImage.STANDARD_5_0
      }
    })

Distribute

In the final stage, Lambda uses layer.zip to create and publish a Lambda layer across multiple Regions. The sample code defines four Regions as targets for the distribution process:

regionCodesToDistribute: ['eu-central-1', 'eu-west-1', 'us-west-1', 'us-east-1']

The Distribution phase consists of n (one per Region) parallel invocations of the same Lambda function, each with userParameter.region set to the respective Region. This is defined in the CDK stack:

const parallel = props.regionCodesToDistribute.map((region) => new codepipelineActions.LambdaInvokeAction({
      actionName: `distribute-${region}`,
      lambda: distributor,
      inputs: [buildOutput],
      userParameters: { region, layerPrincipal: props.layerPrincipal }
}));

Each Lambda function runs the following code to publish a new Lambda layer in each Region:

const parallel = props.regionCodesToDistribute.map((region) => new codepipelineActions.LambdaInvokeAction({
      actionName: `distribute-${region}`,
      lambda: distributor,
      inputs: [buildOutput],
      userParameters: { region, layerPrincipal: props.layerPrincipal }
}));

Each Lambda function runs the following code to publish a new Lambda layer in each Region:

// Simplified code for brevity
// Omitted error handling, permission management and logging 
// See code samples for full code.
export async function handler(event: any) {
    // #1 Get job specific parameters (e.g. target region)
    const { location } = event['CodePipeline.job'].data.inputArtifacts[0];
    const { region, layerPrincipal } = JSON.parse(event["CodePipeline.job"].data.actionConfiguration.configuration.UserParameters);
    
    // #2 Get location of layer.zip and download it locally
    const layerZip = s3.getObject(/* Input artifact location*/);
    const lambda = new Lambda({ region });
    // #3 Publish a new Lambda layer version based on layer.zip
    const layer = lambda.publishLayerVersion({
        Content: {
            ZipFile: layerZip.Body
        },
        LayerName: 'sample-layer',
        CompatibleRuntimes: ['nodejs12.x', 'nodejs14.x']
    })
    
    // #4 Report the status of the operation back to CodePipeline
    return codepipeline.putJobSuccessResult(..);
}

After each Lambda function completes successfully, the pipeline ends. In a production application, you likely would have additional steps after publishing. For example, it may send notifications via Amazon SNS. To learn more about other possible integrations, read Working with pipeline in CodePipeline.

Pipeline output

Testing the workflow

With this automation, you can release a new version of the Lambda layer by changing package.json in the source repository.

Add the AWS X-Ray SDK for Node.js as a dependency to your project, by making the following changes to package.json and committing the new version to your main branch:

{
    "name": "lambda-layer",
    "version": "1.0.0",
    "description": "Sample AWS Lambda layer",
    "dependencies": {
        "aws-sdk": "latest",
        "aws-xray-sdk": "latest"
    }
}

After committing the new version to the repository, the pipeline is triggered again. After a while, you see that an updated version of the Lambda layer is published to all Regions:

Execution history results

Cleaning up

Many services in this blog post are available in the AWS Free Tier. However, using this solution may incur cost and you should tear down the stack if you don’t need it anymore. Cleaning up steps are included in the readme in the repository.

Conclusion

This blog post shows how to create a centralized pipeline to build and distribute Lambda layers consistently across multiple Regions. The pipeline is configurable and allows you to adapt the Regions and permissions according to your use-case.

For more serverless learning resources, visit Serverless Land.

Apply CI/CD DevOps principles to Amazon Redshift development

Post Syndicated from Ashok Srirama original https://aws.amazon.com/blogs/big-data/apply-ci-cd-devops-principles-to-amazon-redshift-development/

CI/CD in the context of application development is a well-understood topic, and developers can choose from numerous patterns and tools to build their pipelines to handle the build, test, and deploy cycle when a new commit gets into version control. For stored procedures or even schema changes that are directly related to the application, this is typically part of the code base and is included in the code repository of the application. These changes are then applied when the application gets deployed to the test or prod environment.

This post demonstrates how you can apply the same set of approaches to stored procedures, and even schema changes to data warehouses like Amazon Redshift.

Stored procedures are considered code and as such should undergo the same rigor as application code. This means that the pipeline should involve running tests against changes to make sure that no regressions are introduced to the production environment. Because we automate the deployment of both stored procedures and schema changes, this significantly reduces inconsistencies in between environments.

Solution overview

The following diagram illustrates our solution architecture. We use AWS CodeCommit to store our code, AWS CodeBuild to run the build process and test environment, and AWS CodePipeline to orchestrate the overall deployment, from source, to test, to production.

Database migrations and tests also require connection information to the relevant Amazon Redshift cluster; we demonstrate how to integrate this securely using AWS Secrets Manager.

We discuss each service component in more detail later in the post.

You can see how all these components work together by completing the following steps:

  1. Clone the GitHub repo.
  2. Deploy the AWS CloudFormation template.
  3. Push code to the CodeCommit repository.
  4. Run the CI/CD pipeline.

Clone the GitHub repository

The CloudFormation template and the source code for the example application are available in the GitHub repo. Before you get started, you need to clone the repository using the following command:

git clone https://github.com/aws-samples/amazon-redshift-devops-blog

This creates a new folder, amazon-redshift-devops-blog, with the files inside.

Deploy the CloudFormation template

The CloudFormation stack creates the VPC, Amazon Redshift clusters, CodeCommit repository, CodeBuild projects for both test and prod, and the pipeline using CodePipeline to orchestrate the change release process.

  1. On the AWS CloudFormation console, choose Create stack.
  2. Choose With new resources (standard).
  3. Select Upload a template file.
  4. Choose Choose file and locate the template file (<cloned_directory>/cloudformation_template.yml).
  5. Choose Next.
  6. For Stack name, enter a name.
  7. In the Parameters section, provide the primary user name and password for both the test and prod Amazon Redshift clusters.

The username must be 1–128 alphanumeric characters, and it can’t be a reserved word.

The password has the following criteria:

  • Must be 8-64 characters
  • Must contain at least one uppercase letter
  • Must contain at least one lowercase letter
  • Must contain at least one number
  • Can only contain ASCII characters (ASCII codes 33–126), except ‘ (single quotation mark), ” (double quotation mark), /, \, or @

Please note that production credentials could be created separately by privileged admins, and you could pass in the ARN of a pre-existing secret instead of the actual password if you so choose.

  1. Choose Next.
  2. Leave the remaining settings at their default and choose Next.
  3. Select I acknowledge that AWS CloudFormation might create IAM resources.
  4. Choose Create stack.

You can choose the refresh icon on the stack’s Events page to track the progress of the stack creation.

Push code to the CodeCommit repository

When stack creation is complete, go to the CodeCommit console. Locate the redshift-devops-repo repository that the stack created. Choose the repository to view its details.

Before you can push any code into this repo, you have to set up your Git credentials using instructions here Setup for HTTPS users using Git credentials. At Step 4 of the Setup for HTTPS users using Git credentials, copy the HTTPS URL, and instead of cloning, add the CodeCommit repo URL into the code that we cloned earlier:

git remote add codecommit <repo_https_url> 
git push codecommit main

The last step populates the repository; you can check it by refreshing the CodeCommit console. If you get prompted for a user name and password, enter the Git credentials that you generated and downloaded from Step 3 of the Setup for HTTPS users using Git credentials

Run the CI/CD pipeline

After you push the code to the CodeCommit repository, this triggers the pipeline to deploy the code into both the test and prod Amazon Redshift clusters. You can monitor the progress on the CodePipeline console.

To dive deeper into the progress of the build, choose Details.

You’re redirected to the CodeBuild console, where you can see the run logs as well as the result of the test.

Components and dependencies

Although from a high-level perspective the test and prod environment look the same, there are some nuances with regards to how these environments are configured. Before diving deeper into the code, let’s look at the components first:

  • CodeCommit – This is the version control system where you store your code.
  • CodeBuild – This service runs the build process and test using Maven.
    • Build – During the build process, Maven uses FlyWay to connect to Amazon Redshift to determine the current version of the schema and what needs to be run to bring it up to the latest version.
    • Test – In the test environment, Maven runs JUnit tests against the test Amazon Redshift cluster. These tests may involve loading data and testing the behavior of the stored procedures. The results of the unit tests are published into the CodeBuild test reports.
  • Secrets Manager – We use Secrets Manager to securely store connection information to the various Amazon Redshift clusters. This includes host name, port, user name, password, and database name. CodeBuild refers to Secrets Manager for the relevant connection information when a build gets triggered. The underlying CodeBuild service role needs to have the corresponding permission to access the relevant secrets.
  • CodePipeline – CodePipeline is responsible for the overall orchestration from source to test to production.

As referenced in the components, we also use some additional dependencies at the code level:

  • Flyway – This framework is responsible for keeping different Amazon Redshift clusters in different environments in sync as far as schema and stored procedures are concerned.
  • JUnit – Unit testing framework written in Java.
  • Apache Maven – A dependency management and build tool. Maven is where we integrate Flyway and JUnit.

In the following sections, we dive deeper into how these dependencies are integrated.

Apache Maven

For Maven, the configuration file is pom.xml. For an example, you can check out the pom file from our demo app. The pertinent part of the xml is the build section:

<build>
        <plugins>
            <plugin>
                <groupId>org.flywaydb</groupId>
                <artifactId>flyway-maven-plugin</artifactId>
                <version>${flyway.version}</version>
                <executions>
                    <execution>
                        <phase>process-resources</phase>
                        <goals>
                            <goal>migrate</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>${surefire.version}</version>
            </plugin>
        </plugins>
    </build>

This section describes two things:

  • By default, the Surefire plugin triggers during the test phase of Maven. The plugin runs the unit tests and generates reports based on the results of those tests. These reports are stored in the target/surefire-reports folder. We reference this folder in the CodeBuild section.
  • Flyway is triggered during the process-resources phase of Maven, and it triggers the migrate goal of Flyway. Looking at Maven’s lifecycle, this phase is always triggered first and deploys the latest version of stored procedures and schemas to the test environment before running test cases.

Flyway

Changes to the database are called migrations, and these can be either versioned or repeatable. Developers can define which type of migration by the naming convention used by Flyway to determine which one is which. The following diagram illustrates the naming convention.

A versioned migration consists of the regular SQL script that is run and an optional undo SQL script to reverse the specific version. You have to create this undo script in order to enable the undo functionality for a specific version. For example, a regular SQL script consists of creating a new table, and the corresponding undo script consists of dropping that table. Flyway is responsible for keeping track of which version a database is currently at, and runs N number of migrations depending on how far back the target database is compared to the latest version. Versioned migrations are the most common use of Flyway and are primarily used to maintain table schema and keep reference or lookup tables up to date by running data loads or updates via SQL statements. Versioned migrations are applied in order exactly one time.

Repeatable migrations don’t have a version; instead they’re rerun every time their checksum changes. They’re useful for maintaining user-defined functions and stored procedures. Instead of having multiple files to track changes over time, we can just use a single file and Flyway keeps track of when to rerun the statement to keep the target database up to date.

By default, these migration files are located in the classpath under db/migration, the full path being src/main/resources/db/migration. For our example application, you can find the source code on GitHub.

JUnit

When Flyway finishes running the migrations, the test cases are run. These test cases are under the folder src/test/java. You can find examples on GitHub that run a stored procedure via JDBC and validate the output or the impact.

Another aspect of unit testing to consider is how the test data is loaded and maintained in the test Amazon Redshift cluster. There are a couple of approaches to consider:

  • As per our example, we’re packaging the test data as part of our version control and loading the data when the first unit test is run. The advantage of this approach is that you get flexibility of when and where you run the test cases. You can start with either a completely empty or partially populated test cluster and you get with the right environment for the test case to run. Other advantages are that you can test data loading queries and have more granular control over the datasets that are being loaded for each test. The downside of this approach is that, depending on how big your test data is, it may add additional time for your test cases to complete.
  • Using an Amazon Redshift snapshot dedicated to the test environment is another way to manage the test data. With this approach, you have a couple more options:
    • Transient cluster – You can provision a transient Amazon Redshift cluster based on the snapshot when the CI/CD pipeline gets triggered. This cluster stops after the pipeline completes to save cost. The downside of this approach is that you have to factor in Amazon Redshift provisioning time in your end-to-end runtime.
    • Long-running cluster – Your test cases can connect to an existing cluster that is dedicated to running test cases. The test cases are responsible for making sure that data-related setup and teardown are done accordingly depending on the nature of the test that’s running. You can use @BeforeAll and @AfterAll JUnit annotations to trigger the setup and teardown, respectively.

CodeBuild

CodeBuild provides an environment where all of these dependencies run. As shown in our architecture diagram, we use CodeBuild for both test and prod. The differences are in the actual commands that run in each of those environments. These commands are stored in the buildspec.yml file. In our example, we provide a separate buildspec file for test and a different one for prod. During the creation of a CodeBuild project, we can specify which buildspec file to use.

There are a few differences between the test and prod CodeBuild project, which we discuss in the following sections.

Buildspec commands

In the test environment, we use mvn clean test and package the Surefire reports so the test results can be displayed via the CodeBuild console. While in the prod environment, we just run mvn clean process-resources. The reason for this is because in the prod environment, we only need to run the Flyway migrations, which are hooked up to the process-resources Maven lifecycle, whereas in the test environment, we not only run the Flyway migrations, but also make sure that it didn’t introduce any regressions by running test cases. These test cases might have an impact on the underlying data, which is why we don’t run it against the production Amazon Redshift cluster. If you want to run the test cases against production data, you can use an Amazon Redshift production cluster snapshot and run the test cases against that.

Secrets via Secrets Manager

Both Flyway and JUnit need information to identify and connect to Amazon Redshift. We store this information in Secrets Manager. Using Secrets Manager has several benefits:

  • Secrets are encrypted automatically
  • Access to secrets can be tightly controlled via fine-grained AWS Identity and Access Management (IAM) policies
  • All activity with secrets is recorded, which enables easy auditing and monitoring
  • You can rotate secrets securely and safely without impacting applications

For our example application, we define the secret as follows:

{
  "username": "<Redshift username>",
  "password": "<Redshift password>",
  "host": "<Redshift hostname>",
  "port": <Redshift port>,
  "dbName": "<Redshift DB Name>"
}

CodeBuild is integrated with Secrets Manager, so we define the following environment variables as part of the CodeBuild project:

  • TEST_HOST: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:host
  • TEST_JDBC_USER: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:username
  • TEST_JDBC_PASSWORD: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:password
  • TEST_PORT: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:port
  • TEST_DB_NAME: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:dbName
  • TEST_REDSHIFT_IAM_ROLE: <ARN of IAM role> (This can be in plaintext and should be attached to the Amazon Redshift cluster)
  • TEST_DATA_S3_BUCKET: <bucket name> (This is where the test data is staged)

CodeBuild automatically retrieves the parameters from Secrets Manager and they’re available in the application as environment variables. If you look at the buildspec_prod.yml example, we use the preceding variables to populate the Flyway environment variables and JDBC connection URL.

VPC configuration

For CodeBuild to be able to connect to Amazon Redshift, you need to configure which VPC it runs in. This includes the subnets and security group that it uses. The Amazon Redshift cluster’s security group also needs to allow access from the CodeBuild security group.

CodePipeline

To bring all these components together, we use CodePipeline to orchestrate the flow from the source code through prod deployment. CodePipeline also has additional capabilities. For example, you can add an approval step between test and prod so a release manager can review the results of the tests before releasing the changes to production.

Example scenario

You can use tests as a form of documentation of what is the expected behavior of a function. To further illustrate this point, let’s look at a simple stored procedure:

create or replace procedure merge_staged_products()
as $$
BEGIN
    update products set status='CLOSED' where product_name in (select product_name from products_staging) and status='ACTIVE';
    insert into products(product_name,price) select product_name, price from products_staging;
END;
$$ LANGUAGE plpgsql;

If you deployed the example app from the previous section, you can follow along by copying the stored procedure code and pasting it in src/main/resources/db/migration/R__MergeStagedProducts.sql. Save it and push the change to the CodeCommit repository by issuing the following commands (assuming that you’re at the top of the project folder):

git add src
git commit -m “<commit message>”
git push codecommit main

After you push the changes to the CodeCommit repository, you can follow the progress of the build and test stages on the CodePipeline console.

We implement a basic Slowly Changing Dimension Type 2 approach in which we mark old data as CLOSED and append newer versions of the data. Although the stored procedure works as is, our test has the following expectations:

  • The number of closed status in the products table needs to correspond to the number of duplicate entries in the staging table.
  • The products table has a close_date column that needs to be populated so we know when it was deprecated
  • At the end of the merge, the staging table needs to be cleared for subsequent ETL runs

The stored procedure will pass the first test, but fail later tests. When we push this change to CodeCommit and the CI/CD process runs, we can see results like in the following screenshot.

The tests show that the second and third tests failed. Failed tests result in the pipeline stopping, which means these bad changes don’t end up in production.

We can update the stored procedure and push the change to CodeCommit to trigger the pipeline again. The updated stored procedure is as follows:

create or replace procedure merge_staged_products()
as $$
BEGIN
    update products set status='CLOSED', close_date=CURRENT_DATE where product_name in (select product_name from products_staging) and status='ACTIVE';
    insert into products(product_name,price) select product_name, price from products_staging;
    truncate products_staging;
END;
$$ LANGUAGE plpgsql; 

All the tests passed this time, which allows CodePipeline to proceed with deployment to production.

We used Flyway’s repeatable migrations to make the changes to the stored procedure. Code is stored in a single file and Flyway verifies the checksum of the file to detect any changes and reapplies the migration if the checksum is different from the one that’s already deployed.

Clean up

After you’re done, it’s crucial to tear down the environment to avoid incurring additional charges beyond your testing. Before you delete the CloudFormation stack, go to the Resources tab of your stack and make sure the two buckets that were provisioned are empty. If they’re not empty, delete all the contents of the buckets.

Now that the buckets are empty, you can go back to the AWS CloudFormation console and delete the stack to complete the cleanup of all the provisioned resources.

Conclusion

Using CI/CD principles in the context of Amazon Redshift stored procedures and schema changes greatly improves confidence when updates are getting deployed to production environments. Similar to CI/CD in application development, proper test coverage of stored procedures is paramount to capturing potential regressions when changes are made. This includes testing both success paths as well as all possible failure modes.

In addition, versioning migrations enables consistency across multiple environments and prevents issues arising from schema changes that aren’t applied properly. This increases confidence when changes are being made and improves development velocity as teams spend more time developing functionality rather than hunting for issues due to environment inconsistencies.

We encourage you to try building a CI/CD pipeline for Amazon Redshift using these steps described in this blog.


About the Authors

Ashok Srirama is a Senior Solutions Architect at Amazon Web Services, based in Washington Crossing, PA. He specializes in serverless applications, containers, devops, and architecting distributed systems. When he’s not spending time with his family, he enjoys watching cricket, and driving his bimmer.

Indira Balakrishnan is a Senior Solutions Architect in the AWS Analytics Specialist SA Team. She is passionate about helping customers build cloud-based analytics solutions to solve their business problems using data-driven decisions. Outside of work, she volunteers at her kids’ activities and spends time with her family.

Vaibhav Agrawal is an Analytics Specialist Solutions Architect at AWS.Throughout his career, he has focused on helping customers design and build well-architected analytics and decision support platforms.

Rajesh Francis is a Sr. Analytics Customer Experience Specialist at AWS. He specializes in Amazon Redshift and works with customers to build scalable Analytic solutions.

Jeetesh Srivastva is a Sr. Manager, Specialist Solutions Architect at AWS. He specializes in Amazon Redshift and works with customers to implement scalable solutions using Amazon Redshift and other AWS Analytic services. He has worked to deliver on-premises and cloud-based analytic solutions for customers in banking and finance and hospitality industry verticals.

Parallel and dynamic SaaS deployments with AWS CDK Pipelines

Post Syndicated from Jani Muuriaisniemi original https://aws.amazon.com/blogs/devops/parallel-and-dynamic-saas-deployments-with-cdk-pipelines/

Software as a Service (SaaS) is an increasingly popular business model for independent software vendors (ISVs), including benefits such as a pay-as-you-go pricing model, scalability, and availability.

SaaS services can be built by using numerous architectural models. The silo model provides each tenant with dedicated resources and a shared-nothing architecture. Silo deployments also provide isolation between tenants’ compute resources and their data, and they help eliminate the noisy-neighbor problem. On the other hand, the pool model offers several benefits, such as lower maintenance overhead, simplified management and operations, and cost-saving opportunities, all due to a more efficient utilization of computing resources and capacity. In the bridge model, both silo and pool models are utilized side-by-side. The bridge model is a hybrid model, where parts of the system can be in a silo model, and parts in a pool.

End-customers benefit from SaaS delivery in numerous ways. For example, the service can be available from multiple locations, letting the customer choose what is best for them. The tenant onboarding process is often real-time and frictionless. To realize these benefits for their end-customers, SaaS providers need methods for reliable, fast, and multi-region capable provisioning and software lifecycle management.

This post will describe a deployment system for automating the provision and lifecycle management of workload components in pool or silo deployment models by using AWS Cloud Development Kit (AWS CDK) and CDK Pipelines. We will explore the system’s dynamic and database driven deployment model, as well as its multi-account and multi-region capabilities, and we will provision demo deployments of workload components in both the silo and pool models.

AWS Cloud Development Kit and CDK Pipelines

For this solution, we utilized AWS Cloud Development Kit (AWS CDK) and its CDK Pipelines construct library. AWS CDK is an open-source software development framework for modeling and provisioning cloud application resources by using familiar programming languages. AWS CDK lets you define your infrastructure as code and provision it through AWS CloudFormation.

CDK Pipelines is a high-level construct library with an opinionated implementation of a continuous deployment pipeline for your CDK applications. It is powered by AWS CodePipeline, a fully managed continuous delivery service that helps automate your release pipelines for fast and reliable application as well as infrastructure updates. No servers need to be provisioned or setup, and you only pay for what you use. This solution utilizes the recently released and stable CDK Pipelines modern API.

Business Scenario

As a baseline use case, we have selected the consideration of a fictitious ISV called Unicorn that wants to implement an SaaS business model.

Unicorn operates in several countries, and requires the storing of customer data within the customers’ chosen region. Currently, Unicorn needs two regions in order to satisfy its main customer base: one in EU and one in US. Unicorn expects rapid growth, and it needs a solution that can scale to thousands of tenants. Unicorn plans to have different tenant tiers with different isolation requirements. Their planned deployment model has the majority of tenants in shared pool instances, but they also plan to support dedicated silo instances for the tenants requiring it. The solution must also be easily extendable to new Regions as Unicorn’s business expands.

Unicorn is starting small with just a single development team responsible for currently the only component in their SaaS workload architecture. Following industry best practices, Unicorn has designed its workload architecture so that each component has a clear technical ownership boundary. The chosen solution must grow together with Unicorn, and support multiple independently developed and deployed components in the future.

Solution Overview

Today, many customers utilize AWS CodePipeline to build, test, and deploy their cloud applications. For an SaaS provider such as Unicorn, considering utilizing a single pipeline for managing every deployment presented concerns. At the scale that Unicorn requires, a single pipeline with potentially hundreds of actions runs the risk of becoming throughput limited. Moreover, a single pipeline would offer Unicorn limited control over how changes are released.

Our solution addresses this problem by having a separate dynamically provisioned pipeline for each pool and silo deployment. The solution is designed to manage multiple deployments of Unicorn’s single workload component, thereby aligning with their current needs — and with small changes, including future needs.

CDK Best Practices state that an AWS CDK application maps to a component as defined by the AWS Well-Architected Framework. A component is the code, configuration, and AWS Resources that together deliver against a workload requirement. And this is typically the unit of technical ownership. A component usually includes logical units (e.g., api, database), and can have a continuous deployment pipeline.

Utilizing CDK Pipelines provides a significant benefit: with no additional code, we can deploy cross-account and cross-region just as easily as we would to a single account and region. CDK Pipelines automatically creates and manages the required cross-account encryption keys and cross-region replication buckets. Furthermore, we only need to establish a trust relationship between the accounts during the CDK bootstrapping process.

The following diagram illustrates the solution architecture:

Solution Architecture Diagram

Figure 1: Solution architecture

Let’s look closer at the two primary high level solution flows: silo and pool pipeline provisioning (1 and 2), and component code deployment (3 and 4).

Provisioning is separated into a dedicated flow, so that code deployments do not interfere with tenant onboarding, and vice versa. At the heart of the provisioning flow is the deployment database (1), which is implemented by using an Amazon DynamoDB table.

Utilizing DynamoDB Streams and AWS Lambda Triggers, a new AWS CodeBuild provisioning project build (2) is automatically started after a record is inserted into the deployment database. The provisioning project directly provisions new silo and pool pipelines by using the “cdk deploy” command. Provisioning events are processed in parallel, so that the solution can handle possible bursts in Unicorn’s tenant onboarding volumes.

CDK best practices suggest that infrastructure and runtime code live in the same package. A single AWS CodeCommit repository (3) contains everything needed: the CI/CD pipeline definitions as well as the workload component code. This repository is the source artifact for every CodePipeline pipeline and CodeBuild project. The chapter “Managing application resources as code” describes related implementation details.

The CI/CD pipeline (4) is a CDK Pipelines pipeline, and it is responsible for the component’s Software Development Life Cycle (SDLC) activities. In addition to implementing the update release process, it is expected that most SaaS providers will also implement additional activities. This includes a variety of tests and pre-production environment deployments. The chapter “Controlling deployment updates” dives deeper into this topic.

Deployments have two parts: The pipeline (5) and the component resource stack(s) (6) that it manages. The pipelines are deployed to the central toolchain account and region, whereas the component resources are deployed to the AWS Account and Region, as specified in the deployments’ record in the deployment database.

Sample code for the solution is available in GitHub. The sample code is intended for utilization in conjunction with this post. Our solution is implemented in TypeScript.

Deployment Database

Our deployment database is an Amazon DynamoDB table, with the following structure:

Table structure explained in post.

Figure 2: DynamoDB table

  • ‘id’ is a unique identifier for each deployment.
  • ‘account’ is the AWS account ID for the component resources.
  • ‘region’ is the AWS region ID for the component resources.
  • ‘type’ is either ‘silo’ or ‘pool’, which defines the deployment model.

This design supports tenant deployment to multiple silo and pool deployments. Each of these can target any available and bootstrapped AWS Account and Region. For example, different pools can support tenants in different regions, with select tenants deployed to dedicated silos. As pools may be limited to how many tenants they can serve, the design also supports having multiple pools within a region, and it can easily be extended with an additional attribute to support the tiers concept.

Note that the deployment database does not contain tenant information. It is expected that such mapping is maintained in a separate tenant database, where each tenant record can map to the ID of the deployment that it is associated with.

Now that we have looked at our solution design and architecture, let’s move to the hands-on section, starting with the deployment requirements for the solution.

Prerequisites

The following tools are required to deploy the solution:

To follow this tutorial completely, you should have administrator access to at least one, but preferably two AWS accounts:

  • Toolchain: Account for the SDLC toolchain: the pipelines, the provisioning project, the repository, and the deployment database.
  • Workload (optional): Account for the component resources.

If you have only a single account, then the toolchain account can be used for both purposes. Credentials for the account(s) are assumed to be configured in AWS CLI profile(s).

The instructions in this post use the following placeholders, which you must replace with your specific values:

  • <TOOLCHAIN_ACCOUNT_ID>: The AWS Account ID for the toolchain account
  • <TOOLCHAIN_PROFILE_NAME>: The AWS CLI profile name for the toolchain account credentials
  • <WORKLOAD_ACCOUNT_ID>: The AWS Account ID for the workload account
  • <WORKLOAD_PROFILE_NAME>: The AWS CLI profile name for the workload account credentials

Bootstrapping

The toolchain account, and all workload account(s), must be bootstrapped prior to first-time deployment.

AWS CDK and our solutions’ dependencies must be installed to start with. The easiest way to do this is to install them locally with npm. First, we need to download our sample code, so that the we have the package.json configuration file available for npm.

Note that throughout these instructions, many commands are broken over multiple lines for readability. Take care to execute the commands completely. It is always safe to execute each code block as a whole.

Clone the sample code repository from GitHub, and then install the dependencies by using npm:

git clone https://github.com/aws-samples/aws-saas-parallel-deployments
cd aws-saas-parallel-deployments
npm ci 

CDK Pipelines requires use of modern bootstrapping. To ensure that this is enabled, start by setting the related environment variable:

export CDK_NEW_BOOTSTRAP=1

Then, bootstrap the toolchain account. You must bootstrap both the region where the toolchain stack is deployed, as well as every target region for component resources. Here, we will first bootstrap only the us-east-1 region, and later you can optionally bootstrap additional region(s).

To bootstrap, we use npx to execute the locally installed version of AWS CDK:

npx cdk bootstrap <TOOLCHAIN_ACCOUNT_ID>/us-east-1 --profile <TOOLCHAIN_PROFILE_NAME>

If you have a workload account that is separate from the toolchain account, then that account must also be bootstrapped. When bootstrapping the workload account, we will establish a trust relationship with the toolchain account. Skip this step if you don’t have a separate workload account.

The workload account boostrappings follows the security best practice of least privilege. First create an execution policy with the minimum permissions required to deploy our demo component resources. We provide a sample policy file in the solution repository for this purpose. Then, use that policy as the execution policy for the trust relationship between the toolchain account and the workload account

aws iam create-policy \
  --profile <WORKLOAD_PROFILE_NAME> \
  --policy-name CDK-Exec-Policy \
  --policy-document file://policies/workload-cdk-exec-policy.json
npx cdk bootstrap <WORKLOAD_ACCOUNT_ID>/us-east-1 \
  --profile <WORKLOAD_PROFILE_NAME> \
  --trust <TOOLCHAIN_ACCOUNT_ID> \
  --cloudformation-execution-policies arn:aws:iam::<WORKLOAD_ACCOUNT_ID>:policy/CDK-Exec-Policy

Toolchain deployment

Prior to being able to deploy for the first time, you must create an AWS CodeCommit repository for the solution. Create this repository in the toolchain account:

aws codecommit create-repository \
  --profile <TOOLCHAIN_PROFILE_NAME> \
  --region us-east-1 \
  --repository-name unicorn-repository

Next, you must push the contents to the CodeCommit repository. For this, use the git command together with the git-remote-codecommit extension in order to authenticate to the repository with your AWS CLI credentials. Our pipelines are configured to use the main branch.

git remote add unicorn codecommit::us-east-1://<TOOLCHAIN_PROFILE_NAME>@unicorn-repository
git push unicorn main

Now we are ready to deploy the toolchain stack:

export AWS_REGION=us-east-1
npx cdk deploy --profile <TOOLCHAIN_PROFILE_NAME>

Workload deployments

At this point, our CI/CD pipeline, provisioning project, and deployment database have been created. The database is initially empty.

Note that the DynamoDB command line interface demonstrated below is not intended to be the SaaS providers provisioning interface for production use. SaaS providers typically have online registration portals, wherein the customer signs up for the service. When new deployments are needed, then a record should automatically be inserted into the solution’s deployment database.

To demonstrate the solution’s capabilities, first we will provision two deployments, with an optional third cross-region deployment:

  1. A silo deployment (silo1) in the us-east-1 region.
  2. A pool deployment (pool1) in the us-east-1 region.
  3. A pool deployment (pool2) in the eu-west-1 region (optional).

To start, configure the AWS CLI environment variables:

export AWS_REGION=us-east-1
export AWS_PROFILE=<TOOLCHAIN_PROFILE_NAME>

Add the deployment database records for the first two deployments:

aws dynamodb put-item \
  --table-name unicorn-deployments \
  --item '{
    "id": {"S":"silo1"},
    "type": {"S":"silo"},
    "account": {"S":"<WORKLOAD_ACCOUNT_ID>"},
    "region": {"S":"us-east-1"}
  }'
aws dynamodb put-item \
  --table-name unicorn-deployments \
  --item '{
    "id": {"S":"pool1"},
    "type": {"S":"pool"},
    "account": {"S":"<WORKLOAD_ACCOUNT_ID>"},
    "region": {"S":"us-east-1"}
  }'

This will trigger two parallel builds of the provisioning CodeBuild project. Use the CodeBuild Console in order to observe the status and progress of each build.

Cross-region deployment (optional)

Optionally, also try a cross-region deployment. Skip this part if a cross-region deployment is not relevant for your use case.

First, you must bootstrap the target region in the toolchain and the workload accounts. Bootstrapping of eu-west-1 here is identical to the bootstrapping of the us-east-1 region earlier. First bootstrap the toolchain account:

npx cdk bootstrap <TOOLCHAIN_ACCOUNT_ID>/eu-west-1 --profile <TOOLCHAIN_PROFILE_NAME>

If you have a separate workload account, then we must also bootstrap it for the new region. Again, please skip this if you have only a single account:

npx cdk bootstrap <WORKLOAD_ACCOUNT_ID>/eu-west-1 \
  --profile <WORKLOAD_PROFILE_NAME> \
  --trust <TOOLCHAIN_ACCOUNT_ID> \
  --cloudformation-execution-policies arn:aws:iam::<WORKLOAD_ACCOUNT_ID>:policy/CDK-Exec-Policy

Then, add the cross-region deployment:

aws dynamodb put-item \
  --table-name unicorn-deployments \
  --item '{
    "id": {"S":"pool2"},
    "type": {"S":"pool"},
    "account": {"S":"<WORKLOAD_ACCOUNT_ID>"},
    "region": {"S":"eu-west-1"}
  }'

Validation of deployments

After the builds have completed, use the CodePipeline console to verify that the deployment pipelines were successfully created in the toolchain account:

CodePipeline console showing Pool-pool2-pipeline, Pool-pool1-pipeline and Silo-silo1-pipeline all Succeeded most recent execution.

Figure 3: CodePipeline console

Similarly, in the workload account, stacks containing your component resources will have been deployed to each configured region for the deployments. In this demo, we are deploying a single “hello world” container application utilizing AWS App Runner as runtime environment. Successful deployment can be verified by using CloudFormation Console:

Console showing Pool-pool1-resources with status of CREATE_COMPLETE

Figure 4: CloudFormation console

Now that we have successfully finished with our demo deployments, let’s look at how updates to the pipelines and the component resources can be managed.

Managing application resources as code

As highlighted earlier in the Solution Overview, every aspect of our solution shares a single source repository. With all of our code in a single source, we can easily deliver complex changes impacting multiple aspects of our solution. And all of this can be packaged, tested, and released as a single change set. For example, a change can introduce a new stage to the CI/CD pipeline, modify an existing stage in the silo and pool pipelines, and/or make code and resource changes to the component resources.

Managing the pipeline definitions is made simple by the self-mutate capability of the CDK Pipelines. Once initially deployed, each CDK Pipelines pipeline can update its own definition. This is implemented by using a separate SelfMutate stage in the pipeline definition. This stage is executed before any deployment actions, thereby ensuring that the pipeline always executes the latest version that is defined by the source code.

Managing how and when the pipelines trigger to execute also required attention. CDK Pipelines configures pipelines by default to utilize event-based polling of the source repository. While this is a reasonable default, and it is great for the CI/CD pipeline, it is undesired for our silo and pool pipelines. If all of these pipelines would execute automatically on code commits to the source repository, the CI/CD pipeline could not manage the release flow. To address this, we have configured the silo and pool pipelines with the trigger in the CodeCommitSourceOptions to NONE.

Controlling deployment updates

A key aspect of SaaS delivery is controlling how you roll out changes to tenants. Significant business risk can arise if changes are released to all tenants all-at-once in a single big bang.

This risk can be managed by utilizing a combination of silo and pool deployments. Reduce your risk by spreading tenants into multiple pools, and gradually rolling out your changes to these pools. Based on business needs and/or risk assessment, select customers can be provisioned into dedicated silo deployments, thereby allowing update control for those customers separately. Note that while all of a pool’s tenants get the same underlying update simultaneously, you can utilize feature flags to selectively enable new features only for specific tenants in the deployment.

In the demo solution, the CI/CD pipeline contains only a single custom stage “UpdateDeployments”. This CodeBuild action implements a simple “one-at-a-time” strategy. The code has been purposely written so that it is simple and provides you with a starting point to implement your own more complex strategy, as based on your unique business needs. In the default implementation, every silo and pool pipeline tracks the same “main” branch of the repository. Releases are governed by controlling when each pipeline executes to update its resources.

When designing your release strategy, look into how the planned process helps implement releases and changes with high quality and frequency. A typical starting point is a CI/CD pipeline with continuous automated deployments via multiple test and staging environments in order to validate your changes prior to deployment to any production tenants.

Furthermore, consider if utilizing a canary release strategy would help identify potential issues with your changes prior to rolling them out across all deployments in production. In a canary release, each change is first deployed only to a small subset of your deployments. Once you are satisfied with the change quality, then the change can either automatically or manually be released to the rest of your deployments. As an example, an AWS Step Functions state machine could be combined with the solution, and then utilized to control the release flow, execute validation tests, implement approval steps (either manual or automatic), and even conduct rollback if necessary.

Further considerations

The example in this post provisions every silo and pool deployment to a single AWS account. However, the solution is not limited to a single account, and it can deploy equally easily to multiple AWS accounts. When operating at scale, it is best-practice to spread your workloads to several accounts. The Organizing Your AWS Environment using Multiple Accounts whitepaper has in-depth guidance on strategies for spreading your workloads.

If combined with an AWS account-vending machine implementation, such as an AWS Control Tower Landing Zone, then the demo solution could be adapted so that new AWS accounts are provisioned automatically. This would be useful if your business requires full account-level deployment isolation, and you also want automated provisioning.

To meet Unicorn’s future needs for spreading their solution architecture over multiple separate components, the deployment database and associated lambda function could be decoupled from the rest of the toolchain components in order to provide a central deployment service. When provisioned as standalone, and amended with Amazon Simple Notification Service-based notifications sent to the component deployment systems for example, this central deployment service could be utilized for managing the deployments for multiple components.

In addition, you should analyze your deployment lifecycle transitions, and then consider what action should be taken when a tenant is disabled and/or deleted. Implementing a deployment archival/deletion process is not in the scope of this post.

Cleanup

To cleanup every resource deployed in this post, conduct the following actions:

  1. In the workload account:
    1. In us-east-1 Region, delete CloudFormation stacks named “pool-pool1-resources” and “silo-silo1-resources” and the CDK bootstrap stack “CDKToolKit”.
    2. In eu-west-1 Region, delete CloudFormation stack named “pool-pool2-resources” and the CDK Bootstrap stack “CDKToolKit”
  2. In the toolchain account:
    1. In us-east-1 Region, delete CloudFormation stacks “toolchain”, “pool-pool1-pipeline”, “pool-pool2-pipeline”, “silo-silo1-pipeline” and the CDK bootstrap stack “CDKToolKit”.
    2. In eu-west-1 Region, delete CloudFormation stack “pool-pool2-pipeline-support-eu-west-1” and the CDK bootstrap stack “CDKToolKit”
    3. Cleanup and delete S3 buckets “toolchain-*”, “pool-pool1-pipeline-*”, “pool-pool2-pipeline-*”, and “silo-silo1-pipeline-*”.

Conclusion

This solution demonstrated an implementation of an automated SaaS application component deployment factory. We covered how an ISV venturing into the SaaS model can utilize AWS CDK and CDK Pipelines in order to avoid a multitude of undifferentiated heavy lifting by leveraging and combining AWS CDK’s cross-region and cross-account capabilities with CDK Pipelines’ self-mutating deployment pipelines. Furthermore, we demonstrated how all of this can be written, managed, and released just like any other code you write. We also demonstrated how a single dynamic provisioning system can be utilized to operate in a mixed mode, with both silo and pool deployments.

Visit the AWS SaaS Factory Program page for further information on how AWS can help you on your SaaS journey — regardless of the stage you are currently in.

About the authors

Jani Muuriaisniemi

Jani is a Principal Solutions Architect at Amazon Web Services based out of Helsinki, Finland. With more than 20 years of industry experience, he works as a trusted advisor with a broad range of customers across different industries and segments, helping the customers on their cloud journey.

Jose Juhala

Jose is a Solutions Architect at Amazon Web Services based out of Tampere, Finland. He works with customers in Nordic and Baltic, from different industries, and guides them in their technical implementations architectural questions.

Building an InnerSource ecosystem using AWS DevOps tools

Post Syndicated from Debashish Chakrabarty original https://aws.amazon.com/blogs/devops/building-an-innersource-ecosystem-using-aws-devops-tools/

InnerSource is the term for the emerging practice of organizations adopting the open source methodology, albeit to develop proprietary software. This blog discusses the building of a model InnerSource ecosystem that leverages multiple AWS services, such as CodeBuild, CodeCommit, CodePipeline, CodeArtifact, and CodeGuru, along with other AWS services and open source tools.

What is InnerSource and why is it gaining traction?

Most software companies leverage open source software (OSS) in their products, as it is a great mechanism for standardizing software and bringing in cost effectiveness via the re-use of high quality, time-tested code. Some organizations may allow its use as-is, while others may utilize a vetting mechanism to ensure that the OSS adheres to the organization standards of security, quality, etc. This confidence in OSS stems from how these community projects are managed and sustained, as well as the culture of openness, collaboration, and creativity that they nurture.

Many organizations building closed source software are now trying to imitate these development principles and practices. This approach, which has been perhaps more discussed than adopted, is popularly called “InnerSource”. InnerSource serves as a great tool for collaborative software development within the organization perimeter, while keeping its concerns for IP & Legality in check. It provides collaboration and innovation avenues beyond the confines of organizational silos through knowledge and talent sharing. Organizations reap the benefits of better code quality and faster time-to-market, yet at only a fraction of the cost.

What constitutes an InnerSource ecosystem?

Infrastructure and processes that harbor collaboration stand at the heart of InnerSource ecology. These systems (refer Figure 1) would typically include tools supporting features such as code hosting, peer reviews, Pull Request (PR) approval flow, issue tracking, documentation, communication & collaboration, continuous integration, and automated testing, among others. Another major component of this system is an entry portal that enables the employees to discover the InnerSource projects and join the community, beginning as ordinary users of the reusable code and later graduating to contributors and committers.

A typical InnerSource ecosystem

Figure 1: A typical InnerSource ecosystem

More to InnerSource than meets the eye

This blog focuses on detailing a technical solution for establishing the required tools for an InnerSource system primarily to enable a development workflow and infrastructure. But the secret sauce of an InnerSource initiative in an enterprise necessitates many other ingredients.

InnerSource Roles & Use Cases

Figure 2: InnerSource Roles & Use Cases

InnerSource thrives on community collaboration and a low entry barrier to enable adoptability. In turn, that demands a cultural makeover. While strategically deciding on the projects that can be inner sourced as well as the appropriate licensing model, enterprises should bootstrap the initiative with a seed product that draws the community, with maintainers and the first set of contributors. Many of these users would eventually be promoted, through a meritocracy-based system, to become the trusted committers.

Over a set period, the organization should plan to move from an infra specific model to a project specific model. In a Project-specific InnerSource model, the responsibility for a particular software asset is owned by a dedicated team funded by other business units. Whereas in the Infrastructure-based InnerSource model, the organization provides the necessary infrastructure to create the ecosystem with code & document repositories, communication tools, etc. This enables anybody in the organization to create a new InnerSource project, although each project initiator maintains their own projects. They could begin by establishing a community of practice, and designating a core team that would provide continuing support to the InnerSource projects’ internal customers. Having a team of dedicated resources would clearly indicate the organization’s long-term commitment to sustaining the initiative. The organization should promote this culture through regular boot camps, trainings, and a recognition program.

Lastly, the significance of having a modular architecture in the InnerSource projects cannot be understated. This architecture helps developers understand the code better, as well as aids code reuse and parallel development, where multiple contributors could work on different code modules while avoiding conflicts during code merges.

A model InnerSource solution using AWS services

This blog discusses a solution that weaves various services together to create the necessary infrastructure for an InnerSource system. While it is not a full-blown solution, and it may lack some other components that an organization may desire in its own system, it can provide you with a good head start.

The ultimate goal of the model solution is to enable a developer workflow as depicted in Figure 3.

Typical developer workflow at InnerSource

Figure 3: Typical developer workflow at InnerSource

At the core of the InnerSource-verse is the distributed version control (AWS CodeCommit in our case). To maintain system transparency, openness, and participation, we must have a discovery mechanism where users could search for the projects and receive encouragement to contribute to the one they prefer (Step 1 in Figure 4).

Architecture diagram for the model InnerSource system

Figure 4: Architecture diagram for the model InnerSource system

For this purpose, the model solution utilizes an open source reference implantation of InnerSource Portal. The portal indexes data from AWS CodeCommit by using a crawler, and it lists available projects with associated metadata, such as the skills required, number of active branches, and average number of commits. For CodeCommit, you can use the crawler implementation that we created in the open source code repo at https://github.com/aws-samples/codecommit-crawler-innersource.

The major portal feature is providing an option to contribute to a project by using a “Contribute” link. This can present a pop-up form to “apply as a contributor” (Step 2 in Figure 4), which when submitted sends an email (or creates a ticket) to the project maintainer/committer who can create an IAM (Step 3 in Figure 4) user with access to the particular repository. Note that the pop-up form functionality is built into the open source version of the portal. However, it would be trivial to add one with an associated action (send an email, cut a ticket, etc.).

InnerSource portal indexes CodeCommit repos and provides a bird’s eye view

Figure 5: InnerSource portal indexes CodeCommit repos and provides a bird’s eye view

The contributor, upon receiving access, logs in to CodeCommit, clones the mainline branch of the InnerSource project (Step 4 in Figure 4) into a fix or feature branch, and starts altering/adding the code. Once completed, the contributor commits the code to the branch and raises a PR (Step 5 in Figure 4). A Pull Request is a mechanism to offer code to an existing repository, which is then peer-reviewed and tested before acceptance for inclusion.

The PR triggers a CodeGuru review (Step 6 in Figure 4) that adds the recommendations as comments on the PR. Furthermore, it triggers a CodeBuild process (Steps 7 to 10 in Figure 4) and logs the build result in the PR. At this point, the code can be peer reviewed by Trusted Committers or Owners of the project repository. The number of approvals would depend on the approval template rule configured in CodeCommit. The Committer(s) can approve the PR (Step 12 in Figure 4) and merge the code to the mainline branch – that is once they verify that the code serves its purpose, has passed the required tests, and doesn’t break the build. They could also rely on the approval vote from a sanity test conducted by a CodeBuild process. Optionally, a build process could deploy the latest mainline code (Step 14 in Figure 4) on the PR merge commit.

To maintain transparency in all communications related to progress, bugs, and feature requests to downstream users and contributors, a communication tool may be needed. This solution does not show integration with any Issue/Bug tracking tool out of the box. However, multiple of these tools are available at the AWS marketplace, with some offering forum and Wiki add-ons in order to elicit discussions. Standard project documentation can be kept within the repository by using the constructs of the README.md file to provide project mission details and the CONTRIBUTING.md file to guide the potential code contributors.

An overview of the AWS services used in the model solution

The model solution employs the following AWS services:

  • Amazon CodeCommit: a fully managed source control service to host secure and highly scalable private Git repositories.
  • Amazon CodeBuild: a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy.
  • Amazon CodeDeploy: a service that automates code deployments to any instance, including EC2 instances and instances running on-premises.
  • Amazon CodeGuru: a developer tool providing intelligent recommendations to improve code quality and identify an application’s most expensive lines of code.
  • Amazon CodePipeline: a fully managed continuous delivery service that helps automate release pipelines for fast and reliable application and infrastructure updates.
  • Amazon CodeArtifact: a fully managed artifact repository service that makes it easy to securely store, publish, and share software packages utilized in their software development process.
  • Amazon S3: an object storage service that offers industry-leading scalability, data availability, security, and performance.
  • Amazon EC2: a web service providing secure, resizable compute capacity in the cloud. It is designed to ease web-scale computing for developers.
  • Amazon EventBridge: a serverless event bus that eases the building of event-driven applications at scale by using events generated from applications and AWS services.
  • Amazon Lambda: a serverless compute service that lets you run code without provisioning or managing servers.

The journey of a thousand miles begins with a single step

InnerSource might not be the right fit for every organization, but is a great step for those wanting to encourage a culture of quality and innovation, as well as purge silos through enhanced collaboration. It requires backing from leadership to sponsor the engineering initiatives, as well as champion the establishment of an open and transparent culture granting autonomy to the developers across the org to contribute to projects outside of their teams. The organizations best-suited for InnerSource have already participated in open source initiatives, have engineering teams that are adept with CI/CD tools, and are willing to adopt OSS practices. They should start small with a pilot and build upon their successes.

Conclusion

Ever more enterprises are adopting the open source culture to develop proprietary software by establishing an InnerSource. This instills innovation, transparency, and collaboration that result in cost effective and quality software development. This blog discussed a model solution to build the developer workflow inside an InnerSource ecosystem, from project discovery to PR approval and deployment. Additional features, like an integrated Issue Tracker, Real time chat, and Wiki/Forum, can further enrich this solution.

If you need helping hands, AWS Professional Services can help adapt and implement this model InnerSource solution in your enterprise. Moreover, our Advisory services can help establish the governance model to accelerate OSS culture adoption through Experience Based Acceleration (EBA) parties.

References

About the authors

Debashish Chakrabarty

Debashish Chakrabarty

Debashish is a Senior Engagement Manager at AWS Professional Services, India managing complex projects on DevOps, Security and Modernization and help ProServe customers accelerate their adoption of AWS Services. He loves to keep connected to his technical roots. Outside of work, Debashish is a Hindi Podcaster and Blogger. He also loves binge-watching on Amazon Prime, and spending time with family.

Akash Verma

Akash Verma

Akash works as a Cloud Consultant for AWS Professional Services, India. He enjoys learning new technologies and helping customers solve complex technical problems and drive business outcomes by providing solutions using AWS products and services. Outside of work, Akash loves to travel, interact with new people and try different cuisines. He also enjoy gardening, watching Stand-up comedy, and listening to poetry.

Validate IAM policies in CloudFormation templates using IAM Access Analyzer

Post Syndicated from Matt Luttrell original https://aws.amazon.com/blogs/security/validate-iam-policies-in-cloudformation-templates-using-iam-access-analyzer/

In this blog post, I introduce IAM Policy Validator for AWS CloudFormation (cfn-policy-validator), an open source tool that extracts AWS Identity and Access Management (IAM) policies from an AWS CloudFormation template, and allows you to run existing IAM Access Analyzer policy validation APIs against the template. I also show you how to run the tool in a continuous integration and continuous delivery (CI/CD) pipeline to validate IAM policies in a CloudFormation template before they are deployed to your AWS environment.

Embedding this validation in a CI/CD pipeline can help prevent IAM policies that have IAM Access Analyzer findings from being deployed to your AWS environment. This tool acts as a guardrail that can allow you to delegate the creation of IAM policies to the developers in your organization. You can also use the tool to provide additional confidence in your existing policy authoring process, enabling you to catch mistakes prior to IAM policy deployment.

What is IAM Access Analyzer?

IAM Access Analyzer mathematically analyzes access control policies that are attached to resources, and determines which resources can be accessed publicly or from other accounts. IAM Access Analyzer can also validate both identity and resource policies against over 100 checks, each designed to improve your security posture and to help you to simplify policy management at scale.

The IAM Policy Validator for AWS CloudFormation tool

IAM Policy Validator for AWS CloudFormation (cfn-policy-validator) is a new command-line tool that parses resource-based and identity-based IAM policies from your CloudFormation template, and runs the policies through IAM Access Analyzer checks. The tool is designed to run in the CI/CD pipeline that deploys your CloudFormation templates, and to prevent a deployment when an IAM Access Analyzer finding is detected. This ensures that changes made to IAM policies are validated before they can be deployed.

The cfn-policy-validator tool looks for all identity-based policies, and a subset of resource-based policies, from your templates. For the full list of supported resource-based policies, see the cfn-policy-validator GitHub repository.

Parsing IAM policies from a CloudFormation template

One of the challenges you can face when parsing IAM policies from a CloudFormation template is that these policies often contain CloudFormation intrinsic functions (such as Ref and Fn::GetAtt) and pseudo parameters (such as AWS::AccountId and AWS::Region). As an example, it’s common for least privileged IAM policies to reference the Amazon Resource Name (ARN) of another CloudFormation resource. Take a look at the following example CloudFormation resources that create an Amazon Simple Queue Service (Amazon SQS) queue, and an IAM role with a policy that grants access to perform the sqs:SendMessage action on the SQS queue.
 

Figure 1- Example policy in CloudFormation template

Figure 1- Example policy in CloudFormation template

As you can see in Figure 1, line 21 uses the function Fn::Sub to restrict this policy to MySQSQueue created earlier in the template.

In this example, if you were to pass the root policy (lines 15-21) as written to IAM Access Analyzer, you would get an error because !Sub ${MySQSQueue.Arn} is syntax that is specific to CloudFormation. The cfn-policy-validator tool takes the policy and translates the CloudFormation syntax to valid IAM policy syntax that Access Analyzer can parse.

The cfn-policy-validator tool recognizes when an intrinsic function such as !Sub ${MySQSQueue.Arn} evaluates to a resource’s ARN, and generates a valid ARN for the resource. The tool creates the ARN by mapping the type of CloudFormation resource (in this example AWS::SQS::Queue) to a pattern that represents what the ARN of the resource will look like when it is deployed. For example, the following is what the mapping looks like for the SQS queue referenced previously:

AWS::SQS::Queue.Arn -> arn:${Partition}:sqs:${Region}:${Account}:${QueueName}

For some CloudFormation resources, the Ref intrinsic function also returns an ARN. The cfn-policy-validator tool handles these cases as well.

Cfn-policy-validator walks through each of the six parts of an ARN and substitutes values for variables in the ARN pattern (any text contained within ${}). The values of ${Partition} and ${Account} are taken from the identity of the role that runs the cfn-policy-validator tool, and the value for ${Region} is provided as an input flag. The cfn-policy-validator tool performs a best-effort resolution of the QueueName, but typically defaults it to the name of the CloudFormation resource (in the previous example, MySQSQueue). Validation of policies with IAM Access Analyzer does not rely on the name of the resource, so the cfn-policy-validator tool is able to substitute a replacement name without affecting the policy checks.

The final ARN generated for the MySQSQueue resource looks like the following (for an account with ID of 111111111111):

arn:aws:sqs:us-east-1:111111111111:MySQSQueue

The cfn-policy-validator tool substitutes this generated ARN for !Sub ${MySQSQueue.Arn}, which allows the cfn-policy-validator tool to parse a policy from the template that can be fed into IAM Access Analyzer for validation. The cfn-policy-validator tool walks through your entire CloudFormation template and performs this ARN substitution until it has generated ARNs for all policies in your template.

Validating the policies with IAM Access Analyzer

After the cfn-policy-validator tool has your IAM policies in a valid format (with no CloudFormation intrinsic functions or pseudo parameters), it can take those policies and feed them into IAM Access Analyzer for validation. The cfn-policy-validator tool runs resource-based and identity-based policies in your CloudFormation template through the ValidatePolicy action of the IAM Access Analyzer. ValidatePolicy is what ensures that your policies have correct grammar and follow IAM policy best practices (for example, not allowing iam:PassRole to all resources). The cfn-policy-validator tool also makes a call to the CreateAccessPreview action for supported resource policies to determine if the policy would grant unintended public or cross-account access to your resource.

The cfn-policy-validator tool categorizes findings from IAM Access Analyzer into the categories blocking or non-blocking. Findings categorized as blocking cause the tool to exit with a non-zero exit code, thereby causing your deployment to fail and preventing your CI/CD pipeline from continuing. If there are no findings, or only non-blocking findings detected, the tool will exit with an exit code of zero (0) and your pipeline can to continue to the next stage. For more information about how the cfn-policy-validator tool decides what findings to categorize as blocking and non-blocking, as well as how to customize the categorization, see the cfn-policy-validator GitHub repository.

Example of running the cfn-policy-validator tool

This section guides you through an example of what happens when you run a CloudFormation template that has some policy violations through the cfn-policy-validator tool.

The following template has two CloudFormation resources with policy findings: an SQS queue policy that grants account 111122223333 access to the SQS queue, and an IAM role with a policy that allows the role to perform a misspelled sqs:ReceiveMessages action. These issues are highlighted in the policy below.

Important: The policy in Figure 2 is written to illustrate a CloudFormation template with potentially undesirable IAM policies. You should be careful when setting the Principal element to an account that is not your own.

Figure 2: CloudFormation template with undesirable IAM policies

Figure 2: CloudFormation template with undesirable IAM policies

When you pass this template as a parameter to the cfn-policy-validator tool, you specify the AWS Region that you want to deploy the template to, as follows:

cfn-policy-validator validate --template-path ./template.json --region us-east-1

After the cfn-policy-validator tool runs, it returns the validation results, which includes the actual response from IAM Access Analyzer:

{
    "BlockingFindings": [
        {
            "findingType": "ERROR",
            "code": "INVALID_ACTION",
            "message": "The action sqs:ReceiveMessages does not exist.",
            "resourceName": "MyRole",
            "policyName": "root",
            "details": …
        },
        {
            "findingType": "SECURITY_WARNING",
            "code": "EXTERNAL_PRINCIPAL",
            "message": "Resource policy allows access from external principals.",
            "resourceName": "MyQueue",
            "policyName": "QueuePolicy",
            "details": …
        }
    ],
    "NonBlockingFindings": []
}

The output from the cfn-policy-validator tool includes the type of finding, the code for the finding, and a message that explains the finding. It also includes the resource and policy name from the CloudFormation template, to allow you to quickly track down and resolve the finding in your template. In the previous example, you can see that IAM Access Analyzer has detected two findings, one security warning and one error, which the cfn-policy-validator tool has classified as blocking. The actual response from IAM Access Analyzer is returned under details, but is excluded above for brevity.

If account 111122223333 is an account that you trust and you are certain that it should have access to the SQS queue, then you can suppress the finding for external access from the 111122223333 account in this example. Modify the call to the cfn-policy-validator tool to ignore this specific finding by using the –-allow-external-principals flag, as follows:

cfn-policy-validator validate --template-path ./template.json --region us-east-1 --allow-external-principals 111122223333

When you look at the output that follows, you’re left with only the blocking finding that states that sqs:ReceiveMessages does not exist.

{
    "BlockingFindings": [
        {
            "findingType": "ERROR",
            "code": "INVALID_ACTION",
            "message": "The action sqs:ReceiveMessages does not exist.",
            "resourceName": "MyRole",
            "policyName": "root",
            "details": …
    ],
    "NonBlockingFindings": []
}

To resolve this finding, update the template to have the correct spelling, sqs:ReceiveMessage (without trailing s).

For the full list of available flags and commands supported, see the cfn-policy-validator GitHub repository.

Now that you’ve seen an example of how you can run the cfn-policy-validator tool to validate policies that are on your local machine, you will take it a step further and see how you can embed the cfn-policy-validator tool in a CI/CD pipeline. By embedding the cfn-policy-validator tool in your CI/CD pipeline, you can ensure that your IAM policies are validated each time a commit is made to your repository.

Embedding the cfn-policy-validator tool in a CI/CD pipeline

The CI/CD pipeline you will create in this post uses AWS CodePipeline and AWS CodeBuild. AWS CodePipeline is a continuous delivery service that enables you to model, visualize, and automate the steps required to release your software. AWS CodeBuild is a fully-managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. You will also use AWS CodeCommit as the source repository where your CloudFormation template is stored. AWS CodeCommit is a fully managed source-control service that hosts secure Git-based repositories.

To deploy the pipeline

  1. To deploy the CloudFormation template that builds the source repository and AWS CodePipeline pipeline, select the following Launch Stack button.
    Select the Launch Stack button to launch the template
  2. In the CloudFormation console, choose Next until you reach the Review page.
  3. Select I acknowledge that AWS CloudFormation might create IAM resources and choose Create stack.
  4. Open the AWS CodeCommit console and choose Repositories.
  5. Select cfn-policy-validator-source-repository.
  6. Download the template.json and template-configuration.json files to your machine.
  7. In the cfn-policy-validator-source-repository, on the right side, select Add file and choose Upload file.
  8. Choose Choose File and select the template.json file that you downloaded previously.
  9. Enter an Author name and an E-mail address and choose Commit changes.
  10. In the cfn-policy-validator-source-repository, repeat steps 7-9 for the template-configuration.json file.

To view validation in the pipeline

  1. In the AWS CodePipeline console choose the IAMPolicyValidatorPipeline.
  2. Watch as your commit travels through the pipeline. If you followed the previous instructions and made two separate commits, you can ignore the failed results of the first pipeline execution. As shown in Figure 3, you will see that the pipeline fails in the Validation stage on the CfnPolicyValidator action, because it detected a blocking finding in the template you committed, which prevents the invalid policy from reaching your AWS environment.
     
    Figure 3: Validation failed on the CfnPolicyValidator action

    Figure 3: Validation failed on the CfnPolicyValidator action

  3. Under CfnPolicyValidator, choose Details, as shown in Figure 3.
  4. In the Action execution failed pop-up, choose Link to execution details to view the cfn-policy-validator tool output in AWS CodeBuild.

Architectural overview of deploying cfn-policy-validator in your pipeline

You can see the architecture diagram for the CI/CD pipeline you deployed in Figure 4.
 
Figure 4: CI/CD pipeline that performs IAM policy validation using the AWS CloudFormation Policy Validator and IAM Access Analyzer

Figure 4 shows the following steps, starting with the CodeCommit source action on the left:

  1. The pipeline starts when you commit to your AWS CodeCommit source code repository. The AWS CodeCommit repository is what contains the CloudFormation template that has the IAM policies that you would like to deploy to your AWS environment.
  2. AWS CodePipeline detects the change in your source code repository and begins the Validation stage. The first step it takes is to start an AWS CodeBuild project that runs the CloudFormation template through the AWS CloudFormation Linter (cfn-lint). The cfn-lint tool validates your template against the CloudFormation resource specification. Taking this initial step ensures that you have a valid CloudFormation template before validating your IAM policies. This is an optional step, but a recommended one. Early schema validation provides fast feedback for any typos or mistakes in your template. There’s little benefit to running additional static analysis tools if your template has an invalid schema.
  3. If the cfn-lint tool completes successfully, you then call a separate AWS CodeBuild project that invokes the IAM Policy Validator for AWS CloudFormation (cfn-policy-validator). The cfn-policy-validator tool then extracts the identity-based and resource-based policies from your template, as described earlier, and runs the policies through IAM Access Analyzer.

    Note: if your template has parameters, then you need to provide them to the cfn-policy-validator tool. You can provide parameters as command-line arguments, or use a template configuration file. It is recommended to use a template configuration file when running validation with an AWS CodePipeline pipeline. The same file can also be used in the deploy stage to deploy the CloudFormation template. By using the template configuration file, you can ensure that you use the same parameters to validate and deploy your template. The CloudFormation template for the pipeline provided with this blog post defaults to using a template configuration file.

    If there are no blocking findings found in the policy validation, the cfn-policy-validator tool exits with an exit code of zero (0) and the pipeline moves to the next stage. If any blocking findings are detected, the cfn-policy-validator tool will exit with a non-zero exit code and the pipeline stops, to prevent the deployment of undesired IAM policies.

  4. The final stage of the pipeline uses the AWS CloudFormation action in AWS CodePipeline to deploy the template to your environment. Your template will only make it to this stage if it passes all static analysis checks run in the Validation Stage.

Cleaning Up

To avoid incurring future charges, in the AWS CloudFormation console delete the validate-iam-policy-pipeline stack. This will remove the validation pipeline from your AWS account.

Summary

In this blog post, I introduced the IAM Policy Validator for AWS CloudFormation (cfn-policy-validator). The cfn-policy-validator tool automates the parsing of identity-based and resource-based IAM policies from your CloudFormation templates and runs those policies through IAM Access Analyzer. This enables you to validate that the policies in your templates follow IAM best practices and do not allow unintended external access to your AWS resources.

I showed you how the IAM Policy Validator for AWS CloudFormation can be included in a CI/CD pipeline. This allows you to run validation on your IAM policies on every commit to your repository and only deploy the template if validation succeeds.

For more information, or to provide feedback and feature requests, see the cfn-policy-validator GitHub repository.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Matt Luttrell

Matt is a Sr. Solutions Architect on the AWS Identity Solutions team. When he’s not spending time chasing his kids around, he enjoys skiing, cycling, and the occasional video game.

17 additional AWS services authorized for DoD workloads in the AWS GovCloud Regions

Post Syndicated from Tyler Harding original https://aws.amazon.com/blogs/security/17-additional-aws-services-authorized-for-dod-workloads-in-the-aws-govcloud-regions/

I’m pleased to announce that the Defense Information Systems Agency (DISA) has authorized 17 additional Amazon Web Services (AWS) services and features in the AWS GovCloud (US) Regions, bringing the total to 105 services and major features that are authorized for use by the U.S. Department of Defense (DoD). AWS now offers additional services to DoD mission owners in these categories: business applications; computing; containers; cost management; developer tools; management and governance; media services; security, identity, and compliance; and storage.

Why does authorization matter?

DISA authorization of 17 new cloud services enables mission owners to build secure innovative solutions to include systems that process unclassified national security data (for example, Impact Level 5). DISA’s authorization demonstrates that AWS effectively implemented more than 421 security controls by using applicable criteria from NIST SP 800-53 Revision 4, the US General Services Administration’s FedRAMP High baseline, and the DoD Cloud Computing Security Requirements Guide.

Recently authorized AWS services at DoD Impact Levels (IL) 4 and 5 include the following:

Business Applications

Compute

Containers

Cost Management

  • AWS Budgets – Set custom budgets to track your cost and usage, from the simplest to the most complex use cases
  • AWS Cost Explorer – An interface that lets you visualize, understand, and manage your AWS costs and usage over time
  • AWS Cost & Usage Report – Itemize usage at the account or organization level by product code, usage type, and operation

Developer Tools

  • AWS CodePipeline – Automate continuous delivery pipelines for fast and reliable updates
  • AWS X-Ray – Analyze and debug production and distributed applications, such as those built using a microservices architecture

Management & Governance

Media Services

  • Amazon Textract – Extract printed text, handwriting, and data from virtually any document

Security, Identity & Compliance

  • Amazon Cognito – Secure user sign-up, sign-in, and access control
  • AWS Security Hub – Centrally view and manage security alerts and automate security checks

Storage

  • AWS Backup – Centrally manage and automate backups across AWS services

Figure 1 shows the IL 4 and IL 5 AWS services that are now authorized for DoD workloads, broken out into functional categories.
 

Figure 1: The AWS services newly authorized by DISA

Figure 1: The AWS services newly authorized by DISA

To learn more about AWS solutions for the DoD, see our AWS solution offerings. Follow the AWS Security Blog for updates on our Services in Scope by Compliance Program. If you have feedback about this blog post, let us know in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Tyler Harding

Tyler is the DoD Compliance Program Manager for AWS Security Assurance. He has over 20 years of experience providing information security solutions to the federal civilian, DoD, and intelligence agencies.

How MOIA built a fully automated GDPR compliant data lake using AWS Lake Formation, AWS Glue, and AWS CodePipeline

Post Syndicated from Leonardo Pêpe original https://aws.amazon.com/blogs/big-data/how-moia-built-a-fully-automated-gdpr-compliant-data-lake-using-aws-lake-formation-aws-glue-and-aws-codepipeline/

This is a guest blog post co-written by Leonardo Pêpe, a Data Engineer at MOIA.

MOIA is an independent company of the Volkswagen Group with locations in Berlin and Hamburg, and operates its own ride pooling services in Hamburg and Hanover. The company was founded in 2016 and develops mobility services independently or in partnership with cities and existing transport systems. MOIA’s focus is on ride pooling and the holistic development of the software and hardware for it. In October 2017, MOIA started a pilot project in Hanover to test a ride pooling service, which was brought into public operation in July 2018. MOIA covers the entire value chain in the area of ​​ride pooling. MOIA has developed a ridesharing system to avoid individual car traffic and use the road infrastructure more efficiently.

In this post, we discuss how MOIA uses AWS Lake Formation, AWS Glue, and AWS CodePipeline to store and process gigabytes of data on a daily basis to serve 20 different teams with individual user data access and implement fine-grained control of the data lake to comply with General Data Protection Regulation (GDPR) guidelines. This involves controlling access to data at a granular level. The solution enables MOIA’s fast pace of innovation to automatically adapt user permissions to new tables and datasets as they become available.

Background

Each MOIA vehicle can carry six passengers. Customers interact with the MOIA app to book a trip, cancel a trip, and give feedback. The highly distributed system prepares multiple offers to reach their destination with different pickup points and prices. Customers select an option and are picked up from their chosen location. All interactions between the customers and the app, as well as all the interactions between internal components and systems (the backend’s and vehicle’s IoT components), are sent to MOIA’s data lake.

Data from the vehicle, app, and backend must be centralized to have the synchronization between trips planned and implemented, and then to collect passenger feedback. To provide different pricing and routing options, MOIA needed centralized data. MOIA decided to build and secure its Amazon Simple Storage Service (Amazon S3) based Data Lake using AWS Lake Formation.

Different MOIA teams that includes Data Analysts, Data Scientists and Data Engineers need to access centralized data from different sources for the development and operations of the application workloads. It’s a legal requirement to control the access and format of the data to these different teams. The app development team needs to understand customer feedback in an anonymized way, pricing-related data must be accessed only by the business analytics team, vehicle data is meant to be used only by the vehicle maintenance team, and the routing team needs access to customer location and destination.

Architecture overview

The following diagram illustrates MOIA solution architecture.

The solution has the following components:

  1. Data input – The apps, backend, and IoT devices are continuously streaming event messages via Amazon Kinesis Data Streams in a nested information in a JSON format.
  2. Data transformation – When the raw data is received, the data pipeline orchestrator (Apache Airflow) starts the data transformation jobs to meet requirements for respective data formats. Amazon EMR, AWS Lambda, or AWS Fargate cleans, transforms, and loads the events in the S3 data lake.
  3. Persistence layer – After the data is transformed and normalized, AWS Glue crawlers classify the data to determine the format, schema, and associated properties. They group data into tables or partitions and write metadata to the AWS Glue Data Catalog based on the structured partitions created in Amazon S3. The data is written on Amazon S3 partitioned by its event type and timestamp so it can be easily accessed via Amazon Athena and the correct permissions can be applied using Lake Formation. The crawlers are run at a fixed interval in accordance with the batch jobs, so no manual runs are required.
  4. Governance layer – Access on the data lake is managed using the Lake Formation governance layer. MOIA uses Lake Formation to centrally define security, governance, and auditing policies in one place. These policies are consistently implemented, which eliminates the need to manually configure them across security services like AWS Identity and Access Management (IAM), AWS Key Management Service (AWS KMS), storage services like Amazon S3, or analytics and machine learning (ML) services like Amazon Redshift, Athena, Amazon EMR for Apache Spark. Lake Formation reduces the effort in configuring policies across services and provides consistent assistance for GDPR requirements and compliance. When a user makes a request to access Data Catalog resources or underlying data from the data lake, for the request to succeed, it must pass permission checks by both IAM and Lake Formation. Lake Formation permissions control access to Data Catalog resources, Amazon S3 locations, and the underlying data at those locations. User permissions to the data on the table, row, or column level are defined in Lake Formation, so that users only access the data if they’re authorized.
  5. Data access layer – After the flattened and structured data is available to be accessed by the data analysts, scientists, and engineers they can use Amazon Redshift or Athena to perform SQL analysis, or AWS Data Wrangler and Apache Spark to perform programmatic analysis and build ML use cases.
  6. Infrastructure orchestration and data pipeline – After the data is ingested and transformed, the infrastructure orchestration layer (CodePipeline and Lake Formation) creates or updates the governance layer with the proper and predefined permissions periodically every hour (see more about the automation in the governance layer later in this post).

Governance challenge

MOIA wants to evolve their ML models for routing, demand prediction, and business models continuously. This requires MOIA to constantly review models and update them, therefore power users such as data administrators and engineers frequently redesign the table schemas in the AWS Glue Data Catalog as part of the data engineering workflow. This highly dynamic metadata transformation requires an equally dynamic governance layer pipeline that can assign the right user permissions to all tables and adapt to these changes transparently without disruptions to end-users. Due to GDPR requirements, there is no room for error, and manual work is required to assign the right permissions according to GDPR compliance on the tables in the data lake with many terabytes of data. Without automation, many developers are needed for administration; this adds human error into the workflows, which is not acceptable. Manual administration and access management isn’t a scalable solution. MOIA needed to innovate faster with GDPR compliance with a small team of developers.

Data schema and data structure often changes at MOIA, resulting in new tables being created in the data lake. To guarantee that new tables inherit the permissions granted to the same group of users who already have access to the entire database, MOIA uses an automated process that grants Lake Formation permission to newly created tables, columns, and databases, as described in the next section.

Solution overview

The following diagram illustrates the continuous deployment loop using AWS CloudFormation.

The workflow contains the following steps:

  1. MOIA data scientists and engineers create or modify new tables to evolve the business and ML models. This results in new tables or changes in the existing schema of the table.
  2. The data pipeline is orchestrated via Apache Airflow, which controls the whole cycle of ingestion, cleansing, and data transformation using Spark jobs (as shown in Step 1). When the data is in its desired format and state, Apache Airflow triggers the AWS Glue crawler job to discover the data schema in the S3 bucket and add the new partitions to the Data Catalog. Apache Airflow also triggers CodePipeline to assign the right permissions on all the tables.
  3. Apache Airflow triggers CodePipeline every hour, which uses MOIA’s libraries to discover the new tables, columns, and databases, and grants Lake Formation permissions in the AWS CloudFormation template for all tables, including new and modified. This orchestration layer ensures Lake Formation permissions are granted. Without this step, nobody can access new tables, and new data isn’t visible to data scientists, business analysts, and others who need it.
  4. MOIA uses Stacker and Troposphere to identify all the tables, assign the tables the right permissions, and deploy the CloudFormation stack. Troposphere is the infrastructure as code (IaC) framework that renders the CloudFormation templates, and stacker helps with the parameterization and deployment of the templates on AWS. Stacker also helps produce reusable templates in the form of blueprints as pure Python code, which can be easily parameterized using YAML files. The Python code uses Boto3 clients to lookup the Data Catalog and search for databases and tables.
  5. The stacker blueprint libraries developed by MOIA (which internally use Boto3 and troposphere) are used to discover newly created databases or tables in the Data Catalog.
  6. A Permissions configuration file allows MOIA to predefine data lake tables (Can be a wildcard indicating all tables available in one database) and specific personas who have access to these tables. These roles are split into different categories called lake personas, and have specific access levels. Example lake personas include data domain engineers, data domain technical users, data administrators, and data analysts. Users and their permissions are defined in the file. In case access to personally identifiable information (PII) data is granted, the GDPR officer ensures that a record of processing activities is in place and the usage of personal data is documented according to the law.
  7. MOIA uses stacker to read the predefined permissions file, and uses the customized stacker blueprints library containing the logic to assign permissions to lake personas for each of the tables discovered in Step 5.
  8. The discovered and modified tables are matched with predefined permissions in Step 6 and Step 7. Stacker uses YAML format as the input for the CloudFormation template parameters to define users and data access permissions. These parameters are related to the user’s roles and define access to the tables. Based on this, MOIA creates a customized stacker that creates AWS CloudFormation resources dynamically. The following are the example stacks in AWS CloudFormation:
    1. data-lake-domain-database – Holds all resources that are relevant during the creation of the database, which includes lake permissions at the database level that allow the domain database admin personas to perform operations in the database.
    2. data-lake-domain-crawler – Scans Amazon S3 constantly to discover and create new tables and updates.
    3. data-lake-lake-table-permissions – Uses Boto3 combined with troposphere to list the available tables and grant lake permissions to the domain roles that access it.

This way, new CloudFormation templates are created, containing permissions for new or modified tables. This process guarantees a fully automated governance layer for the data lake. The generated CloudFormation template contains Lake Formation permission resources for each table, database, or column. The process of managing Lake Formation permissions on Data Catalog databases, tables, and columns is simplified by granting Data Catalog permissions using the Lake Formation tag-based access control method. The advantage of generating a CloudFormation template is audibility. When the new version of the CloudFormation stack is prepared with an access control set on new or modified tables, administrators can compare that stack with the older version to discover newly prepared and modified tables. MOIA can view the differences via the AWS CloudFormation console before new stack deployment.

  1. Either the CloudFormation stack is deployed in the account with the right permissions for all users, or a stack deployment failure notice is sent to the developers via a Slack channel (Step 10). This ensures the visibility in the deployment process and failed pipelines in permission assignment.
  2. Whenever the pipeline fails, MOIA receives the notification via Slack so the engineers can promptly fix the errors. This loop is very important to guarantee that whenever the schema changes, those changes are reflected in the data lake without manual intervention.
  3. Following this automated pipeline, all the tables are assigned right permissions.

Benefits

This solution delivers the following benefits:

  • Automated enforcement of data access permissions allows people to access the data without manual interventions in a scalable way. With automation, 1,000 hours of manual work are saved every year.
  • The GDPR team does the assessment internally every month. The GDPR team constantly provides guidance and approves each change in permissions. This audit trail for records of processing activities is automated with the help of Lake Formation (otherwise it needs a dedicated human resource).
  • The automated workflow of permission assignment is integrated into existing CI/CD processes, resulting in faster onboarding of new teams, features, and dataset releases. An average of 48 releases are done each month (including major and minor version releases and new event types). Onboarding new teams and forming internal new teams is very easy now.
  • This solution enables you to create a data lake using IaC processes that are easy and GDPR-compliant.

Conclusion

MOIA has created scalable, automated, and versioned permissions with a GDPR-supported, governed data lake using Lake Formation. This solution helps them bring new features and models to market faster, and reduces administrative and repetitive tasks. MOIA can focus on 48 average releases every month, contributing to a great customer experience and new data insights.


About the Authors

Leonardo Pêpe is a Data Engineer at MOIA. With a strong background in infrastructure and application support and operations, he is immersed in the DevOps philosophy. He’s helping MOIA build automated solutions for its data platform and enabling the teams to be more data-driven and agile. Outside of MOIA, Leonardo enjoys nature, Jiu-Jitsu and martial arts, and explores the good of life with his family.

 

Sushant Dhamnekar is a Solutions Architect at AWS. As a trusted advisor, Sushant helps automotive customers to build highly scalable, flexible, and resilient cloud architectures, and helps them follow the best practices around advanced cloud-based solutions. Outside of work, Sushant enjoys hiking, food, travel, and CrossFit workouts.

 

 

Shiv Narayanan is Global Business Development Manager for Data Lakes and Analytics solutions at AWS. He works with AWS customers across the globe to strategize, build, develop and deploy modern data platforms. Shiv loves music, travel, food and trying out new tech.

Field Notes: How to Deploy End-to-End CI/CD in the China Regions Using AWS CodePipeline

Post Syndicated from Ashutosh Pateriya original https://aws.amazon.com/blogs/architecture/field-notes-how-to-deploy-end-to-end-ci-cd-in-the-china-regions-using-aws-codepipeline/

This post was co-authored by Ravi Intodia, Cloud Archiect, Infosys Technologies Ltd, Nirmal Tomar, Principal Consultant, Infosys Technologies Ltd and Ashutosh Pateriya, Solution Architect, AWS.

Today’s businesses must contend with fast-changing competitive environments, expanding security needs, and scalability issues. Businesses must find a way to reconcile the need for operational stability with the need for quick product development. Continuous integration and continuous delivery (CI/CD) enables rapid software iterations while maintaining system stability and security.

With an increase in AWS Cloud and DevOps adoption, many organizations seek solutions which go beyond geographical boundaries. AWS CodePipeline, along with its related services, lets you integrate and deploy your solutions across multiple AWS accounts and Regions. However, it becomes more challenging when you want to deploy your application in multiple AWS Regions as well as in China, due to the unavailability of AWS CodePipeline in the Beijing and Ningxia Regions.

In this blog post, you will learn how to overcome the unique challenges when deploying applications across many parts of the world, including China. For this solution, we will use the power and flexibility of AWS CodeBuild to implement AWS Command Line Interface (AWS CLI) commands to perform custom actions that are not directly supported by CodePipeline or AWS CodeDeploy.

CodePipeline for multi-account and multi-Region deployment consists of the following components:

  • ArtifactStore and encryption keys – In the AWS account which hosts CodePipeline, there should be an Amazon Simple Storage Service (Amazon S3) bucket and an AWS Key Management Service (AWS KMS) key for each Region where resources need to be deployed.
  • CodeBuild and CodePipeline roles – In the AWS account which hosts CodePipeline there should be roles created that can be used or assumed by CodeBuild and CodePipeline projects for performing required actions.
  • Cross-account roles – In each AWS account where cross-account deployments are required, an AWS role with the necessary permissions must be created. The CodePipeline role of the deploying account must be allowed to assume this role for all required accounts. Cross-account roles will also have access to the required S3 buckets and AWS KMS keys for deploying accounts.
Figure 1. High-level solution for AWS Regions

Figure 1. High-level solution for AWS Regions

Although the solution works for most Regions, we encounter challenges when we try to expand our current worldwide solutions into the China Regions.

The challenges are as follows:

  • Cross-account roles – Cross-account roles cannot be created between accounts in non-China Regions and the China Regions. This means that CodeDeploy will be unable to assume the target account role necessary to complete component deployment.
  • Availability of services – Services required to configure a cloud native CI/CD pipeline are unavailable in the China Regions.
  • Connectivity – There is no direct network connectivity available between the China Regions and other AWS Regions.
  • User management – Accounts by users in China are distinct from AWS Region user accounts, and must be maintained independently.

Due to the lack of cross-account roles and the CodePipeline service, setting up a worldwide CI/CD pipeline that includes the China Regions is not automatically supported.

High-level solution

In the proposed solution, we will build and deploy the application to both Regions using the AWS CI/CD services from the non-China Region, and we will create an access key in a China Region with access to deploy the application using services, such as AWS Lambda, AWS Elastic Beanstalk, Amazon Elastic Container Service, and Amazon Elastic Kubernetes Service. This access key is stored in a non-China account as an SSM parameter after encryption. On committing, the CodePipeline in the non-China Region is initiated, and it builds the package and deploys the application in both Regions from a single place.

Solution architecture

Figure 2. High-level solution for cross-account deployment from AWS Regions to a China Region

Figure 2. High-level solution for cross-account deployment from AWS Regions to a China Region

In this architecture, AWS CLI commands are used to set an AWS profile of CodeBuild instance with China credentials (retrieved from the AWS Systems Manager Parameter Store). This enables a CodeBuild instance to run an AWS CloudFormation package and deploy commands directly on the China account, thereby deploying required resources in the desired China Region.

This solution is not relying on any AWS CI/CD services like CodeDeploy in the China Region. With this solution we can create a complete CI/CD pipeline running in an AWS Region that can deploy an application in both Regions.

The following key components are needed for deployment:

  • AWS Identity and Access Management (IAM) user credentials – An IAM user needs to be created in the target account in China.
  • SSM parameter (secure string) – China IAM user access key (secret access key needs to be saved as a secure string SSM parameter in the deployment AWS account).
  • Update CloudFormation templates – CloudFormation templates need to be updated to support China Region mappings (such as using “arn:aws-cn” instead of “arn:aws”).
  • Enhance CodeBuild to support build and deployment – CodeBuild buildspec.yml needs to be enhanced to perform build and deployment to China accounts, as mentioned in the following.

Prerequisites

  • Two AWS accounts: One AWS account outside of China, and one account in China.
  • Practical experience in deploying Lambda functions using CodeBuild, CodeDeploy, and CodePipeline, and using AWS CLI. Because this example focuses specifically on extending CodePipeline from Regions outside of China to deploy in China Region, we are not going to explore a standard CodePipeline set up.

Detailed Implementation

This solution is built using CodePipeline, CodeCommit, CodeBuild, AWS Cloud​Formation templates, and IAM.

Steps

  1. One-time key generation in an account in China with necessary access to deploy application, including creation of one S3 bucket for CodeBuild artifacts.
    Note: As a best practice, we suggest rotation of the access key every 30 days.
  2. Complete the setup of CodePipeline to deploy application in Regions outside of China, as well as including China Region.

As a demonstration, let’s deploy a Lambda function in us-east-1 and cn-north-1 and discuss the steps in detail. The same steps can be followed to deploy any other AWS service.

Part 1 – In the account based in China Region: cn-north-1

  1. Create an S3 bucket with default encryption enabled for CodeBuild artifacts.

  1. Create an IAM user (with programmatic access only) with the required permissions to deploy Lambda functions and related resources. The IAM user will also have access to the S3 bucket created for CodeBuild artifacts.To create an IAM policy, refer to the AWS IAM Policy resource.

Part 2 – In AWS account based in non-China Region: us-east-1

  1. Create two SSM parameters of type secure string.
SSM Parameter Name SSM Parameter Value
/China/Dev/UserAccessKey <Value of China IAM User Access Key>
/China/Dev/UserAccessKey <Value of China IAM User Access Key>

To create secure SSM parameters using the AWS CLI, refer to the Create a Systems Manager parameter (AWS CLI) tutorial.

To create secure SSM parameters using the AWS Management Console, refer to the Create a Systems Manager parameter (console) tutorial.

Note: Creating secure SSM parameters is not supported by CloudFormation templates. Also, as a security best practice, you should not have any sensitive information as part of CloudFormation templates to avoid any possible security breach.

  1. Create an AWS KMS key for encrypting CodeBuild or CodePipeline artifacts (for cross-Region deployments, create AWS KMS key in all Regions, and create SSM parameters for each in the Region having CodePipeline).
  2. Create artifacts S3 bucket for CodeBuild or CodePipeline artifacts.
  3. Create CI/CD related roles. For CI/CD service roles, refer to:
    https://docs.aws.amazon.com/codepipeline/latest/userguide/pipelines-create-service-role.html
    https://docs.aws.amazon.com/codebuild/latest/userguide/setting-up.html#setting-up-service-role
  4. Create a CodeCommit repository.
  5. Create SSM parameters for the following.
SSM Parameter Name SSM Parameter Value
/China/Dev/DeploymentS3Bucket <Artifact Bucket Name in China Region>
/US/Dev/CodeBuildRole <Role ARN of CodeBuild Service Role>
/US/Dev/CodePipelineRole <Role ARN of Codepipeline Service Role>
/US/Dev/CloudformationRole <Role ARN of Cloudformation Service Role>
/US/Dev/DeploymentS3Bucket <Artifact Bucket Name in Pipeline Region>
/US/Dev/CodeBuildImage <Code Build Image Details>
/US/Stage/CrossAccountStageRole <Role ARN for Cross Account Service Role for Stage>
/US/Prod/CrossAccountStageRole <Role ARN for Cross Account Service Role for Prod>
  1. In CodeCommit, push the Lambda code and CloudFormation template for deploying Lambda resources (Lambda function, Lambda role, Lambda log group, and so forth).
  2. In CodeCommit, push two buildspec yml files, one for us-east-1, and one for cn-north-1.
    1. buildspec.yml: For us-east-1
# Buildspec Reference Doc: https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html
version: 0.2
phases:
  install:
    runtime-versions:
      python: 3.7
  pre_build:
    commands:
      - echo "[+] Updating PIP...."
      - pip install --upgrade pip
      - echo "[+] Installing dependencies...."
      #- Commands To Install required dependencies
      - yum install zip unzip -y -q
      - pip install awscli --upgrade
      
  build:
    commands:
      - echo "Starting build `date` in `pwd`"
      - echo "Starting SAM packaging `date` in `pwd`"
      - aws cloudformation package --template-file cloudformation_template.yaml --s3-bucket ${S3_BUCKET} --output-template-file transform-packaged.yaml
      # Additional package commands for cross-region deployments
      - echo "SAM packaging completed on `date`"
      - echo "Build completed `date` in `pwd`"
     
artifacts:
  type: zip
  files:
    - transform-packaged.yaml
    # - additional artifacts for cross-region deployments 

  discard-paths: yes

cache:
  paths:
    - '/root/.cache/pip'
    1. buildspec-china.yml: For cn-north-1
      buildspec-china.yml will be customized for performing build and deployment both. Refer to the following for details.
# Buildspec Reference Doc: https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html
version: 0.2    
phases:
  install:
    runtime-versions:
      python: 3.7

  pre_build:
    commands:
      - echo "[+] Updating PIP...."
      - pip install --upgrade pip
      - echo "[+] Installing dependencies...."
      #- Commands To Install required dependencies
      - yum install zip unzip -y -q
      - pip install awscli --upgrade
      # Setting China Region IAM User Profile
      - echo "Start setting User Profile  `date` in `pwd`"
      - USER_ACCESS_KEY=`aws ssm get-parameter --name ${USER_ACCESS_KEY_SSM} --with-decryption --query Parameter.Value --output text`
      - USER_SECRET_KEY=`aws ssm get-parameter --name ${USER_SECRET_KEY_SSM} --with-decryption --query Parameter.Value --output text`
      - aws configure --profile china set aws_access_key_id ${USER_ACCESS_KEY}
      - aws configure --profile china set aws_secret_access_key ${USER_SECRET_KEY}
      - echo "Setting User Profile Completed `date` in `pwd`"

  build:
    commands:
      # Creating Deployment Package 
      - echo "Start build/packaging  `date` in `pwd`"
      - S3_BUCKET=`aws ssm get-parameter --name ${S3_BUCKET_SSM} --query Parameter.Value --output text`
      - zip -q -r package.zip *
      - >
        bash -c '
        aws cloudformation package 
        --template-file cloudformation_template_china.yaml
        --s3-bucket ${S3_BUCKET}
        --output-template-file transformed-template-china.yaml
        --profile china
        --region cn-north-1'
      - echo "Completed build/packaging  `date` in `pwd`"
  post_build:
    commands:
      # Deploying 
      - echo "Start deployment  `date` in `pwd`"
      - >
        bash -c '
        aws cloudformation deploy 
        --capabilities CAPABILITY_NAMED_IAM
        --template-file transformed-template-china.yaml
        --stack-name ${ProjectName}-app-stack-dev
        --profile china
        --region cn-north-1'
      - echo "Completed deployment  `date` in `pwd`"
artifacts:
  type: zip
  files:
    - package.zip
    - transformed-template-china.yaml

Environment Variables: USER_ACCESS_KEY_SSM, USER_SECRET_KEY_SSM and S3_BUCKET_SSM

After creating and committing the previous files, your CodeCommit repository will look like the following.

Now that we have a CodeCommit repository, next we will create a CodePipeline for Lambda with the following stages:

  1. Source – Use the previously created CodeCommit repository (CodeRepository-US-East-1) as the source.
  2. Build – The CodeBuild project uses buildspec.yml by default, and takes output of Source stage as input and builds artifacts for us-east-1.
  3. Deploy
    1. Code-Deploy Project for deploying to us-east-1
      This takes output of the previous CodeBuild stage as input and performs deployment in two steps: create-changeset and execute-changeset (assuming the required role attached to the Code-Deploy for deployment).
    2. CodeBuild Project for deploying to cn-north-1
      This takes output of Source stage as input and performs build and deployment both to cn-north-1 using buildspec-china.yml. Also, it uses China IAM user credentials and bucket SSM parameters from environment variables.CodeBuild project details are outlined in the following image.

  1. Optional – Add further steps like manual approval, deployment to higher environment, and so forth, as required.

Congratulations! you have just created a CodePipeline with Lambda deployed in both a non-China Region and a China Region. Your CodePipeline should appear similar to the following.

Figure 3. CodePipeline implementation for both regions

Figure 3. CodePipeline implementation for both Regions

Note: Actual CodePipeline view will be vertical only where all environment deployment will be one after the other. For the purpose of this example, we have placed them side-by-side to more easily showcase multiple environments.

CodePipeline Implementation Steps

We have created this pipeline with the following high-level steps, and you can add or remove steps as needed.

Step 1. After you commit the source code, CodePipeline will launch in the non-China Region and fetch the source code.

Step 2. Build the package using using buildspec.yml.

Step 3. Deploy the application in both Regions by following the subsection steps for development environment.

    1. Create the changeset for the development environment.
    2. Implement the changeset for the development environment.

Step 4. Repeat step 3, but deploy the application in the staging environment.

Step 5. Wait for approval from your administrator or application owner before deploying application in production environment.

Step 6. Repeat steps 3 and 4 to deploy the application in the production environment.

Cleaning up

To avoid incurring future charges, clean up the resources created as part of this blog post.

  • Delete the CloudFormation stack created in the non-China Region.
  • Delete the SSM parameter created to store the access key.
  • Delete the access created in the China Region.

Conclusion

In this blog post, we have explored the question: how can you use AWS services to implement CI/CD in a China Region and keep them in sync with an AWS Region? Although we are using us-east-1 as an example here, this solution will work for any Region where CodePipeline services are available, including the China Region.

The question has been answered by dividing it into three problem statements as follows.

Problem 1: CodePipeline is not available in the China Regions.
Solution: Set up CodePipeline in a non-China Region and deploy to a China Region.

Problem 2: AWS cross-account roles are not possible between a non-China Region and the China Regions.
Solution: Use the power and flexibility of CodeBuild to build your application and also deploy your application using the AWS CLI.

Problem 3: Keep a non-China Region and the China Regions in sync.
Solution: Maintain all code and managing deployments from a common deployment AWS account.

Reference:

AWS CodeCommit | Managed Source Control Service

AWS CodePipeline | Continuous Integration and Continuous Delivery

AWS CodeBuild – Fully Managed Build Service

Using AWS CodePipeline for deploying container images to AWS Lambda Functions

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/using-aws-codepipeline-for-deploying-container-images-to-aws-lambda-functions/

AWS Lambda launched support for packaging and deploying functions as container images at re:Invent 2020. In the post working with Lambda layers and extensions in container images, we demonstrated packaging Lambda Functions with layers while using container images. This post will teach you to use AWS CodePipeline to deploy docker images for microservices architecture involving multiple Lambda Functions and a common layer utilized as a separate container image. Lambda functions packaged as container images do not support adding Lambda layers to the function configuration. Alternatively, we can use a container image as a common layer while building other container images along with Lambda Functions shown in this post. Packaging Lambda functions as container images enables familiar tooling and larger deployment limits.

Here are some advantages of using container images for Lambda:

  • Easier dependency management and application building with container
    • Install native operating system packages
    • Install language-compatible dependencies
  • Consistent tool set for containers and Lambda-based applications
    • Utilize the same container registry to store application artifacts (Amazon ECR, Docker Hub)
    • Utilize the same build and pipeline tools to deploy
    • Tools that can inspect Dockerfile work the same
  • Deploy large applications with AWS-provided or third-party images up to 10 GB
    • Include larger application dependencies that previously were impossible

When using container images with Lambda, CodePipeline automatically detects code changes in the source repository in AWS CodeCommit, then passes the artifact to the build server like AWS CodeBuild and pushes the container images to ECR, which is then deployed to Lambda functions.

Architecture diagram

 

DevOps Architecture

Lambda-docker-images-DevOps-Architecture

Application Architecture

lambda-docker-image-microservices-app

In the above architecture diagram, two architectures are combined, namely 1, DevOps Architecture and 2, Microservices Application Architecture. DevOps architecture demonstrates the use of AWS Developer services such as AWS CodeCommit, AWS CodePipeline, AWS CodeBuild along with Amazon Elastic Container Repository (ECR) and AWS CloudFormation. These are used to support Continuous Integration and Continuous Deployment/Delivery (CI/CD) for both infrastructure and application code. Microservices Application architecture demonstrates how various AWS Lambda Functions that are part of microservices utilize container images for application code. This post will focus on performing CI/CD for Lambda functions utilizing container containers. The application code used in here is a simpler version taken from Serverless DataLake Framework (SDLF). For more information, refer to the AWS Samples GitHub repository for SDLF here.

DevOps workflow

There are two CodePipelines: one for building and pushing the common layer docker image to Amazon ECR, and another for building and pushing the docker images for all the Lambda Functions within the microservices architecture to Amazon ECR, as well as deploying the microservices architecture involving Lambda Functions via CloudFormation. Common layer container image functions as a common layer in all other Lambda Function container images, therefore its code is maintained in a separate CodeCommit repository used as a source stage for a CodePipeline. Common layer CodePipeline takes the code from the CodeCommit repository and passes the artifact to a CodeBuild project that builds the container image and pushes it to an Amazon ECR repository. This common layer ECR repository functions as a source in addition to the CodeCommit repository holding the code for all other Lambda Functions and resources involved in the microservices architecture CodePipeline.

Due to all or the majority of the Lambda Functions in the microservices architecture requiring the common layer container image as a layer, any change made to it should invoke the microservices architecture CodePipeline that builds the container images for all Lambda Functions. Moreover, a CodeCommit repository holding the code for every resource in the microservices architecture is another source to that CodePipeline to get invoked. This has two sources, because the container images in the microservices architecture should be built for changes in the common layer container image as well as for the code changes made and pushed to the CodeCommit repository.

Below is the sample dockerfile that uses the common layer container image as a layer:

ARG ECR_COMMON_DATALAKE_REPO_URL
FROM ${ECR_COMMON_DATALAKE_REPO_URL}:latest AS layer
FROM public.ecr.aws/lambda/python:3.8
# Layer Code
WORKDIR /opt
COPY --from=layer /opt/ .
# Function Code
WORKDIR /var/task
COPY src/lambda_function.py .
CMD ["lambda_function.lambda_handler"]

where the argument ECR_COMMON_DATALAKE_REPO_URL should resolve to the ECR url for common layer container image, which is provided to the --build-args along with docker build command. For example:

export ECR_COMMON_DATALAKE_REPO_URL="0123456789.dkr.ecr.us-east-2.amazonaws.com/dev-routing-lambda"
docker build --build-arg ECR_COMMON_DATALAKE_REPO_URL=$ECR_COMMON_DATALAKE_REPO_URL .

Deploying a Sample

  • Step1: Clone the repository Codepipeline-lambda-docker-images to your workstation. If using the zip file, then unzip the file to a local directory.
    • git clone https://github.com/aws-samples/codepipeline-lambda-docker-images.git
  • Step 2: Change the directory to the cloned directory or extracted directory. The local code repository structure should appear as follows:
    • cd codepipeline-lambda-docker-images

code-repository-structure

  • Step 3: Deploy the CloudFormation stack used in the template file CodePipelineTemplate/codepipeline.yaml to your AWS account. This deploys the resources required for DevOps architecture involving AWS CodePipelines for common layer code and microservices architecture code. Deploy CloudFormation stacks using the AWS console by following the documentation here, providing the name for the stack (for example datalake-infra-resources) and passing the parameters while navigating the console. Furthermore, use the AWS CLI to deploy a CloudFormation stack by following the documentation here.
  • Step 4: When the CloudFormation Stack deployment completes, navigate to the AWS CloudFormation console and to the Outputs section of the deployed stack, then note the CodeCommit repository urls. Three CodeCommit repo urls are available in the CloudFormation stack outputs section for each CodeCommit repository. Choose one of them based on the way you want to access it. Refer to the following documentation Setting up for AWS CodeCommit. I will be using the git-remote-codecommit (grc) method throughout this post for CodeCommit access.
  • Step 5: Clone the CodeCommit repositories and add code:
      • Common Layer CodeCommit repository: Take the value of the Output for the key oCommonLayerCodeCommitHttpsGrcRepoUrl from datalake-infra-resources CloudFormation Stack Outputs section which looks like below:

    commonlayercodeoutput

      • Clone the repository:
        • git clone codecommit::us-east-2://dev-CommonLayerCode
      • Change the directory to dev-CommonLayerCode
        • cd dev-CommonLayerCode
      •  Add contents to the cloned repository from the source code downloaded in Step 1. Copy the code from the CommonLayerCode directory and the repo contents should appear as follows:

    common-layer-repository

      • Create the main branch and push to the remote repository
        git checkout -b main
        git add ./
        git commit -m "Initial Commit"
        git push -u origin main
      • Application CodeCommit repository: Take the value of the Output for the key oAppCodeCommitHttpsGrcRepoUrl from datalake-infra-resources CloudFormation Stack Outputs section which looks like below:

    appcodeoutput

      • Clone the repository:
        • git clone codecommit::us-east-2://dev-AppCode
      • Change the directory to dev-CommonLayerCode
        • cd dev-AppCode
      • Add contents to the cloned repository from the source code downloaded in Step 1. Copy the code from the ApplicationCode directory and the repo contents should appear as follows from the root:

    app-layer-repository

    • Create the main branch and push to the remote repository
      git checkout -b main
      git add ./
      git commit -m "Initial Commit"
      git push -u origin main

What happens now?

  • Now the Common Layer CodePipeline goes to the InProgress state and invokes the Common Layer CodeBuild project that builds the docker image and pushes it to the Common Layer Amazon ECR repository. The image tag utilized for the container image is the value resolved for the environment variable available in the AWS CodeBuild project CODEBUILD_RESOLVED_SOURCE_VERSION. This is the CodeCommit git Commit Id in this case.
    For example, if the CommitId in CodeCommit is f1769c87, then the pushed docker image will have this tag along with latest
  • buildspec.yaml files appears as follows:
    version: 0.2
    phases:
      install:
        runtime-versions:
          docker: 19
      pre_build:
        commands:
          - echo Logging in to Amazon ECR...
          - aws --version
          - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
          - REPOSITORY_URI=$ECR_COMMON_DATALAKE_REPO_URL
          - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
          - IMAGE_TAG=${COMMIT_HASH:=latest}
      build:
        commands:
          - echo Build started on `date`
          - echo Building the Docker image...          
          - docker build -t $REPOSITORY_URI:latest .
          - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
      post_build:
        commands:
          - echo Build completed on `date`
          - echo Pushing the Docker images...
          - docker push $REPOSITORY_URI:latest
          - docker push $REPOSITORY_URI:$IMAGE_TAG
  • Now the microservices architecture CodePipeline goes to the InProgress state and invokes all of the application image builder CodeBuild project that builds the docker images and pushes them to the Amazon ECR repository.
    • To improve the performance, every docker image is built in parallel within the codebuild project. The buildspec.yaml executes the build.sh script. This has the logic to build docker images required for each Lambda Function part of the microservices architecture. The docker images used for this sample architecture took approximately 4 to 5 minutes when the docker images were built serially. After switching to parallel building, it took approximately 40 to 50 seconds.
    • buildspec.yaml files appear as follows:
      version: 0.2
      phases:
        install:
          runtime-versions:
            docker: 19
          commands:
            - uname -a
            - set -e
            - chmod +x ./build.sh
            - ./build.sh
      artifacts:
        files:
          - cfn/**/*
        name: builds/$CODEBUILD_BUILD_NUMBER/cfn-artifacts
    • build.sh file appears as follows:
      #!/bin/bash
      set -eu
      set -o pipefail
      
      RESOURCE_PREFIX="${RESOURCE_PREFIX:=stg}"
      AWS_DEFAULT_REGION="${AWS_DEFAULT_REGION:=us-east-1}"
      ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text 2>&1)
      ECR_COMMON_DATALAKE_REPO_URL="${ECR_COMMON_DATALAKE_REPO_URL:=$ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com\/$RESOURCE_PREFIX-common-datalake-library}"
      pids=()
      pids1=()
      
      PROFILE='new-profile'
      aws configure --profile $PROFILE set credential_source EcsContainer
      
      aws --version
      $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      BUILD_TAG=build-$(echo $CODEBUILD_BUILD_ID | awk -F":" '{print $2}')
      IMAGE_TAG=${BUILD_TAG:=COMMIT_HASH:=latest}
      
      cd dockerfiles;
      mkdir ../logs
      function pwait() {
          while [ $(jobs -p | wc -l) -ge $1 ]; do
              sleep 1
          done
      }
      
      function build_dockerfiles() {
          if [ -d $1 ]; then
              directory=$1
              cd $directory
              echo $directory
              echo "---------------------------------------------------------------------------------"
              echo "Start creating docker image for $directory..."
              echo "---------------------------------------------------------------------------------"
                  REPOSITORY_URI=$ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$RESOURCE_PREFIX-$directory
                  docker build --build-arg ECR_COMMON_DATALAKE_REPO_URL=$ECR_COMMON_DATALAKE_REPO_URL . -t $REPOSITORY_URI:latest -t $REPOSITORY_URI:$IMAGE_TAG -t $REPOSITORY_URI:$COMMIT_HASH
                  echo Build completed on `date`
                  echo Pushing the Docker images...
                  docker push $REPOSITORY_URI
              cd ../
              echo "---------------------------------------------------------------------------------"
              echo "End creating docker image for $directory..."
              echo "---------------------------------------------------------------------------------"
          fi
      }
      
      for directory in *; do 
         echo "------Started processing code in $directory directory-----"
         build_dockerfiles $directory 2>&1 1>../logs/$directory-logs.log | tee -a ../logs/$directory-logs.log &
         pids+=($!)
         pwait 20
      done
      
      for pid in "${pids[@]}"; do
        wait "$pid"
      done
      
      cd ../cfn/
      function build_cfnpackages() {
          if [ -d ${directory} ]; then
              directory=$1
              cd $directory
              echo $directory
              echo "---------------------------------------------------------------------------------"
              echo "Start packaging cloudformation package for $directory..."
              echo "---------------------------------------------------------------------------------"
              aws cloudformation package --profile $PROFILE --template-file template.yaml --s3-bucket $S3_BUCKET --output-template-file packaged-template.yaml
              echo "Replace the parameter 'pEcrImageTag' value with the latest built tag"
              echo $(jq --arg Image_Tag "$IMAGE_TAG" '.Parameters |= . + {"pEcrImageTag":$Image_Tag}' parameters.json) > parameters.json
              cat parameters.json
              ls -al
              cd ../
              echo "---------------------------------------------------------------------------------"
              echo "End packaging cloudformation package for $directory..."
              echo "---------------------------------------------------------------------------------"
          fi
      }
      
      for directory in *; do
          echo "------Started processing code in $directory directory-----"
          build_cfnpackages $directory 2>&1 1>../logs/$directory-logs.log | tee -a ../logs/$directory-logs.log &
          pids1+=($!)
          pwait 20
      done
      
      for pid in "${pids1[@]}"; do
        wait "$pid"
      done
      
      cd ../logs/
      ls -al
      for f in *; do
        printf '%s\n' "$f"
        paste /dev/null - < "$f"
      done
      
      cd ../
      

The function build_dockerfiles() loops through each directory within the dockerfiles directory and runs the docker build command in order to build the docker image. The name for the docker image and then the ECR repository is determined by the directory name in which the DockerFile is used from. For example, if the DockerFile directory is routing-lambda and the environment variables take the below values,

ACCOUNT_ID=0123456789
AWS_DEFAULT_REGION=us-east-2
RESOURCE_PREFIX=dev
directory=routing-lambda
REPOSITORY_URI=$ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$RESOURCE_PREFIX-$directory

Then REPOSITORY_URI becomes 0123456789.dkr.ecr.us-east-2.amazonaws.com/dev-routing-lambda
And the docker image is pushed to this resolved REPOSITORY_URI. Similarly, docker images for all other directories are built and pushed to Amazon ECR.

Important Note: The ECR repository names match the directory names where the DockerFiles exist and was already created as part of the CloudFormation template codepipeline.yaml that was deployed in step 3. In order to add more Lambda Functions to the microservices architecture, make sure that the ECR repository name added to the new repository in the codepipeline.yaml template matches the directory name within the AppCode repository dockerfiles directory.

Every docker image is built in parallel in order to save time. Each runs as a separate operating system process and is pushed to the Amazon ECR repository. This also controls the number of processes that could run in parallel by setting a value for the variable pwait within the loop. For example, if pwait 20, then the maximum number of parallel processes is 20 at a given time. The image tag for all docker images used for Lambda Functions is constructed via the CodeBuild BuildId, which is available via environment variable $CODEBUILD_BUILD_ID, in order to ensure that a new image gets a new tag. This is required for CloudFormation to detect changes and update Lambda Functions with the new container image tag.

Once every docker image is built and pushed to Amazon ECR in the CodeBuild project, it builds every CloudFormation package by uploading all local artifacts to Amazon S3 via AWS Cloudformation package CLI command for the templates available in its own directory within the cfn directory. Moreover, it updates every parameters.json file for each directory with the ECR image tag to the parameter value pEcrImageTag. This is required for CloudFormation to detect changes and update the Lambda Function with the new image tag.

After this, the CodeBuild project will output the packaged CloudFormation templates and parameters files as an artifact to AWS CodePipeline so that it can be deployed via AWS CloudFormation in further stages. This is done by first creating a ChangeSet and then deploying it at the next stage.

Testing the microservices architecture

As stated above, the sample application utilized for microservices architecture involving multiple Lambda Functions is a modified version of the Serverless Data Lake Framework. The microservices architecture CodePipeline deployed every AWS resource required to run the SDLF application via AWS CloudFormation stages. As part of SDLF, it also deployed a set of DynamoDB tables required for the applications to run. I utilized the meteorites sample for this, thereby the DynamoDb tables should be added with the necessary data for the application to run for this sample.

Utilize the AWS console to write data to the AWS DynamoDb Table. For more information, refer to this documentation. The sample json files are in the utils/DynamoDbConfig/ directory.

1. Add the record below to the octagon-Pipelines-dev DynamoDB table:

{
"description": "Main Pipeline to Ingest Data",
"ingestion_frequency": "WEEKLY",
"last_execution_date": "2020-03-11",
"last_execution_duration_in_seconds": 4.761,
"last_execution_id": "5445249c-a097-447a-a957-f54f446adfd2",
"last_execution_status": "COMPLETED",
"last_execution_timestamp": "2020-03-11T02:34:23.683Z",
"last_updated_timestamp": "2020-03-11T02:34:23.683Z",
"modules": [
{
"name": "pandas",
"version": "0.24.2"
},
{
"name": "Python",
"version": "3.7"
}
],
"name": "engineering-main-pre-stage",
"owner": "Yuri Gagarin",
"owner_contact": "y.gagarin@",
"status": "ACTIVE",
"tags": [
{
"key": "org",
"value": "VOSTOK"
}
],
"type": "INGESTION",
"version": 127
}

2. Add the record below to the octagon-Pipelines-dev DynamoDB table:

{
"description": "Main Pipeline to Merge Data",
"ingestion_frequency": "WEEKLY",
"last_execution_date": "2020-03-11",
"last_execution_duration_in_seconds": 570.559,
"last_execution_id": "0bb30d20-ace8-4cb2-a9aa-694ad018694f",
"last_execution_status": "COMPLETED",
"last_execution_timestamp": "2020-03-11T02:44:36.069Z",
"last_updated_timestamp": "2020-03-11T02:44:36.069Z",
"modules": [
{
"name": "PySpark",
"version": "1.0"
}
],
"name": "engineering-main-post-stage",
"owner": "Neil Armstrong",
"owner_contact": "n.armstrong@",
"status": "ACTIVE",
"tags": [
{
"key": "org",
"value": "NASA"
}
],
"type": "TRANSFORM",
"version": 4
}

3. Add the record below to the octagon-Datsets-dev DynamoDB table:

{
"classification": "Orange",
"description": "Meteorites Name, Location and Classification",
"frequency": "DAILY",
"max_items_process": 250,
"min_items_process": 1,
"name": "engineering-meteorites",
"owner": "NASA",
"owner_contact": "[email protected]",
"pipeline": "main",
"tags": [
{
"key": "cost",
"value": "meteorites division"
}
],
"transforms": {
"stage_a_transform": "light_transform_blueprint",
"stage_b_transform": "heavy_transform_blueprint"
},
"type": "TRANSACTIONAL",
"version": 1
}

 

If you want to create these samples using AWS CLI, please refer to this documentation.

Record 1:

aws dynamodb put-item --table-name octagon-Pipelines-dev --item '{"description":{"S":"Main Pipeline to Merge Data"},"ingestion_frequency":{"S":"WEEKLY"},"last_execution_date":{"S":"2021-03-16"},"last_execution_duration_in_seconds":{"N":"930.097"},"last_execution_id":{"S":"e23b7dae-8e83-4982-9f97-5784a9831a14"},"last_execution_status":{"S":"COMPLETED"},"last_execution_timestamp":{"S":"2021-03-16T04:31:16.968Z"},"last_updated_timestamp":{"S":"2021-03-16T04:31:16.968Z"},"modules":{"L":[{"M":{"name":{"S":"PySpark"},"version":{"S":"1.0"}}}]},"name":{"S":"engineering-main-post-stage"},"owner":{"S":"Neil Armstrong"},"owner_contact":{"S":"n.armstrong@"},"status":{"S":"ACTIVE"},"tags":{"L":[{"M":{"key":{"S":"org"},"value":{"S":"NASA"}}}]},"type":{"S":"TRANSFORM"},"version":{"N":"8"}}'

Record 2:

aws dynamodb put-item --table-name octagon-Pipelines-dev --item '{"description":{"S":"Main Pipeline to Ingest Data"},"ingestion_frequency":{"S":"WEEKLY"},"last_execution_date":{"S":"2021-03-28"},"last_execution_duration_in_seconds":{"N":"1.75"},"last_execution_id":{"S":"7e0e04e7-b05e-41a6-8ced-829d47866a6a"},"last_execution_status":{"S":"COMPLETED"},"last_execution_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"last_updated_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"modules":{"L":[{"M":{"name":{"S":"pandas"},"version":{"S":"0.24.2"}}},{"M":{"name":{"S":"Python"},"version":{"S":"3.7"}}}]},"name":{"S":"engineering-main-pre-stage"},"owner":{"S":"Yuri Gagarin"},"owner_contact":{"S":"y.gagarin@"},"status":{"S":"ACTIVE"},"tags":{"L":[{"M":{"key":{"S":"org"},"value":{"S":"VOSTOK"}}}]},"type":{"S":"INGESTION"},"version":{"N":"238"}}'

Record 3:

aws dynamodb put-item --table-name octagon-Pipelines-dev --item '{"description":{"S":"Main Pipeline to Ingest Data"},"ingestion_frequency":{"S":"WEEKLY"},"last_execution_date":{"S":"2021-03-28"},"last_execution_duration_in_seconds":{"N":"1.75"},"last_execution_id":{"S":"7e0e04e7-b05e-41a6-8ced-829d47866a6a"},"last_execution_status":{"S":"COMPLETED"},"last_execution_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"last_updated_timestamp":{"S":"2021-03-28T20:23:06.031Z"},"modules":{"L":[{"M":{"name":{"S":"pandas"},"version":{"S":"0.24.2"}}},{"M":{"name":{"S":"Python"},"version":{"S":"3.7"}}}]},"name":{"S":"engineering-main-pre-stage"},"owner":{"S":"Yuri Gagarin"},"owner_contact":{"S":"y.gagarin@"},"status":{"S":"ACTIVE"},"tags":{"L":[{"M":{"key":{"S":"org"},"value":{"S":"VOSTOK"}}}]},"type":{"S":"INGESTION"},"version":{"N":"238"}}'

Now upload the sample json files to the raw s3 bucket. The raw S3 bucket name can be obtained in the output of the common-cloudformation stack deployed as part of the microservices architecture CodePipeline. Navigate to the CloudFormation console in the region where the CodePipeline was deployed and locate the stack with the name common-cloudformation, navigate to the Outputs section, and then note the output bucket name with the key oCentralBucket. Navigate to the Amazon S3 Bucket console and locate the bucket for oCentralBucket, create two path directories named engineering/meteorites, and upload every sample json file to this directory. Meteorites sample json files are available in the utils/meteorites-test-json-files directory of the previously cloned repository. Wait a few minutes and then navigate to the stage bucket noted from the common-cloudformation stack output name oStageBucket. You can see json files converted into csv in pre-stage/engineering/meteorites folder in S3. Wait a few more minutes and then navigate to the post-stage/engineering/meteorites folder in the oStageBucket to see the csv files converted to parquet format.

 

Cleanup

Navigate to the AWS CloudFormation console, note the S3 bucket names from the common-cloudformation stack outputs, and empty the S3 buckets. Refer to Emptying the Bucket for more information.

Delete the CloudFormation stacks in the following order:
1. Common-Cloudformation
2. stagea
3. stageb
4. sdlf-engineering-meteorites
Then delete the infrastructure CloudFormation stack datalake-infra-resources deployed using the codepipeline.yaml template. Refer to the following documentation to delete CloudFormation Stacks: Deleting a stack on the AWS CloudFormation console or Deleting a stack using AWS CLI.

 

Conclusion

This method lets us use CI/CD via CodePipeline, CodeCommit, and CodeBuild, along with other AWS services, to automatically deploy container images to Lambda Functions that are part of the microservices architecture. Furthermore, we can build a common layer that is equivalent to the Lambda layer that could be built independently via its own CodePipeline, and then build the container image and push to Amazon ECR. Then, the common layer container image Amazon ECR functions as a source along with its own CodeCommit repository which holds the code for the microservices architecture CodePipeline. Having two sources for microservices architecture codepipeline lets us build every docker image. This is due to a change made to the common layer docker image that is referred to in other docker images, and another source that holds the code for other microservices including Lambda Function.

 

About the Author

kirankumar.jpeg Kirankumar Chandrashekar is a Sr.DevOps consultant at AWS Professional Services. He focuses on leading customers in architecting DevOps technologies. Kirankumar is passionate about DevOps, Infrastructure as Code, and solving complex customer issues. He enjoys music, as well as cooking and traveling.

 

Chaos Testing with AWS Fault Injection Simulator and AWS CodePipeline

Post Syndicated from Matt Chastain original https://aws.amazon.com/blogs/architecture/chaos-testing-with-aws-fault-injection-simulator-and-aws-codepipeline/

The COVID-19 pandemic has proven to be the largest stress test of our technology infrastructures in generations. A meteoric increase in internet consumption followed, due in large part to working and schooling from home. The chaotic, early months of the pandemic have clearly demonstrated the value of resiliency in production. How can we better prepare our critical systems for these global events in the future? A more modern approach to testing and validating your application architecture is needed. Chaos engineering has emerged as an innovative approach to solving these types of challenges.

This blog shows an architecture pattern for automating chaos testing as part of your continuous integration/continuous delivery (CI/CD) process. By automating the implementation of chaos experiments inside CI/CD pipelines, complex risks and modeled failure scenarios can be tested against application environments with every deployment. Application teams can use the results of these experiments to prioritize improvements in their architecture. These results will give your team the confidence they need to operate in an unpredictable production environment.

AWS Fault Injection Simulator (FIS) is a managed service that enables you to perform fault injection experiments on your AWS workloads. Fault injection is based on the principles of chaos engineering. These experiments stress an application by creating disruptive events so that you can observe how your application responds. You can then use this information to improve the performance and resiliency of your applications. With AWS FIS, you set up and run experiments that help you create the real-world conditions needed to uncover application issues.

AWS CodePipeline is a fully managed continuous delivery service for fast and reliable application and infrastructure updates. You can use AWS CodePipeline to model and automate your software release processes. Automating your build, test, and release process allows you to quickly test each code change. You can ensure the quality of your application or infrastructure code by running each change through your staging and release process.

Continuous chaos testing

Figure 1. High-level architecture pattern for automating chaos engineering

Figure 1. High-level architecture pattern for automating chaos engineering

Create FIS experiments

Begin with creating an FIS experiment template by configuring one or more actions (action set) to run against the target resources of the application architecture. Here we have created an action to stop Amazon EC2 instances in our Amazon Elastic Container Service (ECS) cluster identified by a tag. Target resources can be identified by resource IDs, filters, or tags. We can also set up the action parameters for running the actions before or after the actions/duration. Additionally, you can set up Amazon CloudWatch alarms to stop running one or more fault experiments once a particular threshold or boundary has been reached. In Increase your e-commerce website reliability using chaos engineering and AWS Fault Injection Simulator, Bastien Leblanc shares how to set up CloudWatch metric thresholds as stop conditions for experiments.

Figure 2. AWS FIS experiment template

Figure 2. AWS FIS experiment template

Author AWS Lambda to initiate FIS experiments

Create/Add a FIS IAM role to the Lambda function in the configuration permissions section. To start a specific FIS experiment, we use the experimentTemplateId parameter in our Lambda code. Refer to the AWS FIS API Reference when writing your Lambda code. When integrating the Lambda function into your pipeline, a new AWS CodePipeline can be created or an existing one can be used. A new pipeline stage is added at the point we initiate our Lambda function (post deployment stage), which launches our FIS experiment.

Figure 3. AWS Lambda function initiating AWS FIS experiment

Figure 3. AWS Lambda function initiating AWS FIS experiment

 

Figure 4. AWS CodePipeline with AWS FIS experiment stage

Figure 4. AWS CodePipeline with AWS FIS experiment stage

The experimentTemplateId parameter can also be staged as a key/value ‘environment variable’ in your Lambda function configuration. This is useful as it allows you to change your FIS experiment template without having to adjust your function code. You can use the same Lambda function code by dynamically injecting the experimentTemplateId in multiple environments on your way to production.

Verify FIS experiment results on deployed application

By continuously performing fault injection post-deployment in AWS CodePipeline, you learn about complex failure conditions, which you must solve. User experience and availability testing on your application during the runtime of the FIS experiment can be started by a notification rule. In AWS CodePipeline, you can use an Amazon Simple Notification Service (SNS) topic or chatbot integration. CloudWatch Synthetics can be used for those looking to automate experience testing on the candidate application while other FIS experiments are running.

Figure 5. AWS CodePipeline notification rule setting

Figure 5. AWS CodePipeline notification rule setting

Summary

Using AWS CodePipeline to automate chaos engineering experiments on application architecture with AWS FIS is straightforward. Following are some benefits from automating fault injection testing in our CI/CD pipelines:

  • Our team can achieve a higher degree of confidence in meeting the resiliency requirements of our application. We use a more modern approach to testing and automating this experimentation inside our existing CI/CD process with AWS CodePipeline.
  • We know more about the unknown risks to our application. All testing results we receive provide benefit and learning opportunities for our team. We use these results to understand what we do well, where we need to improve, or what we are willing to tolerate based on our application requirements.
  • Continuously evaluating our architectural fitness inside CI/CD allows our team to validate the impact each feature or component iteration has on the resiliency of our application.

However, the sole value of automating chaos testing is not limited to finding, fixing, or documenting the risks that surface in our application. Additional confidence is gained through constantly validating your operational practices, such as alerts and alarms, monitoring, and notifications.

FIS gives you a controlled and repeatable way to reproduce necessary conditions to fine-tune your operational procedures and runbooks. Automating this testing inside a CI/CD pipeline ensures a nearly continuous feedback loop for these operational practices.

CICD on Serverless Applications using AWS CodeArtifact

Post Syndicated from Anand Krishna original https://aws.amazon.com/blogs/devops/cicd-on-serverless-applications-using-aws-codeartifact/

Developing and deploying applications rapidly to users requires a working pipeline that accepts the user code (usually via a Git repository). AWS CodeArtifact was announced in 2020. It’s a secure and scalable artifact management product that easily integrates with other AWS products and services. CodeArtifact allows you to publish, store, and view packages, list package dependencies, and share your application’s packages.

In this post, I will show how we can build a simple DevOps pipeline for a sample JAVA application (JAR file) to be built with Maven.

Solution Overview

We utilize the following AWS services/Tools/Frameworks to set up our continuous integration, continuous deployment (CI/CD) pipeline:

The following diagram illustrates the pipeline architecture and flow:

 

aws-codeartifact-pipeline

 

Our pipeline is built on CodePipeline with CodeCommit as the source (CodePipeline Source Stage). This triggers the pipeline via a CloudWatch Events rule. Then the code is fetched from the CodeCommit repository branch (main) and sent to the next pipeline phase. This CodeBuild phase is specifically for compiling, packaging, and publishing the code to CodeArtifact by utilizing a package manager—in this case Maven.

After Maven publishes the code to CodeArtifact, the pipeline asks for a manual approval to be directly approved in the pipeline. It can also optionally trigger an email alert via Amazon Simple Notification Service (Amazon SNS). After approval, the pipeline moves to another CodeBuild phase. This downloads the latest packaged JAR file from a CodeArtifact repository and deploys to the AWS Lambda function.

Clone the Repository

Clone the GitHub repository as follows:

git clone https://github.com/aws-samples/aws-cdk-codeartifact-pipeline-sample.git

Code Deep Dive

After the Git repository is cloned, the directory structure is shown as in the following screenshot :

aws-codeartifact-pipeline-code

Let’s study the files and code to understand how the pipeline is built.

The directory java-events is a sample Java Maven project. Find numerous sample applications on GitHub. For this post, we use the sample application java-events.

To add your own application code, place the pom.xml and settings.xml files in the root directory for the AWS CDK project.

Let’s study the code in the file lib/cdk-pipeline-codeartifact-new-stack.ts of the stack CdkPipelineCodeartifactStack. This is the heart of the AWS CDK code that builds the whole pipeline. The stack does the following:

  • Creates a CodeCommit repository called ca-pipeline-repository.
  • References a CloudFormation template (lib/ca-template.yaml) in the AWS CDK code via the module @aws-cdk/cloudformation-include.
  • Creates a CodeArtifact domain called cdkpipelines-codeartifact.
  • Creates a CodeArtifact repository called cdkpipelines-codeartifact-repository.
  • Creates a CodeBuild project called JarBuild_CodeArtifact. This CodeBuild phase does all of the code compiling, packaging, and publishing to CodeArtifact into a repository called cdkpipelines-codeartifact-repository.
  • Creates a CodeBuild project called JarDeploy_Lambda_Function. This phase fetches the latest artifact from CodeArtifact created in the previous step (cdkpipelines-codeartifact-repository) and deploys to the Lambda function.
  • Finally, creates a pipeline with four phases:
    • Source as CodeCommit (ca-pipeline-repository).
    • CodeBuild project JarBuild_CodeArtifact.
    • A Manual approval Stage.
    • CodeBuild project JarDeploy_Lambda_Function.

 

CodeArtifact shows the domain-specific and repository-specific connection settings to mention/add in the application’s pom.xml and settings.xml files as below:

aws-codeartifact-repository-connections

Deploy the Pipeline

The AWS CDK code requires the following packages in order to build the CI/CD pipeline:

  • @aws-cdk/core
  • @aws-cdk/aws-codepipeline
  • @aws-cdk/aws-codepipeline-actions
  • @aws-cdk/aws-codecommit
  • @aws-cdk/aws-codebuild
  • @aws-cdk/aws-iam
  • @aws-cdk/cloudformation-include

 

Install the required AWS CDK packages as below:

npm i @aws-cdk/core @aws-cdk/aws-codepipeline @aws-cdk/aws-codepipeline-actions @aws-cdk/aws-codecommit @aws-cdk/aws-codebuild @aws-cdk/pipelines @aws-cdk/aws-iam @ @aws-cdk/cloudformation-include

Compile the AWS CDK code:

npm run build

Deploy the AWS CDK code:

cdk synth
cdk deploy

After the AWS CDK code is deployed, view the final output on the stack’s detail page on the AWS CloudFormation :

aws-codeartifact-pipeline-cloudformation-stack

 

How the pipeline works with artifact versions (using SNAPSHOTS)

In this demo, I publish SNAPSHOT to the repository. As per the documentation here and here, a SNAPSHOT refers to the most recent code along a branch. It’s a development version preceding the final release version. Identify a snapshot version of a Maven package by the suffix SNAPSHOT appended to the package version.

The application settings are defined in the pom.xml file. For this post, we define the following:

  • The version to be used, called 1.0-SNAPSHOT.
  • The specific packaging, called jar.
  • The specific project display name, called JavaEvents.
  • The specific group ID, called JavaEvents.

The screenshot below shows the pom.xml settings we utilised in the application:

aws-codeartifact-pipeline-pom-xml

 

You can’t republish a package asset that already exists with different content, as per the documentation here.

When a Maven snapshot is published, its previous version is preserved in a new version called a build. Each time a Maven snapshot is published, a new build version is created.

When a Maven snapshot is published, its status is set to Published, and the status of the build containing the previous version is set to Unlisted. If you request a snapshot, the version with status Published is returned. This is always the most recent Maven snapshot version.

For example, the image below shows the state when the pipeline is run for the FIRST RUN. The latest version has the status Published and previous builds are marked Unlisted.

aws-codeartifact-repository-package-versions

 

For all subsequent pipeline runs, multiple Unlisted versions will occur every time the pipeline is run, as all previous versions of a snapshot are maintained in its build versions.

aws-codeartifact-repository-package-versions

 

Fetching the Latest Code

Retrieve the snapshot from the repository in order to deploy the code to an AWS Lambda Function. I have used AWS CLI to list and fetch the latest asset of package version 1.0-SNAPSHOT.

 

Listing the latest snapshot

export ListLatestArtifact = `aws codeartifact list-package-version-assets —domain cdkpipelines-codeartifact --domain-owner $Account_Id --repository cdkpipelines-codeartifact-repository --namespace JavaEvents --format maven --package JavaEvents --package-version "1.0-SNAPSHOT"| jq ".assets[].name"|grep jar|sed ’s/“//g’`

NOTE : Please note the dynamic CDK variable $Account_Id which represents AWS Account ID.

 

Fetching the latest code using Package Version

aws codeartifact get-package-version-asset --domain cdkpipelines-codeartifact --repository cdkpipelines-codeartifact-repository --format maven --package JavaEvents --package-version 1.0-SNAPSHOT --namespace JavaEvents --asset $ListLatestArtifact demooutput

Notice that I’m referring the last code by using variable $ListLatestArtifact. This always fetches the latest code, and demooutput is the outfile of the AWS CLI command where the content (code) is saved.

 

Testing the Pipeline

Now clone the CodeCommit repository that we created with the following code:

git clone https://git-codecommit.<region>.amazonaws.com/v1/repos/codeartifact-pipeline-repository

 

Enter the following code to push the code to the CodeCommit repository:

cp -rp cdk-pipeline-codeartifact-new /* ca-pipeline-repository
cd ca-pipeline-repository
git checkout -b main
git add .
git commit -m “testing the pipeline”
git push origin main

Once the code is pushed to Git repository, the pipeline is automatically triggered by Amazon CloudWatch events.

The following screenshots shows the second phase (AWS CodeBuild Phase – JarBuild_CodeArtifact) of the pipeline, wherein the asset is successfully compiled and published to the CodeArtifact repository by Maven:

aws-codeartifact-pipeline-codebuild-jarbuild

aws-codeartifact-pipeline-codebuild-screenshot

aws-codeartifact-pipeline-codebuild-screenshot2

 

The following screenshots show the last phase (AWS CodeBuild Phase – Deploy-to-Lambda) of the pipeline, wherein the latest asset is successfully pulled and deployed to AWS Lambda Function.

Asset JavaEvents-1.0-20210618.131629-5.jar is the latest snapshot code for the package version 1.0-SNAPSHOT. This is the same asset version code that will be deployed to AWS Lambda Function, as seen in the screenshots below:

aws-codeartifact-pipeline-codebuild-jardeploy

aws-codeartifact-pipeline-codebuild-screenshot-jarbuild

The following screenshot of the pipeline shows a successful run. The code was fetched and deployed to the existing Lambda function (codeartifact-test-function).

aws-codeartifact-pipeline-codepipeline

Cleanup

To clean up, You can either delete the entire stack through the AWS CloudFormation console or use AWS CDK command like below –

cdk destroy

For more information on the AWS CDK commands, please check the here or sample here.

Summary

In this post, I demonstrated how to build a CI/CD pipeline for your serverless application with AWS CodePipeline by utilizing AWS CDK with AWS CodeArtifact. Please check the documentation here for an in-depth explanation regarding other package managers and the getting started guide.