Tag Archives: CloudFormation

Configuration driven dynamic multi-account CI/CD solution on AWS

Post Syndicated from Anshul Saxena original https://aws.amazon.com/blogs/devops/configuration-driven-dynamic-multi-account-ci-cd-solution-on-aws/

Many organizations require durable automated code delivery for their applications. They leverage multi-account continuous integration/continuous deployment (CI/CD) pipelines to deploy code and run automated tests in multiple environments before deploying to Production. In cases where the testing strategy is release specific, you must update the pipeline before every release. Traditional pipeline stages are predefined and static in nature, and once the pipeline stages are defined it’s hard to update them. In this post, we present a configuration driven dynamic CI/CD solution per repository. The pipeline state is maintained and governed by configurations stored in Amazon DynamoDB. This gives you the advantage of automatically customizing the pipeline for every release based on the testing requirements.

By following this post, you will set up a dynamic multi-account CI/CD solution. Your pipeline will deploy and test a sample pet store API application. Refer to Automating your API testing with AWS CodeBuild, AWS CodePipeline, and Postman for more details on this application. New code deployments will be delivered with custom pipeline stages based on the pipeline configuration that you create. This solution uses services such as AWS Cloud Development Kit (AWS CDK), AWS CloudFormation, Amazon DynamoDB, AWS Lambda, and AWS Step Functions.

Solution overview

The following diagram illustrates the solution architecture:

The image represents the solution workflow, highlighting the integration of the AWS components involved.

Figure 1: Architecture Diagram

  1. Users insert/update/delete entry in the DynamoDB table.
  2. The Step Function Trigger Lambda is invoked on all modifications.
  3. The Step Function Trigger Lambda evaluates the incoming event and does the following:
    1. On insert and update, triggers the Step Function.
    2. On delete, finds the appropriate CloudFormation stack and deletes it.
  4. Steps in the Step Function are as follows:
    1. Collect Information (Pass State) – Filters the relevant information from the event, such as repositoryName and referenceName.
    2. Get Mapping Information (Backed by CodeCommit event filter Lambda) – Retrieves the mapping information from the Pipeline config stored in the DynamoDB.
    3. Deployment Configuration Exist? (Choice State) – If the StatusCode == 200, then the DynamoDB entry is found, and Initiate CloudFormation Stack step is invoked, or else StepFunction exits with Successful.
    4. Initiate CloudFormation Stack (Backed by stack create Lambda) – Constructs the CloudFormation parameters and creates/updates the dynamic pipeline based on the configuration stored in the DynamoDB via CloudFormation.

Code deliverables

The code deliverables include the following:

  1. AWS CDK app – The AWS CDK app contains the code for all the Lambdas, Step Functions, and CloudFormation templates.
  2. sample-application-repo – This directory contains the sample application repository used for deployment.
  3. automated-tests-repo– This directory contains the sample automated tests repository for testing the sample repo.

Deploying the CI/CD solution

  1. Clone this repository to your local machine.
  2. Follow the README to deploy the solution to your main CI/CD account. Upon successful deployment, the following resources should be created in the CI/CD account:
    1. A DynamoDB table
    2. Step Function
    3. Lambda Functions
  3. Navigate to the Amazon Simple Storage Service (Amazon S3) console in your main CI/CD account and search for a bucket with the name: cloudformation-template-bucket-<AWS_ACCOUNT_ID>. You should see two CloudFormation templates (templates/codepipeline.yaml and templates/childaccount.yaml) uploaded to this bucket.
  4. Run the childaccount.yaml in every target CI/CD account (Alpha, Beta, Gamma, and Prod) by going to the CloudFormation Console. Provide the main CI/CD account number as the “CentralAwsAccountId” parameter, and execute.
  5. Upon successful creation of Stack, two roles will be created in the Child Accounts:
    1. ChildAccountFormationRole
    2. ChildAccountDeployerRole

Pipeline configuration

Make an entry into devops-pipeline-table-info for the Repository name and branch combination. A sample entry can be found in sample-entry.json.

The pipeline is highly configurable, and everything can be configured through the DynamoDB entry.

The following are the top-level keys:

RepoName: Name of the repository for which AWS CodePipeline is configured.
RepoTag: Name of the branch used in CodePipeline.
BuildImage: Build image used for application AWS CodeBuild project.
BuildSpecFile: Buildspec file used in the application CodeBuild project.
DeploymentConfigurations: This key holds the deployment configurations for the pipeline. Under this key are the environment specific configurations. In our case, we’ve named our environments Alpha, Beta, Gamma, and Prod. You can configure to any name you like, but make sure that the entries in json are the same as in the codepipeline.yaml CloudFormation template. This is because there is a 1:1 mapping between them. Sub-level keys under DeploymentConfigurations are as follows:

  • EnvironmentName. This is the top-level key for environment specific configuration. In our case, it’s Alpha, Beta, Gamma, and Prod. Sub level keys under this are:
    • <Env>AwsAccountId: AWS account ID of the target environment.
    • Deploy<Env>: A key specifying whether or not the artifact should be deployed to this environment. Based on its value, the CodePipeline will have a deployment stage to this environment.
    • ManualApproval<Env>: Key representing whether or not manual approval is required before deployment. Enter your email or set to false.
    • Tests: Once again, this is a top-level key with sub-level keys. This key holds the test related information to be run on specific environments. Each test based on whether or not it will be run will add an additional step to the CodePipeline. The tests’ related information is also configurable with the ability to specify the test repository, branch name, buildspec file, and build image for testing the CodeBuild project.


  1. Make an entry into the devops-pipeline-table-info DynamoDB table in the main CI/CD account. A sample entry can be found in sample-entry.json. Make sure to replace the configuration values with appropriate values for your environment. An explanation of the values can be found in the Pipeline Configuration section above.
  2. After the entry is made in the DynamoDB table, you should see a CloudFormation stack being created. This CloudFormation stack will deploy the CodePipeline in the main CI/CD account by reading and using the entry in the DynamoDB table.

Customize the solution for different combinations such as deploying to an environment while skipping for others by updating the pipeline configurations stored in the devops-pipeline-table-info DynamoDB table. The following is the pipeline configured for the sample-application repository’s main branch.

The image represents the dynamic CI/CD pipeline deployed in your account.

The image represents the dynamic CI/CD pipeline deployed in your account.

The image represents the dynamic CI/CD pipeline deployed in your account.

The image represents the dynamic CI/CD pipeline deployed in your account.

Figure 2: Dynamic Multi-Account CI/CD Pipeline

Clean up your dynamic multi-account CI/CD solution and related resources

To avoid ongoing charges for the resources that you created following this post, you should delete the following:

  1. The pipeline configuration stored in the DynamoDB
  2. The CloudFormation stacks deployed in the target CI/CD accounts
  3. The AWS CDK app deployed in the main CI/CD account
  4. Empty and delete the retained S3 buckets.


This configuration-driven CI/CD solution provides the ability to dynamically create and configure your pipelines in DynamoDB. IDEMIA, a global leader in identity technologies, adopted this approach for deploying their microservices based application across environments. This solution created by AWS Professional Services allowed them to dynamically create and configure their pipelines per repository per release. As Kunal Bajaj, Tech Lead of IDEMIA, states, “We worked with AWS pro-serve team to create a dynamic CI/CD solution using lambdas, step functions, SQS, and other native AWS services to conduct cross-account deployments to our different environments while providing us the flexibility to add tests and approvals as needed by the business.”

About the authors:

Anshul Saxena

Anshul is a Cloud Application Architect at AWS Professional Services and works with customers helping them in their cloud adoption journey. His expertise lies in DevOps, serverless architectures, and architecting and implementing cloud native solutions aligning with best practices.

Libin Roy

Libin is a Cloud Infrastructure Architect at AWS Professional Services. He enjoys working with customers to design and build cloud native solutions to accelerate their cloud journey. Outside of work, he enjoys traveling, cooking, playing sports and weight training.

Using CloudFormation events to build custom workflows for post provisioning management

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/using-cloudformation-events-to-build-custom-workflows-for-post-provisioning-management/

Over one million active customers manage application resources with AWS CloudFormation every week. CloudFormation is a service that helps you model, provision, and manage your cloud resources by treating Infrastructure as Code (IaC). It can simplify infrastructure management, quickly replicate your environment to multiple AWS regions with a single turn-key solution, and let you easily control and track changes in your infrastructure.

You can create various AWS resources using CloudFormation to setup an environment for your workloads. You continue to interact with and manage those resources throughout the workload lifecycle to make sure the resource configuration is aligned with business objectives such as adhering to security compliance standards, meeting required reliability targets, and aligning with budget requirements. The inability to perform a hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services poses a challenge. For example, after provisioning of resources, customers might need to perform additional tasks to manage these resources such as adding cost allocation tags, populating resource inventory database or trigger downstream processes.

While they are able to obtain the logical resource grouping that is tied to a workload or a workload component with a CloudFormation stack, that context does not extend beyond CloudFormation for the most part when they use various AWS and non-AWS services to conduct post-provisioning resource management. These AWS and non-AWS services typically offer a resource level view, or in some cases offer basic aggregated views such as supporting a tag group, or an account level abstraction to see all resources in a given account. For a CloudFormation customer, the inability to not have the context of a stack beyond resource provisioning provides a disjointed experience given there is no hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services. The various management actions customers take with their workload resources through out their lifecycle are

CloudFormation events provide a robust way to track the status of individual resources during the lifecycle of a stack. You can send CloudFormation events to Amazon EventBridge whenever a create, update,  or drift detection action is performed on your stack. Then you can set up additional workflows based on those events from EventBridge. For example, by tagging the resources automatically, you can reference that tag group when using AWS Trusted Advisor, and continue your resource management experience post-provisioning. CloudFormation sends these events to EventBridge automatically so that you don’t need to do anything. One real-world use case is to use these events to create actionable tasks for your teams to troubleshoot issues. CloudFormation events published to EventBridge can be used to create OpsItems within AWS Systems Manager OpsCenter. OpsItems are the work items created in OpsCenter for engineers to view, investigate and remediate tasks/issues. This enables teams to respond and resolve any issues more efficiently.


To set up the EventBridge rule, go to the AWS console and navigate to EventBridge. Select on Create Rule to get started. Enter Name, description and select Next:

Create Rule

On the next screen, select AWS events in the Event source section.

This sample event is for the CREATE_COMPLETE event. It contains the source, AWS account number, AWS region, event type, resources and details about the event.

On the same page in the Event pattern section:

Select Custom patterns (JSON editor) and enter the following event pattern. This will match any events when a resource fails to create, update, or delete. Learn more about EventBridge event patterns.

    "source": [
    "detail-type": [
        "CloudFormation Resource Status Change"
    "detail": {
        "status-details": {
            "status": [

Custom patterns - JSON editor

Select Next. On the Target screen, select AWS service, then select System Manager OpsItem as the target for this rule.

Target 1

Add a second target – an Amazon Simple Notification Service (SNS) Topic – to notify the Ops team whenever a failure occurs and an OpsItem has been created.

Target 2

Select Next and optionally add tags.

Select next to review the selections, and select Create rule.

Now your rule is created and whenever a stack failure occurs, an OpsItem gets created and a notification is sent out for the operators to troubleshoot and fix the issue. The OpsItem contains operational data, such as the resource that failed, the reason for failure, as well as the stack to which it belongs, which is useful for troubleshooting the issue. Operators can take manual actions or use runbooks codified as Systems Manager Documents to take corrective actions. From the AWS Console you can go to OpsCenter to see the events:

operational data

Once the issues have been addressed, operators can mark the OpsItem as resolved, and retry the stack operation that failed, resulting in a swift resolution of the issue, and preventing duplication of efforts.

This walkthrough is for the Console but you can use AWS Command Line Interface (AWS CLI), AWS SDK or even CloudFormation to accomplish all of this. Refer to AWS CLI documentation for more information on creating EventBridge rules through CLI. Furthermore, refer to AWS SDK documentation for creating EventBridge rules through AWS SDK. You can use following CloudFormation template to deploy the EventBridge rules example used as part of the walkthrough in this blog post:

	"Parameters": {
		"SNSTopicARN": {
			"Type": "String",
			"Description": "Enter the ARN of the SNS Topic where you want stack failure notifications to be sent."
	"Resources": {
		"CFNEventsRule": {
			"Type": "AWS::Events::Rule",
			"Properties": {
				"Description": "Event rule to capture CloudFormation failure events",
				"EventPattern": {
					"source": [
					"detail-type": [
						"CloudFormation Resource Status Change"
					"detail": {
						"status-details": {
							"status": [
				"Name": "cfn-stack-failure-test",
				"State": "ENABLED",
				"Targets": [
						"Arn": {
							"Fn::Sub": "arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:opsitem"
						"Id": "opsitems",
						"RoleArn": {
							"Fn::GetAtt": [
						"Arn": {
							"Ref": "SNSTopicARN"
						"Id": "sns"
		"TargetInvocationRole": {
			"Type": "AWS::IAM::Role",
			"Properties": {
				"AssumeRolePolicyDocument": {
					"Version": "2012-10-17",
					"Statement": [
							"Effect": "Allow",
							"Principal": {
								"Service": [
							"Action": [
				"Path": "/",
				"Policies": [
						"PolicyName": "createopsitem",
						"PolicyDocument": {
							"Version": "2012-10-17",
							"Statement": [
									"Effect": "Allow",
									"Action": [
									"Resource": "*"
		"AllowSNSPublish": {
			"Type": "AWS::SNS::TopicPolicy",
			"Properties": {
				"PolicyDocument": {
					"Statement": [
							"Sid": "grant-eventbridge-publish",
							"Effect": "Allow",
							"Principal": {
								"Service": "events.amazonaws.com"
							"Action": [
							"Resource": {
								"Ref": "SNSTopicARN"
				"Topics": [
						"Ref": "SNSTopicARN"


Responding to CloudFormation stack events becomes easy with the integration between CloudFormation and EventBridge. CloudFormation events can be used to perform post-provisioning actions on workload resources. With the variety of targets available to EventBridge rules, various actions such as adding tags and, troubleshooting issues can be performed. This example above uses Systems Manager and Amazon SNS but you can have numerous targets including, Amazon API gateway, AWS Lambda, Amazon Elastic Container Service (Amazon ECS) task, Amazon Kinesis services, Amazon Redshift, Amazon SageMaker pipeline, and many more. These events are available for free in EventBridge.

Learn more about Managing events with CloudFormation and EventBridge.

About the Author

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 25 years of experience in software engineering and architecture roles for various large-scale enterprises.



Mahanth is a Solutions Architect at Amazon Web Services (AWS). As part of the AWS Well-Architected team, he works with customers and AWS Partner Network partners of all sizes to help them build secure, high-performing, resilient, and efficient infrastructure for their applications. He spends his free time playing with his pup Cosmo, learning more about astronomy, and is an avid gamer.



Sukhchander is a Solutions Architect at Amazon Web Services. He is passionate about helping startups and enterprises adopt the cloud in the most scalable, secure, and cost-effective way by providing technical guidance, best practices, and well architected solutions.

Implementing long running deployments with AWS CloudFormation Custom Resources using AWS Step Functions

Post Syndicated from DAMODAR SHENVI WAGLE original https://aws.amazon.com/blogs/devops/implementing-long-running-deployments-with-aws-cloudformation-custom-resources-using-aws-step-functions/

AWS CloudFormation custom resource provides mechanisms to provision AWS resources that don’t have built-in support from CloudFormation. It lets us write custom provisioning logic for resources that aren’t supported as resource types under CloudFormation. This post focusses on the use cases where CloudFormation custom resource is used to implement a long running task/job. With custom resources, you can manage these custom tasks (which are one-off in nature) as deployment stack resources.

The routine pattern used for implementing custom resources is via AWS Lambda function. However, when using the Lambda function as the custom resource provider, you must consider its trade-offs, such as its 15 minute timeout. Tasks involved in the provisioning of certain AWS resources can be long running and could span beyond the Lambda timeout. In these scenarios, you must look beyond the conventional Lambda function-based approach for custom resources.

In this post, I’ll demonstrate how to use AWS Step Functions to implement custom resources using AWS Cloud Development Kit (AWS CDK). Step Functions allow complex deployment tasks to be orchestrated as a step-by-step workflow. It also offers direct integration with any AWS service via AWS SDK integrations. By default the CloudFormation stack waits for 1 hour before timing out. The timeout can be increased to maximum 12 hours using wait conditions. In this post, you’ll also see how to use wait conditions with custom resource to run long running deployment tasks as part of a CloudFormation stack.


Before proceeding any further, you must identify and designate an AWS account required for the solution to work. You must also create an AWS account profile in ~/.aws/credentials for the designated AWS account, if you don’t already have one. The profile must have sufficient permissions to run an AWS CDK stack. It should be your private profile and only be used during the course of this post. Therefore, it should be fine if you want to use admin privileges. Don’t share the profile details, especially if it has admin privileges. I recommend removing the profile when you’re finished with this walkthrough. For more information about creating an AWS account profile, see Configuring the AWS CLI.

Services and frameworks used in the post include CloudFormation, Step Functions, Lambda, DynamoDB, Amazon S3, and AWS CDK.

Solution overview

The following architecture diagram shows the application of Step Functions to implement custom resources.

Architecture diagram

Figure 1. Architecture diagram

  1. The user deploys a CloudFormation stack that includes a custom resource implementation.
  2. The CloudFormation custom resource triggers a Lambda function with the appropriate event which can be CREATE/UPDATE/DELETE.
  3. The custom resource Lambda function invokes Step Functions workflow and offloads the event handling responsibility. The CloudFormation event and context are wrapped inside the Step Function input at the time of invocation.
  4. The custom resource Lambda function returns SUCCESS back to CloudFormation stack indicating that the custom resource provisioning has begun. CloudFormation stack then goes into waiting mode where it waits for a SUCCESS or FAILURE signal to continue.
  5. In the interim, Step Functions workflow handles the custom resource event through one or more steps.
  6. Step Functions workflow prepares the response to be sent back to CloudFormation stack.
  7. Send Response Lambda function sends a success/failure response back to CloudFormation stack. This propels CloudFormation stack out of the waiting mode and into completion.

Solution deep dive

In this section I will get into the details of several key aspects of the solution

Custom Resource Definition

Following code snippet shows the custom resource definition which can be found here. Please note that we also define AWS::CloudFormation::WaitCondition and AWS::CloudFormation::WaitConditionHandle alongside the custom resource. AWS::CloudFormation::WaitConditionHandle resource sets up a pre-signed URL which is passed into the CallbackUrl property of the Custom Resource.

The final completion signal for the custom resource i.e. SUCCESS/FAILURE is received over this CallbackUrl. To learn more about wait conditions please refer to its user guide here. Note that, when updating the custom resource, you cannot use the existing WaitCondition-WaitConditionHandle resource pair. You need to create a new pair for tracking each update/delete operation on the custom resource.

/************************** Custom Resource Definition *****************************/
// When you intend to update CustomResource make sure that a new WaitCondition and 
// a new WaitConditionHandle resource is created to track CustomResource update.
// The strategy we are using here is to create a hash of Custom Resource properties.
// The resource names for WaitCondition and WaitConditionHandle carry this hash.
// Anytime there is an update to the custom resource properties, a new hash is generated,
// which automatically leads to new WaitCondition and WaitConditionHandle resources.
const resourceName: string = getNormalizedResourceName('DemoCustomResource');
const demoData = {
    pk: 'demo-sfn',
    sk: resourceName,
    ts: Date.now().toString()
const dataHash = hash(demoData);
const wcHandle = new CfnWaitConditionHandle(
const customResource = new CustomResource(this, resourceName, {
    serviceToken: customResourceLambda.functionArn,
    properties: {
        DDBTable: String(demoTable.tableName),
        Data: JSON.stringify(demoData),
        CallbackUrl: wcHandle.ref
// Note: AWS::CloudFormation::WaitCondition resource type does not support updates.
new CfnWaitCondition(
        count: 1,
        timeout: '300',
        handle: wcHandle.ref

Custom Resource Lambda

Following code snippet shows how the custom resource lambda function passes the CloudFormation event as an input into the StepFunction at the time of invocation. CloudFormation event contains the CallbackUrl resource property I discussed in the previous section.

private async startExecution() {
    const input = {
        cfnEvent: this.event,
        cfnContext: this.context
    const params: StartExecutionInput = {
        stateMachineArn: String(process.env.SFN_ARN),
        input: JSON.stringify(input)
    let attempt = 0;
    let retry = false;
    do {
        try {
            const response = await this.sfnClient.startExecution(params).promise();
            console.debug('Response: ' + JSON.stringify(response));
            retry = false;

Custom Resource StepFunction

The StepFunction handles the CloudFormation event based on the event type. The CloudFormation event containing CallbackUrl is passed down the stages of StepFunction all the way to the final step. The last step of the StepFunction sends back the response over CallbackUrl via send-cfn-response lambda function as shown in the following code snippet.

 * Send response back to cloudformation
 * @param event
 * @param context
 * @param response
export async function sendResponse(event: any, context: any, response: any) {
    const responseBody = JSON.stringify({
        Status: response.Status,
        Reason: "Success",
        UniqueId: response.PhysicalResourceId,
        Data: JSON.stringify(response.Data)
    console.debug("Response body:\n", responseBody);
    const parsedUrl = url.parse(event.ResourceProperties.CallbackUrl);
    const options = {
        hostname: parsedUrl.hostname,
        port: 443,
        path: parsedUrl.path,
        method: "PUT",
        headers: {
            "content-type": "",
            "content-length": responseBody.length
    await new Promise(() => {
        const request = https.request(options, function(response: any) {
	    console.debug("Status code: " + response.statusCode);
	    console.debug("Status message: " + response.statusMessage);
	request.on("error", function(error) {
	    console.debug("send(..) failed executing https.request(..): " + error);


Clone the GitHub repo cfn-custom-resource-using-step-functions and navigate to the folder cfn-custom-resource-using-step-functions. Now, execute the script script-deploy.sh by passing the name of the AWS profile that you created in the prerequisites section above. This should deploy the solution. The commands are shown as follows for your reference. Note that if you don’t pass the AWS profile name ‘default’ the profile will be used for deployment.

git clone 
cd cfn-custom-resource-using-step-functions
./script-deploy.sh "<AWS- ACCOUNT-PROFILE-NAME>"

The deployed solution consists of 2 stacks as shown in the following screenshot

  1. cfn-custom-resource-common-lib: Deploys common components
    • DynamoDB table that custom resources write to during their lifecycle events
    • Lambda layer used across the rest of the stacks
  2. cfn-custom-resource-sfn: Deploys Step Functions backed custom resource implementation
CloudFormation stacks deployed

Figure 2. CloudFormation stacks deployed

For demo purposes, I implemented a custom resource that inserts data into the DynamoDB table. When you deploy the solution for the first time, like you just did in the previous step, it initiates a CREATE event resulting in the creation of a new custom resource using Step Functions. You should see a new record with unix epoch timestamp in the DynamoDB table, indicating that the resource was created as shown in the following screenshot. You can find the DynamoDB table name/arn from the SSM Parameter Store /CUSTOM_RESOURCE_PATTERNS/DYNAMODB/ARN

DynamoDB record indicating custom resource creation

Figure 3. DynamoDB record indicating custom resource creation

Now, execute the script script-deploy.sh again. This should initiate an UPDATE event, resulting in the update of custom resources. The code also automatically creates new WaitConditionHandle and WaitCondition resources required to wait for the update event to finish. Now you should see that the records in the DynamoDb table have been updated with new values for lastOperation and ts attributes as follows.

DynamoDB record indicating custom resource update

Figure 4. DynamoDB record indicating custom resource update

Cleaning up

To remove all of the stacks, run the script script-undeploy.sh as follows.

./script-undeploy.sh "<AWS- ACCOUNT-PROFILE-NAME>"


In this post I showed how to look beyond the conventional approach of building CloudFormation custom resources using a Lambda function. I discussed implementing custom resources using Step Functions and CloudFormation wait conditions. Try this solution in scenarios where you must execute a long running deployment task/job as part of your CloudFormation stack deployment.



About the author:

Damodar Shenvi

Damodar Shenvi Wagle is a Cloud Application Architect at AWS Professional Services. His areas of expertise include architecting serverless solutions, CI/CD and automation.

Deploy and manage OpenAPI/Swagger RESTful APIs with the AWS Cloud Development Kit

Post Syndicated from Luke Popplewell original https://aws.amazon.com/blogs/devops/deploy-and-manage-openapi-swagger-restful-apis-with-the-aws-cloud-development-kit/

This post demonstrates how AWS Cloud Development Kit (AWS CDK) Infrastructure as Code (IaC) constructs and AWS serverless technology can be used to build and deploy a RESTful Application Programming Interface (API) defined in the OpenAPI specification. This post uses an example API that describes  Widget resources and demonstrates how to use an AWS CDK Pipeline to:

  • Deploy a RESTful API stage to Amazon API Gateway from an OpenAPI specification.
  • Build and deploy an AWS Lambda function that contains the API functionality.
  • Auto-generate API documentation and publish it to an Amazon Simple Storage Service (Amazon S3)-hosted website served by the Amazon CloudFront content delivery network (CDN) service. This provides technical and non-technical stakeholders with versioned, current, and accessible API documentation.
  • Auto-generate client libraries for invoking the API and deploy them to AWS CodeArtifact, which is a fully-managed artifact repository service. This allows API client development teams to integrate with different versions of the API in different environments.

The diagram shown in the following figure depicts the architecture of the AWS services and resources described in this post.

 The architecture described in this post consists of an AWS CodePipeline pipeline, provisioned using the AWS CDK, that deploys the Widget API to AWS Lambda and API Gateway. The pipeline then auto-generates the API’s documentation as a website served by CloudFront and deployed to S3. Finally, the pipeline auto-generates a client library for the API and deploys this to CodeArtifact.

Figure 1 – Architecture

The code that accompanies this post, written in Java, is available here.


APIs must be understood by all stakeholders and parties within an enterprise including business areas, management, enterprise architecture, and other teams wishing to consume the API. Unfortunately, API definitions are often hidden in code and lack up-to-date documentation. Therefore, they remain inaccessible for the majority of the API’s stakeholders. Furthermore, it’s often challenging to determine what version of an API is present in different environments at any one time.

This post describes some solutions to these issues by demonstrating how to continuously deliver up-to-date and accessible API documentation, API client libraries, and API deployments.


The AWS CDK is a software development framework for defining cloud IaC and is available in multiple languages including TypeScript, JavaScript, Python, Java, C#/.Net, and Go. The AWS CDK Developer Guide provides best practices for using the CDK.

This post uses the CDK to define IaC in Java which is synthesized to a cloud assembly. The cloud assembly includes one to many templates and assets that are deployed via an AWS CodePipeline pipeline. A unit of deployment in the CDK is called a Stack.

OpenAPI specification (formerly Swagger specification)

OpenAPI specifications describe the capabilities of an API and are both human and machine-readable. They consist of definitions of API components which include resources, endpoints, operation parameters, authentication methods, and contact information.

Project composition

The API project that accompanies this post consists of three directories:

  • app
  • api
  • cdk

app directory

This directory contains the code for the Lambda function which is invoked when the Widget API is invoked via API Gateway. The code has been developed in Java as an Apache Maven project.

The Quarkus framework has been used to define a WidgetResource class (see src/main/java/aws/sample/blog/cdkopenapi/app/WidgetResources.java ) that contains the methods that align with HTTP Methods of the Widget API.
api directory

The api directory contains the OpenAPI specification file ( openapi.yaml ). This file is used as the source for:

  • Defining the REST API using API Gateway’s support for OpenApi.
  • Auto-generating the API documentation.
  • Auto-generating the API client artifact.

The api directory also contains the following files:

  • openapi-generator-config.yaml : This file contains configuration settings for the OpenAPI Generator framework, which is described in the section CI/CD Pipeline.
  • maven-settings.xml: This file is used support the deployment of the generated SDKs or libraries (Apache Maven artifacts) for the API and is described in the CI/CD Pipeline section of this post.

This directory contains a sub directory called docker . The docker directory contains a Dockerfile which defines the commands for building a Docker image:

FROM ruby:2.6.5-alpine
RUN apk update \
 && apk upgrade --no-cache \
 && apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.14/main/ nodejs=14.20.0-r0 npm \
 && apk add git \
 && apk add --no-cache build-base
# Install Widdershins node packages and ruby gem bundler 
RUN npm install -g widdershins \
 && gem install bundler 
# working directory
WORKDIR /openapi
# Clone and install the Slate framework
RUN git clone https://github.com/slatedocs/slate
RUN cd slate \
 && bundle install

The Docker image incorporates two open source projects, the NodeJS Widdershins library and the Ruby Slate-framework. These are used together to auto-generate the documentation for the API from the OpenAPI specification.  This Dockerfile is referenced and built by the  ApiStack class, which is described in the CDK Stacks section of this post.

cdk directory

This directory contains an Apache Maven Project developed in Java for provisioning the CDK stacks for the  Widget API.

Under the  src/main/java  folder, the package  aws.sample.blog.cdkopenapi.cdk  contains the files and classes that define the application’s CDK stacks and also the entry point (main method) for invoking the stacks from the CDK Toolkit CLI:

  • CdkApp.java: This file contains the  CdkApp class which provides the main method that is invoked from the AWS CDK Toolkit to build and deploy the  application stacks.
  • ApiStack.java: This file contains the   ApiStack class which defines the  OpenApiBlogAPI   stack and is described in the CDK Stacks section of this post.
  • PipelineStack.java: This file contains the   PipelineStack class which defines the OpenAPIBlogPipeline  stack and is described in the CDK Stacks section of this post.
  • ApiStackStage.java: This file contains the  ApiStackStage class which defines a CDK stage. As detailed in the CI/CD Pipeline section of this post, a DEV stage, containing the  OpenApiBlogAPI stack resources for a DEV environment, is deployed from the  OpenApiBlogPipeline pipeline.

CDK stacks


Note that the CDK bundling functionality is used at multiple points in the  ApiStack  class to produce CDK Assets. The post, Building, bundling, and deploying applications with the AWS CDK, provides more details regarding using CDK bundling mechanisms.

The  ApiStack  class defines multiple resources including:

  • Widget API Lambda function: This is bundled by the CDK in a Docker container using the Java 11 runtime image.
  • Widget  REST API on API Gateway: The REST API is created from an Inline API Definition which is passed as an S3 CDK Asset. This asset includes a reference to the  Widget API OpenAPI specification located under the  api folder (see  api/openapi.yaml ) and builds upon the SpecRestApi construct and API Gateway’s support for OpenApi.
  • API documentation Docker Image Asset: This is the Docker image that contains the open source frameworks (Widdershins and Slate) that are leveraged to generate the API documentation.
  • CDK Asset bundling functionality that leverages the API documentation Docker image to auto-generate documentation for the API.
  • An S3 Bucket for holding the API documentation website.
  • An origin access identity (OAI) which allows CloudFront to securely serve the S3 Bucket API documentation content.
  • A CloudFront distribution which provides CDN functionality for the S3 Bucket website.

Note that the  ApiStack class features the following code which is executed on the  Widget API Lambda construct:

CfnFunction apiCfnFunction = (CfnFunction)apiLambda.getNode().getDefaultChild();

The CDK, by default, auto-assigns an ID for each defined resource but in this case the generated ID is being overridden with “APILambda”. The reason for this is that inside of the  Widget API OpenAPI specification (see  api/openapi.yaml ), there is a reference to the Lambda function by name (“APILambda”) so that the function can be integrated as a proxy for each listed API path and method combination. The OpenAPI specification includes this name as a variable to derive the Amazon Resource Name (ARN) for the Lambda function:

	Fn::Sub: "arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${APILambda.Arn}/invocations"


The  PipelineStack class defines a CDK CodePipline construct which is a higher level construct and pattern. Therefore, the construct doesn’t just map directly to a single CloudFormation resource, but provisions multiple resources to fulfil the requirements of the pattern. The post, CDK Pipelines: Continuous delivery for AWS CDK applications, provides more detail on creating pipelines with the CDK.

final CodePipeline pipeline = CodePipeline.Builder.create(this, "OpenAPIBlogPipeline")

CI/CD pipeline

The diagram in the following figure shows the multiple CodePipeline stages and actions created by the CDK CodePipeline construct that is defined in the PipelineStack class.

The CI/CD pipeline’s stages include the Source stage, the Synth stage, the Update pipeline, the Assets stage, and the DEV stage.

Figure 2 – CI/CD Pipeline

The stages defined include the following:

  • Source stage: The pipeline is passed the source code contents from this stage.
  • Synth stage: This stage consists of a Synth Action that synthesizes the CloudFormation templates for the application’s CDK stacks and compiles and builds the project Lambda API function.
  • Update Pipeline stage: This stage checks the OpenAPIBlogPipeline stack and reinitiates the pipeline when changes to its definition have been deployed.
  • Assets stage: The application’s CDK stacks produce multiple file assets (for example, zipped Lambda code) which are published to Amazon S3. Docker image assets are published to a managed CDK framework Amazon Elastic Container Registry (Amazon ECR) repository.
  • DEV stage: The API’s CDK stack ( OpenApiBlogAPI ) is deployed to a hypothetical development environment in this stage. A post stage deployment action is also defined in this stage. Through the use of a CDK ShellStep construct, a Bash script is executed that deploys a generated client Java Archive (JAR) for the Widget API to CodeArtifact. The script employs the OpenAPI Generator project for this purpose:
CodeBuildStep codeArtifactStep = CodeBuildStep.Builder.create("CodeArtifactDeploy")
           	"echo $REPOSITORY_DOMAIN",
           	"echo $REPOSITORY_NAME",
           	"export CODEARTIFACT_TOKEN=`aws codeartifact get-authorization-token --domain $REPOSITORY_DOMAIN --query authorizationToken --output text`",
           	"export REPOSITORY_ENDPOINT=$(aws codeartifact get-repository-endpoint --domain $REPOSITORY_DOMAIN --repository $REPOSITORY_NAME --format maven | jq .repositoryEndpoint | sed 's/\\\"//g')",
           	"echo $REPOSITORY_ENDPOINT",
           	"cd api",
           	"wget -q https://repo1.maven.org/maven2/org/openapitools/openapi-generator-cli/5.4.0/openapi-generator-cli-5.4.0.jar -O openapi-generator-cli.jar",
     	          "cp ./maven-settings.xml /root/.m2/settings.xml",
        	          "java -jar openapi-generator-cli.jar batch openapi-generator-config.yaml",
                    "cd client",
                    "mvn --no-transfer-progress deploy -DaltDeploymentRepository=openapi--prod::default::$REPOSITORY_ENDPOINT"
      .rolePolicyStatements(Arrays.asList(codeArtifactStatement, codeArtifactStsStatement))
.env(new HashMap<String, String>() {{
      		put("REPOSITORY_DOMAIN", codeArtifactDomainName);
            	put("REPOSITORY_NAME", codeArtifactRepositoryName);

Running the project

To run this project, you must install the AWS CLI v2, the AWS CDK Toolkit CLI, a Java/JDK 11 runtime, Apache Maven, Docker, and a Git client. Furthermore, the AWS CLI must be configured for a user who has administrator access to an AWS Account. This is required to bootstrap the CDK in your AWS account (if not already completed) and provision the required AWS resources.

To build and run the project, perform the following steps:

  1. Fork the OpenAPI blog project in GitHub.
  2. Open the AWS Console and create a connection to GitHub. Note the connection’s ARN.
  3. In the Console, navigate to AWS CodeArtifact and create a domain and repository.  Note the names used.
  4. From the command line, clone your forked project and change into the project’s directory:
git clone https://github.com/<your-repository-path>
cd <your-repository-path>
  1. Edit the CDK JSON file at  cdk/cdk.json  and enter the details:
"RepositoryString": "<your-github-repository-path>",
"RepositoryBranch": "<your-github-repository-branch-name>",
"CodestarConnectionArn": "<connection-arn>",
"CodeArtifactDomain": "<code-artifact-domain-name>",
"CodeArtifactRepository": "<code-artifact-repository-name>"

Please note that for setting configuration values in CDK applications, it is recommend to use environment variables or AWS Systems Manager parameters.

  1. Commit and push your changes back to your GitHub repository:
git push origin main
  1. Change into the  cdk directory and bootstrap the CDK in your AWS account if you haven’t already done so (enter “Y” when prompted):
cd cdk
cdk bootstrap
  1. Deploy the CDK pipeline stack (enter “Y” when prompted):
cdk deploy OpenAPIBlogPipeline

Once the stack deployment completes successfully, the pipeline  OpenAPIBlogPipeline will start running. This will build and deploy the API and its associated resources. If you open the Console and navigate to AWS CodePipeline, then you’ll see a pipeline in progress for the API.

Once the pipeline has completed executing, navigate to AWS CloudFormation to get the output values for the  DEV-OpenAPIBlog  stack deployment:

  1. Select the  DEV-OpenAPIBlog  stack entry and then select the Outputs column. Record the REST_URL value for the key that begins with   OpenAPIBlogRestAPIEndpoint .
  2. Record the CLOUDFRONT_URL value for the key  OpenAPIBlogCloudFrontURL .

The API ping method at https://<REST_URL>/ping can now be invoked using your browser or an API development tool like Postman. Other API other methods, as defined by the OpenApi specification, are also available for invocation (For example, GET https://<REST_URL>/widgets).

To view the generated API documentation, open a browser at https://< CLOUDFRONT_URL>.

The following figure shows the API documentation website that has been auto-generated from the API’s OpenAPI specification. The documentation includes code snippets for using the API from multiple programming languages.

The API’s auto-generated documentation website provides descriptions of the API’s methods and resources as well as code snippets in multiple languages including JavaScript, Python, and Java.

Figure 3 – Auto-generated API documentation

To view the generated API client code artifact, open the Console and navigate to AWS CodeArtifact. The following figure shows the generated API client artifact that has been published to CodeArtifact.

The CodeArtifact service user interface in the Console shows the different versions of the API’s auto-generated client libraries.

Figure 4 – API client artifact in CodeArtifact

Cleaning up

  1. From the command change to the  cdk directory and remove the API stack in the DEV stage (enter “Y” when prompted):
cd cdk
cdk destroy OpenAPIBlogPipeline/DEV/OpenAPIBlogAPI
  1. Once this has completed, delete the pipeline stack:
cdk destroy OpenAPIBlogPipeline
  1. Delete the S3 bucket created to support pipeline operations. Open the Console and navigate to Amazon S3. Delete buckets with the prefix  openapiblogpipeline .


This post demonstrates the use of the AWS CDK to deploy a RESTful API defined by the OpenAPI/Swagger specification. Furthermore, this post describes how to use the AWS CDK to auto-generate API documentation, publish this documentation to a web site hosted on Amazon S3, auto-generate API client libraries or SDKs, and publish these artifacts to an Apache Maven repository hosted on CodeArtifact.

The solution described in this post can be improved by:

  • Building and pushing the API documentation Docker image to Amazon ECR, and then using this image in CodePipeline API pipelines.
  • Creating stages for different environments such as TEST, PREPROD, and PROD.
  • Adding integration testing actions to make sure that the API Deployment is working correctly.
  • Adding Manual approval actions for that are executed before deploying the API to PROD.
  • Using CodeBuild caching of artifacts including Docker images and libraries.

About the author:

Luke Popplewell

Luke Popplewell works primarily with federal entities in the Australian Government. In his role as an architect, Luke uses his knowledge and experience to help organisations reach their goals on the AWS cloud. Luke has a keen interest in serverless technology, modernization, DevOps and event-driven architectures.

Choosing a CI/CD approach: AWS Services with BigHat Biosciences

Post Syndicated from Mike Apted original https://aws.amazon.com/blogs/devops/choosing-ci-cd-aws-services-bighat-biosciences/

Founded in 2019, BigHat Biosciences’ mission is to improve human health by reimagining antibody discovery and engineering to create better antibodies faster. Their integrated computational + experimental approach speeds up antibody design and discovery by combining high-speed molecular characterization with machine learning technologies to guide the search for better antibodies. They apply these design capabilities to develop new generations of safer and more effective treatments for patients suffering from today’s most challenging diseases. Their platform, from wet lab robots to cloud-based data and logistics plane, is woven together with rapidly changing BigHat-proprietary software. BigHat uses continuous integration and continuous deployment (CI/CD) throughout their data engineering workflows and when training and evaluating their machine learning (ML) models.


BigHat Biosciences Logo


In a previous post, we discussed the key considerations when choosing a CI/CD approach. In this post, we explore BigHat’s decisions and motivations in adopting managed AWS CI/CD services. You may find that your organization has commonalities with BigHat and some of their insights may apply to you. Throughout the post, considerations are informed and choices are guided by the best practices in the AWS Well-Architected Framework.

How did BigHat decide what they needed?

Making decisions on appropriate (CI/CD) solutions requires understanding the characteristics of your organization, the environment you operate in, and your current priorities and goals.

“As a company designing therapeutics for patients rather than software, the role of technology at BigHat is to enable a radically better approach to develop therapeutic molecules,” says Eddie Abrams, VP of Engineering at BigHat. “We need to automate as much as possible. We need the speed, agility, reliability and reproducibility of fully automated infrastructure to enable our company to solve complex problems with maximum scientific rigor while integrating best in class data analysis. Our engineering-first approach supports that.”

BigHat possesses a unique insight to an unsolved problem. As an early stage startup, their core focus is optimizing the fully integrated platform that they built from the ground-up to guide the design for better molecules. They respond to feedback from partners and learn from their own internal experimentation. With each iteration, the quality of what they’re creating improves, and they gain greater insight and improved models to support the next iteration. More than anything, they need to be able to iterate rapidly. They don’t need any additional complexity that would distract from their mission. They need uncomplicated and enabling solutions.

They also have to take into consideration the regulatory requirements that apply to them as a company, the data they work with and its security requirements; and the market segment they compete in. Although they don’t control these factors, they can control how they respond to them, and they want to be able to respond quickly. It’s not only speed that matters in designing for security and compliance, but also visibility and track-ability. These often overlooked and critical considerations are instrumental in choosing a CI/CD strategy and platform.

“The ability to learn faster than your competitors may be the only sustainable competitive advantage,” says Cindy Alvarez in her book Lean Customer Development.

The tighter the feedback loop, the easier it is to make a change. Rapid iteration allows BigHat to easily build upon what works, and make adjustments as they identify avenues that won’t lead to success.

Feature set

CI/CD is applicable to more than just the traditional use case. It doesn’t have to be software delivered in a classic fashion. In the case of BigHat, they apply CI/CD in their data engineering workflows and in training their ML models. BigHat uses automated solutions in all aspects of their workflow. Automation further supports taking what they have created internally and enabling advances in antibody design and development for safer, more effective treatments of conditions.

“We see a broadening of the notion of what can come under CI/CD,” says Abrams. “We use automated solutions wherever possible including robotics to perform scaled assays. The goal in tightening the loop is to improve precision and speed, and reduce latency and lag time.”

BigHat reached the conclusion that they would adopt managed service offerings wherever possible, including in their CI/CD tooling and other automation initiatives.

“The phrase ‘undifferentiated heavy lifting’ has always resonated,” says Abrams. “Building, scaling, and operating core software and infrastructure are hard problems, but solving them isn’t itself a differentiating advantage for a therapeutics company. But whether we can automate that infrastructure, and how we can use that infrastructure at scale on a rock solid control plane to provide our custom solutions iteratively, reliably and efficiently absolutely does give us an edge. We need an end-to-end, complete infrastructure solution that doesn’t force us to integrate a patchwork of solutions ourselves. AWS provides exactly what we need in this regard.”

Reducing risk

Startups can be full of risk, with the upside being potential future reward. They face risk in finding the right problem, in finding a solution to that problem, and in finding a viable customer base to buy that solution.

A key priority for early stage startups is removing risk from as many areas of the business as possible. Any steps an early stage startup can take to remove risk without commensurately limiting reward makes them more viable. The more risk a startup can drive out of their hypothesis the more likely their success, in part because they’re more attractive to customers, employees, and investors alike. The more likely their product solves their problem, the more willing a customer is to give it a chance. Likewise, the more attractive they are to investors when compared to alternative startups with greater risk in reaching their next major milestone.

Adoption of managed services for CI/CD accomplishes this goal in several ways. The most important advantage remains speed. The core functionality required can be stood up very quickly, as it’s an existing service. Customers have a large body of reference examples and documentation available to demonstrate how to use that service. They also insulate teams from the need to configure and then operate the underlying infrastructure. The team remains focused on their differentiation and their core value proposition.

“We are automated right up to the organizational level and because of this, running those services ourselves represents operational risk,” says Abrams. “The largest day-to-day infrastructure risk to us is having the business stalled while something is not working. Do I want to operate these services, and focus my staff on that? There is no guarantee I can just throw more compute at a self-managed software service I’m running and make it scale effectively. There is no guarantee that if one datacenter is having a network or electrical problem that I can simply switch to another datacenter. I prefer AWS manages those scale and uptime problems.”

Embracing an opinionated model

BigHat is a startup with a singular focus on using ML to reduce the time and difficulty of designing antibodies and other therapeutic proteins. By adopting managed services, they have removed the burden of implementing and maintaining CI/CD systems.

Accepting the opinionated guardrails of the managed service approach allows, and to a degree reinforces, the focus on what makes a startup unique. Rather than being focused on performance tuning, making decisions on what OS version to use, or which of the myriad optional puzzle pieces to put together, they can use a well-integrated set of tools built to work with each other in a defined fashion.

The opinionated model means best practices are baked into the toolchain. Instead of hiring for specialized administration skills they’re hiring for specialized biotech skills.

“The only degrees of freedom I care about are the ones that improve our technologies and reduce the time, cost, and risk of bringing a therapeutic to market,” says Abrams. “We focus on exactly where we can gain operational advantages by simply adopting managed services that already embrace the Well-Architected Framework. If we had to tackle all of these engineering needs with limited resources, we would be spending into a solved problem. Before AWS, startups just didn’t do these sorts of things very well. Offloading this effort to a trusted partner is pretty liberating.”

Beyond the reduction in operational concerns, BigHat can also expect continuous improvement of that service over time to be delivered automatically by the provider. For their use case they will likely derive more benefit for less cost over time without any investment required.

Overview of solution

BigHat uses the following key services:


BigHat Reference Architecture


Managed services are supported, owned and operated by the provider . This allows BigHat to leave concerns like patching and security of the underlying infrastructure and services to the provider. BigHat continues to maintain ownership in the shared responsibility model, but their scope of concern is significantly narrowed. The surface area the’re responsible for is reduced, helping to minimize risk. Choosing a partner with best in class observability, tracking, compliance and auditing tools is critical to any company that manages sensitive data.

Cost advantages

A startup must also make strategic decisions about where to deploy the capital they have raised from their investors. The vendor managed services bring a model focused on consumption, and allow the startup to make decisions about where they want to spend. This is often referred to as an operational expense (OpEx) model, in other words “pay as you go”, like a utility. This is in contrast to a large upfront investment in both time and capital to build these tools. The lack of need for extensive engineering efforts to stand up these tools, and continued investment to evolve them, acts as a form of capital expenditure (CapEx) avoidance. Startups can allocate their capital where it matters most for them.

“This is corporate-level changing stuff,” says Abrams. “We engage in a weekly leadership review of cost budgets. Operationally I can set the spending knob where I want it monthly, weekly or even daily, and avoid the risks involved in traditional capacity provisioning.”

The right tool for the right time

A key consideration for BigHat was the ability to extend the provider managed tools, where needed, to incorporate extended functionality from the ecosystem. This allows for additional functionality that isn’t covered by the core managed services, while maintaining a focus on their product development versus operating these tools.

Startups must also ask themselves what they need now, versus what they need in the future. As their needs change and grow, they can augment, extend, and replace the tools they have chosen to meet the new requirements. Starting with a vendor-managed service is not a one-way door; it’s an opportunity to defer investment in building and operating these capabilities yourself until that investment is justified. The time to value in using managed services initially doesn’t leave a startup with a sunk cost that limits future options.

“You have to think about the degree you want to adopt a hybrid model for the services you run. Today we aren’t running any software or services that require us to run our own compute instances. It’s very rare we run into something that is hard to do using just the services AWS already provides. Where our needs are not met, we can communicate them to AWS and we can choose to wait for them on their roadmap, which we have done in several cases, or we can elect to do it ourselves,” says Abrams. “This freedom to tweak and expand our service model at will is incomparably liberating.”


BigHat Biosciences was able to make an informed decision by considering the priorities of the business at this stage of its lifecycle. They adopted and embraced opinionated and service provider-managed tooling, which allowed them to inherit a largely best practice set of technology and practices, de-risk their operations, and focus on product velocity and customer feedback. This maintains future flexibility, which delivers significantly more value to the business in its current stage.

“We believe that the underlying engineering, the underlying automation story, is an advantage that applies to every aspect of what we do for our customers,” says Abrams. “By taking those advantages into every aspect of the business, we deliver on operations in a way that provides a competitive advantage a lot of other companies miss by not thinking about it this way.”

About the authors

Mike is a Principal Solutions Architect with the Startup Team at Amazon Web Services. He is a former founder, current mentor, and enjoys helping startups live their best cloud life.




Sean is a Senior Startup Solutions Architect at AWS. Before AWS, he was Director of Scientific Computing at the Howard Hughes Medical Institute.

Using AWS DevOps Tools to model and provision AWS Glue workflows

Post Syndicated from Nuatu Tseggai original https://aws.amazon.com/blogs/devops/provision-codepipeline-glue-workflows/

This post provides a step-by-step guide on how to model and provision AWS Glue workflows utilizing a DevOps principle known as infrastructure as code (IaC) that emphasizes the use of templates, source control, and automation. The cloud resources in this solution are defined within AWS CloudFormation templates and provisioned with automation features provided by AWS CodePipeline and AWS CodeBuild. These AWS DevOps tools are flexible, interchangeable, and well suited for automating the deployment of AWS Glue workflows into different environments such as dev, test, and production, which typically reside in separate AWS accounts and Regions.

AWS Glue workflows allow you to manage dependencies between multiple components that interoperate within an end-to-end ETL data pipeline by grouping together a set of related jobs, crawlers, and triggers into one logical run unit. Many customers using AWS Glue workflows start by defining the pipeline using the AWS Management Console and then move on to monitoring and troubleshooting using either the console, AWS APIs, or the AWS Command Line Interface (AWS CLI).

Solution overview

The solution uses COVID-19 datasets. For more information on these datasets, see the public data lake for analysis of COVID-19 data, which contains a centralized repository of freely available and up-to-date curated datasets made available by the AWS Data Lake team.

Because the primary focus of this solution showcases how to model and provision AWS Glue workflows using AWS CloudFormation and CodePipeline, we don’t spend much time describing intricate transform capabilities that can be performed in AWS Glue jobs. As shown in the Python scripts, the business logic is optimized for readability and extensibility so you can easily home in on the functions that aggregate data based on monthly and quarterly time periods.

The ETL pipeline reads the source COVID-19 datasets directly and writes only the aggregated data to your S3 bucket.

The solution exposes the datasets in the following tables:

Table Name Description Dataset location Provider
countrycode Lookup table for country codes s3://covid19-lake/static-datasets/csv/countrycode/ Rearc
countypopulation Lookup table for the population of each county s3://covid19-lake/static-datasets/csv/CountyPopulation/ Rearc
state_abv Lookup table for US state abbreviations s3://covid19-lake/static-datasets/json/state-abv/ Rearc
rearc_covid_19_nyt_data_in_usa_us_counties Data on COVID-19 cases at US county level s3://covid19-lake/rearc-covid-19-nyt-data-in-usa/csv/us-counties/ Rearc
rearc_covid_19_nyt_data_in_usa_us_states Data on COVID-19 cases at US state level s3://covid19-lake/rearc-covid-19-nyt-data-in-usa/csv/us-states/ Rearc
rearc_covid_19_testing_data_states_daily Data on COVID-19 cases at US state level s3://covid19-lake/rearc-covid-19-testing-data/csv/states_daily/ Rearc
rearc_covid_19_testing_data_us_daily US total test daily trend s3://covid19-lake/rearc-covid-19-testing-data/csv/us_daily/ Rearc
rearc_covid_19_testing_data_us_total_latest US total tests s3://covid19-lake/rearc-covid-19-testing-data/csv/us-total-latest/ Rearc
rearc_covid_19_world_cases_deaths_testing World total tests s3://covid19-lake/rearc-covid-19-world-cases-deaths-testing/ Rearc
rearc_usa_hospital_beds Hospital beds and their utilization in the US s3://covid19-lake/rearc-usa-hospital-beds/ Rearc
world_cases_deaths_aggregates Monthly and quarterly aggregate of the world s3://<your-S3-bucket-name>/covid19/world-cases-deaths-aggregates/ Aggregate


This post assumes you have the following:

  • Access to an AWS account
  • The AWS CLI (optional)
  • Permissions to create a CloudFormation stack
  • Permissions to create AWS resources, such as AWS Identity and Access Management (IAM) roles, Amazon Simple Storage Service (Amazon S3) buckets, and various other resources
  • General familiarity with AWS Glue resources (triggers, crawlers, and jobs)


The CloudFormation template glue-workflow-stack.yml defines all the AWS Glue resources shown in the following diagram.

architecture diagram showing ETL process

Figure: AWS Glue workflow architecture diagram

Modeling the AWS Glue workflow using AWS CloudFormation

Let’s start by exploring the template used to model the AWS Glue workflow: glue-workflow-stack.yml

We focus on two resources in the following snippet:

  • AWS::Glue::Workflow
  • AWS::Glue::Trigger

From a logical perspective, a workflow contains one or more triggers that are responsible for invoking crawlers and jobs. Building a workflow starts with defining the crawlers and jobs as resources within the template and then associating it with triggers.

Defining the workflow

This is where the definition of the workflow starts. In the following snippet, we specify the type as AWS::Glue::Workflow and the property Name as a reference to the parameter GlueWorkflowName.

    Type: String
    Description: Glue workflow that tracks all triggers, jobs, crawlers as a single entity
    Default: Covid_19

    Type: AWS::Glue::Workflow
      Description: Glue workflow that tracks specified triggers, jobs, and crawlers as a single entity
      Name: !Ref GlueWorkflowName

Defining the triggers

This is where we define each trigger and associate it with the workflow. In the following snippet, we specify the property WorkflowName on each trigger as a reference to the logical ID Covid19Workflow.

These triggers allow us to create a chain of dependent jobs and crawlers as specified by the properties Actions and Predicate.

The trigger t_Start utilizes a type of SCHEDULED, which means that it starts at a defined time (in our case, one time a day at 8:00 AM UTC). Every time it runs, it starts the job with the logical ID Covid19WorkflowStarted.

The trigger t_GroupA utilizes a type of CONDITIONAL, which means that it starts when the resources specified within the property Predicate have reached a specific state (when the list of Conditions specified equals SUCCEEDED). Every time t_GroupA runs, it starts the crawlers with the logical ID’s CountyPopulation and Countrycode, per the Actions property containing a list of actions.

    Type: AWS::Glue::Trigger
      Name: t_Start
      Type: SCHEDULED
      Schedule: cron(0 8 * * ? *) # Runs once a day at 8 AM UTC
      StartOnCreation: true
      WorkflowName: !Ref GlueWorkflowName
        - JobName: !Ref Covid19WorkflowStarted

    Type: AWS::Glue::Trigger
      Name: t_GroupA
      StartOnCreation: true
      WorkflowName: !Ref GlueWorkflowName
        - CrawlerName: !Ref CountyPopulation
        - CrawlerName: !Ref Countrycode
          - JobName: !Ref Covid19WorkflowStarted
            LogicalOperator: EQUALS
            State: SUCCEEDED

Provisioning the AWS Glue workflow using CodePipeline

Now let’s explore the template used to provision the CodePipeline resources: codepipeline-stack.yml

This template defines an S3 bucket that is used as the source action for the pipeline. Any time source code is uploaded to a specified bucket, AWS CloudTrail logs the event, which is detected by an Amazon CloudWatch Events rule configured to start running the pipeline in CodePipeline. The pipeline orchestrates CodeBuild to get the source code and provision the workflow.

For more information on any of the available source actions that you can use with CodePipeline, such as Amazon S3, AWS CodeCommit, Amazon Elastic Container Registry (Amazon ECR), GitHub, GitHub Enterprise Server, GitHub Enterprise Cloud, or Bitbucket, see Start a pipeline execution in CodePipeline.

We start by deploying the stack that sets up the CodePipeline resources. This stack can be deployed in any Region where CodePipeline and AWS Glue are available. For more information, see AWS Regional Services.

Cloning the GitHub repo

Clone the GitHub repo with the following command:

$ git clone https://github.com/aws-samples/provision-codepipeline-glue-workflows.git

Deploying the CodePipeline stack

Deploy the CodePipeline stack with the following command:

$ aws cloudformation deploy \
--stack-name codepipeline-covid19 \
--template-file cloudformation/codepipeline-stack.yml \
--capabilities CAPABILITY_NAMED_IAM \
--no-fail-on-empty-changeset \
--region <AWS_REGION>

When the deployment is complete, you can view the pipeline that was provisioned on the CodePipeline console.

CodePipeline console showing the deploy pipeline in failed state

Figure: CodePipeline console

The preceding screenshot shows that the pipeline failed. This is because we haven’t uploaded the source code yet.

In the following steps, we zip and upload the source code, which triggers another (successful) run of the pipeline.

Zipping the source code

Zip the source code containing Glue scripts, CloudFormation templates, and Buildspecs file with the following command:

$ zip -r source.zip . -x images/\* *.history* *.git* *.DS_Store*

You can omit *.DS_Store* from the preceding command if you are not a Mac user.

Uploading the source code

Upload the source code with the following command:

$ aws s3 cp source.zip s3://covid19-codepipeline-source-<AWS_ACCOUNT_ID>-<AWS_REGION>

Make sure to provide your account ID and Region in the preceding command. For example, if your AWS account ID is 111111111111 and you’re using Region us-west-2, use the following command:

$ aws s3 cp source.zip s3://covid19-codepipeline-source-111111111111-us-west-2

Now that the source code has been uploaded, view the pipeline again to see it in action.

CodePipeline console showing the deploy pipeline in success state

Figure: CodePipeline console displaying stage “Deploy” in-progress

Choose Details within the Deploy stage to see the build logs.

CodeBuild console displaying build logs

Figure: CodeBuild console displaying build logs

To modify any of the commands that run within the Deploy stage, feel free to modify: deploy-glue-workflow-stack.yml

Try uploading the source code a few more times. Each time it’s uploaded, CodePipeline starts and runs another deploy of the workflow stack. If nothing has changed in the source code, AWS CloudFormation automatically determines that the stack is already up to date. If something has changed in the source code, AWS CloudFormation automatically determines that the stack needs to be updated and proceeds to run the change set.

Viewing the provisioned workflow, triggers, jobs, and crawlers

To view your workflows on the AWS Glue console, in the navigation pane, under ETL, choose Workflows.

Glue console showing workflows

Figure: Navigate to Workflows

To view your triggers, in the navigation pane, under ETL, choose Triggers.

Glue console showing triggers

Figure: Navigate to Triggers

To view your crawlers, under Data Catalog, choose Crawlers.

Glue console showing crawlers

Figure: Navigate to Crawlers

To view your jobs, under ETL, choose Jobs.

Glue console showing jobs

Figure: Navigate to Jobs

Running the workflow

The workflow runs automatically at 8:00 AM UTC. To start the workflow manually, you can use either the AWS CLI or the AWS Glue console.

To start the workflow with the AWS CLI, enter the following command:

$ aws glue start-workflow-run --name Covid_19 --region <AWS_REGION>

To start the workflow on the AWS Glue console, on the Workflows page, select your workflow and choose Run on the Actions menu.

Glue console run workflow

Figure: AWS Glue console start workflow run

To view the run details of the workflow, choose the workflow on the AWS Glue console and choose View run details on the History tab.

Glue console view run details of a workflow

Figure: View run details

The following screenshot shows a visual representation of the workflow as a graph with your run details.

Glue console showing visual representation of the workflow as a graph.

Figure: AWS Glue console displaying details of successful workflow run

Cleaning up

To avoid additional charges, delete the stack created by the CloudFormation template and the contents of the buckets you created.

1. Delete the contents of the covid19-dataset bucket with the following command:

$ aws s3 rm s3://covid19-dataset-<AWS_ACCOUNT_ID>-<AWS_REGION> --recursive

2. Delete your workflow stack with the following command:

$ aws cloudformation delete-stack --stack-name glue-covid19 --region <AWS_REGION>

To delete the contents of the covid19-codepipeline-source bucket, it’s simplest to use the Amazon S3 console because it makes it easy to delete multiple versions of the object at once.

3. Navigate to the S3 bucket named covid19-codepipeline-source-<AWS_ACCOUNT_ID>- <AWS_REGION>.

4. Choose List versions.

5. Select all the files to delete.

6. Choose Delete and follow the prompts to permanently delete all the objects.

S3 console delete all object versions

Figure: AWS S3 console delete all object versions

7. Delete the contents of the covid19-codepipeline-artifacts bucket:

$ aws s3 rm s3://covid19-codepipeline-artifacts-<AWS_ACCOUNT_ID>-<AWS-REGION> --recursive

8. Delete the contents of the covid19-cloudtrail-logs bucket:

$ aws s3 rm s3://covid19-cloudtrail-logs-<AWS_ACCOUNT_ID>-<AWS-REGION> --recursive

9. Delete the pipeline stack:

$ aws cloudformation delete-stack --stack-name codepipeline-covid19 --region <AWS-REGION>


In this post, we stepped through how to use AWS DevOps tooling to model and provision an AWS Glue workflow that orchestrates an end-to-end ETL pipeline on a real-world dataset.

You can download the source code and template from this Github repository and adapt it as you see fit for your data pipeline use cases. Feel free to leave comments letting us know about the architectures you build for your environment. To learn more about building ETL pipelines with AWS Glue, see the AWS Glue Developer Guide and the AWS Data Analytics learning path.

About the Authors

Nuatu Tseggai

Nuatu Tseggai is a Cloud Infrastructure Architect at Amazon Web Services. He enjoys working with customers to design and build event-driven distributed systems that span multiple services.

Suvojit Dasgupta

Suvojit Dasgupta is a Sr. Customer Data Architect at Amazon Web Services. He works with customers to design and build complex data solutions on AWS.

Easily configure Amazon DevOps Guru across multiple accounts and Regions using AWS CloudFormation StackSets

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/configure-devops-guru-multiple-accounts-regions-using-cfn-stacksets/

As applications become increasingly distributed and complex, operators need more automated practices to maintain application availability and reduce the time and effort spent on detecting, debugging, and resolving operational issues.

Enter Amazon DevOps Guru (preview).

Amazon DevOps Guru is a machine learning (ML) powered service that gives you a simpler way to improve an application’s availability and reduce expensive downtime. Without involving any complex configuration setup, DevOps Guru automatically ingests operational data in your AWS Cloud. When DevOps Guru identifies a critical issue, it automatically alerts you with a summary of related anomalies, the likely root cause, and context on when and where the issue occurred. DevOps Guru also, when possible, provides prescriptive recommendations on how to remediate the issue.

Using Amazon DevOps Guru is easy and doesn’t require you to have any ML expertise. To get started, you need to configure DevOps Guru and specify which AWS resources to analyze. If your applications are distributed across multiple AWS accounts and AWS Regions, you need to configure DevOps Guru for each account-Region combination. Though this may sound complex, it’s in fact very simple to do so using AWS CloudFormation StackSets. This post walks you through the steps to configure DevOps Guru across multiple AWS accounts or organizational units, using AWS CloudFormation StackSets.


Solution overview

The goal of this post is to provide you with sample templates to facilitate onboarding Amazon DevOps Guru across multiple AWS accounts. Instead of logging into each account and enabling DevOps Guru, you use AWS CloudFormation StackSets from the primary account to enable DevOps Guru across multiple accounts in a single AWS CloudFormation operation. When it’s enabled, DevOps Guru monitors your associated resources and provides you with detailed insights for anomalous behavior along with intelligent recommendations to mitigate and incorporate preventive measures.

We consider various options in this post for enabling Amazon DevOps Guru across multiple accounts and Regions:

  • All resources across multiple accounts and Regions
  • Resources from specific CloudFormation stacks across multiple accounts and Regions
  • For All resources in an organizational unit

In the following diagram, we launch the AWS CloudFormation StackSet from a primary account to enable Amazon DevOps Guru across two AWS accounts and carry out operations to generate insights. The StackSet uses a single CloudFormation template to configure DevOps Guru, and deploys it across multiple accounts and regions, as specified in the command.

Figure: Shows enabling of DevOps Guru using CloudFormation StackSets

Figure: Shows enabling of DevOps Guru using CloudFormation StackSets

When Amazon DevOps Guru is enabled to monitor your resources within the account, it uses a combination of vended Amazon CloudWatch metrics, AWS CloudTrail logs, and specific patterns from its ML models to detect an anomaly. When the anomaly is detected, it generates an insight with the recommendations.

Figure: Shows DevOps Guru generating Insights based upon ingested metrics

Figure: Shows DevOps Guru monitoring the resources and generating insights for anomalies detected



To complete this post, you should have the following prerequisites:

  • Two AWS accounts. For this post, we use the account numbers 111111111111 (primary account) and 222222222222. We will carry out the CloudFormation operations and monitoring of the stacks from this primary account.
  • To use organizations instead of individual accounts, identify the organization unit (OU) ID that contains at least one AWS account.
  • Access to a bash environment, either using an AWS Cloud9 environment or your local terminal with the AWS Command Line Interface (AWS CLI) installed.
  • AWS Identity and Access Management (IAM) roles for AWS CloudFormation StackSets.
  • Knowledge of CloudFormation StackSets


(a) Using an AWS Cloud9 environment or AWS CLI terminal
We recommend using AWS Cloud9 to create an environment to get access to the AWS CLI from a bash terminal. Make sure you select Linux2 as the operating system for the AWS Cloud9 environment.

Alternatively, you may use your bash terminal in your favorite IDE and configure your AWS credentials in your terminal.

(b) Creating IAM roles

If you are using Organizations for account management, you would not need to create the IAM roles manually and instead use Organization based trusted access and SLRs. You may skip the sections (b), (c) and (d). If not using Organizations, please read further.

Before you can deploy AWS CloudFormation StackSets, you must have the following IAM roles:

  • AWSCloudFormationStackSetAdministrationRole
  • AWSCloudFormationStackSetExecutionRole

The IAM role AWSCloudFormationStackSetAdministrationRole should be created in the primary account whereas AWSCloudFormationStackSetExecutionRole role should be created in all the accounts where you would like to run the StackSets.

If you’re already using AWS CloudFormation StackSets, you should already have these roles in place. If not, complete the following steps to provision these roles.

(c) Creating the AWSCloudFormationStackSetAdministrationRole role
To create the AWSCloudFormationStackSetAdministrationRole role, sign in to your primary AWS account and go to the AWS Cloud9 terminal.

Execute the following command to download the file:

curl -O https://s3.amazonaws.com/cloudformation-stackset-sample-templates-us-east-1/AWSCloudFormationStackSetAdministrationRole.yml

Execute the following command to create the stack:

aws cloudformation create-stack \
--stack-name AdminRole \
--template-body file:///$PWD/AWSCloudFormationStackSetAdministrationRole.yml \
--capabilities CAPABILITY_NAMED_IAM \
--region us-east-1

(d) Creating the AWSCloudFormationStackSetExecutionRole role
You now create the role AWSCloudFormationStackSetExecutionRole in the primary account and other target accounts where you want to enable DevOps Guru. For this post, we create it for our two accounts and two Regions (us-east-1 and us-east-2).

Execute the following command to download the file:

curl -O https://s3.amazonaws.com/cloudformation-stackset-sample-templates-us-east-1/AWSCloudFormationStackSetExecutionRole.yml

Execute the following command to create the stack:

aws cloudformation create-stack \
--stack-name ExecutionRole \
--template-body file:///$PWD/AWSCloudFormationStackSetExecutionRole.yml \
--parameters ParameterKey=AdministratorAccountId,ParameterValue=111111111111 \
--capabilities CAPABILITY_NAMED_IAM \
--region us-east-1

Now that the roles are provisioned, you can use AWS CloudFormation StackSets in the next section.


Running AWS CloudFormation StackSets to enable DevOps Guru

With the required IAM roles in place, now you can deploy the stack sets to enable DevOps Guru across multiple accounts.

As a first step, go to your bash terminal and clone the GitHub repository to access the CloudFormation templates:

git clone https://github.com/aws-samples/amazon-devopsguru-samples
cd amazon-devopsguru-samples/enable-devopsguru-stacksets


(a) Configuring Amazon SNS topics for DevOps Guru to send notifications for operational insights

If you want to receive notifications for operational insights generated by Amazon DevOps Guru, you need to configure an Amazon Simple Notification Service (Amazon SNS) topic across multiple accounts. If you have already configured SNS topics and want to use them, identify the topic name and directly skip to the step to enable DevOps Guru.

Note for Central notification target: You may prefer to configure an SNS Topic in the central AWS account so that all Insight notifications are sent to a single target. In such a case, you would need to modify the central account SNS topic policy to allow other accounts to send notifications.

To create your stack set, enter the following command (provide an email for receiving insights):

aws cloudformation create-stack-set \
--stack-set-name CreateDevOpsGuruTopic \
--template-body file:///$PWD/CreateSNSTopic.yml \
--parameters ParameterKey=EmailAddress,ParameterValue=<[email protected]> \
--region us-east-1

Instantiate AWS CloudFormation StackSets instances across multiple accounts and multiple Regions (provide your account numbers and Regions as needed):

aws cloudformation create-stack-instances \
--stack-set-name CreateDevOpsGuruTopic \
--accounts '["111111111111","222222222222"]' \
--regions '["us-east-1","us-east-2"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

After running this command, the SNS topic devops-guru is created across both the accounts. Go to the email address specified and confirm the subscription by clicking the Confirm subscription link in each of the emails that you receive. Your SNS topic is now fully configured for DevOps Guru to use.

Figure: Shows creation of SNS topic to receive insights from DevOps Guru

Figure: Shows creation of SNS topic to receive insights from DevOps Guru


(b) Enabling DevOps Guru

Let us first examine the CloudFormation template format to enable DevOps Guru and configure it to send notifications over SNS topics. See the following code snippet:

    Type: AWS::DevOpsGuru::ResourceCollection
          StackNames: *

    Type: AWS::DevOpsGuru::NotificationChannel
          TopicArn: arn:aws:sns:us-east-1:111111111111:SnsTopic


When the StackNames property is fed with a value of *, it enables DevOps Guru for all CloudFormation stacks. However, you can enable DevOps Guru for only specific CloudFormation stacks by providing the desired stack names as shown in the following code:


    Type: AWS::DevOpsGuru::ResourceCollection
          - StackA
          - StackB


For the CloudFormation template in this post, we provide the names of the stacks using the parameter inputs. To enable the AWS CLI to accept a list of inputs, we need to configure the input type as CommaDelimitedList, instead of a base string. We also provide the parameter SnsTopicName, which the template substitutes into the TopicArn property.

See the following code:

AWSTemplateFormatVersion: 2010-09-09
Description: Enable Amazon DevOps Guru

    Type: CommaDelimitedList
    Description: Comma separated names of the CloudFormation Stacks for DevOps Guru to analyze.
    Default: "*"

    Type: String
    Description: Name of SNS Topic

    Type: AWS::DevOpsGuru::ResourceCollection
          StackNames: !Ref CfnStackNames

    Type: AWS::DevOpsGuru::NotificationChannel
          TopicArn: !Sub arn:aws:sns:${AWS::Region}:${AWS::AccountId}:${SnsTopicName}


Now that we reviewed the CloudFormation syntax, we will use this template to implement the solution. For this post, we will consider three use cases for enabling Amazon DevOps Guru:

(i) For all resources across multiple accounts and Regions

(ii) For all resources from specific CloudFormation stacks across multiple accounts and Regions

(iii) For all resources in an organization

Let us review each of the above points in detail.

(i) Enabling DevOps Guru for all resources across multiple accounts and Regions

Note: Carry out the following steps in your primary AWS account.

You can use the CloudFormation template (EnableDevOpsGuruForAccount.yml) from the current directory, create a stack set, and then instantiate AWS CloudFormation StackSets instances across desired accounts and Regions.

The following command creates the stack set:

aws cloudformation create-stack-set \
--stack-set-name EnableDevOpsGuruForAccount \
--template-body file:///$PWD/EnableDevOpsGuruForAccount.yml \
--parameters ParameterKey=CfnStackNames,ParameterValue=* \
ParameterKey=SnsTopicName,ParameterValue=devops-guru \
--region us-east-1

The following command instantiates AWS CloudFormation StackSets instances across multiple accounts and Regions:

aws cloudformation create-stack-instances \
--stack-set-name EnableDevOpsGuruForAccount \
--accounts '["111111111111","222222222222"]' \
--regions '["us-east-1","us-east-2"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1


The following screenshot of the AWS CloudFormation console in the primary account running StackSet, shows the stack set deployed in both accounts.

Figure: Screenshot for deployed StackSet and Stack instances

Figure: Screenshot for deployed StackSet and Stack instances


The following screenshot of the Amazon DevOps Guru console shows DevOps Guru is enabled to monitor all CloudFormation stacks.

Figure: Screenshot of DevOps Guru dashboard showing DevOps Guru enabled for all CloudFormation stacks

Figure: Screenshot of DevOps Guru dashboard showing DevOps Guru enabled for all CloudFormation stacks


(ii) Enabling DevOps Guru for specific CloudFormation stacks for individual accounts

Note: Carry out the following steps in your primary AWS account

In this use case, we want to enable Amazon DevOps Guru only for specific CloudFormation stacks for individual accounts. We use the AWS CloudFormation StackSets override parameters feature to rerun the stack set with specific values for CloudFormation stack names as parameter inputs. For more information, see Override parameters on stack instances.

If you haven’t created the stack instances for individual accounts, use the create-stack-instances AWS CLI command and pass the parameter overrides. If you have already created stack instances, update the existing stack instances using update-stack-instances and pass the parameter overrides. Replace the required account number, Regions, and stack names as needed.

In account 111111111111, create instances with the parameter override with the following command, where CloudFormation stacks STACK-NAME-1 and STACK-NAME-2 belong to this account in us-east-1 Region:

aws cloudformation create-stack-instances \
--stack-set-name  EnableDevOpsGuruForAccount \
--accounts '["111111111111"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-1>,<STACK-NAME-2>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

Update the instances with the following command:

aws cloudformation update-stack-instances \
--stack-set-name EnableDevOpsGuruForAccount \
--accounts '["111111111111"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-1>,<STACK-NAME-2>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1


In account 222222222222, create instances with the parameter override with the following command, where CloudFormation stacks STACK-NAME-A and STACK-NAME-B belong to this account in the us-east-1 Region:

aws cloudformation create-stack-instances \
--stack-set-name  EnableDevOpsGuruForAccount \
--accounts '["222222222222"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-A>,<STACK-NAME-B>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

Update the instances with the following command:

aws cloudformation update-stack-instances \
--stack-set-name EnableDevOpsGuruForAccount \
--accounts '["222222222222"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-A>,<STACK-NAME-B>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1


The following screenshot of the DevOps Guru console shows that DevOps Guru is enabled for only two CloudFormation stacks.

Figure: Screenshot of DevOps Guru dashboard showing DevOps Guru enabled for only two CloudFormation stacks

Figure: Screenshot of DevOps Guru dashboard with DevOps Guru enabled for two CloudFormation stacks


(iii) Enabling DevOps Guru for all resources in an organization

Note: Carry out the following steps in your primary AWS account

If you’re using AWS Organizations, you can take advantage of the AWS CloudFormation StackSets feature support for Organizations. This way, you don’t need to add or remove stacks when you add or remove accounts from the organization. For more information, see New: Use AWS CloudFormation StackSets for Multiple Accounts in an AWS Organization.

The following example shows the command line using multiple Regions to demonstrate the use. Update the OU as needed. If you need to use additional Regions, you may have to create an SNS topic in those Regions too.

To create a stack set for an OU and across multiple Regions, enter the following command:

aws cloudformation create-stack-set \
--stack-set-name EnableDevOpsGuruForAccount \
--template-body file:///$PWD/EnableDevOpsGuruForAccount.yml \
--parameters ParameterKey=CfnStackNames,ParameterValue=* \
ParameterKey=SnsTopicName,ParameterValue=devops-guru \
--region us-east-1 \
--permission-model SERVICE_MANAGED \
--auto-deployment Enabled=true,RetainStacksOnAccountRemoval=true

Instantiate AWS CloudFormation StackSets instances for an OU and across multiple Regions with the following command:

aws cloudformation create-stack-instances \
--stack-set-name  EnableDevOpsGuruForAccount \
--deployment-targets OrganizationalUnitIds='["<organizational-unit-id>"]' \
--regions '["us-east-1","us-east-2"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

In this way, you can run CloudFormation StackSets to enable and configure DevOps Guru across multiple accounts, Regions, with simple and easy steps.


Reviewing DevOps Guru insights

Amazon DevOps Guru monitors for anomalies in the resources in the CloudFormation stacks that are enabled for monitoring. The following screenshot shows the initial dashboard.

Figure: Screenshot of DevOps Guru dashboard

Figure: Screenshot of DevOps Guru dashboard

On enabling DevOps Guru, it may take up to 24 hours to analyze the resources and baseline the normal behavior. When it detects an anomaly, it highlights the impacted CloudFormation stack, logs insights that provide details about the metrics indicating an anomaly, and prints actionable recommendations to mitigate the anomaly.

Figure: Screenshot of DevOps Guru dashboard showing ongoing reactive insight

Figure: Screenshot of DevOps Guru dashboard showing ongoing reactive insight

The following screenshot shows an example of an insight (which now has been resolved) that was generated for the increased latency for an ELB. The insight provides various sections in which it provides details about the metrics, the graphed anomaly along with the time duration, potential related events, and recommendations to mitigate and implement preventive measures.

Figure: Screenshot for an Insight generated about ELB Latency

Figure: Screenshot for an Insight generated about ELB Latency


Cleaning up

When you’re finished walking through this post, you should clean up or un-provision the resources to avoid incurring any further charges.

  1. On the AWS CloudFormation StackSets console, choose the stack set to delete.
  2. On the Actions menu, choose Delete stacks from StackSets.
  3. After you delete the stacks from individual accounts, delete the stack set by choosing Delete StackSet.
  4. Un-provision the environment for AWS Cloud9.



This post reviewed how to enable Amazon DevOps Guru using AWS CloudFormation StackSets across multiple AWS accounts or organizations to monitor the resources in existing CloudFormation stacks. Upon detecting an anomaly, DevOps Guru generates an insight that includes the vended CloudWatch metric, the CloudFormation stack in which the resource existed, and actionable recommendations.

We hope this post was useful to you to onboard DevOps Guru and that you try using it for your production needs.


About the Authors

Author's profile photo


Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.





Nuatu Tseggai is a Cloud Infrastructure Architect at Amazon Web Services. He enjoys working with customers to design and build event-driven distributed systems that span multiple services.


Integrating AWS CloudFormation Guard into CI/CD pipelines

Post Syndicated from Sergey Voinich original https://aws.amazon.com/blogs/devops/integrating-aws-cloudformation-guard/

In this post, we discuss and build a managed continuous integration and continuous deployment (CI/CD) pipeline that uses AWS CloudFormation Guard to automate and simplify pre-deployment compliance checks of your AWS CloudFormation templates. This enables your teams to define a single source of truth for what constitutes valid infrastructure definitions, to be compliant with your company guidelines and streamline AWS resources’ deployment lifecycle.

We use the following AWS services and open-source tools to set up the pipeline:

Solution overview

The CI/CD workflow includes the following steps:

  1. A code change is committed and pushed to the CodeCommit repository.
  2. CodePipeline automatically triggers a CodeBuild job.
  3. CodeBuild spins up a compute environment and runs the phases specified in the buildspec.yml file:
  4. Clone the code from the CodeCommit repository (CloudFormation template, rule set for CloudFormation Guard, buildspec.yml file).
  5. Clone the code from the CloudFormation Guard repository on GitHub.
  6. Provision the build environment with necessary components (rust, cargo, git, build-essential).
  7. Download CloudFormation Guard release from GitHub.
  8. Run a validation check of the CloudFormation template.
  9. If the validation is successful, pass the control over to CloudFormation and deploy the stack. If the validation fails, stop the build job and print a summary to the build job log.

The following diagram illustrates this workflow.

Architecture Diagram

Architecture Diagram of CI/CD Pipeline with CloudFormation Guard


For this walkthrough, complete the following prerequisites:

Creating your CodeCommit repository

Create your CodeCommit repository by running a create-repository command in the AWS CLI:

aws codecommit create-repository --repository-name cfn-guard-demo --repository-description "CloudFormation Guard Demo"

The following screenshot indicates that the repository has been created.

CodeCommit Repository

CodeCommit Repository has been created

Populating the CodeCommit repository

Populate your repository with the following artifacts:

  1. A buildspec.yml file. Modify the following code as per your requirements:
version: 0.2
    # Definining CloudFormation Teamplate and Ruleset as variables - part of the code repo
    CF_TEMPLATE: "cfn_template_file_example.yaml"
    CF_ORG_RULESET:  "cfn_guard_ruleset_example"
      - apt-get update
      - apt-get install build-essential -y
      - apt-get install cargo -y
      - apt-get install git -y
      - echo "Setting up the environment for AWS CloudFormation Guard"
      - echo "More info https://github.com/aws-cloudformation/cloudformation-guard"
      - echo "Install Rust"
      - curl https://sh.rustup.rs -sSf | sh -s -- -y
       - echo "Pull GA release from github"
       - echo "More info https://github.com/aws-cloudformation/cloudformation-guard/releases"
       - wget https://github.com/aws-cloudformation/cloudformation-guard/releases/download/1.0.0/cfn-guard-linux-1.0.0.tar.gz
       - echo "Extract cfn-guard"
       - tar xvf cfn-guard-linux-1.0.0.tar.gz .
       - echo "Validate CloudFormation template with cfn-guard tool"
       - echo "More information https://github.com/aws-cloudformation/cloudformation-guard/blob/master/cfn-guard/README.md"
       - cfn-guard-linux/cfn-guard check --rule_set $CF_ORG_RULESET --template $CF_TEMPLATE --strict-checks
    - cfn_template_file_example.yaml
  name: guard_templates
  1. An example of a rule set file (cfn_guard_ruleset_example) for CloudFormation Guard. Modify the following code as per your requirements:
#CFN Guard rules set example

#List of multiple references
let allowed_azs = [us-east-1a,us-east-1b]
let allowed_ec2_instance_types = [t2.micro,t3.nano,t3.micro]
let allowed_security_groups = [sg-08bbcxxc21e9ba8e6,sg-07b8bx98795dcab2]

#EC2 Policies
AWS::EC2::Instance AvailabilityZone IN %allowed_azs
AWS::EC2::Instance ImageId == ami-0323c3dd2da7fb37d
AWS::EC2::Instance InstanceType IN %allowed_ec2_instance_types
AWS::EC2::Instance SecurityGroupIds == ["sg-07b8xxxsscab2"]
AWS::EC2::Instance SubnetId == subnet-0407a7casssse558

#EBS Policies
AWS::EC2::Volume AvailabilityZone == us-east-1a
AWS::EC2::Volume Encrypted == true
AWS::EC2::Volume Size == 50 |OR| AWS::EC2::Volume Size == 100
AWS::EC2::Volume VolumeType == gp2
  1. An example of a CloudFormation template file (.yaml). Modify the following code as per your requirements:
AWSTemplateFormatVersion: "2010-09-09"
Description: "EC2 instance with encrypted EBS volume for AWS CloudFormation Guard Testing"


    Type: AWS::EC2::Instance
      ImageId: 'ami-0323c3dd2da7fb37d'
      AvailabilityZone: 'us-east-1a'
      KeyName: "your-ssh-key"
      InstanceType: 't3.micro'
      SubnetId: 'subnet-0407a7xx68410e558'
        - 'sg-07b8b339xx95dcab2'
          Device: '/dev/sdf'
          VolumeId: !Ref EBSVolume
       - Key: Name
         Value: cfn-guard-ec2

   Type: AWS::EC2::Volume
     Size: 100
     AvailabilityZone: 'us-east-1a'
     Encrypted: true
     VolumeType: gp2
       - Key: Name
         Value: cfn-guard-ebs
   DeletionPolicy: Snapshot

    Description: The Instance ID
    Value: !Ref EC2Instance
    Description: The Volume ID
    Value: !Ref  EBSVolume
AWS CodeCommit

Optional CodeCommit Repository Structure

The following screenshot shows a potential CodeCommit repository structure.

Creating a CodeBuild project

Our CodeBuild project orchestrates around CloudFormation Guard and runs validation checks of our CloudFormation templates as a phase of the CI process.

  1. On the CodeBuild console, choose Build projects.
  2. Choose Create build projects.
  3. For Project name, enter your project name.
  4. For Description, enter a description.
AWS CodeBuild

Create CodeBuild Project

  1. For Source provider, choose AWS CodeCommit.
  2. For Repository, choose the CodeCommit repository you created in the previous step.
AWS CodeBuild

Define the source for your CodeBuild Project

To setup CodeBuild environment we will use managed image based on Ubuntu 18.04

  1. For Environment Image, select Managed image.
  2. For Operating system, choose Ubuntu.
  3. For Service role¸ select New service role.
  4. For Role name, enter your service role name.
CodeBuild Environment

Setup the environment, the OS image and other settings for the CodeBuild

  1. Leave the default settings for additional configuration, buildspec, batch configuration, artifacts, and logs.

You can also use CodeBuild with custom build environments to help you optimize billing and improve the build time.

Creating IAM roles and policies

Our CI/CD pipeline needs two AWS Identity and Access Management (IAM) roles to run properly: one role for CodePipeline to work with other resources and services, and one role for AWS CloudFormation to run the deployments that passed the validation check in the CodeBuild phase.

Creating permission policies

Create your permission policies first. The following code is the policy in JSON format for CodePipeline:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
            "Resource": "*"
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "*",
            "Condition": {
                "StringEqualsIfExists": {
                    "iam:PassedToService": [

To create your policy for CodePipeline, run the following CLI command:

aws iam create-policy --policy-name CodePipeline-Cfn-Guard-Demo --policy-document file://CodePipelineServiceRolePolicy_example.json

Capture the policy ARN that you get in the output to use in the next steps.

The following code is the policy in JSON format for AWS CloudFormation:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "iam:CreateServiceLinkedRole",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:AWSServiceName": [
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
            "Resource": "*"

Create the policy for AWS CloudFormation by running the following CLI command:

aws iam create-policy --policy-name CloudFormation-Cfn-Guard-Demo --policy-document file://CloudFormationRolePolicy_example.json

Capture the policy ARN that you get in the output to use in the next steps.

Creating roles and trust policies

The following code is the trust policy for CodePipeline in JSON format:

  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Principal": {
        "Service": "codepipeline.amazonaws.com"
      "Action": "sts:AssumeRole"

Create your role for CodePipeline with the following CLI command:

aws iam create-role --role-name CodePipeline-Cfn-Guard-Demo-Role --assume-role-policy-document file://RoleTrustPolicy_CodePipeline.json

Capture the role name for the next step.

The following code is the trust policy for AWS CloudFormation in JSON format:

  "Version": "2012-10-17",
  "Statement": [
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudformation.amazonaws.com"
      "Action": "sts:AssumeRole"

Create your role for AWS CloudFormation with the following CLI command:

aws iam create-role --role-name CF-Cfn-Guard-Demo-Role --assume-role-policy-document file://RoleTrustPolicy_CloudFormation.json

Capture the role name for the next step.


Finally, attach the permissions policies created in the previous step to the IAM roles you created:

aws iam attach-role-policy --role-name CodePipeline-Cfn-Guard-Demo-Role  --policy-arn "arn:aws:iam::<AWS Account Id >:policy/CodePipeline-Cfn-Guard-Demo"

aws iam attach-role-policy --role-name CF-Cfn-Guard-Demo-Role  --policy-arn "arn:aws:iam::<AWS Account Id>:policy/CloudFormation-Cfn-Guard-Demo"

Creating a pipeline

We can now create our pipeline to assemble all the components into one managed, continuous mechanism.

  1. On the CodePipeline console, choose Pipelines.
  2. Choose Create new pipeline.
  3. For Pipeline name, enter a name.
  4. For Service role, select Existing service role.
  5. For Role ARN, choose the service role you created in the previous step.
  6. Choose Next.
CodePipeline Setup

Setting Up CodePipeline environment

  1. In the Source section, for Source provider, choose AWS CodeCommit.
  2. For Repository name¸ enter your repository name.
  3. For Branch name, choose master.
  4. For Change detection options, select Amazon CloudWatch Events.
  5. Choose Next.
AWS CodePipeline Source

Adding CodeCommit to CodePipeline

  1. In the Build section, for Build provider, choose AWS CodeBuild.
  2. For Project name, choose the CodeBuild project you created.
  3. For Build type, select Single build.
  4. Choose Next.
CodePipeline Build Stage

Adding Build Project to Pipeline Stage

Now we will create a deploy stage in our CodePipeline to deploy CloudFormation templates that passed the CloudFormation Guard inspection in the CI stage.

  1. In the Deploy section, for Deploy provider, choose AWS CloudFormation.
  2. For Action mode¸ choose Create or update stack.
  3. For Stack name, choose any stack name.
  4. For Artifact name, choose BuildArtifact.
  5. For File name, enter the CloudFormation template name in your CodeCommit repository (In case of our demo it is cfn_template_file_example.yaml).
  6. For Role name, choose the role you created earlier for CloudFormation.
CodePipeline - Deploy Stage

Adding deploy stage to CodePipeline

22. In the next step review your selections for the pipeline to be created. The stages and action providers in each stage are shown in the order that they will be created. Click Create pipeline. Our CodePipeline is ready.

Validating the CI/CD pipeline operation

Our CodePipeline has two basic flows and outcomes. If the CloudFormation template complies with our CloudFormation Guard rule set file, the resources in the template deploy successfully (in our use case, we deploy an EC2 instance with an encrypted EBS volume).

CloudFormation Deployed

CloudFormation Console

If our CloudFormation template doesn’t comply with the policies specified in our CloudFormation Guard rule set file, our CodePipeline stops at the CodeBuild step and you see an error in the build job log indicating the resources that are non-compliant:

[EBSVolume] failed because [Encrypted] is [false] and the permitted value is [true]
[EC2Instance] failed because [t3.2xlarge] is not in [t2.micro,t3.nano,t3.micro] for [InstanceType]
Number of failures: 2

Note: To demonstrate the above functionality I changed my CloudFormation template to use unencrypted EBS volume and switched the EC2 instance type to t3.2xlarge which do not adhere to the rules that we specified in the Guard rule set file

Cleaning up

To avoid incurring future charges, delete the resources that we have created during the walkthrough:

  • CloudFormation stack resources that were deployed by the CodePipeline
  • CodePipeline that we have created
  • CodeBuild project
  • CodeCommit repository


In this post, we covered how to integrate CloudFormation Guard into CodePipeline and fully automate pre-deployment compliance checks of your CloudFormation templates. This allows your teams to have an end-to-end automated CI/CD pipeline with minimal operational overhead and stay compliant with your organizational infrastructure policies.