Tag Archives: Infrastructure as Code

Terraform CI/CD and testing on AWS with the new Terraform Test Framework

Post Syndicated from Kevon Mayers original https://aws.amazon.com/blogs/devops/terraform-ci-cd-and-testing-on-aws-with-the-new-terraform-test-framework/

Image of HashiCorp Terraform logo and Amazon Web Services (AWS) Logo. Underneath the AWS Logo are the service logos for AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and Amazon S3. Graphic created by Kevon Mayers

Graphic created by Kevon Mayers

 Introduction

Organizations often use Terraform Modules to orchestrate complex resource provisioning and provide a simple interface for developers to enter the required parameters to deploy the desired infrastructure. Modules enable code reuse and provide a method for organizations to standardize deployment of common workloads such as a three-tier web application, a cloud networking environment, or a data analytics pipeline. When building Terraform modules, it is common for the module author to start with manual testing. Manual testing is performed using commands such as terraform validate for syntax validation, terraform plan to preview the execution plan, and terraform apply followed by manual inspection of resource configuration in the AWS Management Console. Manual testing is prone to human error, not scalable, and can result in unintended issues. Because modules are used by multiple teams in the organization, it is important to ensure that any changes to the modules are extensively tested before the release. In this blog post, we will show you how to validate Terraform modules and how to automate the process using a Continuous Integration/Continuous Deployment (CI/CD) pipeline.

Terraform Test

Terraform test is a new testing framework for module authors to perform unit and integration tests for Terraform modules. Terraform test can create infrastructure as declared in the module, run validation against the infrastructure, and destroy the test resources regardless if the test passes or fails. Terraform test will also provide warnings if there are any resources that cannot be destroyed. Terraform test uses the same HashiCorp Configuration Language (HCL) syntax used to write Terraform modules. This reduces the burden for modules authors to learn other tools or programming languages. Module authors run the tests using the command terraform test which is available on Terraform CLI version 1.6 or higher.

Module authors create test files with the extension *.tftest.hcl. These test files are placed in the root of the Terraform module or in a dedicated tests directory. The following elements are typically present in a Terraform tests file:

  • Provider block: optional, used to override the provider configuration, such as selecting AWS region where the tests run.
  • Variables block: the input variables passed into the module during the test, used to supply non-default values or to override default values for variables.
  • Run block: used to run a specific test scenario. There can be multiple run blocks per test file, Terraform executes run blocks in order. In each run block you specify the command Terraform (plan or apply), and the test assertions. Module authors can specify the conditions such as: length(var.items) != 0. A full list of condition expressions can be found in the HashiCorp documentation.

Terraform tests are performed in sequential order and at the end of the Terraform test execution, any failed assertions are displayed.

Basic test to validate resource creation

Now that we understand the basic anatomy of a Terraform tests file, let’s create basic tests to validate the functionality of the following Terraform configuration. This Terraform configuration will create an AWS CodeCommit repository with prefix name repo-.

# main.tf

variable "repository_name" {
  type = string
}
resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description     = "Test repository."
}

Now we create a Terraform test file in the tests directory. See the following directory structure as an example:

├── main.tf 
└── tests 
└── basic.tftest.hcl

For this first test, we will not perform any assertion except for validating that Terraform execution plan runs successfully. In the tests file, we create a variable block to set the value for the variable repository_name. We also added the run block with command = plan to instruct Terraform test to run Terraform plan. The completed test should look like the following:

# basic.tftest.hcl

variables {
  repository_name = "MyRepo"
}

run "test_resource_creation" {
  command = plan
}

Now we will run this test locally. First ensure that you are authenticated into an AWS account, and run the terraform init command in the root directory of the Terraform module. After the provider is initialized, start the test using the terraform test command.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... pass
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... pass

Our first test is complete, we have validated that the Terraform configuration is valid and the resource can be provisioned successfully. Next, let’s learn how to perform inspection of the resource state.

Create resource and validate resource name

Re-using the previous test file, we add the assertion block to checks if the CodeCommit repository name starts with a string repo- and provide error message if the condition fails. For the assertion, we use the startswith function. See the following example:

# basic.tftest.hcl

variables {
  repository_name = "MyRepo"
}

run "test_resource_creation" {
  command = plan

  assert {
    condition = startswith(aws_codecommit_repository.test.repository_name, "repo-")
    error_message = "CodeCommit repository name ${var.repository_name} did not start with the expected value of ‘repo-****’."
  }
}

Now, let’s assume that another module author made changes to the module by modifying the prefix from repo- to my-repo-. Here is the modified Terraform module.

# main.tf

variable "repository_name" {
  type = string
}
resource "aws_codecommit_repository" "test" {
  repository_name = format("my-repo-%s", var.repository_name)
  description = "Test repository."
}

We can catch this mistake by running the the terraform test command again.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... fail
╷
│ Error: Test assertion failed
│
│ on tests/basic.tftest.hcl line 9, in run "test_resource_creation":
│ 9: condition = startswith(aws_codecommit_repository.test.repository_name, "repo-")
│ ├────────────────
│ │ aws_codecommit_repository.test.repository_name is "my-repo-MyRepo"
│
│ CodeCommit repository name MyRepo did not start with the expected value 'repo-***'.
╵
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... fail

Failure! 0 passed, 1 failed.

We have successfully created a unit test using assertions that validates the resource name matches the expected value. For more examples of using assertions see the Terraform Tests Docs. Before we proceed to the next section, don’t forget to fix the repository name in the module (revert the name back to repo- instead of my-repo-) and re-run your Terraform test.

Testing variable input validation

When developing Terraform modules, it is common to use variable validation as a contract test to validate any dependencies / restrictions. For example, AWS CodeCommit limits the repository name to 100 characters. A module author can use the length function to check the length of the input variable value. We are going to use Terraform test to ensure that the variable validation works effectively. First, we modify the module to use variable validation.

# main.tf

variable "repository_name" {
  type = string
  validation {
    condition = length(var.repository_name) <= 100
    error_message = "The repository name must be less than or equal to 100 characters."
  }
}

resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description = "Test repository."
}

By default, when variable validation fails during the execution of Terraform test, the Terraform test also fails. To simulate this, create a new test file and insert the repository_name variable with a value longer than 100 characters.

# var_validation.tftest.hcl

variables {
  repository_name = “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
}

run “test_invalid_var” {
  command = plan
}

Notice on this new test file, we also set the command to Terraform plan, why is that? Because variable validation runs prior to Terraform apply, thus we can save time and cost by skipping the entire resource provisioning. If we run this Terraform test, it will fail as expected.

❯ terraform test
tests/basic.tftest.hcl… in progress
run “test_resource_creation”… pass
tests/basic.tftest.hcl… tearing down
tests/basic.tftest.hcl… pass
tests/var_validation.tftest.hcl… in progress
run “test_invalid_var”… fail
╷
│ Error: Invalid value for variable
│
│ on main.tf line 1:
│ 1: variable “repository_name” {
│ ├────────────────
│ │ var.repository_name is “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
│
│ The repository name must be less than or equal to 100 characters.
│
│ This was checked by the validation rule at main.tf:3,3-13.
╵
tests/var_validation.tftest.hcl… tearing down
tests/var_validation.tftest.hcl… fail

Failure! 1 passed, 1 failed.

For other module authors who might iterate on the module, we need to ensure that the validation condition is correct and will catch any problems with input values. In other words, we expect the validation condition to fail with the wrong input. This is especially important when we want to incorporate the contract test in a CI/CD pipeline. To prevent our test from failing due introducing an intentional error in the test, we can use the expect_failures attribute. Here is the modified test file:

# var_validation.tftest.hcl

variables {
  repository_name = “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
}

run “test_invalid_var” {
  command = plan

  expect_failures = [
    var.repository_name
  ]
}

Now if we run the Terraform test, we will get a successful result.

❯ terraform test
tests/basic.tftest.hcl… in progress
run “test_resource_creation”… pass
tests/basic.tftest.hcl… tearing down
tests/basic.tftest.hcl… pass
tests/var_validation.tftest.hcl… in progress
run “test_invalid_var”… pass
tests/var_validation.tftest.hcl… tearing down
tests/var_validation.tftest.hcl… pass

Success! 2 passed, 0 failed.

As you can see, the expect_failures attribute is used to test negative paths (the inputs that would cause failures when passed into a module). Assertions tend to focus on positive paths (the ideal inputs). For an additional example of a test that validates functionality of a completed module with multiple interconnected resources, see this example in the Terraform CI/CD and Testing on AWS Workshop.

Orchestrating supporting resources

In practice, end-users utilize Terraform modules in conjunction with other supporting resources. For example, a CodeCommit repository is usually encrypted using an AWS Key Management Service (KMS) key. The KMS key is provided by end-users to the module using a variable called kms_key_id. To simulate this test, we need to orchestrate the creation of the KMS key outside of the module. In this section we will learn how to do that. First, update the Terraform module to add the optional variable for the KMS key.

# main.tf

variable "repository_name" {
  type = string
  validation {
    condition = length(var.repository_name) <= 100
    error_message = "The repository name must be less than or equal to 100 characters."
  }
}

variable "kms_key_id" {
  type = string
  default = ""
}

resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description = "Test repository."
  kms_key_id = var.kms_key_id != "" ? var.kms_key_id : null
}

In a Terraform test, you can instruct the run block to execute another helper module. The helper module is used by the test to create the supporting resources. We will create a sub-directory called setup under the tests directory with a single kms.tf file. We also create a new test file for KMS scenario. See the updated directory structure:

├── main.tf
└── tests
├── setup
│ └── kms.tf
├── basic.tftest.hcl
├── var_validation.tftest.hcl
└── with_kms.tftest.hcl

The kms.tf file is a helper module to create a KMS key and provide its ARN as the output value.

# kms.tf

resource "aws_kms_key" "test" {
  description = "test KMS key for CodeCommit repo"
  deletion_window_in_days = 7
}

output "kms_key_id" {
  value = aws_kms_key.test.arn
}

The new test will use two separate run blocks. The first run block (setup) executes the helper module to generate a KMS key. This is done by assigning the command apply which will run terraform apply to generate the KMS key. The second run block (codecommit_with_kms) will then use the KMS key ARN output of the first run as the input variable passed to the main module.

# with_kms.tftest.hcl

run "setup" {
  command = apply
  module {
    source = "./tests/setup"
  }
}

run "codecommit_with_kms" {
  command = apply

  variables {
    repository_name = "MyRepo"
    kms_key_id = run.setup.kms_key_id
  }

  assert {
    condition = aws_codecommit_repository.test.kms_key_id != null
    error_message = "KMS key ID attribute value is null"
  }
}

Go ahead and run the Terraform init, followed by Terraform test. You should get the successful result like below.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... pass
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... pass
tests/var_validation.tftest.hcl... in progress
run "test_invalid_var"... pass
tests/var_validation.tftest.hcl... tearing down
tests/var_validation.tftest.hcl... pass
tests/with_kms.tftest.hcl... in progress
run "create_kms_key"... pass
run "codecommit_with_kms"... pass
tests/with_kms.tftest.hcl... tearing down
tests/with_kms.tftest.hcl... pass

Success! 4 passed, 0 failed.

We have learned how to run Terraform test and develop various test scenarios. In the next section we will see how to incorporate all the tests into a CI/CD pipeline.

Terraform Tests in CI/CD Pipelines

Now that we have seen how Terraform Test works locally, let’s see how the Terraform test can be leveraged to create a Terraform module validation pipeline on AWS. The following AWS services are used:

  • AWS CodeCommit – a secure, highly scalable, fully managed source control service that hosts private Git repositories.
  • AWS CodeBuild – a fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages.
  • AWS CodePipeline – a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.
  • Amazon Simple Storage Service (Amazon S3) – an object storage service offering industry-leading scalability, data availability, security, and performance.
Terraform module validation pipeline Architecture. Multiple interconnected AWS services such as AWS CodeCommit, CodeBuild, CodePipeline, and Amazon S3 used to build a Terraform module validation pipeline.

Terraform module validation pipeline

In the above architecture for a Terraform module validation pipeline, the following takes place:

  • A developer pushes Terraform module configuration files to a git repository (AWS CodeCommit).
  • AWS CodePipeline begins running the pipeline. The pipeline clones the git repo and stores the artifacts to an Amazon S3 bucket.
  • An AWS CodeBuild project configures a compute/build environment with Checkov installed from an image fetched from Docker Hub. CodePipeline passes the artifacts (Terraform module) and CodeBuild executes Checkov to run static analysis of the Terraform configuration files.
  • Another CodeBuild project configured with Terraform from an image fetched from Docker Hub. CodePipeline passes the artifacts (repo contents) and CodeBuild runs Terraform command to execute the tests.

CodeBuild uses a buildspec file to declare the build commands and relevant settings. Here is an example of the buildspec files for both CodeBuild Projects:

# Checkov
version: 0.1
phases:
  pre_build:
    commands:
      - echo pre_build starting

  build:
    commands:
      - echo build starting
      - echo starting checkov
      - ls
      - checkov -d .
      - echo saving checkov output
      - checkov -s -d ./ > checkov.result.txt

In the above buildspec, Checkov is run against the root directory of the cloned CodeCommit repository. This directory contains the configuration files for the Terraform module. Checkov also saves the output to a file named checkov.result.txt for further review or handling if needed. If Checkov fails, the pipeline will fail.

# Terraform Test
version: 0.1
phases:
  pre_build:
    commands:
      - terraform init
      - terraform validate

  build:
    commands:
      - terraform test

In the above buildspec, the terraform init and terraform validate commands are used to initialize Terraform, then check if the configuration is valid. Finally, the terraform test command is used to run the configured tests. If any of the Terraform tests fails, the pipeline will fail.

For a full example of the CI/CD pipeline configuration, please refer to the Terraform CI/CD and Testing on AWS workshop. The module validation pipeline mentioned above is meant as a starting point. In a production environment, you might want to customize it further by adding Checkov allow-list rules, linting, checks for Terraform docs, or pre-requisites such as building the code used in AWS Lambda.

Choosing various testing strategies

At this point you may be wondering when you should use Terraform tests or other tools such as Preconditions and Postconditions, Check blocks or policy as code. The answer depends on your test type and use-cases. Terraform test is suitable for unit tests, such as validating resources are created according to the naming specification. Variable validations and Pre/Post conditions are useful for contract tests of Terraform modules, for example by providing error warning when input variables value do not meet the specification. As shown in the previous section, you can also use Terraform test to ensure your contract tests are running properly. Terraform test is also suitable for integration tests where you need to create supporting resources to properly test the module functionality. Lastly, Check blocks are suitable for end to end tests where you want to validate the infrastructure state after all resources are generated, for example to test if a website is running after an S3 bucket configured for static web hosting is created.

When developing Terraform modules, you can run Terraform test in command = plan mode for unit and contract tests. This allows the unit and contract tests to run quicker and cheaper since there are no resources created. You should also consider the time and cost to execute Terraform test for complex / large Terraform configurations, especially if you have multiple test scenarios. Terraform test maintains one or many state files within the memory for each test file. Consider how to re-use the module’s state when appropriate. Terraform test also provides test mocking, which allows you to test your module without creating the real infrastructure.

Conclusion

In this post, you learned how to use Terraform test and develop various test scenarios. You also learned how to incorporate Terraform test in a CI/CD pipeline. Lastly, we also discussed various testing strategies for Terraform configurations and modules. For more information about Terraform test, we recommend the Terraform test documentation and tutorial. To get hands on practice building a Terraform module validation pipeline and Terraform deployment pipeline, check out the Terraform CI/CD and Testing on AWS Workshop.

Authors

Kevon Mayers

Kevon Mayers is a Solutions Architect at AWS. Kevon is a Terraform Contributor and has led multiple Terraform initiatives within AWS. Prior to joining AWS he was working as a DevOps Engineer and Developer, and before that was working with the GRAMMYs/The Recording Academy as a Studio Manager, Music Producer, and Audio Engineer. He also owns a professional production company, MM Productions.

Welly Siauw

Welly Siauw is a Principal Partner Solution Architect at Amazon Web Services (AWS). He spends his day working with customers and partners, solving architectural challenges. He is passionate about service integration and orchestration, serverless and artificial intelligence (AI) and machine learning (ML). He has authored several AWS blog posts and actively leads AWS Immersion Days and Activation Days. Welly spends his free time tinkering with espresso machines and outdoor hiking.

How we sped up AWS CloudFormation deployments with optimistic stabilization

Post Syndicated from Bhavani Kanneganti original https://aws.amazon.com/blogs/devops/how-we-sped-up-aws-cloudformation-deployments-with-optimistic-stabilization/

Introduction

AWS CloudFormation customers often inquire about the behind-the-scenes process of provisioning resources and why certain resources or stacks take longer to provision compared to the AWS Management Console or AWS Command Line Interface (AWS CLI). In this post, we will delve into the various factors affecting resource provisioning in CloudFormation, specifically focusing on resource stabilization, which allows CloudFormation and other Infrastructure as Code (IaC) tools to ensure resilient deployments. We will also introduce a new optimistic stabilization strategy that improves CloudFormation stack deployment times by up to 40% and provides greater visibility into resource provisioning through the new CONFIGURATION_COMPLETE status.

AWS CloudFormation is an IaC service that allows you to model your AWS and third-party resources in template files. By creating CloudFormation stacks, you can provision and manage the lifecycle of the template-defined resources manually via the AWS CLI, Console, AWS SAM, or automatically through an AWS CodePipeline, where CLI and SAM can also be leveraged or through Git sync. You can also use AWS Cloud Development Kit (AWS CDK) to define cloud infrastructure in familiar programming languages and provision it through CloudFormation, or leverage AWS Application Composer to design your application architecture, visualize dependencies, and generate templates to create CloudFormation stacks.

Deploying a CloudFormation stack

Let’s examine a deployment of a containerized application using AWS CloudFormation to understand CloudFormation’s resource provisioning.

Sample application architecture to deploy an ECS service

Figure 1. Sample application architecture to deploy an ECS service

For deploying a containerized application, you need to create an Amazon ECS service. To set up the ECS service, several key resources must first exist: an ECS cluster, an Amazon ECR repository, a task definition, and associated Amazon VPC infrastructure such as security groups and subnets.
Since you want to manage both the infrastructure and application deployments using AWS CloudFormation, you will first define a CloudFormation template that includes: an ECS cluster resource (AWS::ECS::Cluster), a task definition (AWS::ECS::TaskDefinition), an ECR repository (AWS::ECR::Repository), required VPC resources like subnets (AWS::EC2::Subnet) and security groups (AWS::EC2::SecurityGroup), and finally, the ECS Service (AWS::ECS::Service) itself. When you create the CloudFormation stack using this template, the ECS service (AWS::ECS::Service) is the final resource created, as it waits for the other resources to finish creation. This brings up the concept of Resource Dependencies.

Resource Dependency:

In CloudFormation, resources can have dependencies on other resources being created first. There are two types of resource dependencies:

  • Implicit: CloudFormation automatically infers dependencies when a resource uses intrinsic functions to reference another resource. These implicit dependencies ensure the resources are created in the proper order.
  • Explicit: Dependencies can be directly defined in the template using the DependsOn attribute. This allows you to customize the creation order of resources.

The following template snippet shows the ECS service’s dependencies visualized in a dependency graph:

Template snippet:

ECSService:
    DependsOn: [PublicRoute] #Explicit Dependency
    Type: 'AWS::ECS::Service'
    Properties:
      ServiceName: cfn-service
      Cluster: !Ref ECSCluster #Implicit Dependency
      DesiredCount: 2
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED
          SecurityGroups:
            - !Ref SecurityGroup #Implicit Dependency
          Subnets:
            - !Ref PublicSubnet #Implicit Dependency
      TaskDefinition: !Ref TaskDefinition #Implicit Dependency

Dependency Graph:

CloudFormation’s dependency graph for a containerized application

Figure 2. CloudFormation’s dependency graph for a containerized application

Note: VPC Resources in the above graph include PublicSubnet (AWS::EC2::Subnet), SecurityGroup (AWS::EC2::SecurityGroup), PublicRoute (AWS::EC2::Route)

In the above template snippet, the ECS Service (AWS::ECS::Service) resource has an explicit dependency on the PublicRoute resource, specified using the DependsOn attribute. The ECS service also has implicit dependencies on the ECSCluster, SecurityGroup, PublicSubnet, and TaskDefinition resources. Even without an explicit DependsOn, CloudFormation understands that these resources must be created before the ECS service, since the service references them using the Ref intrinsic function. Now that you understand how CloudFormation creates resources in a specific order based on their definition in the template file, let’s look at the time taken to provision these resources.

Resource Provisioning Time:

The total time for CloudFormation to provision the stack depends on the time required to create each individual resource defined in the template. The provisioning duration per resource is determined by several time factors:

  • Engine Time: CloudFormation Engine Time refers to the duration spent by the service reading and persisting data related to a resource. This includes the time taken for operations like parsing and interpreting the CloudFormation template, and for the resolution of intrinsic functions like Fn::GetAtt and Ref.
  • Resource Creation Time: The actual time an AWS service requires to create and configure the resource. This can vary across resource types provisioned by the service.
  • Resource Stabilization Time: The duration required for a resource to reach a usable state after creation.

What is Resource Stabilization?

When provisioning AWS resources, CloudFormation makes the necessary API calls to the underlying services to create the resources. After creation, CloudFormation then performs eventual consistency checks to ensure the resources are ready to process the intended traffic, a process known as resource stabilization. For example, when creating an ECS service in the application, the service is not readily accessible immediately after creation completes (after creation time). To ensure the ECS service is available to use, CloudFormation performs additional verification checks defined specifically for ECS service resources. Resource stabilization is not unique to CloudFormation and must be handled to some degree by all IaC tools.

Stabilization Criteria and Stabilization Timeout

For CloudFormation to mark a resource as CREATE_COMPLETE, the resource must meet specific stabilization criteria called stabilization parameters. These checks validate that the resource is not only created but also ready for use.

If a resource fails to meet its stabilization parameters within the allowed stabilization timeout period, CloudFormation will mark the resource status as CREATE_FAILED and roll back the operation. Stabilization criteria and timeouts are defined uniquely for each AWS resource supported in CloudFormation by the service, and are applied during both resource create and update workflows.

AWS CloudFormation vs AWS CLI to provision resources

Now, you will create a similar ECS service using the AWS CLI. You can use the following AWS CLI command to deploy an ECS service using the same task definition, ECS cluster and VPC resources created earlier using CloudFormation.

Command:

aws ecs create-service \
    --cluster CFNCluster \
    --service-name service-cli \
    --task-definition task-definition-cfn:1 \
    --desired-count 2 \
    --launch-type FARGATE \
    --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-yyy],assignPublicIp=ENABLED}" \
    --region us-east-1

The following snippet from the output of the above command shows that the ECS Service has been successfully created and its status is ACTIVE.

Snapshot of the ECS service API call's response

Figure 3. Snapshot of the ECS service API call

However, when you navigate to the ECS console and review the service, tasks are still in the Pending state, and you are unable to access the application.

ECS tasks status in the AWS console

Figure 4. ECS console view

You have to wait for the service to reach a steady state before you can successfully access the application.

ECS service events from the AWS console

Figure 5. ECS service events from the AWS console

When you create the same ECS service using AWS CloudFormation, the service is accessible immediately after the resource reaches a status of CREATE_COMPLETE in the stack. This reliable availability is due to CloudFormation’s resource stabilization process. After initially creating the ECS service, CloudFormation waits and continues calling the ECS DescribeServices API action until the service reaches a steady state. Once the ECS service passes its consistency checks and is fully ready for use, only then will CloudFormation mark the resource status as CREATE_COMPLETE in the stack. This creation and stabilization orchestration allows you to access the service right away without any further delays.

The following is an AWS CloudTrail snippet of CloudFormation performing DescribeServices API calls during Stabilization:

Snapshot of AWS CloudTrail event for DescribeServices API call

Figure 6. Snapshot of AWS CloudTrail event

By handling resource stabilization natively, CloudFormation saves you the extra coding effort and complexity of having to implement custom status checks and availability polling logic after resource creation. You would have to develop this additional logic using tools like the AWS CLI or API across all the infrastructure and application resources. With CloudFormation’s built-in stabilization orchestration, you can deploy the template once and trust that the services will be fully ready after creation, allowing you to focus on developing your application functionality.

Evolution of Stabilization Strategy

CloudFormation’s stabilization strategy couples resource creation with stabilization such that the provisioning of a resource is not considered COMPLETE until stabilization is complete.

Historic Stabilization Strategy

For resources that have no interdependencies, CloudFormation starts the provisioning process in parallel. However, if a resource depends on another resource, CloudFormation will wait for the entire resource provisioning operation of the dependency resource to complete before starting the provisioning of the dependent resource.

CloudFormation’s historic stabilization strategy

Figure 7. CloudFormation’s historic stabilization strategy

The diagram above shows a deployment of some of the ECS application resources that you deploy using AWS CloudFormation. The Task Definition (AWS::ECS::TaskDefinition) resource depends on the ECR Repository (AWS::ECR::Repository) resource, and the ECS Service (AWS::ECS:Service) resource depends on both the Task Definition and ECS Cluster (AWS::ECS::Cluster) resources. The ECS Cluster resource has no dependencies defined. CloudFormation initiates creation of the ECR Repository and ECS Cluster resources in parallel. It then waits for the ECR Repository to complete consistency checks before starting provisioning of the Task Definition resource. Similarly, creation of the ECS Service resource begins only when the Task Definition and ECS Cluster resources have been created and are ready. This sequential approach ensures safety and stability but causes delays. CloudFormation strictly deploys dependent resources one after the other, slowing down deployment of the entire stack. As the number of interdependent resources grows, the overall stack deployment time increases, creating a bottleneck that prolongs the whole stack operation.

New Optimistic Stabilization Strategy

To improve stack provisioning times and deployment performance, AWS CloudFormation recently launched a new optimistic stabilization strategy. The optimistic strategy can reduce customer stack deployment duration by up to 40%. It allows dependent resources to be created in parallel. This concurrent resource creation helps significantly improve deployment speed.

CloudFormation’s new optimistic stabilizationstrategy

Figure 8. CloudFormation’s new optimistic stabilization strategy

The diagram above shows deployment of the same 4 resources discussed in the historic strategy. The Task Definition (AWS::ECS::TaskDefinition) resource depends on the ECR Repository (AWS::ECR::Repository) resource, and the ECS Service (AWS::ECS:Service) resource depends on both the Task Definition and ECS Cluster (AWS::ECS::Cluster) resources. The ECS Cluster resource has no dependencies defined. CloudFormation initiates creation of the ECR Repository and ECS Cluster resources in parallel. Then, instead of waiting for the ECR Repository to complete consistency checks, it starts creating the Task Definition when the ECR Repository completes creation, but before stabilization is complete. Similarly, creation of the ECS Service resource begins after Task Definition and ECS Cluster creation. The change was made because not all resources require their dependent resources to complete consistency checks before starting creation. If the ECS Service fails to provision because the Task Definition or ECS Cluster resources are still undergoing consistency checks, CloudFormation will wait for those dependencies to complete their consistency checks before attempting to create the ECS Service again.

CloudFormation’s new stabilization strategy with the retry capability

Figure 9. CloudFormation’s new stabilization strategy with the retry capability

This parallel creation of dependent resources with automatic retry capabilities results in faster deployment times compared to the historical linear resource provisioning strategy. The Optimistic stabilization strategy currently applies only to create workflows with resources that have implicit dependencies. For resources with an explicit dependency, CloudFormation leverages the historic strategy in deploying resources.

Improved Visibility into Resource Provisioning

When creating a CloudFormation stack, a resource can sometimes take longer to provision, making it appear as if it’s stuck in an IN_PROGRESS state. This can be because CloudFormation is waiting for the resource to complete consistency checks during its resource stabilization step. To improve visibility into resource provisioning status, CloudFormation has introduced a new “CONFIGURATION_COMPLETE” event. This event is emitted at both the individual resource level and the overall stack level during create workflow when resource(s) creation or configuration is complete, but stabilization is still in progress.

CloudFormation stack events of the ECS Application

Figure 10. CloudFormation stack events of the ECS Application

The above diagram shows the snapshot of stack events of the ECS application’s CloudFormation stack named ECSApplication. Observe the events from the bottom to top:

  • At 10:46:08 UTC-0600, ECSService (AWS::ECS::Service) resource creation was initiated.
  • At 10:46:09 UTC-0600, the ECSService has CREATE_IN_PROGRESS status in the Status tab and CONFIGURATION_COMPLETE status in the Detailed status tab, meaning the resource was successfully created and the consistency check was initiated.
  • At 10:46:09 UTC-0600, the stack ECSApplication has CREATE_IN_PROGRESS status in the Status tab and CONFIGURATION_COMPLETE status in the Detailed status tab, meaning all the resources in the ECSApplication stack are successfully created and are going through stabilization. This stack level CONFIGURATION_COMPLETE status can also be viewed in the stack’s Overview tab.
CloudFormation Overview tab for the ECSApplication stack

Figure 11. CloudFormation Overview tab for the ECSApplication stack

  • At 10:47:09 UTC-0600, the ECSService has CREATE_COMPLETE status in the Status tab, meaning the service is created and completed consistency checks.
  • At 10:47:10 UTC-0600, ECSApplication has CREATE_COMPLETE status in the Status tab, meaning all the resources are successfully created and completed consistency checks.

Conclusion:

In this post, I hope you gained some insights into how CloudFormation deploys resources and the various time factors that contribute to the creation of a stack and its resources. You also took a deeper look into what CloudFormation does under the hood with resource stabilization and how it ensures the safe, consistent, and reliable provisioning of resources in critical, high-availability production infrastructure deployments. Finally, you learned about the new optimistic stabilization strategy to shorten stack deployment times and improve visibility into resource provisioning.

About the authors:

Picture of author Bhavani Kanneganti

Bhavani Kanneganti

Bhavani is a Principal Engineer at AWS Support. She has over 7 years of experience solving complex customer issues on the AWS Cloud pertaining to infrastructure-as-code and container orchestration services such as CloudFormation, ECS, and EKS. She also works closely with teams across AWS to design solutions that improve customer experience. Outside of work, Bhavani enjoys cooking and traveling.

Picture of author Idriss Laouali Abdou

Idriss Laouali Abdou

Idriss is a Senior Product Manager AWS, working on delivering the best experience for AWS IaC customers. Outside of work, you can either find him creating educational content helping thousands of students, cooking, or dancing.

Best practices for managing Terraform State files in AWS CI/CD Pipeline

Post Syndicated from Arun Kumar Selvaraj original https://aws.amazon.com/blogs/devops/best-practices-for-managing-terraform-state-files-in-aws-ci-cd-pipeline/

Introduction

Today customers want to reduce manual operations for deploying and maintaining their infrastructure. The recommended method to deploy and manage infrastructure on AWS is to follow Infrastructure-As-Code (IaC) model using tools like AWS CloudFormation, AWS Cloud Development Kit (AWS CDK) or Terraform.

One of the critical components in terraform is managing the state file which keeps track of your configuration and resources. When you run terraform in an AWS CI/CD pipeline the state file has to be stored in a secured, common path to which the pipeline has access to. You need a mechanism to lock it when multiple developers in the team want to access it at the same time.

In this blog post, we will explain how to manage terraform state files in AWS, best practices on configuring them in AWS and an example of how you can manage it efficiently in your Continuous Integration pipeline in AWS when used with AWS Developer Tools such as AWS CodeCommit and AWS CodeBuild. This blog post assumes you have a basic knowledge of terraform, AWS Developer Tools and AWS CI/CD pipeline. Let’s dive in!

Challenges with handling state files

By default, the state file is stored locally where terraform runs, which is not a problem if you are a single developer working on the deployment. However if not, it is not ideal to store state files locally as you may run into following problems:

  • When working in teams or collaborative environments, multiple people need access to the state file
  • Data in the state file is stored in plain text which may contain secrets or sensitive information
  • Local files can get lost, corrupted, or deleted

Best practices for handling state files

The recommended practice for managing state files is to use terraform’s built-in support for remote backends. These are:

Remote backend on Amazon Simple Storage Service (Amazon S3): You can configure terraform to store state files in an Amazon S3 bucket which provides a durable and scalable storage solution. Storing on Amazon S3 also enables collaboration that allows you to share state file with others.

Remote backend on Amazon S3 with Amazon DynamoDB: In addition to using an Amazon S3 bucket for managing the files, you can use an Amazon DynamoDB table to lock the state file. This will allow only one person to modify a particular state file at any given time. It will help to avoid conflicts and enable safe concurrent access to the state file.

There are other options available as well such as remote backend on terraform cloud and third party backends. Ultimately, the best method for managing terraform state files on AWS will depend on your specific requirements.

When deploying terraform on AWS, the preferred choice of managing state is using Amazon S3 with Amazon DynamoDB.

AWS configurations for managing state files

  1. Create an Amazon S3 bucket using terraform. Implement security measures for Amazon S3 bucket by creating an AWS Identity and Access Management (AWS IAM) policy or Amazon S3 Bucket Policy. Thus you can restrict access, configure object versioning for data protection and recovery, and enable AES256 encryption with SSE-KMS for encryption control.
  1. Next create an Amazon DynamoDB table using terraform with Primary key set to LockID. You can also set any additional configuration options such as read/write capacity units. Once the table is created, you will configure the terraform backend to use it for state locking by specifying the table name in the terraform block of your configuration.
  1. For a single AWS account with multiple environments and projects, you can use a single Amazon S3 bucket. If you have multiple applications in multiple environments across multiple AWS accounts, you can create one Amazon S3 bucket for each account. In that Amazon S3 bucket, you can create appropriate folders for each environment, storing project state files with specific prefixes.

Now that you know how to handle terraform state files on AWS, let’s look at an example of how you can configure them in a Continuous Integration pipeline in AWS.

Architecture

Architecture on how to use terraform in an AWS CI pipeline

Figure 1: Example architecture on how to use terraform in an AWS CI pipeline

This diagram outlines the workflow implemented in this blog:

  1. The AWS CodeCommit repository contains the application code
  2. The AWS CodeBuild job contains the buildspec files and references the source code in AWS CodeCommit
  3. The AWS Lambda function contains the application code created after running terraform apply
  4. Amazon S3 contains the state file created after running terraform apply. Amazon DynamoDB locks the state file present in Amazon S3

Implementation

Pre-requisites

Before you begin, you must complete the following prerequisites:

Setting up the environment

  1. You need an AWS access key ID and secret access key to configure AWS CLI. To learn more about configuring the AWS CLI, follow these instructions.
  2. Clone the repo for complete example: git clone https://github.com/aws-samples/manage-terraform-statefiles-in-aws-pipeline
  3. After cloning, you could see the following folder structure:
AWS CodeCommit repository structure

Figure 2: AWS CodeCommit repository structure

Let’s break down the terraform code into 2 parts – one for preparing the infrastructure and another for preparing the application.

Preparing the Infrastructure

  1. The main.tf file is the core component that does below:
      • It creates an Amazon S3 bucket to store the state file. We configure bucket ACL, bucket versioning and encryption so that the state file is secure.
      • It creates an Amazon DynamoDB table which will be used to lock the state file.
      • It creates two AWS CodeBuild projects, one for ‘terraform plan’ and another for ‘terraform apply’.

    Note – It also has the code block (commented out by default) to create AWS Lambda which you will use at a later stage.

  1. AWS CodeBuild projects should be able to access Amazon S3, Amazon DynamoDB, AWS CodeCommit and AWS Lambda. So, the AWS IAM role with appropriate permissions required to access these resources are created via iam.tf file.
  1. Next you will find two buildspec files named buildspec-plan.yaml and buildspec-apply.yaml that will execute terraform commands – terraform plan and terraform apply respectively.
  1. Modify AWS region in the provider.tf file.
  1. Update Amazon S3 bucket name, Amazon DynamoDB table name, AWS CodeBuild compute types, AWS Lambda role and policy names to required values using variable.tf file. You can also use this file to easily customize parameters for different environments.

With this, the infrastructure setup is complete.

You can use your local terminal and execute below commands in the same order to deploy the above-mentioned resources in your AWS account.

terraform init
terraform validate
terraform plan
terraform apply

Once the apply is successful and all the above resources have been successfully deployed in your AWS account, proceed with deploying your application. 

Preparing the Application

  1. In the cloned repository, use the backend.tf file to create your own Amazon S3 backend to store the state file. By default, it will have below values. You can override them with your required values.
bucket = "tfbackend-bucket" 
key    = "terraform.tfstate" 
region = "eu-central-1"
  1. The repository has sample python code stored in main.py that returns a simple message when invoked.
  1. In the main.tf file, you can find the below block of code to create and deploy the Lambda function that uses the main.py code (uncomment these code blocks).
data "archive_file" "lambda_archive_file" {
    ……
}

resource "aws_lambda_function" "lambda" {
    ……
}
  1. Now you can deploy the application using AWS CodeBuild instead of running terraform commands locally which is the whole point and advantage of using AWS CodeBuild.
  1. Run the two AWS CodeBuild projects to execute terraform plan and terraform apply again.
  1. Once successful, you can verify your deployment by testing the code in AWS Lambda. To test a lambda function (console):
    • Open AWS Lambda console and select your function “tf-codebuild”
    • In the navigation pane, in Code section, click Test to create a test event
    • Provide your required name, for example “test-lambda”
    • Accept default values and click Save
    • Click Test again to trigger your test event “test-lambda”

It should return the sample message you provided in your main.py file. In the default case, it will display “Hello from AWS Lambda !” message as shown below.

Sample Amazon Lambda function response

Figure 3: Sample Amazon Lambda function response

  1. To verify your state file, go to Amazon S3 console and select the backend bucket created (tfbackend-bucket). It will contain your state file.
Amazon S3 bucket with terraform state file

Figure 4: Amazon S3 bucket with terraform state file

  1. Open Amazon DynamoDB console and check your table tfstate-lock and it will have an entry with LockID.
Amazon DynamoDB table with LockID

Figure 5: Amazon DynamoDB table with LockID

Thus, you have securely stored and locked your terraform state file using terraform backend in a Continuous Integration pipeline.

Cleanup

To delete all the resources created as part of the repository, run the below command from your terminal.

terraform destroy

Conclusion

In this blog post, we explored the fundamentals of terraform state files, discussed best practices for their secure storage within AWS environments and also mechanisms for locking these files to prevent unauthorized team access. And finally, we showed you an example of how efficiently you can manage them in a Continuous Integration pipeline in AWS.

You can apply the same methodology to manage state files in a Continuous Delivery pipeline in AWS. For more information, see CI/CD pipeline on AWS, Terraform backends types, Purpose of terraform state.

Arun Kumar Selvaraj

Arun Kumar Selvaraj is a Cloud Infrastructure Architect with AWS Professional Services. He loves building world class capability that provides thought leadership, operating standards and platform to deliver accelerated migration and development paths for his customers. His interests include Migration, CCoE, IaC, Python, DevOps, Containers and Networking.

Manasi Bhutada

Manasi Bhutada is an ISV Solutions Architect based in the Netherlands. She helps customers design and implement well architected solutions in AWS that address their business problems. She is passionate about data analytics and networking. Beyond work she enjoys experimenting with food, playing pickleball, and diving into fun board games.

A new and improved AWS CDK construct for Amazon DynamoDB tables

Post Syndicated from Anirudh Sharma original https://aws.amazon.com/blogs/devops/a-new-and-improved-aws-cdk-construct-for-amazon-dynamodb-tables/

Recently, we launched a new AWS Cloud Development Kit (CDK) construct for Amazon DynamoDB tables, known as TableV2. This construct provides a number of new features in addition to what the original construct offered, enabling CDK authors to create global tables, simplifying the configuration of global secondary indexes and auto scaling, as well as supporting AWS CloudFormation drift detection and import operations. We believe that this new construct will make it easier for organizations to build and manage their DynamoDB tables at scale, in addition to providing more flexibility and control over the configuration of tables.

AWS CDK is a framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. Developers can use any of the supported programming languages to define reusable cloud components known as Constructs. A construct is a reusable and programmable component that represents AWS resources. CDK translates the high-level constructs defined by you into equivalent AWS CloudFormation templates. CloudFormation provisions the resources specified in the template, streamlining the usage of Infrastructure as a Code (IaC) on AWS.

In this post we’ll explore:

  • The reasoning behind the creation of a new L2 construct for DynamoDB tables.
  • Features of new L2 constructs along with examples.
  • The benefits of leveraging this new construct in terms of scalability, flexibility, and simplicity.

By understanding the reasons behind its development and exploring its capabilities through practical examples, you will gain a comprehensive understanding of how this new L2 construct can enhance their DynamoDB experience. Let’s dive in.

Background

The original DynamoDB L2 Table construct is a powerful and versatile tool for creating and managing DynamoDB tables. It allows you to easily define the schema of your table, as well as the provisioned throughput and replicas. It also supports features like global tables, secondary indexes, and streams.

However, the Table construct uses a custom resource to add replicas to the primary table. This means that a separate Lambda function is created as the resource provider in addition to the Table resources (primary table and any replicas). This can be cumbersome to manage and can lead to drift detection issues.

The new TableV2 construct is an abstraction built on top of the GlobalTable L1 construct. It uses the CloudFormation resource AWS::DynamoDB::GlobalTable to create and manage DynamoDB tables. This has two important benefits:

  1. CloudFormation is in control and aware of all replicas that make up the Global Table, which means you will experience drift detection across all the replicas. With the original table construct, CloudFormation was not aware of any replicas since this was being handled through the Lambda function being used as a resource provider.
  2. No extra resource (Lambda function) is created when replicas are configured with TableV2. This eliminates the need to manage an extra resource and the risk of troubleshooting issues that may arise with the custom resource. TableV2 simplifies the setup and maintenance of DynamoDB tables by using native CloudFormation constructs to directly manage replicas, without the need for a Lambda function. This results in a more efficient and streamlined experience for users.

The new TableV2 construct provides more fine-grained control to customers over the replicas created as part of the Global Table. Specifically, customers can specify properties like contributor insights, deletion protection, point-in-time recovery, table class, read capacity, and global secondary index options on a per-replica basis.

This means that customers can tailor their table setup to meet their specific needs and optimize their overall experience with the Global Table feature. For example, a customer might want to enable contributor insights for all replicas, but only enable deletion protection for the primary replica. Or, a customer might want to use a different table class for each replica, depending on the expected workload.

The new TableV2 construct also offers greater flexibility and customization options by allowing customers to specify these properties on a per-replica basis. This can be helpful for customers who need to have different configurations for their replicas, or who want to fine-tune the performance and availability of their tables.

In the next section, we will explore each of these properties in more detail and how they can be specified in the new construct.

Features Walk-through

The new TableV2 construct is the recommended CDK DynamoDB construct for creating both single tables and global tables. In this section, we will review some specific aspects of the TableV2 construct and how they can be implemented. The walkthrough will cover features like Replicas, Billing, and Encryption, providing a comprehensive understanding of its capabilities.

Replicas

One of the most important benefits of the new L2 construct is the ability to configure properties on a per-replica basis. For example, the following code creates a global DynamoDB table with contributor insights and point-in-time recovery enabled for the table:

import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';

const app = new cdk.App();
const stack = new cdk.Stack(app, 'Stack', { env: { region: 'us-west-2' } });

const globalTable = new dynamodb.TableV2(stack, 'GlobalTable', {
  partitionKey: { name: 'pk', type: dynamodb.AttributeType.STRING },
  contributorInsights: true,
  pointInTimeRecovery: true,
  replicas: [
    {
      region: 'us-east-1',
      tableClass: dynamodb.TableClass.STANDARD_INFREQUENT_ACCESS,
      pointInTimeRecovery: false,
    },
    {
      region: 'us-east-2',
      contributorInsights: false,
    },
  ],
});

// This is an ITableV2 instance for the replica table in us-east-1
const replica = globalTable.replica('us-east-1');

This code creates two replicas, one in the us-east-1 region and one in the us-east-2 region. For the replica in the us-east-1 region, we disable point-in-time recovery and set the table class to STANDARD_INFREQUENT_ACCESS. For the replica in the us-east-2 region, we disable contributor insights. The TableV2 construct also enables users to work with individual instances of the replicas in a global table via the replica() method. We see how this can be utilized from the above code where an ITableV2 instance representing the replica in us-east-1 is returned.

This is particularly useful for the grant() and metric() methods. For example, the following code gives a user write access to a replica in us-east-1 region:

import { Construct } from 'constructs';
import { App, Stack, StackProps } from 'aws-cdk-lib';
import { ITableV2, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { AttributeType } from 'aws-cdk-lib/aws-dynamodb';
import * as iam from 'aws-cdk-lib/aws-iam';


class FooStack extends Stack {
  public readonly globalTable: TableV2;

  public constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    this.globalTable = new TableV2(this, 'GlobalTable', {
      partitionKey: { name: 'pk', type: AttributeType.STRING },
      replicas: [
        { region: 'us-east-1' },
        { region: 'us-east-2' },
      ],
    });
  }
}

interface BarStackProps extends StackProps {
  readonly replicaTable: ITableV2;
}

class BarStack extends Stack {
  public constructor(scope: Construct, id: string, props: BarStackProps) {
    super(scope, id, props);
    const user = new iam.User(this, 'User')

    // user is given grantWriteData permissions to replica in us-east-1
    props.replicaTable.grantWriteData(user);
  }
}

const app = new App();

const fooStack = new FooStack(app, 'FooStack', { env: { region: 'us-west-2', account: process.env.CDK_DEFAULT_ACCOUNT } });
const barStack = new BarStack(app, 'BarStack', {
  replicaTable: fooStack.globalTable.replica('us-east-1'),
  env: { region: 'us-east-1', account: process.env.CDK_DEFAULT_ACCOUNT },
});

Before the replica() method was introduced, grant methods on the original Table construct applied to the primary table and all replicas. This was because there was no way to pull out a specific replica. This limited a user’s ability to grant a specific principal read, write, or read/write permission to a specific replica. The replica() method enables granting specific permissions to individual replicas in a global table. It maintains consistent behavior across all methods in the ITableV2 interface, including grants and metrics.

Billing

Table billing is easily configured using the onDemand() or provisioned() static methods of the Billing class. If provisioned billing is configured, the user must provide read and write capacity, which can be easily configured using the fixed() or autoscaled() static methods of the Capacity class.

For example, to configure on-demand billing:

import * as cdk from 'aws-cdk-lib';
import { AttributeType, Billing, TableClass, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { Construct } from 'constructs';


export class DynamodbStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    new TableV2(this, 'DynamoDBTable', {
      partitionKey: { name: 'id', type: AttributeType.STRING},
      replicas: [
        {region: 'us-east-2'},
        {region: 'us-west-1'}
      ],
      billing: Billing.onDemand(),
      tableClass: TableClass.STANDARD
    })
  }
}

To configure provisioned billing:

import * as cdk from 'aws-cdk-lib';
import { AttributeType, Billing, Capacity, TableClass, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { Construct } from 'constructs';

export class DynamodbStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    new TableV2(this, 'DynamoDBTable', {
      partitionKey: { name: 'id', type: AttributeType.STRING},
      replicas: [
        {region: 'us-east-2'},
        {region: 'us-west-1'}
      ],
      billing: Billing.provisioned({
        readCapacity: Capacity.fixed(5),
        writeCapacity: Capacity.autoscaled({maxCapacity: 10})
      }),
      tableClass: TableClass.STANDARD
    })
  }
}

Note that with the previous Table construct, users had to set a billingMode property and configure readCapacity and writeCapacity as separate properties. Additionally, configuring autoscaled capacity required calling the autoScaleReadCapacity() or autoScaleWriteCapacity() method on an instance of the Table construct. Lastly, since readCapacity, writeCapacity, and billingMode were all individual properties, a user had to know not to provision read and write capacity for a table with PAY_PER_REQUEST billing mode. With the new Billing class, the user is guided into providing necessary properties via the onDemand() and provisioned() static methods.

Encryption

The TableEncryptionV2 class allows you to provide your own KMS keys for each replica instead of using the default AWS owned keys, thus encrypting every replica with a custom KMS key. This provides more granular control over the encryption of your DynamoDB tables.

Here is an example of how to use the TableEncryptionV2 class to encrypt each replica of a global table with a custom KMS key:

import * as cdk from 'aws-cdk-lib';
import { AttributeType, Billing, BillingMode, Capacity, TableBaseV2, TableEncryptionV2, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { IKey, Key } from 'aws-cdk-lib/aws-kms';
import { Construct } from 'constructs';

interface KMSkeys extends cdk.StackProps {
  kmsuswest1: IKey;
  kmsuseast2: IKey;
}

export class GlobalTableStack extends cdk.Stack {
  //public readonly globalTable: TableV2;
  constructor(scope: Construct, id: string, props: KMSkeys) {
    super(scope, id, props);

    const replicaTableKeys = {
      "us-west-1": props.kmsuswest1.keyArn,
      "us-east-2": props.kmsuseast2.keyArn
    }
    const TableKMSKey=new Key(this, 'TableKMSKey', {
      alias: 'KMSuswest2Stack',
    }
    )

    new TableV2(this, 'GlobalTable', {
    tableName: 'FooTableFour',
    encryption: TableEncryptionV2.customerManagedKey(TableKMSKey,replicaTableKeys),

    partitionKey: {
    name: 'FooHashKey',
    type: AttributeType.STRING,
    },
    replicas: [
    {
      region: 'us-west-1',  
    },
    {
      region: 'us-east-2',
    },
  ],
    })
  }
}

The ability to provide custom KMS keys for each replica can help to improve the security of your DynamoDB tables. It also gives you more control over the encryption of your data. This can help you to meet specific compliance requirements.

Conclusion

In this post, I introduced the new AWS CDK TableV2 construct, highlighting its advantages over the original construct. Notably, TableV2 enables drift detection for replica tables and eliminates the need for an extra Lambda function custom resource. I delved into practical implementations, focusing on three key aspects: Replicas, Billing, and Encryption.

To summarize, TableV2 marks a substantial improvement over the original construct. Its user experience provides significant improvement over the original construct in several ways, such as:

  • Direct support for global tables: TableV2 makes it easy to create and manage global DynamoDB tables.
  • Easier configuration of global secondary indexes and Autoscaling: TableV2 provides a simplified and streamlined process for configuring global secondary indexes and Autoscaling.
  • More granular control over replicas: TableV2 allows you to configure properties on a per-replica basis, giving you more control over the performance and availability of your tables.
  • Improved API design and user experience: TableV2 improves the API design and user experience by implementing new classes for billing, capacity, and encryption.

Overall, TableV2 is a powerful and flexible construct that makes it easier to build and manage DynamoDB tables at scale. It is the preferred CDK DynamoDB construct for creating both single tables and global tables. If you are looking for a powerful and flexible way to build and manage DynamoDB tables, TableV2 is the perfect choice for you.

If you’re new to CDK and eager to get started, we highly recommend checking out the CDK documentation and the CDK workshop.

Anirudh Sharma

Anirudh is a Cloud Support Engineer 2 with an extensive background in DevOps offerings at AWS, and he is also a Subject Matter Expert in AWS ElasticBeanstalk and AWS CodeDeploy services. He loves helping customers and learning new services and technologies. He also loves travelling and has a goal to visit Japan someday. He is a Golden State Warriors fan and loves spending time with his family.

Getting started with Projen and AWS CDK

Post Syndicated from Michael Tran original https://aws.amazon.com/blogs/devops/getting-started-with-projen-and-aws-cdk/

In the modern world of cloud computing, Infrastructure as Code (IaC) has become a vital practice for deploying and managing cloud resources. AWS Cloud Development Kit (AWS CDK) is a popular open-source framework that allows developers to define cloud resources using familiar programming languages. A related open source tool called Projen is a powerful project generator that simplifies the management of complex software configurations. In this post, we’ll explore how to get started with Projen and AWS CDK, and discuss the pros and cons of using Projen.

What is Projen?

Building modern and high quality software requires a large number of tools and configuration files to handle tasks like linting, testing, and automating releases. Each tool has its own configuration interface, such as JSON or YAML, and a unique syntax, increasing maintenance complexity.

When starting a new project, you rarely start from scratch, but more often use a scaffolding tool (for instance, create-react-app) to generate a new project structure. A large amount of configuration is created on your behalf, and you get the ownership of those files. Moreover, there is a high number of project generation tools, with new ones created almost everyday.

Projen is a project generator that helps developers to efficiently manage project configuration files and build high quality software. It allows you to define your project structure and configuration in code, making it easier to maintain and share across different environments and projects.

Out of the box, Projen supports multiple project types like AWS CDK construct libraries, react applications, Java projects, and Python projects. New project types can be added by contributors, and projects can be developed in multiple languages. Projen uses the jsii library, which allows us to write APIs once and generate libraries in several languages. Moreover, Projen provides a single interface, the projenrc file, to manage the configuration of your entire project!

The diagram below provides an overview of the deployment process of AWS cloud resources using Projen:

Projen Overview of Deployment process of AWS Resources

 

  1. In this example, Projen can be used to generate a new project, for instance, a new CDK Typescript application.
  2. Developers define their infrastructure and application code using AWS CDK resources. To modify the project configuration, developers use the projenrc file instead of directly editing files like package.json.
  3. The project is synthesized to produce an AWS CloudFormation template.
  4. The CloudFormation template is deployed in a AWS account, and provisions AWS cloud resources.

Projen_Diagram
Diagram 1 – Projen packaged features: Projen helps gets your project started and allows you to focus on coding instead of worrying about the other project variables. It comes out of the box with linting, unit test and code coverage, and a number of Github actions for release and versioning and dependency management.

Pros and Cons of using Projen

Pros

  1. Consistency: Projen ensures consistency across different projects by allowing you to define standard project templates. You don’t need to use different project generators, only Projen.
  2. Version Control: Since project configuration is defined in code, it can be version-controlled, making it easier to track changes and collaborate with others.
  3. Extensibility: Projen supports various plugins and extensions, allowing you to customize the project configuration to fit your specific needs.
  4. Integration with AWS CDK: Projen provides seamless integration with AWS CDK, simplifying the process of defining and deploying cloud resources.
  5. Polyglot CDK constructs library: Build once, run in multiple runtimes. Projen can convert and publish a CDK Construct developed in TypeScript to Java (Maven) and Python (PYPI) with JSII support.
  6. API Documentation : Generate API documentation from the comments, if you are building a CDK construct

Cons

  1. Microsoft Windows support. There are a number of open issues about Projen not completely working with the Windows environment (https://github.com/projen/projen/issues/2427 and https://github.com/projen/projen/issues/498).
  2. The framework, Projen, is very opinionated with a lot of assumptions on architecture, best practices and conventions.
  3. Projen is still not GA, with the version at the time of this writing at v0.77.5.

Walkthrough

Step 1: Set up prerequisites

  • An AWS account
  • Download and install Node
  • Install yarn
  • AWS CLI : configure your credentials
  • Deploying stacks with the AWS CDK requires dedicated Amazon S3 buckets and other containers to be available to AWS CloudFormation during deployment (More information).

Note: Projen doesn’t need to be installed globally. You will be using npx to run Projen which takes care of all required setup steps. npx is a tool for running npm packages that:

  • live inside of a local node_modules folder
  • are not installed globally.

npx comes bundled with npm version 5.2+

Step 2: Create a New Projen Project

You can create a new Projen project using the following command:

mkdir test_project && cd test_project
npx projen new awscdk-app-ts

This command creates a new TypeScript project with AWS CDK support. The exhaustive list of supported project types is available through the official documentation: Projen.io, or by running the npx projen new command without a project type. It also supports npx projen new awscdk-construct to create a reusable construct which can then be published to other package managers.

The created project structure should be as follows:

test_project
| .github/
| .projen/
| src/
| test/
| .eslintrc
| .gitattributes
| .gitignore
| .mergify.yml
| .npmignore
| .projenrc.js
| cdk.json
| LICENSE
| package.json
| README.md
| tsconfig.dev.json
| yarn.lock

Projen generated a new project including:

  • Initialization of an empty git repository, with the associated GitHub workflow files to build and upgrade the project. The release workflow can be customized with projen tasks.
  • .projenrc.js is the main configuration file for project
  • tasks.json file for integration with Visual Studio Code
  • src folder containing an empty CDK stack
  • License and README files
  • A projen configuration file: projenrc.js
  • package.json contains functional metadata about the project like name, versions and dependencies.
  • .gitignore, .gitattributes file to manage your files with git.
  • .eslintrc identifying and reporting patterns on javascript.
  • .npmignore to keep files out of package manager.
  • .mergify.yml for managing the pull requests.
  • tsconfig.json configure the compiler options

Most of the generated files include a disclaimer:

# ~~ Generated by projen. To modify, edit .projenrc.js and run "npx projen".

Projen’s power lies in its single configuration file, .projenrc.js. By editing this file, you can manage your project’s lint rules, dependencies, .gitignore, and more. Projen will propagate your changes across all generated files, simplifying and unifying dependency management across your projects.

Projen generated files are considered implementation details and are not meant to be edited manually. If you do make manual changes, they will be overwritten the next time you run npx projen.

To edit your project configuration, simply edit .projenrc.js and then run npx projen to synthesize again. For more information on the Projen API, please see the documentation: http://projen.io/api/API.html.

Projen uses the projenrc.js file’s configuration to instantiate a new AwsCdkTypeScriptApp with some basic metadata: the project name, CDK version and the default release branch. Additional APIs are available for this project type to customize it (for instance, add runtime dependencies).

Let’s try to modify a property and see how Projen reacts. As an example, let’s update the project name in projenrc.js :

name: 'test_project_2',

and then run the npx projen command:

npx projen

Once done, you can see that the project name was updated in the package.json file.

Step 3: Define AWS CDK Resources

Inside your Projen project, you can define AWS CDK resources using familiar programming languages like TypeScript. Here’s an example of defining an Amazon Simple Storage Service (Amazon S3) bucket:

1. Navigate to your main.ts file in the src/ directory
2. Modify the imports at the top of the file as follow:

import { App, CfnOutput, Stack, StackProps } from 'aws-cdk-lib';
import * as s3 from 'aws-cdk-lib/aws-s3';
import { Construct } from 'constructs';

1. Replace line 9 “// define resources here…” with the code below:

const bucket = new s3.Bucket(this, 'MyBucket', {
  versioned: true,
});

new CfnOutput(this, 'TestBucket', { value: bucket.bucketArn });

Step 4: Synthesize and Deploy

Next we will bootstrap our application. Run the following in a terminal:

$ npx cdk bootstrap

Once you’ve defined your resources, you can synthesize a cloud assembly, which includes a CloudFormation template (or many depending on the application) using:

$ npx projen build

npx projen build will perform several actions:

  1. Build the application
  2. Synthesize the CloudFormation template
  3. Run tests and linter

The synth() method of Projen performs the actual synthesizing (and updating) of all configuration files managed by Projen. This is achieved by deleting all Projen-managed files (if there are any), and then re-synthesizing them based on the latest configuration specified by the user.

You can find an exhaustive list of the available npx projen commands in .projen/tasks.json. You can also use the projen API project.addTask to add a new task to perform any custom action you need ! Tasks are a project-level feature to define a project command system backed by shell scripts.

Deploy the CDK application:

$ npx projen deploy

Projen will use the cdk deploy command to deploy the CloudFormation stack in the configured AWS account by creating and executing a change set based on the template generated by CDK synthesis. The output of the step above should look as follow:

deploy | cdk deploy

✨ Synthesis time: 3.28s

toto-dev: start: Building 387a3a724050aec67aa083b74c69485b08a876f038078ec7ea1018c7131f4605:263905523351-us-east-1
toto-dev: success: Built 387a3a724050aec67aa083b74c69485b08a876f038078ec7ea1018c7131f4605:263905523351-us-east-1
toto-dev: start: Publishing 387a3a724050aec67aa083b74c69485b08a876f038078ec7ea1018c7131f4605:263905523351-us-east-1
toto-dev: success: Published 387a3a724050aec67aa083b74c69485b08a876f038078ec7ea1018c7131f4605:263905523351-us-east-1
toto-dev: deploying... [1/1]
toto-dev: creating CloudFormation changeset...

✅ testproject-dev

✨ Deployment time: 33.48s

Outputs:
testproject-dev.TestBucket = arn:aws:s3:::testproject-dev-mybucketf68f3ff0-1xy2f0vk0ve4r
Stack ARN:
arn:aws:cloudformation:us-east-1:263905523351:stack/testproject-dev/007e7b20-48df-11ee-b38d-0aa3a92c162d

✨ Total time: 36.76s

The application was successfully deployed in the configured AWS account! Also, the Amazon Resource Name (ARN) of the S3 bucket created is available through the CloudFormation stack Outputs tab, and displayed in your terminal under the ‘Outputs’ section.

Clean up

Delete CloudFormation Stack

To clean up the resources created in this section of the workshop, navigate to the CloudFormation console and delete the stack created. You can also perform the same task programmatically:

$ npx projen destroy

Which should produce the following output:

destroy | cdk destroy
Are you sure you want to delete: testproject-dev (y/n)? y
testproject-dev: destroying... [1/1]

✅ testproject-dev: destroyed

Delete S3 Buckets

The S3 bucket will not be deleted since its retention policy was set to RETAIN. Navigate to the S3 console and delete the created bucket. If you added files to that bucket, you will need to empty it before deletion. See the Deleting a bucket documentation for more information.

Conclusion

Projen and AWS CDK together provide a powerful combination for managing cloud resources and project configuration. By leveraging Projen, you can ensure consistency, version control, and extensibility across your projects. The integration with AWS CDK allows you to define and deploy cloud resources using familiar programming languages, making the entire process more developer-friendly.

Whether you’re a seasoned cloud developer or just getting started, Projen and AWS CDK offer a streamlined approach to cloud resource management. Give it a try and experience the benefits of Infrastructure as Code with the flexibility and power of modern development tools.

Alain Krok

Alain Krok is a Senior Solutions Architect with a passion for emerging technologies. His past experience includes designing and implementing IIoT solutions for the oil and gas industry and working on robotics projects. He enjoys pushing the limits and indulging in extreme sports when he is not designing software.

 

Dinesh Sajwan

Dinesh Sajwan is a Senior Solutions Architect. His passion for emerging technologies allows him to stay on the cutting edge and identify new ways to apply the latest advancements to solve even the most complex business problems. His diverse expertise and enthusiasm for both technology and adventure position him as a uniquely creative problem-solver.

Michael Tran

Michael Tran is a Sr. Solutions Architect with Prototyping Acceleration team at Amazon Web Services. He provides technical guidance and helps customers innovate by showing the art of the possible on AWS. He specializes in building prototypes in the AI/ML space. You can contact him @Mike_Trann on Twitter.

How to build a consistent workflow for development and operations teams

Post Syndicated from Mark Paulsen original https://github.blog/2023-02-28-how-to-build-a-consistent-workflow-for-development-and-operations-teams/

In GitHub’s recent 2022 State of the Octoverse report, HashiCorp Configuration Language (HCL) was the fastest growing programming language on GitHub. HashiCorp is a leading provider of Infrastructure as Code (IaC) automation for cloud computing. HCL is HashiCorp’s configuration language used with tools like Terraform and Vault to deliver IaC capabilities in a human-readable configuration file across multi-cloud and on-premises environments.

HCL’s growth shows the importance of bringing together the worlds of infrastructure, operations, and developers. This was always the goal of DevOps. But in reality, these worlds remain siloed for many enterprises.

In this post we’ll look at the business and cultural influences that bring development and operations together, as well as security, governance, and networking teams. Then, we’ll explore how GitHub and HashiCorp can enable consistent workflows and guardrails throughout the entire CI/CD pipeline.

The traditional world of operations (Ops)

Armon Dadgar, co-founder of HashiCorp, uses the analogy of a tree to explain the traditional world of Ops. The trunk includes all of the shared and consistent services you need in an enterprise to get stuff done. Think of things like security requirements, Active Directory, and networking configurations. A branch represents the different lines of business within an enterprise, providing services and products internally or externally. The leaves represent the different environments and technologies where your software or services are deployed: cloud, on-premises, and container environment, among others.

In many enterprises, the communication channels and processes between these different business areas can be cumbersome and expensive. If there is a significant change to the infrastructure or architecture, multiple tickets are typically submitted to multiple teams for reviews and approvals across different parts of the enterprise. Change Advisory Boards are commonly used to protect the organization. The change is usually unable to proceed unless the documentation is complete. Commonly, there’s a set of governance logs and auditable artifacts which are required for future audits.

Wouldn’t it be more beneficial for companies if teams had an optimized, automated workflow that could be used to speed up delivery and empower teams to get the work done in a set of secure guardrails? This could result in significant time and cost savings, leading to added business value.

After all, a recent Forrester report found that over three years, using GitHub drove 433% ROI for a composite organization simply with the combined power of all GitHub’s enterprise products. Not to mention the potential for time savings and efficiency increase, along with other qualitative benefits that come with consistency and streamlining work.

Your products and services would be deployed through an optimized path with security and governance built-in, rather than a sluggish, manual and error-prone process. After all, isn’t that the dream of DevOps, GitOps, and Cloud Native?

Introducing IaC

Let’s use a different analogy. Think of IaC as the blueprint for resources (such as servers, databases, networking components, or PaaS services) that host our software and services.

If you were architecting a hospital or a school, you wouldn’t use the same overall blueprint for both scenarios as they serve entirely different purposes with significantly different requirements. But there are likely building blocks or foundations that can be reused across the two designs.

IaC solutions, such as HCL, allow us to define and reuse these building blocks, similarly to how we reuse methods, modules, and package libraries in software development. With it being IaC, we can start adopting the same recommended practices for infrastructure that we use when collaborating and deploying on applications.

After all, we know that teams that adopt DevOps methodologies will see improved productivity, cloud-enabled scalability, collaboration, and security.

A better way to deliver

With that context, let’s explore the tangible benefits that we gain in codifying our infrastructure and how they can help us transform our traditional Ops culture.

Storing code in repositories

Let’s start with the lowest-hanging fruit. With it being IaC, we can start storing infrastructure and architectural patterns in source code repositories such as GitHub. This gives us a single source of truth with a complete version history. This allows us to easily rollback changes if needed, or deploy a specific version of the truth from history.

Teams across the enterprise can collaborate in separate branches in a Git repository. Branches allow teams and individuals to be productive in “their own space” and not have to worry about negatively impacting the in-progress work of other teams, away from the “production” source of truth (typically, the main branch).

Terraform modules, the reusable building blocks mentioned in the last section, are also stored and versioned in Git repositories. From there, modules can be imported to the private registry in Terraform Cloud to make them easily discoverable by all teams. When a new release version is tagged in GitHub, it is automatically updated in the registry.

Collaborate early and often

As we discussed above, teams can make changes in separate branches to not impact the current state. But what happens when you want to bring those changes to the production codebase? If you’re unfamiliar with Git, then you may not have heard of a pull request before. As the name implies, we can “pull” changes from one branch into another.

Pull requests in GitHub are a great way to collaborate with other users in the team, being able to get peer reviews so feedback can be incorporated into your work. The pull request process is deliberately very social, to foster collaboration across the team.

In GitHub, you could consider setting branch protection rules so that direct changes to your main branch are not allowed. That way, all users must go through a pull request to get their code into production. You can even specify the minimum number of reviewers needed in branch protection rules.

Tip: you could use a special type of file, the CODEOWNERS file in GitHub, to automatically add reviewers to a pull request based on the files being edited. For example, all HCL files may need a review by the core infrastructure team. Or IaC configurations for line of business core banking systems might require review by a compliance team.

Unlike Change Advisory Boards, which typically take place on a specified cadence, pull requests become a natural part of the process to bring code into production. The quality of the decisions and discussions also evolves. Rather than being a “yes/no” decision with recommendations in an external system, the context and recommendations can be viewed directly in the pull request.

Collaboration is also critical in the provisioning process, and GitHub’s integrations with Terraform Cloud will help you scale these processes across multiple teams. Terraform Cloud offers workflow features like secure storage for your Terraform state and direct integration with your GitHub repositories for a turnkey experience around the pull request and merge lifecycle.

Bringing automated quality reviews into the process

Building on from the previous section, pull requests also allow us to automatically check the quality of the changes that are being proposed. It is common in software to check that the application still compiles correctly, that unit tests pass, that no security vulnerabilities are introduced, and more.

From an IaC perspective, we can bring similar automated checks into our process. This is achieved by using GitHub status checks and gives us a clear understanding of whether certain criteria has been met or not.

GitHub Actions are commonly used to execute some of these automated checks in pull requests on GitHub. To determine the quality of IaC, you could include checks such as:

  • Validating that the code is syntactically correct (for example, Terraform validate).
  • Linting the code to ensure a certain set of standards are being followed (for example, TFLint or Terraform format).
  • Static code analysis to identify any misconfigurations in your infrastructure at “design time” (for example, tfsec or terrascan).
  • Relevant unit or integration tests (using tools such as Terratest).
  • Deploying the infrastructure into a “smoke test”environment to verify that the infrastructure configuration (along with a known set of parameters) results deploy into a desired state.

Getting started with Terraform on GitHub is easy. Versions of Terraform are installed on our Linux-based GitHub-hosted runners, and HashiCorp has an official GitHub Action to set up Terraform on a runner using a Terraform version that you specify.

Compliance as an automated check

We recently blogged about building compliance, security, and audit into your delivery pipelines and the benefits of this approach. When you add IaC to your existing development pipelines and workflows, you’ll have the ability to describe previously manual compliance testing and artifacts as code directly into your HCL configurations files.

A natural extension to IaC, policy as code allows your security and compliance teams to centralize the definitions of your organization’s requirements. Terraform Cloud’s built-in support for the HashiCorp Sentinel and Open Policy Agent (OPA) frameworks allows policy sets to be automatically ingested from GitHub repositories and applied consistently across all provisioning runs. This ensures policies are applied before misconfigurations have a chance to make it to production.

An added bonus mentioned in another recent blog is the ability to leverage AI-powered compliance solutions to optimize your delivery even more. Imagine a future where generative AI could create compliance-focused unit-tests across your entire development and infrastructure delivery pipeline with no manual effort.

Security in the background

You may have heard of Dependabot, our handy tool to help you keep your dependencies up to date. But did you know that Dependabot supports Terraform? That means you could rely on Dependabot to help keep your Terraform provider and module versions up to date.

Checks complete, time to deploy

With the checks complete, it’s now time for us to deploy our new infrastructure configuration! Branching and deployment strategies is beyond the scope of this post, so we’ll leave that for another discussion.

However, GitHub Actions can help us with the deployment aspect as well! As we explained earlier, getting started with Terraform on GitHub is easy. Versions of Terraform are installed on our Linux-based GitHub-hosted runners, and HashiCorp has an official GitHub Action to set up Terraform on a runner using a Terraform version that you specify.

But you can take this even further! In Terraform, it is very common to use the command terraform plan to understand the impact of changes before you push them to production. terraform apply is then used to execute the changes.

Reviewing environment changes in a pull request

HashiCorp provides an example of automating Terraform with GitHub Actions. This example orchestrates a release through Terraform Cloud by using GitHub Actions. The example takes the output of the terraform plan command and copies the output into your pull request for approval (again, this depends on the development flow that you’ve chosen).

Reviewing environment changes using GitHub Actions environments

Let’s consider another example, based on the example from HashiCorp. GitHub Actions has a built-in concept of environments. Think of these environments as a logical mapping to a target deployment location. You can associate a protection rule with an environment so that an approval is given before deploying.

So, with that context, let’s create a GitHub Action workflow that has two environments—one which is used for planning purposes, and another which is used for deployment:

name: 'Review and Deploy to EnvironmentA'
on: [push]

jobs:
  review:
    name: 'Terraform Plan'
    environment: environment_a_plan
    runs-on: ubuntu-latest

    steps:
      - name: 'Checkout'
        uses: actions/checkout@v2

      - name: 'Terraform Setup'
        uses: hashicorp/setup-terraform@v2
        with:
          cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}

      - name: 'Terraform Init'
        run: terraform init


      - name: 'Terraform Format'
        run: terraform fmt -check

      - name: 'Terraform Plan'
        run: terraform plan -input=false

  deploy:
    name: 'Terraform'
    environment: environment_a_deploy
    runs-on: ubuntu-latest
    needs: [review]

    steps:
      - name: 'Checkout'
        uses: actions/checkout@v2

      - name: 'Terraform Setup'
        uses: hashicorp/setup-terraform@v2
        with:
          cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}

      - name: 'Terraform Init'
        run: terraform init

      - name: 'Terraform Plan'
        run: terraform apply -auto-approve -input=false

Before executing the workflow, we can create an environment in the GitHub repository and associate protection rules with the environment_a_deploy. This means that a review is required before a production deployment.

Learn more

Check out HashiCorp’s Practitioner’s Guide to Using HashiCorp Terraform Cloud with GitHub for some common recommendations on getting started. And find out how we at GitHub are using Terraform to deliver mission-critical functionality faster and at lower cost.

Using CloudFormation events to build custom workflows for post provisioning management

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/using-cloudformation-events-to-build-custom-workflows-for-post-provisioning-management/

Over one million active customers manage application resources with AWS CloudFormation every week. CloudFormation is a service that helps you model, provision, and manage your cloud resources by treating Infrastructure as Code (IaC). It can simplify infrastructure management, quickly replicate your environment to multiple AWS regions with a single turn-key solution, and let you easily control and track changes in your infrastructure.

You can create various AWS resources using CloudFormation to setup an environment for your workloads. You continue to interact with and manage those resources throughout the workload lifecycle to make sure the resource configuration is aligned with business objectives such as adhering to security compliance standards, meeting required reliability targets, and aligning with budget requirements. The inability to perform a hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services poses a challenge. For example, after provisioning of resources, customers might need to perform additional tasks to manage these resources such as adding cost allocation tags, populating resource inventory database or trigger downstream processes.

While they are able to obtain the logical resource grouping that is tied to a workload or a workload component with a CloudFormation stack, that context does not extend beyond CloudFormation for the most part when they use various AWS and non-AWS services to conduct post-provisioning resource management. These AWS and non-AWS services typically offer a resource level view, or in some cases offer basic aggregated views such as supporting a tag group, or an account level abstraction to see all resources in a given account. For a CloudFormation customer, the inability to not have the context of a stack beyond resource provisioning provides a disjointed experience given there is no hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services. The various management actions customers take with their workload resources through out their lifecycle are

CloudFormation events provide a robust way to track the status of individual resources during the lifecycle of a stack. You can send CloudFormation events to Amazon EventBridge whenever a create, update,  or drift detection action is performed on your stack. Then you can set up additional workflows based on those events from EventBridge. For example, by tagging the resources automatically, you can reference that tag group when using AWS Trusted Advisor, and continue your resource management experience post-provisioning. CloudFormation sends these events to EventBridge automatically so that you don’t need to do anything. One real-world use case is to use these events to create actionable tasks for your teams to troubleshoot issues. CloudFormation events published to EventBridge can be used to create OpsItems within AWS Systems Manager OpsCenter. OpsItems are the work items created in OpsCenter for engineers to view, investigate and remediate tasks/issues. This enables teams to respond and resolve any issues more efficiently.

Walkthrough

To set up the EventBridge rule, go to the AWS console and navigate to EventBridge. Select on Create Rule to get started. Enter Name, description and select Next:

Create Rule

On the next screen, select AWS events in the Event source section.

This sample event is for the CREATE_COMPLETE event. It contains the source, AWS account number, AWS region, event type, resources and details about the event.

On the same page in the Event pattern section:

Select Custom patterns (JSON editor) and enter the following event pattern. This will match any events when a resource fails to create, update, or delete. Learn more about EventBridge event patterns.

{
    "source": [
        "aws.cloudformation"
    ],
    "detail-type": [
        "CloudFormation Resource Status Change"
    ],
    "detail": {
        "status-details": {
            "status": [
                "CREATE_FAILED",
                "UPDATE_FAILED",
                "DELETE_FAILED"
            ]
        }
    }
}

Custom patterns - JSON editor

Select Next. On the Target screen, select AWS service, then select System Manager OpsItem as the target for this rule.

Target 1

Add a second target – an Amazon Simple Notification Service (SNS) Topic – to notify the Ops team whenever a failure occurs and an OpsItem has been created.

Target 2

Select Next and optionally add tags.

Select next to review the selections, and select Create rule.

Now your rule is created and whenever a stack failure occurs, an OpsItem gets created and a notification is sent out for the operators to troubleshoot and fix the issue. The OpsItem contains operational data, such as the resource that failed, the reason for failure, as well as the stack to which it belongs, which is useful for troubleshooting the issue. Operators can take manual actions or use runbooks codified as Systems Manager Documents to take corrective actions. From the AWS Console you can go to OpsCenter to see the events:

operational data

Once the issues have been addressed, operators can mark the OpsItem as resolved, and retry the stack operation that failed, resulting in a swift resolution of the issue, and preventing duplication of efforts.

This walkthrough is for the Console but you can use AWS Command Line Interface (AWS CLI), AWS SDK or even CloudFormation to accomplish all of this. Refer to AWS CLI documentation for more information on creating EventBridge rules through CLI. Furthermore, refer to AWS SDK documentation for creating EventBridge rules through AWS SDK. You can use following CloudFormation template to deploy the EventBridge rules example used as part of the walkthrough in this blog post:

{
	"Parameters": {
		"SNSTopicARN": {
			"Type": "String",
			"Description": "Enter the ARN of the SNS Topic where you want stack failure notifications to be sent."
		}
	},
	"Resources": {
		"CFNEventsRule": {
			"Type": "AWS::Events::Rule",
			"Properties": {
				"Description": "Event rule to capture CloudFormation failure events",
				"EventPattern": {
					"source": [
						"aws.cloudformation"
					],
					"detail-type": [
						"CloudFormation Resource Status Change"
					],
					"detail": {
						"status-details": {
							"status": [
								"CREATE_FAILED",
								"UPDATE_FAILED",
								"DELETE_FAILED"
							]
						}
					}
				},
				"Name": "cfn-stack-failure-test",
				"State": "ENABLED",
				"Targets": [
					{
						"Arn": {
							"Fn::Sub": "arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:opsitem"
						},
						"Id": "opsitems",
						"RoleArn": {
							"Fn::GetAtt": [
								"TargetInvocationRole",
								"Arn"
							]
						}
					},
					{
						"Arn": {
							"Ref": "SNSTopicARN"
						},
						"Id": "sns"
					}
				]
			}
		},
		"TargetInvocationRole": {
			"Type": "AWS::IAM::Role",
			"Properties": {
				"AssumeRolePolicyDocument": {
					"Version": "2012-10-17",
					"Statement": [
						{
							"Effect": "Allow",
							"Principal": {
								"Service": [
									"events.amazonaws.com"
								]
							},
							"Action": [
								"sts:AssumeRole"
							]
						}
					]
				},
				"Path": "/",
				"Policies": [
					{
						"PolicyName": "createopsitem",
						"PolicyDocument": {
							"Version": "2012-10-17",
							"Statement": [
								{
									"Effect": "Allow",
									"Action": [
										"ssm:CreateOpsItem"
									],
									"Resource": "*"
								}
							]
						}
					}
				]
			}
		},
		"AllowSNSPublish": {
			"Type": "AWS::SNS::TopicPolicy",
			"Properties": {
				"PolicyDocument": {
					"Statement": [
						{
							"Sid": "grant-eventbridge-publish",
							"Effect": "Allow",
							"Principal": {
								"Service": "events.amazonaws.com"
							},
							"Action": [
								"sns:Publish"
							],
							"Resource": {
								"Ref": "SNSTopicARN"
							}
						}
					]
				},
				"Topics": [
					{
						"Ref": "SNSTopicARN"
					}
				]
			}
		}
	}
}

Summary

Responding to CloudFormation stack events becomes easy with the integration between CloudFormation and EventBridge. CloudFormation events can be used to perform post-provisioning actions on workload resources. With the variety of targets available to EventBridge rules, various actions such as adding tags and, troubleshooting issues can be performed. This example above uses Systems Manager and Amazon SNS but you can have numerous targets including, Amazon API gateway, AWS Lambda, Amazon Elastic Container Service (Amazon ECS) task, Amazon Kinesis services, Amazon Redshift, Amazon SageMaker pipeline, and many more. These events are available for free in EventBridge.

Learn more about Managing events with CloudFormation and EventBridge.

About the Author

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 25 years of experience in software engineering and architecture roles for various large-scale enterprises.

 

 

Mahanth is a Solutions Architect at Amazon Web Services (AWS). As part of the AWS Well-Architected team, he works with customers and AWS Partner Network partners of all sizes to help them build secure, high-performing, resilient, and efficient infrastructure for their applications. He spends his free time playing with his pup Cosmo, learning more about astronomy, and is an avid gamer.

 

 

Sukhchander is a Solutions Architect at Amazon Web Services. He is passionate about helping startups and enterprises adopt the cloud in the most scalable, secure, and cost-effective way by providing technical guidance, best practices, and well architected solutions.

Deploying Alexa Skills with the AWS CDK

Post Syndicated from Jeff Gardner original https://aws.amazon.com/blogs/devops/deploying-alexa-skills-with-aws-cdk/

So you’re expanding your reach by leveraging voice interfaces for your applications through the Alexa ecosystem. You’ve experimented with a new Alexa Skill via the Alexa Developer Console, and now you’re ready to productionalize it for your customers. How exciting!

You are also a proponent of Infrastructure as Code (IaC). You appreciate the speed, consistency, and change management capabilities enabled by IaC. Perhaps you have other applications that you provision and maintain via DevOps practices, and you want to deploy and maintain your Alexa Skill in the same way. Great idea!

That’s where AWS CloudFormation and the AWS Cloud Development Kit (AWS CDK) come in. AWS CloudFormation lets you treat infrastructure as code, so that you can easily model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles. The AWS CDK is an open-source software development framework for modeling and provisioning your cloud application resources via familiar programming languages, like TypeScript, Python, Java, and .NET. AWS CDK utilizes AWS CloudFormation in the background in order to provision resources in a safe and repeatable manner.

In this post, we show you how to achieve Infrastructure as Code for your Alexa Skills by leveraging powerful AWS CDK features.

Concepts

Alexa Skills Kit (ASK)

In addition to the Alexa Developer Console, skill developers can utilize the Alexa Skills Kit (ASK) to build interactive voice interfaces for Alexa. ASK provides a suite of self-service APIs and tools for building and interacting with Alexa Skills, including the ASK CLI, the Skill Management API (SMAPI), and SDKs for Node.js, Java, and Python. These tools provide a programmatic interface for your Alexa Skills in order to update them with code rather than through a user interface.

AWS CloudFormation

AWS CloudFormation lets you create templates written in either YAML or JSON format to model your infrastructure in code form. CloudFormation templates are declarative and idempotent, allowing you to check them into a versioned code repository, deploy them automatically, and track changes over time.

The ASK CloudFormation resource allows you to incorporate Alexa Skills in your CloudFormation templates alongside your other infrastructure. However, this has limitations that we’ll discuss in further detail in the Problem section below.

AWS Cloud Development Kit (AWS CDK)

Think of the AWS CDK as a developer-centric toolkit that leverages the power of modern programming languages to define your AWS infrastructure as code. When AWS CDK applications are run, they compile down to fully formed CloudFormation JSON/YAML templates that are then submitted to the CloudFormation service for provisioning. Because the AWS CDK leverages CloudFormation, you still enjoy every benefit provided by CloudFormation, such as safe deployment, automatic rollback, and drift detection. AWS CDK currently supports TypeScript, JavaScript, Python, Java, C#, and Go (currently in Developer Preview).

Perhaps the most compelling part of AWS CDK is the concept of constructs—the basic building blocks of AWS CDK apps. The three levels of constructs reflect the level of abstraction from CloudFormation. A construct can represent a single resource, like an AWS Lambda Function, or it can represent a higher-level component consisting of multiple AWS resources.

The three different levels of constructs begin with low-level constructs, called L1 (short for “level 1”) or Cfn (short for CloudFormation) resources. These constructs directly represent all of the resources available in AWS CloudFormation. The next level of constructs, called L2, also represents AWS resources, but it has a higher-level and intent-based API. They provide not only similar functionality, but also the defaults, boilerplate, and glue logic you’d be writing yourself with a CFN Resource construct. Finally, the AWS Construct Library includes even higher-level constructs, called L3 constructs, or patterns. These are designed to help you complete common tasks in AWS, often involving multiple resource types. Learn more about constructs in the AWS CDK developer guide.

One L2 construct example is the Custom Resources module. This lets you execute custom logic via a Lambda Function as part of your deployment in order to cover scenarios that the AWS CDK doesn’t support yet. While the Custom Resources module leverages CloudFormation’s native Custom Resource functionality, it also greatly reduces the boilerplate code in your CDK project and simplifies the necessary code in the Lambda Function. The open-source construct library referenced in the Solution section of this post utilizes Custom Resources to avoid some limitations of what CloudFormation and CDK natively support for Alexa Skills.

Problem

The primary issue with utilizing the Alexa::ASK::Skill CloudFormation resource, and its corresponding CDK CfnSkill construct, arises when you define the Skill’s backend Lambda Function in the same CloudFormation template or CDK project. When the Skill’s endpoint is set to a Lambda Function, the ASK service validates that the Skill has the appropriate permissions to invoke that Lambda Function. The best practice is to enable Skill ID verification in your Lambda Function. This effectively restricts the Lambda Function to be invokable only by the configured Skill ID. The problem is that in order to configure Skill ID verification, the Lambda Permission must reference the Skill ID, so it cannot be added to the Lambda Function until the Alexa Skill has been created. If we try creating the Alexa Skill without the Lambda Permission in place, insufficient permissions will cause the validation to fail. The endpoint validation causes a circular dependency preventing us from defining our desired end state with just the native CloudFormation resource.

Unfortunately, the AWS CDK also does not yet support any L2 constructs for Alexa skills. While the ASK Skill Management API is another option, managing imperative API calls within a CI/CD pipeline would not be ideal.

Solution

Overview

AWS CDK is extensible in that if there isn’t a native construct that does what you want, you can simply create your own! You can also publish your custom constructs publicly or privately for others to leverage via package registries like npm, PyPI, NuGet, Maven, etc.

We could write our own code to solve the problem, but luckily this use case allows us to leverage an open-source construct library that addresses our needs. This library is currently available for TypeScript (npm) and Python (PyPI).

The complete solution can be found at the GitHub repository, here. The code is in TypeScript, but you can easily port it to another language if necessary. See the AWS CDK Developer Guide for more guidance on translating between languages.

Prerequisites

You will need the following in order to build and deploy the solution presented below. Please be mindful of any prerequisites for these tools.

  • Alexa Developer Account
  • AWS Account
  • Docker
    • Used by CDK for bundling assets locally during synthesis and deployment.
    • See Docker website for installation instructions based on your operating system.
  • AWS CLI
    • Used by CDK to deploy resources to your AWS account.
    • See AWS CLI user guide for installation instructions based on your operating system.
  • Node.js
    • The CDK Toolset and backend runs on Node.js regardless of the project language. See the detailed requirements in the AWS CDK Getting Started Guide.
    • See the Node.js website to download the specific installer for your operating system.

Clone Code Repository and Install Dependencies

The code for the solution in this post is located in this repository on GitHub. First, clone this repository and install its local dependencies by executing the following commands in your local Terminal:

# clone repository
git clone https://github.com/aws-samples/aws-devops-blog-alexa-cdk-walkthrough
# navigate to project directory
cd aws-devops-blog-alexa-cdk-walkthrough
# install dependencies
npm install

Note that CLI commands in the sections below (ask, cdk) use npx. This executes the command from local project binaries if they exist, or, if not, it installs the binaries required to run the command. In our case, the local binaries are installed as part of the npm install command above. Therefore, npx will utilize the local version of the binaries even if you already have those tools installed globally. We use this method to simplify setup and alleviate any issues arising from version discrepancies.

Get Alexa Developer Credentials

To create and manage Alexa Skills via CDK, we will need to provide Alexa Developer account credentials, which are separate from our AWS credentials. The following values must be supplied in order to authenticate:

  • Vendor ID: Represents the Alexa Developer account.
  • Client ID: Represents the developer, tool, or organization requiring permission to perform a list of operations on the skill. In this case, our AWS CDK project.
  • Client Secret: The secret value associated with the Client ID.
  • Refresh Token: A token for reauthentication. The ASK service uses access tokens for authentication that expire one hour after creation. Refresh tokens do not expire and can retrieve a new access token when needed.

Follow the steps below to retrieve each of these values.

Get Alexa Developer Vendor ID

Easily retrieve your Alexa Developer Vendor ID from the Alexa Developer Console.

  1. Navigate to the Alexa Developer console and login with your Amazon account.
  2. After logging in, on the main screen click on the “Settings” tab.

Screenshot of Alexa Developer console showing location of Settings tab

  1. Your Vendor ID is listed in the “My IDs” section. Note this value.

Screenshot of Alexa Developer console showing location of Vendor ID

Create Login with Amazon (LWA) Security Profile

The Skill Management API utilizes Login with Amazon (LWA) for authentication, so first we must create a security profile for LWA under the same Amazon account that we will use to create the Alexa Skill.

  1. Navigate to the LWA console and login with your Amazon account.
  2. Click the “Create a New Security Profile” button.

Screenshot of Login with Amazon console showing location of Create a New Security Profile button

  1. Fill out the form with a Name, Description, and Consent Privacy Notice URL, and then click “Save”.

Screenshot of Login with Amazon console showing Create a New Security Profile form

  1. The new Security Profile should now be listed. Hover over the gear icon, located to the right of the new profile name, and click “Web Settings”.

Screenshot of Login with Amazon console showing location of Web Settings link

  1. Click the “Edit” button and add the following under “Allowed Return URLs”:
    • http://127.0.0.1:9090/cb
    • https://s3.amazonaws.com/ask-cli/response_parser.html
  2. Click the “Save” button to save your changes.
  3. Click the “Show Secret” button to reveal your Client Secret. Note your Client ID and Client Secret.

Screenshot of Login with Amazon console showing location of Client ID and Client Secret values

Get Refresh Token from ASK CLI

Your Client ID and Client Secret let you generate a refresh token for authenticating with the ASK service.

  1. Navigate to your local Terminal and enter the following command, replacing <your Client ID> and <your Client Secret> with your Client ID and Client Secret, respectively:
# ensure you are in the root directory of the repository
npx ask util generate-lwa-tokens --client-id "<your Client ID>" --client-confirmation "<your Client Secret>" --scopes "alexa::ask:skills:readwrite alexa::ask:models:readwrite"
  1. A browser window should open with a login screen. Supply credentials for the same Amazon account with which you created the LWA Security Profile previously.
  2. Click the “Allow” button to grant the refresh token appropriate access to your Amazon Developer account.
  3. Return to your Terminal. The credentials, including your new refresh token, should be printed. Note the value in the refresh_token field.

NOTE: If your Terminal shows an error like CliFileNotFoundError: File ~/.ask/cli_config not exists., you need to first initialize the ASK CLI with the command npx ask configure. This command will open a browser with a login screen, and you should enter the credentials for the Amazon account with which you created the LWA Security Profile previously. After signing in, return to your Terminal and enter n to decline linking your AWS account. After completing this process, try the generate-lwa-tokens command above again.

NOTE: If your Terminal shows an error like CliError: invalid_client, make sure that you have included the quotation marks (") around the --client_id and --client-confirmation arguments.

Add Alexa Developer Credentials to AWS SSM Parameter Store / AWS Secrets Manager

Our AWS CDK project requires access to the Alexa Developer credentials we just generated (Client ID, Client Secret, Refresh Token) in order to create and manage our Skill. To avoid hard-coding these values into our code, we can store the values in AWS Systems Manager (SSM) Parameter Store and AWS Secrets Manager, and then retrieve them programmatically when deploying our CDK project. In our case, we are using SSM Parameter Store to store the non-sensitive values in plaintext, and Secrets Manager to store the secret values in encrypted form.

The repository contains a shell script at scripts/upload-credentials.sh that can create the appropriate parameters and secrets via AWS CLI. You’ll just need to supply the credential values from the previous steps. Alternatively, instructions for creating parameters and secrets via the AWS Console or AWS CLI can each be found in the AWS Systems Manager User Guide and AWS Secrets Manager User Guide.

You will need the following resources created in your AWS account before proceeding:

Name Service Type
/alexa-cdk-blog/alexa-developer-vendor-id SSM Parameter Store String
/alexa-cdk-blog/lwa-client-id SSM Parameter Store String
/alexa-cdk-blog/lwa-client-secret Secrets Manager Plaintext / secret-string
/alexa-cdk-blog/lwa-refresh-token Secrets Manager Plaintext / secret-string

Code Walkthrough

Skill Package

When you programmatically create an Alexa Skill, you supply a Skill Package, which is a zip file consisting of a set of files defining your Skill. A skill package includes a manifest JSON file, and optionally a set of interaction model files, in-skill product files, and/or image assets for your skill. See the Skill Management API documentation for details regarding skill packages.

The repository contains a skill package that defines a simple Time Teller Skill at src/skill-package. If you want to use an existing Skill instead, replace the contents of src/skill-package with your skill package.

If you want to export the skill package of an existing Skill, use the ASK CLI:

  1. Navigate to the Alexa Developer console and log in with your Amazon account.
  2. Find the Skill you want to export and click the link under the name “Copy Skill ID”. Either make sure this stays on your clipboard or note the Skill ID for the next step.
  3. Navigate to your local Terminal and enter the following command, replacing <your Skill ID> with your Skill ID:
# ensure you are in the root directory of the repository
cd src
npx ask smapi export-package --stage development --skill-id <your Skill ID>

NOTE: To export the skill package for a live skill, replace --stage development with --stage live.

NOTE: The CDK code in this solution will dynamically populate the manifest.apis section in skill.json. If that section is populated in your skill package, either clear it out or know that it will be replaced when the project is deployed.

Skill Backend Lambda Function

The Lambda Function code for the Time Teller Alexa Skill’s backend also resides within the CDK project at src/lambda/skill-backend. If you want to use an existing Skill instead, replace the contents of src/lambda/skill-backend with your Lambda code. Also note the following if you want to use your own Lambda code:

  • The CDK code in the repository assumes that the Lambda Function runtime is Python. However, you can modify for another runtime if necessary by using either the aws-lambda or aws-lambda-nodejs CDK module instead of aws-lambda-python.
  • If you’re using your own Python Lambda Function code, please note the following to ensure the Lambda Function definition compatibility in the sample CDK project. If your Lambda Function varies from what is below, then you may need to modify the CDK code. See the Python Lambda code in the repository for an example.
    • The skill-backend/ directory should contain all of the necessary resources for your Lambda Function. For Python functions, this should include at least a file named index.py that contains your Lambda entrypoint, and a requirements.txt file containing your pip dependencies.
    • For Python functions, your Lambda handler function should be called handler(). This generally looks like handler = SkillBuilder().lambda_handler() when using the Python ASK SDK.

Open-Source Alexa Skill Construct Library

As mentioned above, this solution utilizes an open-source construct library to create and manage the Alexa Skill. This construct library utilizes the L1 CfnSkill construct along with other L1 and L2 constructs to create a complete Alexa Skill with a functioning backend Lambda Function. Utilizing this construct library means that we are no longer limited by the shortcomings of only using the Alexa::ASK::Skill CloudFormation resource or L1 CfnSkill construct.

Look into the construct library code if you’re curious. There’s only one construct—Skill—and you can follow the code to see how it dodges the Lambda Permission issue.

CDK Stack

The CDK stack code is located in lib/alexa-cdk-stack.ts. Let’s dive in to understand what’s happening. We’ll look at one section at a time:

...
const PARAM_PREFIX = '/alexa-cdk-blog/'

export class AlexaCdkStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Get Alexa Developer credentials from SSM Parameter Store/Secrets Manager.
    // NOTE: Parameters and secrets must have been created in the appropriate account before running `cdk deploy` on this stack.
    //       See sample script at scripts/upload-credentials.sh for how to create appropriate resources via AWS CLI.
    const alexaVendorId = ssm.StringParameter.valueForStringParameter(this, `${PARAM_PREFIX}alexa-developer-vendor-id`);
    const lwaClientId = ssm.StringParameter.valueForStringParameter(this, `${PARAM_PREFIX}lwa-client-id`);
    const lwaClientSecret = cdk.SecretValue.secretsManager(`${PARAM_PREFIX}lwa-client-secret`);
    const lwaRefreshToken = cdk.SecretValue.secretsManager(`${PARAM_PREFIX}lwa-refresh-token`);
    ...
  }
}

First, within the stack’s constructor, after calling the constructor of the base class, we retrieve the credentials we uploaded earlier to SSM and Secrets Manager. This lets us to store our account credentials in a safe place—encrypted in the case of our lwaClientSecret and lwaRefreshToken secrets—and we avoid storing sensitive data in plaintext or source control.

...
export class AlexaCdkStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    ...
    // Create the Lambda Function for the Skill Backend
    const skillBackend = new lambdaPython.PythonFunction(this, 'SkillBackend', {
      entry: 'src/lambda/skill-backend',
      timeout: cdk.Duration.seconds(7)
    });
    ...
  }
}

Next, we create the Lambda Function containing the skill’s backend logic. In this case, we are using the aws-lambda-python module. This transparently handles every aspect of the dependency installation and packaging for us. Rather than leave the default 3-second timeout, specify a 7-second timeout to correspond with the Alexa service timeout of 8 seconds.

...

export class AlexaCdkStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    ...
    // Create the Alexa Skill
    const skill = new Skill(this, 'Skill', {
      endpointLambdaFunction: skillBackend,
      skillPackagePath: 'src/skill-package',
      alexaVendorId: alexaVendorId,
      lwaClientId: lwaClientId,
      lwaClientSecret: lwaClientSecret,
      lwaRefreshToken: lwaRefreshToken
    });
  }
}

Finally, we create our Skill! All we need to do is pass the Lambda Function with the Skill’s backend code into where the skill package is located, as well as the credentials for authenticating into our Alexa Developer account. All of the wiring for deploying the skill package and connecting the Lambda Function to the Skill is handled transparently within the construct code.

Deploy CDK project

Now that all of our code is in place, we can deploy our project and test it out!

  1. Make sure that you have bootstrapped your AWS account for CDK. If not, you can bootstrap with the following command:
# ensure you are in the root directory of the repository
npx cdk bootstrap
  1. Make sure that the Docker daemon is running locally. This is generally done by starting the Docker Desktop application.
    • You can also use the following Terminal command to determine whether the Docker daemon is running. The command will return an error if the daemon is not running.
docker ps -q
    • See more details regarding starting the Docker daemon based on your operating system via the Docker website.
  1. Synthesize your CDK project in order to confirm that your project is building properly.
# ensure you are in the root directory of the repository
npx cdk synth

NOTE: In addition to generating the CloudFormation template for this project, this command also bundles the Lambda Function code via Docker, so it may take a few minutes to complete.

  1. Deploy!
# ensure you are in the root directory of the repository
npx cdk deploy
    • Feel free to review the IAM policies that will be created, and enter y to continue when prompted.
    • If you would like to skip the security approval requirement and deploy in one step, use cdk deploy --require-approval never instead.

Check it out!

Once your project finishes deploying, take a look at your new Skill!

  1. Navigate to the Alexa Developer console and log in with your Amazon account.
  2. After logging in, on the main screen you should now see your new Skill listed. Click on the name to go to the “Build” screen.
  3. Investigate the console to confirm that your Skill was created as expected.
  4. On the left-hand navigation menu, click “Endpoint” and confirm that the ARN for your backend Lambda Function is showing in the “Default Region” field. This ARN was added dynamically by our CDK project.

Screenshot of Alexa Developer console showing location of Endpoint text box

  1. Test the Skill to confirm that it functions properly.
    1. Click on the “Test” tab and enable testing for the “Development” stage of your skill.
    2. Type your Skill’s invocation name in the Alexa Simulator in order to launch the skill and invoke a response.
      • If you deployed the sample skill package and Lambda Function, the invocation name is “time teller”. If Alexa responds with the current time in UTC, it is working properly!

Bonus Points

Now that you can deploy your Alexa Skill via the AWS CDK, can you incorporate your new project into a CI/CD pipeline for automated deployments? Extra kudos if the pipeline is defined with the CDK 🙂 Follow these links for some inspiration:

Cleanup

After you are finished, delete the resources you created to avoid incurring future charges. This can be easily done by deleting the CloudFormation stack from the CloudFormation console, or by executing the following command in your Terminal, which has the same effect:

# ensure you are in the root directory of the repository
npx cdk destroy

Conclusion

You can, and should, strive for IaC and CI/CD in every project, and the powerful AWS CDK features make that easier with a set of simple yet flexible constructs. Leverage the simplicity of declarative infrastructure definitions with convenient default configurations and helper methods via the AWS CDK. This example also reveals that if there are any gaps in the built-in functionality, you can easily fill them with a custom resource construct, or one of the thousands of open-source construct libraries shared by fellow CDK developers around the world. Happy coding!

Carlos Santos

Jeff Gardner

Jeff Gardner is a Solutions Architect with Amazon Web Services (AWS). In his role, Jeff helps enterprise customers through their cloud journey, leveraging his experience with application architecture and DevOps practices. Outside of work, Jeff enjoys watching and playing sports and chasing around his three young children.