Tag Archives: devops

Quickly adopt new AWS features with the Terraform AWS Cloud Control provider

2024-05-30 Welly Siauw

Post Syndicated from Welly Siauw original https://aws.amazon.com/blogs/devops/quickly-adopt-new-aws-features-with-the-terraform-aws-cloud-control-provider/

Introduction

Today, we are pleased to announce the general availability of the Terraform AWS Cloud Control (AWS CC) Provider, enabling our customers to take advantage of AWS innovations faster. AWS has been continually expanding its services to support virtually any cloud workload; supporting over 200 fully featured services and delighting customers through its rapid pace of innovation with over 3,400 significant new features in 2023. Our customers use Infrastructure as Code (IaC) tools such as HashiCorp Terraform among others as a best-practice to provision and manage these AWS features and services as part of their cloud infrastructure at scale. With the Terraform AWS CC Provider launch, AWS customers using Terraform as their IaC tool can now benefit from faster time-to-market by building cloud infrastructure with the latest AWS innovations that are typically available on the Terraform AWS CC Provider on the day of launch. For example, AWS customer Meta’s Oculus Studios was able to quickly leverage Amazon GameLift to support their game development. “AWS and Hashicorp have been great partners in helping Oculus Studios standardize how we deploy our GameLift infrastructure using industry best practices.” said Mick Afaneh, Meta’s Oculus Studios Central Technology.

The Terraform AWS CC Provider leverages AWS Cloud Control API to automatically generate support for hundreds of AWS resource types, such as Amazon EC2 instances and Amazon S3 buckets. Since the AWS CC provider is automatically generated, new features and services on AWS can be supported as soon as they are available on AWS Cloud Control API, addressing any coverage gaps in the existing Terraform AWS standard provider. This automated process allows the AWS CC provider to deliver new resources faster because it does not have to wait for the community to author schema and resource implementations for each new service. Today, the AWS CC provider supports 950+ AWS resources and data sources, with more support being added as AWS service teams continue to adopt the Cloud Control API standard.

As a Terraform practitioner, using the AWS CC Provider would feel familiar to the existing workflow. You can employ the configuration blocks shown below, while specifying your preferred region.

terraform {
  required_providers {
    awscc = {
      source  = "hashicorp/awscc"
      version = "~> 1.0"
    }
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "awscc" {
  region = "us-east-1"
}

provider "aws" {
  region = "us-east-1"
}

During Terraform plan or apply, the AWS CC Terraform provider interacts with AWS Cloud Control API to provision the resources by calling its consistent Create, Read, Update, Delete, or List (CRUD-L) APIs.

AWS Cloud Control API

AWS service teams own, publish, and maintain resources on the AWS CloudFormation Registry using a standardized resource model. This resource model uses uniform JSON schemas and provisioning logic that codifies the expected behavior and error handling associated with CRUD-L operations. This resource model enables AWS service teams to expose their service features in an easily discoverable, intuitive, and uniform format with standardized behavior. Launched in September 2021, AWS Cloud Control API exposes these resources through a set of five consistent CRUD-L operations without any additional work from service teams. Using Cloud Control API, developers can manage the lifecycle of hundreds of AWS and third-party resources with consistent resource-oriented API instead of using distinct service-specific APIs. Furthermore, Cloud Control API is up-to-date with the latest AWS resources as soon as they are available on the CloudFormation Registry, typically on the day of launch. You can read more on launch day requirement for Cloud Control API in this blog post. This enables AWS Partners such as HashiCorp to take advantage of consistent CRUD-L API operations and integrate Terraform with Cloud Control API just once, and then automatically access new AWS resources without additional integration work.

History and Evolution of the Terraform AWS CC Provider

The general availability of Terraform AWS CC Provider project is a culmination of 4+ years of collaboration between AWS and HashiCorp. Our teams partnered across the Product, Engineering, Partner, and Customer Support functions in influencing, shaping, and defining the customer experience leading up to the the technical preview announcement of the AWS CC provider in September 2021. At technical preview, the provider supported more than 300 resources. Since then, we have added an additional 600+ resources to the provider, bringing the total to 950+ supported resources at general availability.

Beyond just increasing resource coverage, we gathered additional signals from customer feedback during the technical preview and rolled out several improvements since September 2021. Customers care deeply about the user experience on the providers available on the Terraform registry. Customers sought practical examples in the form of sample HCL configurations for each resource that they could use to immediately test in order to confidently start using the provider. This prompted us to enrich the AWS CC provider with hundreds of practical examples for popular AWS CC provider resources in the Terraform registry. This was made possible by contributions of hundreds of Amazonians who became early adopters of the AWS CC provider. We also published a how-to guide for anyone interested in contributing to AWS CC provider examples. Furthermore, customers also wanted to minimize context switching by moving between Terraform and AWS service documentation on what each attribute of a resource signified and the type of values it needed as part of configuration. This empowered us to prioritize augmenting the provider with rich resource attribute description with information taken from AWS documentation. The documentation provides detailed information of how to use the attributes, enumerations of the accepted attribute values and other relevant information for dozens of popularly used AWS resources.

We also worked with HashiCorp on various bug fixes and feature enhancements for the AWS CC provider, as well as the upstream Cloud Control API dependencies. We improved handling for resources with complex nested attribute schemas, implemented various bug fixes to resolve unintended resource replacement, and refined provider behavior under various conditions to support the idempotency expected by Terraform practitioners. While this are not an exhaustive list of improvements, we continue to listen to customer feedback and iterate on improving the experience. We encourage you to try out the provider and share feedback on the AWS CC provider’s GitHub page.

Using the AWS CC Provider

Let’s take an example of a recently introduced service, Amazon Q Business, a fully managed, generative AI-powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. Amazon Q Business resources were available in AWS CC provider shortly after the April 30th 2024 launch announcement. In the following example, we’ll create a demo Amazon Q Business application and deploy the web experience.

data "aws_caller_identity" "current" {}

data "aws_ssoadmin_instances" "example" {}

resource "awscc_qbusiness_application" "example" {
  description                  = "Example QBusiness Application"
  display_name                 = "Demo_QBusiness_App"
  attachments_configuration    = {
    attachments_control_mode = "ENABLED"
  }
  identity_center_instance_arn = data.aws_ssoadmin_instances.example.arns[0]
}

resource "awscc_qbusiness_web_experience" "example" {
  application_id              = awscc_qbusiness_application.example.id
  role_arn                    = awscc_iam_role.example.arn
  subtitle                    = "Drop a file and ask questions"
  title                       = "Demo Amazon Q Business"
  welcome_message             = "Welcome, please enter your questions"
}

resource "awscc_iam_role" "example" {
  role_name   = "Amazon-QBusiness-WebExperience-Role"
  description = "Grants permissions to AWS Services and Resources used or managed by Amazon Q Business"
  assume_role_policy_document = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "QBusinessTrustPolicy"
        Effect = "Allow"
        Principal = {
          Service = "application.qbusiness.amazonaws.com"
        }
        Action = [
          "sts:AssumeRole",
          "sts:SetContext"
        ]
        Condition = {
          StringEquals = {
            "aws:SourceAccount" = data.aws_caller_identity.current.account_id
          }
          ArnEquals = {
            "aws:SourceArn" = awscc_qbusiness_application.example.application_arn
          }
        }
      }
    ]
  })
  policies = [{
    policy_name = "qbusiness_policy"
    policy_document = jsonencode({
      Version = "2012-10-17"
      Statement = [
        {
          Sid = "QBusinessConversationPermission"
          Effect = "Allow"
          Action = [
            "qbusiness:Chat",
            "qbusiness:ChatSync",
            "qbusiness:ListMessages",
            "qbusiness:ListConversations",
            "qbusiness:DeleteConversation",
            "qbusiness:PutFeedback",
            "qbusiness:GetWebExperience",
            "qbusiness:GetApplication",
            "qbusiness:ListPlugins",
            "qbusiness:GetChatControlsConfiguration"
          ]
          Resource = awscc_qbusiness_application.example.application_arn
        }
      ]
    })
  }]
}

As you see in this example, you can use both the AWS and AWS CC providers in the same configuration file. This allows you to easily incorporate new resources available in the AWS CC provider into your existing configuration with minimal changes. The AWS CC provider also accepts the same authentication method and provider-level features available in the AWS provider. This means you don’t have to add additional configuration in your CI/CD pipeline to start using the AWS CC provider. In addition, you can also add custom agent information inside the provider block as described in this documentation.

Things to know

The AWS CC provider is unique due to how it was developed and its dependencies with Cloud Control API and AWS resource model in the CloudFormation registry. As such, there are things that you should know before you start using the AWS CC provider.

The AWS CC provider is generated from the latest CloudFormation schemas, and will release weekly containing all new AWS services and enhancements added to Cloud Control API.
Certain resources available in the CloudFormation schema are not compatible with the AWS CC provider due to nuances in the schema implementation. You can find them on the GitHub issue list here. We are actively working to add these resources to the AWS CC provider.
The AWS CC provider requires Terraform CLI version 1.0.7 or higher.
Every AWS CC provider resource includes a top-level attribute `id` that acts as the resource identifier. If the CloudFormation resource schema also has a similarly named top-level attribute `id`, then that property is mapped to a new attribute named `<type>_id`. For example `web_experience_id` for `awscc_qbusiness_web_experience` resource.
If a resource attribute is not defined in the Terraform configuration, the AWS CC provider will honor the default values specified in the CloudFormation resource schema. If the resource schema does not include a default value, AWS CC provider will use attribute value stored in the Terraform state (taken from Cloud Control API GetResponse after resource was created).
In correlation to the default value behavior as stated above, when an attribute value is removed from the Terraform configuration (e.g. by commenting the attribute), the AWS CC provider will use the previous attribute value stored in the Terraform state. As such, no drift will be detected on the resource configuration when you run Terraform plan / apply.
The AWS CC provider data sources are either plural or singular with filters based on `id` attribute. Currently there is no native support for metadata sources such as `aws_region` or `aws_caller_identity`. You can continue to leverage the AWS provider data sources to complement your Terraform configuration.

If you want to dive deeper into AWS CC provider resource behavior, we encourage you to check the documentation here.

Conclusion

The AWS CC provider is now generally available and will be the fastest way for customers to access newly launched AWS features and services using Terraform. We will continue to add support for more resources, additional examples and enriching the schema descriptions. You can start using the AWS CC provider alongside your existing AWS standard provider. To learn more about the AWS CC provider, please check the HashiCorp announcement blog post. You can also follow the workshop on how to get started with AWS CC provider. If you are interested in contributing with practical examples for AWS CC provider resources, check out the how-to guide. For more questions or if you run into any issues with the new provider, don’t hesitate to submit your issue in the AWS CC provider GitHub repository.

Authors

Accelerate your Software Development Lifecycle with Amazon Q

2024-05-08 Chetan Makvana

Post Syndicated from Chetan Makvana original https://aws.amazon.com/blogs/devops/accelerate-your-software-development-lifecycle-with-amazon-q/

Software development teams are constantly looking for ways to accelerate their software development lifecycle (SDLC) to release quality software faster. Amazon Q, a generative AI–powered assistant, can help software development teams work more efficiently throughout the SDLC—from research to maintenance.

Software development teams spend significant time on undifferentiated tasks while analyzing requirements, building, testing, and operating applications. Trained on 17 years’ worth of AWS expertise, Amazon Q can transform how you build, deploy, and operate applications and workloads on AWS. By automating mundane tasks, Amazon Q enables development teams to spend more innovating and building. Amazon Q can speed up on-boarding, reduce context switching, and accelerate development of applications on AWS.

This blog post explores how Amazon Q can accelerate development tasks across the SDLC using an example To-Do API project. Throughout this blog, we will navigate through the various phases of the SDLC while implementing To-Do API by leveraging Amazon Q Business and Amazon Q Developer. We will walk through common use cases for Amazon Q Business in the planning and research phases, and Amazon Q Developer in the research, design, development, testing, and maintenance phases.

Planning

As a product owner, you spend significant time on requirements analysis and user story development. You research internal documents like functional specifications and business requirements to understand the desired functionalities and objectives. Manually sifting through documentation is time consuming. You can leverage Amazon Q Business to quickly extract relevant information from your internal documents or wikis, such as Confluence. Amazon Q Business quickly connects to your business data, information, and systems so that you can have tailored conversations, solve problems, generate content, and take actions relevant to your business. Amazon Q Business offers over 40 built-in connectors to popular enterprise applications and document repositories, including Amazon Simple Storage Service (Amazon S3), Confluence, Salesforce and more, enabling you to create a generative AI solution with minimal configuration. Amazon Q Business also provides plugins to interact with third-party applications. These plugins support read and write actions that can help boost end user productivity.

So, instead of digging through the internal documentations, you can simply ask Amazon Q Business about requirements using natural language and it will provide immediate and relevant information to the users, and helps streamline tasks and accelerate problem solving.

For our To-Do API example, let’s consider the business requirements are documented in Confluence, and Jira is utilized for issue management. You can configure Amazon Q Business with Confluence and Jira through the Confluence connector and Jira plugin, respectively. To understand requirements, you may ask Amazon Q Business for an overview of the use case, business drivers, non-functional requirements, and other related questions. Amazon Q Business then pulls the relevant details from the Confluence documents and presents them to you in a clear and concise manner. This allows you to save time gathering requirements and focus more on developing user stories.

Once you have a good understanding of the requirements, you can ask Amazon Q Business to write a user story and even create a Jira issue for you. For the To-Do API use case, Amazon Q Business generates the user stories tailored to the requirements and creates the corresponding Jira ticket ready for your team, saving you time and ensuring efficiency in the project workflow.

Research and Design

Let’s consider a scenario where the above mentioned user story is assigned to you and you have to implement it based on the technology stacks described in the confluence page.

First, you ask Amazon Q Business to gain insights into the technology stacks aligning with the organization’s development guidelines. Amazon Q Business promptly provides you with details sourced from the internal development guidelines document hosted on Confluence along with references and citations.

As a developer, you can use Amazon Q Developer in your integrated development environment (IDE) to get software development assistance, including code explanation, code generation, and code improvements such as debugging and optimization. Amazon Q Developer can help by analyzing the requirements, assessing different approaches, and creating an implementation plan and sample code. It can investigate options, weigh tradeoffs, recommend best practices, and even brainstorm with you to optimize the design.

Let’s see how Amazon Q Developer can help analyze the user story, design, and brainstorm with you arrive at an implementation plan.

Let us further refine the design with adding non-functional requirements such as security and performance.

Develop and Test

Amazon Q Developer can generate code snippets that meet your specified business and technical needs. You can review the auto-generated code, manually copy, and paste it into your editor, or use the Insert at cursor option to directly incorporate it into your source code. This allows you to rapidly prototype and iterate on new capabilities for your application. Amazon Q Developer uses the context of your conversation to inform future responses for the duration of your conversation. This makes it easy to help you focus on building applications because you don’t have to leave your IDE to get answers and context-specific coding guidance.

Amazon Q Developer is particularly useful for answering questions related to the following areas:

Building on AWS, including AWS service selection, limits, and best practices.
General software development concepts, including programming language syntax and application development.
Writing code, including explaining code, debugging code, and writing unit tests.
Upgrading and modernizing existing application code using Amazon Q Developer Agent for Code Transformation.

Expanding on the same user story design generated by Amazon Q Developer, you can ask Amazon Q Developer to implement the API and refine based on additional requirements and parameters. Let’s collaborate with Amazon Q Developer, to expand our design to implementation. You can leverage Amazon Q Developer’s expertise to ideate, evaluate options, and arrive at an optimal solution. Amazon Q Developer can have an intelligent discussion to brainstorm creative new test cases based on the requirements. It can then help construct an implementation plan, suggesting ways to efficiently add robust, comprehensive tests that cover edge cases.

Let’s ask Amazon Q Developer to generate code based on the design.

Now, let’s ask Amazon Q Developer to implement the AWS Lambda function.

Amazon Q Developer can provide code examples and snippets that show how to implement the design. You can review the code, get Amazon Q Developer’s feedback, and seamlessly integrate it into the project. Collaborating with Amazon Q Developer allows you to amplify your productivity by leveraging its knowledge to quickly iterate and enrich our application capabilities.

Amazon Q Developer can also review the code and find opportunities for improvements, optimization based on performance and other parameters. Let us ask Amazon Q Developer to find any opportunities for improvements on the code for our To-do API.

Debugging and Troubleshooting

Amazon Q Developer can assist you with troubleshooting and debugging your code. For unfamiliar error codes or exception types, you can ask Amazon Q Developer to research their meaning and common resolutions. Amazon Q Developer can also help by analyzing your applications’ debug logs, highlighting any anomalies, errors, or warnings that could indicate potential issues.

Amazon Q Developer can troubleshoot network connectivity issues caused by misconfiguration, providing concise problem analysis and resolution suggestions. Amazon Q Developer can also research AWS best practices to identify areas not aligned with recommendations. For code issues, it can answer questions and debug problems within supported IDEs. Leveraging its knowledge of AWS services and their interactions, Amazon Q Developer can provide service-specific guidance. In the AWS Management Console, Amazon Q Developer can troubleshoot errors you receive while working with AWS services such as insufficient permissions, incorrect configuration, and exceeding service limits.

Let’s test our To-Do API by hitting the Amazon API Gateway endpoint using cURL.

The API Gateway endpoint invokes the Lambda function to insert records in the Amazon DynamoDB table. Since it throws Internal Server Error, let’s go to the Lambda console to troubleshoot this further and test the function directly by creating a test event for the POST method. You can troubleshoot different console errors with Amazon Q Developer, directly in the AWS Management Console. For the above error, Amazon Q analyzes the issue and helps find the resolution. Amazon Q explains how to fix this error directly on the console by adding an environment variable for DynamoDB table name.

Now, let’s ask Amazon Q Developer in IDE to generate code to fix this error. Amazon Q Developer then generates a code snippet to set the desired environment variable in the AWS Cloud Development Kit (AWS CDK) code for Lambda function.

Conclusion

In this post, you learned how to leverage Amazon Q Business and Amazon Q Developer to streamline SDLC and accelerate time-to-market. With its deep understanding of code and AWS resources, Amazon Q Developer enables development teams to work efficiently throughout the research, design, development, testing, and review phases. By automating mundane tasks, offering expert guidance, generating code snippets, optimizing implementations, and troubleshooting issues, Amazon Q Developer allows developers to redirect their focus towards higher-value activities that drive innovation. Moreover, through Amazon Q Business, teams can leverage the power of generative AI to expedite the requirements planning and research phases.

Automate Terraform Deployments with Amazon CodeCatalyst and Terraform Community action

2024-05-02 Vineeth Nair

Post Syndicated from Vineeth Nair original https://aws.amazon.com/blogs/devops/automate-terraform-deployments-with-amazon-codecatalyst-and-terraform-community-action/

Amazon CodeCatalyst integrates continuous integration and deployment (CI/CD) by bringing key development tools together on one platform. With the entire application lifecycle managed in one tool, CodeCatalyst empowers rapid, dependable software delivery. CodeCatalyst offers a range of actions which is the main building block of a workflow, and defines a logical unit of work to perform during a workflow run. Typically, a workflow includes multiple actions that run sequentially or in parallel depending on how you’ve configured them.

Introduction

Infrastructure as code (IaC) has become a best practice for managing IT infrastructure. IaC uses code to provision and manage your infrastructure in a consistent, programmatic way. Terraform by HashiCorp is one of most common tools for IaC.

With Terraform, you define the desired end state of your infrastructure resources in declarative configuration files. Terraform determines the necessary steps to reach the desired state and provisions the infrastructure automatically. This removes the need for manual processes while enabling version control, collaboration, and reproducibility across your infrastructure.

In this blog post, we will demonstrate using the “Terraform Community Edition” action in CodeCatalyst to create resources in an AWS account.

Amazon CodeCatalyst workflow overview
Figure 1: Amazon CodeCatalyst Action

Prerequisites

To follow along with the post, you will need the following items:

An AWS Builder ID for signing in to CodeCatalyst.
A CodeCatalyst space
Have the Space administrator role assigned in your CodeCatalyst space
Have an AWS account associated with your space along with an associated IAM role
A CodeCatalyst project with a source repository
A CodeCatalyst environment configured with a connection to your target AWS account
An Amazon S3 Bucket to store Terraform remote state file
An Amazon DynamoDB Table to manage the locking of the state file during Terraform operations.

Walkthrough

In this walkthrough we create an Amazon S3 bucket using the Terraform Community Edition action in Amazon CodeCatalyst. The action will execute the Terraform commands needed to apply your configuration. You configure the action with a specified Terraform version. When the action runs it uses that Terraform version to deploy your Terraform templates, provisioning the defined infrastructure. This action will run terraform init to initialize the working directory, terraform plan to preview changes, and terraform apply to create the Amazon S3 bucket based on the Terraform configuration in a target AWS Account. At the end of the post your workflow will look like the following:

Amazon CodeCatalyst Workflow with Terraform Community Action

Figure 2: Amazon CodeCatalyst Workflow with Terraform Community Action

Create the base workflow

To begin, we create a workflow that will execute our Terraform code. In the CodeCatalyst project, click on CI/CD on left pane and select Workflows. In the Workflows pane, click on Create Workflow.

Creating Amazon CodeCatalyst Workflow

Figure 3: Creating Amazon CodeCatalyst Workflow

We have taken an existing repository my-sample-terraform-repository as a source repository.

Creating Workflow from source repository

Figure 4 : Creating Workflow from source repository

Once the source repository is selected, select Branch as main and click Create. You will have an empty workflow. You can edit the workflow from within the CodeCatalyst console. Click on the Commit button to create an initial commit:

Initial Workflow commit

Figure 5: Initial Workflow commit

On the Commit Workflow dialogue, add a commit message, and click on Commit. Ignore any validation errors at this stage:

Completing Initial Commit for Workflow

Figure 6: Completing Initial Commit for Workflow

Connect to CodeCatalyst Dev Environment

For this post, we will use an AWS Cloud9 Dev Environment to edit our workflow. Your first step is to connect to the dev environment. Select Code → Dev Environments.

Navigate to CodeCatalyst Dev Environments

Figure 7 : Navigate to CodeCatalyst Dev Environments

If you do not already have a Dev Environment you can create an instance by selecting the Create Dev Environment dropdown and selecting AWS Cloud9 (in browser). Leave the options as default and click on Create to provision a new Dev Environment.

Create CodeCatalyst Dev Environment

Figure 8: Create CodeCatalyst Dev Environment

Once the Dev Environment has provisioned, you are redirected to a Cloud9 instance in browser. The Dev Environment automatically clones the existing repository for the Terraform project code. We at first create a main.tf file in root of the repository with the Terraform code for creating an Amazon S3 bucket. To do this, we right click on the repository folder in the tree-pane view on the left side of the Cloud9 Console window and select New File

Creating a new file in Cloud9

Figure 9: Creating a new file in Cloud9

We are presented with a new file which we will name main.tf, this file will store the Terraform code. We then edit main.tf by right clicking on the file and selecting open. We insert the code below into main.tf. The code has a Terraform resource block to create an AWS S3 Bucket. The configuration also uses Terraform AWS datasources to obtain AWS region and AWS Account ID data which is used to form part of the bucket name. Finally, we use a backend block to configure Terraform to use an AWS S3 bucket to store Terraform state data. To save our changes we select File -> Save

: Adding Terraform Code

Figure 10: Adding Terraform Code

Now let’s start creating Terraform Workflow using Amazon CodeCatalyst Terraform Community Action. Within your repository go to .codecatalyst/workflows directory and open the <workflowname.yaml> file.

Creating CodeCatalyst Workflow

Figure 11: Creating CodeCatalyst Workflow

The below code snippet is an example workflow definition with terraform plan and terraform apply. We will enter this into our workflow file, with the relevant configuration settings for our environment.

The workflow does the following:

When a change is pushed to the main branch, a new workflow execution is triggered. This workflow carries a Terraform plan and subsequent apply operation.

Name: terraform-action-workflow
Compute:
  Type: EC2
  Fleet: Linux.x86-64.Large
SchemaVersion: "1.0"
Triggers:
  - Type: Push
    Branches:
      -  main
Actions: 
  PlanTerraform:
    Identifier: codecatalyst-labs/provision-with-terraform-community@v1
    Environment:
      Name: dev 
      Connections:
        - Name: codecatalyst
          Role: CodeCatalystWorkflowDevelopmentRole # The IAM role to be used
    Inputs:
      Sources:
        - WorkflowSource
    Outputs:
      Artifacts:
        - Name: tfplan # generates a tfplan output artifact
          Files:
            - tfplan.out
    Configuration:
      AWSRegion: eu-west-2
      StateBucket: tfstate-bucket # The Terraform state S3 Bucket
      StateKey: terraform.tfstate # The Terraform state file
      StateKeyPrefix: states/ # The path to the state file (optional)
      StateTable: tfstate-table # The Dynamo DB database
      TerraformVersion: ‘1.5.1’ # The Terraform version to be used
      TerraformOperationMode: plan # The Terraform operation- can be plan or apply
  ApplyTerraform:
    Identifier: codecatalyst-labs/provision-with-terraform-community@v1
    DependsOn:
      - PlanTerraform
    Environment:
      Name: dev 
      Connections:
        - Name: codecatalyst
          Role: CodeCatalystWorkflowDevelopmentRole
    Inputs:
      Sources:
        - WorkflowSource
      Artifacts:
        - tfplan
    Configuration:
      AWSRegion: eu-west-2
      StateBucket: tfstate-bucket
      StateKey: terraform.tfstate
      StateKeyPrefix: states/
      StateTable: tfstate-table
      TerraformVersion: '1.5.1'
      TerraformOperationMode: apply

Key configuration parameters are:
- Environment.Name: The name of our CodeCatalyst Environment
- Environment.Connections.Name: The name of the CodeCatalyst connection
- Environment.Connections.Role: The IAM role used for the workflow
- AWSRegion: The AWS region that hosts the Terraform state bucket
- Environment.Name: The name of our CodeCatalyst Environment
- Identifier: codecatalyst-labs/provision-with-terraform-community@v1
- StateBucket: The Terraform state bucket
- StateKey: The Terraform statefile e.g. terraform.tfstate
- StateKeyPrefix: The folder location of the State file (optional)
- StateTable: The DynamoDB State table
- TerraformVersion: The version of Terraform to be installed
- TerraformOperationMode: The operation mode for Terraform – this can be either ‘plan’ or ‘apply’

The workflow now contains CodeCatalyst action for Terraform Plan and Terraform Apply.

To save our changes we select File -> Save, we can then commit these to our git repository by typing the following at the terminal:

git add . && git commit -m ‘adding terraform workflow and main.tf’ && git push

The above command adds the workflow file and Terraform code to be tracked by git. It then commits the code and pushes the changes to CodeCatalyst git repository. As we have a branch trigger for main defined, this will trigger a run of the workflow. We can monitor the status of the workflow in the CodeCatalyst console by selecting CICD -> Workflows. Locate your workflow and click on Runs to view the status. You will be able to observe that the workflow has successfully completed and Amazon S3 bucket is created.

: CodeCatalyst Workflow Status

Figure 12: CodeCatalyst Workflow Status

Cleaning up

If you have been following along with this workflow, you should delete the resources that you have deployed to avoid further charges. The walkthrough will create an Amazon S3 bucket named <your-aws-account-id>-<your-aws-region>-terraform-sample-bucket in your AWS account. In the AWS Console > S3, locate the bucket that was created, then select and click Delete to remove the bucket.

Conclusion

In this post, we explained how you can easily get started deploying IaC to your AWS accounts with Amazon CodeCatalyst. We outlined how the Terraform Community Edition action can streamline the process of planning and applying Terraform configurations and how to create a workflow that can leverage this action. Get started with Amazon CodeCatalyst today.

Add your Ruby gems to AWS CodeArtifact

2024-05-01 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/add-your-ruby-gems-to-aws-codeartifact/

Ruby developers can now use AWS CodeArtifact to securely store and retrieve their gems. CodeArtifact integrates with standard developer tools like gem and bundler.

Applications often use numerous packages to speed up development by providing reusable code for common tasks like network access, cryptography, or data manipulation. Developers also embed SDKs–such as the AWS SDKs–to access remote services. These packages may come from within your organization or from third parties like open source projects. Managing packages and dependencies is integral to software development. Languages like Java, C#, JavaScript, Swift, and Python have tools for downloading and resolving dependencies, and Ruby developers typically use gem and bundler.

However, using third-party packages presents legal and security challenges. Organizations must ensure package licenses are compatible with their projects and don’t violate intellectual property. They must also verify that the included code is safe and doesn’t introduce vulnerabilities, a tactic known as a supply chain attack. To address these challenges, organizations typically use private package servers. Developers can only use packages vetted by security and legal teams made available through private repositories.

CodeArtifact is a managed service that allows the safe distribution of packages to internal developer teams without managing the underlying infrastructure. CodeArtifact now supports Ruby gems in addition to npm, PyPI, Maven, NuGet, SwiftPM, and generic formats.

You can publish and download Ruby gem dependencies from your CodeArtifact repository in the AWS Cloud, working with existing tools such as gem and bundler. After storing packages in CodeArtifact, you can reference them in your Gemfile. Your build system will then download approved packages from the CodeArtifact repository during the build process.

How to get started
Imagine I’m working on a package to be shared with other development teams in my organization.

In this demo, I show you how I prepare my environment, upload the package to the repository, and use this specific package build as a dependency for my project. I focus on the steps specific to Ruby packages. You can read the tutorial written by my colleague Steven to get started with CodeArtifact.

I use an AWS account that has a package repository (MyGemsRepo) and domain (stormacq-test) already configured.

To let the Ruby tools acess my CodeArtifact repository, I start by collecting an authentication token from CodeArtifact.

export CODEARTIFACT_AUTH_TOKEN=`aws codeartifact get-authorization-token \
                                     --domain stormacq-test              \
                                     --domain-owner 012345678912         \
                                     --query authorizationToken          \
                                     --output text`

export GEM_HOST_API_KEY="Bearer $CODEARTIFACT_AUTH_TOKEN"

Note that the authentication token expires after 12 hours. I must repeat this command after 12 hours to obtain a fresh token.

Then, I request the repository endpoint. I pass the domain name and domain owner (the AWS account ID). Notice the --format ruby option.

export RUBYGEMS_HOST=`aws codeartifact get-repository-endpoint  \
                           --domain stormacq-test               \
                           --domain-owner 012345678912          \
                           --format ruby                        \
                           --repository MyGemsRepo              \
                           --query repositoryEndpoint           \
                           --output text`

Now that I have the repository endpoint and an authentication token, gem will use these environment variable values to connect to my private package repository.

I create a very simple project, build it, and send it to the package repository.

$ gem build hola.gemspec 

Successfully built RubyGem
  Name: hola-codeartifact
  Version: 0.0.0
  File: hola-codeartifact-0.0.0.gem
  
$ gem push hola-codeartifact-0.0.0.gem 
Pushing gem to https://stormacq-test-486652066693.d.codeartifact.us-west-2.amazonaws.com/ruby/MyGemsRepo...

I verify in the console that the package is available.

Now that the package is available, I can use it in my projects as usual. This involves configuring the local ~/.gemrc file on my machine. I follow the instructions provided by the console, and I make sure I replace ${CODEARTIFACT_AUTH_TOKEN} with its actual value.

Once ~/.gemrc is correctly configured, I can install gems as usual. They will be downloaded from my private gem repository.

$ gem install hola-codeartifact

Fetching hola-codeartifact-0.0.0.gem
Successfully installed hola-codeartifact-0.0.0
Parsing documentation for hola-codeartifact-0.0.0
Installing ri documentation for hola-codeartifact-0.0.0
Done installing documentation for hola-codeartifact after 0 seconds
1 gem installed

Install from upstream
I can also associate my repository with an upstream source. It will automatically fetch gems from upstream when I request one.

To associate the repository with rubygems.org, I use the console, or I type

aws codeartifact  associate-external-connection \
                   --domain stormacq-test       \
                   --repository MyGemsRepo      \
                   --external-connection public:ruby-gems-org

{
    "repository": {
        "name": "MyGemsRepo",
        "administratorAccount": "012345678912",
        "domainName": "stormacq-test",
        "domainOwner": "012345678912",
        "arn": "arn:aws:codeartifact:us-west-2:012345678912:repository/stormacq-test/MyGemsRepo",
        "upstreams": [],
        "externalConnections": [
            {
                "externalConnectionName": "public:ruby-gems-org",
                "packageFormat": "ruby",
                "status": "AVAILABLE"
            }
        ],
        "createdTime": "2024-04-12T12:58:44.101000+02:00"
    }
}

Once associated, I can pull any gems through CodeArtifact. It will automatically fetch packages from upstream when not locally available.

$ gem install rake 

Fetching rake-13.2.1.gem
Successfully installed rake-13.2.1
Parsing documentation for rake-13.2.1
Installing ri documentation for rake-13.2.1
Done installing documentation for rake after 0 seconds
1 gem installed

I use the console to verify the rake package is now available in my repo.

Things to know
There are some things to keep in mind before uploading your first Ruby packages.

Be sure to update to the latest version of the AWS CLI before trying any command shown in the preceding instructions. Only the latest version of the CLI knows about Ruby repositories in CodeArtifact.
The authentication token expires after 12 hours. We suggest writing a script to automate its renewal or using a scheduled AWS Lambda function and securely storing the token in AWS Secrets Manager (for example).

Pricing and availability
CodeArtifact costs for Ruby packages are the same as for the other package formats already supported. CodeArtifact billing depends on three metrics: the storage (measured in GB per month), the number of requests, and the data transfer out to the internet or to other AWS Regions. Data transfer to AWS services in the same Region is not charged, meaning you can run your continuous integration and delivery (CI/CD) jobs on Amazon Elastic Compute Cloud (Amazon EC2) or AWS CodeBuild, for example, without incurring a charge for the CodeArtifact data transfer. As usual, the pricing page has the details.

CodeArtifact for Ruby packages is available in all 13 Regions where CodeArtifact is available.

Now, go build your Ruby applications and upload your private packages to CodeArtifact!

— seb

De-risk releases with AWS CodePipeline rollbacks

2024-04-29 Matt Laver

Post Syndicated from Matt Laver original https://aws.amazon.com/blogs/devops/de-risk-releases-with-aws-codepipeline-rollbacks/

It’s an established practice for development teams to build deployment pipelines, with services such as AWS CodePipeline, to increase the quality of application and infrastructure releases through reliable, repeatable and consistent automation.

Automating the deployment process helps build quality into our products by introducing continuous integration to build and test code as early as possible, aiming to catch errors before they reach production, but with all the best will in the world, issues can be missed and not caught until after a production release has been initiated.

In our AWS DevOps Guidance we include a DevOps Sagas indicator to Implement automatic rollbacks for failed deployments:

“Implement an automatic rollback strategy to enhance system reliability and minimize service disruptions. The strategy should be defined as a proactive measure in case of an operational event, which prioritizes customer impact mitigation even before identifying whether the new deployment is the cause of the issue.”

When release problems occur, pressure is put on teams to fix the issue quickly which can be time consuming and stressful. Is the issue related to code in the last change? Has the environment changed? Should we manually fix-forward by urgently fixing the code and re-releasing?

AWS CodePipeline recently added a stage level rollback feature that enables customers to recover from a failed pipeline quickly by processing the source revisions that previously successfully completed the failed stage.

In this blog post I will cover how the rollback feature can be enabled and walk through two scenarios to cover both automatic and manual rollbacks.

AWS CodePipeline

AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.

Figure 1. AWS CodePipeline orchestration stages

CodePipeline has three core constructs:

Pipelines – A pipeline is a workflow construct that describes how software changes go through a release process. Each pipeline is made up of a series of stages.
Stages – A stage is a logical unit you can use to isolate an environment and to limit the number of concurrent changes in that environment. Each stage contains actions that are performed on the application artifacts.
Actions – An action is a set of operations performed on application code and configured so that the actions run in the pipeline at a specified point.

A pipeline execution is a set of changes released by a pipeline. Each pipeline execution is unique and has its own ID. An execution corresponds to a set of changes, such as a merged commit or a manual release of the latest commit.

Enable Automatic rollbacks

Automatic rollbacks can be enabled at the stage level within V2 type pipelines. When automatic rollbacks are enabled on a stage that has failed, CodePipeline will automatically select the source revisions from the most recent pipeline execution that successfully completed the stage and initiate the rollback.

Enabling automatic rollbacks can be set on V2 pipeline creation by selecting Configure automatic rollback on stage failure on a given stage, as per Figure 2 below.

Figure 2. Enable automatic rollback feature on pipeline creation

Automated rollbacks can be enabled for any stage, except the “Source” stage.

For existing V2 pipelines, Automatic rollbacks can also be toggled by editing a stage and toggling Configure automatic rollback on stage failure, see Figure 3 below.

Figure 3. Enable automatic rollback feature on an existing pipeline

Enabling automatic rollback on an existing pipeline will change the pipeline configuration which means pipeline executions before this step will not be considered an eligible pipeline execution. Only successful pipeline executions from this point onwards will be able to be considered for a rollback.

Example 1 – Automated Rollbacks

To demonstrate how auto-rollbacks work, I’ve created a simple pipeline based on one of the CodePipeline tutorials: Create a simple pipeline (S3 bucket). When following this tutorial, In Step 4: Create your first pipeline in CodePipeline there are two differences:

Ensure a V2 pipeline type is selected when creating the pipeline
When in the add deploy stage, select Configure automatic rollback on stage failure

Once the pipeline has been created, it should complete its first run within a few minutes.

Figure 4. Simple pipeline (S3 bucket)

Now I am going to simulate a deployment failure, and this is how:

Locate and un-zip either SampleApp_Windows.zip or SampleApp_Linux.zip depending on which server instances was selected when following the CodePipeline tutorial.
Delete the appspec.yml file.
Re-zip the application and upload to the Amazon S3 bucket created in Step1: Create an S3 bucket for your application.

The pipeline will trigger again after the new archive is uploaded and after a few minutes the deployment will report a stage failure and trigger an automatic rollback on the “Deploy” stage.

It’s possible to observe the Rollback as it re-deploys the last successful pipeline execution, initially showing a status of “In-progress” as per Figure 5 below:

Figure 5. Rollback in-progress

A new pipeline execution ID can be seen in the stage with the source revisions from the previous successfully completed pipeline execution. After a few minutes the rollback will be marked as “Succeeded”:

Figure 6. Rollback Succeeded

Note that the rollback started a new Pipeline execution ID but it used the artifacts and variables from the prior successful pipeline execution.

With our production systems back in a healthy state, I can begin the investigations into why the deploy failed, the execution history, see Figure 7 below, shows the failed execution and allows the Execution ID to be inspected for more details on the failure.

Figure 7. Execution history showing the failed pipeline execution.

Note that the Trigger column also reports the FailedPipelineExecutionId, we can use the provided link to start investigating the failure.

Example 2 – Manual Rollbacks

There may be cases where the pipeline release was successful but it caused application errors such as errors in logs and infrastructure alarms. In these scenarios a manual rollback can be initiated via the Start rollback button.

Start rollback button available in Deploy stage

Figure 8. Start rollback

When manually triggering a rollback, it is possible to select a specific Execution ID and Source revision to re-deploy:

Figure 9. Select Execution ID

Selecting Start rollback will create a new Pipeline execution in the stage with the selected source revisions.

Clean up

Instructions on how to clean up the resources created in this blog can be found in the last step of the tutorial used to create the pipeline: Step 7: Clean up resources.

Conclusion

The absence of a rollback strategy can lead to prolonged service disruptions and compatibility issues. Introducing an automatic rollback mechanism to stages within a release pipeline can reduce downtime, maintain system reliability and can help return to a stable state in the event of a fault. Having a rollback plan for deployments, without any disruption for our customers, is critical in making a service reliable. Explicitly testing for rollback safety eliminates the need to rely on manual analysis, which can be error-prone.

To learn more about CodePipeline rollbacks, visit https://docs.aws.amazon.com/codepipeline/latest/userguide/stage-rollback.html.

About the author:

Accelerate security automation using Amazon CodeWhisperer

2024-04-15 Brendan Jenkins

Post Syndicated from Brendan Jenkins original https://aws.amazon.com/blogs/security/accelerate-security-automation-using-amazon-codewhisperer/

In an ever-changing security landscape, teams must be able to quickly remediate security risks. Many organizations look for ways to automate the remediation of security findings that are currently handled manually. Amazon CodeWhisperer is an artificial intelligence (AI) coding companion that generates real-time, single-line or full-function code suggestions in your integrated development environment (IDE) to help you quickly build software. By using CodeWhisperer, security teams can expedite the process of writing security automation scripts for various types of findings that are aggregated in AWS Security Hub, a cloud security posture management (CSPM) service.

In this post, we present some of the current challenges with security automation and walk you through how to use CodeWhisperer, together with Amazon EventBridge and AWS Lambda, to automate the remediation of Security Hub findings. Before reading further, please read the AWS Responsible AI Policy.

Current challenges with security automation

Many approaches to security automation, including Lambda and AWS Systems Manager Automation, require software development skills. Furthermore, the process of manually writing code for remediation can be a time-consuming process for security professionals. To help overcome these challenges, CodeWhisperer serves as a force multiplier for qualified security professionals with development experience to quickly and effectively generate code to help remediate security findings.

Security professionals should still cultivate software development skills to implement robust solutions. Engineers should thoroughly review and validate any generated code, as manual oversight remains critical for security.

Solution overview

Figure 1 shows how the findings that Security Hub produces are ingested by EventBridge, which then invokes Lambda functions for processing. The Lambda code is generated with the help of CodeWhisperer.

Figure 1: Diagram of the solution

Security Hub integrates with EventBridge so you can automatically process findings with other services such as Lambda. To begin remediating the findings automatically, you can configure rules to determine where to send findings. This solution will do the following:

Ingest an Amazon Security Hub finding into EventBridge.
Use an EventBridge rule to invoke a Lambda function for processing.
Use CodeWhisperer to generate the Lambda function code.

It is important to note that there are two types of automation for Security Hub finding remediation:

Partial automation, which is initiated when a human worker selects the Security Hub findings manually and applies the automated remediation workflow to the selected findings.
End-to-end automation, which means that when a finding is generated within Security Hub, this initiates an automated workflow to immediately remediate without human intervention.

Important: When you use end-to-end automation, we highly recommend that you thoroughly test the efficiency and impact of the workflow in a non-production environment first before moving forward with implementation in a production environment.

Prerequisites

To follow along with this walkthrough, make sure that you have the following prerequisites in place:

An AWS account
Visual Studio Code (VS Code) or supported JetBrains IDEs
Python
CodeWhisperer enabled locally in your IDE
AWS Config enabled
Security Hub enabled, with the NIST Special Publication 800-53 Revision 5 standard selected

Implement security automation

In this scenario, you have been tasked with making sure that versioning is enabled across all Amazon Simple Storage Service (Amazon S3) buckets in your AWS account. Additionally, you want to do this in a way that is programmatic and automated so that it can be reused in different AWS accounts in the future.

To do this, you will perform the following steps:

Generate the remediation script with CodeWhisperer
Create the Lambda function
Integrate the Lambda function with Security Hub by using EventBridge
Create a custom action in Security Hub
Create an EventBridge rule to target the Lambda function
Run the remediation

Generate a remediation script with CodeWhisperer

The first step is to use VS Code to create a script so that CodeWhisperer generates the code for your Lambda function in Python. You will use this Lambda function to remediate the Security Hub findings generated by the [S3.14] S3 buckets should use versioning control.

Note: The underlying model of CodeWhisperer is powered by generative AI, and the output of CodeWhisperer is nondeterministic. As such, the code recommended by the service can vary by user. By modifying the initial code comment to prompt CodeWhisperer for a response, customers can change the corresponding output to help meet their needs. Customers should subject all code generated by CodeWhisperer to typical testing and review protocols to verify that it is free of errors and is in line with applicable organizational security policies. To learn about best practices on prompt engineering with CodeWhisperer, see this AWS blog post.

To generate the remediation script

Open a new VS Code window, and then open or create a new folder for your file to reside in.
Create a Python file called cw-blog-remediation.py as shown in Figure 2.

Figure 2: New VS Code file created called cw-blog-remediation.py
Add the following imports to the Python file.
```
import json
import boto3
```
Because you have the context added to your file, you can now prompt CodeWhisperer by using a natural language comment. In your file, below the import statements, enter the following comment and then press Enter.
```
# Create lambda function that turns on versioning for an S3 bucket after the function is triggered from Amazon EventBridge
```
Accept the first recommendation that CodeWhisperer provides by pressing Tab to use the Lambda function handler, as shown in Figure 3.
&ngsp;

Figure 3: Generation of Lambda handler

To get the recommendation for the function from CodeWhisperer, press Enter. Make sure that the recommendation you receive looks similar to the following. CodeWhisperer is nondeterministic, so its recommendations can vary.

import json
import boto3

# Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
def lambda_handler(event, context):
    s3 = boto3.client('s3')
    bucket = event['detail']['requestParameters']['bucketName']
    response = s3.put_bucket_versioning(
        Bucket=bucket,
        VersioningConfiguration={
            'Status': 'Enabled'
        }
    )
    print(response)
    return {
        'statusCode': 200,
        'body': json.dumps('Versioning enabled for bucket ' + bucket)
    }

Take a moment to review the user actions and keyboard shortcut keys. Press Tab to accept the recommendation.
You can change the function body to fit your use case. To get the Amazon Resource Name (ARN) of the S3 bucket from the EventBridge event, replace the bucket variable with the following line:
```
bucket = event['detail']['findings'][0]['Resources'][0]['Id']
```

To prompt CodeWhisperer to extract the bucket name from the bucket ARN, use the following comment:

# Take the S3 bucket name from the ARN of the S3 bucket

Your function code should look similar to the following:

import json
import boto3

# Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
def lambda_handler(event, context):
    s3 = boto3.client('s3')
   bucket = event['detail']['findings'][0]['Resources'][0]['Id']
         # Take the S3 bucket name from the ARN of the S3 bucket
   bucket = bucket.split(':')[5]

    response = s3.put_bucket_versioning(
        Bucket=bucket,
        VersioningConfiguration={
            'Status': 'Enabled'
        }
    )
    print(response)
    return {
        'statusCode': 200,
        'body': json.dumps('Versioning enabled for bucket ' + bucket)
    }

Create a .zip file for cw-blog-remediation.py. Find the file in your local file manager, right-click the file, and select compress/zip. You will use this .zip file in the next section of the post.

Create the Lambda function

The next step is to use the automation script that you generated to create the Lambda function that will enable versioning on applicable S3 buckets.

To create the Lambda function

Open the AWS Lambda console.
In the left navigation pane, choose Functions, and then choose Create function.
Select Author from Scratch and provide the following configurations for the function:
1. For Function name, select sec_remediation_function.
2. For Runtime, select Python 3.12.
3. For Architecture, select x86_64.
4. For Permissions, select Create a new role with basic Lambda permissions.
Choose Create function.
To upload your local code to Lambda, select Upload from and then .zip file, and then upload the file that you zipped.
Verify that you created the Lambda function successfully. In the Code source section of Lambda, you should see the code from the automation script displayed in a new tab, as shown in Figure 4.

Figure 4: Source code that was successfully uploaded
Choose the Code tab.
Scroll down to the Runtime settings pane and choose Edit.
For Handler, enter cw-blog-remediation.lambda_handler for your function handler, and then choose Save, as shown in Figure 5.

Figure 5: Updated Lambda handler
For security purposes, and to follow the principle of least privilege, you should also add an inline policy to the Lambda function’s role to perform the tasks necessary to enable versioning on S3 buckets.
1. In the Lambda console, navigate to the Configuration tab and then, in the left navigation pane, choose Permissions. Choose the Role name, as shown in Figure 6.
  
  Figure 6: Lambda role in the AWS console
2. In the Add permissions dropdown, select Create inline policy.
  
  Figure 7: Create inline policy
3. Choose JSON, add the following policy to the policy editor, and then choose Next.
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "s3:PutBucketVersioning",
            "Resource": "*"
        }
    ]
}
```
4. Name the policy PutBucketVersioning and choose Create policy.

Create a custom action in Security Hub

In this step, you will create a custom action in Security Hub.

To create the custom action

Open the Security Hub console.
In the left navigation pane, choose Settings, and then choose Custom actions.
Choose Create custom action.
Provide the following information, as shown in Figure 8:
- For Name, enter TurnOnS3Versioning.
- For Description, enter Action that will turn on versioning for a specific S3 bucket.
- For Custom action ID, enter TurnOnS3Versioning.
  
  Figure 8: Create a custom action in Security Hub
Choose Create custom action.
Make a note of the Custom action ARN. You will need this ARN when you create a rule to associate with the custom action in EventBridge.

Create an EventBridge rule to target the Lambda function

The next step is to create an EventBridge rule to capture the custom action. You will define an EventBridge rule that matches events (in this case, findings) from Security Hub that were forwarded by the custom action that you defined previously.

To create the EventBridge rule

Navigate to the EventBridge console.
On the right side, choose Create rule.
On the Define rule detail page, give your rule a name and description that represents the rule’s purpose—for example, you could use the same name and description that you used for the custom action. Then choose Next.
Scroll down to Event pattern, and then do the following:
1. For Event source, make sure that AWS services is selected.
2. For AWS service, select Security Hub.
3. For Event type, select Security Hub Findings – Custom Action.
4. Select Specific custom action ARN(s) and enter the ARN for the custom action that you created earlier.
Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

As you provide this information, the Event pattern updates.
Choose Next.
On the Select target(s) step, in the Select a target dropdown, select Lambda function. Then from the Function dropdown, select sec_remediation_function.
Choose Next.
On the Configure tags step, choose Next.
On the Review and create step, choose Create rule.

Run the automation

Your automation is set up and you can now test the automation. This test covers a partial automation workflow, since you will manually select the finding and apply the remediation workflow to one or more selected findings.

Important: As we mentioned earlier, if you decide to make the automation end-to-end, you should assess the impact of the workflow in a non-production environment. Additionally, you may want to consider creating preventative controls if you want to minimize the risk of event occurrence across an entire environment.

To run the automation

In the Security Hub console, on the Findings tab, add a filter by entering Title in the search box and selecting that filter. Select IS and enter S3 general purpose buckets should have versioning enabled (case sensitive). Choose Apply.
In the filtered list, choose the Title of an active finding.
Before you start the automation, check the current configuration of the S3 bucket to confirm that your automation works. Expand the Resources section of the finding.
Under Resource ID, choose the link for the S3 bucket. This opens a new tab on the S3 console that shows only this S3 bucket.
In your browser, go back to the Security Hub tab (don’t close the S3 tab—you will need to return to it), and on the left side, select this same finding, as shown in Figure 10.

Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings
In the Actions dropdown list, choose the name of your custom action.

Figure 11: Choose the custom action that you created to start the remediation workflow
When you see a banner that displays Successfully started action…, go back to the S3 browser tab and refresh it. Verify that the S3 versioning configuration on the bucket has been enabled as shown in figure 12.

Figure 12: Versioning successfully enabled

Conclusion

In this post, you learned how to use CodeWhisperer to produce AI-generated code for custom remediations for a security use case. We encourage you to experiment with CodeWhisperer to create Lambda functions that remediate other Security Hub findings that might exist in your account, such as the enforcement of lifecycle policies on S3 buckets with versioning enabled, or using automation to remove multiple unused Amazon EC2 elastic IP addresses. The ability to automatically set public S3 buckets to private is just one of many use cases where CodeWhisperer can generate code to help you remediate Security Hub findings.

To sum up, CodeWhisperer acts as a tool that can help boost the productivity of security experts who have coding abilities, assisting them to swiftly write code to address security issues. However, security specialists should continue building their software development capabilities to implement robust solutions. Engineers should carefully review and test any generated code, since human oversight is still vital for security.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Terraform CI/CD and testing on AWS with the new Terraform Test Framework

2024-04-03 Kevon Mayers

Post Syndicated from Kevon Mayers original https://aws.amazon.com/blogs/devops/terraform-ci-cd-and-testing-on-aws-with-the-new-terraform-test-framework/

Graphic created by Kevon Mayers

Introduction

Organizations often use Terraform Modules to orchestrate complex resource provisioning and provide a simple interface for developers to enter the required parameters to deploy the desired infrastructure. Modules enable code reuse and provide a method for organizations to standardize deployment of common workloads such as a three-tier web application, a cloud networking environment, or a data analytics pipeline. When building Terraform modules, it is common for the module author to start with manual testing. Manual testing is performed using commands such as terraform validate for syntax validation, terraform plan to preview the execution plan, and terraform apply followed by manual inspection of resource configuration in the AWS Management Console. Manual testing is prone to human error, not scalable, and can result in unintended issues. Because modules are used by multiple teams in the organization, it is important to ensure that any changes to the modules are extensively tested before the release. In this blog post, we will show you how to validate Terraform modules and how to automate the process using a Continuous Integration/Continuous Deployment (CI/CD) pipeline.

Terraform Test

Terraform test is a new testing framework for module authors to perform unit and integration tests for Terraform modules. Terraform test can create infrastructure as declared in the module, run validation against the infrastructure, and destroy the test resources regardless if the test passes or fails. Terraform test will also provide warnings if there are any resources that cannot be destroyed. Terraform test uses the same HashiCorp Configuration Language (HCL) syntax used to write Terraform modules. This reduces the burden for modules authors to learn other tools or programming languages. Module authors run the tests using the command terraform test which is available on Terraform CLI version 1.6 or higher.

Module authors create test files with the extension *.tftest.hcl. These test files are placed in the root of the Terraform module or in a dedicated tests directory. The following elements are typically present in a Terraform tests file:

Provider block: optional, used to override the provider configuration, such as selecting AWS region where the tests run.
Variables block: the input variables passed into the module during the test, used to supply non-default values or to override default values for variables.
Run block: used to run a specific test scenario. There can be multiple run blocks per test file, Terraform executes run blocks in order. In each run block you specify the command Terraform (plan or apply), and the test assertions. Module authors can specify the conditions such as: length(var.items) != 0. A full list of condition expressions can be found in the HashiCorp documentation.

Terraform tests are performed in sequential order and at the end of the Terraform test execution, any failed assertions are displayed.

Basic test to validate resource creation

Now that we understand the basic anatomy of a Terraform tests file, let’s create basic tests to validate the functionality of the following Terraform configuration. This Terraform configuration will create an AWS CodeCommit repository with prefix name repo-.

# main.tf

variable "repository_name" {
  type = string
}
resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description     = "Test repository."
}

Now we create a Terraform test file in the tests directory. See the following directory structure as an example:

├── main.tf 
└── tests 
└── basic.tftest.hcl

For this first test, we will not perform any assertion except for validating that Terraform execution plan runs successfully. In the tests file, we create a variable block to set the value for the variable repository_name. We also added the run block with command = plan to instruct Terraform test to run Terraform plan. The completed test should look like the following:

# basic.tftest.hcl

variables {
  repository_name = "MyRepo"
}

run "test_resource_creation" {
  command = plan
}

Now we will run this test locally. First ensure that you are authenticated into an AWS account, and run the terraform init command in the root directory of the Terraform module. After the provider is initialized, start the test using the terraform test command.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... pass
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... pass

Our first test is complete, we have validated that the Terraform configuration is valid and the resource can be provisioned successfully. Next, let’s learn how to perform inspection of the resource state.

Create resource and validate resource name

Re-using the previous test file, we add the assertion block to checks if the CodeCommit repository name starts with a string repo- and provide error message if the condition fails. For the assertion, we use the startswith function. See the following example:

# basic.tftest.hcl

variables {
  repository_name = "MyRepo"
}

run "test_resource_creation" {
  command = plan

  assert {
    condition = startswith(aws_codecommit_repository.test.repository_name, "repo-")
    error_message = "CodeCommit repository name ${var.repository_name} did not start with the expected value of ‘repo-****’."
  }
}

Now, let’s assume that another module author made changes to the module by modifying the prefix from repo- to my-repo-. Here is the modified Terraform module.

# main.tf

variable "repository_name" {
  type = string
}
resource "aws_codecommit_repository" "test" {
  repository_name = format("my-repo-%s", var.repository_name)
  description = "Test repository."
}

We can catch this mistake by running the the terraform test command again.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... fail
╷
│ Error: Test assertion failed
│
│ on tests/basic.tftest.hcl line 9, in run "test_resource_creation":
│ 9: condition = startswith(aws_codecommit_repository.test.repository_name, "repo-")
│ ├────────────────
│ │ aws_codecommit_repository.test.repository_name is "my-repo-MyRepo"
│
│ CodeCommit repository name MyRepo did not start with the expected value 'repo-***'.
╵
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... fail

Failure! 0 passed, 1 failed.

We have successfully created a unit test using assertions that validates the resource name matches the expected value. For more examples of using assertions see the Terraform Tests Docs. Before we proceed to the next section, don’t forget to fix the repository name in the module (revert the name back to repo- instead of my-repo-) and re-run your Terraform test.

Testing variable input validation

When developing Terraform modules, it is common to use variable validation as a contract test to validate any dependencies / restrictions. For example, AWS CodeCommit limits the repository name to 100 characters. A module author can use the length function to check the length of the input variable value. We are going to use Terraform test to ensure that the variable validation works effectively. First, we modify the module to use variable validation.

# main.tf

variable "repository_name" {
  type = string
  validation {
    condition = length(var.repository_name) <= 100
    error_message = "The repository name must be less than or equal to 100 characters."
  }
}

resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description = "Test repository."
}

By default, when variable validation fails during the execution of Terraform test, the Terraform test also fails. To simulate this, create a new test file and insert the repository_name variable with a value longer than 100 characters.

# var_validation.tftest.hcl

variables {
  repository_name = “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
}

run “test_invalid_var” {
  command = plan
}

Notice on this new test file, we also set the command to Terraform plan, why is that? Because variable validation runs prior to Terraform apply, thus we can save time and cost by skipping the entire resource provisioning. If we run this Terraform test, it will fail as expected.

❯ terraform test
tests/basic.tftest.hcl… in progress
run “test_resource_creation”… pass
tests/basic.tftest.hcl… tearing down
tests/basic.tftest.hcl… pass
tests/var_validation.tftest.hcl… in progress
run “test_invalid_var”… fail
╷
│ Error: Invalid value for variable
│
│ on main.tf line 1:
│ 1: variable “repository_name” {
│ ├────────────────
│ │ var.repository_name is “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
│
│ The repository name must be less than or equal to 100 characters.
│
│ This was checked by the validation rule at main.tf:3,3-13.
╵
tests/var_validation.tftest.hcl… tearing down
tests/var_validation.tftest.hcl… fail

Failure! 1 passed, 1 failed.

For other module authors who might iterate on the module, we need to ensure that the validation condition is correct and will catch any problems with input values. In other words, we expect the validation condition to fail with the wrong input. This is especially important when we want to incorporate the contract test in a CI/CD pipeline. To prevent our test from failing due introducing an intentional error in the test, we can use the expect_failures attribute. Here is the modified test file:

# var_validation.tftest.hcl

variables {
  repository_name = “this_is_a_repository_name_longer_than_100_characters_7rfD86rGwuqhF3TH9d3Y99r7vq6JZBZJkhw5h4eGEawBntZmvy”
}

run “test_invalid_var” {
  command = plan

  expect_failures = [
    var.repository_name
  ]
}

Now if we run the Terraform test, we will get a successful result.

❯ terraform test
tests/basic.tftest.hcl… in progress
run “test_resource_creation”… pass
tests/basic.tftest.hcl… tearing down
tests/basic.tftest.hcl… pass
tests/var_validation.tftest.hcl… in progress
run “test_invalid_var”… pass
tests/var_validation.tftest.hcl… tearing down
tests/var_validation.tftest.hcl… pass

Success! 2 passed, 0 failed.

As you can see, the expect_failures attribute is used to test negative paths (the inputs that would cause failures when passed into a module). Assertions tend to focus on positive paths (the ideal inputs). For an additional example of a test that validates functionality of a completed module with multiple interconnected resources, see this example in the Terraform CI/CD and Testing on AWS Workshop.

Orchestrating supporting resources

In practice, end-users utilize Terraform modules in conjunction with other supporting resources. For example, a CodeCommit repository is usually encrypted using an AWS Key Management Service (KMS) key. The KMS key is provided by end-users to the module using a variable called kms_key_id. To simulate this test, we need to orchestrate the creation of the KMS key outside of the module. In this section we will learn how to do that. First, update the Terraform module to add the optional variable for the KMS key.

# main.tf

variable "repository_name" {
  type = string
  validation {
    condition = length(var.repository_name) <= 100
    error_message = "The repository name must be less than or equal to 100 characters."
  }
}

variable "kms_key_id" {
  type = string
  default = ""
}

resource "aws_codecommit_repository" "test" {
  repository_name = format("repo-%s", var.repository_name)
  description = "Test repository."
  kms_key_id = var.kms_key_id != "" ? var.kms_key_id : null
}

In a Terraform test, you can instruct the run block to execute another helper module. The helper module is used by the test to create the supporting resources. We will create a sub-directory called setup under the tests directory with a single kms.tf file. We also create a new test file for KMS scenario. See the updated directory structure:

├── main.tf
└── tests
├── setup
│ └── kms.tf
├── basic.tftest.hcl
├── var_validation.tftest.hcl
└── with_kms.tftest.hcl

The kms.tf file is a helper module to create a KMS key and provide its ARN as the output value.

# kms.tf

resource "aws_kms_key" "test" {
  description = "test KMS key for CodeCommit repo"
  deletion_window_in_days = 7
}

output "kms_key_id" {
  value = aws_kms_key.test.arn
}

The new test will use two separate run blocks. The first run block (setup) executes the helper module to generate a KMS key. This is done by assigning the command apply which will run terraform apply to generate the KMS key. The second run block (codecommit_with_kms) will then use the KMS key ARN output of the first run as the input variable passed to the main module.

# with_kms.tftest.hcl

run "setup" {
  command = apply
  module {
    source = "./tests/setup"
  }
}

run "codecommit_with_kms" {
  command = apply

  variables {
    repository_name = "MyRepo"
    kms_key_id = run.setup.kms_key_id
  }

  assert {
    condition = aws_codecommit_repository.test.kms_key_id != null
    error_message = "KMS key ID attribute value is null"
  }
}

Go ahead and run the Terraform init, followed by Terraform test. You should get the successful result like below.

❯ terraform test
tests/basic.tftest.hcl... in progress
run "test_resource_creation"... pass
tests/basic.tftest.hcl... tearing down
tests/basic.tftest.hcl... pass
tests/var_validation.tftest.hcl... in progress
run "test_invalid_var"... pass
tests/var_validation.tftest.hcl... tearing down
tests/var_validation.tftest.hcl... pass
tests/with_kms.tftest.hcl... in progress
run "create_kms_key"... pass
run "codecommit_with_kms"... pass
tests/with_kms.tftest.hcl... tearing down
tests/with_kms.tftest.hcl... pass

Success! 4 passed, 0 failed.

We have learned how to run Terraform test and develop various test scenarios. In the next section we will see how to incorporate all the tests into a CI/CD pipeline.

Terraform Tests in CI/CD Pipelines

Now that we have seen how Terraform Test works locally, let’s see how the Terraform test can be leveraged to create a Terraform module validation pipeline on AWS. The following AWS services are used:

AWS CodeCommit – a secure, highly scalable, fully managed source control service that hosts private Git repositories.
AWS CodeBuild – a fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages.
AWS CodePipeline – a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.
Amazon Simple Storage Service (Amazon S3) – an object storage service offering industry-leading scalability, data availability, security, and performance.

Terraform module validation pipeline

In the above architecture for a Terraform module validation pipeline, the following takes place:

A developer pushes Terraform module configuration files to a git repository (AWS CodeCommit).
AWS CodePipeline begins running the pipeline. The pipeline clones the git repo and stores the artifacts to an Amazon S3 bucket.
An AWS CodeBuild project configures a compute/build environment with Checkov installed from an image fetched from Docker Hub. CodePipeline passes the artifacts (Terraform module) and CodeBuild executes Checkov to run static analysis of the Terraform configuration files.
Another CodeBuild project configured with Terraform from an image fetched from Docker Hub. CodePipeline passes the artifacts (repo contents) and CodeBuild runs Terraform command to execute the tests.

CodeBuild uses a buildspec file to declare the build commands and relevant settings. Here is an example of the buildspec files for both CodeBuild Projects:

# Checkov
version: 0.1
phases:
  pre_build:
    commands:
      - echo pre_build starting

  build:
    commands:
      - echo build starting
      - echo starting checkov
      - ls
      - checkov -d .
      - echo saving checkov output
      - checkov -s -d ./ > checkov.result.txt

In the above buildspec, Checkov is run against the root directory of the cloned CodeCommit repository. This directory contains the configuration files for the Terraform module. Checkov also saves the output to a file named checkov.result.txt for further review or handling if needed. If Checkov fails, the pipeline will fail.

# Terraform Test
version: 0.1
phases:
  pre_build:
    commands:
      - terraform init
      - terraform validate

  build:
    commands:
      - terraform test

In the above buildspec, the terraform init and terraform validate commands are used to initialize Terraform, then check if the configuration is valid. Finally, the terraform test command is used to run the configured tests. If any of the Terraform tests fails, the pipeline will fail.

For a full example of the CI/CD pipeline configuration, please refer to the Terraform CI/CD and Testing on AWS workshop. The module validation pipeline mentioned above is meant as a starting point. In a production environment, you might want to customize it further by adding Checkov allow-list rules, linting, checks for Terraform docs, or pre-requisites such as building the code used in AWS Lambda.

Choosing various testing strategies

At this point you may be wondering when you should use Terraform tests or other tools such as Preconditions and Postconditions, Check blocks or policy as code. The answer depends on your test type and use-cases. Terraform test is suitable for unit tests, such as validating resources are created according to the naming specification. Variable validations and Pre/Post conditions are useful for contract tests of Terraform modules, for example by providing error warning when input variables value do not meet the specification. As shown in the previous section, you can also use Terraform test to ensure your contract tests are running properly. Terraform test is also suitable for integration tests where you need to create supporting resources to properly test the module functionality. Lastly, Check blocks are suitable for end to end tests where you want to validate the infrastructure state after all resources are generated, for example to test if a website is running after an S3 bucket configured for static web hosting is created.

When developing Terraform modules, you can run Terraform test in command = plan mode for unit and contract tests. This allows the unit and contract tests to run quicker and cheaper since there are no resources created. You should also consider the time and cost to execute Terraform test for complex / large Terraform configurations, especially if you have multiple test scenarios. Terraform test maintains one or many state files within the memory for each test file. Consider how to re-use the module’s state when appropriate. Terraform test also provides test mocking, which allows you to test your module without creating the real infrastructure.

Conclusion

In this post, you learned how to use Terraform test and develop various test scenarios. You also learned how to incorporate Terraform test in a CI/CD pipeline. Lastly, we also discussed various testing strategies for Terraform configurations and modules. For more information about Terraform test, we recommend the Terraform test documentation and tutorial. To get hands on practice building a Terraform module validation pipeline and Terraform deployment pipeline, check out the Terraform CI/CD and Testing on AWS Workshop.

Authors

Sequential Testing Keeps the World Streaming Netflix Part 2: Counting Processes

2024-03-18 Netflix Technology Blog

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/sequential-testing-keeps-the-world-streaming-netflix-part-2-counting-processes-da6805341642

Michael Lindon, Chris Sanden, Vache Shirikian, Yanjun Liu, Minal Mishra, Martin Tingley

Have you ever encountered a bug while streaming Netflix? Did your title stop unexpectedly, or not start at all? In the first installment of this blog series on sequential testing, we described our canary testing methodology for continuous metrics such as play-delay. One of our readers commented

What if the new release is not related to a new play/streaming feature? For example, what if the new release includes modified login functionality? Will you still monitor the “play-delay” metric?

Netflix monitors a large suite of metrics, many of which can be classified as counts. These include metrics such as the number of logins, errors, successful play starts, and even the number of customer call center contacts. In this second installment, we describe our sequential methodology for testing count metrics, outlined in the NeurIPS paper Anytime Valid Inference for Multinomial Count Data.

Spot the Difference

Suppose we are about to deploy new code that changes the login behavior. To de-risk the software rollout we A/B test the new code, known also as a canary test. Whenever an event such as a login occurs, a log flows through our real-time backend and the corresponding timestamp is recorded. Figure 1 illustrates the sequences of timestamps generated by devices assigned to the new (treatment) and existing (control) software versions. A question that naturally concerns us is whether there are fewer login events in the treatment. Can you tell?

Figure 1: Timestamps of events occurring in control and treatment

It is not immediately obvious by simple inspection of the point processes in Figure 1. The difference becomes immediately obvious when we visualize the observed counting processes, shown in Figure 2.

Figure 2: Visualizing the counting processes — the number of events observed by time t

The counting processes are functions that increment by 1 whenever a new event arrives. Clearly, there are fewer events occurring in the treatment than in the control. If these were login events, this would suggest that the new code contains a bug that prevents some users from being able to log in successfully.

This is a common situation when dealing with event timestamps. To give another example, if events corresponded to errors or crashes, we would like to know if these are accruing faster in the treatment than in the control. Moreover, we want to answer that question as quickly as possible to prevent any further disruption to the service. This necessitates sequential testing techniques which were introduced in part 1.

Time-Inhomogeneous Poisson Process

Our data for each treatment group is a realization of a one-dimensional point process, that is, a sequence of timestamps. As the rate at which the events arrive is time-varying (in both treatment and control), we model the point process as a time-inhomogeneous Poisson point process. This point process is defined by an intensity function λ: ℝ → [0, ∞). The number of events in the interval [0,t), denoted N(t), has the following Poisson distribution

N(t) ~ Poisson(Λ(t)), where Λ(t) = ∫₀ᵗ λ(s) ds.

We seek to test the null hypothesis H₀: λᴬ(t) = λᴮ(t) for all t i.e. the intensity functions for control (A) and treatment (B) are the same. This can be done semiparametrically without making any assumptions about the intensity functions λᴬ and λᴮ. Moreover, the novelty of the research is that this can be done sequentially, as described in section 4 of our paper. Conveniently, the only data required to test this hypothesis at time t is Nᴬ(t) and Nᴮ(t), the total number of events observed so far in control and treatment. In other words, all you need to test the null hypothesis is two integers, which can easily be updated as new events arrive. Here is an example from a simulated A/A test, in which we know by design that the intensity function is the same for the control (A) and the treatment (B), albeit nonstationary.

Figure 3: (Left) An A/A simulation of two inhomogeneous Poisson point processes. (Right) Confidence sequence on the log-difference of intensity functions, and sequential p-value.

Figure 3 provides an illustration of an A/A setting. The left figure presents the raw data and the intensity functions, and the right figure presents the sequential statistical analysis. The blue and red rug plots indicate the observed arrival timestamps of events from the treatment and control streams respectively. The dashed lines are the observed counting processes. As this data is simulated under the null, the intensity functions are identical and overlay each other. The left axis of the right figure visualizes the evolution of the confidence sequence on the log-difference of intensity functions. The right axis of the right figure visualizes the evolution of the sequential p-value. We can make the two following observations

Under the null, the difference of log intensities is zero, which is correctly covered by the 0.95 confidence sequence at all times.
The sequential p-value is greater than 0.05 at all times

Now let’s consider an illustration of an A/B setting. Figure 4 shows observed arrival times for treatment and control when the intensity functions differ. As this is a simulation, the true difference between log intensities is known.

Figure 4: (Left) An A/B simulation of two inhomogeneous Poisson point processes. (Right) Confidence sequence on the difference of log of intensity functions, and sequential p-value.

We can make the following observations

The 0.95 confidence sequence covers the true log-difference at all times
The sequential p-value falls below 0.05 at the same time the 0.95 confidence sequence excludes the null value of zero

Now we present a number of case studies where this methodology has rapidly detected serious problems in a number of count metrics

Case Study 1: Drop in Successful Title Starts

Figure 2 actually presents counts of title start events from a real canary test. Whenever a title starts successfully, an event is sent from the device to Netflix. We have a stream of title start events from treatment devices and a stream of title start events from control devices. Whenever fewer title starts are observed among treatment devices, there is usually a bug in the new client preventing playback.

In this case, the canary test detected a bug that was later determined to have prevented approximately 60% of treatment devices from being able to start their streams. The confidence sequence is shown in Figure 5, in addition to the (sequential) p-value. While the exact units of time have been omitted, this bug was detected at the sub-second level.

Figure 5: 0.99 Confidence sequence on the difference of log-intensities with sequential p-value.

Case Study 2: Increase in Abnormal Shutdowns

In addition to title start events, we also monitor whenever the Netflix client shuts down unexpectedly. As before, we have two streams of abnormal shutdown events, one from treatment devices, and one from control devices. The following screenshots are taken directly from our Lumen dashboards.

Figure 6: Counts of Abnormal Shutdowns over time, cumulative and non-cumulative. Treatment (Black) and Control (Blue)

Figure 6 illustrates two important points. There is clearly nonstationarity in the arrival of abnormal shutdown events. It is also not easy to visibly see any difference between treatment and control from the non-cumulative view. The difference is, however, much easier to see from the cumulative view by observing the counting process. There is a small but visible increase in the number of abnormal shutdowns in the treatment. Figure 7 shows how our sequential statistical methodology is even able to identify such small differences.

Figure 7: Abnormal Shutdowns. (Top Panel) Confidence sequences on λᴮ(t)/λᴬ(t) (shaded blue) with observed counting processes for treatment (black dashed) and control (blue dashed). (Bottom Panel) sequential p-values.

Case Study 3: Increase in Errors

Netflix also monitors the number of errors produced by treatment and control. This is a high cardinality metric as every error is annotated with a code indicating the type of error. Monitoring errors segmented by code helps developers diagnose issues quickly. Figure 8 shows the sequential p-values, on the log scale, for a set of error codes that Netflix monitors during client rollouts. In this example, we have detected a higher volume of 3.1.18 errors being produced by treatment devices. Devices experiencing this error are presented with the following message:

“We’re having trouble playing this title right now”

Figure 8: Sequential p-values for start play errors by error code

Figure 9: Observed error-3.1.18 timestamps and counting processes for treatment (blue) and control (red)

Knowing which errors increased can streamline the process of identifying the bug for our developers. We immediately send developers alerts through Slack integrations, such as the following

Figure 10: Notifications via Slack Integrations

The next time you are watching Netflix and encounter an error, know that we’re on it!

Try it Out!

The statistical approach outlined in our paper is remarkably easy to implement in practice. All you need are two integers, the number of events observed so far in the treatment and control. The code is available in this short GitHub gist. Here are two usage examples:

> counts = [100, 101]
> assignment_probabilities = [0.5, 0.5]
> sequential_p_value(counts, assignment_probabilities)
  1

> counts = [100, 201]
> assignment_probabilities = [0.5, 0.5]
> sequential_p_value(counts, assignment_probabilities)
  5.06061172163498e-06

The code generalizes to more than just two treatment groups. For full details, including hyperparameter tuning, see section 4 of the paper.

Infrastructure as Code development with Amazon CodeWhisperer

2024-03-12 Eric Z. Beard

Post Syndicated from Eric Z. Beard original https://aws.amazon.com/blogs/devops/infrastructure-as-code-development-with-amazon-codewhisperer/

At re:Invent in 2023, AWS announced Infrastructure as Code (IaC) support for Amazon CodeWhisperer. CodeWhisperer is an AI-powered productivity tool for the IDE and command line that helps software developers to quickly and efficiently create cloud applications to run on AWS. Languages currently supported for IaC are YAML and JSON for AWS CloudFormation, Typescript and Python for AWS CDK, and HCL for HashiCorp Terraform. In addition to providing code recommendations in the editor, CodeWhisperer also features a security scanner that alerts the developer to potentially insecure infrastructure code, and offers suggested fixes than can be applied with a single click.

In this post, we will walk you through some common scenarios and show you how to get the most out of CodeWhisperer in the IDE. CodeWhisperer is supported by several IDEs, such as Visual Studio Code and JetBrains. For the purposes of this post, we’ll focus on Visual Studio Code. There are a few things that you need to follow along with the examples, listed in the prerequisites section below.

Prerequisites

An AWS Builder ID or an AWS Identity Center login controlled by your organization
A supported IDE, like Visual Studio Code
The AWS Toolkit IDE extension
Authenticate and Connect

CloudFormation

Now that you have the toolkit configured, open a new source file with the yaml extension. Since YAML files can represent a wide variety of different configuration file types, it helps to add the AWSTemplateFormatVersion: '2010-09-09' header to the file to let CodeWhisperer know that you are editing a CloudFormation file. Just typing the first few characters of that header is likely to result in a recommendation from CodeWhisperer. Press TAB to accept recommendations and Escape to ignore them.

AWSTemplateFormatVersion header

If you have a good idea about the various resources you want to include in your template, include them in a top level Description field. This will help CodeWhisperer to understand the relationships between the resources you will create in the file. In the example below, we describe the stack we want as a “VPC with public and private subnets”. You can be more descriptive if you want, using a multi-line YAML string to add more specific details about the resources you want to create.

Creating a CloudFormation template with a description

After accepting that recommendation for the parameters, you can continue to create resources.

Creating CloudFormation resources

You can also trigger recommendations with inline comments and descriptive logical IDs if you want to create one resource at a time. The more code you have in the file, the more CodeWhisperer will understand from context what you are trying to achieve.

CDK

It’s also possible to create CDK code using CodeWhisperer. In the example below, we started with a CDK project using cdk init, wrote a few lines of code to create a VPC in a TypeScript file, and CodeWhisperer proposed some code suggestions using what we started to write. After accepting the suggestion, it is possible to customize the code to fit your needs. CodeWhisperer will learn from your coding style and make more precise suggestions as you add more code to the project.

Create a CDK stack

You can choose whether you want to get suggestions that include code with references with the professional version of CodeWhisperer. If you choose to get the references, you can find them in the Code Reference Log. These references let you know when the code recommendation was a near exact match for code in an open source repository, allowing you to inspect the license and decide if you want to use that code or not.

References

Terraform HCL

After a close collaboration between teams at Hashicorp and AWS, Terraform HashiCorp Configuration Language (HCL) is also supported by CodeWhisperer. CodeWhisperer recommendations are triggered by comments in the file. In this example, we repeat a prompt that is similar to what we used with CloudFormation and CDK.

Terraform code suggestion

Security Scanner

In addition to CodeWhisperer recommendations, the toolkit configuration also includes a built in security scanner. Considering that the resulting code can be edited and combined with other preexisting code, it’s good practice to scan the final result to see if there are any best-practice security recommendations that can be applied.

Expand the CodeWhisperer section of the AWS Toolkit to see the “Run Security Scan” button. Click it to initiate a scan, which might take up to a minute to run. In the example below, we defined an S3 bucket that can be read by anyone on the internet.

Security scanner

Once the security scan completes, the code with issues is underlined and each suggestion is added to the ‘Problems’ tab. Click on any of those to get more details.

Scan results

CodeWhisperer provides a clickable link to get more information about the vulnerability, and what you can do to fix it.

Scanner Link

Conclusion

The integration of generative AI tools like Amazon CodeWhisperer are transforming the landscape of cloud application development. By supporting Infrastructure as Code (IaC) languages such as CloudFormation, CDK, and Terraform HCL, CodeWhisperer is expanding its reach beyond traditional development roles. This advancement is pivotal in merging runtime and infrastructure code into a cohesive unit, significantly enhancing productivity and collaboration in the development process. The inclusion of IaC enables a broader range of professionals, especially Site Reliability Engineers (SREs), to actively engage in application development, automating and optimizing infrastructure management tasks more efficiently.

CodeWhisperer’s capability to perform security scans on the generated code aligns with the critical objectives of system reliability and security, essential for both developers and SREs. By providing insights into security best practices, CodeWhisperer enables robust and secure infrastructure management on the AWS cloud. This makes CodeWhisperer a valuable tool not just for developers, but as a comprehensive solution that bridges different technical disciplines, fostering a collaborative environment for innovation in cloud-based solutions.

Bio

Eric Beard is a Solutions Architect at AWS specializing in DevOps, CI/CD, and Infrastructure as Code, the author of the AWS Sysops Cookbook, and an editor for the AWS DevOps blog channel. When he’s not helping customers to design Well-Architected systems on AWS, he is usually playing tennis or watching tennis.

Amar Meriche is a Sr Technical Account Manager at AWS in Paris. He helps his customers improve their operational posture through advocacy and guidance, and is an active member of the DevOps and IaC community at AWS. He’s passionate about helping customers use the various IaC tools available at AWS following best practices.

How we sped up AWS CloudFormation deployments with optimistic stabilization

2024-03-11 Bhavani Kanneganti

Post Syndicated from Bhavani Kanneganti original https://aws.amazon.com/blogs/devops/how-we-sped-up-aws-cloudformation-deployments-with-optimistic-stabilization/

Introduction

AWS CloudFormation customers often inquire about the behind-the-scenes process of provisioning resources and why certain resources or stacks take longer to provision compared to the AWS Management Console or AWS Command Line Interface (AWS CLI). In this post, we will delve into the various factors affecting resource provisioning in CloudFormation, specifically focusing on resource stabilization, which allows CloudFormation and other Infrastructure as Code (IaC) tools to ensure resilient deployments. We will also introduce a new optimistic stabilization strategy that improves CloudFormation stack deployment times by up to 40% and provides greater visibility into resource provisioning through the new CONFIGURATION_COMPLETE status.

AWS CloudFormation is an IaC service that allows you to model your AWS and third-party resources in template files. By creating CloudFormation stacks, you can provision and manage the lifecycle of the template-defined resources manually via the AWS CLI, Console, AWS SAM, or automatically through an AWS CodePipeline, where CLI and SAM can also be leveraged or through Git sync. You can also use AWS Cloud Development Kit (AWS CDK) to define cloud infrastructure in familiar programming languages and provision it through CloudFormation, or leverage AWS Application Composer to design your application architecture, visualize dependencies, and generate templates to create CloudFormation stacks.

Deploying a CloudFormation stack

Let’s examine a deployment of a containerized application using AWS CloudFormation to understand CloudFormation’s resource provisioning.

Figure 1. Sample application architecture to deploy an ECS service

For deploying a containerized application, you need to create an Amazon ECS service. To set up the ECS service, several key resources must first exist: an ECS cluster, an Amazon ECR repository, a task definition, and associated Amazon VPC infrastructure such as security groups and subnets.
Since you want to manage both the infrastructure and application deployments using AWS CloudFormation, you will first define a CloudFormation template that includes: an ECS cluster resource (AWS::ECS::Cluster), a task definition (AWS::ECS::TaskDefinition), an ECR repository (AWS::ECR::Repository), required VPC resources like subnets (AWS::EC2::Subnet) and security groups (AWS::EC2::SecurityGroup), and finally, the ECS Service (AWS::ECS::Service) itself. When you create the CloudFormation stack using this template, the ECS service (AWS::ECS::Service) is the final resource created, as it waits for the other resources to finish creation. This brings up the concept of Resource Dependencies.

Resource Dependency:

In CloudFormation, resources can have dependencies on other resources being created first. There are two types of resource dependencies:

Implicit: CloudFormation automatically infers dependencies when a resource uses intrinsic functions to reference another resource. These implicit dependencies ensure the resources are created in the proper order.
Explicit: Dependencies can be directly defined in the template using the DependsOn attribute. This allows you to customize the creation order of resources.

The following template snippet shows the ECS service’s dependencies visualized in a dependency graph:

Template snippet:

ECSService:
    DependsOn: [PublicRoute] #Explicit Dependency
    Type: 'AWS::ECS::Service'
    Properties:
      ServiceName: cfn-service
      Cluster: !Ref ECSCluster #Implicit Dependency
      DesiredCount: 2
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED
          SecurityGroups:
            - !Ref SecurityGroup #Implicit Dependency
          Subnets:
            - !Ref PublicSubnet #Implicit Dependency
      TaskDefinition: !Ref TaskDefinition #Implicit Dependency

Dependency Graph:

Figure 2. CloudFormation’s dependency graph for a containerized application

Note: VPC Resources in the above graph include PublicSubnet (AWS::EC2::Subnet), SecurityGroup (AWS::EC2::SecurityGroup), PublicRoute (AWS::EC2::Route)

In the above template snippet, the ECS Service (AWS::ECS::Service) resource has an explicit dependency on the PublicRoute resource, specified using the DependsOn attribute. The ECS service also has implicit dependencies on the ECSCluster, SecurityGroup, PublicSubnet, and TaskDefinition resources. Even without an explicit DependsOn, CloudFormation understands that these resources must be created before the ECS service, since the service references them using the Ref intrinsic function. Now that you understand how CloudFormation creates resources in a specific order based on their definition in the template file, let’s look at the time taken to provision these resources.

Resource Provisioning Time:

The total time for CloudFormation to provision the stack depends on the time required to create each individual resource defined in the template. The provisioning duration per resource is determined by several time factors:

Engine Time: CloudFormation Engine Time refers to the duration spent by the service reading and persisting data related to a resource. This includes the time taken for operations like parsing and interpreting the CloudFormation template, and for the resolution of intrinsic functions like Fn::GetAtt and Ref.
Resource Creation Time: The actual time an AWS service requires to create and configure the resource. This can vary across resource types provisioned by the service.
Resource Stabilization Time: The duration required for a resource to reach a usable state after creation.

What is Resource Stabilization?

When provisioning AWS resources, CloudFormation makes the necessary API calls to the underlying services to create the resources. After creation, CloudFormation then performs eventual consistency checks to ensure the resources are ready to process the intended traffic, a process known as resource stabilization. For example, when creating an ECS service in the application, the service is not readily accessible immediately after creation completes (after creation time). To ensure the ECS service is available to use, CloudFormation performs additional verification checks defined specifically for ECS service resources. Resource stabilization is not unique to CloudFormation and must be handled to some degree by all IaC tools.

Stabilization Criteria and Stabilization Timeout

For CloudFormation to mark a resource as CREATE_COMPLETE, the resource must meet specific stabilization criteria called stabilization parameters. These checks validate that the resource is not only created but also ready for use.

If a resource fails to meet its stabilization parameters within the allowed stabilization timeout period, CloudFormation will mark the resource status as CREATE_FAILED and roll back the operation. Stabilization criteria and timeouts are defined uniquely for each AWS resource supported in CloudFormation by the service, and are applied during both resource create and update workflows.

AWS CloudFormation vs AWS CLI to provision resources

Now, you will create a similar ECS service using the AWS CLI. You can use the following AWS CLI command to deploy an ECS service using the same task definition, ECS cluster and VPC resources created earlier using CloudFormation.

Command:

aws ecs create-service \
    --cluster CFNCluster \
    --service-name service-cli \
    --task-definition task-definition-cfn:1 \
    --desired-count 2 \
    --launch-type FARGATE \
    --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-yyy],assignPublicIp=ENABLED}" \
    --region us-east-1

The following snippet from the output of the above command shows that the ECS Service has been successfully created and its status is ACTIVE.

Snapshot of the ECS service API call's response

Figure 3. Snapshot of the ECS service API call

However, when you navigate to the ECS console and review the service, tasks are still in the Pending state, and you are unable to access the application.

Figure 4. ECS console view

You have to wait for the service to reach a steady state before you can successfully access the application.

Figure 5. ECS service events from the AWS console

When you create the same ECS service using AWS CloudFormation, the service is accessible immediately after the resource reaches a status of CREATE_COMPLETE in the stack. This reliable availability is due to CloudFormation’s resource stabilization process. After initially creating the ECS service, CloudFormation waits and continues calling the ECS DescribeServices API action until the service reaches a steady state. Once the ECS service passes its consistency checks and is fully ready for use, only then will CloudFormation mark the resource status as CREATE_COMPLETE in the stack. This creation and stabilization orchestration allows you to access the service right away without any further delays.

The following is an AWS CloudTrail snippet of CloudFormation performing DescribeServices API calls during Stabilization:

Snapshot of AWS CloudTrail event for DescribeServices API call

Figure 6. Snapshot of AWS CloudTrail event

By handling resource stabilization natively, CloudFormation saves you the extra coding effort and complexity of having to implement custom status checks and availability polling logic after resource creation. You would have to develop this additional logic using tools like the AWS CLI or API across all the infrastructure and application resources. With CloudFormation’s built-in stabilization orchestration, you can deploy the template once and trust that the services will be fully ready after creation, allowing you to focus on developing your application functionality.

Evolution of Stabilization Strategy

CloudFormation’s stabilization strategy couples resource creation with stabilization such that the provisioning of a resource is not considered COMPLETE until stabilization is complete.

Historic Stabilization Strategy

For resources that have no interdependencies, CloudFormation starts the provisioning process in parallel. However, if a resource depends on another resource, CloudFormation will wait for the entire resource provisioning operation of the dependency resource to complete before starting the provisioning of the dependent resource.

Figure 7. CloudFormation’s historic stabilization strategy

The diagram above shows a deployment of some of the ECS application resources that you deploy using AWS CloudFormation. The Task Definition (AWS::ECS::TaskDefinition) resource depends on the ECR Repository (AWS::ECR::Repository) resource, and the ECS Service (AWS::ECS:Service) resource depends on both the Task Definition and ECS Cluster (AWS::ECS::Cluster) resources. The ECS Cluster resource has no dependencies defined. CloudFormation initiates creation of the ECR Repository and ECS Cluster resources in parallel. It then waits for the ECR Repository to complete consistency checks before starting provisioning of the Task Definition resource. Similarly, creation of the ECS Service resource begins only when the Task Definition and ECS Cluster resources have been created and are ready. This sequential approach ensures safety and stability but causes delays. CloudFormation strictly deploys dependent resources one after the other, slowing down deployment of the entire stack. As the number of interdependent resources grows, the overall stack deployment time increases, creating a bottleneck that prolongs the whole stack operation.

New Optimistic Stabilization Strategy

To improve stack provisioning times and deployment performance, AWS CloudFormation recently launched a new optimistic stabilization strategy. The optimistic strategy can reduce customer stack deployment duration by up to 40%. It allows dependent resources to be created in parallel. This concurrent resource creation helps significantly improve deployment speed.

Figure 8. CloudFormation’s new optimistic stabilization strategy

The diagram above shows deployment of the same 4 resources discussed in the historic strategy. The Task Definition (AWS::ECS::TaskDefinition) resource depends on the ECR Repository (AWS::ECR::Repository) resource, and the ECS Service (AWS::ECS:Service) resource depends on both the Task Definition and ECS Cluster (AWS::ECS::Cluster) resources. The ECS Cluster resource has no dependencies defined. CloudFormation initiates creation of the ECR Repository and ECS Cluster resources in parallel. Then, instead of waiting for the ECR Repository to complete consistency checks, it starts creating the Task Definition when the ECR Repository completes creation, but before stabilization is complete. Similarly, creation of the ECS Service resource begins after Task Definition and ECS Cluster creation. The change was made because not all resources require their dependent resources to complete consistency checks before starting creation. If the ECS Service fails to provision because the Task Definition or ECS Cluster resources are still undergoing consistency checks, CloudFormation will wait for those dependencies to complete their consistency checks before attempting to create the ECS Service again.

Figure 9. CloudFormation’s new stabilization strategy with the retry capability

This parallel creation of dependent resources with automatic retry capabilities results in faster deployment times compared to the historical linear resource provisioning strategy. The Optimistic stabilization strategy currently applies only to create workflows with resources that have implicit dependencies. For resources with an explicit dependency, CloudFormation leverages the historic strategy in deploying resources.

Improved Visibility into Resource Provisioning

When creating a CloudFormation stack, a resource can sometimes take longer to provision, making it appear as if it’s stuck in an IN_PROGRESS state. This can be because CloudFormation is waiting for the resource to complete consistency checks during its resource stabilization step. To improve visibility into resource provisioning status, CloudFormation has introduced a new “CONFIGURATION_COMPLETE” event. This event is emitted at both the individual resource level and the overall stack level during create workflow when resource(s) creation or configuration is complete, but stabilization is still in progress.

Figure 10. CloudFormation stack events of the ECS Application

The above diagram shows the snapshot of stack events of the ECS application’s CloudFormation stack named ECSApplication. Observe the events from the bottom to top:

At 10:46:08 UTC-0600, ECSService (AWS::ECS::Service) resource creation was initiated.
At 10:46:09 UTC-0600, the ECSService has CREATE_IN_PROGRESS status in the Status tab and CONFIGURATION_COMPLETE status in the Detailed status tab, meaning the resource was successfully created and the consistency check was initiated.
At 10:46:09 UTC-0600, the stack ECSApplication has CREATE_IN_PROGRESS status in the Status tab and CONFIGURATION_COMPLETE status in the Detailed status tab, meaning all the resources in the ECSApplication stack are successfully created and are going through stabilization. This stack level CONFIGURATION_COMPLETE status can also be viewed in the stack’s Overview tab.

Figure 11. CloudFormation Overview tab for the ECSApplication stack

At 10:47:09 UTC-0600, the ECSService has CREATE_COMPLETE status in the Status tab, meaning the service is created and completed consistency checks.
At 10:47:10 UTC-0600, ECSApplication has CREATE_COMPLETE status in the Status tab, meaning all the resources are successfully created and completed consistency checks.

Conclusion:

In this post, I hope you gained some insights into how CloudFormation deploys resources and the various time factors that contribute to the creation of a stack and its resources. You also took a deeper look into what CloudFormation does under the hood with resource stabilization and how it ensures the safe, consistent, and reliable provisioning of resources in critical, high-availability production infrastructure deployments. Finally, you learned about the new optimistic stabilization strategy to shorten stack deployment times and improve visibility into resource provisioning.

About the authors:

Creating a User Activity Dashboard for Amazon CodeWhisperer

2024-03-04 David Ernst

Post Syndicated from David Ernst original https://aws.amazon.com/blogs/devops/creating-a-user-activity-dashboard-for-amazon-codewhisperer/

Maximizing the value from Enterprise Software tools requires an understanding of who and how users interact with those tools. As we have worked with builders rolling out Amazon CodeWhisperer to their enterprises, identifying usage patterns has been critical.

This blog post is a result of that work, builds on Introducing Amazon CodeWhisperer Dashboard blog and Amazon CloudWatch metrics and enables customers to build dashboards to support their rollouts. Note that these features are only available in CodeWhisperer Professional plan.

Organizations have leveraged the existing Amazon CodeWhisperer Dashboard to gain insights into developer usage. This blog explores how we can supplement the existing dashboard with detailed user analytics. Identifying leading contributors has accelerated tool usage and adoption within organizations. Acknowledging and incentivizing adopters can accelerate a broader adoption.

The architecture diagram outlines a streamlined process for tracking and analyzing Amazon CodeWhisperer usage events. It begins with logging these events in CodeWhisperer and AWS CloudTrail and then forwarding them to Amazon CloudWatch Logs. Configuring AWS CloudTrail involves using Amazon S3 for storage and AWS Key Management Service (KMS) for log encryption. An AWS Lambda function analyzes the logs, extracting information about user activity. This blog also introduces a AWS CloudFormation template that simplifies the setup process, including creating the CloudTrail with an S3 bucket KMS key and the Lambda function. The template also configures AWS IAM permissions, ensuring the Lambda function has access rights to interact with other AWS services.

Configuring CloudTrail for CodeWhisperer User Tracking

This section details the process for monitoring user interactions while using Amazon CodeWhisperer. The aim is to utilize AWS CloudTrail to record instances where users receive code suggestions from CodeWhisperer. This involves setting up a new CloudTrail trail tailored to log events related to these interactions. By accomplishing this, you lay a foundational framework for capturing detailed user activity data, which is crucial for the subsequent steps of analyzing and visualizing this data through a custom AWS Lambda function and an Amazon CloudWatch dashboard.

Setup CloudTrail for CodeWhisperer

1. Navigate to AWS CloudTrail Service.

2. Create Trail

3. Choose Trail Attributes

a. Click on Create Trail

b. Provide a Trail Name, for example, “cwspr-preprod-cloudtrail”

c. Choose Enable for all accounts in my organization

d. Choose Create a new Amazon S3 bucket to configure the Storage Location

e. For Trail log bucket and folder, note down the given unique trail bucket name in order to view the logs at a future point.

f. Check Enabled to encrypt log files with SSE-KMS encryption

j. Enter an AWS Key Management Service alias for log file SSE-KMS encryption, for example, “cwspr-preprod-cloudtrail”

h. Select Enabled for CloudWatch Logs

i. Select New

j. Copy the given CloudWatch Log group name, you will need this for the testing the Lambda function in a future step.

k. Provide a Role Name, for example, “CloudTrailRole-cwspr-preprod-cloudtrail”

l. Click Next.

This image depicts how to choose the trail attributes within CloudTrail for CodeWhisperer User Tracking.

4. Choose Log Events

a. Check “Management events“ and ”Data events“

b. Under Management events, keep the default options under API activity, Read and Write

c. Under Data event, choose CodeWhisperer for Data event type

d. Keep the default Log all events under Log selector template

e. Click Next

f. Review and click Create Trail

This image depicts how to choose the log events for CloudTrail for CodeWhisperer User Tracking.

Please Note: The logs will need to be included on the account which the management account or member accounts are enabled.

Gathering Application ARN for CodeWhisperer application

Step 1: Access AWS IAM Identity Center

1. Locate and click on the Services dropdown menu at the top of the console.

2. Search for and select IAM Identity Center (SSO) from the list of services.

Step 2: Find the Application ARN for CodeWhisperer application

1. In the IAM Identity Center dashboard, click on Application Assignments. -> Applications in the left-side navigation pane.

2. Locate the application with Service as CodeWhisperer and click on it

An image displays where you can find the Application in IAM Identity Center.

3. Copy the Application ARN and store it in a secure place. You will need this ID to configure your Lambda function’s JSON event.

An image shows where you will find the Application ARN after you click on you AWS managed application.

User Activity Analysis in CodeWhisperer with AWS Lambda

This section focuses on creating and testing our custom AWS Lambda function, which was explicitly designed to analyze user activity within an Amazon CodeWhisperer environment. This function is critical in extracting, processing, and organizing user activity data. It starts by retrieving detailed logs from CloudWatch containing CodeWhisperer user activity, then cross-references this data with the membership details obtained from the AWS Identity Center. This allows the function to categorize users into active and inactive groups based on their engagement within a specified time frame.

The Lambda function’s capability extends to fetching and structuring detailed user information, including names, display names, and email addresses. It then sorts and compiles these details into a comprehensive HTML output. This output highlights the CodeWhisperer usage in an organization.

Creating and Configuring Your AWS Lambda Function

1. Navigate to the Lambda service.

2. Click on Create function.

3. Choose Author from scratch.

4. Enter a Function name, for example, “AmazonCodeWhispererUserActivity”.

5. Choose Python 3.11 as the Runtime.

6. Click on ‘Create function’ to create your new Lambda function.

7. Access the Function: After creating your Lambda function, you will be directed to the function’s dashboard. If not, navigate to the Lambda service, find your function “AmazonCodeWhispererUserActivity”, and click on it.

8. Copy and paste your Python code into the inline code editor on the function’s dashboard. The lambda function code can be found here.

9. Click ‘Deploy’ to save and deploy your code to the Lambda function.

10. You have now successfully created and configured an AWS Lambda function with our Python code.

This image depicts how to configure your AWS Lambda function for tracking user activity in CodeWhisperer.

Updating the Execution Role for Your AWS Lambda Function

After you’ve created your Lambda function, you need to ensure it has the appropriate permissions to interact with other AWS services like CloudWatch Logs and AWS Identity Store. Here’s how you can update the IAM role permissions:

Locate the Execution Role:

1. Open Your Lambda Function’s Dashboard in the AWS Management Console.

2. Click on the ‘Configuration’ tab located near the top of the dashboard.

3. Set the Time Out setting to 15 minutes from the default 3 seconds

4. Select the ‘Permissions’ menu on the left side of the Configuration page.

5. Find the ‘Execution role’ section on the Permissions page.

6. Click on the Role Name to open the IAM (Identity and Access Management) role associated with your Lambda function.

7. In the IAM role dashboard, click on the Policy Name under the Permissions policies.

8. Edit the existing policy: Replace the policy with the following JSON.

9. Save the changes to the policy.

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Action":[
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents",
            "logs:StartQuery",
            "logs:GetQueryResults",
            "sso:ListInstances",
            "sso:ListApplicationAssignments"
            "identitystore:DescribeUser",
            "identitystore:ListUsers",
            "identitystore:ListGroupMemberships"
         ],
         "Resource":"*",
         "Effect":"Allow"
      },
      {
         "Action":[
            "cloudtrail:DescribeTrails",
            "cloudtrail:GetTrailStatus"
         ],
         "Resource":"*",
         "Effect":"Allow"
      }
   ]
} Your AWS Lambda function now has the necessary permissions to execute and interact with CloudWatch Logs and AWS Identity Store.

Testing Lambda Function with custom input

1. On your Lambda function’s dashboard.

2. On the function’s dashboard, locate the Test button near the top right corner.

3. Click on Test. This opens a dialog for configuring a new test event.

4. In the dialog, you’ll see an option to create a new test event. If it’s your first test, you’ll be prompted automatically to create a new event.

5. For Event name, enter a descriptive name for your test, such as “TestEvent”.

6. In the event code area, replace the existing JSON with your specific input:

{
"log_group_name": "{Insert Log Group Name}",
"start_date": "{Insert Start Date}",
"end_date": "{Insert End Date}",
"codewhisperer_application_arn": "{Insert Codewhisperer Application ARN}", 
"identity_store_region": "{Insert Region}", 
"codewhisperer_region": "{Insert Region}"
}

7. This JSON structure includes:

a. log_group_name: The name of the log group in CloudWatch Logs.

b. start_date: The start date and time for the query, formatted as “YYYY-MM-DD HH:MM:SS”.

c. end_date: The end date and time for the query, formatted as “YYYY-MM-DD HH:MM:SS”.

e. codewhisperer_application_arn: The ARN of the Code Whisperer Application in the AWS Identity Store.

f. identity_store_region: The region of the AWS Identity Store.

f. codewhisperer_region: The region of where Amazon CodeWhisperer is configured.

8. Click on Save to store this test configuration.

This image depicts an example of creating a test event for the Lambda function with example JSON parameters entered.

9. With the test event selected, click on the Test button again to execute the function with this event.

10. The function will run, and you’ll see the execution result at the top of the page. This includes execution status, logs, and output.

11. Check the Execution result section to see if the function executed successfully.

This image depicts what a test case that successfully executed looks like.

Visualizing CodeWhisperer User Activity with Amazon CloudWatch Dashboard

This section focuses on effectively visualizing the data processed by our AWS Lambda function using a CloudWatch dashboard. This part of the guide provides a step-by-step approach to creating a “CodeWhispererUserActivity” dashboard within CloudWatch. It details how to add a custom widget to display the results from the Lambda Function. The process includes configuring the widget with the Lambda function’s ARN and the necessary JSON parameters.

1.Navigate to the Amazon CloudWatch service from within the AWS Management Console

2. Choose the ‘Dashboards’ option from the left-hand navigation panel.

3. Click on ‘Create dashboard’ and provide a name for your dashboard, for example: “CodeWhispererUserActivity”.

4. Click the ‘Create Dashboard’ button.

5. Select “Other Content Types” as your ‘Data sources types’ option before choosing “Custom Widget” for your ‘Widget Configuration’ and then click ‘Next’.

6. On the “Create a custom widget” page click the ‘Next’ button without making a selection from the dropdown.

7. On the ‘Create a custom widget’ page:

a. Enter your Lambda function’s ARN (Amazon Resource Name) or use the dropdown menu to find and select your “CodeWhispererUserActivity” function.

b. Add the JSON parameters that you provided in the test event, without including the start and end dates.

{
"log_group_name": "{Insert Log Group Name}",
“codewhisperer_application_arn”:”{Insert Codewhisperer Application ARN}”,
"identity_store_region": "{Insert identity Store Region}",
"codewhisperer_region": "{Insert Codewhisperer Region}"
}

This image depicts an example of creating a custom widget.

8. Click the ‘Add widget’ button. The dashboard will update to include your new widget and will run the Lambda function to retrieve initial data. You’ll need to click the “Execute them all” button in the upper banner to let CloudWatch run the initial Lambda retrieval.

This image depicts the execute them all button on the upper right of the screen.

9. Customize Your Dashboard: Arrange the dashboard by dragging and resizing widgets for optimal organization and visibility. Adjust the time range and refresh settings as needed to suit your monitoring requirements.

10. Save the Dashboard Configuration: After setting up and customizing your dashboard, click ‘Save dashboard’ to preserve your layout and settings.

This image depicts what the dashboard looks like. It showcases active users and inactive users, with first name, last name, display name, and email.

CloudFormation Deployment for the CodeWhisperer Dashboard

The blog post concludes with a detailed AWS CloudFormation template designed to automate the setup of the necessary infrastructure for the Amazon CodeWhisperer User Activity Dashboard. This template provisions AWS resources, streamlining the deployment process. It includes the configuration of AWS CloudTrail for tracking user interactions, setting up CloudWatch Logs for logging and monitoring, and creating an AWS Lambda function for analyzing user activity data. Additionally, the template defines the required IAM roles and permissions, ensuring the Lambda function has access to the needed AWS services and resources.

The blog post also provides a JSON configuration for the CloudWatch dashboard. This is because, at the time of writing, AWS CloudFormation does not natively support the creation and configuration of CloudWatch dashboards. Therefore, the JSON configuration is necessary to manually set up the dashboard in CloudWatch, allowing users to visualize the processed data from the Lambda function. The CloudFormation template can be found here.

Create a CloudWatch Dashboard and import the JSON below.

{
   "widgets":[
      {
         "height":30,
         "width":20,
         "y":0,
         "x":0,
         "type":"custom",
         "properties":{
            "endpoint":"{Insert ARN of Lambda Function}",
            "updateOn":{
               "refresh":true,
               "resize":true,
               "timeRange":true
            },
            "params":{
               "log_group_name":"{Insert Log Group Name}",
               "codewhisperer_application_arn":"{Insert Codewhisperer Application ARN}",
               "identity_store_region":"{Insert identity Store Region}",
               "codewhisperer_region":"{Insert Codewhisperer Region}"
            }
         }
      }
   ]
}

Conclusion

In this blog, we detail a comprehensive process for establishing a user activity dashboard for Amazon CodeWhisperer to deliver data to support an enterprise rollout. The journey begins with setting up AWS CloudTrail to log user interactions with CodeWhisperer. This foundational step ensures the capture of detailed activity events, which is vital for our subsequent analysis. We then construct a tailored AWS Lambda function to sift through CloudTrail logs. Then, create a dashboard in AWS CloudWatch. This dashboard serves as a central platform for displaying the user data from our Lambda function in an accessible, user-friendly format.

You can reference the existing CodeWhisperer dashboard for additional insights. The Amazon CodeWhisperer Dashboard offers a view summarizing data about how your developers use the service.

Overall, this dashboard empowers you to track, understand, and influence the adoption and effective use of Amazon CodeWhisperer in your organizations, optimizing the tool’s deployment and fostering a culture of informed data-driven usage.

About the authors:

AWS CodePipeline adds support for Branch-based development and Monorepos

2024-02-10 Michael Ohde

Post Syndicated from Michael Ohde original https://aws.amazon.com/blogs/devops/aws-codepipeline-adds-support-for-branch-based-development-and-monorepos/

AWS CodePipeline is a managed continuous delivery service that automates your release pipelines for application and infrastructure updates. Today, CodePipeline adds triggers and new execution modes to support teams with various delivery strategies. These features give customers more choice in the pipelines they build.

In this post, I am going to show you how to use triggers and pipeline execution modes together to create three pipeline designs. These examples are requested by customers that practice branch-based development or manage multiple projects within a monorepo.

Pipeline #1: Create a GitFlow (multi-branch) release pipeline.
Pipeline #2: Run a pipeline on all pull requests (PRs).
Pipeline #3: Run a pipeline on a single folder within a monorepo.

As I walkthrough each of the pipelines you will learn more about these features and how to use them. After completing the blog, you can use triggers and execution modes to adapt these examples to your pipeline needs.

Pipeline #1 – Create a GitFlow (multi-branch) release pipeline

GitFlow is a development model that manages large projects with parallel development and releases using long-running branches. GitFlow uses two permanent branches, main and develop, along with supporting feature, release, and hotfix branches. Since I will cover triggering a pipeline from multiple branches, these concepts can be applied to simplify other multi-branch pipeline strategies such as GitHub flow.

I can create pipelines using the AWS Management Console, AWS CLI, AWS CloudFormation, or by writing code that calls the CodePipeline CreatePipeline API. In this blog, I will keep things simple by creating two pipelines, a release pipeline and a feature development pipeline. I start by navigating to the CodePipeline console and choosing Create pipeline.

In the first step, Pipeline settings, as per Figure 1 below, you will now see options for the newly added Execution modes – Queued and Parallel.

Figure 1. Example GitFlow release pipeline settings for Queued execution mode.

The execution mode of the pipeline determines the handling of multiple executions:

Superseded – an execution that started more recently can overtake one that began earlier. Before today, CodePipeline only supported Superseded execution mode.
Queued – the executions wait and do not overtake executions that have already started. Executions are processed one by one in the order that they are queued.
Parallel – the executions are independent of one another and do not wait for other executions to complete before starting.

The first pipeline, release pipeline, will trigger for main, develop, hotfix, and release branches. I select Queued, since I want to run every push to these branches in the order triggered by the pipeline. I make sure the Pipeline type chosen is V2 and I click Next.

Figure 2. Example source connection and repository.

In step two, Add source stage, I select my Source provider, Connection, Repository name, and Default branch. I need to use a source provider that uses a connection to my external code repository, in this example I’m using GitHub so I select Connect to GitHub. Connections authorize and establish configurations that associate a third-party provider such as GitHub with CodePipeline.

Now that I have my Source setup, I’m going to configure a Trigger. Triggers define the event type that starts the pipeline, such as a code push or pull request. I select the Specify filter from the Trigger types, since I want to add a filtered trigger.

For this pipeline, I select Push for the Event type. A push trigger starts a pipeline when a change is pushed to the source repository. The execution uses the files in the branch that is pushed to, the destination branch. Next, I select the Filter type of Branch. The branch filter type specifies the branches in GitHub connected repository that the trigger monitors in order to know when to start an execution.

There are two types of branch filters:

Include – the trigger will start a pipeline if the branch name matches the pattern.
Exclude – the trigger will NOT start a pipeline if the branch name matches the pattern.

Note: If Include and Exclude both have the same pattern, then the default is to exclude the pattern.

Branching patterns are entered in the glob format, detailed in Working with glob patterns in syntax, to specify the branch I want to trigger, I enter main,develop,hotfix/**,release/** in Include and I leave Exclude empty.

Figure 3. Example GitFlow release pipeline for push event type and branch filters.

I am done configuring the filters and I click Next.

To keep the focus of the blog on the pipeline and not the application, I will skip ahead to Create pipeline. If you are curious about by my application and build step, I followed the example in AWS CodeBuild adds support for AWS Lambda compute mode.

Next, I create the feature development pipeline. The feature development pipeline will trigger for feature branches. This time, I select the Parallel Execution mode, as developers should not be blocked by their peers working in other feature branches. I make sure the Pipeline type chosen is V2 and I click Next.

Figure 4. Example GitFlow feature pipeline settings for Queued execution mode.

In Step 2, the source provider and connection is setup the same as per the previous release pipeline, see Figure 2 above. Once the Source step is complete, I configure my Trigger with an Event type of Push, but this time I only enter feature/** for Include.

Figure 5. Example GitFlow feature pipeline for push event type and branch filters.

I am done configuring the filters and I skip forward to Create pipeline.

After the pipeline is finished creating, I can now see both of the pipelines I created – the release pipeline and the feature development pipeline.

Figure 6. Example GitFlow pipelines.

To verify my pipeline setup, I create and merge multiple code changes to feature branches and to the release branches – develop, release, and main. The Pipeline view now displays the executions that have been triggered by the matching branches. Note how these executions have been successfully added to the queue by the pipeline.

Figure 7. Example GitFlow release executions queued.

I have now implemented a GitFlow release pipeline. By using Branch filter types and Push event triggers, you can now extend this example to meet your branch-based development needs.

Pipeline #2 – Run a pipeline on all pull requests (PRs)

Before proceeding, I recommend you review the concepts covered in Pipeline #1, as you will build on that knowledge.

Triggering a pipeline on a pull request (PR) is a common continuous integration pattern to catch build and test failures before the PR is merged into the branch. A PR pipeline is often faster and lighter than the full release by limiting tests like security scans, validation tests, or performance tests to the changes in the PR rather than running them on every commit. Having a single pipeline triggered for all PRs allows reviewing and validating any proposed changes to the repository before merging.

To start I create a new pipeline, by clicking Create Pipeline. I change the Execution mode to Parallel. I choose Parallel because the development team will be working on multiple features at the same time and it is wasteful to wait for other executions to finish. I make sure the Pipeline type chosen is V2 and I click Next.

Figure 8. Example PR pipeline settings for Parallel execution mode.

As per the previous pipeline, the Source provider and connection is setup as show in Figure 2 above. Once the Source step is setup, I configure my Pull Request Trigger.

For this pipeline, I select Pull Request for the Event type. A pull request trigger starts a pipeline when a pull request is opened, updated, or closed in the source repository. The execution will use the files in the branch that the change is being pulled from, the source branch. Next, I select Pull request is created and New revision is made to pull request for Events for pull request. To match pull requests for all branches, I enter ** under Include for Branches and leave Exclude empty.

Figure 9. Example PR pipeline for pull request event type.

I will fast-forward to the Create pipeline, skipping the details of the build and deploy steps, similar to what I did in Pipeline #1.

Once the pipeline has finished creating, I open a few PRs in my GitHub repository as a test. Back in CodePipeline when I click on my pipeline, I notice the pipeline takes me straight to the Execution history view. The reason I’m redirected to the execution history is the pipeline execution mode is Parallel and all executions are independent. From this view, I see the Trigger column displaying details about each pull request that has triggered the pipeline.

Figure 10. Example PR pipeline with executions in parallel.

Note: To view an individual execution Pipeline, click the Execution ID.

I have now implemented a PR validation pipeline for all PRs across branches. By using Pull request event triggers and Branch filter types, you can now extend this example to meet your PR pipeline needs.

Pipeline #3 – Run a pipeline on a single folder within a monorepo

Before proceeding, I recommend you review the concepts covered in Pipeline #1, as you will build on that knowledge.

A monorepo is a software-development strategy where a single repository is used to contain the code for multiple projects. Running pipelines for each project contained in the monorepo on every commit can be inefficient, especially when each project requires different pipelines. For this pipeline example, I want to limit pipeline executions to only changes inside the infrastructure folder in the main branch. This can reduce cost, speed up deployments, and optimize resource usage.

To start, I create a new pipeline by clicking Create Pipeline. For this example, I keep the default Execution mode as Suspended, since I do not have any specific execution mode requirements. I make sure the Pipeline type chosen is V2 and I click Next.

As per the previous pipeline, the Source provider and connection is setup as per Figure 2 above. Once the Source step is complete, I configure my Trigger to focus on the infrastructure folder in the main branch.

For this pipeline, I select Push for the Event type. Next, I select the Filter type of Branch. To match pushes to only main, I enter main under Include for Branches and leave Exclude empty. Under File paths, for Include, I enter infrastructure/** and I leave Exclude empty. The file paths filter type specifies file path names in the source repository that the trigger monitors in order to know when to start an execution. Similar to branch filters, I can specify file path name patterns in glob format under Include and Exclude.

Figure 11. Example monorepo pipeline for push event type and file path filters.

I click Next, since I am done configuring the filters.

I will jump ahead to the Create pipeline, omitting the details of the build and deploy steps, like I did in Pipeline #1.

Once the pipeline has been created, I can test the pipeline Trigger in GitHub by making changes on the main branch inside and outside the infrastructure folder. To verify it is only invoking the pipelines inside the infrastructure folder, I open the History for the pipeline in CodePipeline. I confirm that only the changes I’m expecting are running.

Figure 12. Example monorepo pipeline with only infrastructure executions.

I have now selectively invoked a pipeline based on repository changes in a monorepo. By using File paths filter types, you can now extend this example to meet your monorepo release pipelines.

Conclusion

AWS CodePipeline’s new triggers and execution modes unlock new patterns for building pipelines on AWS. In this post, I discussed the new features and three popular pipeline patterns you can build. If you are creating GitFlow or your own multi-branch strategy, CodePipeline simplifies managing release pipelines for multi-branch models. Whether you are using File path filter types for monorepos or leveraging Parallel execution mode to unblock developers, CodePipeline accelerates the delivery of your software. Check out the AWS CodePipeline User Guide and hands-on tutorials to automate your delivery workflows today.

Fulfilling the promise of single-vendor SASE through network modernization

2024-02-07 Michael Keane http://blog.cloudflare.com/author/michael-keane/

Post Syndicated from Michael Keane http://blog.cloudflare.com/author/michael-keane/ original https://blog.cloudflare.com/single-vendor-sase-announcement-2024

As more organizations collectively progress toward adopting a SASE architecture, it has become clear that the traditional SASE market definition (SSE + SD-WAN) is not enough. It forces some teams to work with multiple vendors to address their specific needs, introducing performance and security tradeoffs. More worrisome, it draws focus more to a checklist of services than a vendor’s underlying architecture. Even the most advanced individual security services or traffic on-ramps don’t matter if organizations ultimately send their traffic through a fragmented, flawed network.

Single-vendor SASE is a critical trend to converge disparate security and networking technologies, yet enterprise “any-to-any connectivity” needs true network modernization for SASE to work for all teams. Over the past few years, Cloudflare has launched capabilities to help organizations modernize their networks as they navigate their short- and long-term roadmaps of SASE use cases. We’ve helped simplify SASE implementation, regardless of the team leading the initiative.

Announcing (even more!) flexible on-ramps for single-vendor SASE

Today, we are announcing a series of updates to our SASE platform, Cloudflare One, that further the promise of a single-vendor SASE architecture. Through these new capabilities, Cloudflare makes SASE networking more flexible and accessible for security teams, more efficient for traditional networking teams, and uniquely extend its reach to an underserved technical team in the larger SASE connectivity conversation: DevOps.

These platform updates include:

Flexible on-ramps for site-to-site connectivity that enable both agent/proxy-based and appliance/routing-based implementations, simplifying SASE networking for both security and networking teams.
New WAN-as-a-service (WANaaS) capabilities like high availability, application awareness, a virtual machine deployment option, and enhanced visibility and analytics that boost operational efficiency while reducing network costs through a “light branch, heavy cloud” approach.
Zero Trust connectivity for DevOps: mesh and peer-to-peer (P2P) secure networking capabilities that extend ZTNA to support service-to-service workflows and bidirectional traffic.

Cloudflare offers a wide range of SASE on- and off-ramps — including connectors for your WAN, applications, services, systems, devices, or any other internal network resources — to more easily route traffic to and from Cloudflare services. This helps organizations align with their best fit connectivity paradigm, based on existing environment, technical familiarity, and job role.

We recently dove into the Magic WAN Connector in a separate blog post and have explained how all our on-ramps fit together in our SASE reference architecture, including our new WARP Connector. This blog focuses on the main impact those technologies have for customers approaching SASE networking from different angles.

More flexible and accessible for security teams

The process of implementing a SASE architecture can challenge an organization’s status quo for internal responsibilities and collaboration across IT, security, and networking. Different teams own various security or networking technologies whose replacement cycles are not necessarily aligned, which can reduce the organization’s willingness to support particular projects.

Security or IT practitioners need to be able to protect resources no matter where they reside. Sometimes a small connectivity change would help them more efficiently protect a given resource, but the task is outside their domain of control. Security teams don’t want to feel reliant on their networking teams in order to do their jobs, and yet they also don’t need to cause downstream trouble with existing network infrastructure. They need an easier way to connect subnets, for instance, without feeling held back by bureaucracy.

Agent/proxy-based site-to-site connectivity

To help push these security-led projects past the challenges associated with traditional siloes, Cloudflare offers both agent/proxy-based and appliance/routing-based implementations for site-to-site or subnet-to-subnet connectivity. This way, networking teams can pursue the traditional networking concepts with which they are familiar through our appliance/routing-based WANaaS — a modern architecture vs. legacy SD-WAN overlays. Simultaneously, security/IT teams can achieve connectivity through agent/proxy-based software connectors (like the WARP Connector) that may be more approachable to implement. This agent-based approach blurs the lines between industry norms for branch connectors and app connectors, bringing WAN and ZTNA technology closer together to help achieve least-privileged access everywhere.

Agent/proxy-based connectivity may be a complementary fit for a subset of an organization’s total network connectivity. These software-driven site-to-site use cases could include microsites with no router or firewall, or perhaps cases in which teams are unable to configure IPsec or GRE tunnels like in tightly regulated managed networks or cloud environments like Kubernetes. Organizations can mix and match traffic on-ramps to fit their needs; all options can be used composably and concurrently.

Our agent/proxy-based approach to site-to-site connectivity uses the same underlying technology that helps security teams fully replace VPNs, supporting ZTNA for apps with server-initiated or bidirectional traffic. These include services such as Voice over Internet Protocol (VoIP) and Session Initiation Protocol (SIP) traffic, Microsoft’s System Center Configuration Manager (SCCM), Active Directory (AD) domain replication, and as detailed later in this blog, DevOps workflows.

This new Cloudflare on-ramp enables site-to-site, bidirectional, and mesh networking connectivity without requiring changes to underlying network routing infrastructure, acting as a router for the subnet within the private network to on-ramp and off-ramp traffic through Cloudflare.

More efficient for networking teams

Meanwhile, for networking teams who prefer a network-layer appliance/routing-based implementation for site-to-site connectivity, the industry norms still force too many tradeoffs between security, performance, cost, and reliability. Many (if not most) large enterprises still rely on legacy forms of private connectivity such as MPLS. MPLS is generally considered expensive and inflexible, but it is highly reliable and has features such as quality of service (QoS) that are used for bandwidth management.

Commodity Internet connectivity is widely available in most parts of the inhabited world, but has a number of challenges which make it an imperfect replacement to MPLS. In many countries, high speed Internet is fast and cheap, but this is not universally true. Speed and costs depend on the local infrastructure and the market for regional service providers. In general, broadband Internet is also not as reliable as MPLS. Outages and slowdowns are not unusual, with customers having varying degrees of tolerance to the frequency and duration of disrupted service. For businesses, outages and slowdowns are not tolerable. Disruptions to network service means lost business, unhappy customers, lower productivity and frustrated employees. Thus, despite the fact that a significant amount of corporate traffic flows have shifted to the Internet anyway, many organizations face difficulty migrating away from MPLS.

SD-WAN introduced an alternative to MPLS that is transport neutral and improves networking stability over conventional broadband alone. However, it introduces new topology and security challenges. For example, many SD-WAN implementations can increase risk if they bypass inspection between branches. It also has implementation-specific challenges such as how to address scaling and the use/control (or more precisely, the lack of) a middle mile. Thus, the promise of making a full cutover to Internet connectivity and eliminating MPLS remains unfulfilled for many organizations. These issues are also not very apparent to some customers at the time of purchase and require continuing market education.

Evolution of the enterprise WAN

Cloudflare Magic WAN follows a different paradigm built from the ground up in Cloudflare’s connectivity cloud; it takes a “light branch, heavy cloud” approach to augment and eventually replace existing network architectures including MPLS circuits and SD-WAN overlays. While Magic WAN has similar cloud-native routing and configuration controls to what customers would expect from traditional SD-WAN, it is easier to deploy, manage, and consume. It scales with changing business requirements, with security built in. Customers like Solocal agree that the benefits of this architecture ultimately improve their total cost of ownership:

“Cloudflare’s Magic WAN Connector offers a centralized and automated management of network and security infrastructure, in an intuitive approach. As part of Cloudflare’s SASE platform, it provides a consistent and homogeneous single-vendor architecture, founded on market standards and best practices. Control over all data flows is ensured, and risks of breaches or security gaps are reduced. It is obvious to Solocal that it should provide us with significant savings, by reducing all costs related to acquiring, installing, maintaining, and upgrading our branch network appliances by up to 40%. A high-potential connectivity solution for our IT to modernize our network.”
– Maxime Lacour, Network Operations Manager, Solocal

This is quite different from other single-vendor SASE vendor approaches which have been trying to reconcile acquisitions that were designed around fundamentally different design philosophies. These “stitched together” solutions lead to a non-converged experience due to their fragmented architectures, similar to what organizations might see if they were managing multiple separate vendors anyway. Consolidating the components of SASE with a vendor that has built a unified, integrated solution, versus piecing together different solutions for networking and security, significantly simplifies deployment and management by reducing complexity, bypassed security, and potential integration or connectivity challenges.

Magic WAN can automatically establish IPsec tunnels to Cloudflare via our Connector device, manually via Anycast IPsec or GRE Tunnels initiated on a customer’s edge router or firewall, or via Cloudflare Network Interconnect (CNI) at private peering locations or public cloud instances. It pushes beyond “integration” claims with SSE to truly converge security and networking functionality and help organizations more efficiently modernize their networks.

New Magic WAN Connector capabilities

In October 2023, we announced the general availability of the Magic WAN Connector, a lightweight device that customers can drop into existing network environments for zero-touch connectivity to Cloudflare One, and ultimately used to replace other networking hardware such as legacy SD-WAN devices, routers, and firewalls. Today, we’re excited to announce new capabilities of the Magic WAN Connector including:

High Availability (HA) configurations for critical environments: In enterprise deployments, organizations generally desire support for high availability to mitigate the risk of hardware failure. High availability uses a pair of Magic WAN Connectors (running as a VM or on a supported hardware device) that work in conjunction with one another to seamlessly resume operation if one device fails. Customers can manage HA configuration, like all other aspects of the Magic WAN Connector, from the unified Cloudflare One dashboard.
Application awareness: One of the central differentiating features of SD-WAN vs. more traditional networking devices has been the ability to create traffic policies based on well-known applications, in addition to network-layer attributes like IP and port ranges. Application-aware policies provide easier management and more granularity over traffic flows. Cloudflare’s implementation of application awareness leverages the intelligence of our global network, using the same categorization/classification already shared across security tools like our Secure Web Gateway, so IT and security teams can expect consistent behavior across routing and inspection decisions – a capability not available in dual-vendor or stitched-together SASE solutions.
Virtual machine deployment option: The Magic WAN Connector is now available as a virtual appliance software image, that can be downloaded for immediate deployment on any supported virtualization platform / hypervisor. The virtual Magic WAN Connector has the same ultra-low-touch deployment model and centralized fleet management experience as the hardware appliance, and is offered to all Magic WAN customers at no additional cost.
Enhanced visibility and analytics: The Magic WAN Connector features enhanced visibility into key metrics such as connectivity status, CPU utilization, memory consumption, and device temperature. These analytics are available via dashboard and API so operations teams can integrate the data into their NOCs.

Extending SASE’s reach to DevOps

Complex continuous integration and continuous delivery (CI/CD) pipeline interaction is famous for being agile, so the connectivity and security supporting these workflows should match. DevOps teams too often rely on traditional VPNs to accomplish remote access to various development and operational tools. VPNs are cumbersome to manage, susceptible to exploit with known or zero-day vulnerabilities, and use a legacy hub-and-spoke connectivity model that is too slow for modern workflows.

Of any employee group, developers are particularly capable of finding creative workarounds that decrease friction in their daily workflows, so all corporate security measures need to “just work,” without getting in their way. Ideally, all users and servers across build, staging, and production environments should be orchestrated through centralized, Zero Trust access controls, no matter what components and tools are used and no matter where they are located. Ad hoc policy changes should be accommodated, as well as temporary Zero Trust access for contractors or even emergency responders during a production server incident.

Zero Trust connectivity for DevOps

ZTNA works well as an industry paradigm for secure, least-privileged user-to-app access, but it should extend further to secure networking use cases that involve server-initiated or bidirectional traffic. This follows an emerging trend that imagines an overlay mesh connectivity model across clouds, VPCs, or network segments without a reliance on routers. For true any-to-any connectivity, customers need flexibility to cover all of their network connectivity and application access use cases. Not every SASE vendor’s network on-ramps can extend beyond client-initiated traffic without requiring network routing changes or making security tradeoffs, so generic “any-to-any connectivity” claims may not be what they initially seem.

Cloudflare extends the reach of ZTNA to ensure all user-to-app use cases are covered, plus mesh and P2P secure networking to make connectivity options as broad and flexible as possible. DevOps service-to-service workflows can run efficiently on the same platform that accomplishes ZTNA, VPN replacement, or enterprise-class SASE. Cloudflare acts as the connectivity “glue” across all DevOps users and resources, regardless of the flow of traffic at each step. This same technology, i.e., WARP Connector, enables admins to manage different private networks with overlapping IP ranges — VPC & RFC1918, support server-initiated traffic and P2P apps (e.g., SCCM, AD, VoIP & SIP traffic) connectivity over existing private networks, build P2P private networks (e.g., CI/CD resource flows), and deterministically route traffic. Organizations can also automate management of their SASE platform with Cloudflare’s Terraform provider.

The Cloudflare difference

Cloudflare’s single-vendor SASE platform, Cloudflare One, is built on our connectivity cloud — the next evolution of the public cloud, providing a unified, intelligent platform of programmable, composable services that enable connectivity between all networks (enterprise and Internet), clouds, apps, and users. Our connectivity cloud is flexible enough to make “any-to-any connectivity” a more approachable reality for organizations implementing a SASE architecture, accommodating deployment preferences alongside prescriptive guidance. Cloudflare is built to offer the breadth and depth needed to help organizations regain IT control through single-vendor SASE and beyond, while simplifying workflows for every team that contributes along the way.

Other SASE vendors designed their data centers for egress traffic to the Internet. They weren’t designed to handle or secure East-West traffic, providing neither middle mile nor security services for traffic passing from branch to HQ or branch to branch. Cloudflare’s middle mile global backbone supports security and networking for any-to-any connectivity, whether users are on-prem or remote, and whether apps are in the data center or in the cloud.

To learn more, read our reference architecture, “Evolving to a SASE architecture with Cloudflare,” or talk to a Cloudflare One expert.

Import entire applications into AWS CloudFormation

2024-02-02 Dan Blanco

Post Syndicated from Dan Blanco original https://aws.amazon.com/blogs/devops/import-entire-applications-into-aws-cloudformation/

AWS Infrastructure as Code (IaC) enables customers to manage, model, and provision infrastructure at scale. You can declare your infrastructure as code in YAML or JSON by using AWS CloudFormation, in a general purpose programming language using the AWS Cloud Development Kit (CDK), or visually using Application Composer. IaC configurations can then be audited and version controlled in a version control system of your choice. Finally, deploying AWS IaC enables deployment previews using change sets, automated rollbacks, proactive enforcement of resource compliance using hooks, and more. Millions of customers enjoy the safety and reliability of AWS IaC products.

Not every resource starts in IaC, however. Customers create non-IaC resources for various reasons: they didn’t know about IaC, or they prefer to work in the CLI or management console. In 2019, we introduced the ability to import existing resources into CloudFormation. While this feature proved integral for bringing resources into IaC on an individual basis, the process of manually creating templates to match those resources wasn’t ideal. Customers were required to look up documentation on resources and painstakingly copy values manually. Customers also told us they traditionally engaged with applications (that is, groupings of related resources), so dealing with individual resources didn’t match that experience. We set out to create a more holistic flow for managing resources and their relations.

Recently, we announced the IaC generator and CDK Migrate, an end-to-end experience that enables customers to create an IaC configuration based off a resource as well as its relationships. This works by scanning an AWS account and using the CloudFormation resource type schema to find relationships between resources. Once this configuration is created, you can use it to either import those resources into an existing stack, or create a brand new stack from scratch. It’s now possible to bring entire applications into a managed CloudFormation stack without having to recreate any resources!

In this post, I’ll explore a common use case we’ve seen and expect the IaC generator to solve: an existing network architecture, created outside of any IaC tool, needs to be managed by CloudFormation.

IaC generator in Action

Consider the following scenario:

As a new hire to an organization that’s just starting its cloud adoption journey, you’ve been tasked with continuing the development of the team’s shared Amazon Virtual Private Cloud (VPC) resources. These are actively in use by the development teams. As you dig around, you find out that these resources were created without any form of IaC. There’s no documentation, and the person who set it up is no longer with the team. Confounding the problem, you have multiple VPCs and their related resources, such subnets, route tables, and internet gateways.

You understand the benefits of IaC – repeatability, reliability, auditability, and safety. Bringing these resources under CloudFormation management will extend these benefits to your existing resources. You’ve imported resources into CloudFormation before, so you set about the task of finding all related resources manually to create a template. You quickly discover, however, that this won’t be a simple task. VPCs don’t store relations to items; instead, relations are reversed – items know which VPC they belong to, but VPCs don’t know which items belong to them. In order to find all the resources that are related to a VPC, you’ll have to manually go through all the VPC-related resources and scan to see which vpc-id they belong to. You’ll have to be diligent, as it’s very easy to miss a resource because you weren’t aware that it existed or it may even be different class of resource altogether! For example, some resources may use an elastic network interface (ENI) to attach to the VPC, like an Amazon Relational Database Service instance.

You, however, recently learned about the IaC generator. The generator works by running a scan of your account and creating an up-to-date inventory of resources. CloudFormation will then leverage the resource type schema to find relationships between resources. For example, it can determine that a subnet has a relationship to a VPC via a vpc-id property. Once these relationships have been determined, you can then select the top-level resources you want to generate a template for. Finally, you’ll be able to leverage the wizard to create a stack from this existing template.

You can navigate to the IaC generator page in the Amazon Management Console and start a scan on your account. Scans last for 30 days, and you can run three scans per day in an account.

Scan account button and status

Once the scan completes, you create a template by selecting the Create Template button. After selecting Start from a new template, you fill out the relevant details about the stack, including the Template name and any stack policies. In this case, you leave it as Retain.

Create template section with "Start from a new template" selected

On the next page, you’ll see all the scanned resources. You can add filters to the resource such as tags to view a subset of scanned resources. This example will only use a Resource type prefix filter. More information on filters can be found here. Once you find the VPC, you can select it from the list.

A VPC selected in the scanned resources list]

On the next page, you’ll see the list of resources that CloudFormation has determined to have a link to this VPC. You see this includes a myriad of networking related resource. You keep these all selected to create a template from them.

A list of related resources, all selected

At this point, you select Create template and CloudFormation will generate a template from the existing resources. Since you don’t have an existing stack to import these resource into, you must create a new stack. You now select this template and then select the Import to stack button.

The template detail page with an import to stack button

After entering the Stack name, you can then enter any Parameters your template needs.

The specify stack details page, with a stack name of "networking" entered

CloudFormation will create a change set for your new stack. Change sets allow you to see the changes CloudFormation will apply to a stack. In this example, all of the resources will have the Import status. You see the resources CloudFormation found, and once you’re satisfied, you create the stack.

A change set indicating the previously found resources will be created

At this point, the create stack operation will proceed as normal, going through each resource and importing it into the stack. You can report back to your team that you have successfully imported your entire networking stack! As next steps, you should source this template in a version control system. We recently announced a new feature to keep CloudFormation templates synced with popular version control systems. Finally, make sure to make any changes through CloudFormation to avoid a configuration drift between the stated configuration and the existing configuration.

This example was primarily CloudFormation-based, but CDK customers can use CDK Migrate to import this configuration into a CDK application.

Available Now

The IaC generator is now available in all regions where CloudFormation is supported. You can access the IaC generator using the console, CLI, and SDK.

Conclusion

In this post, we explored the new IaC generator feature of CloudFormation. We walked through a scenario of needing to manage previously existing resources and using the IaC generator’s provided wizard flow to generate a CloudFormation template. We then used that template and created a stack to manage these resources. These resources will now enjoy the safety and repeatability that IaC provides. Though this is just one example, we foresee other use cases for this feature, such as enabling a console-first development experience. We’re really excited to hear your thoughts about the feature. Please let us know how you feel!

Announcing CDK Migrate: A single command to migrate to the AWS CDK

2024-02-02 Adam Keller

Post Syndicated from Adam Keller original https://aws.amazon.com/blogs/devops/announcing-cdk-migrate-a-single-command-to-migrate-to-the-aws-cdk/

Today we’re excited to announce the general availability of CDK Migrate, a component of the AWS Cloud Development Kit (CDK). This feature enables users to migrate AWS CloudFormation templates, previously deployed CloudFormation stacks, or resources created outside of Infrastructure as Code (IaC) into a CDK application. This feature is being launched in tandem with the CloudFormation IaC Generator, which helps customers import resources created outside of CloudFormation into a template, and into a newly generated, fully managed CloudFormation stack. To read more on this feature, check out the launch post.

There are various ways to create and manage resources in AWS, whether that be via “ClickOps” (creating and updating via the AWS Console), via AWS API’s, or using Infrastructure as Code (IaC). While it’s a good and recommended practice to manage the lifecycle of resources using IaC, there can be an on-ramp to getting started. For those that aren’t ready to use IaC, it is likely that they use the console to create the resources and update them accordingly. While this can be acceptable for smaller use cases or for testing out a new service, it becomes more challenging as the complexity of the environment grows. This is further exacerbated when there is a need to re-deploy the exact configuration to other accounts, environments, or regions, as the process becomes very error prone when trying to replicate it. IaC is built to help solve this problem by allowing users to define once and deploy everywhere. For those who have been putting off the move to IaC, now is the time to take the plunge with the IaC generator functionality and CDK migrate, which can accelerate and simplify the move.

Getting Started

The first step when migrating resources into the AWS CDK is to understand the best mechanism for how the users would prefer to interact with their IaC.

For users that are looking to define their IaC declaratively (manage resources via a configuration language like YAML), it is recommended that they look at IaC generator, which can generate a CloudFormation template as well as manage the existing resources in a CloudFormation stack.
For users that are looking to manage their IaC via a higher level programming language as well as build on top of those templates with higher level abstractions and automation, the AWS Cloud Development Kit and CDK migrate serve as an excellent option,

There is also functionality in the CDK CLI to import resources into an existing CDK application. Let’s review the use cases for when to use CDK migrate vs when to use CDK import.

CDK Migrate

Users are looking to migrate one or many resources into a new CDK application.
- Examples of existing resources in the AWS region to be migrated:
  - Resources created outside of IaC
  - A deployed CloudFormation Stack
Users want to migrate from CloudFormation templates into a new CDK application
Users are looking for a managed experience to generate CDK code from existing resources and/or CloudFormation templates.
While the CDK migrate feature is designed to help accelerate those users looking to use the AWS CDK, it’s important to understand that there are limitations. For more information on the limitations, please review the documentation.

CDK Import

Users have an existing CDK application and want to import one or many resources that were created outside of the CDK.
- Examples of existing resources in the AWS region to be migrated:
  - Resources created outside of IaC (via ClickOps)
  - A deployed CloudFormation Stack
- The user must define the resources in their CDK app on their own, and ensure that the resources defined in the CDK code map directly to the resource as it exists in the account. There is a multi-step process to follow when using this feature, for more information see here.

This post will walk through an example of how to take a local CloudFormation template and convert it into a new CDK application.

Walkthrough

To start, take the CloudFormation template below that will be converted to a CDK application. The template creates an AWS Lambda Function, AWS Identity and Access Management (IAM) role, and an Amazon S3 Bucket along with some parameters to help make some of the inputs dynamic. Below is the template in full:

AWSTemplateFormatVersion: "2010-09-09"
Description: AWS CDK Migrate Demo Template
Parameters:
  FunctionResponse:
    Description: Response message from the Lambda function
    Type: String
    Default: Hello World
  BucketTag:
    Description: The tag value of the S3 bucket
    Type: String
    Default: ChangeMe
Resources:
  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
  HelloWorldFunction:
    Type: AWS::Lambda::Function
    Properties:
      Role: !GetAtt LambdaExecutionRole.Arn
      Code:
        ZipFile: |
          import os
          def lambda_handler(event, context):
            function_response = os.getenv('FUNCTION_RESPONSE')
            return {
              "statusCode": 200,
              "body": function_response
            }
      Handler: index.lambda_handler
      Runtime: python3.11
      Environment:
        Variables:
          FUNCTION_RESPONSE: !Ref FunctionResponse
  S3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256
      Tags:
        - Key: Application
          Value: Git-Sync-Demo
        - Key: DynamicTag
          Value: !Ref BucketTag
Outputs:
  S3BucketName:
    Description: The name of the S3 bucket
    Value: !Ref S3Bucket
    Export:
      Name: !Sub ${AWS::StackName}-S3BucketName

This is the template that you will use when running the migration command. As a reminder, this demo migrates a CloudFormation template to a CDK application, but you can also migrate a previously deployed stack or non IaC created resources.

Migrate

The migration from the CloudFormation template to the CDK is done with a single command: cdk migrate. Simply point to the local CloudFormation template file (let’s call it demo-template.yaml), and watch as the CLI converts the template into a CDK application. The output and result from running the command will be a directory comprised of the CDK code and dependencies, but will not deploy the stack.

cdk migrate --stack-name CDK-Local-Template-Migrate-Demo --language typescript --from-path ../demoTemplate.yaml

CDK Migrate command

In the above command, you’re instructing the CDK CLI to consume the CloudFormation template file using the --from-path parameter, and choose the language as the output for the CDK application. The CDK CLI will convert the template as well as create a project folder along with the required dependencies for the CDK application.

When the migration is complete, the CDK application along with the project structure and files are available and ready to use, but have not yet been deployed. Below is the file structure of what was generated:

cdk app directory structure

The above output represents the scaffold for your CDK Typescript application, ready for deployment. The two directories that house the CDK code are bin and lib. Within the bin directory you’ll find the code that creates our CDK app and calls the CDK Stack class. The name of the files will match the input that was passed into the –stack-name parameter when running the migrate command, so in this case the file is named: bin/cdk-local-template-migrate-demo.ts. Below is the generated code:

CDK App Code

The CdkLocalTemplateMigrateDemoStack is imported and then instantiated. This is where the code that was converted from the existing CloudFormation template (or stack, or resources) resides. Again, similar to how the file was named above, the filename and location for the CDK stack code is lib/cdk-local-template-migrate-demo-stack.ts. Let’s look at the code that was converted.

CDK Stack Code

Comparing the above auto generated code to the original CloudFormation template, the definitions of the resources look similar. This is because the migrate command is generating the CDK code using L1 constructs, which represent all resources available in CloudFormation. For more information on CDK constructs and the various levels of abstraction they offer, check out this video.

The CloudFormation parameters were converted to properties inside of an interface, which are passed in to the Stack class. Inside of the Stack class code, it honors the defaults set in the properties based on the defaults were set in the original CloudFormation parameters. If you wanted to override those defaults, you could pass those properties into the CDK stack as follows:

CDK App Code Cleaned Up

With your newly created CDK application, you’re ready to deploy it to your AWS account.

Deploy

If this is the first time that you are using the CDK in the account and region, you will need to run the cdk bootstrap command, which creates assets required for the CDK to properly deploy resources to the region and account. For more information see here. Assuming the bootstrap process has happened, you can proceed to deployment.

The Infrastructure as Code is ready to deploy, but prior to deploying you should run a cdk diff to see what will be deployed. Running the diff command creates a change set and surfaces the changes being proposed (in this case it is a brand new stack with new resources).

Cdk Diff command

From the output you can see that all new resources are being created. If the cdk diff command was run against existing resources or stacks, assuming nothing changed (like above where I updated the properties), the diff would show no changes to the existing resources.

Next, deploy the stack (by running the cdk deploy command) and once the deployment is complete, head over to the AWS console and find your Lambda function. Run a test on your lambda function, and the response should match the functionResponse property that was updated as “CDK Migrate Demo Blog”. Lambda test execution output

Wrapping up

In this post, we discussed how the CDK migrate command can help you move your resources to the CDK to manage your infrastructure as code, whether it’s from a CloudFormation template, previously deployed CloudFormation stack, or from importing resources via the CloudFormation IaC generator feature. As always, we encourage you to test this feature and provide feedback and/or feature requests in our GitHub repo. In addition, if you’re new to the CDK there are some resources that can help you get started.

CDK hands on workshop
CDK Live! (Youtube channel with live streams and walkthroughs)
CDK Best Practices Guide

A new and improved AWS CDK construct for Amazon DynamoDB tables

2024-02-01 Anirudh Sharma

Post Syndicated from Anirudh Sharma original https://aws.amazon.com/blogs/devops/a-new-and-improved-aws-cdk-construct-for-amazon-dynamodb-tables/

Recently, we launched a new AWS Cloud Development Kit (CDK) construct for Amazon DynamoDB tables, known as TableV2. This construct provides a number of new features in addition to what the original construct offered, enabling CDK authors to create global tables, simplifying the configuration of global secondary indexes and auto scaling, as well as supporting AWS CloudFormation drift detection and import operations. We believe that this new construct will make it easier for organizations to build and manage their DynamoDB tables at scale, in addition to providing more flexibility and control over the configuration of tables.

AWS CDK is a framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. Developers can use any of the supported programming languages to define reusable cloud components known as Constructs. A construct is a reusable and programmable component that represents AWS resources. CDK translates the high-level constructs defined by you into equivalent AWS CloudFormation templates. CloudFormation provisions the resources specified in the template, streamlining the usage of Infrastructure as a Code (IaC) on AWS.

In this post we’ll explore:

The reasoning behind the creation of a new L2 construct for DynamoDB tables.
Features of new L2 constructs along with examples.
The benefits of leveraging this new construct in terms of scalability, flexibility, and simplicity.

By understanding the reasons behind its development and exploring its capabilities through practical examples, you will gain a comprehensive understanding of how this new L2 construct can enhance their DynamoDB experience. Let’s dive in.

Background

The original DynamoDB L2 Table construct is a powerful and versatile tool for creating and managing DynamoDB tables. It allows you to easily define the schema of your table, as well as the provisioned throughput and replicas. It also supports features like global tables, secondary indexes, and streams.

However, the Table construct uses a custom resource to add replicas to the primary table. This means that a separate Lambda function is created as the resource provider in addition to the Table resources (primary table and any replicas). This can be cumbersome to manage and can lead to drift detection issues.

The new TableV2 construct is an abstraction built on top of the GlobalTable L1 construct. It uses the CloudFormation resource AWS::DynamoDB::GlobalTable to create and manage DynamoDB tables. This has two important benefits:

CloudFormation is in control and aware of all replicas that make up the Global Table, which means you will experience drift detection across all the replicas. With the original table construct, CloudFormation was not aware of any replicas since this was being handled through the Lambda function being used as a resource provider.
No extra resource (Lambda function) is created when replicas are configured with TableV2. This eliminates the need to manage an extra resource and the risk of troubleshooting issues that may arise with the custom resource. TableV2 simplifies the setup and maintenance of DynamoDB tables by using native CloudFormation constructs to directly manage replicas, without the need for a Lambda function. This results in a more efficient and streamlined experience for users.

The new TableV2 construct provides more fine-grained control to customers over the replicas created as part of the Global Table. Specifically, customers can specify properties like contributor insights, deletion protection, point-in-time recovery, table class, read capacity, and global secondary index options on a per-replica basis.

This means that customers can tailor their table setup to meet their specific needs and optimize their overall experience with the Global Table feature. For example, a customer might want to enable contributor insights for all replicas, but only enable deletion protection for the primary replica. Or, a customer might want to use a different table class for each replica, depending on the expected workload.

The new TableV2 construct also offers greater flexibility and customization options by allowing customers to specify these properties on a per-replica basis. This can be helpful for customers who need to have different configurations for their replicas, or who want to fine-tune the performance and availability of their tables.

In the next section, we will explore each of these properties in more detail and how they can be specified in the new construct.

Features Walk-through

The new TableV2 construct is the recommended CDK DynamoDB construct for creating both single tables and global tables. In this section, we will review some specific aspects of the TableV2 construct and how they can be implemented. The walkthrough will cover features like Replicas, Billing, and Encryption, providing a comprehensive understanding of its capabilities.

Replicas

One of the most important benefits of the new L2 construct is the ability to configure properties on a per-replica basis. For example, the following code creates a global DynamoDB table with contributor insights and point-in-time recovery enabled for the table:

import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';

const app = new cdk.App();
const stack = new cdk.Stack(app, 'Stack', { env: { region: 'us-west-2' } });

const globalTable = new dynamodb.TableV2(stack, 'GlobalTable', {
  partitionKey: { name: 'pk', type: dynamodb.AttributeType.STRING },
  contributorInsights: true,
  pointInTimeRecovery: true,
  replicas: [
    {
      region: 'us-east-1',
      tableClass: dynamodb.TableClass.STANDARD_INFREQUENT_ACCESS,
      pointInTimeRecovery: false,
    },
    {
      region: 'us-east-2',
      contributorInsights: false,
    },
  ],
});

// This is an ITableV2 instance for the replica table in us-east-1
const replica = globalTable.replica('us-east-1');

This code creates two replicas, one in the us-east-1 region and one in the us-east-2 region. For the replica in the us-east-1 region, we disable point-in-time recovery and set the table class to STANDARD_INFREQUENT_ACCESS. For the replica in the us-east-2 region, we disable contributor insights. The TableV2 construct also enables users to work with individual instances of the replicas in a global table via the replica() method. We see how this can be utilized from the above code where an ITableV2 instance representing the replica in us-east-1 is returned.

This is particularly useful for the grant() and metric() methods. For example, the following code gives a user write access to a replica in us-east-1 region:

import { Construct } from 'constructs';
import { App, Stack, StackProps } from 'aws-cdk-lib';
import { ITableV2, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { AttributeType } from 'aws-cdk-lib/aws-dynamodb';
import * as iam from 'aws-cdk-lib/aws-iam';


class FooStack extends Stack {
  public readonly globalTable: TableV2;

  public constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    this.globalTable = new TableV2(this, 'GlobalTable', {
      partitionKey: { name: 'pk', type: AttributeType.STRING },
      replicas: [
        { region: 'us-east-1' },
        { region: 'us-east-2' },
      ],
    });
  }
}

interface BarStackProps extends StackProps {
  readonly replicaTable: ITableV2;
}

class BarStack extends Stack {
  public constructor(scope: Construct, id: string, props: BarStackProps) {
    super(scope, id, props);
    const user = new iam.User(this, 'User')

    // user is given grantWriteData permissions to replica in us-east-1
    props.replicaTable.grantWriteData(user);
  }
}

const app = new App();

const fooStack = new FooStack(app, 'FooStack', { env: { region: 'us-west-2', account: process.env.CDK_DEFAULT_ACCOUNT } });
const barStack = new BarStack(app, 'BarStack', {
  replicaTable: fooStack.globalTable.replica('us-east-1'),
  env: { region: 'us-east-1', account: process.env.CDK_DEFAULT_ACCOUNT },
});

Before the replica() method was introduced, grant methods on the original Table construct applied to the primary table and all replicas. This was because there was no way to pull out a specific replica. This limited a user’s ability to grant a specific principal read, write, or read/write permission to a specific replica. The replica() method enables granting specific permissions to individual replicas in a global table. It maintains consistent behavior across all methods in the ITableV2 interface, including grants and metrics.

Billing

Table billing is easily configured using the onDemand() or provisioned() static methods of the Billing class. If provisioned billing is configured, the user must provide read and write capacity, which can be easily configured using the fixed() or autoscaled() static methods of the Capacity class.

For example, to configure on-demand billing:

import * as cdk from 'aws-cdk-lib';
import { AttributeType, Billing, TableClass, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { Construct } from 'constructs';


export class DynamodbStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    new TableV2(this, 'DynamoDBTable', {
      partitionKey: { name: 'id', type: AttributeType.STRING},
      replicas: [
        {region: 'us-east-2'},
        {region: 'us-west-1'}
      ],
      billing: Billing.onDemand(),
      tableClass: TableClass.STANDARD
    })
  }
}

To configure provisioned billing:

import * as cdk from 'aws-cdk-lib';
import { AttributeType, Billing, Capacity, TableClass, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { Construct } from 'constructs';

export class DynamodbStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    new TableV2(this, 'DynamoDBTable', {
      partitionKey: { name: 'id', type: AttributeType.STRING},
      replicas: [
        {region: 'us-east-2'},
        {region: 'us-west-1'}
      ],
      billing: Billing.provisioned({
        readCapacity: Capacity.fixed(5),
        writeCapacity: Capacity.autoscaled({maxCapacity: 10})
      }),
      tableClass: TableClass.STANDARD
    })
  }
}

Note that with the previous Table construct, users had to set a billingMode property and configure readCapacity and writeCapacity as separate properties. Additionally, configuring autoscaled capacity required calling the autoScaleReadCapacity() or autoScaleWriteCapacity() method on an instance of the Table construct. Lastly, since readCapacity, writeCapacity, and billingMode were all individual properties, a user had to know not to provision read and write capacity for a table with PAY_PER_REQUEST billing mode. With the new Billing class, the user is guided into providing necessary properties via the onDemand() and provisioned() static methods.

Encryption

The TableEncryptionV2 class allows you to provide your own KMS keys for each replica instead of using the default AWS owned keys, thus encrypting every replica with a custom KMS key. This provides more granular control over the encryption of your DynamoDB tables.

Here is an example of how to use the TableEncryptionV2 class to encrypt each replica of a global table with a custom KMS key:

import * as cdk from 'aws-cdk-lib';
import { AttributeType, Billing, BillingMode, Capacity, TableBaseV2, TableEncryptionV2, TableV2 } from 'aws-cdk-lib/aws-dynamodb';
import { IKey, Key } from 'aws-cdk-lib/aws-kms';
import { Construct } from 'constructs';

interface KMSkeys extends cdk.StackProps {
  kmsuswest1: IKey;
  kmsuseast2: IKey;
}

export class GlobalTableStack extends cdk.Stack {
  //public readonly globalTable: TableV2;
  constructor(scope: Construct, id: string, props: KMSkeys) {
    super(scope, id, props);

    const replicaTableKeys = {
      "us-west-1": props.kmsuswest1.keyArn,
      "us-east-2": props.kmsuseast2.keyArn
    }
    const TableKMSKey=new Key(this, 'TableKMSKey', {
      alias: 'KMSuswest2Stack',
    }
    )

    new TableV2(this, 'GlobalTable', {
    tableName: 'FooTableFour',
    encryption: TableEncryptionV2.customerManagedKey(TableKMSKey,replicaTableKeys),

    partitionKey: {
    name: 'FooHashKey',
    type: AttributeType.STRING,
    },
    replicas: [
    {
      region: 'us-west-1',  
    },
    {
      region: 'us-east-2',
    },
  ],
    })
  }
}

The ability to provide custom KMS keys for each replica can help to improve the security of your DynamoDB tables. It also gives you more control over the encryption of your data. This can help you to meet specific compliance requirements.

Conclusion

In this post, I introduced the new AWS CDK TableV2 construct, highlighting its advantages over the original construct. Notably, TableV2 enables drift detection for replica tables and eliminates the need for an extra Lambda function custom resource. I delved into practical implementations, focusing on three key aspects: Replicas, Billing, and Encryption.

To summarize, TableV2 marks a substantial improvement over the original construct. Its user experience provides significant improvement over the original construct in several ways, such as:

Direct support for global tables: TableV2 makes it easy to create and manage global DynamoDB tables.
Easier configuration of global secondary indexes and Autoscaling: TableV2 provides a simplified and streamlined process for configuring global secondary indexes and Autoscaling.
More granular control over replicas: TableV2 allows you to configure properties on a per-replica basis, giving you more control over the performance and availability of your tables.
Improved API design and user experience: TableV2 improves the API design and user experience by implementing new classes for billing, capacity, and encryption.

Overall, TableV2 is a powerful and flexible construct that makes it easier to build and manage DynamoDB tables at scale. It is the preferred CDK DynamoDB construct for creating both single tables and global tables. If you are looking for a powerful and flexible way to build and manage DynamoDB tables, TableV2 is the perfect choice for you.

If you’re new to CDK and eager to get started, we highly recommend checking out the CDK documentation and the CDK workshop.

Announcing Generative AI CDK Constructs

2024-01-31 Michael Tran

Post Syndicated from Michael Tran original https://aws.amazon.com/blogs/devops/announcing-generative-ai-cdk-constructs/

Announced by Werner Vogels in his 2023 re:Invent Keynote, Generative AI CDK Constructs, an open-source extension of the AWS Cloud Development Kit (AWS CDK), provides well-architected multi-service patterns to quickly and efficiently create repeatable infrastructure required for generative AI projects on AWS. Our initial release includes five CDK constructs enabling key generative AI capabilities like question and answering, summarization, data ingestion for Retrieval Augmented Generation (RAG), and model deployment capabilities.

Simplify and Accelerate the Development of Applications with Generative AI

Developing generative AI applications is particularly challenging due to the rapidly evolving nature of the underlying technologies. As a result, developers are faced with the challenge of keeping up with changing paradigms and best practices. This often leads to varied and disjointed patterns as developers try crafting their own solutions from scratch. For instance, there are already hundreds of implementations of patterns like retrieval augmented generation(RAG), summarization, or chatbots.

AWS introduced the Cloud Development Kit (CDK), an open-source software development framework to define cloud infrastructure in code and provision it through AWS CloudFormation. CDK empowers developers to use high-level construct libraries that encapsulate AWS best practices, allowing for the creation of cloud applications without needing to know every detail of the AWS services.

A ‘construct‘ in CDK terminology is a building block that represents an AWS resource or a combination of AWS resources configured together. These constructs can range from low-level (individual resources) to high-level (complete architectures), enabling developers to compose and share their cloud application models as code.

Our library harnesses the power of CDK, offering pre-built constructs designed to expedite the deployment of generative AI applications. These constructs use LLMs/FMs available in Amazon Bedrock facilitating easier modeling and rapid deployment of cloud architectures tailored for generative AI. With these constructs available in both Python and Typescript, developers can leverage AWS services to build industry-agnostic solutions swiftly, regardless of their expertise level in generative AI. The constructs are designed to integrate smoothly with existing CDK applications, offering a scalable approach to enhancing generative AI capabilities across various business verticals.

Getting Started with the Constructs

Prerequisites

Node >= v18.12.1
AWS CDK >= 2.111.0
Python >= 3.9
Projen >= 0.77.44

You can install the CDK constructs in your preferred language:

Typescript app: npm i @cdklabs/generative-ai-cdk-constructs
Python app: pip install cdklabs.generative-ai-cdk-constructs

Within your existing CDK application, import the new library and get access to available constructs:

Typescript app: import * as genai from '@cdklabs/generative-ai-cdk-constructs';
Python app: import cdklabs.generative-ai-cdk-constructs as genai

Example

aws-qa-appsync-opensearch is one of the constructs available in the generative-ai-cdk-constructs library. This construct provides a question answering workflow using Amazon Bedrock and a provisioned Amazon OpenSearch cluster. Amazon Cognito is required to authenticate calls to the AWS AppSync GraphQL API created by the construct.

Typescript app:



import { Construct } from 'constructs';
import { Stack, StackProps } from 'aws-cdk-lib';
import * as os from 'aws-cdk-lib/aws-opensearchservice';
import * as cognito from 'aws-cdk-lib/aws-cognito';
import { QaAppsyncOpensearch, QaAppsyncOpensearchProps } from '@cdklabs/generative-ai-cdk-constructs';


class MyStack extends Stack {
    // get an existing OpenSearch provisioned cluster
    const osDomain = os.Domain.fromDomainAttributes(this, 'osdomain', {
        domainArn: 'arn:aws:es:us-east-1:XXXXXX',
        domainEndpoint: 'https://XXXXX.us-east-1.es.amazonaws.com'
    });
    
    // get an existing userpool 
    const cognitoPoolId = 'us-east-1_XXXXX';
    const userPoolLoaded = cognito.UserPool.fromUserPoolId(this, 'myuserpool', cognitoPoolId);
   
    // Create a QA Appsync OpenSearch construct 
    const ragSource = new QaAppsyncOpensearch(
        this,
        'QaAppsyncOpensearch',
        {
            existingOpensearchDomain: osDomain,
            openSearchIndexName: 'demoindex',
            cognitoUserPool: userPoolLoaded
        }
    )

Python app:


from aws_cdk import (
    Stack,
    aws_opensearchservice as os,
    aws_cognito as cognito
)
from constructs import Construct
from cdklabs.generative_ai_cdk_constructs import QaAppsyncOpensearch

class MyStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs):
        super().__init__(scope, id, **kwargs)

        # Get an existing OpenSearch provisioned cluster
        os_domain = os.Domain.from_domain_attributes(self, 'osdomain',
            domain_arn='arn:aws:es:us-east-1:XXXXXX',
            domain_endpoint='https://XXXXX.us-east-1.es.amazonaws.com'
        )

        # Get an existing user pool
        cognito_pool_id = 'us-east-1_XXXXX'
        user_pool_loaded = cognito.UserPool.from_user_pool_id(self, 'myuserpool', cognito_pool_id)

        # Create a QA Appsync OpenSearch construct
        rag_source = QaAppsyncOpensearch(
            self, 
            'QaAppsyncOpensearch',
            existing_opensearch_domain=os_domain,
            open_search_index_name='demoindex',
            cognito_user_pool=user_pool_loaded
        )

Refer to the documentation for additional guidance on a particular construct: Catalog

Currently, the following constructs that are available:

Feature	Description
Data ingestion pipeline	Ingestion pipeline providing a RAG (retrieval augmented generation) source for storing documents in a knowledge base.
Question answering	Question answering with a large language model (Anthropic Claude V2) using a RAG (retrieval augmented generation) source and/or long context.
Summarization	Document summarization with a large language model (Anthropic Claude V2).
Lambda layer	Python Lambda layer providing dependencies and utilities to develop generative AI applications on AWS.
SageMaker model deployment	Deploy a foundation model from Amazon SageMaker JumpStart, Hugging Face, or an S3 location to an Amazon SageMaker endpoint.

Explore More Constructs

The library includes constructs for data ingestion pipelines, question answering workflows, document summarization, Lambda layer for generative AI applications, and SageMaker model deployment.

Next Steps

Visit the Generative AI CDK Constructs GitHub repository for a full list of constructs and documentation. For practical examples, check out the AWS samples repository. Your feedback and contributions are welcome on our GitHub repository to enhance the capabilities of Generative AI CDK Constructs.

Deploy CloudFormation Hooks to an Organization with service-managed StackSets

2024-01-18 Kirankumar Chandrashekar

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/deploy-cloudformation-hooks-to-an-organization-with-service-managed-stacksets/

This post demonstrates using AWS CloudFormation StackSets to deploy CloudFormation Hooks from a centralized delegated administrator account to all accounts within an Organization Unit(OU). It provides step-by-step guidance to deploy controls at scale to your AWS Organization as Hooks using StackSets. By following this post, you will learn how to deploy a hook to hundreds of AWS accounts in minutes.

AWS CloudFormation StackSets help deploy CloudFormation stacks to multiple accounts and regions with a single operation. Using service-managed permissions, StackSets automatically generate the IAM roles required to deploy stack instances, eliminating the need for manual creation in each target account prior to deployment. StackSets provide auto-deploy capabilities to deploy stacks to new accounts as they’re added to an Organizational Unit (OU) in AWS Organization. With StackSets, you can deploy AWS well-architected multi-account solutions organization-wide in a single click and target stacks to selected accounts in OUs. You can also leverage StackSets to auto deploy foundational stacks like networking, policies, security, monitoring, disaster recovery, billing, and analytics to new accounts. This ensures consistent security and governance reflecting AWS best practices.

AWS CloudFormation Hooks allow customers to invoke custom logic to validate resource configurations before a CloudFormation stack create/update/delete operation. This helps enforce infrastructure-as-code policies by preventing non-compliant resources. Hooks enable policy-as-code to support consistency and compliance at scale. Without hooks, controlling CloudFormation stack operations centrally across accounts is more challenging because governance checks and enforcement have to be implemented through disjointed workarounds across disparate services after the resources are deployed. Other options like Config rules evaluate resource configurations on a timed basis rather than on stack operations. And SCPs manage account permissions but don’t include custom logic tailored to granular resource configurations. In contrast, CloudFormation hooks allows customer-defined automation to validate each resource as new stacks are deployed or existing ones updated. This enables stronger compliance guarantees and rapid feedback compared to asynchronous or indirect policy enforcement via other mechanisms.

Follow the later sections of this post that provide a step-by-step implementation for deploying hooks across accounts in an organization unit (OU) with a StackSet including:

Configure service-managed permissions to automatically create IAM roles
Create the StackSet in the delegated administrator account
Target the OU to distribute hook stacks to member accounts

This shows how to easily enable a policy-as-code framework organization-wide.

I will show you how to register a custom CloudFormation hook as a private extension, restricting permissions and usage to internal administrators and automation. Registering the hook as a private extension limits discoverability and access. Only approved accounts and roles within the organization can invoke the hook, following security best practices of least privilege.

StackSets Architecture

As depicted in the following AWS StackSets architecture diagram, a dedicated Delegated Administrator Account handles creation, configuration, and management of the StackSet that defines the template for standardized provisioning. In addition, these centrally managed StackSets are deploying a private CloudFormation hook into all member accounts that belong to the given Organization Unit. Registering this as a private CloudFormation hook enables administrative control over the deployment lifecycle events it can respond to. Private hooks prevent public usage, ensuring the hook can only be invoked by approved accounts, roles, or resources inside your organization.

Architecture for deploying CloudFormation Hooks to accounts in an Organization

Diagram 1: StackSets Delegated Administration and Member Account Diagram

In the above architecture, Member accounts join the StackSet through their inclusion in a central Organization Unit. By joining, these accounts receive deployed instances of the StackSet template which provisions resources consistently across accounts, including the controlled private hook for administrative visibility and control.

The delegation of StackSet administration responsibilities to the Delegated Admin Account follows security best practices. Rather than having the sensitive central Management Account handle deployment logistics, delegation isolates these controls to an admin account with purpose-built permissions. The Management Account representing the overall AWS Organization focuses more on high-level compliance governance and organizational oversight. The Delegated Admin Account translates broader guardrails and policies into specific infrastructure automation leveraging StackSets capabilities. This separation of duties ensures administrative privileges are restricted through delegation while also enabling an organization-wide StackSet solution deployment at scale.

Centralized StackSets facilitate account governance through code-based infrastructure management rather than manual account-by-account changes. In summary, the combination of account delegation roles, StackSet administration, and joining through Organization Units creates an architecture to allow governed, infrastructure-as-code deployments across any number of accounts in an AWS Organization.

Sample Hook Development and Deployment

In the section, we will develop a hook on a workstation using the AWS CloudFormation CLI, package it, and upload it to the Hook Package S3 Bucket. Then we will deploy a CloudFormation stack that in turn deploys a hook across member accounts within an Organization Unit (OU) using StackSets.

The sample hook used in this blog post enforces that server-side encryption must be enabled for any S3 buckets and SQS queues created or updated on a CloudFormation stack. This policy requires that all S3 buckets and SQS queues be configured with server-side encryption when provisioned, ensuring security is built into our infrastructure by default. By enforcing encryption at the CloudFormation level, we prevent data from being stored unencrypted and minimize risk of exposure. Rather than manually enabling encryption post-resource creation, our developers simply enable it as a basic CloudFormation parameter. Adding this check directly into provisioning stacks leads to a stronger security posture across environments and applications. This example hook demonstrates functionality for mandating security best practices on infrastructure-as-code deployments.

Prerequisites

On the AWS Organization:

An AWS Organization setup with a management account or a delegated administrator for stack set deployments. For more information, see CloudFormation StackSets delegated administration for setting a delegated administrator.
An AWS Organization setup with Organization Units with AWS accounts. For more information, refer Managing organizational units.
Enable trusted access to CloudFormation StackSets with AWS Organizations. This is required for the service-managed permissions to work. Follow the steps in the AWS CloudFormation User Guide to enable trusted access with AWS Organizations.

On the workstation where the hooks will be developed:

AWS CloudFormation CLI installed
Any Code editor or a Cloud9 Instance
AWS CLI installedand configured with your AWS credentials that has access to the delegated administrator account. This post uses the us-west-2 Region.
```
aws configure
Default region name [us-east-1]: us-west-2
```

In the Delegated Administrator account:

Create a hooks package S3 bucket within the delegated administrator account. Upload the hooks package and CloudFormation templates that StackSets will deploy. Ensure the S3 bucket policy allows access from the AWS accounts within the OU. This access lets AWS CloudFormation access the hooks package objects and CloudFormation template objects in the S3 bucket from the member accounts during stack deployment.

Follow these steps to deploy a CloudFormation template that sets up the S3 bucket and permissions:

Click here to download the admin-cfn-hook-deployment-s3-bucket.yaml template file in to your local workstation.
Note: Make sure you model the S3 bucket and IAM policies as least privilege as possible. For the above S3 Bucket policy, you can add a list of IAM Role ARNs created by the StackSets service managed permissions instead of AWS: “*”, which allows S3 bucket access to all the IAM entities from the accounts in the OU. The ARN of this role will be “arn:aws:iam:::role/stacksets-exec-” in every member account within the OU. For more information about equipping least privilege access to IAM policies and S3 Bucket Policies, refer IAM Policies and Bucket Policies and ACLs! Oh, My! (Controlling Access to S3 Resources) blog post.
Execute the following command to deploy the template admin-cfn-hook-deployment-s3-bucket.yaml using AWS CLI. For more information see Creating a stack using the AWS Command Line Interface. If using AWS CloudFormation console, see Creating a stack on the AWS CloudFormation console.
To get the OU Id, see Viewing the details of an OU. OU Id starts with “ou-“. To get the Organization Id, see Viewing details about your organization. Organization Id starts with “o-”
```
aws cloudformation create-stack \
--stack-name hooks-asset-stack \
--template-body file://admin-cfn-deployment-s3-bucket.yaml \
--parameters ParameterKey=OrgId,ParameterValue="&lt;Org_id&gt;" \
ParameterKey=OUId,ParameterValue="&lt;OU_id&gt;"
```
After deploying the stack, note down the AWS S3 bucket name from the CloudFormation Outputs.

Hook Development

In this section, you will develop a sample CloudFormation hook package that will enforce encryption for S3 Buckets and SQS queues within the preCreate and preDelete hook. Follow the steps in the walkthrough to develop a sample hook and generate a zip package for deploying and enabling them in all the accounts within an OU. While following the walkthrough, within the Registering hooks section, make sure that you stop right after executing the cfn submit --dry-run command. The --dry-run option will make sure that your hook is built and packaged your without registering it with CloudFormation on your account. While initiating a Hook project if you created a new directory with the name mycompany-testing-mytesthook, the hook package will be generated as a zip file with the name mycompany-testing-mytesthook.zip at the root your hooks project.

Upload mycompany-testing-mytesthook.zip file to the hooks package S3 bucket within the Delegated Administrator account. The packaged zip file can then be distributed to enable the encryption hooks across all accounts in the target OU.

Note: If you are using your own hooks project and not doing the tutorial, irrespective of it, you should make sure that you are executing the cfn submit command with the --dry-run option. This ensures you have a hooks package that can be distributed and reused across multiple accounts.

Hook Deployment using CloudFormation Stack Sets

In this section, deploy the sample hook developed previously across all accounts within an OU. Use a centralized CloudFormation stack deployed from the delegated administrator account via StackSets.

Deploying hooks via CloudFormation requires these key resources:

AWS::CloudFormation::HookVersion: Publishes a new hook version to the CloudFormation registry
AWS::CloudFormation::HookDefaultVersion: Specifies the default hook version for the AWS account and region
AWS::CloudFormation::HookTypeConfig: Defines the hook configuration
AWS::IAM::Role #1: Task execution role that grants the hook permissions
AWS::IAM::Role #2: (Optional) role for CloudWatch logging that CloudFormation will assume to send log entries during hook execution
AWS::Logs::LogGroup: (Optional) Enables CloudWatch error logging for hook executions

Follow these steps to deploy CloudFormation Hooks to accounts within the OU using StackSets:

Click here to download the hooks-template.yaml template file into your local workstation and upload it into the Hooks package S3 bucket in the Delegated Administrator account.
Deploy the hooks CloudFormation template hooks-template.yaml to all accounts within an OU using StackSets. Leverage service-managed permissions for automatic IAM role creation across the OU.
To deploy the hooks template hooks-template.yaml across OU using StackSets, click here to download the CloudFormation StackSets template hooks-stack-sets-template.yaml locally, and upload it to the hooks package S3 bucket in the delegated administrator account. This StackSets template contains an AWS::CloudFormation::StackSet resource that will deploy the necessary hooks resources from hooks-template.yaml to all accounts in the target OU. Using SERVICE_MANAGED permissions model automatically handle provisioning the required IAM execution roles per account within the OU.
Execute the following command to deploy the template hooks-stack-sets-template.yaml using AWS CLI. For more information see Creating a stack using the AWS Command Line Interface. If using AWS CloudFormation console, see Creating a stack on the AWS CloudFormation console.To get the S3 Https URL for the hooks template, hooks package and StackSets template, login to the AWS S3 service on the AWS console, select the respective object and click on Copy URL button as shown in the following screenshot:
Diagram 2: S3 Https URL

To get the OU Id, see Viewing the details of an OU. OU Id starts with “ou-“.
Make sure to replace the <S3BucketName> and then <OU_Id> accordingly in the following command:
```
aws cloudformation create-stack --stack-name hooks-stack-set-stack \
--template-url https://<S3BucketName>.s3.us-west-2.amazonaws.com/hooks-stack-sets-template.yaml \
--parameters ParameterKey=OuId,ParameterValue="<OU_Id>" \
ParameterKey=HookTypeName,ParameterValue="MyCompany::Testing::MyTestHook" \
ParameterKey=s3TemplateURL,ParameterValue="https://<S3BucketName>.s3.us-west-2.amazonaws.com/hooks-template.yaml" \
ParameterKey=SchemaHandlerPackageS3URL,ParameterValue="https://<S3BucketName>.s3.us-west-2.amazonaws.com/mycompany-testing-mytesthook.zip"
```
Check the progress of the stack deployment using the aws cloudformation describe-stack command. Move to the next section when the stack status is CREATE_COMPLETE.
```
aws cloudformation describe-stacks --stack-name hooks-stack-set-stack
```
If you navigate to the AWS CloudFormation Service’s StackSets section in the console, you can view the stack instances deployed to the accounts within the OU. Alternatively, you can execute the AWS CloudFormation list-stack-instances CLI command below to list the deployed stack instances:
```
aws cloudformation list-stack-instances --stack-set-name MyTestHookStackSet
```

Testing the deployed hook

Deploy the following sample templates into any AWS account that is within the OU where the hooks was deployed and activated. Follow the steps in the Creating a stack on the AWS CloudFormation console. If using AWS CloudFormation CLI, follow the steps in the Creating a stack using the AWS Command Line Interface.

Provision a non-compliant stack without server-side encryption using the following template:
```
AWSTemplateFormatVersion: 2010-09-09
Description: |
  This CloudFormation template provisions an S3 Bucket
Resources:
  S3Bucket:
    Type: 'AWS::S3::Bucket'
    Properties: {}
```
The stack deployment will not succeed and will give the following error message

The following hook(s) failed: [MyCompany::Testing::MyTestHook] and the hook status reason as shown in the following screenshot:

Diagram 3: S3 Bucket creation failure with hooks execution

Provision a stack using the following template that has server-side encryption for the S3 Bucket.

AWSTemplateFormatVersion: 2010-09-09
Description: |
  This CloudFormation template provisions an encrypted S3 Bucket. **WARNING** This template creates an Amazon S3 bucket and a KMS key that you will be charged for. You will be billed for the AWS resources used if you create a stack from this template.
Resources:
  EncryptedS3Bucket:
    Type: "AWS::S3::Bucket"
    Properties:
      BucketName: !Sub "encryptedbucket-${AWS::Region}-${AWS::AccountId}"
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: "aws:kms"
              KMSMasterKeyID: !Ref EncryptionKey
            BucketKeyEnabled: true
  EncryptionKey:
    Type: "AWS::KMS::Key"
    DeletionPolicy: Retain
    UpdateReplacePolicy: Retain
    Properties:
      Description: KMS key used to encrypt the resource type artifacts
      EnableKeyRotation: true
      KeyPolicy:
        Version: 2012-10-17
        Statement:
          - Sid: Enable full access for owning account
            Effect: Allow
            Principal:
              AWS: !Ref "AWS::AccountId"
            Action: "kms:*"
            Resource: "*"
Outputs:
  EncryptedBucketName:
    Value: !Ref EncryptedS3Bucket

The deployment will succeed as it will pass the hook validation with the following hook status reason as shown in the following screenshot:

stack deployment pass due to hooks execution Diagram 4: S3 Bucket creation success with hooks execution

Updating the hooks package

To update the hooks package, follow the same steps described in the Hooks Development section to change the hook code accordingly. Then, execute the cfn submit --dry-run command to build and generate the hooks package file with the registering the type with the CloudFormation registry. Make sure to rename the zip file with a unique name compared to what was previously used. Otherwise, while updating the CloudFormation StackSets stack, it will not see any changes in the template and thus not deploy updates. The best practice is to use a CI/CD pipeline to manage the hook package. Typically, it is good to assign unique version numbers to the hooks packages so that CloudFormation stacks with the new changes get deployed.

Cleanup

Navigate to the AWS CloudFormation console on the Delegated Administrator account, and note down the Hooks package S3 bucket name and empty its contents. Refer to Emptying the Bucket for more information.

Delete the CloudFormation stacks in the following order:

Test stack that failed
Test stack that passed
StackSets CloudFormation stack. This has a DeletionPolicy set to Retain, update the stack by removing the DeletionPolicy and then initiate a stack deletion via CloudFormation or physically delete the StackSet instances and StackSets from the Console or CLI by following: 1. Delete stack instances from your stack set 2. Delete a stack set
Hooks asset CloudFormation stack

Refer to the following documentation to delete CloudFormation Stacks: Deleting a stack on the AWS CloudFormation console or Deleting a stack using AWS CLI.

Conclusion

Throughout this blog post, you have explored how AWS StackSets enable the scalable and centralized deployment of CloudFormation hooks across all accounts within an Organization Unit. By implementing hooks as reusable code templates, StackSets provide consistency benefits and slash the administrative labor associated with fragmented and manual installs. As organizations aim to fortify governance, compliance, and security through hooks, StackSets offer a turnkey mechanism to efficiently reach hundreds of accounts. By leveraging the described architecture of delegated StackSet administration and member account joining, organizations can implement a single hook across hundreds of accounts rather than manually enabling hooks per account. Centralizing your hook code-base within StackSets templates facilitates uniform adoption while also simplifying maintenance. Administrators can update hooks in one location instead of attempting fragmented, account-by-account changes. By enclosing new hooks within reusable StackSets templates, administrators benefit from infrastructure-as-code descriptiveness and version control instead of one-off scripts. Once configured, StackSets provide automated hook propagation without overhead. The delegated administrator merely needs to include target accounts through their Organization Unit alignment rather than handling individual permissions. New accounts added to the OU automatically receive hook deployments through the StackSet orchestration engine.

About the Author

Kirankumar Chandrashekar is a Sr. Solutions Architect for Strategic Accounts at AWS. He focuses on leading customers in architecting DevOps, modernization using serverless, containers and container orchestration technologies like Docker, ECS, EKS to name a few. Kirankumar is passionate about DevOps, Infrastructure as Code, modernization and solving complex customer issues. He enjoys music, as well as cooking and traveling.

Strengthen the DevOps pipeline and protect data with AWS Secrets Manager, AWS KMS, and AWS Certificate Manager

2024-01-10 Magesh Dhanasekaran

Post Syndicated from Magesh Dhanasekaran original https://aws.amazon.com/blogs/security/strengthen-the-devops-pipeline-and-protect-data-with-aws-secrets-manager-aws-kms-and-aws-certificate-manager/

In this blog post, we delve into using Amazon Web Services (AWS) data protection services such as Amazon Secrets Manager, AWS Key Management Service (AWS KMS), and AWS Certificate Manager (ACM) to help fortify both the security of the pipeline and security in the pipeline. We explore how these services contribute to the overall security of the DevOps pipeline infrastructure while enabling seamless integration of data protection measures. We also provide practical insights by demonstrating the implementation of these services within a DevOps pipeline for a three-tier WordPress web application deployed using Amazon Elastic Kubernetes Service (Amazon EKS).

DevOps pipelines involve the continuous integration, delivery, and deployment of cloud infrastructure and applications, which can store and process sensitive data. The increasing adoption of DevOps pipelines for cloud infrastructure and application deployments has made the protection of sensitive data a critical priority for organizations.

Some examples of the types of sensitive data that must be protected in DevOps pipelines are:

Credentials: Usernames and passwords used to access cloud resources, databases, and applications.
Configuration files: Files that contain settings and configuration data for applications, databases, and other systems.
Certificates: TLS certificates used to encrypt communication between systems.
Secrets: Any other sensitive data used to access or authenticate with cloud resources, such as private keys, security tokens, or passwords for third-party services.

Unintended access or data disclosure can have serious consequences such as loss of productivity, legal liabilities, financial losses, and reputational damage. It’s crucial to prioritize data protection to help mitigate these risks effectively.

The concept of security of the pipeline encompasses implementing security measures to protect the entire DevOps pipeline—the infrastructure, tools, and processes—from potential security issues. While the concept of security in the pipeline focuses on incorporating security practices and controls directly into the development and deployment processes within the pipeline.

By using Secrets Manager, AWS KMS, and ACM, you can strengthen the security of your DevOps pipelines, safeguard sensitive data, and facilitate secure and compliant application deployments. Our goal is to equip you with the knowledge and tools to establish a secure DevOps environment, providing the integrity of your pipeline infrastructure and protecting your organization’s sensitive data throughout the software delivery process.

Sample application architecture overview

WordPress was chosen as the use case for this DevOps pipeline implementation due to its popularity, open source nature, containerization support, and integration with AWS services. The sample architecture for the WordPress application in the AWS cloud uses the following services:

Amazon Route 53: A DNS web service that routes traffic to the correct AWS resource.
Amazon CloudFront: A global content delivery network (CDN) service that securely delivers data and videos to users with low latency and high transfer speeds.
AWS WAF: A web application firewall that protects web applications from common web exploits.
AWS Certificate Manager (ACM): A service that provides SSL/TLS certificates to enable secure connections.
Application Load Balancer (ALB): Routes traffic to the appropriate container in Amazon EKS.
Amazon Elastic Kubernetes Service (Amazon EKS): A scalable and highly available Kubernetes cluster to deploy containerized applications.
Amazon Relational Database Service (Amazon RDS): A managed relational database service that provides scalable and secure databases for applications.
AWS Key Management Service (AWS KMS): A key management service that allows you to create and manage the encryption keys used to protect your data at rest.
AWS Secrets Manager: A service that provides the ability to rotate, manage, and retrieve database credentials.
AWS CodePipeline: A fully managed continuous delivery service that helps to automate release pipelines for fast and reliable application and infrastructure updates.
AWS CodeBuild: A fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages.
AWS CodeCommit: A secure, highly scalable, fully managed source-control service that hosts private Git repositories.

Before we explore the specifics of the sample application architecture in Figure 1, it’s important to clarify a few aspects of the diagram. While it displays only a single Availability Zone (AZ), please note that the application and infrastructure can be developed to be highly available across multiple AZs to improve fault tolerance. This means that even if one AZ is unavailable, the application remains operational in other AZs, providing uninterrupted service to users.

Figure 1: Sample application architecture

The flow of the data protection services in the post and depicted in Figure 1 can be summarized as follows:

First, we discuss securing your pipeline. You can use Secrets Manager to securely store sensitive information such as Amazon RDS credentials. We show you how to retrieve these secrets from Secrets Manager in your DevOps pipeline to access the database. By using Secrets Manager, you can protect critical credentials and help prevent unauthorized access, strengthening the security of your pipeline.

Next, we cover data encryption. With AWS KMS, you can encrypt sensitive data at rest. We explain how to encrypt data stored in Amazon RDS using AWS KMS encryption, making sure that it remains secure and protected from unauthorized access. By implementing KMS encryption, you add an extra layer of protection to your data and bolster the overall security of your pipeline.

Lastly, we discuss securing connections (data in transit) in your WordPress application. ACM is used to manage SSL/TLS certificates. We show you how to provision and manage SSL/TLS certificates using ACM and configure your Amazon EKS cluster to use these certificates for secure communication between users and the WordPress application. By using ACM, you can establish secure communication channels, providing data privacy and enhancing the security of your pipeline.

Note: The code samples in this post are only to demonstrate the key concepts. The actual code can be found on GitHub.

Securing sensitive data with Secrets Manager

In this sample application architecture, Secrets Manager is used to store and manage sensitive data. The AWS CloudFormation template provided sets up an Amazon RDS for MySQL instance and securely sets the master user password by retrieving it from Secrets Manager using KMS encryption.

Here’s how Secrets Manager is implemented in this sample application architecture:

Creating a Secrets Manager secret.
1. Create a Secrets Manager secret that includes the Amazon RDS database credentials using CloudFormation.
2. The secret is encrypted using an AWS KMS customer managed key.
3. Sample code:
```
RDSMySQL:
    Type: AWS::RDS::DBInstance
    Properties: 
		ManageMasterUserPassword: true
		MasterUserSecret:
        		KmsKeyId: !Ref RDSMySqlSecretEncryption
```
The ManageMasterUserPassword: true line in the CloudFormation template indicates that the stack will manage the master user password for the Amazon RDS instance. To securely retrieve the password for the master user, the CloudFormation template uses the MasterUserSecret parameter, which retrieves the password from Secrets Manager. The KmsKeyId: !Ref RDSMySqlSecretEncryption line specifies the KMS key ID that will be used to encrypt the secret in Secrets Manager.

By setting the MasterUserSecret parameter to retrieve the password from Secrets Manager, the CloudFormation stack can securely retrieve and set the master user password for the Amazon RDS MySQL instance without exposing it in plain text. Additionally, specifying the KMS key ID for encryption adds another layer of security to the secret stored in Secrets Manager.
Retrieving secrets from Secrets Manager.
1. The secrets store CSI driver is a Kubernetes-native driver that provides a common interface for Secrets Store integration with Amazon EKS. The secrets-store-csi-driver-provider-aws is a specific provider that provides integration with the Secrets Manager.
2. To set up Amazon EKS, the first step is to create a SecretProviderClass, which specifies the secret ID of the Amazon RDS database. This SecretProviderClass is then used in the Kubernetes deployment object to deploy the WordPress application and dynamically retrieve the secrets from the secret manager during deployment. This process is entirely dynamic and verifies that no secrets are recorded anywhere. The SecretProviderClass is created on a specific app namespace, such as the wp namespace.
3. Sample code:
```
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
spec:
  provider: aws
  parameters:
    objects: |
        - objectName: 'rds!db-0x0000-0x0000-0x0000-0x0000-0x0000'
```

When using Secrets manager, be aware of the following best practices for managing and securing Secrets Manager secrets:

Use AWS Identity and Access Management (IAM) identity policies to define who can perform specific actions on Secrets Manager secrets, such as reading, writing, or deleting them.
Secrets Manager resource policies can be used to manage access to secrets at a more granular level. This includes defining who has access to specific secrets based on attributes such as IP address, time of day, or authentication status.
Encrypt the Secrets Manager secret using an AWS KMS key.
Using CloudFormation templates to automate the creation and management of Secrets Manager secrets including rotation.
Use AWS CloudTrail to monitor access and changes to Secrets Manager secrets.
Use CloudFormation hooks to validate the Secrets Manager secret before and after deployment. If the secret fails validation, the deployment is rolled back.

Encrypting data with AWS KMS

Data encryption involves converting sensitive information into a coded form that can only be accessed with the appropriate decryption key. By implementing encryption measures throughout your pipeline, you make sure that even if unauthorized individuals gain access to the data, they won’t be able to understand its contents.

Here’s how data at rest encryption using AWS KMS is implemented in this sample application architecture:

Amazon RDS secret encryption

Encrypting secrets: An AWS KMS customer managed key is used to encrypt the secrets stored in Secrets Manager to ensure their confidentiality during the DevOps build process.

Sample code:

RDSMySQL:
    Type: AWS::RDS::DBInstance
    Properties:
      ManageMasterUserPassword: true
      MasterUserSecret:
        KmsKeyId: !Ref RDSMySqlSecretEncryption

RDSMySqlSecretEncryption:
    Type: "AWS::KMS::Key"
    Properties:
      KeyPolicy:
        Id: rds-mysql-secret-encryption
        Statement:
          - Sid: Allow administration of the key
            Effect: Allow
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
					.
					.
					.
            ]
          - Sid: Allow use of the key
            Effect: Allow
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKey",
                "kms:DescribeKey"
            ]

Amazon RDS data encryption

Enable encryption for an Amazon RDS instance using CloudFormation. Specify the KMS key ARN in the CloudFormation stack and RDS will use the specified KMS key to encrypt data at rest.

Sample code:

RDSMySQL:
    Type: AWS::RDS::DBInstance
    Properties:
  KmsKeyId: !Ref RDSMySqlDataEncryption
        StorageEncrypted: true

RDSMySqlDataEncryption:
    Type: "AWS::KMS::Key"
    Properties:
      KeyPolicy:
        Id: rds-mysql-data-encryption
        Statement:
          - Sid: Allow administration of the key
            Effect: Allow
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
.
.
.
            ]
          - Sid: Allow use of the key
            Effect: Allow
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKey",
                "kms:DescribeKey"
            ]

Kubernetes Pods storage
1. Use encrypted Amazon Elastic Block Store (Amazon EBS) volumes to store configuration data. Create a managed encrypted Amazon EBS volume using the following code snippet, and then deploy a Kubernetes pod with the persistent volume claim (PVC) mounted as a volume.
2. Sample code:
```
kind: StorageClass
provisioner: ebs.csi.aws.com
parameters:
  csi.storage.k8s.io/fstype: xfs
  encrypted: "true"

kind: Deployment
spec:
  volumes:      
      - name: persistent-storage
        persistentVolumeClaim:
          claimName: ebs-claim
```

Amazon ECR

To secure data at rest in Amazon Elastic Container Registry (Amazon ECR), enable encryption at rest for Amazon ECR repositories using the AWS Management Console or AWS Command Line Interface (AWS CLI). ECR uses AWS KMS to encrypt the data at rest.
Create a KMS key for Amazon ECR and use that key to encrypt the data at rest.
Automate the creation of encrypted ECR repositories and enable encryption at rest using a DevOps pipeline, use CodePipeline to automate the deployment of the CloudFormation stack.
Define the creation of encrypted Amazon ECR repositories as part of the pipeline.

Sample code:

ECRRepository:
    Type: AWS::ECR::Repository
    Properties: 
      EncryptionConfiguration: 
        EncryptionType: KMS
        KmsKey: !Ref ECREncryption

ECREncryption:
    Type: AWS::KMS::Key
    Properties:
      KeyPolicy:
        Id: ecr-encryption-key
        Statement:
          - Sid: Allow administration of the key
            Effect: Allow
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
.
.
.
 ]
          - Sid: Allow use of the key
            Effect: Allow
            "Action": [
                "kms:Decrypt",
                "kms:GenerateDataKey",
                "kms:DescribeKey"
            ]

AWS best practices for managing encryption keys in an AWS environment

To effectively manage encryption keys and verify the security of data at rest in an AWS environment, we recommend the following best practices:

Use separate AWS KMS customer managed KMS keys for data classifications to provide better control and management of keys.
Enforce separation of duties by assigning different roles and responsibilities for key management tasks, such as creating and rotating keys, setting key policies, or granting permissions. By segregating key management duties, you can reduce the risk of accidental or intentional key compromise and improve overall security.
Use CloudTrail to monitor AWS KMS API activity and detect potential security incidents.
Rotate KMS keys as required by your regulatory requirements.
Use CloudFormation hooks to validate KMS key policies to verify that they align with organizational and regulatory requirements.

Following these best practices and implementing encryption at rest for different services such as Amazon RDS, Kubernetes Pods storage, and Amazon ECR, will help ensure that data is encrypted at rest.

Securing communication with ACM

Secure communication is a critical requirement for modern environments and implementing it in a DevOps pipeline is crucial for verifying that the infrastructure is secure, consistent, and repeatable across different environments. In this WordPress application running on Amazon EKS, ACM is used to secure communication end-to-end. Here’s how to achieve this:

Provision TLS certificates with ACM using a DevOps pipeline
1. To provision TLS certificates with ACM in a DevOps pipeline, automate the creation and deployment of TLS certificates using ACM. Use AWS CloudFormation templates to create the certificates and deploy them as part of infrastructure as code. This verifies that the certificates are created and deployed consistently and securely across multiple environments.
2. Sample code:
```
DNSDomainCertificate:
    Type: AWS::CertificateManager::Certificate
    Properties:
      DomainName: !Ref DNSDomainName
      ValidationMethod: 'DNS'

DNSDomainName:
    Description: dns domain name 
    TypeM: String
    Default: "example.com"
```

Provisioning of ALB and integration of TLS certificate using AWS ALB Ingress Controller for Kubernetes

Use a DevOps pipeline to create and configure the TLS certificates and ALB. This verifies that the infrastructure is created consistently and securely across multiple environments.

Sample code:

kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:000000000000:certificate/0x0000-0x0000-0x0000-0x0000-0x0000
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
    alb.ingress.kubernetes.io/security-groups:  sg-0x00000x0000,sg-0x00000x0000
spec:
  ingressClassName: alb

CloudFront and ALB

To secure communication between CloudFront and the ALB, verify that the traffic from the client to CloudFront and from CloudFront to the ALB is encrypted using the TLS certificate.

Sample code:

CloudFrontDistribution:
    Type: AWS::CloudFront::Distribution
    Properties:
      DistributionConfig:
        Origins:
          - DomainName: !Ref ALBDNSName
            Id: !Ref ALBDNSName
            CustomOriginConfig:
              HTTPSPort: '443'
              OriginProtocolPolicy: 'https-only'
              OriginSSLProtocols:
                - LSv1
	    ViewerCertificate:
AcmCertificateArn: !Sub 'arn:aws:acm:${AWS::Region}:${AWS::AccountId}:certificate/${ACMCertificateIdentifier}'
            SslSupportMethod:  'sni-only'
            MinimumProtocolVersion: 'TLSv1.2_2021'

ALBDNSName:
    Description: alb dns name
    Type: String
    Default: "k8s-wp-ingressw-x0x0000x000-x0x0000x000.us-east-1.elb.amazonaws.com"

ALB to Kubernetes Pods
1. To secure communication between the ALB and the Kubernetes Pods, use the Kubernetes ingress resource to terminate SSL/TLS connections at the ALB. The ALB sends the PROTO metadata http connection header to the WordPress web server. The web server checks the incoming traffic type (http or https) and enables the HTTPS connection only hereafter. This verifies that pod responses are sent back to ALB only over HTTPS.
2. Additionally, using the X-Forwarded-Proto header can help pass the original protocol information and help avoid issues with the $_SERVER[‘HTTPS’] variable in WordPress.
3. Sample code:
```
define('WP_HOME','https://example.com/');
define('WP_SITEURL','https://example.com/');

define('FORCE_SSL_ADMIN', true);
if (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && strpos($_SERVER['HTTP_X_FORWARDED_PROTO'], 'https') !== false) {
    $_SERVER['HTTPS'] = 'on';
```
Kubernetes Pods to Amazon RDS
1. To secure communication between the Kubernetes Pods in Amazon EKS and the Amazon RDS database, use SSL/TLS encryption on the database connection.
2. Configure an Amazon RDS MySQL instance with enhanced security settings to verify that only TLS-encrypted connections are allowed to the database. This is achieved by creating a DB parameter group with a parameter called require_secure_transport set to ‘1‘. The WordPress configuration file is also updated to enable SSL/TLS communication with the MySQL database. Then enable the TLS flag on the MySQL client and the Amazon RDS public certificate is passed to ensure that the connection is encrypted using the TLS_AES_256_GCM_SHA384 protocol. The sample code that follows focuses on enhancing the security of the RDS MySQL instance by enforcing encrypted connections and configuring WordPress to use SSL/TLS for communication with the database.
3. Sample code:
```
RDSDBParameterGroup:
    Type: 'AWS::RDS::DBParameterGroup'
    Properties:
      DBParameterGroupName: 'rds-tls-custom-mysql'
      Parameters:
        require_secure_transport: '1'

RDSMySQL:
    Type: AWS::RDS::DBInstance
    Properties:
      DBName: 'wordpress'
      DBParameterGroupName: !Ref RDSDBParameterGroup

wp-config-docker.php:
// Enable SSL/TLS between WordPress and MYSQL database
define('MYSQL_CLIENT_FLAGS', MYSQLI_CLIENT_SSL);//This activates SSL mode
define('MYSQL_SSL_CA', '/usr/src/wordpress/amazon-global-bundle-rds.pem');
```

In this architecture, AWS WAF is enabled at CloudFront to protect the WordPress application from common web exploits. AWS WAF for CloudFront is recommended and use AWS managed WAF rules to verify that web applications are protected from common and the latest threats.

Here are some AWS best practices for securing communication with ACM:

Use SSL/TLS certificates: Encrypt data in transit between clients and servers. ACM makes it simple to create, manage, and deploy SSL/TLS certificates across your infrastructure.
Use ACM-issued certificates: This verifies that your certificates are trusted by major browsers and that they are regularly renewed and replaced as needed.
Implement certificate revocation: Implement certificate revocation for SSL/TLS certificates that have been compromised or are no longer in use.
Implement strict transport security (HSTS): This helps protect against protocol downgrade attacks and verifies that SSL/TLS is used consistently across sessions.
Configure proper cipher suites: Configure your SSL/TLS connections to use only the strongest and most secure cipher suites.

Monitoring and auditing with CloudTrail

In this section, we discuss the significance of monitoring and auditing actions in your AWS account using CloudTrail. CloudTrail is a logging and tracking service that records the API activity in your AWS account, which is crucial for troubleshooting, compliance, and security purposes. Enabling CloudTrail in your AWS account and securely storing the logs in a durable location such as Amazon Simple Storage Service (Amazon S3) with encryption is highly recommended to help prevent unauthorized access. Monitoring and analyzing CloudTrail logs in real-time using CloudWatch Logs can help you quickly detect and respond to security incidents.

In a DevOps pipeline, you can use infrastructure-as-code tools such as CloudFormation, CodePipeline, and CodeBuild to create and manage CloudTrail consistently across different environments. You can create a CloudFormation stack with the CloudTrail configuration and use CodePipeline and CodeBuild to build and deploy the stack to different environments. CloudFormation hooks can validate the CloudTrail configuration to verify it aligns with your security requirements and policies.

It’s worth noting that the aspects discussed in the preceding paragraph might not apply if you’re using AWS Organizations and the CloudTrail Organization Trail feature. When using those services, the management of CloudTrail configurations across multiple accounts and environments is streamlined. This centralized approach simplifies the process of enforcing security policies and standards uniformly throughout the organization.

By following these best practices, you can effectively audit actions in your AWS environment, troubleshoot issues, and detect and respond to security incidents proactively.

Complete code for sample architecture for deployment

The complete code repository for the sample WordPress application architecture demonstrates how to implement data protection in a DevOps pipeline using various AWS services. The repository includes both infrastructure code and application code that covers all aspects of the sample architecture and implementation steps.

The infrastructure code consists of a set of CloudFormation templates that define the resources required to deploy the WordPress application in an AWS environment. This includes the Amazon Virtual Private Cloud (Amazon VPC), subnets, security groups, Amazon EKS cluster, Amazon RDS instance, AWS KMS key, and Secrets Manager secret. It also defines the necessary security configurations such as encryption at rest for the RDS instance and encryption in transit for the EKS cluster.

The application code is a sample WordPress application that is containerized using Docker and deployed to the Amazon EKS cluster. It shows how to use the Application Load Balancer (ALB) to route traffic to the appropriate container in the EKS cluster, and how to use the Amazon RDS instance to store the application data. The code also demonstrates how to use AWS KMS to encrypt and decrypt data in the application, and how to use Secrets Manager to store and retrieve secrets. Additionally, the code showcases the use of ACM to provision SSL/TLS certificates for secure communication between the CloudFront and the ALB, thereby ensuring data in transit is encrypted, which is critical for data protection in a DevOps pipeline.

Conclusion

Strengthening the security and compliance of your application in the cloud environment requires automating data protection measures in your DevOps pipeline. This involves using AWS services such as Secrets Manager, AWS KMS, ACM, and AWS CloudFormation, along with following best practices.

By automating data protection mechanisms with AWS CloudFormation, you can efficiently create a secure pipeline that is reproducible, controlled, and audited. This helps maintain a consistent and reliable infrastructure.

Monitoring and auditing your DevOps pipeline with AWS CloudTrail is crucial for maintaining compliance and security. It allows you to track and analyze API activity, detect any potential security incidents, and respond promptly.

By implementing these best practices and using data protection mechanisms, you can establish a secure pipeline in the AWS cloud environment. This enhances the overall security and compliance of your application, providing a reliable and protected environment for your deployments.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Looking beyond code coverage with Amazon CodeWhisperer

2024-01-08 Saurabh Kumar

Post Syndicated from Saurabh Kumar original https://aws.amazon.com/blogs/devops/looking-beyond-code-coverage-with-amazon-codewhisperer/

Code coverage is a code quality metric leveraging unit tests. Coming up with test cases with every combination of parameters requires developer’s time, which is already scarce. Developers’ focus is (mis)directed at just meeting the coverage threshold. In doing so, quality of code may be compromised and resulting code may still result in unexpected outcomes.

In this blog, we will walk you through a Java application and demonstrate how to look beyond code coverage by leveraging Amazon CodeWhisperer, an AI coding companion, for generating a combination of test cases, including boundary conditions which are often overlooked due to time and resource constraints. Taking this approach, you can improve code quality as well as improve your productivity.

Prerequisites

Create an AWS Account. If you don’t already have an AWS account, you’ll need to create one in order to follow the steps outlined to AWS Management Console. For this, it is recommended to create a new user.
To set up CodeWhisperer for your development environment, follow the setup instructions AWS Toolkit.
Download and set up Java: Install the Java SE Development Kit.
Install Maven.
You can use any IDE of your choice such as Eclipse, Spring Tools, VS Code or IntelliJ. For this blog, we have used IntelliJ community edition which you can download from here. You can use IntelliJ Ultimate if you have a license for it.
Clone repo: Looking beyond code coverage with CodeWhisperer.
Import cloned project in IntelliJ following importing-a-project guide.

Following the above steps, you would have setup the project locally. Figure1 below shows how the initial project should look after you have imported it as maven project in your IDE:

Figure1. initial Java project

The following screen recording shows how you can use CodeWhisperer to generate code for ‘add’ method using the comment “//method for adding two numbers”.

ScreenRecording 1. Initial application code generation

Now let’s generate one simple test case using CodeWhisperer as shown in the ScreenRecording 2. First test case generation:

ScreenRecording 2. First test case generation

Let’s run the test case with code coverage. In figure2. “First test with code coverage”, you can see that we have achieved 100% coverage on the Calculator class. If we just go by the coverage, we can conclude that the code is ready.

Figure2. First test with code coverage

Just one unit test is not sufficient to ensure quality and side-effect-free code. Next, you will see how CodeWhisperer can assist in generating additional test cases. As soon as you begin typing a comment like, “// Test,” it provides suggestions for new test cases, such as “// test with one negative number,” “// test with two negative numbers,” “Test with one zero number,” etc. This feature, as shown in the ScreenRecording 3 titled ‘Generating additional test cases’ below, makes the task of generating a variety of test cases easier and enables developers to create more tests in a shorter amount of time.

ScreenRecording 3. Generating additional test cases

So far, we have generated test cases with different arguments and still have 100% code coverage. Let’s now turn our focus towards safety of the code and think about different arguments that can lead to unexpected outcomes. Each argument should fall within the range of -2,147,483,648 (the minimum value of type int) to 2,147,483,647 (the maximum value of type int). Let’s use CodeWhisperer to generate additional test cases to challenge code safety as shown in the screen recording below:

ScreenRecording 4. Generating test for boundary conditions

Here, CodeWhisperer first generated a test case which adds 1 to maximum value of integer. We have also added a statement to print the result to console so that we can see the actual value ‘add’ method returns when this use case is executed. Point to note here: the generated test case is expecting the output to be the MINIMUM value of type int. Upon execution, the test case prints something unexpected. Because of the way signed integer operations work, adding 1 to max int results in the min value.

Let’s consider this in more practical terms. Imagine using the ‘add’ method in a banking system, where every time a customer deposits money into their bank account, you add up the recent deposit to calculate the final amount in their account. Now, imagine a customer’s reaction when they find their balance to be negative after depositing $2.14 billion, and they now owe huge overdraft charges.

This example demonstrates that even code with 100% coverage has unexpected side effects. The focus should be identifying combinations of parameters which can potentially produce unexpected outcomes so that code can be corrected before it manifests this behavior in production.

Now, let’s use CodeWhisperer to generate another test case that could create an unexpected result: “add -1 to the minimum value of ‘int’”. Again, adding -1 to the minimum int value results in the MAXIMUM value. Using the same example as above, a customer would be more than happy to notice that they still have money in their bank account, even after withdrawing $2.14 billion.

Again, the point is that developers should focus on ensuring that the code doesn’t have unwanted, unexpected consequences, rather than chasing a coverage target.

Now that we have seen that the add methods runs into integer overflow in certain conditions, let’s improve the code using CodeWhisperer using comment “check a and b for integer overflow”:

ScreenRecording 5. Code improvement- overflow check

After adding the safety checks, the test cases are not resulting in unexpected outcomes and resulting in ArithmaticException as shown in the above screen recording. However, the test cases are failing, and failing test cases can interrupt the CI/CD pipeline. So, let’s refactor these test cases to expect this runtime exception and pass the test case as shown in the screen recording below.

ScreenRecording 6. Test case improvement- overflow checks

Having rerun the test cases with coverage, you can see that the test cases are not only passing, but also have 100% code coverage.

For this blog, the majority of the code and its corresponding test cases are generated by CodeWhisperer, an AI coding companion. This tool enables us to enhance code by easily exploring libraries. In our example, this led us to the ‘Math.addExact’ method, which provides checks for boundary conditions relevant to our task. Let’s refactor the code to utilize this method, as shown below in Figure 3 final code.

Figure 3. Final code

If we rerun the test suite with coverage, we find that all the test cases are passing and coverage is also maintained at 100%.

Figure 4. Final tests with coverage

Conclusion:

Through this blog post, we have demonstrated that high code coverage alone does not guarantee high quality code. Tools like Amazon CodeWhisperer can boost developer productivity by generating code and as well as corresponding test suite, including boundary conditions. This frees up developers to concentrate on business logic and to learn new frameworks and libraries, thereby resulting in overall improvement in quality and safety of code.

While our example focused on Java, this concept applies to other programming languages as well. Checkout the complete list of programming languages and IDEs supported by CodeWhisperer in the FAQs.

Try out CodeWhisperer Individual Tier for free to see how it can help you write high-quality code more efficiently using CodeWhisperer getting started guides.

Happy coding!

Introduction

AWS Cloud Control API

History and Evolution of the Terraform AWS CC Provider

Using the AWS CC Provider

Things to know

Conclusion

Authors

Planning

Research and Design

Develop and Test

Debugging and Troubleshooting

Conclusion

Introduction

Prerequisites

Walkthrough

Create the base workflow

Connect to CodeCatalyst Dev Environment

Cleaning up

Conclusion

AWS CodePipeline

Enable Automatic rollbacks

Example 1 – Automated Rollbacks

Example 2 – Manual Rollbacks

Clean up

Conclusion

About the author:

Current challenges with security automation

Solution overview

Prerequisites

Implement security automation

Generate a remediation script with CodeWhisperer

To generate the remediation script

Create the Lambda function

To create the Lambda function

Create a custom action in Security Hub

To create the custom action

Create an EventBridge rule to target the Lambda function

To create the EventBridge rule

Run the automation

To run the automation

Conclusion

Introduction

Terraform Test

Basic test to validate resource creation

Create resource and validate resource name

Testing variable input validation

Orchestrating supporting resources

Terraform Tests in CI/CD Pipelines

Choosing various testing strategies

Conclusion

Authors

Spot the Difference

Time-Inhomogeneous Poisson Process

Case Study 1: Drop in Successful Title Starts

Case Study 2: Increase in Abnormal Shutdowns

Case Study 3: Increase in Errors

Try it Out!

Further Reading

Prerequisites

CloudFormation

CDK

Terraform HCL

Security Scanner

Conclusion

Bio

Introduction

Deploying a CloudFormation stack

Resource Dependency:

Template snippet:

Dependency Graph:

Resource Provisioning Time:

What is Resource Stabilization?

Stabilization Criteria and Stabilization Timeout

AWS CloudFormation vs AWS CLI to provision resources

Command:

Evolution of Stabilization Strategy

Historic Stabilization Strategy

New Optimistic Stabilization Strategy

Improved Visibility into Resource Provisioning

Conclusion: