Tag Archives: AWS CodeCommit

Deploying AWS Lambda layers automatically across multiple Regions

2021-11-18 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/deploying-aws-lambda-layers-automatically-across-multiple-regions/

This post is written by Ben Freiberg, Solutions Architect, and Markus Ziller, Solutions Architect.

Many developers import libraries and dependencies into their AWS Lambda functions. These dependencies can be zipped and uploaded as part of the build and deployment process but it’s often easier to use Lambda layers instead.

A Lambda layer is an archive containing additional code, such as libraries or dependencies. Layers are deployed as immutable versions, and the version number increments each time you publish a new layer. When you include a layer in a function, you specify the layer version you want to use.

Lambda layers simplify and speed up the development process by providing common dependencies and reducing the deployment size of your Lambda functions. To learn more, refer to Using Lambda layers to simplify your development process.

Many customers build Lambda layers for use across multiple Regions. But maintaining up-to-date and consistent layer versions across multiple Regions is a manual process. Layers are set as private automatically but they can be shared with other AWS accounts or shared publicly. Permissions only apply to a single version of a layer. This solution automates the creation and deployment of Lambda layers across multiple Regions from a centralized pipeline.

Overview of the solution

This solution uses AWS Lambda, AWS CodeCommit, AWS CodeBuild and AWS CodePipeline.

This diagram outlines the workflow implemented in this blog:

A CodeCommit repository contains the language-specific definition of dependencies and libraries that the layer contains, such as package.json for Node.js or requirements.txt for Python. Each commit to the main branch triggers an execution of the surrounding CodePipeline.
A CodeBuild job uses the provided buildspec.yaml to create a zipped archive containing the layer contents. CodePipeline automatically stores the output of the CodeBuild job as artifacts in a dedicated Amazon S3 bucket.
A Lambda function is invoked for each configured Region.
The function first downloads the zip archive from S3.
Next, the function creates the layer version in the specified Region with the configured permissions.

Walkthrough

The following walkthrough explains the components and how the provisioning can be automated via CDK. For this walkthrough, you need:

An AWS account.
Installed and authenticated AWS CLI (authenticate with an IAM user or an AWS STS security token).
Installed Node.js, TypeScript.
Installed git.
AWS CDK installed.

To deploy the sample stack:

Clone the associated GitHub repository by running the following command in a local directory:
git clone https://github.com/aws-samples/multi-region-lambda-layers
Open the repository in your preferred editor and review the contents of the src and cdk folder.
Follow the instructions in the README.md to deploy the stack.
Check the execution history of your pipeline in the AWS Management Console. The pipeline has been started once already and published a first version of the Lambda layer.

Code repository

The source code of the Lambda layers is stored in AWS CodeCommit. This is a secure, highly scalable, managed source control service that hosts private Git repositories. This example initializes a new repository as part of the CDK stack:

    const asset = new Asset(this, 'SampleAsset', {
      path: path.join(__dirname, '../../res')
    });

    const cfnRepository = new codecommit.CfnRepository(this, 'lambda-layer-source', {
      repositoryName: 'lambda-layer-source',
      repositoryDescription: 'Contains the source code for a nodejs12+14 Lambda layer.',
      code: {
        branchName: 'main',
        s3: {
          bucket: asset.s3BucketName,
          key: asset.s3ObjectKey
        }
      },
    });

This code uploads the contents of the ./cdk/res/ folder to an S3 bucket that is managed by the CDK. The CDK then initializes a new CodeCommit repository with the contents of the bucket. In this case, the repository gets initialized with the following files:

LICENSE: A text file describing the license for this Lambda layer
package.json: In Node.js, the package.json file is a manifest for projects. It defines dependencies, scripts, and metainformation about the project. The npm install command installs all project dependencies in a node_modules folder. This is where you define the contents of the Lambda layer.

The default package.json in the sample code defines a Lambda layer with the latest version of the AWS SDK for JavaScript:

{
    "name": "lambda-layer",
    "version": "1.0.0",
    "description": "Sample AWS Lambda layer",
    "dependencies": {
      "aws-sdk": "latest"
    }
}

To see what is included in the layer, run npm install in the ./cdk/res/ directory. This shows the files that are bundled into the Lambda layer. The contents of this folder initialize the CodeCommit repository, so delete node_modules and package-lock.json inspecting these files.

This blog post uses a new CodeCommit repository as the source but you can adapt this to other providers. CodePipeline also supports repositories on GitHub and Bitbucket. To connect to those providers, see the documentation.

CI/CD Pipeline

CodePipeline automates the process of building and distributing Lambda layers across Region for every change to the main branch of the source repository. It is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline automates the build, test, and deploy phases of your release process every time there is a code change, based on the release model you define.

The CDK creates a pipeline in CodePipeline and configures it so that every change to the code base of the Lambda layer runs through the following three stages:

new codepipeline.Pipeline(this, 'Pipeline', {
      pipelineName: 'LambdaLayerBuilderPipeline',
      stages: [
        {
          stageName: 'Source',
          actions: [sourceAction]
        },
        {
          stageName: 'Build',
          actions: [buildAction]
        },
        {
          stageName: 'Distribute',
          actions: parallel,
        }
      ]
    });

Source

The source phase is the first phase of every run of the pipeline. It is typically triggered by a new commit to the main branch of the source repository. You can also start the source phase manually with the following AWS CLI command:

aws codepipeline start-pipeline-execution --name LambdaLayerBuilderPipeline

When started manually, the current head of the main branch is used. Otherwise CodePipeline checks out the code in the revision of the commit that triggered the pipeline execution.

CodePipeline stores the code locally and uses it as an output artifact of the Source stage. Stages use input and output artifacts that are stored in the Amazon S3 artifact bucket you chose when you created the pipeline. CodePipeline zips and transfers the files for input or output artifacts as appropriate for the action type in the stage.

Build

In the second phase of the pipeline, CodePipeline installs all dependencies and packages according to the specs of the targeted Lambda runtime. CodeBuild is a fully managed build service in the cloud. It reduces the need to provision, manage, and scale your own build servers. It provides prepackaged build environments for popular programming languages and build tools like npm for Node.js.

In CodeBuild, you use build specifications (buildspecs) to define what commands need to run to build your application. Here, it runs commands in a provisioned Docker image with Amazon Linux 2 to do the following:

Create the folder structure expected by Lambda Layer.
Run npm install to install all Node.js dependencies.
Package the code into a layer.zip file and define layer.zip as output of the Build stage.

The following CDK code highlights the specifications of the CodeBuild project.

const buildAction = new codebuild.PipelineProject(this, 'lambda-layer-builder', {
      buildSpec: codebuild.BuildSpec.fromObject({
        version: '0.2',
        phases: {
          install: {
            commands: [
              'mkdir -p node_layer/nodejs',
              'cp package.json ./node_layer/nodejs/package.json',
              'cd ./node_layer/nodejs',
              'npm install',
            ]
          },
          build: {
            commands: [
              'rm package-lock.json',
              'cd ..',
              'zip ../layer.zip * -r',
            ]
          }
        },
        artifacts: {
          files: [
            'layer.zip',
          ]
        }
      }),
      environment: {
        buildImage: codebuild.LinuxBuildImage.STANDARD_5_0
      }
    })

Distribute

In the final stage, Lambda uses layer.zip to create and publish a Lambda layer across multiple Regions. The sample code defines four Regions as targets for the distribution process:

regionCodesToDistribute: ['eu-central-1', 'eu-west-1', 'us-west-1', 'us-east-1']

The Distribution phase consists of n (one per Region) parallel invocations of the same Lambda function, each with userParameter.region set to the respective Region. This is defined in the CDK stack:

const parallel = props.regionCodesToDistribute.map((region) => new codepipelineActions.LambdaInvokeAction({
      actionName: `distribute-${region}`,
      lambda: distributor,
      inputs: [buildOutput],
      userParameters: { region, layerPrincipal: props.layerPrincipal }
}));

Each Lambda function runs the following code to publish a new Lambda layer in each Region:

const parallel = props.regionCodesToDistribute.map((region) => new codepipelineActions.LambdaInvokeAction({
      actionName: `distribute-${region}`,
      lambda: distributor,
      inputs: [buildOutput],
      userParameters: { region, layerPrincipal: props.layerPrincipal }
}));

Each Lambda function runs the following code to publish a new Lambda layer in each Region:

// Simplified code for brevity
// Omitted error handling, permission management and logging 
// See code samples for full code.
export async function handler(event: any) {
    // #1 Get job specific parameters (e.g. target region)
    const { location } = event['CodePipeline.job'].data.inputArtifacts[0];
    const { region, layerPrincipal } = JSON.parse(event["CodePipeline.job"].data.actionConfiguration.configuration.UserParameters);
    
    // #2 Get location of layer.zip and download it locally
    const layerZip = s3.getObject(/* Input artifact location*/);
    const lambda = new Lambda({ region });
    // #3 Publish a new Lambda layer version based on layer.zip
    const layer = lambda.publishLayerVersion({
        Content: {
            ZipFile: layerZip.Body
        },
        LayerName: 'sample-layer',
        CompatibleRuntimes: ['nodejs12.x', 'nodejs14.x']
    })
    
    // #4 Report the status of the operation back to CodePipeline
    return codepipeline.putJobSuccessResult(..);
}

After each Lambda function completes successfully, the pipeline ends. In a production application, you likely would have additional steps after publishing. For example, it may send notifications via Amazon SNS. To learn more about other possible integrations, read Working with pipeline in CodePipeline.

Testing the workflow

With this automation, you can release a new version of the Lambda layer by changing package.json in the source repository.

Add the AWS X-Ray SDK for Node.js as a dependency to your project, by making the following changes to package.json and committing the new version to your main branch:

{
    "name": "lambda-layer",
    "version": "1.0.0",
    "description": "Sample AWS Lambda layer",
    "dependencies": {
        "aws-sdk": "latest",
        "aws-xray-sdk": "latest"
    }
}

After committing the new version to the repository, the pipeline is triggered again. After a while, you see that an updated version of the Lambda layer is published to all Regions:

Cleaning up

Many services in this blog post are available in the AWS Free Tier. However, using this solution may incur cost and you should tear down the stack if you don’t need it anymore. Cleaning up steps are included in the readme in the repository.

Conclusion

This blog post shows how to create a centralized pipeline to build and distribute Lambda layers consistently across multiple Regions. The pipeline is configurable and allows you to adapt the Regions and permissions according to your use-case.

For more serverless learning resources, visit Serverless Land.

Apply CI/CD DevOps principles to Amazon Redshift development

2021-11-15 Ashok Srirama

Post Syndicated from Ashok Srirama original https://aws.amazon.com/blogs/big-data/apply-ci-cd-devops-principles-to-amazon-redshift-development/

CI/CD in the context of application development is a well-understood topic, and developers can choose from numerous patterns and tools to build their pipelines to handle the build, test, and deploy cycle when a new commit gets into version control. For stored procedures or even schema changes that are directly related to the application, this is typically part of the code base and is included in the code repository of the application. These changes are then applied when the application gets deployed to the test or prod environment.

This post demonstrates how you can apply the same set of approaches to stored procedures, and even schema changes to data warehouses like Amazon Redshift.

Stored procedures are considered code and as such should undergo the same rigor as application code. This means that the pipeline should involve running tests against changes to make sure that no regressions are introduced to the production environment. Because we automate the deployment of both stored procedures and schema changes, this significantly reduces inconsistencies in between environments.

Solution overview

The following diagram illustrates our solution architecture. We use AWS CodeCommit to store our code, AWS CodeBuild to run the build process and test environment, and AWS CodePipeline to orchestrate the overall deployment, from source, to test, to production.

Database migrations and tests also require connection information to the relevant Amazon Redshift cluster; we demonstrate how to integrate this securely using AWS Secrets Manager.

We discuss each service component in more detail later in the post.

You can see how all these components work together by completing the following steps:

Clone the GitHub repo.
Deploy the AWS CloudFormation template.
Push code to the CodeCommit repository.
Run the CI/CD pipeline.

Clone the GitHub repository

The CloudFormation template and the source code for the example application are available in the GitHub repo. Before you get started, you need to clone the repository using the following command:

git clone https://github.com/aws-samples/amazon-redshift-devops-blog

This creates a new folder, amazon-redshift-devops-blog, with the files inside.

Deploy the CloudFormation template

The CloudFormation stack creates the VPC, Amazon Redshift clusters, CodeCommit repository, CodeBuild projects for both test and prod, and the pipeline using CodePipeline to orchestrate the change release process.

On the AWS CloudFormation console, choose Create stack.
Choose With new resources (standard).
Select Upload a template file.
Choose Choose file and locate the template file (<cloned_directory>/cloudformation_template.yml).
Choose Next.
For Stack name, enter a name.
In the Parameters section, provide the primary user name and password for both the test and prod Amazon Redshift clusters.

The username must be 1–128 alphanumeric characters, and it can’t be a reserved word.

The password has the following criteria:

Must be 8-64 characters
Must contain at least one uppercase letter
Must contain at least one lowercase letter
Must contain at least one number
Can only contain ASCII characters (ASCII codes 33–126), except ‘ (single quotation mark), ” (double quotation mark), /, \, or @

Please note that production credentials could be created separately by privileged admins, and you could pass in the ARN of a pre-existing secret instead of the actual password if you so choose.

Choose Next.
Leave the remaining settings at their default and choose Next.
Select I acknowledge that AWS CloudFormation might create IAM resources.
Choose Create stack.

You can choose the refresh icon on the stack’s Events page to track the progress of the stack creation.

Push code to the CodeCommit repository

When stack creation is complete, go to the CodeCommit console. Locate the redshift-devops-repo repository that the stack created. Choose the repository to view its details.

Before you can push any code into this repo, you have to set up your Git credentials using instructions here Setup for HTTPS users using Git credentials. At Step 4 of the Setup for HTTPS users using Git credentials, copy the HTTPS URL, and instead of cloning, add the CodeCommit repo URL into the code that we cloned earlier:

git remote add codecommit <repo_https_url> 
git push codecommit main

The last step populates the repository; you can check it by refreshing the CodeCommit console. If you get prompted for a user name and password, enter the Git credentials that you generated and downloaded from Step 3 of the Setup for HTTPS users using Git credentials

Run the CI/CD pipeline

After you push the code to the CodeCommit repository, this triggers the pipeline to deploy the code into both the test and prod Amazon Redshift clusters. You can monitor the progress on the CodePipeline console.

To dive deeper into the progress of the build, choose Details.

You’re redirected to the CodeBuild console, where you can see the run logs as well as the result of the test.

Components and dependencies

Although from a high-level perspective the test and prod environment look the same, there are some nuances with regards to how these environments are configured. Before diving deeper into the code, let’s look at the components first:

CodeCommit – This is the version control system where you store your code.
CodeBuild – This service runs the build process and test using Maven.
- Build – During the build process, Maven uses FlyWay to connect to Amazon Redshift to determine the current version of the schema and what needs to be run to bring it up to the latest version.
- Test – In the test environment, Maven runs JUnit tests against the test Amazon Redshift cluster. These tests may involve loading data and testing the behavior of the stored procedures. The results of the unit tests are published into the CodeBuild test reports.
Secrets Manager – We use Secrets Manager to securely store connection information to the various Amazon Redshift clusters. This includes host name, port, user name, password, and database name. CodeBuild refers to Secrets Manager for the relevant connection information when a build gets triggered. The underlying CodeBuild service role needs to have the corresponding permission to access the relevant secrets.
CodePipeline – CodePipeline is responsible for the overall orchestration from source to test to production.

As referenced in the components, we also use some additional dependencies at the code level:

Flyway – This framework is responsible for keeping different Amazon Redshift clusters in different environments in sync as far as schema and stored procedures are concerned.
JUnit – Unit testing framework written in Java.
Apache Maven – A dependency management and build tool. Maven is where we integrate Flyway and JUnit.

In the following sections, we dive deeper into how these dependencies are integrated.

Apache Maven

For Maven, the configuration file is pom.xml. For an example, you can check out the pom file from our demo app. The pertinent part of the xml is the build section:

<build>
        <plugins>
            <plugin>
                <groupId>org.flywaydb</groupId>
                <artifactId>flyway-maven-plugin</artifactId>
                <version>${flyway.version}</version>
                <executions>
                    <execution>
                        <phase>process-resources</phase>
                        <goals>
                            <goal>migrate</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>${surefire.version}</version>
            </plugin>
        </plugins>
    </build>

This section describes two things:

By default, the Surefire plugin triggers during the test phase of Maven. The plugin runs the unit tests and generates reports based on the results of those tests. These reports are stored in the target/surefire-reports folder. We reference this folder in the CodeBuild section.
Flyway is triggered during the process-resources phase of Maven, and it triggers the migrate goal of Flyway. Looking at Maven’s lifecycle, this phase is always triggered first and deploys the latest version of stored procedures and schemas to the test environment before running test cases.

Flyway

Changes to the database are called migrations, and these can be either versioned or repeatable. Developers can define which type of migration by the naming convention used by Flyway to determine which one is which. The following diagram illustrates the naming convention.

A versioned migration consists of the regular SQL script that is run and an optional undo SQL script to reverse the specific version. You have to create this undo script in order to enable the undo functionality for a specific version. For example, a regular SQL script consists of creating a new table, and the corresponding undo script consists of dropping that table. Flyway is responsible for keeping track of which version a database is currently at, and runs N number of migrations depending on how far back the target database is compared to the latest version. Versioned migrations are the most common use of Flyway and are primarily used to maintain table schema and keep reference or lookup tables up to date by running data loads or updates via SQL statements. Versioned migrations are applied in order exactly one time.

Repeatable migrations don’t have a version; instead they’re rerun every time their checksum changes. They’re useful for maintaining user-defined functions and stored procedures. Instead of having multiple files to track changes over time, we can just use a single file and Flyway keeps track of when to rerun the statement to keep the target database up to date.

By default, these migration files are located in the classpath under db/migration, the full path being src/main/resources/db/migration. For our example application, you can find the source code on GitHub.

JUnit

When Flyway finishes running the migrations, the test cases are run. These test cases are under the folder src/test/java. You can find examples on GitHub that run a stored procedure via JDBC and validate the output or the impact.

Another aspect of unit testing to consider is how the test data is loaded and maintained in the test Amazon Redshift cluster. There are a couple of approaches to consider:

As per our example, we’re packaging the test data as part of our version control and loading the data when the first unit test is run. The advantage of this approach is that you get flexibility of when and where you run the test cases. You can start with either a completely empty or partially populated test cluster and you get with the right environment for the test case to run. Other advantages are that you can test data loading queries and have more granular control over the datasets that are being loaded for each test. The downside of this approach is that, depending on how big your test data is, it may add additional time for your test cases to complete.
Using an Amazon Redshift snapshot dedicated to the test environment is another way to manage the test data. With this approach, you have a couple more options:
- Transient cluster – You can provision a transient Amazon Redshift cluster based on the snapshot when the CI/CD pipeline gets triggered. This cluster stops after the pipeline completes to save cost. The downside of this approach is that you have to factor in Amazon Redshift provisioning time in your end-to-end runtime.
- Long-running cluster – Your test cases can connect to an existing cluster that is dedicated to running test cases. The test cases are responsible for making sure that data-related setup and teardown are done accordingly depending on the nature of the test that’s running. You can use @BeforeAll and @AfterAll JUnit annotations to trigger the setup and teardown, respectively.

CodeBuild

CodeBuild provides an environment where all of these dependencies run. As shown in our architecture diagram, we use CodeBuild for both test and prod. The differences are in the actual commands that run in each of those environments. These commands are stored in the buildspec.yml file. In our example, we provide a separate buildspec file for test and a different one for prod. During the creation of a CodeBuild project, we can specify which buildspec file to use.

There are a few differences between the test and prod CodeBuild project, which we discuss in the following sections.

Buildspec commands

In the test environment, we use mvn clean test and package the Surefire reports so the test results can be displayed via the CodeBuild console. While in the prod environment, we just run mvn clean process-resources. The reason for this is because in the prod environment, we only need to run the Flyway migrations, which are hooked up to the process-resources Maven lifecycle, whereas in the test environment, we not only run the Flyway migrations, but also make sure that it didn’t introduce any regressions by running test cases. These test cases might have an impact on the underlying data, which is why we don’t run it against the production Amazon Redshift cluster. If you want to run the test cases against production data, you can use an Amazon Redshift production cluster snapshot and run the test cases against that.

Secrets via Secrets Manager

Both Flyway and JUnit need information to identify and connect to Amazon Redshift. We store this information in Secrets Manager. Using Secrets Manager has several benefits:

Secrets are encrypted automatically
Access to secrets can be tightly controlled via fine-grained AWS Identity and Access Management (IAM) policies
All activity with secrets is recorded, which enables easy auditing and monitoring
You can rotate secrets securely and safely without impacting applications

For our example application, we define the secret as follows:

{
  "username": "<Redshift username>",
  "password": "<Redshift password>",
  "host": "<Redshift hostname>",
  "port": <Redshift port>,
  "dbName": "<Redshift DB Name>"
}

CodeBuild is integrated with Secrets Manager, so we define the following environment variables as part of the CodeBuild project:

TEST_HOST: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:host
TEST_JDBC_USER: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:username
TEST_JDBC_PASSWORD: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:password
TEST_PORT: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:port
TEST_DB_NAME: arn:aws:secretsmanager:<region>:<AWS Account Id>:secret:<secret name>:dbName
TEST_REDSHIFT_IAM_ROLE: <ARN of IAM role> (This can be in plaintext and should be attached to the Amazon Redshift cluster)
TEST_DATA_S3_BUCKET: <bucket name> (This is where the test data is staged)

CodeBuild automatically retrieves the parameters from Secrets Manager and they’re available in the application as environment variables. If you look at the buildspec_prod.yml example, we use the preceding variables to populate the Flyway environment variables and JDBC connection URL.

VPC configuration

For CodeBuild to be able to connect to Amazon Redshift, you need to configure which VPC it runs in. This includes the subnets and security group that it uses. The Amazon Redshift cluster’s security group also needs to allow access from the CodeBuild security group.

CodePipeline

To bring all these components together, we use CodePipeline to orchestrate the flow from the source code through prod deployment. CodePipeline also has additional capabilities. For example, you can add an approval step between test and prod so a release manager can review the results of the tests before releasing the changes to production.

Example scenario

You can use tests as a form of documentation of what is the expected behavior of a function. To further illustrate this point, let’s look at a simple stored procedure:

create or replace procedure merge_staged_products()
as $$
BEGIN
    update products set status='CLOSED' where product_name in (select product_name from products_staging) and status='ACTIVE';
    insert into products(product_name,price) select product_name, price from products_staging;
END;
$$ LANGUAGE plpgsql;

If you deployed the example app from the previous section, you can follow along by copying the stored procedure code and pasting it in src/main/resources/db/migration/R__MergeStagedProducts.sql. Save it and push the change to the CodeCommit repository by issuing the following commands (assuming that you’re at the top of the project folder):

git add src
git commit -m “<commit message>”
git push codecommit main

After you push the changes to the CodeCommit repository, you can follow the progress of the build and test stages on the CodePipeline console.

We implement a basic Slowly Changing Dimension Type 2 approach in which we mark old data as CLOSED and append newer versions of the data. Although the stored procedure works as is, our test has the following expectations:

The number of closed status in the products table needs to correspond to the number of duplicate entries in the staging table.
The products table has a close_date column that needs to be populated so we know when it was deprecated
At the end of the merge, the staging table needs to be cleared for subsequent ETL runs

The stored procedure will pass the first test, but fail later tests. When we push this change to CodeCommit and the CI/CD process runs, we can see results like in the following screenshot.

The tests show that the second and third tests failed. Failed tests result in the pipeline stopping, which means these bad changes don’t end up in production.

We can update the stored procedure and push the change to CodeCommit to trigger the pipeline again. The updated stored procedure is as follows:

create or replace procedure merge_staged_products()
as $$
BEGIN
    update products set status='CLOSED', close_date=CURRENT_DATE where product_name in (select product_name from products_staging) and status='ACTIVE';
    insert into products(product_name,price) select product_name, price from products_staging;
    truncate products_staging;
END;
$$ LANGUAGE plpgsql;

All the tests passed this time, which allows CodePipeline to proceed with deployment to production.

We used Flyway’s repeatable migrations to make the changes to the stored procedure. Code is stored in a single file and Flyway verifies the checksum of the file to detect any changes and reapplies the migration if the checksum is different from the one that’s already deployed.

Clean up

After you’re done, it’s crucial to tear down the environment to avoid incurring additional charges beyond your testing. Before you delete the CloudFormation stack, go to the Resources tab of your stack and make sure the two buckets that were provisioned are empty. If they’re not empty, delete all the contents of the buckets.

Now that the buckets are empty, you can go back to the AWS CloudFormation console and delete the stack to complete the cleanup of all the provisioned resources.

Conclusion

Using CI/CD principles in the context of Amazon Redshift stored procedures and schema changes greatly improves confidence when updates are getting deployed to production environments. Similar to CI/CD in application development, proper test coverage of stored procedures is paramount to capturing potential regressions when changes are made. This includes testing both success paths as well as all possible failure modes.

In addition, versioning migrations enables consistency across multiple environments and prevents issues arising from schema changes that aren’t applied properly. This increases confidence when changes are being made and improves development velocity as teams spend more time developing functionality rather than hunting for issues due to environment inconsistencies.

We encourage you to try building a CI/CD pipeline for Amazon Redshift using these steps described in this blog.

About the Authors

Ashok Srirama is a Senior Solutions Architect at Amazon Web Services, based in Washington Crossing, PA. He specializes in serverless applications, containers, devops, and architecting distributed systems. When he’s not spending time with his family, he enjoys watching cricket, and driving his bimmer.

Indira Balakrishnan is a Senior Solutions Architect in the AWS Analytics Specialist SA Team. She is passionate about helping customers build cloud-based analytics solutions to solve their business problems using data-driven decisions. Outside of work, she volunteers at her kids’ activities and spends time with her family.

Vaibhav Agrawal is an Analytics Specialist Solutions Architect at AWS.Throughout his career, he has focused on helping customers design and build well-architected analytics and decision support platforms.

Rajesh Francis is a Sr. Analytics Customer Experience Specialist at AWS. He specializes in Amazon Redshift and works with customers to build scalable Analytic solutions.

Jeetesh Srivastva is a Sr. Manager, Specialist Solutions Architect at AWS. He specializes in Amazon Redshift and works with customers to implement scalable solutions using Amazon Redshift and other AWS Analytic services. He has worked to deliver on-premises and cloud-based analytic solutions for customers in banking and finance and hospitality industry verticals.

Parallel and dynamic SaaS deployments with AWS CDK Pipelines

2021-11-01 Jani Muuriaisniemi

Post Syndicated from Jani Muuriaisniemi original https://aws.amazon.com/blogs/devops/parallel-and-dynamic-saas-deployments-with-cdk-pipelines/

Software as a Service (SaaS) is an increasingly popular business model for independent software vendors (ISVs), including benefits such as a pay-as-you-go pricing model, scalability, and availability.

SaaS services can be built by using numerous architectural models. The silo model provides each tenant with dedicated resources and a shared-nothing architecture. Silo deployments also provide isolation between tenants’ compute resources and their data, and they help eliminate the noisy-neighbor problem. On the other hand, the pool model offers several benefits, such as lower maintenance overhead, simplified management and operations, and cost-saving opportunities, all due to a more efficient utilization of computing resources and capacity. In the bridge model, both silo and pool models are utilized side-by-side. The bridge model is a hybrid model, where parts of the system can be in a silo model, and parts in a pool.

End-customers benefit from SaaS delivery in numerous ways. For example, the service can be available from multiple locations, letting the customer choose what is best for them. The tenant onboarding process is often real-time and frictionless. To realize these benefits for their end-customers, SaaS providers need methods for reliable, fast, and multi-region capable provisioning and software lifecycle management.

This post will describe a deployment system for automating the provision and lifecycle management of workload components in pool or silo deployment models by using AWS Cloud Development Kit (AWS CDK) and CDK Pipelines. We will explore the system’s dynamic and database driven deployment model, as well as its multi-account and multi-region capabilities, and we will provision demo deployments of workload components in both the silo and pool models.

AWS Cloud Development Kit and CDK Pipelines

For this solution, we utilized AWS Cloud Development Kit (AWS CDK) and its CDK Pipelines construct library. AWS CDK is an open-source software development framework for modeling and provisioning cloud application resources by using familiar programming languages. AWS CDK lets you define your infrastructure as code and provision it through AWS CloudFormation.

CDK Pipelines is a high-level construct library with an opinionated implementation of a continuous deployment pipeline for your CDK applications. It is powered by AWS CodePipeline, a fully managed continuous delivery service that helps automate your release pipelines for fast and reliable application as well as infrastructure updates. No servers need to be provisioned or setup, and you only pay for what you use. This solution utilizes the recently released and stable CDK Pipelines modern API.

Business Scenario

As a baseline use case, we have selected the consideration of a fictitious ISV called Unicorn that wants to implement an SaaS business model.

Unicorn operates in several countries, and requires the storing of customer data within the customers’ chosen region. Currently, Unicorn needs two regions in order to satisfy its main customer base: one in EU and one in US. Unicorn expects rapid growth, and it needs a solution that can scale to thousands of tenants. Unicorn plans to have different tenant tiers with different isolation requirements. Their planned deployment model has the majority of tenants in shared pool instances, but they also plan to support dedicated silo instances for the tenants requiring it. The solution must also be easily extendable to new Regions as Unicorn’s business expands.

Unicorn is starting small with just a single development team responsible for currently the only component in their SaaS workload architecture. Following industry best practices, Unicorn has designed its workload architecture so that each component has a clear technical ownership boundary. The chosen solution must grow together with Unicorn, and support multiple independently developed and deployed components in the future.

Solution Overview

Today, many customers utilize AWS CodePipeline to build, test, and deploy their cloud applications. For an SaaS provider such as Unicorn, considering utilizing a single pipeline for managing every deployment presented concerns. At the scale that Unicorn requires, a single pipeline with potentially hundreds of actions runs the risk of becoming throughput limited. Moreover, a single pipeline would offer Unicorn limited control over how changes are released.

Our solution addresses this problem by having a separate dynamically provisioned pipeline for each pool and silo deployment. The solution is designed to manage multiple deployments of Unicorn’s single workload component, thereby aligning with their current needs — and with small changes, including future needs.

CDK Best Practices state that an AWS CDK application maps to a component as defined by the AWS Well-Architected Framework. A component is the code, configuration, and AWS Resources that together deliver against a workload requirement. And this is typically the unit of technical ownership. A component usually includes logical units (e.g., api, database), and can have a continuous deployment pipeline.

Utilizing CDK Pipelines provides a significant benefit: with no additional code, we can deploy cross-account and cross-region just as easily as we would to a single account and region. CDK Pipelines automatically creates and manages the required cross-account encryption keys and cross-region replication buckets. Furthermore, we only need to establish a trust relationship between the accounts during the CDK bootstrapping process.

The following diagram illustrates the solution architecture:

Solution Architecture Diagram

Figure 1: Solution architecture

Let’s look closer at the two primary high level solution flows: silo and pool pipeline provisioning (1 and 2), and component code deployment (3 and 4).

Provisioning is separated into a dedicated flow, so that code deployments do not interfere with tenant onboarding, and vice versa. At the heart of the provisioning flow is the deployment database (1), which is implemented by using an Amazon DynamoDB table.

Utilizing DynamoDB Streams and AWS Lambda Triggers, a new AWS CodeBuild provisioning project build (2) is automatically started after a record is inserted into the deployment database. The provisioning project directly provisions new silo and pool pipelines by using the “cdk deploy” command. Provisioning events are processed in parallel, so that the solution can handle possible bursts in Unicorn’s tenant onboarding volumes.

CDK best practices suggest that infrastructure and runtime code live in the same package. A single AWS CodeCommit repository (3) contains everything needed: the CI/CD pipeline definitions as well as the workload component code. This repository is the source artifact for every CodePipeline pipeline and CodeBuild project. The chapter “Managing application resources as code” describes related implementation details.

The CI/CD pipeline (4) is a CDK Pipelines pipeline, and it is responsible for the component’s Software Development Life Cycle (SDLC) activities. In addition to implementing the update release process, it is expected that most SaaS providers will also implement additional activities. This includes a variety of tests and pre-production environment deployments. The chapter “Controlling deployment updates” dives deeper into this topic.

Deployments have two parts: The pipeline (5) and the component resource stack(s) (6) that it manages. The pipelines are deployed to the central toolchain account and region, whereas the component resources are deployed to the AWS Account and Region, as specified in the deployments’ record in the deployment database.

Sample code for the solution is available in GitHub. The sample code is intended for utilization in conjunction with this post. Our solution is implemented in TypeScript.

Deployment Database

Our deployment database is an Amazon DynamoDB table, with the following structure:

Table structure explained in post.

Figure 2: DynamoDB table

‘id’ is a unique identifier for each deployment.
‘account’ is the AWS account ID for the component resources.
‘region’ is the AWS region ID for the component resources.
‘type’ is either ‘silo’ or ‘pool’, which defines the deployment model.

This design supports tenant deployment to multiple silo and pool deployments. Each of these can target any available and bootstrapped AWS Account and Region. For example, different pools can support tenants in different regions, with select tenants deployed to dedicated silos. As pools may be limited to how many tenants they can serve, the design also supports having multiple pools within a region, and it can easily be extended with an additional attribute to support the tiers concept.

Note that the deployment database does not contain tenant information. It is expected that such mapping is maintained in a separate tenant database, where each tenant record can map to the ID of the deployment that it is associated with.

Now that we have looked at our solution design and architecture, let’s move to the hands-on section, starting with the deployment requirements for the solution.

Prerequisites

The following tools are required to deploy the solution:

NodeJS version compatible with AWS CDK version 1.124.0
The AWS Command Line Interface (AWS CLI)
Git with git-remote-codecommit extension

To follow this tutorial completely, you should have administrator access to at least one, but preferably two AWS accounts:

Toolchain: Account for the SDLC toolchain: the pipelines, the provisioning project, the repository, and the deployment database.
Workload (optional): Account for the component resources.

If you have only a single account, then the toolchain account can be used for both purposes. Credentials for the account(s) are assumed to be configured in AWS CLI profile(s).

The instructions in this post use the following placeholders, which you must replace with your specific values:

<TOOLCHAIN_ACCOUNT_ID>: The AWS Account ID for the toolchain account
<TOOLCHAIN_PROFILE_NAME>: The AWS CLI profile name for the toolchain account credentials
<WORKLOAD_ACCOUNT_ID>: The AWS Account ID for the workload account
<WORKLOAD_PROFILE_NAME>: The AWS CLI profile name for the workload account credentials

Bootstrapping

The toolchain account, and all workload account(s), must be bootstrapped prior to first-time deployment.

AWS CDK and our solutions’ dependencies must be installed to start with. The easiest way to do this is to install them locally with npm. First, we need to download our sample code, so that the we have the package.json configuration file available for npm.

Note that throughout these instructions, many commands are broken over multiple lines for readability. Take care to execute the commands completely. It is always safe to execute each code block as a whole.

Clone the sample code repository from GitHub, and then install the dependencies by using npm:

git clone https://github.com/aws-samples/aws-saas-parallel-deployments
cd aws-saas-parallel-deployments
npm ci

CDK Pipelines requires use of modern bootstrapping. To ensure that this is enabled, start by setting the related environment variable:

export CDK_NEW_BOOTSTRAP=1

Then, bootstrap the toolchain account. You must bootstrap both the region where the toolchain stack is deployed, as well as every target region for component resources. Here, we will first bootstrap only the us-east-1 region, and later you can optionally bootstrap additional region(s).

To bootstrap, we use npx to execute the locally installed version of AWS CDK:

npx cdk bootstrap <TOOLCHAIN_ACCOUNT_ID>/us-east-1 --profile <TOOLCHAIN_PROFILE_NAME>

If you have a workload account that is separate from the toolchain account, then that account must also be bootstrapped. When bootstrapping the workload account, we will establish a trust relationship with the toolchain account. Skip this step if you don’t have a separate workload account.

The workload account boostrappings follows the security best practice of least privilege. First create an execution policy with the minimum permissions required to deploy our demo component resources. We provide a sample policy file in the solution repository for this purpose. Then, use that policy as the execution policy for the trust relationship between the toolchain account and the workload account

aws iam create-policy \
  --profile <WORKLOAD_PROFILE_NAME> \
  --policy-name CDK-Exec-Policy \
  --policy-document file://policies/workload-cdk-exec-policy.json
npx cdk bootstrap <WORKLOAD_ACCOUNT_ID>/us-east-1 \
  --profile <WORKLOAD_PROFILE_NAME> \
  --trust <TOOLCHAIN_ACCOUNT_ID> \
  --cloudformation-execution-policies arn:aws:iam::<WORKLOAD_ACCOUNT_ID>:policy/CDK-Exec-Policy

Toolchain deployment

Prior to being able to deploy for the first time, you must create an AWS CodeCommit repository for the solution. Create this repository in the toolchain account:

aws codecommit create-repository \
  --profile <TOOLCHAIN_PROFILE_NAME> \
  --region us-east-1 \
  --repository-name unicorn-repository

Next, you must push the contents to the CodeCommit repository. For this, use the git command together with the git-remote-codecommit extension in order to authenticate to the repository with your AWS CLI credentials. Our pipelines are configured to use the main branch.

git remote add unicorn codecommit::us-east-1://<TOOLCHAIN_PROFILE_NAME>@unicorn-repository
git push unicorn main

Now we are ready to deploy the toolchain stack:

export AWS_REGION=us-east-1
npx cdk deploy --profile <TOOLCHAIN_PROFILE_NAME>

Workload deployments

At this point, our CI/CD pipeline, provisioning project, and deployment database have been created. The database is initially empty.

Note that the DynamoDB command line interface demonstrated below is not intended to be the SaaS providers provisioning interface for production use. SaaS providers typically have online registration portals, wherein the customer signs up for the service. When new deployments are needed, then a record should automatically be inserted into the solution’s deployment database.

To demonstrate the solution’s capabilities, first we will provision two deployments, with an optional third cross-region deployment:

A silo deployment (silo1) in the us-east-1 region.
A pool deployment (pool1) in the us-east-1 region.
A pool deployment (pool2) in the eu-west-1 region (optional).

To start, configure the AWS CLI environment variables:

export AWS_REGION=us-east-1
export AWS_PROFILE=<TOOLCHAIN_PROFILE_NAME>

Add the deployment database records for the first two deployments:

aws dynamodb put-item \
  --table-name unicorn-deployments \
  --item '{
    "id": {"S":"silo1"},
    "type": {"S":"silo"},
    "account": {"S":"<WORKLOAD_ACCOUNT_ID>"},
    "region": {"S":"us-east-1"}
  }'
aws dynamodb put-item \
  --table-name unicorn-deployments \
  --item '{
    "id": {"S":"pool1"},
    "type": {"S":"pool"},
    "account": {"S":"<WORKLOAD_ACCOUNT_ID>"},
    "region": {"S":"us-east-1"}
  }'

This will trigger two parallel builds of the provisioning CodeBuild project. Use the CodeBuild Console in order to observe the status and progress of each build.

Cross-region deployment (optional)

Optionally, also try a cross-region deployment. Skip this part if a cross-region deployment is not relevant for your use case.

First, you must bootstrap the target region in the toolchain and the workload accounts. Bootstrapping of eu-west-1 here is identical to the bootstrapping of the us-east-1 region earlier. First bootstrap the toolchain account:

npx cdk bootstrap <TOOLCHAIN_ACCOUNT_ID>/eu-west-1 --profile <TOOLCHAIN_PROFILE_NAME>

If you have a separate workload account, then we must also bootstrap it for the new region. Again, please skip this if you have only a single account:

npx cdk bootstrap <WORKLOAD_ACCOUNT_ID>/eu-west-1 \
  --profile <WORKLOAD_PROFILE_NAME> \
  --trust <TOOLCHAIN_ACCOUNT_ID> \
  --cloudformation-execution-policies arn:aws:iam::<WORKLOAD_ACCOUNT_ID>:policy/CDK-Exec-Policy

Then, add the cross-region deployment:

aws dynamodb put-item \
  --table-name unicorn-deployments \
  --item '{
    "id": {"S":"pool2"},
    "type": {"S":"pool"},
    "account": {"S":"<WORKLOAD_ACCOUNT_ID>"},
    "region": {"S":"eu-west-1"}
  }'

Validation of deployments

After the builds have completed, use the CodePipeline console to verify that the deployment pipelines were successfully created in the toolchain account:

CodePipeline console showing Pool-pool2-pipeline, Pool-pool1-pipeline and Silo-silo1-pipeline all Succeeded most recent execution.

Figure 3: CodePipeline console

Similarly, in the workload account, stacks containing your component resources will have been deployed to each configured region for the deployments. In this demo, we are deploying a single “hello world” container application utilizing AWS App Runner as runtime environment. Successful deployment can be verified by using CloudFormation Console:

Console showing Pool-pool1-resources with status of CREATE_COMPLETE

Figure 4: CloudFormation console

Now that we have successfully finished with our demo deployments, let’s look at how updates to the pipelines and the component resources can be managed.

Managing application resources as code

As highlighted earlier in the Solution Overview, every aspect of our solution shares a single source repository. With all of our code in a single source, we can easily deliver complex changes impacting multiple aspects of our solution. And all of this can be packaged, tested, and released as a single change set. For example, a change can introduce a new stage to the CI/CD pipeline, modify an existing stage in the silo and pool pipelines, and/or make code and resource changes to the component resources.

Managing the pipeline definitions is made simple by the self-mutate capability of the CDK Pipelines. Once initially deployed, each CDK Pipelines pipeline can update its own definition. This is implemented by using a separate SelfMutate stage in the pipeline definition. This stage is executed before any deployment actions, thereby ensuring that the pipeline always executes the latest version that is defined by the source code.

Managing how and when the pipelines trigger to execute also required attention. CDK Pipelines configures pipelines by default to utilize event-based polling of the source repository. While this is a reasonable default, and it is great for the CI/CD pipeline, it is undesired for our silo and pool pipelines. If all of these pipelines would execute automatically on code commits to the source repository, the CI/CD pipeline could not manage the release flow. To address this, we have configured the silo and pool pipelines with the trigger in the CodeCommitSourceOptions to NONE.

Controlling deployment updates

A key aspect of SaaS delivery is controlling how you roll out changes to tenants. Significant business risk can arise if changes are released to all tenants all-at-once in a single big bang.

This risk can be managed by utilizing a combination of silo and pool deployments. Reduce your risk by spreading tenants into multiple pools, and gradually rolling out your changes to these pools. Based on business needs and/or risk assessment, select customers can be provisioned into dedicated silo deployments, thereby allowing update control for those customers separately. Note that while all of a pool’s tenants get the same underlying update simultaneously, you can utilize feature flags to selectively enable new features only for specific tenants in the deployment.

In the demo solution, the CI/CD pipeline contains only a single custom stage “UpdateDeployments”. This CodeBuild action implements a simple “one-at-a-time” strategy. The code has been purposely written so that it is simple and provides you with a starting point to implement your own more complex strategy, as based on your unique business needs. In the default implementation, every silo and pool pipeline tracks the same “main” branch of the repository. Releases are governed by controlling when each pipeline executes to update its resources.

When designing your release strategy, look into how the planned process helps implement releases and changes with high quality and frequency. A typical starting point is a CI/CD pipeline with continuous automated deployments via multiple test and staging environments in order to validate your changes prior to deployment to any production tenants.

Furthermore, consider if utilizing a canary release strategy would help identify potential issues with your changes prior to rolling them out across all deployments in production. In a canary release, each change is first deployed only to a small subset of your deployments. Once you are satisfied with the change quality, then the change can either automatically or manually be released to the rest of your deployments. As an example, an AWS Step Functions state machine could be combined with the solution, and then utilized to control the release flow, execute validation tests, implement approval steps (either manual or automatic), and even conduct rollback if necessary.

Further considerations

The example in this post provisions every silo and pool deployment to a single AWS account. However, the solution is not limited to a single account, and it can deploy equally easily to multiple AWS accounts. When operating at scale, it is best-practice to spread your workloads to several accounts. The Organizing Your AWS Environment using Multiple Accounts whitepaper has in-depth guidance on strategies for spreading your workloads.

If combined with an AWS account-vending machine implementation, such as an AWS Control Tower Landing Zone, then the demo solution could be adapted so that new AWS accounts are provisioned automatically. This would be useful if your business requires full account-level deployment isolation, and you also want automated provisioning.

To meet Unicorn’s future needs for spreading their solution architecture over multiple separate components, the deployment database and associated lambda function could be decoupled from the rest of the toolchain components in order to provide a central deployment service. When provisioned as standalone, and amended with Amazon Simple Notification Service-based notifications sent to the component deployment systems for example, this central deployment service could be utilized for managing the deployments for multiple components.

In addition, you should analyze your deployment lifecycle transitions, and then consider what action should be taken when a tenant is disabled and/or deleted. Implementing a deployment archival/deletion process is not in the scope of this post.

Cleanup

To cleanup every resource deployed in this post, conduct the following actions:

In the workload account:
1. In us-east-1 Region, delete CloudFormation stacks named “pool-pool1-resources” and “silo-silo1-resources” and the CDK bootstrap stack “CDKToolKit”.
2. In eu-west-1 Region, delete CloudFormation stack named “pool-pool2-resources” and the CDK Bootstrap stack “CDKToolKit”
In the toolchain account:
1. In us-east-1 Region, delete CloudFormation stacks “toolchain”, “pool-pool1-pipeline”, “pool-pool2-pipeline”, “silo-silo1-pipeline” and the CDK bootstrap stack “CDKToolKit”.
2. In eu-west-1 Region, delete CloudFormation stack “pool-pool2-pipeline-support-eu-west-1” and the CDK bootstrap stack “CDKToolKit”
3. Cleanup and delete S3 buckets “toolchain-*”, “pool-pool1-pipeline-*”, “pool-pool2-pipeline-*”, and “silo-silo1-pipeline-*”.

Conclusion

This solution demonstrated an implementation of an automated SaaS application component deployment factory. We covered how an ISV venturing into the SaaS model can utilize AWS CDK and CDK Pipelines in order to avoid a multitude of undifferentiated heavy lifting by leveraging and combining AWS CDK’s cross-region and cross-account capabilities with CDK Pipelines’ self-mutating deployment pipelines. Furthermore, we demonstrated how all of this can be written, managed, and released just like any other code you write. We also demonstrated how a single dynamic provisioning system can be utilized to operate in a mixed mode, with both silo and pool deployments.

Visit the AWS SaaS Factory Program page for further information on how AWS can help you on your SaaS journey — regardless of the stage you are currently in.

About the authors

CICD on Serverless Applications using AWS CodeArtifact

2021-08-07 Anand Krishna

Post Syndicated from Anand Krishna original https://aws.amazon.com/blogs/devops/cicd-on-serverless-applications-using-aws-codeartifact/

Developing and deploying applications rapidly to users requires a working pipeline that accepts the user code (usually via a Git repository). AWS CodeArtifact was announced in 2020. It’s a secure and scalable artifact management product that easily integrates with other AWS products and services. CodeArtifact allows you to publish, store, and view packages, list package dependencies, and share your application’s packages.

In this post, I will show how we can build a simple DevOps pipeline for a sample JAVA application (JAR file) to be built with Maven.

Solution Overview

We utilize the following AWS services/Tools/Frameworks to set up our continuous integration, continuous deployment (CI/CD) pipeline:

The following diagram illustrates the pipeline architecture and flow:

aws-codeartifact-pipeline

Our pipeline is built on CodePipeline with CodeCommit as the source (CodePipeline Source Stage). This triggers the pipeline via a CloudWatch Events rule. Then the code is fetched from the CodeCommit repository branch (main) and sent to the next pipeline phase. This CodeBuild phase is specifically for compiling, packaging, and publishing the code to CodeArtifact by utilizing a package manager—in this case Maven.

After Maven publishes the code to CodeArtifact, the pipeline asks for a manual approval to be directly approved in the pipeline. It can also optionally trigger an email alert via Amazon Simple Notification Service (Amazon SNS). After approval, the pipeline moves to another CodeBuild phase. This downloads the latest packaged JAR file from a CodeArtifact repository and deploys to the AWS Lambda function.

Clone the Repository

Clone the GitHub repository as follows:

git clone https://github.com/aws-samples/aws-cdk-codeartifact-pipeline-sample.git

Code Deep Dive

After the Git repository is cloned, the directory structure is shown as in the following screenshot :

aws-codeartifact-pipeline-code

Let’s study the files and code to understand how the pipeline is built.

The directory java-events is a sample Java Maven project. Find numerous sample applications on GitHub. For this post, we use the sample application java-events.

To add your own application code, place the pom.xml and settings.xml files in the root directory for the AWS CDK project.

Let’s study the code in the file lib/cdk-pipeline-codeartifact-new-stack.ts of the stack CdkPipelineCodeartifactStack. This is the heart of the AWS CDK code that builds the whole pipeline. The stack does the following:

Creates a CodeCommit repository called ca-pipeline-repository.
References a CloudFormation template (lib/ca-template.yaml) in the AWS CDK code via the module @aws-cdk/cloudformation-include.
Creates a CodeArtifact domain called cdkpipelines-codeartifact.
Creates a CodeArtifact repository called cdkpipelines-codeartifact-repository.
Creates a CodeBuild project called JarBuild_CodeArtifact. This CodeBuild phase does all of the code compiling, packaging, and publishing to CodeArtifact into a repository called cdkpipelines-codeartifact-repository.
Creates a CodeBuild project called JarDeploy_Lambda_Function. This phase fetches the latest artifact from CodeArtifact created in the previous step (cdkpipelines-codeartifact-repository) and deploys to the Lambda function.
Finally, creates a pipeline with four phases:
- Source as CodeCommit (ca-pipeline-repository).
- CodeBuild project JarBuild_CodeArtifact.
- A Manual approval Stage.
- CodeBuild project JarDeploy_Lambda_Function.

CodeArtifact shows the domain-specific and repository-specific connection settings to mention/add in the application’s pom.xml and settings.xml files as below:

aws-codeartifact-repository-connections

Deploy the Pipeline

The AWS CDK code requires the following packages in order to build the CI/CD pipeline:

@aws-cdk/core
@aws-cdk/aws-codepipeline
@aws-cdk/aws-codepipeline-actions
@aws-cdk/aws-codecommit
@aws-cdk/aws-codebuild
@aws-cdk/aws-iam
@aws-cdk/cloudformation-include

Install the required AWS CDK packages as below:

npm i @aws-cdk/core @aws-cdk/aws-codepipeline @aws-cdk/aws-codepipeline-actions @aws-cdk/aws-codecommit @aws-cdk/aws-codebuild @aws-cdk/pipelines @aws-cdk/aws-iam @ @aws-cdk/cloudformation-include

Compile the AWS CDK code:

npm run build

Deploy the AWS CDK code:

cdk synth cdk deploy

After the AWS CDK code is deployed, view the final output on the stack’s detail page on the AWS CloudFormation :

aws-codeartifact-pipeline-cloudformation-stack

How the pipeline works with artifact versions (using SNAPSHOTS)

In this demo, I publish SNAPSHOT to the repository. As per the documentation here and here, a SNAPSHOT refers to the most recent code along a branch. It’s a development version preceding the final release version. Identify a snapshot version of a Maven package by the suffix SNAPSHOT appended to the package version.

The application settings are defined in the pom.xml file. For this post, we define the following:

The version to be used, called 1.0-SNAPSHOT.
The specific packaging, called jar.
The specific project display name, called JavaEvents.
The specific group ID, called JavaEvents.

The screenshot below shows the pom.xml settings we utilised in the application:

aws-codeartifact-pipeline-pom-xml

You can’t republish a package asset that already exists with different content, as per the documentation here.

When a Maven snapshot is published, its previous version is preserved in a new version called a build. Each time a Maven snapshot is published, a new build version is created.

When a Maven snapshot is published, its status is set to Published, and the status of the build containing the previous version is set to Unlisted. If you request a snapshot, the version with status Published is returned. This is always the most recent Maven snapshot version.

For example, the image below shows the state when the pipeline is run for the FIRST RUN. The latest version has the status Published and previous builds are marked Unlisted.

aws-codeartifact-repository-package-versions

For all subsequent pipeline runs, multiple Unlisted versions will occur every time the pipeline is run, as all previous versions of a snapshot are maintained in its build versions.

aws-codeartifact-repository-package-versions

Fetching the Latest Code

Retrieve the snapshot from the repository in order to deploy the code to an AWS Lambda Function. I have used AWS CLI to list and fetch the latest asset of package version 1.0-SNAPSHOT.

Listing the latest snapshot

export ListLatestArtifact = `aws codeartifact list-package-version-assets —domain cdkpipelines-codeartifact --domain-owner $Account_Id --repository cdkpipelines-codeartifact-repository --namespace JavaEvents --format maven --package JavaEvents --package-version "1.0-SNAPSHOT"| jq ".assets[].name"|grep jar|sed ’s/“//g’`

NOTE : Please note the dynamic CDK variable $Account_Id which represents AWS Account ID.

Fetching the latest code using Package Version

aws codeartifact get-package-version-asset --domain cdkpipelines-codeartifact --repository cdkpipelines-codeartifact-repository --format maven --package JavaEvents --package-version 1.0-SNAPSHOT --namespace JavaEvents --asset $ListLatestArtifact demooutput

Notice that I’m referring the last code by using variable $ListLatestArtifact. This always fetches the latest code, and demooutput is the outfile of the AWS CLI command where the content (code) is saved.

Testing the Pipeline

Now clone the CodeCommit repository that we created with the following code:

git clone https://git-codecommit.<region>.amazonaws.com/v1/repos/codeartifact-pipeline-repository

Enter the following code to push the code to the CodeCommit repository:

cp -rp cdk-pipeline-codeartifact-new /* ca-pipeline-repository cd ca-pipeline-repository git checkout -b main git add . git commit -m “testing the pipeline” git push origin main

Once the code is pushed to Git repository, the pipeline is automatically triggered by Amazon CloudWatch events.

The following screenshots shows the second phase (AWS CodeBuild Phase – JarBuild_CodeArtifact) of the pipeline, wherein the asset is successfully compiled and published to the CodeArtifact repository by Maven:

aws-codeartifact-pipeline-codebuild-jarbuild

aws-codeartifact-pipeline-codebuild-screenshot

aws-codeartifact-pipeline-codebuild-screenshot2

The following screenshots show the last phase (AWS CodeBuild Phase – Deploy-to-Lambda) of the pipeline, wherein the latest asset is successfully pulled and deployed to AWS Lambda Function.

Asset JavaEvents-1.0-20210618.131629-5.jar is the latest snapshot code for the package version 1.0-SNAPSHOT. This is the same asset version code that will be deployed to AWS Lambda Function, as seen in the screenshots below:

aws-codeartifact-pipeline-codebuild-jardeploy

aws-codeartifact-pipeline-codebuild-screenshot-jarbuild

The following screenshot of the pipeline shows a successful run. The code was fetched and deployed to the existing Lambda function (codeartifact-test-function).

aws-codeartifact-pipeline-codepipeline

Cleanup

To clean up, You can either delete the entire stack through the AWS CloudFormation console or use AWS CDK command like below –

cdk destroy

For more information on the AWS CDK commands, please check the here or sample here.

Summary

In this post, I demonstrated how to build a CI/CD pipeline for your serverless application with AWS CodePipeline by utilizing AWS CDK with AWS CodeArtifact. Please check the documentation here for an in-depth explanation regarding other package managers and the getting started guide.

Secure and analyse your Terraform code using AWS CodeCommit, AWS CodePipeline, AWS CodeBuild and tfsec

2021-07-30 César Prieto Ballester

Post Syndicated from César Prieto Ballester original https://aws.amazon.com/blogs/devops/secure-and-analyse-your-terraform-code-using-aws-codecommit-aws-codepipeline-aws-codebuild-and-tfsec/

Introduction

More and more customers are using Infrastructure-as-Code (IaC) to design and implement their infrastructure on AWS. This is why it is essential to have pipelines with Continuous Integration/Continuous Deployment (CI/CD) for infrastructure deployment. HashiCorp Terraform is one of the popular IaC tools for customers on AWS.

In this blog, I will guide you through building a CI/CD pipeline on AWS to analyze and identify possible configurations issues in your Terraform code templates. This will help mitigate security risks within our infrastructure deployment pipelines as part of our CI/CD. To do this, we utilize AWS tools and the Open Source tfsec tool, a static analysis security scanner for your Terraform code, including more than 90 preconfigured checks with the ability to add custom checks.

Solutions Overview

The architecture goes through a CI/CD pipeline created on AWS using AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, and Amazon ECR.

Our demo has two separate pipelines:

CI/CD Pipeline to build and push our custom Docker image to Amazon ECR
CI/CD Pipeline where our tfsec analysis is executed and Terraform provisions infrastructure

The tfsec configuration and Terraform goes through a buildspec specification file defined within an AWS CodeBuild action. This action will calculate how many potential security risks we currently have within our Terraform templates, which will be displayed in our manual acceptance process for verification.

Architecture diagram

Provisioning the infrastructure

We have created an AWS Cloud Development Kit (AWS CDK) app hosted in a Git Repository written in Python. Here you can deploy the two main pipelines in order to manage this scenario. For a list of the deployment prerequisites, see the README.md file.

Clone the repo in your local machine. Then, bootstrap and deploy the CDK stack:

git clone https://github.com/aws-samples/aws-cdk-tfsec
cd aws-cdk-tfsec
pip install -r requirements.txt
cdk bootstrap aws://account_id/eu-west-1
cdk deploy --all

The infrastructure creation takes around 5-10 minutes due the AWS CodePipelines and referenced repository creation. Once the CDK has deployed the infrastructure, clone the two new AWS CodeCommit repos that have already been created and push the example code. First, one for the custom Docker image, and later for your Terraform code, like this:

git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/awsome-terraform-example-container
cd awsome-terraform-example-container
git checkout -b main
cp repos/docker_image/* .
git add .
git commit -am "First commit"
git push origin main

Once the Docker image is built and pushed to the Amazon ECR, proceed with Terraform repo. Check the pipeline process on the AWS CodePipeline console.

Screenshot of CI/CD Pipeline to build Docker Image

git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/awsome-terraform-example
cd awsome-terraform-example
git checkout -b main
cp -aR repos/terraform_code/* .
git add .
git commit -am "First commit"
git push origin main

The Terraform provisioning AWS CodePipeline has the following aspect:

Screenshot of CodePipeline to run security and orchestrate IaC

The pipeline has three main stages:

Source – AWS CodeCommit stores the Terraform repository infrastructure and every time we push code to the main branch the AWS CodePipeline will be triggered.
tfsec analysis – AWS CodeBuild looks for a buildspec to execute the tfsec actions configured on the same buildspec.

Screenshot showing tfsec analysis

The output shows the potential security issues detected by tfsec for our Terraform code. The output is linking to the different security issues already defined on tfsec. Check the security checks defined by tfsec here. After tfsec execution, a manual approval action is set up to decide if we should go for the next steps or if we reject and stop the AWS CodePipeline execution.

The URL for review is linking to our tfsec output console.

Screenshot of tfsec output

Terraform plan and Terraform apply – This will be applied to our infrastructure plan. After the Terraform plan command and before the Terraform apply, a manual action is set up to decide if we can apply the changes.

After going through all of the stages, our Terraform infrastructure should be created.

Clean up

After completing your demo, feel free to delete your stack using the CDK cli:

cdk destroy --all

Conclusion

At AWS, security is our top priority. This post demonstrates how to build a CI/CD pipeline by using AWS Services to automate and secure your infrastructure as code via Terraform and tfsec.

Learn more about tfsec through the official documentation: https://tfsec.dev/

About the authors

César Prieto Ballester is a DevOps Consultant at Amazon Web Services. He enjoys automating everything and building infrastructure using code. Apart from work, he plays electric guitar and loves riding his mountain bike.

Bruno Bardelli is a Senior DevOps Consultant at Amazon Web Services. He loves to build applications and in his free time plays video games, practices aikido, and goes on walks with his dog.

Blue/Green deployment with AWS Developer tools on Amazon EC2 using Amazon EFS to host application source code

2021-07-28 Rakesh Singh

Post Syndicated from Rakesh Singh original https://aws.amazon.com/blogs/devops/blue-green-deployment-with-aws-developer-tools-on-amazon-ec2-using-amazon-efs-to-host-application-source-code/

Many organizations building modern applications require a shared and persistent storage layer for hosting and deploying data-intensive enterprise applications, such as content management systems, media and entertainment, distributed applications like machine learning training, etc. These applications demand a centralized file share that scales to petabytes without disrupting running applications and remains concurrently accessible from potentially thousands of Amazon EC2 instances.

Simultaneously, customers want to automate the end-to-end deployment workflow and leverage continuous methodologies utilizing AWS developer tools services for performing a blue/green deployment with zero downtime. A blue/green deployment is a deployment strategy wherein you create two separate, but identical environments. One environment (blue) is running the current application version, and one environment (green) is running the new application version. The blue/green deployment strategy increases application availability by generally isolating the two application environments and ensuring that spinning up a parallel green environment won’t affect the blue environment resources. This isolation reduces deployment risk by simplifying the rollback process if a deployment fails.

Amazon Elastic File System (Amazon EFS) provides a simple, scalable, and fully-managed elastic NFS file system for use with AWS Cloud services and on-premises resources. It scales on demand, thereby eliminating the need to provision and manage capacity in order to accommodate growth. Utilize Amazon EFS to create a shared directory that stores and serves code and content for numerous applications. Your application can treat a mounted Amazon EFS volume like local storage. This means you don’t have to deploy your application code every time the environment scales up to multiple instances to distribute load.

In this blog post, I will guide you through an automated process to deploy a sample web application on Amazon EC2 instances utilizing Amazon EFS mount to host application source code, and utilizing a blue/green deployment with AWS code suite services in order to deploy the application source code with no downtime.

How this solution works

This blog post includes a CloudFormation template to provision all of the resources needed for this solution. The CloudFormation stack deploys a Hello World application on Amazon Linux 2 EC2 Instances running behind an Application Load Balancer and utilizes Amazon EFS mount point to store the application content. The AWS CodePipeline project utilizes AWS CodeCommit as the version control, AWS CodeBuild for installing dependencies and creating artifacts, and AWS CodeDeploy to conduct deployment on EC2 instances running in an Amazon EC2 Auto Scaling group.

Figure 1 below illustrates our solution architecture.

Figure 1: Sample solution architecture

The event flow in Figure 1 is as follows:

A developer commits code changes from their local repo to the CodeCommit repository. The commit triggers CodePipeline execution.
CodeBuild execution begins to compile source code, install dependencies, run custom commands, and create deployment artifact as per the instructions in the Build specification reference file.
During the build phase, CodeBuild copies the source-code artifact to Amazon EFS file system and maintains two different directories for current (green) and new (blue) deployments.
After successfully completing the build step, CodeDeploy deployment kicks in to conduct a Blue/Green deployment to a new Auto Scaling Group.
During the deployment phase, CodeDeploy mounts the EFS file system on new EC2 instances as per the CodeDeploy AppSpec file reference and conducts other deployment activities.
After successful deployment, a Lambda function triggers in order to store a deployment environment parameter in Systems Manager parameter store. The parameter stores the current EFS mount name that the application utilizes.
The AWS Lambda function updates the parameter value during every successful deployment with the current EFS location.

Prerequisites

For this walkthrough, the following are required:

An AWS account
Access to an AWS account with administrator or PowerUser (or equivalent) AWS Identity and Access Management(IAM) role policies attached
Git Command Line installed and configured in your local environment

Deploy the solution

Once you’ve assembled the prerequisites, download or clone the GitHub repo and store the files on your local machine. Utilize the commands below to clone the repo:

mkdir -p ~/blue-green-sample/
cd ~/blue-green-sample/
git clone https://github.com/aws-samples/blue-green-deployment-pipeline-for-efs

Once completed, utilize the following steps to deploy the solution in your AWS account:

Create a private Amazon Simple Storage Service (Amazon S3) bucket by using this documentation

Figure 2: AWS S3 console view when creating a bucket
Upload the cloned or downloaded GitHub repo files to the root of the S3 bucket. the S3 bucket objects structure should look similar to Figure 3:

Figure 3: AWS S3 bucket object structure
Go to the S3 bucket and select the template name solution-stack-template.yml, and then copy the object URL.
Open the CloudFormation console. Choose the appropriate AWS Region, and then choose Create Stack. Select With new resources.
Select Amazon S3 URL as the template source, paste the object URL that you copied in Step 3, and then choose Next.
On the Specify stack details page, enter a name for the stack and provide the following input parameter. Modify the default values for other parameters in order to customize the solution for your environment. You can leave everything as default for this walkthrough.

ArtifactBucket– The name of the S3 bucket that you created in the first step of the solution deployment. This is a mandatory parameter with no default value.

Figure 4: Defining the stack name and input parameters for the CloudFormation stack

Choose Next.
On the Options page, keep the default values and then choose Next.
On the Review page, confirm the details, acknowledge that CloudFormation might create IAM resources with custom names, and then choose Create Stack.
Once the stack creation is marked as CREATE_COMPLETE, the following resources are created:

A virtual private cloud (VPC) configured with two public and two private subnets.
NAT Gateway, an EIP address, and an Internet Gateway.
Route tables for private and public subnets.
Auto Scaling Group with a single EC2 Instance.
Application Load Balancer and a Target Group.
Three security groups—one each for ALB, web servers, and EFS file system.
Amazon EFS file system with a mount target for each Availability Zone.
CodePipeline project with CodeCommit repository, CodeBuild, and CodeDeploy resources.
SSM parameter to store the environment current deployment status.
Lambda function to update the SSM parameter for every successful pipeline execution.
Required IAM Roles and policies.

Note: It may take anywhere from 10-20 minutes to complete the stack creation.

Test the solution

Now that the solution stack is deployed, follow the steps below to test the solution:

Validate CodePipeline execution status

After successfully creating the CloudFormation stack, a CodePipeline execution automatically triggers to deploy the default application code version from the CodeCommit repository.

In the AWS console, choose Services and then CloudFormation. Select your stack name. On the stack Outputs tab, look for the CodePipelineURL key and click on the URL.
Validate that all steps have successfully completed. For a successful CodePipeline execution, you should see something like Figure 5. Wait for the execution to complete in case it is still in progress.

Figure 5: CodePipeline console showing execution status of all stages

Validate the Website URL

After completing the pipeline execution, hit the website URL on a browser to check if it’s working.

On the stack Outputs tab, look for the WebsiteURL key and click on the URL.
For a successful deployment, it should open a default page similar to Figure 6.

Figure 6: Sample “Hello World” application (Green deployment)

Validate the EFS share

After the website deployed successfully, we will get into the application server and validate the EFS mount point and the application source code directory.

Open the Amazon EC2 console, and then choose Instances in the left navigation pane.
Select the instance named bg-sample and choose
For Connection method, choose Session Manager, and then choose connect

After the connection is made, run the following bash commands to validate the EFS mount and the deployed content. Figure 7 shows a sample output from running the bash commands.

sudo df –h | grep efs
ls –la /efs/green
ls –la /var/www/

Figure 7: Sample output from the bash command (Green deployment)

Deploy a new revision of the application code

After verifying the application status and the deployed code on the EFS share, commit some changes to the CodeCommit repository in order to trigger a new deployment.

On the stack Outputs tab, look for the CodeCommitURL key and click on the corresponding URL.
Click on the file html.
Click on
Uncomment line 9 and comment line 10, so that the new lines look like those below after the changes:

background-color: #0188cc; 
#background-color: #90ee90;

Add Author name, Email address, and then choose Commit changes.

After you commit the code, the CodePipeline triggers and executes Source, Build, Deploy, and Lambda stages. Once the execution completes, hit the Website URL and you should see a new page like Figure 8.

Figure 8: New Application version (Blue deployment)

On the EFS side, the application directory on the new EC2 instance now points to /efs/blue as shown in Figure 9.

Figure 9: Sample output from the bash command (Blue deployment)

Solution review

Let’s review the pipeline stages details and what happens during the Blue/Green deployment:

1) Build stage

For this sample application, the CodeBuild project is configured to mount the EFS file system and utilize the buildspec.yml file present in the source code root directory to run the build. Following is the sample build spec utilized in this solution:

version: 0.2
phases:
  install:
    runtime-versions:
      php: latest   
  build:
    commands:
      - current_deployment=$(aws ssm get-parameter --name $SSM_PARAMETER --query "Parameter.Value" --region $REGION --output text)
      - echo $current_deployment
      - echo $SSM_PARAMETER
      - echo $EFS_ID $REGION
      - if [[ "$current_deployment" == "null" ]]; then echo "this is the first GREEN deployment for this project" ; dir='/efs/green' ; fi
      - if [[ "$current_deployment" == "green" ]]; then dir='/efs/blue' ; else dir='/efs/green' ; fi
      - if [ ! -d $dir ]; then  mkdir $dir >/dev/null 2>&1 ; fi
      - echo $dir
      - rsync -ar $CODEBUILD_SRC_DIR/ $dir/
artifacts:
  files:
      - '**/*'

During the build job, the following activities occur:

Installs latest php runtime version.
Reads the SSM parameter value in order to know the current deployment and decide which directory to utilize. The SSM parameter value flips between green and blue for every successful deployment.
Synchronizes the latest source code to the EFS mount point.
Creates artifacts to be utilized in subsequent stages.

Note: Utilize the default buildspec.yml as a reference and customize it further as per your requirement. See this link for more examples.

2) Deploy Stage

The solution is utilizing CodeDeploy blue/green deployment type for EC2/On-premises. The deployment environment is configured to provision a new EC2 Auto Scaling group for every new deployment in order to deploy the new application revision. CodeDeploy creates the new Auto Scaling group by copying the current one. See this link for more details on blue/green deployment configuration with CodeDeploy. During each deployment event, CodeDeploy utilizes the appspec.yml file to run the deployment steps as per the defined life cycle hooks. Following is the sample AppSpec file utilized in this solution.

version: 0.0
os: linux
hooks:
  BeforeInstall:
    - location: scripts/install_dependencies
      timeout: 180
      runas: root
  AfterInstall:
    - location: scripts/app_deployment
      timeout: 180
      runas: root
  BeforeAllowTraffic :
     - location: scripts/check_app_status
       timeout: 180
       runas: root

Note: The scripts mentioned in the AppSpec file are available in the scripts directory of the CodeCommit repository. Utilize these sample scripts as a reference and modify as per your requirement.

For this sample, the following steps are conducted during a deployment:

BeforeInstall:
- Installs required packages on the EC2 instance.
- Mounts the EFS file system.
- Creates a symbolic link to point the apache home directory /var/www/html to the appropriate EFS mount point. It also ensures that the new application version deploys to a different EFS directory without affecting the current running application.

AfterInstall:
- Stops apache web server.
- Fetches current EFS directory name from Systems Manager.
- Runs some clean up commands.
- Restarts apache web server.

BeforeAllowTraffic:
- Checks application status if running fine.
- Exits the deployment with error if the app returns a non 200 HTTP status code.

3) Lambda Stage

After completing the deploy stage, CodePipeline triggers a Lambda function in order to update the SSM parameter value with the updated EFS directory name. This parameter value alternates between “blue” and “green” to help CodePipeline identify the right EFS file system path during the next deployment.

CodeDeploy Blue/Green deployment

Let’s review the sequence of events flow during the CodeDeploy deployment:

CodeDeploy creates a new Auto Scaling group by copying the original one.
Provisions a replacement EC2 instance in the new Auto Scaling Group.
Conducts the deployment on the new instance as per the instructions in the yml file.
Sets up health checks and redirects traffic to the new instance.
Terminates the original instance along with the Auto Scaling Group.
After completing the deployment, it should appear as shown in Figure 10.

AWS CodeDeploy console view of a Blue/Green CodeDeploy deployment on Ec2

Figure 10: AWS console view of a Blue/Green CodeDeploy deployment on Ec2

Troubleshooting

To troubleshoot any service-related issues, see the following links:

More information

Now that you have tested the solution, here are some additional points worth noting:

The sample template and code utilized in this blog can work in any AWS region and are mainly intended for demonstration purposes. Utilize the sample as a reference and modify it further as per your requirement.
This solution works with single account, Region, and VPC combination.
For this sample, we have utilized AWS CodeCommit as version control, but you can also utilize any other source supported by AWS CodePipeline like Bitbucket, GitHub, or GitHub Enterprise Server

Clean up

Follow these steps to delete the components and avoid any future incurring charges:

Open the AWS CloudFormation console.
On the Stacks page in the CloudFormation console, select the stack that you created for this blog post. The stack must be currently running.
In the stack details pane, choose Delete.
Select Delete stack when prompted.
Empty and delete the S3 bucket created during deployment step 1.

Conclusion

In this blog post, you learned how to set up a complete CI/CD pipeline for conducting a blue/green deployment on EC2 instances utilizing Amazon EFS file share as mount point to host application source code. The EFS share will be the central location hosting your application content, and it will help reduce your overall deployment time by eliminating the need for deploying a new revision on every EC2 instance local storage. It also helps to preserve any dynamically generated content when the life of an EC2 instance ends.

Author bio

Rakesh Singh

Rakesh is a Senior Technical Account Manager at Amazon. He loves automation and enjoys working directly with customers to solve complex technical issues and provide architectural guidance. Outside of work, he enjoys playing soccer, singing karaoke, and watching thriller movies.

Use the Snyk CLI to scan Python packages using AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild

2021-07-27 BK Das

Post Syndicated from BK Das original https://aws.amazon.com/blogs/devops/snyk-cli-scan-python-codecommit-codepipeline-codebuild/

One of the primary advantages of working in the cloud is achieving agility in product development. You can adopt practices like continuous integration and continuous delivery (CI/CD) and GitOps to increase your ability to release code at quicker iterations. Development models like these demand agility from security teams as well. This means your security team has to provide the tooling and visibility to developers for them to fix security vulnerabilities as quickly as possible.

Vulnerabilities in cloud-native applications can be roughly classified into infrastructure misconfigurations and application vulnerabilities. In this post, we focus on enabling developers to scan vulnerable data around Python open-source packages using the Snyk Command Line Interface (CLI).

The world of package dependencies

Traditionally, code scanning is performed by the security team; they either ship the code to the scanning instance, or in some cases ship it to the vendor for vulnerability scanning. After the vendor finishes the scan, the results are provided to the security team and forwarded to the developer. The end-to-end process of organizing the repositories, sending the code to security team for scanning, getting results back, and remediating them is counterproductive to the agility of working in the cloud.

Let’s take an example of package A, which uses package B and C. To scan package A, you scan package B and C as well. Similar to package A having dependencies on B and C, packages B and C can have their individual dependencies too. So the dependencies for each package get complex and cumbersome to scan over time. The ideal method is to scan all the dependencies in one go, without having manual intervention to understand the dependencies between packages.

Building on the foundation of GitOps and Gitflow

GitOps was introduced in 2017 by Weaveworks as a DevOps model to implement continuous deployment for cloud-native applications. It focuses on the developer ability to ship code faster. Because security is a non-negotiable piece of any application, this solution includes security as part of the deployment process. We define the Snyk scanner as declarative and immutable AWS Cloud Development Kit (AWS CDK) code, which instructs new Python code committed to the repository to be scanned.

Another continuous delivery practice that we base this solution on is Gitflow. Gitflow is a strict branching model that enables project release by enforcing a framework for managing Git projects. As a brief introduction on Gitflow, typically you have a main branch, which is the code sent to production, and you have a development branch where new code is committed. After the code in development branch passes all tests, it’s merged to the main branch, thereby becoming the code in production. In this solution, we aim to provide this scanning capability in all your branches, providing security observability through your entire Gitflow.

AWS services used in this solution

We use the following AWS services as part of this solution:

AWS CDK – The AWS CDK is an open-source software development framework to define your cloud application resources using familiar programming languages. In this solution, we use Python to write our AWS CDK code.
AWS CodeBuild – CodeBuild is a fully managed build service in the cloud. CodeBuild compiles your source code, runs unit tests, and produces artifacts that are ready to deploy. CodeBuild eliminates the need to provision, manage, and scale your own build servers.
AWS CodeCommit – CodeCommit is a fully managed source control service that hosts secure Git-based repositories. It makes it easy for teams to collaborate on code in a secure and highly scalable ecosystem. CodeCommit eliminates the need to operate your own source control system or worry about scaling its infrastructure. You can use CodeCommit to securely store anything from source code to binaries, and it works seamlessly with your existing Git tools.
AWS CodePipeline – CodePipeline is a continuous delivery service you can use to model, visualize, and automate the steps required to release your software. You can quickly model and configure the different stages of a software release process. CodePipeline automates the steps required to release your software changes continuously.
Amazon EventBridge – EventBridge rules deliver a near-real-time stream of system events that describe changes in AWS resources. With simple rules that you can quickly set up, you can match events and route them to one or more target functions or streams.
AWS Systems Manager Parameter Store – Parameter Store, a capability of AWS Systems Manager, provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, Amazon Machine Image (AMI) IDs, and license codes as parameter values.

Prerequisites

Before you get started, make sure you have the following prerequisites:

An AWS account (use a Region that supports CodeCommit, CodeBuild, Parameter Store, and CodePipeline)
A Snyk account
An existing CodeCommit repository you want to test on

Architecture overview

After you complete the steps in this post, you will have a working pipeline that scans your Python code for open-source vulnerabilities.

We use the Snyk CLI, which is available to customers on all plans, including the Free Tier, and provides the ability to programmatically scan repositories for vulnerabilities in open-source dependencies as well as base image recommendations for container images. The following reference architecture represents a general workflow of how Snyk performs the scan in an automated manner. The design uses DevSecOps principles of automation, event-driven triggers, and keeping humans out of the loop for its run.

As developers keep working on their code, they continue to commit their code to the CodeCommit repository. Upon each commit, a CodeCommit API call is generated, which is then captured using the EventBridge rule. You can customize this event rule for a specific event or feature branch you want to trigger the pipeline for.

When the developer commits code to the specified branch, that EventBridge event rule triggers a CodePipeline pipeline. This pipeline has a build stage using CodeBuild. This stage interacts with the Snyk CLI, and uses the token stored in Parameter Store. The Snyk CLI uses this token as authentication and starts scanning the latest code committed to the repository. When the scan is complete, you can review the results on the Snyk console.

This code is built for Python pip packages. You can edit the buildspec.yml to incorporate for any other language that Snyk supports.

The following diagram illustrates our architecture.

snyk architecture codepipeline

Code overview

The code in this post is written using the AWS CDK in Python. If you’re not familiar with the AWS CDK, we recommend reading Getting started with AWS CDK before you customize and deploy the code.

Repository URL: https://github.com/aws-samples/aws-cdk-codecommit-snyk

This AWS CDK construct uses the Snyk CLI within the CodeBuild job in the pipeline to scan the Python packages for open-source package vulnerabilities. The construct uses CodePipeline to create a two-stage pipeline: one source, and one build (the Snyk scan stage). The construct takes the input of the CodeCommit repository you want to scan, the Snyk organization ID, and Snyk auth token.

Resources deployed

This solution deploys the following resources:

An EventBridge rule
A CodeBuild project
Four AWS Identity and Access Management (IAM) roles with inline policies
A CodePipeline pipeline
An Amazon Simple Storage Service (Amazon S3) bucket
An AWS Key Management Service (AWS KMS) key and alias

For the deployment, we use the AWS CDK construct in the codebase cdk_snyk_construct/cdk_snyk_construct_stack.py in the AWS CDK stack cdk-snyk-stack. The construct requires the following parameters:

ARN of the CodeCommit repo you want to scan
Name of the repository branch you want to be monitored
Parameter Store name of the Snyk organization ID
Parameter Store name for the Snyk auth token

Set up the organization ID and auth token before deploying the stack. Because these are confidential and sensitive data, you should deploy them as a separate stack or manual process. In this solution, the parameters have been stored as a SecureString parameter type and encrypted using the AWS-managed KMS key.

You create the organization ID and auth token on the Snyk console. On the Settings page, choose General in the navigation page to add these parameters.

snyk settings console

You can retrieve the names of the parameters on the Systems Manager console by navigating to Parameter Store and finding the name on the Overview tab.

SSM Parameter Store

Create a requirements.txt file in the CodeCommit repository

We now create a repository in CodeCommit to store the code. For simplicity, we primarily store the requirements.txt file in our repository. In Python, a requirements file stores the packages that are used. Having clearly defined packages and versions makes it easier for development, especially in virtual environments.

For more information on the requirements file in Python, see Requirement Specifiers.

To create a CodeCommit repository, run the following AWS Command Line Interface (AWS CLI) command in your AWS accounts:

aws codecommit create-repository --repository-name snyk-repo \
--repository-description "Repository for Snyk to scan Python packages"

Now let’s create a branch called main in the repository using the following command:

aws codecommit create-branch --repository-name snyk-repo \
--branch-name main

After you create the repository, commit a file named requirements.txt with the following content. The following packages are pinned to a particular version that they have a vulnerability with. This file is our hypothetical vulnerable set of packages that have been committed into your development code.

PyYAML==5.3.1
Pillow==7.1.2
pylint==2.5.3
urllib3==1.25.8

For instructions on committing files in CodeCommit, see Connect to an AWS CodeCommit repository.

When you store the Snyk auth token and organization ID in Parameter Store, note the parameter names—you need to pass them as parameters during the deployment step.

Now clone the CDK code from the GitHub repository with the command below:

git clone https://github.com/aws-samples/aws-cdk-codecommit-snyk.git

After the cloning is complete you should see a directory named aws-cdk-codecommit-snyk on your machine.

When you’re ready to deploy, enter the aws-cdk-codecommit-snyk directory, and run the following command with the appropriate values:

cdk deploy cdk-snyk-stack \
--parameters RepoName=<name-of-codecommit-repo> \
--parameters RepoBranch=<branch-to-be-scanned> \
--parameters SnykOrgId=<value> \
--parameters SnykAuthToken=<value>

After the stack deployment is complete, you can see a new pipeline in your AWS account, which is configured to be triggered every time a commit occurs on the main branch.

You can view the results of the scan on the Snyk console. After the pipeline runs, log in to snyk.io and you should see a project named as per your repository (see the following screenshot).

snyk dashboard

Choose the repo name to get a detailed view of the vulnerabilities found. Depending on what packages you put in your requirements.txt, your report will differ from the following screenshot.

snyk-vuln-details

To fix the vulnerability identified, you can change the version of these packages in the requirements.txt file. The edited requirements file should look like the following:

PyYAML==5.4
Pillow==8.2.0
pylint==2.6.1
urllib3==1.25.9

After you update the requirements.txt file in your repository, push your changes back to the CodeCommit repository you created earlier on the main branch. The push starts the pipeline again.

After the commit is performed to the targeted branch, you don’t see the vulnerability reported on the Snyk dashboard because the pinned version 5.4 doesn’t contain that vulnerability.

Clean up

To avoid accruing further cost for the resources deployed in this solution, run cdk destroy to remove all the AWS resources you deployed through CDK.

As the CodeCommit repository was created using AWS CLI, the following command deletes the CodeCommit repository:

aws codecommit delete-repository --repository-name snyk-repo

Conclusion

In this post, we provided a solution so developers can self- remediate vulnerabilities in their code by monitoring it through Snyk. This solution provides observability, agility, and security for your Python application by following DevOps principles.

A similar architecture has been used at NFL to shift-left the security of their code. According to the shift-left design principle, security should be moved closer to the developers to identify and remediate security issues earlier in the development cycle. NFL has implemented a similar architecture which made the total process, from committing code on the branch to remediating 15 times faster than their previous code scanning setup.

Here’s what NFL has to say about their experience:

“NFL used Snyk to scan Python packages for a service launch. Traditionally it would have taken 10days to scan the packages through our existing process but with Snyk we were able to follow DevSecOps principles and get the scans completed, and reviewed within matter of days. This simplified our time to market while maintaining visibility into our security posture.” – Joe Steinke (Director, Data Solution Architect)

How Banks Can Use AWS to Meet Compliance

2021-06-29 Jiwan Panjiker

Post Syndicated from Jiwan Panjiker original https://aws.amazon.com/blogs/architecture/how-banks-can-use-aws-to-meet-compliance/

Since the 2008 financial crisis, banking supervisory institutions such as the Basel Committee on Banking Supervision (BCBS) have strengthened regulations. There is now increased oversight over the financial services industry. For banks, making the necessary changes to comply with these rules is a challenging, multi-year effort.

Basel IV, a massive update to existing rules, is due for implementation in January 2023. Basel IV standardizes the approach to calculating credit risk, increases the impact of risk-weighted assets (RWAs) and emphasizes data transparency.

Given the complexity of data, modeling, and numerous assumptions that have to be made, compliance under Basel IV implementation will be challenging. Standardization omits nuances unique to your business, which can drive up costs, but violating guidelines will result in steep penalties.

This post will address these challenges by outlining a mechanism that facilitates a healthy, data-driven dialogue between banks and regulators to better achieve compliance objectives. The reference architecture will focus on enabling fast, iterative releases with the help of serverless AWS services.

There are four key actions to take in order to support this mechanism:

Automate data management
Establish a continuous integration/continuous delivery (CI/CD) pipeline
Enable fast, point-in-time audit replays
Set up proactive monitoring and notifications

Automate data management

Due to frequent merger activity, banks are typically comprised of a web of integrated systems and siloed business units, making it difficult to consolidate data. Under Basel IV guidelines, auditors want banks to provide detailed data in a presentable way.

You can tackle this first challenge by establishing a data pipeline as shown in Figure 1. Take inventory of each data source as it is incorporated into the pipeline. Identify the critical internal and external data sources that will be used to populate the initial landing area. Amazon Simple Storage Service (S3) is a great choice for this.

Figure 1. Data pipeline that cleans, processes, and segments data

Amazon S3 is a highly available, durable service that is a popular data lake solution. S3 offers WORM storage capabilities like S3 Glacier Vault and S3 Object Lock to protect the integrity of your archived data in accordance with U.S. SEC and FINRA rules.

Basel IV regulations also require banks to use many attributes to develop accurate credit risk models. The attributes can be a mix of datasets such as financial statements, internal balanced scorecards, macro-economic data, and credit ratings. The risk models themselves can also be segmented by portfolio types, industry segments, asset types and much more.

You can split data into different domains and designate data owners with separate S3 buckets. Credit risk model developers, analyst, and data scientists can then use the structure of the S3 buckets to pull together relevant datasets. They can then store the outputs into S3 buckets.

To support fast, automated data retrieval, store object metadata in a highly scalable, and queryable database. You can set up Amazon S3 so that an event can initiate a function to populate Amazon DynamoDB. Developers can use AWS Lambda to write these functions using popular languages like Python.

With AWS Glue, you can automate Extract/Load/Transform (ETL) processes to clean and move data to the different S3 buckets. AWS Glue can also support data operations by automatically cataloging your various data sources.

Taking on a structured approach will simplify data governance and transparency as the business continues to grow and operate.

Establish a CI/CD pipeline

Adopt tools that machine learning teams can use to build a streamlined CI/CD solution as demonstrated in Figure 2.

Figure 2. An end-to-end machine learning development and deployment pipeline

Using tightly integrated AWS services, your teams can minimize time spent managing tools and deployment processes, and instead, focus on tuning the models and analyzing the results.

Amazon SageMaker brings together a powerful set of machine learning capabilities on the AWS Cloud. It helps data scientists and engineers build insightful models. Figure 2 depicts the high-level architecture and shows how Amazon SageMaker Pipelines helps teams orchestrate the automation and deployment processes.

The core of the pipeline uses a set of AWS deployment services so that your teams can collaborate and review effectively. With AWS CodeCommit, your teams can set up git-based repository to store and version models for data processing, training, and evaluation. The repository can also store code and configuration files using AWS CloudFormation for deployment. You can use AWS CodePipeline and AWS CodeBuild to create and update a model endpoint based on the approved/reviewed changes.

Any updates detected in the AWS CodeCommit repository initiate a deployment whenever a new model version is added to the Model Registry. Amazon S3 can be used to store generated model artifacts, historical data, and models.

Enable fast, point-in-time audit replays

Figure 3. Containers offer a lightweight, powerful solution to run audits using historical assets

One of the main themes of Basel IV is transparency. Figure 3 illustrates a solution to build trust with regulators by allowing them to verify and understand modeling activity.

A lightweight application is hosted in AWS Fargate and enables auditors to re-run Basel credit risk models under specified conditions. With AWS Fargate, you don’t need to manually manage instances or container orchestration. Configure the CPU or memory specifications at the task level and set guidelines around scalability for your service. Your tasks then scale up and down automatically, based on demand, and will optimize cost efficiency and availability.

Figure 3 shows the following:

The application takes inputs such as date, release version, and model type.
It then queries DynamoDB with this information.
The query will return the data necessary to retrieve model artifacts from previous CI/CD deployments and relevant datasets from historical S3 buckets.
Using this information, it can spin up as many containers as needed to run the model.
It then stores the outputs in a separate S3 bucket.
Auditors will have a detailed trace of all the attributes, assumptions, and data that went into the modeling effort. To streamline this process, the app can also compare the outputs of the historical runs to the recent replay and highlight any significant deviations.

Though internal models will be de-emphasized under Basel IV, banks will continue to run internal models as a benchmark against the broader standards. Schedule AWS Fargate tasks to run these models regularly to capitalize on highly performant compute services while minimizing costs.

Set up proactive monitoring and notifications

Figure 4. Scheduled jobs can send out notifications using Amazon SNS when certain thresholds are breached

The last principle is based around establishing an early warning system, enabling banks to take on a more proactive role in maintaining compliance.

With automated monitoring and notifications, banks will be able to respond quickly to potential concerns. For instance, there can be a daily scheduled job that launches containers and runs the models against the latest data. If any thresholds are breached, alerts can be sent out via SMS or email. Operational teams can be subscribed to certain message topics using Amazon Simple Notification Service (SNS). They can then respond before actual compliance issues emerge.

Conclusion

With a Well-Architected approach, AWS helps you control your data, deploy new features, and embrace a serverless approach. This frees you to innovate quickly and address regulatory challenges.

You can iterate with new AWS services and bring machine learning to bear on various streams of data to identify high impact pools of value. You can get a clearer picture of the data to make it easier to identify areas where you can reduce RWAs. Using Amazon S3, you can turn on AWS analytics services such as Amazon QuickSight and Amazon Athena to visualize the data. You’ll be able to fulfill reporting requirements such as those found in regulatory studies like CCAR, DFAST, CECL, and IFRS9.

For more information about establishing a data pipeline, read Lake House Formation Architecture. It is a powerful pattern that combines a few concepts that will help bring your data together cohesively. To set up a robust CI/CD pipeline, explore the AWS Serverless CI/CD Reference Architecture.

Continuous Compliance Workflow for Infrastructure as Code: Part 2

2021-06-26 DAMODAR SHENVI WAGLE

Post Syndicated from DAMODAR SHENVI WAGLE original https://aws.amazon.com/blogs/devops/continuous-compliance-workflow-for-infrastructure-as-code-part-2/

In the first post of this series, we introduced a continuous compliance workflow in which an enterprise security and compliance team can release guardrails in a continuous integration, continuous deployment (CI/CD) fashion in your organization.

In this post, we focus on the technical implementation of the continuous compliance workflow. We demonstrate how to use AWS Developer Tools to create a CI/CD pipeline that releases guardrails for Terraform application workloads.

We use the Terraform-Compliance framework to define the guardrails. Terraform-Compliance is a lightweight, security and compliance-focused test framework for Terraform to enable the negative testing capability for your infrastructure as code (IaC).

With this compliance framework, we can ensure that the implemented Terraform code follows security standards and your own custom standards. Currently, HashiCorp provides Sentinel (a policy as code framework) for enterprise products. AWS has CloudFormation Guard an open-source policy-as-code evaluation tool for AWS CloudFormation templates. Terraform-Compliance allows us to build a similar functionality for Terraform, and is open source.

This post is from the perspective of a security and compliance engineer, and assumes that the engineer is familiar with the practices of IaC, CI/CD, behavior-driven development (BDD), and negative testing.

Solution overview

You start by building the necessary resources as listed in the workload (application development team) account:

An AWS CodeCommit repository for the Terraform workload
A CI/CD pipeline built using AWS CodePipeline to deploy the workload
A cross-account AWS Identity and Access Management (IAM) role that gives the security and compliance account the permissions to pull the Terraform workload from the workload account repository for testing their guardrails in observation mode

Next, we build the resources in the security and compliance account:

A CodeCommit repository to hold the security and compliance standards (guardrails)
A CI/CD pipeline built using CodePipeline to release new guardrails
A cross-account role that gives the workload account the permissions to pull the activated guardrails from the main branch of the security and compliance account repository.

The following diagram shows our solution architecture.

solution architecture diagram

The architecture has two workflows: security and compliance (Steps 1–4) and application delivery (Steps 5–7).

When a new security and compliance guardrail is introduced into the develop branch of the compliance repository, it triggers the security and compliance pipeline.
The pipeline pulls the Terraform workload.
The pipeline tests this compliance check guardrail against the Terraform workload in the workload account repository.
If the workload is compliant, the guardrail is automatically merged into the main branch. This activates the guardrail by making it available for all Terraform application workload pipelines to consume. By doing this, we make sure that we don’t break the Terraform application deployment pipeline by introducing new guardrails. It also provides the security and compliance team visibility into the resources in the application workload that are noncompliant. The security and compliance team can then reach out to the application delivery team and suggest appropriate remediation before the new standards are activated. If the compliance check fails, the automatic merge to the main branch is stopped. The security and compliance team has an option to force merge the guardrail into the main branch if it’s deemed critical and they need to activate it immediately.
The Terraform deployment pipeline in the workload account always pulls the latest security and compliance checks from the main branch of the compliance repository.
Checks are run against the Terraform workload to ensure that it meets the organization’s security and compliance standards.
Only secure and compliant workloads are deployed by the pipeline. If the workload is noncompliant, the security and compliance checks fail and break the pipeline, forcing the application delivery team to remediate the issue and recheck-in the code.

Prerequisites

Before proceeding any further, you need to identify and designate two AWS accounts required for the solution to work:

Security and Compliance – In which you create a CodeCommit repository to hold compliance standards that are written based on Terraform-Compliance framework. You also create a CI/CD pipeline to release new compliance guardrails.
Workload – In which the Terraform workload resides. The pipeline to deploy the Terraform workload enforces the compliance guardrails prior to the deployment.

You also need to create two AWS account profiles in ~/.aws/credentials for the tools and target accounts, if you don’t already have them. These profiles need to have sufficient permissions to run an AWS Cloud Development Kit (AWS CDK) stack. They should be your private profiles and only be used during the course of this use case. Therefore, it should be fine if you want to use admin privileges. Don’t share the profile details, especially if it has admin privileges. I recommend removing the profile when you’re finished with this walkthrough. For more information about creating an AWS account profile, see Configuring the AWS CLI.

In addition, you need to generate a cucumber-sandwich.jar file by following the steps in the cucumber-sandwich GitHub repo. The JAR file is needed to generate pretty HTML compliance reports. The security and compliance team can use these reports to make sure that the standards are met.

To implement our solution, we complete the following high-level steps:

Create the security and compliance account stack.
Create the workload account stack.
Test the compliance workflow.

Create the security and compliance account stack

We create the following resources in the security and compliance account:

A CodeCommit repo to hold the security and compliance guardrails
A CI/CD pipeline to roll out the Terraform compliance guardrails
An IAM role that trusts the application workload account and allows it to pull compliance guardrails from its CodeCommit repo

In this section, we set up the properties for the pipeline and cross-account role stacks, and run the deployment scripts.

Set up properties for the pipeline stack

Clone the GitHub repo aws-continuous-compliance-for-terraform and navigate to the folder security-and-compliance-account/stacks. This contains the folder pipeline_stack/, which holds the code and properties for creating the pipeline stack.

The folder has a JSON file cdk-stack-param.json, which has the parameter TERRAFORM_APPLICATION_WORKLOADS, which represents the list of application workloads that the security and compliance pipeline pulls and runs tests against to make sure that the workloads are compliant. In the workload list, you have the following parameters:

GIT_REPO_URL – The HTTPS URL of the CodeCommit repository in the workload account against which the security and compliance check pipeline runs compliance guardrails.
CROSS_ACCOUNT_ROLE_ARN – The ARN for the cross-account role we create in the next section. This role gives the security and compliance account permissions to pull Terraform code from the workload account.

For CROSS_ACCOUNT_ROLE_ARN, replace <workload-account-id> with the account ID for your designated AWS workload account. For GIT_REPO_URL, replace <region> with AWS Region where the repository resides.

security and compliance pipeline stack parameters

Set up properties for the cross-account role stack

In the cloned GitHub repo aws-continuous-compliance-for-terraform from the previous step, navigate to the folder security-and-compliance-account/stacks. This contains the folder cross_account_role_stack/, which holds the code and properties for creating the cross-account role.

The folder has a JSON file cdk-stack-param.json, which has the parameter TERRAFORM_APPLICATION_WORKLOAD_ACCOUNTS, which represents the list of Terraform workload accounts that intend to integrate with the security and compliance account for running compliance checks. All these accounts are trusted by the security and compliance account and given permissions to pull compliance guardrails. Replace <workload-account-id> with the account ID for your designated AWS workload account.

security and compliance cross account role stack parameters

Run the deployment script

Run deploy.sh by passing the name of the AWS security and compliance account profile you created earlier. The script uses the AWS CDK CLI to bootstrap and deploy the two stacks we discussed. See the following code:

cd aws-continuous-compliance-for-terraform/security-and-compliance-account/
./deploy.sh "<AWS-COMPLIANCE-ACCOUNT-PROFILE-NAME>"

You should now see three stacks in the tools account:

CDKToolkit – AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an Amazon Simple Storage Service (Amazon S3) bucket needed to hold deployment assets such as an AWS CloudFormation template and AWS Lambda code package.
cf-CrossAccountRoles – This stack creates the cross-account IAM role.
cf-SecurityAndCompliancePipeline – This stack creates the pipeline. On the Outputs tab of the stack, you can find the CodeCommit source repo URL from the key OutSourceRepoHttpUrl. Record the URL to use later.

security and compliance stack

Create a workload account stack

We create the following resources in the workload account:

A CodeCommit repo to hold the Terraform workload to be deployed
A CI/CD pipeline to deploy the Terraform workload
An IAM role that trusts the security and compliance account and allows it to pull Terraform code from its CodeCommit repo for testing

We follow similar steps as in the previous section to set up the properties for the pipeline stack and cross-account role stack, and then run the deployment script.

Set up properties for the pipeline stack

In the already cloned repo, navigate to the folder workload-account/stacks. This contains the folder pipeline_stack/, which holds the code and properties for creating the pipeline stack.

The folder has a JSON file cdk-stack-param.json, which has the parameter COMPLIANCE_CODE, which provides details on where to pull the compliance guardrails from. The pipeline pulls and runs compliance checks prior to deployment, to make sure that application workload is compliant. You have the following parameters:

GIT_REPO_URL – The HTTPS URL of the CodeCommit repositoryCode in the security and compliance account, which contains compliance guardrails that the pipeline in the workload account pulls to carry out compliance checks.
CROSS_ACCOUNT_ROLE_ARN – The ARN for the cross-account role we created in the previous step in the security and compliance account. This role gives the workload account permissions to pull the Terraform compliance code from its respective security and compliance account.

For CROSS_ACCOUNT_ROLE_ARN, replace <compliance-account-id> with the account ID for your designated AWS security and compliance account. For GIT_REPO_URL, replace <region> with Region where the repository resides.

workload pipeline stack config

Set up the properties for cross-account role stack

In the already cloned repo, navigate to folder workload-account/stacks. This contains the folder cross_account_role_stack/, which holds the code and properties for creating the cross-account role stack.

The folder has a JSON file cdk-stack-param.json, which has the parameter COMPLIANCE_ACCOUNT, which represents the security and compliance account that intends to integrate with the workload account for running compliance checks. This account is trusted by the workload account and given permissions to pull compliance guardrails. Replace <compliance-account-id> with the account ID for your designated AWS security and compliance account.

workload cross account role stack config

Run the deployment script

Run deploy.sh by passing the name of the AWS workload account profile you created earlier. The script uses the AWS CDK CLI to bootstrap and deploy the two stacks we discussed. See the following code:

cd aws-continuous-compliance-for-terraform/workload-account/
./deploy.sh "<AWS-WORKLOAD-ACCOUNT-PROFILE-NAME>"

You should now see three stacks in the tools account:

CDKToolkit –AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an S3 bucket needed to hold deployment assets such as a CloudFormation template and Lambda code package.
cf-CrossAccountRoles – This stack creates the cross-account IAM role.
cf-TerraformWorkloadPipeline – This stack creates the pipeline. On the Outputs tab of the stack, you can find the CodeCommit source repo URL from the key OutSourceRepoHttpUrl. Record the URL to use later.

workload pipeline stack

Test the compliance workflow

In this section, we walk through the following steps to test our workflow:

Push the application workload code into its repo.
Push the security and compliance code into its repo and run its pipeline to release the compliance guardrails.
Run the application workload pipeline to exercise the compliance guardrails.
Review the generated reports.

Push the application workload code into its repo

Clone the empty CodeCommit repo from workload account. You can find the URL from the variable OutSourceRepoHttpUrl on the Outputs tab of the cf-TerraformWorkloadPipeline stack we deployed in the previous section.

Create a new branch main and copy the workload code into it.
Copy the cucumber-sandwich.jar file you generated in the prerequisites section into a new folder /lib.
Create a directory called reports with an empty file dummy. The reports directory is where Terraform-Compliance framework create compliance reports.
Push the code to the remote origin.

See the following sample script

git checkout -b main
# Copy the code from git repo location
# Create reports directory and a dummy file.
mkdir reports
touch reports/dummy
git add .
git commit -m “Initial commit”
git push origin main

The folder structure of workload code repo should match the structure shown in the following screenshot.

workload code folder structure

The first commit triggers the pipeline-workload-main pipeline, which fails in the stage RunComplianceCheck due to the security and compliance repo not being present (which we add in the next section).

Push the security and compliance code into its repo and run its pipeline

Clone the empty CodeCommit repo from the security and compliance account. You can find the URL from the variable OutSourceRepoHttpUrl on the Outputs tab of the cf-SecurityAndCompliancePipeline stack we deployed in the previous section.

Create a new local branch main and check in the empty branch into the remote origin so that the main branch is created in the remote origin. Skipping this step leads to failure in the code merge step of the pipeline due to the absence of the main branch.
Create a new branch develop and copy the security and compliance code into it. This is required because the security and compliance pipeline is configured to be triggered from the develop branch for the purposes of this post.
Copy the cucumber-sandwich.jar file you generated in the prerequisites section into a new folder /lib.

See the following sample script:

cd security-and-compliance-code
git checkout -b main
git add .
git commit --allow-empty -m “initial commit”
git push origin main
git checkout -b develop main
# Here copy the code from git repo location
# You also copy cucumber-sandwich.jar into a new folder /lib
git add .
git commit -m “Initial commit”
git push origin develop

The folder structure of security and compliance code repo should match the structure shown in the following screenshot.

security and compliance code folder structure

The code push to the develop branch of the security-and-compliance-code repo triggers the security and compliance pipeline. The pipeline pulls the code from the workload account repo, then runs the compliance guardrails against the Terraform workload to make sure that the workload is compliant. If the workload is compliant, the pipeline merges the compliance guardrails into the main branch. If the workload fails the compliance test, the pipeline fails. The following screenshot shows a sample run of the pipeline.

security and compliance pipeline

Run the application workload pipeline to exercise the compliance guardrails

After we set up the security and compliance repo and the pipeline runs successfully, the workload pipeline is ready to proceed (see the following screenshot of its progress).

workload pipeline

The service delivery teams are now being subjected to the security and compliance guardrails being implemented (RunComplianceCheck stage), and their pipeline breaks if any resource is noncompliant.

Review the generated reports

CodeBuild supports viewing reports generated in cucumber JSON format. In our workflow, we generate reports in cucumber JSON and BDD XML formats, and we use this capability of CodeBuild to generate and view HTML reports. Our implementation also generates report directly in HTML using the cucumber-sandwich library.

The following screenshot is snippet of the script compliance-check.sh, which implements report generation.

compliance check script

The bug noted in the screenshot is in the radish-bdd library that Terraform-Compliance uses for the cucumber JSON format report generation. For more information, you can review the defect logged against radish-bdd for this issue.

After the script generates the reports, CodeBuild needs to be configured to access them to generate HTML reports. The following screenshot shows a snippet from buildspec-compliance-check.yml, which shows how the reports section is set up for report generation:

buildspec compliance check

For more details on how to set up buildspec file for CodeBuild to generate reports, see Create a test report.

CodeBuild displays the compliance run reports as shown in the following screenshot.

code build cucumber report

We can also view a trending graph for multiple runs.

code build cucumber report

The other report generated by the workflow is the pretty HTML report generated by the cucumber-sandwich library.

code build cucumber report

The reports are available for download from the S3 bucket <OutPipelineBucketName>/pipeline-security-an/report_App/<zip file>.

The cucumber-sandwich generated report marks scenarios with skipped tests as failed scenarios. This is the only noticeable difference between the CodeBuild generated HTML and cucumber-sandwich generated HTML reports.

Clean up

To remove all the resources from the workload account, complete the following steps in order:

Go to the folder where you cloned the workload code and edit buildspec-workload-deploy.yml:
- Comment line 44 (- ./workload-deploy.sh).
- Uncomment line 45 (- ./workload-deploy.sh --destroy).
- Commit and push the code change to the remote repo. The workload pipeline is triggered, which cleans up the workload.
Delete the CloudFormation stack cf-CrossAccountRoles. This step removes the cross-account role from the workload account, which gives permission to the security and compliance account to pull the Terraform workload.
Go to the CloudFormation stack cf-TerraformWorkloadPipeline and note the OutPipelineBucketName and OutStateFileBucketName on the Outputs tab. Empty the two buckets and then delete the stack. This removes pipeline resources from workload account.
Go to the CDKToolkit stack and note the BucketName on the Outputs tab. Empty that bucket and then delete the stack.

To remove all the resources from the security and compliance account, complete the following steps in order:

Delete the CloudFormation stack cf-CrossAccountRoles. This step removes the cross-account role from the security and compliance account, which gives permission to the workload account to pull the compliance code.
Go to CloudFormation stack cf-SecurityAndCompliancePipeline and note the OutPipelineBucketName on the Outputs tab. Empty that bucket and then delete the stack. This removes pipeline resources from the security and compliance account.
Go to the CDKToolkit stack and note the BucketName on the Outputs tab. Empty that bucket and then delete the stack.

Security considerations

Cross-account IAM roles are very powerful and need to be handled carefully. For this post, we strictly limited the cross-account IAM role to specific CodeCommit permissions. This makes sure that the cross-account role can only do those things.

Conclusion

In this post in our two-part series, we implemented a continuous compliance workflow using CodePipeline and the open-source Terraform-Compliance framework. The Terraform-Compliance framework allows you to build guardrails for securing Terraform applications deployed on AWS.

We also showed how you can use AWS developer tools to seamlessly integrate security and compliance guardrails into an application release cycle and catch noncompliant AWS resources before getting deployed into AWS.

Try implementing the solution in your enterprise as shown in this post, and leave your thoughts and questions in the comments.

About the authors

sumit mishra

Sumit Mishra is Senior DevOps Architect at AWS Professional Services. His area of expertise include IaC, Security in pipeline, CI/CD and automation.

Damodar Shenvi Wagle

Damodar Shenvi Wagle is a Cloud Application Architect at AWS Professional Services. His areas of expertise include architecting serverless solutions, CI/CD and automation.

Keeping up with your dependencies: building a feedback loop for shared libraries

2021-06-26 Joerg Woehrle

Post Syndicated from Joerg Woehrle original https://aws.amazon.com/blogs/devops/keeping-up-with-your-dependencies-building-a-feedback-loop-for-shared-libraries/

In a microservices world, it’s common to share as little as possible between services. This enables teams to work independently of each other, helps to reduce wait times and decreases coupling between services.

However, it’s also a common scenario that libraries for cross-cutting-concerns (such as security or logging) are developed one time and offered to other teams for consumption. Although it’s vital to offer an opt-out of those libraries (namely, use your own code to address the cross-cutting-concern, such as when there is no version for a given language), shared libraries also provide the benefit of better governance and time savings.

To avoid these pitfalls when sharing artifacts, two points are important:

For consumers of shared libraries, it’s important to stay up to date with new releases in order to benefit from security, performance, and feature improvements.
For producers of shared libraries, it’s important to get quick feedback in case of an involuntarily added breaking change.

Based on those two factors, we’re looking for the following solution:

A frictionless and automated way to update consumer’s code to the latest release version of a given library
Immediate feedback to the library producer in case of a breaking change (the new version of the library breaks the build of a downstream system)

In this blog post I develop a solution that takes care of both those problems. I use Amazon EventBridge to be notified on new releases of a library in AWS CodeArtifact. I use an AWS Lambda function along with an AWS Fargate task to automatically create a pull request (PR) with the new release version on AWS CodeCommit. Finally, I use AWS CodeBuild to kick off a build of the PR and notify the library producer via EventBridge and Amazon Simple Notification Service (Amazon SNS) in case of a failure.

Overview of solution

Let’s start with a short introduction on the services I use for this solution:

CodeArtifact – A fully managed artifact repository service that makes it easy for organizations of any size to securely store, publish, and share software packages used in their software development process. CodeArtifact works with commonly used package managers and build tools like Maven, Gradle, npm, yarn, twine, and pip.
CodeBuild – A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy.
CodeCommit – A fully-managed source control service that hosts secure Git-based repositories.
EventBridge – A serverless event bus that makes it easy to connect applications together using data from your own applications, integrated software as a service (SaaS) applications, and AWS services. EventBridge makes it easy to build event-driven applications because it takes care of event ingestion and delivery, security, authorization, and error handling.
Fargate – A serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS). Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.
Lambda – Lets you run code without provisioning or managing servers. You pay only for the compute time you consume.
Amazon SNS – A fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.

The resulting flow through the system looks like the following diagram.

Architecture Diagram

In my example, I look at two independent teams working in two different AWS accounts. Team A is the provider of the shared library, and Team B is the consumer.

Let’s do a high-level walkthrough of the involved steps and components:

A new library version is released by Team A and pushed to CodeArtifact.
CodeArtifact creates an event when the new version is published.
I send this event to the default event bus in Team B’s AWS account.
An EventBridge rule in Team B’s account triggers a Lambda function for further processing.
The function filters SNAPSHOT releases (in Maven a SNAPSHOT represents an artifact still under development that doesn’t have a final release yet) and runs an Amazon ECS Fargate task for non-SNAPSHOT versions.
The Fargate task checks out the source that uses the shared library, updates the library’s version in the pom.xml, and creates a pull request to integrate the change into the mainline of the code repository.
The pull request creation results in an event being published.
An EventBridge rule triggers the CodeBuild project of the downstream artifact.
The result of the build is published as an event.
If the build fails, this failure is propagated back to the event bus of Team A.
The failure is forwarded to an SNS topic that notifies the subscribers of the failure.

Amazon EventBridge

A central component of the solution is Amazon EventBridge. I use EventBridge to receive and react on events emitted by the various AWS services in the solution (e.g., whenever a new version of an artifact gets uploaded to CodeArtifact, when a PR is created within CodeCommit or when a build fails in CodeBuild). Let’s have a high-level look on some of the central concepts of EventBridge:

Event Bus – An event bus is a pipeline that receives events. There is a default event bus in each account which receives events from AWS services. One can send events to an event bus via the PutEvents API.
Event – An event indicates a change in e.g., an AWS environment, a SaaS partner service or application or one of your applications.
Rule – A rule matches incoming events on an event bus and sends them to targets for processing. To react on a particular event, one creates a rule which matches this event. To learn more about the rule concept check out Rules on the EventBridge documentation.
Target – When an event matches the event pattern defined in a rule it is send to a target. There are currently more than 20 target types available in EventBridge. In this blog post I use the targets provided for: an event bus in a different account, a Lambda function, a CodeBuild project and an SNS topic. For a detailed list on available targets see Amazon EventBridge targets.

Solution Details:

In this section I walk through the most important parts of the solution. The complete code can be found on GitHub. For a detailed view on the resources created in each account please refer to the GitHub repository.

I use the AWS Cloud Development Kit (CDK) to create my infrastructure. For some of the resource types I create, no higher-level constructs are available yet (at the time of writing, I used AWS CDK version 1.108.1). This is why I sometimes use low-level AWS CloudFormation constructs or even use the provided escape hatches to use AWS CloudFormation constructs directly.

The code for the shared library producer and consumer is written in Java and uses Apache Maven for dependency management. However, the same concepts apply to e.g., Node.js and npm.

Notify another account of new releases

To send events from EventBridge to another account, the receiving account needs to specify an EventBusPolicy. The AWS CDK code on the consumer account looks like the following code:

new events.CfnEventBusPolicy(this, 'EventBusPolicy', {
    statementId: 'AllowCrossAccount',
    action: 'events:PutEvents',
    principal: consumerAccount
});

With that the producer account has the permission to publish events into the event bus of the consumer account.

I’m interested in CodeArtifact events that are published on the release of a new artifact. I first create a Rule which matches those events. Next, I add a target to the rule which targets the event bus of account B. As of this writing there is no CDK construct available to directly add another account as a target. That is why I use the underlying CloudFormation CfnRule to do that. This is called an escape hatch in CDK. For more information about escape hatches, see Escape hatches.

const onLibraryReleaseRule = new events.Rule(this, 'LibraryReleaseRule', {
  eventPattern: {
    source: [ 'aws.codeartifact' ],
    detailType: [ 'CodeArtifact Package Version State Change' ],
    detail: {
      domainOwner: [ this.account ],
      domainName: [ codeArtifactDomain.domainName ],
      repositoryName: [ codeArtifactRepo.repositoryName ],
      packageVersionState: [ 'Published' ],
      packageFormat: [ 'maven' ]
    }
  }
});
/* there is currently no CDK construct provided to add an event bus in another account as a target. 
That's why we use the underlying CfnRule directly */
const cfnRule = onLibraryReleaseRule.node.defaultChild as events.CfnRule;
cfnRule.targets = [ {arn: `arn:aws:events:${this.region}:${consumerAccount}:event-bus/default`, id: 'ConsumerAccount'} ];

For more information about event formats, see CodeArtifact event format and example.

Act on new releases in the consumer account

I established the connection between the events produced by Account A and Account B: The events now are available in Account B’s event bus. To use them, I add a rule which matches this event in Account B:

const onLibraryReleaseRule = new events.Rule(this, 'LibraryReleaseRule', {
  eventPattern: {
    source: [ 'aws.codeartifact' ],
    detailType: [ 'CodeArtifact Package Version State Change' ],
    detail: {
      domainOwner: [ producerAccount ],
      packageVersionState: [ 'Published' ],
      packageFormat: [ 'maven' ]
    }
  }
});

Add a Lambda function target

Now that I created a rule to trigger anytime a new package version is published, I will now add an EventBridge target which triggers my runTaskLambda Lambda Function. The below CDK code shows how I add our Lambda function as a target to the onLibraryRelease rule. Notice how I extract information from the event’s payload and pass it into the Lambda function’s invocation event.

onLibraryReleaseRule.addTarget(
    new targets.LambdaFunction( runTaskLambda,{
      event: events.RuleTargetInput.fromObject({
        groupId: events.EventField.fromPath('$.detail.packageNamespace'),
        artifactId: events.EventField.fromPath('$.detail.packageName'),
        version: events.EventField.fromPath('$.detail.packageVersion'),
        repoUrl: codeCommitRepo.repositoryCloneUrlHttp,
        region: this.region
      })
    }));

Filter SNAPSHOT versions

Because I’m not interested in Maven SNAPSHOT versions (such as 1.0.1-SNAPSHOT), I have to find a way to filter those and only act upon non-SNAPSHOT versions. Even though content-based filtering on event patterns is supported by Amazon EventBridge, filtering on suffixes is not supported as of this writing. This is why the Lambda function filters SNAPSHOT versions and only acts upon real, non-SNAPSHOT, releases. For those, I start a custom Amazon ECS Fargate task by using the AWS JavaScript SDK. My function passes some environment overrides to the Fargate task in order to have the required information about the artifact available at runtime.

In the following function code, I pass all required information to create a pull request into the environment of the Fargate task:

const AWS = require('aws-sdk');

const ECS = new AWS.ECS();
exports.handler = async (event) => {
    console.log(`Received event: ${JSON.stringify(event)}`)
    const artifactVersion = event.version;
    const artifactId = event.artifactId;
    if ( artifactVersion.indexOf('SNAPSHOT') > -1 ) {
        console.log(`Skipping SNAPSHOT version ${artifactVersion}`)
    } else {
        console.log(`Triggering task to create pull request for version ${artifactVersion} of artifact ${artifactId}`);
        const params = {
            launchType: 'FARGATE',
            taskDefinition: process.env.TASK_DEFINITION_ARN,
            cluster: process.env.CLUSTER_ARN,
            networkConfiguration: {
                awsvpcConfiguration: {
                    subnets: process.env.TASK_SUBNETS.split(',')
                }
            },
            overrides: {
                containerOverrides: [ {
                    name: process.env.CONTAINER_NAME,
                    environment: [
                        {name: 'REPO_URL', value: process.env.REPO_URL},
                        {name: 'REPO_NAME', value: process.env.REPO_NAME},
                        {name: 'REPO_REGION', value: process.env.REPO_REGION},
                        {name: 'ARTIFACT_VERSION', value: artifactVersion},
                        {name: 'ARTIFACT_ID', value: artifactId}
                    ]
                } ]
            }
        };
        await ECS.runTask(params).promise();
    }
};

Create the pull request

With the environment set, I can use a simple bash script inside the container to create a new Git branch, update the pom.xml with the new dependency version, push the branch to CodeCommit, and use the AWS Command Line Interface (AWS CLI) to create the pull request. The Docker entrypoint looks like the following code:

#!/usr/bin/env bash
set -e

# clone the repository and create a new branch for the change
git clone --depth 1 $REPO_URL repo && cd repo
branch="library_update_$(date +"%Y-%m-%d_%H-%M-%S")"
git checkout -b "$branch"

# replace whatever version is currently used by the new version of the library
sed -i "s/<shared\.library\.version>.*<\/shared\.library\.version>/<shared\.library\.version>${ARTIFACT_VERSION}<\/shared\.library\.version>/g" pom.xml

# stage, commit and push the change
git add pom.xml
git -c "user.name=ECS Pull Request Creator" -c "[email protected]" commit -m "Update version of ${ARTIFACT_ID} to ${ARTIFACT_VERSION}"
git push --set-upstream origin "$branch"

# create pull request
aws codecommit create-pull-request --title "Update version of ${ARTIFACT_ID} to ${ARTIFACT_VERSION}" --targets repositoryName="$REPO_NAME",sourceReference="$branch",destinationReference=main --region "$REPO_REGION"

After a successful run, I can check the CodeCommit UI for the created pull request. The following screenshot shows the changes introduced by one of my pull requests during testing:

Screenshot of the Pull Request in AWS CodeCommit

Now that I have the pull request in place, I want to verify that the dependency update does not break my consumer code. I do this by triggering a CodeBuild project with the help of EventBridge.

Build the pull request

The ingredients I use are the same as with the CodeArtifact event. I create a rule that matches the event emitted by CodeCommit (limiting it to branches that match the prefix used by our Fargate task). Afterwards I add a target to the rule to start the CodeBuild project:

const onPullRequestCreatedRule = new events.Rule(this, 'PullRequestCreatedRule', {
  eventPattern: {
    source: [ 'aws.codecommit' ],
    detailType: [ 'CodeCommit Pull Request State Change' ],
    resources: [ codeCommitRepo.repositoryArn ],
    detail: {
      event: [ 'pullRequestCreated' ],
      sourceReference: [ {
        prefix: 'refs/heads/library_update_'
      } ],
      destinationReference: [ 'refs/heads/main' ]
    }
  }
});
onPullRequestCreatedRule.addTarget( new targets.CodeBuildProject(codeBuild, {
  event: events.RuleTargetInput.fromObject( {
    projectName: codeBuild.projectName,
    sourceVersion: events.EventField.fromPath('$.detail.sourceReference')
  })
}));

This triggers the build whenever a new pull request is created with a branch prefix of refs/head/library_update_.
You can easily add the build results as a comment back to CodeCommit. For more information, see Validating AWS CodeCommit Pull Requests with AWS CodeBuild and AWS Lambda.

My last step is to notify an SNS topic in in case of a failing build. The SNS topic is a resource in Account A. To target a resource in a different account I need to forward the event to this account’s event bus. From there I then target the SNS topic.

First, I forward the failed build event from Account B into the default event bus of Account A:

const onFailedBuildRule = new events.Rule(this, 'BrokenBuildRule', {
  eventPattern: {
    detailType: [ 'CodeBuild Build State Change' ],
    source: [ 'aws.codebuild' ],
    detail: {
      'build-status': [ 'FAILED' ]
    }
  }
});
const producerAccountTarget = new targets.EventBus(events.EventBus.fromEventBusArn(this, 'cross-account-event-bus', `arn:aws:events:${this.region}:${producerAccount}:event-bus/default`))
onFailedBuildRule.addTarget(producerAccountTarget);

Then I target the SNS topic in Account A to be notified of failures:

const onFailedBuildRule = new events.Rule(this, 'BrokenBuildRule', {
  eventPattern: {
    detailType: [ 'CodeBuild Build State Change' ],
    source: [ 'aws.codebuild' ],
    account: [ consumerAccount ],
    detail: {
      'build-status': [ 'FAILED' ]
    }
  }
});
onFailedBuildRule.addTarget(new targets.SnsTopic(notificationTopic));

See it in action

I use the cdk-assume-role-credential-plugin to deploy to both accounts, producer and consumer, with a single CDK command issued to the producer account. To do this I create roles for cross account access from the producer account in the consumer account as described here. I also make sure that the accounts are bootstrapped for CDK as described here. After that I run the following steps:

Deploy the Stacks:
cd cdk && cdk deploy --context region=<YOUR_REGION> --context producerAccount=<PRODUCER_ACCOUNT_NO> --context consumerAccount==<CONSUMER_ACCOUNT_NO> --all && cd -
After a successful deployment CDK prints a set of export commands. I set my environment from those Outputs:
❯ export CODEARTIFACT_ACCOUNT=<MY_PRODUCER_ACCOUNT>
❯ export CODEARTIFACT_DOMAIN=<MY_CODEARTIFACT_DOMAIN>
❯ export CODEARTIFACT_REGION=<MY_REGION>
❯ export CODECOMMIT_URL=<MY_CODECOMMIT_URL>
Setup Maven to authenticate to CodeArtifact
export CODEARTIFACT_TOKEN=$(aws codeartifact get-authorization-token --domain $CODEARTIFACT_DOMAIN --domain-owner $CODEARTIFACT_ACCOUNT --query authorizationToken --output text)
Release the first version of the shared library to CodeArtifact:
cd library_producer/library && mvn --settings ./settings.xml deploy && cd -
From a console which is authenticated/authorized for CodeCommit in the Consumer Account
1. Setup git to work with CodeCommit
2. Push the code of the library consumer to CodeCommit:
  cd library_consumer/library && git init && git add . && git commit -m "Add consumer to codecommit" && git remote add codecommit $CODECOMMIT_URL && git push --set-upstream codecommit main && cd -
Release a new version of the shared library:
cd library_producer/library && sed -i '' 's/<version>1.0.0/<version>1.0.1/' pom.xml && mvn --settings settings.xml deploy && cd -
After 1-3 minutes a Pull Request is created in the CodeCommit repo in the Consumer Account and a build is run to verify this PR:
In case of a build failure, you can create a subscription to the SNS topic in Account A to act upon the broken build.

Clean up

In case you followed along with this blog post and want to prevent incurring costs you have to delete the created resources. Run cdk destroy --context region=<YOUR_REGION> --context producerAccount=<PRODUCER_ACCOUNT_NO> --context consumerAccount==<CONSUMER_ACCOUNT_NO> --all to delete the CloudFormation stacks.

Conclusion

In this post, I automated the manual task of updating a shared library dependency version. I used a workflow that not only updates the dependency version, but also notifies the library producer in case the new artifact introduces a regression (for example, an API incompatibility with an older version). By using Amazon EventBridge I’ve created a loosely coupled solution which can be used as a basis for a feedback loop between library creators and consumers.

What next?

To improve the solution, I suggest to look into possibilities of error handling for the Fargate task. What happens if the git operation fails? How do we signal such a failure? You might want to replace the AWS Fargate portion with a Lambda-only solution and use AWS Step Functions for better error handling.

As a next step, I could think of a solution that automates updates for libraries stored in Maven Central. Wouldn’t it be nice to never miss the release of a new Spring Boot version? A Fargate task run on a schedule and the following code should get you going:

curl -sS 'https://search.maven.org/solrsearch/select?q=g:org.springframework.boot%20a:spring-boot-starter&start=0&rows=1&wt=json' | jq -r '.response.docs[ 0 ].latestVersion'

Happy Building!

Author bio

Joerg is a Solutions Architect at AWS and works with manufacturing customers in Germany. As a former Developer, DevOps Engineer and SRE he enjoys building and automating things.

Building a CI/CD pipeline to update an AWS CloudFormation StackSets

2021-06-26 Karim Afifi

Post Syndicated from Karim Afifi original https://aws.amazon.com/blogs/devops/building-a-ci-cd-pipeline-to-update-an-aws-cloudformation-stacksets/

AWS CloudFormation StackSets can extend the functionality of CloudFormation Stacks by enabling you to create, update, or delete one or more stack across multiple accounts. As a developer working in a large enterprise or for a group that supports multiple AWS accounts, you may often find yourself challenged with updating AWS CloudFormation StackSets. If you’re building a CI/CD pipeline to automate the process of updating CloudFormation stacks, you can do so natively. AWS CodePipeline can initiate a workflow that builds and tests a stack, and then pushes it to production. The workflow can either create or manipulate an existing stack; however, working with AWS CloudFormation StackSets is currently not a supported action at the time of this writing.

You can update an existing CloudFormation stack using one of two methods:

Directly updating the stack – AWS immediately deploys the changes that you submit. You can use this method when you want to quickly deploy your updates.
Running change sets – You can preview the changes AWS CloudFormation will make to the stack, and decide whether to proceed with the changes.

You have several options when building a CI/CD pipeline to automate creating or updating a stack. You can create or update a stack, delete a stack, create or replace a change set, or run a change set. Creating or updating a CloudFormation StackSet, however, is not a supported action.

The following screenshot shows the existing actions supported by CodePipeline against AWS CloudFormation on the CodePipeline console.

CodePipeline console

This post explains how to use CodePipeline to update an existing CloudFormation StackSet. For this post, we update the StackSet’s parameters. Parameters enable you to input custom values to your template each time you create or update a stack.

Overview of solution

To implement this solution, we walk you through the following high-level steps:

Update a parameter for a StackSet by passing a parameter key and its associated value via an AWS CodeCommit
Create an AWS CodeBuild
Build a CI/CD pipeline.
Run your pipeline and monitor its status.

After completing all the steps in this post, you will have a fully functional CI/CD that updates the CloudFormation StackSet parameters. The pipeline starts automatically after you apply the intended changes into the CodeCommit repository.

The following diagram illustrates the solution architecture.

Solution Architecture

The solution workflow is as follows:

Developers integrate changes into a main branch hosted within a CodeCommit repository.
CodePipeline polls the source code repository and triggers the pipeline to run when a new version is detected.
CodePipeline runs a build of the new revision in CodeBuild.
CodeBuild runs the changes in the yml file, which includes the changes against the StackSets. (To update all the stack instances associated with this StackSet, do not specify DeploymentTargets or Regions in the buildspec.yml file.)
Verify that the changes were applied successfully.

Prerequisites

To complete this tutorial, you should have the following prerequisites:

An AWS account.
Access to either AWS Cloud9 or the AWS Command Line Interface (AWS CLI).
Basic knowledge of AWS CloudFormation.
A CloudFormation StackSet. You can create a StackSet via the AWS Management Console or the AWS CLI. For instructions, see Create a StackSet. You can also use one of the sample templates provided by AWS CloudFormation device to create a StackSet. For more information, please visit the Create a StackSet
An AWS Identity and Access Management (IAM) service role to allow CodeBuild to build this project.

Retrieving your StackSet parameters

Your first step is to verify that you have a StackSet in the AWS account you intend to use. If not, create one before proceeding. For this post, we use an existing StackSet called StackSet-Test.

Sign in to your AWS account.
On the CloudFormation console, choose StackSets.
Choose your StackSet.

StackSet

For this post, we modify the value of the parameter with the key KMSId.

On the Parameters tab, note the value of the key KMSId.

Parameters

Creating a CodeCommit repository

To create your repository, complete the following steps:

On the CodeCommit console, choose Repositories.
Choose Create repository.

Repositories name

For Repository name, enter a name (for example, Demo-Repo).
Choose Create.

Repositories Description

Choose Create file to populate the repository with the following artifacts.

Create file

A buildspec.yml file informs CodeBuild of all the actions that should be taken during a build run for our application. We divide the build run into separate predefined phases for logical organization, and list the commands that run on the provisioned build server performing a build job.

Enter the following code in the code editor:

YAML

phases:

pre_build:

commands:

- aws cloudformation update-stack-set --stack-set-name StackSet-Test --use-previous-template --parameters ParameterKey=KMSId,ParameterValue=newCustomValue

The preceding AWS CloudFormation command updates a StackSet with the name StackSet-Test. The command results in updating the parameter value of the parameter key KMSId to newCustomValue.

Name the file yml.
Provide an author name and email address.
Choose Commit changes.

Creating a CodeBuild project

To create your CodeBuild project, complete the following steps:

On the CodeBuild console, choose Build projects.
Choose Create build project.

create build project

For Project name, enter your project name (for example, Demo-Build).
For Description, enter an optional description.

project name

For Source provider, choose AWS CodeCommit.
For Repository, choose the CodeCommit repository you created in the previous step.
For Reference type, keep default selection Branch.
For Branch, choose master.

Source configuration

To set up the CodeBuild environment, we use a managed image based on Amazon Linux 2.

For Environment Image, select Managed image.
For Operating system, choose Amazon Linux 2.
For Runtime(s), choose Standard.
For Image, choose amazonlinux2-aarch64-standard:1.0.
For Image version, choose Always use the latest for this runtime version.

Environment

For Service role¸ select New service role.
For Role name, enter your service role name.

Service Role

Chose Create build project.

Creating a CodePipeline pipeline

To create your pipeline, complete the following steps:

On the CodePipeline console, choose Pipelines.
Choose Create pipeline

Code Pipeline

For Pipeline name, enter a name for the pipeline (for example, DemoPipeline).
For Service role, select New service role.
For Role name, enter your service role name.

Pipeline name

Choose Next.
For Source provider, choose AWS CodeCommit.
For Repository name, choose the repository you created.
For Branch name, choose master.

Source Configurations

Choose Next.
For Build provider, choose AWS CodeBuild.
For Region, choose your Region.
For Project name, choose the build project you created.

CodeBuild

Choose Next.
Choose Skip deploy stage.
Choose Skip
Choose Create pipeline.

The pipeline is now created successfully.

Running and monitoring your pipeline

We use the pipeline to release changes. By default, a pipeline starts automatically when it’s created and any time a change is made in a source repository. You can also manually run the most recent revision through your pipeline, as in the following steps:

On the CodePipeline console, choose the pipeline you created.
On the pipeline details page, choose Release change.

The following screenshot shows the status of the run from the pipeline.

Release change

Under Build, choose Details to view build logs, phase details, reports, environment variables, and build details.

Build details

Choose the Build logs tab to view the logs generated as a result of the build in more detail.

The following screenshot shows that we ran the AWS CloudFormation command that was provided in the buildspec.yml file. It also shows that all phases of the build process are successfully complete.

Phase Details

The StackSet parameter KMSId has been updated successfully with the new value newCustomValue as a result of running the pipeline. Please note that we used the parameter KMSId as an example for demonstration purposes. Any other parameter that is part of your StackSet could have been used instead.

Cleaning up

You may delete the resources that you created during this post:

AWS CloudFormation StackSet.
AWS CodeCommit repository.
AWS CodeBuild project.
AWS CodePipeline.

Conclusion

In this post, we explored how to use CodePipeline, CodeBuild, and CodeCommit to update an existing CloudFormation StackSet. Happy coding!

About the author

Karim Afifi is a Solutions Architect Leader with Amazon Web Services. He is part of the Global Life Sciences Solution Architecture team. team. He is based out of New York, and enjoys helping customers throughout their journey to innovation.

Use AWS CodeCommit to mirror an Azure DevOps repository using an Azure DevOps pipeline

2021-06-26 Michael Massey

Post Syndicated from Michael Massey original https://aws.amazon.com/blogs/devops/use-aws-codecommit-to-mirror-an-azure-devops-repository-using-an-azure-devops-pipeline/

AWS customers with Git repositories in Azure DevOps can automatically backup their repositories in the AWS Cloud using an AWS CodeCommit repository as a replica. By configuring an Azure DevOps pipeline, the source and replica repositories can be automatically kept in sync. When updates are pushed to the source repository, the pipeline will be triggered to clone the repository and push it to the replica repository in AWS.

In this post, we show you how to automatically sync a source repository in Azure DevOps to a replica repository in AWS CodeCommit using an Azure DevOps pipeline.

Solution overview

The following diagram shows a high-level architecture of the pipeline.
Solution architecture diagram
To replicate your repository in the AWS Cloud, you perform the following steps which we will cover in this blog post:

Create a repository in CodeCommit.
Create a policy, user, and HTTPS Git credentials in AWS Identity and Access Management (IAM).
Create a pipeline in Azure DevOps.

Prerequisites

Before you get started, make sure you have the following prerequisites set up:

An AWS account
An Azure DevOps repository

Creating a repository in CodeCommit

You first create a new repository in CodeCommit to use as your replica repository. You need the URL and Amazon Resource Name (ARN) of the replica repository to complete this example pipeline. Follow these steps to create the repository and get the URL and ARN:

Create a CodeCommit repository in the Region of your choice. Choose a name to help you remember that this repository is a replica or backup repository (for example, MyRepoReplica). Important: Do not manually push any changes to this replica repository. It will cause conflicts later when your pipeline pushes changes in the source repository. Treat it as a read-only repository and push all of your development changes to your source repository.
On the AWS CodeCommit console, choose Repositories.
Choose your repository and choose View Repository.
Choose Clone URL and choose Clone HTTPS. This copies the repository’s URL. Save it by pasting it into a plain-text editor.
On the navigation pane, under Repositories, choose Settings.
Copy the value of Repository ARN and save it by pasting it into a plain-text editor.

Creating a policy, user, and HTTPS Git credentials in IAM

The pipeline needs permissions and credentials to push commits to your CodeCommit repository. In this example, you create an IAM policy, IAM user, and HTTPS Git credentials for the pipeline to give it access to your repository in AWS. You grant least privilege to the IAM user so the pipeline can only push to your replica repository.

To create the IAM policy, complete the following steps:

On the IAM console, choose Policies.
Choose Create Policy.
Choose JSON.

Enter a policy that grants permission to push commits to your repository. You can use a policy that’s similar to the following. For the resource element, specify the ARN of your CodeCommit repository:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "codecommit:GitPush",
            "Resource": "arn:aws:codecommit:us-east-1:123456789012:MyRepoReplica"
        }
    ]
}

Choose Review policy.
For Name, enter a name for your policy (for example, CodeCommitMyRepoReplicaGitPush).
Choose Create policy.

For more information, see Creating IAM Policies.

You can now create the IAM user.

On the IAM console, choose Users.
Choose Add user.
Enter a User name (for example, azure-devops-pipeline).
Select Programmatic access.
Choose Next: Permissions.
Select Attach existing policies directly and select the IAM policy you created.
Choose Next: Tags.
Choose Next: Review.
Choose Create user.
When presented with security credentials, choose Close.
Choose your new user by clicking on the user name link.
Choose Security Credentials.
Under Access keys, remove the existing access key.
Under HTTPS Git credentials for AWS CodeCommit, choose Generate credentials.
Choose Download credentials to save the user name and password.
Choose Close.

For more information, see Creating an IAM user and Setup for HTTPS users using Git credentials.

Creating a pipeline in Azure DevOps

The pipeline in this post clones a mirror of your source repository and pushes it to your CodeCommit repository. The pipeline requires the URL of your source repository and HTTPS Git credentials to clone it.

To find the URL of your source repository and to generate HTTPS Git credentials, complete the following steps:

Go to the Repos page within Azure DevOps and choose your repository.
Choose Clone.
Choose HTTPS.
Copy and save the URL by pasting it into a plain-text editor.
Choose Generate Git Credentials.
Copy the user name and password and save them by pasting them into a plain-text editor.

Now that you have the URL and HTTPS Git credentials, create a pipeline.

Go to the Pipeline page within Azure DevOps.
Choose Create Pipeline (or New Pipeline).
Choose Azure Repos Git.
Choose your repository.
Choose Starter pipeline.

Enter the following YAML code to replace the default pipeline YAML:

# Pipeline to automatically mirror
# an Azure DevOps repository in AWS CodeCommit

# Trigger on all branches
trigger:
- '*'

# Use latest Ubuntu image
pool:
  vmImage: 'ubuntu-latest'

# Pipeline
steps:
- checkout: none
- script: |
      
      # Install urlencode function to encode reserved characters in passwords
      sudo apt-get install gridsite-clients

      # Create local mirror of Azure DevOps repository
      git clone --mirror https://${AZURE_GIT_USERNAME}:$(urlencode ${AZURE_GIT_PASSWORD})@${AZURE_REPO_URL} repo-mirror
      
      # Sync AWS CodeCommit repository
      cd repo-mirror
      git push --mirror https://${AWS_GIT_USERNAME}:$(urlencode ${AWS_GIT_PASSWORD})@${AWS_REPO_URL}
      
  displayName: 'Sync repository with AWS CodeCommit'
  env:
    AZURE_REPO_URL: $(AZURE_REPO_URL)
    AZURE_GIT_USERNAME: $(AZURE_GIT_USERNAME)
    AZURE_GIT_PASSWORD: $(AZURE_GIT_PASSWORD)
    AWS_REPO_URL: $(AWS_REPO_URL)
    AWS_GIT_USERNAME: $(AWS_GIT_USERNAME)
    AWS_GIT_PASSWORD: $(AWS_GIT_PASSWORD)

Add the following variables to your pipeline using the steps below:

Name	Value	Keep Secret
`AZURE_REPO_URL`	Your Azure DevOps repository URL (do not include `https://user@`)	Optional
`AZURE_GIT_USERNAME`	Your Azure HTTPS Git credentials user name	YES
`AZURE_GIT_PASSWORD`	Your Azure HTTPS Git credentials password	YES
`AWS_REPO_URL`	Your CodeCommit repository URL (do not include `https://`)	Optional
`AWS_GIT_USERNAME`	Your AWS HTTPS Git credentials user name	YES
`AWS_GIT_PASSWORD`	Your AWS HTTPS Git credentials password	YES

Choose Variables.
Choose New Variable.
Enter the variable Name and Value.
Select Keep this value secret when adding any user name or password variable.
Choose OK.
Repeat for each variable.
Choose Save.
Choose Save and run.

Verifying the pipeline

When you save the pipeline, it commits the pipeline’s YAML file (azure-pipelines.yml) to the root of your source repository’s primary branch and then runs. You can verify that the pipeline ran successfully by viewing the pipeline job in Azure DevOps pipelines and viewing your replica repository on the CodeCommit console.

Go to the Pipeline page within Azure DevOps and choose your pipeline.
Choose the entry for the latest run.
Under Jobs, choose Job to view the output of your pipeline.
On the CodeCommit console, choose Repositories.
Choose your repository and choose View Repository.
On the navigation pane, choose Commits.
Verify that the CodeCommit repository contains the latest commit from Azure DevOps.

The pipeline runs whenever a new commit is pushed to the source repository. All updates are mirrored in the replica CodeCommit repository, including commits, branches, and references.

Cleaning up

When you’ve completed all steps and are finished testing, follow these steps to delete resources to avoid incurring costs:

On the CodeCommit console, choose Repositories.
Choose your repository and choose Delete Repository.
On the IAM console, choose Users.
Choose your pipeline user and choose Delete User.
On the navigation pane, choose Policies.
Choose your CodeCommit Git push policy and choose Policy Actions and Delete.
Go to the Pipeline page within Azure DevOps and choose your pipeline.
Choose More Actions and choose Delete.

Conclusion

This post showed how you can use an Azure DevOps pipeline to mirror an Azure DevOps repository in CodeCommit. It provided detailed instructions on setting up your replica repository in CodeCommit, creating a least privilege access policy and user credentials for the pipeline in IAM, and creating the pipeline in Azure DevOps. You can use this solution to automatically replicate your Azure DevOps repositories in AWS for backup purposes or as a source to build CI/CD pipelines within AWS.

About the author

Michael Massey is a Cloud Application Architect at Amazon Web Services. He helps AWS customers achieve their goals by building highly-available and highly-scalable solutions on the AWS Cloud.

Hackathons with AWS Cloud9: Collaboration simplified for your next big idea

2021-06-12 Mahesh Biradar

Post Syndicated from Mahesh Biradar original https://aws.amazon.com/blogs/devops/hackathons-with-aws-cloud9-collaboration-simplified-for-your-next-big-idea/

Many organizations host ideation events to innovate and prototype new ideas faster. These events usually run for a short duration and involve collaboration between members of participating teams. By the end of the event, a successful demonstration of a working prototype is expected and the winner or the next steps are determined. Therefore, it’s important to build a working proof of concept quickly, and to do that teams need to be able to share the code and get peer reviewed in real time.

In this post, you see how AWS Cloud9 can help teams collaborate, pair program, and track each other’s inputs in real time for a successful hackathon experience.

AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you to write, run, and debug code from any machine with just a browser. A shared environment is an AWS Cloud9 development environment that multiple users have been invited to participate in and can edit or view its shared resources.

Pair programming and mob programming are development approaches in which two or more developers collaborate simultaneously to design, code, or test solutions. At the core is the premise that two or more people collaborate on the same code at the same time, which allows for real-time code review and can result in higher quality software.

Hackathons are one of the best ways to collaboratively solve problems, often with code. Cross-functional two-pizza teams compete with limited resources under time constraints to solve a challenging business problem. Several companies have adopted the concept of hackathons to foster a culture of innovation, providing a platform for developers to showcase their creativity and acquire new skills. Teams are either provided a roster of ideas to choose from or come up with their own new idea.

Solution overview

In this post, you create an AWS Cloud9 environment shared with three AWS Identity and Access Management (IAM) users (the hackathon team). You also see how this team can code together to develop a sample serverless application using an AWS Serverless Application Model (AWS SAM) template.

The following diagram illustrates the deployment architecture.

Figure1: Solution Overview

Prerequisites

To complete the steps in this post, you need an AWS account with administrator privileges.

Set up the environment

To start setting up your environment, complete the following steps:

Create an AWS Cloud9 environment in your AWS account.
Create and attach an instance profile to AWS Cloud9 to call AWS services from an environment.For more information, see Create and store permanent access credentials in an environment.
On the AWS Cloud9 console, select the environment you just created and choose View details.

Figure2: Cloud9 View details
Note the environment ID from the Environment ARN value; we use this ID in a later step.

Figure3: Environment ARN

In your AWS Cloud9 terminal, create the file usersetup.sh with the following contents:

#USAGE: 
#STEP 1: Execute following command within Cloud9 terminal to retrieve environment id
# aws cloud9 list-environments
#STEP 2: Execute following command by providing appropriate parameters: -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3 
# sh usersetup.sh -e 877f86c3bb80418aabc9956580436e9a -u User1,User2
function usage() {
  echo "USAGE: sh usersetup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3"
}
while getopts ":e:u:" opt; do
  case $opt in
    e)  if ! aws cloud9 describe-environment-status --environment-id "$OPTARG" 2>&1 >/dev/null; then
          echo "Please provide valid cloud9 environmentid."
          usage
          exit 1
        fi
        environmentId="$OPTARG" ;;
    u)  if [ "$OPTARG" == "" ]; then
          echo "Please provide comma separated list of usernames."
          usage
          exit 1
        fi
        users="$OPTARG" ;;
    \?) echo "Incorrect arguments."
        usage
        exit 1;;
  esac
done
if [ "$OPTIND" -lt 5 ]; then
  echo "Missing required arguments."
  usage
  exit 1
fi
IFS=',' read -ra userNames <<< "$users"
groupName='HackathonUsers'
groupPolicy='arn:aws:iam::aws:policy/AdministratorAccess'
userArns=()
function createUsers() {
    userList=""    
    if aws iam get-group --group-name $groupName  > /dev/null 2>&1; then
      echo "$groupName group already exists."  
    else
      if aws iam create-group --group-name $groupName 2>&1 >/dev/null; then
        echo "Created user group - $groupName."  
      else
        echo "Error creating user group - $groupName."  
        exit 1
      fi
    fi
    if aws iam attach-group-policy --policy-arn $groupPolicy --group-name $groupName; then
      echo "Attached group policy."  
    else
      echo "Error attaching group policy to - $groupName."  
      exit 1
    fi
    
    for userName in "${userNames[@]}" ; do 
        
        randomPwd=`aws secretsmanager get-random-password \
        --require-each-included-type \
        --password-length 20 \
        --no-include-space \
        --output text`
    
        userList="$userList"$'\n'"Username: $userName, Password: $randomPwd"
        
        userArn=`aws iam create-user \
        --user-name $userName \
        --query 'User.Arn' | sed -e 's/\/.*\///g' | tr -d '"'`
        
        userArns+=( $userArn )
      
        aws iam wait user-exists \
        --user-name $userName
        
        echo "Successfully created user $userName."
        
        aws iam create-login-profile \
        --user-name $userName \
        --password $randomPwd \
        --password-reset-required 2>&1 >/dev/null
        
        aws iam add-user-to-group \
        --user-name $userName \
        --group-name $groupName
    done
    echo "Waiting for users profile setup..."
    sleep 8
    
    for arn in "${userArns[@]}" ; do 
      aws cloud9 create-environment-membership \
        --environment-id $environmentId \
        --user-arn $arn \
        --permissions read-write 2>&1 >/dev/null
    done
    echo "Following users have been created and added to $groupName group."
    echo "$userList"
}
createUsers

Run the following command by replacing the following parameters:
1. 1. ENVIRONMENTID – The environment ID you saved earlier
  2. USERNAME1, USERNAME2… – A comma-separated list of users. In this example, we use three users.
sh usersetup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3
The script creates the following resources:
- - The number of IAM users that you defined
  - The IAM user group HackathonUsers with the users created from previous step assigned with administrator access
  - These users are assigned a random password, which must be changed before their first login.
  - User passwords can be shared with your team from the AWS Cloud9 Terminal output.
Instruct your team to sign in to the AWS Cloud9 console open the shared environment by choosing Shared with you.

Figure4: Shared environments

Run the create-repository command, specifying a unique name, optional description, and optional tags:

aws codecommit create-repository --repository-name hackathon-repo --repository-description "Hackathon repository" --tags Team=hackathon

Note the cloneUrlHttp value from the output; we use this in a later step.

Figure5: CodeCommit repo url

The environment is now ready for the hackathon team to start coding.
Instruct your team members to open the shared environment from the AWS Cloud9 dashboard.
For demo purposes, you can quickly create a sample Python-based Hello World application using the AWS SAM CLI

Run the following commands to commit the files to the local repo:

cd hackathon-repo
git config --global init.defaultBranch main
git init
git add .
git commit -m "Initial commit

Run the following command to push the local repo to AWS CodeCommit by replacing CLONE_URL_HTTP with the cloneUrlHttp value you noted earlier:
```
git push <CLONEURLHTTP> —all
```

For a sample collaboration scenario, watch the video Collaboration with Cloud9 .

Clean up

The cleanup script deletes all the resources it created. Make a local copy of any files you want to save.

Create a file named cleanup.sh with the following content:

#USAGE: 
#STEP 1: Execute following command within Cloud9 terminal to retrieve envronment id
# aws cloud9 list-environments
#STEP 2: Execute following command by providing appropriate parameters: -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3 
# sh cleanup.sh -e 877f86c3bb80418aabc9956580436e9a -u User1,User2
function usage() {
  echo "USAGE: sh cleanup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3"
}
while getopts ":e:u:" opt; do
  case $opt in
    e)  if ! aws cloud9 describe-environment-status --environment-id "$OPTARG" 2>&1 >/dev/null; then
          echo "Please provide valid cloud9 environmentid."
          usage
          exit 1
        fi
        environmentId="$OPTARG" ;;
    u)  if [ "$OPTARG" == "" ]; then
          echo "Please provide comma separated list of usernames."
          usage
          exit 1
        fi
        users="$OPTARG" ;;
    \?) echo "Incorrect arguments."
        usage
        exit 1;;
  esac
done
if [ "$OPTIND" -lt 5 ]; then
  echo "Missing required arguments."
  usage
  exit 1
fi
IFS=',' read -ra userNames <<< "$users"
groupName='HackathonUsers'
groupPolicy='arn:aws:iam::aws:policy/AdministratorAccess'
function cleanUp() {
    echo "Starting cleanup..."
    groupExists=false
    if aws iam get-group --group-name $groupName  > /dev/null 2>&1; then
      groupExists=true
    else
      echo "$groupName does not exist."  
    fi
    
    for userName in "${userNames[@]}" ; do 
        if ! aws iam get-user --user-name $userName >/dev/null 2>&1; then
          echo "$userName does not exist."  
        else
          userArn=$(aws iam get-user \
          --user-name $userName \
          --query 'User.Arn' | tr -d '"') 
          
          if $groupExists ; then 
            aws iam remove-user-from-group \
            --user-name $userName \
            --group-name $groupName
          fi
  
          aws iam delete-login-profile \
          --user-name $userName 
  
          if aws iam delete-user --user-name $userName ; then
            echo "Succesfully deleted $userName"
          fi
          
          aws cloud9 delete-environment-membership \
          --environment-id $environmentId --user-arn $userArn
          
        fi
    done
    if $groupExists ; then 
      aws iam detach-group-policy \
      --group-name $groupName \
      --policy-arn $groupPolicy
  
      if aws iam delete-group --group-name $groupName ; then
        echo "Succesfully deleted $groupName user group"
      fi
    fi
    
    echo "Cleanup complete."
}
cleanUp

Run the script by passing the same parameters you passed when setting up the script:
```
sh cleanup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3
```
Delete the CodeCommit repository by running the following commands in the root directory with the appropriate repository name:
```
aws codecommit delete-repository —repository-name hackathon-repo
rm -rf hackathon-repo
```
You can delete the Cloud9 environment when the event is over

Conclusion

In this post, you saw how to use an AWS Cloud9 IDE to collaborate as a team and code together to develop a working prototype. For organizations looking to host hackathon events, these tools can be a powerful way to deliver a rich user experience. For more information about AWS Cloud9 capabilities, see the AWS Cloud9 User Guide. If you plan on using AWS Cloud9 for an ongoing collaboration, refer to the best practices for sharing environments in Working with shared environment in AWS Cloud9.

About the authors

	Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale.
	Guy Savoie is a Senior Solutions Architect at AWS working with SMB customers, primarily in Florida. In his role as a technical advisor, he focuses on unlocking business value through outcome based innovation.
	Ramesh Chidirala is a Solutions Architect focused on SMB customers in the Central region. He is passionate about helping customers solve challenging technical problems with AWS and help them achieve their desired business outcomes.

Choosing a Well-Architected CI/CD approach: Open-source software and AWS Services

2021-06-10 Brian Carlson

Post Syndicated from Brian Carlson original https://aws.amazon.com/blogs/devops/choosing-well-architected-ci-cd-open-source-software-aws-services/

This series of posts discusses making informed decisions when choosing to implement open-source tools on AWS services, adopt managed AWS services to satisfy the same needs, or use a combination of both.

We look at key considerations for evaluating open-source software and AWS services using the perspectives of a startup company and a mature company as examples. You can use these two different points of view to compare to your own organization. To make this investigation easier we will use Continuous Integration (CI) and Continuous Delivery (CD) capabilities as the target of our investigation.

Startup Company rocket and Mature Company rocket

In two related posts, we follow two AWS customers, Iponweb and BigHat Biosciences, as they share their CI/CD journeys, their perspectives, the decisions they made, and why. To end the series, we explore an example reference architecture showing the benefits AWS provides regardless of your emphasis on open-source tools or managed AWS services.

Why CI/CD?

Until your creations are in the hands of your customers, investment in development has provided no return. The faster valuable changes enter production, the greater positive impact you can have on your customer. In today’s highly competitive world, the ability to frequently and consistently deliver value is a competitive advantage. The Operational Excellence (OE) pillar of the AWS Well-Architected Framework recognizes this impact and focuses on the capabilities of CI/CD in two dedicated sections.

The concepts in CI/CD originate from software engineering but apply equally to any form of content. The goal is to support development, integration, testing, deployment, and delivery to production. For example, making changes to an application, updating your machine learning (ML) models, changing your multimedia assets, or referring to the AWS Well-Architected Framework.

Adopting CI/CD and the best practices from the Operational Excellence pillar can help you address risks in your environment, and limit errors from manual processes. More importantly, they help free your teams from the related manual processes, so they can focus on satisfying customer needs, differentiating your organization, and accelerating the flow of valuable changes into production.

A red question mark sits on a field of chaotically arranged black question marks.

How do you decide what you need?

The first question in the Operational Excellence pillar is about understanding needs and making informed decisions. To help you frame your own decision-making process, we explore key considerations from the perspective of a fictional startup company and a fictional mature company. In our two related posts, we explore these same considerations with Iponweb and BigHat.

The key considerations include:

Functional requirements – Providing specific features and capabilities that deliver value to your customers.
Non-functional requirements – Enabling the safe, effective, and efficient delivery of the functional requirements. Non-functional requirements include security, reliability, performance, and cost requirements.
- Without security, you can’t earn customer trust. If your customers can’t trust you, you won’t have customers.
- Without reliability you aren’t available to serve your customers. If you can’t serve your customers, you won’t have customers.
- Performance is focused on timely and efficient delivery of value, not delivering as fast as possible.
- Cost is focused on optimizing the value received for the resources spent (for example, money, time, or effort), not minimizing expense.
Operational requirements – Enabling you to effectively and efficiently support, maintain, sustain, and improve the delivery of value to your customers. When you “Design with Ops in Mind,” you’re enabling effective and efficient support for your business outcomes.

These non-feature-related key considerations are why Operational Excellence, Security, Reliability, Performance Efficiency, and Cost Optimization are the five pillars of the AWS Well-Architected Framework.

The startup company

Any startup begins as a small team of inspired people working together to realize the unique solution they believe solves an unsolved problem.

For our fictional small team, everyone knows each other personally and all speak frequently. We share processes and procedures in discussions, and everyone know what needs to be done. Our team members bring their expertise and dedicate it, and the majority of their work time, to delivering our solution. The results of our efforts inform changes we make to support our next iteration.

However, our manual activities are error-prone and inconsistencies exist in the way we do them. Performing these tasks takes time away from delivering our solution. When errors occur, they have the potential to disrupt everyone’s progress.

We have capital available to make some investments. We would prefer to bring in more team members who can contribute directly to developing our solution. We need to iterate faster if we are to achieve a broadly viable product in time to qualify for our next round of funding. We need to decide what investments to make.

Goals – Reach the next milestone and secure funding to continue development
Needs – Reduce or eliminate the manual processes and associated errors
Priority – Rapid iteration
CI/CD emphasis – Baseline CI/CD capabilities and non-functional requirements are emphasized over a rich feature set

The mature company

Our second fictional company is a large and mature organization operating in a mature market segment. We’re focused on consistent, quality customer experiences to serve and retain our customers.

Our size limits the personal relationships between our service and development teams. The process to make requests, and the interfaces between teams and their systems, are well documented and understood.

However, the systems we have implemented over time, as needs were identified and addressed, aren’t well documented. Our existing tool chain includes some in-house scripting and both supported and unsupported versions of open-source tools. There are limited opportunities for us to acquire new customers.

When conditions change and new features are desired, we want to be able to rapidly implement and deploy those features as fast as possible. If we can differentiate our services, however briefly, we may be able to win customers away from our competitors. Our other path to improved profitability is to evolve our processes, maximizing integration and efficiencies, and capturing cost reductions.

Goals – Differentiate ourselves in the marketplace with desired new features
Needs – Address the risks of poorly documented systems and unsupported software
Priority – Evolve efficiency
CI/CD emphasis – Rich feature set and integrations are emphasized over improving the existing non-functional capabilities

Open-source tools on AWS vs. AWS services

The choice of open-source tools or AWS service is not binary. You can select the combination of solutions that provides the greatest value. You can implement open-source tools for their specific benefits where they outweigh the costs and operational burden, using underlying AWS services like Amazon Elastic Compute Cloud (Amazon EC2) to host them. You can then use AWS managed services, like AWS CodeBuild, for the undifferentiated features you need, without additional cost or operational burden.

A group of people sit around a table discussing the pieces of a puzzle and their ideas.

Feature Set

Our fictional organizations both want to accelerate the flow of beneficial changes into production and are evaluating CI/CD alternatives to support that outcome. Our startup company wants a working solution—basic capabilities, author/code, build, and deploy, so that they can focus on development. Our mature company is seeking every advantage—a rich feature set, extensive opportunities for customization, integration capabilities, and fine-grained control.

Open-source tools

Open-source tools often excel at meeting functional requirements. When a new functionality, capability, or integration is desired, any developer can implement it for themselves, and then contribute their code back to the project. As the user community for an open-source project expands the number of use cases and the features identified grows, so does the number of potential solutions and potential contributors. Developers are using these tools to support their efforts and implement new features that provide value to them.

However, features may be released in unsupported versions and then later added to the supported feature set. Non-functional requirements take time and are less appealing because they don’t typically bring immediate value to the product. Non-functional capabilities may lag behind the feature set.

Consider the following:

Open-source tools may have more features and existing integrations to other tools
The pace of feature set delivery may be extremely rapid
The features delivered are those desired and created by the active members of the community
You are free to implement the features your company desires
There is no commitment to long-term support for the project or any given feature
You can implement open-source tools on multiple cloud providers or on premises
If the project is abandoned, you’re responsible for maintaining your implementation

AWS services

AWS services are driven by customer needs. Services and features are supported by dedicated teams. These customer-obsessed teams focus on all customer needs, with security being their top priority. Both functional and non-functional requirements are addressed with an emphasis on enabling customer outcomes while minimizing the effort they expend to achieve them.

Consider the following:

The pace of delivery of feature sets is consistent
The feature roadmap is driven by customer need and customer requests
The AWS service team is dedicated to support of the service
AWS services are available on the AWS Cloud and on premises through AWS Outposts

Picture showing symbol of dollar

Cost Optimization

Why are we discussing cost after the feature set? Security and reliability are fundamentally more important. Leadership naturally gravitates to following the operational excellence best practice of evaluating trade-offs. Having looked at the potential benefits from the feature set, the next question is typically, “What is this going to cost?” Leadership defines the priorities and allocates the resources necessary (capital, time, effort). We review cost optimization second so that leadership can make a comparison of the expected benefits between CI/CD investments, and investments in other efforts, so they can make an informed decision.

Our organizations are both cost conscious. Our startup is working with finite capital and time. In contrast, our mature company can plan to make investments over time and budget for the needed capital. Early investment in a robust and feature-rich CI/CD tool chain could provide significant advantages towards the startup’s long-term success, but if the startup fails early, the value of that investment will never be realized. The mature company can afford to realize the value of their investment over time and can make targeted investments to address specific short-term needs.

Open-source tools

Open-source software doesn’t have to be purchased, but there are costs to adopt. Open-source tools require appropriate skills in order to be implemented, and to perform management and maintenance activities. Those skills must be gained through dedicated training of team members, team member self-study, or by hiring new team members with the existing skills. The availability of skilled practitioners of open-source tools varies with how popular a tool is and how long it has had an active community. Loss of skilled team members includes the loss of their institutional knowledge and intimacy with the implementation. Skills must be maintained with changes to the tools and as team members join or leave. Time is required from skilled team members to support management and maintenance activities. If commercial support for the tool is desired, it may be available through third-parties at an additional cost.

The time to value of an open-source implementation includes the time to implement and configure the resources and software. Additional value may be realized through investment of time configuring or implementing desired integrations and capabilities. There may be existing community-supported integrations or capabilities that reduce the level of effort to achieve these.

Consider the following:

No cost to acquire the software.
The availability of skill practitioners of open-source tools may be lower. Cost (capital and time) to acquire, establish, or maintain skill set may be higher.
There is an ongoing cost to maintain the team member skills necessary to support the open-source tools.
There is an ongoing cost of time for team members to perform management and maintenance activities.
Additional commercial support for open-source tools may be available at additional cost
Time to value includes implementation and configuration of resources and the open-source software. There may be more predefined community integrations.

AWS services

AWS services are provided pay-as-you-go with no required upfront costs. As of August 2020, more than 400,000 individuals hold active AWS Certifications, a number that grew more than 85% between August 2019 and August 2020.

Time to value for AWS services is extremely short and limited to the time to instantiate or configure the service for your use. Additional value may be realized through the investment of time configuring or implementing desired integrations. Predefined integrations for AWS services are added as part of the service development roadmap. However, there may be fewer existing integrations to reduce your level of effort.

Consider the following:

No cost to acquire the software; AWS services are pay-as-you-go for use.
AWS skill sets are broadly available. Cost (capital and time) to acquire, establish, or maintain skill sets may be lower.
AWS services are fully managed, and service teams are responsible for the operation of the services.
Time to value is limited to the time to instantiate or configure the service. There may be fewer predefined integrations.
Additional support for AWS services is available through AWS Support. Cost for support varies based on level of support and your AWS utilization.

Open-source tools on AWS services

Open-source tools on AWS services don’t impact these cost considerations. Migration off of either of these solutions is similarly not differentiated. In either case, you have to invest time in replacing the integrations and customizations you wish to maintain.

Picture showing a checkmark put on security

Security

Both organizations are concerned about reputation and customer trust. They both want to act to protect their information systems and are focusing on confidentiality and integrity of data. They both take security very seriously. Our startup wants to be secure by default and wants to trust the vendor to address vulnerabilities within the service. Our mature company has dedicated resources that focus on security, and the company practices defense in depth across internal organizations.

The startup and the mature company both want to know whether a choice is safe, secure, and can validate the security of their choice. They also want to understand their responsibilities and the shared responsibility model that applies.

Open-source tools

Open-source tools are the product of the contributors and may contain flaws or vulnerabilities. The entire community has access to the code to test and validate. There are frequently many eyes evaluating the security of the tools. A company or individual may perform a validation for themselves. However, there may be limited guidance on secure configurations. Controls in the implementer’s environment may reduce potential risk.

Consider the following:

You’re responsible for the security of the open-source software you implement
You control the security of your data within your open-source implementation
You can validate the security of the code and act as desired

AWS services

AWS service teams make security their highest priority and are able to respond rapidly when flaws are identified. There is robust guidance provided to support configuring AWS services securely.

Consider the following:

AWS is responsible for the security of the cloud and the underlying services
You are responsible for the security of your data in the cloud and how you configure AWS services
You must rely on the AWS service team to validate the security of the code

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation and the configuration of the AWS services it consumes. AWS is responsible for the security of the AWS Cloud and the managed AWS services.

Picture showing global distribution for redundancy to depict reliability

Reliability

Everyone wants reliable capabilities. What varies between companies is their appetite for risk, and how much they can tolerate the impact of non-availability. The startup emphasized the need for their systems to be available to support their rapid iterations. The mature company is operating with some existing reliability risks, including unsupported open-source tools and in-house scripts.

The startup and the mature company both want to understand the expected reliability of a choice, meaning what percentage of the time it is expected to be available. They both want to know if a choice is designed for high availability and will remain available even if a portion of the systems fails or is in a degraded state. They both want to understand the durability of their data, how to perform backups of their data, and how to perform recovery in the event of a failure.

Both companies need to determine what is an acceptable outage duration, commonly referred to as a Recovery Time Objective (RTO), and for what quantity of elapsed time it is acceptable to lose transactions (including committing changes), commonly referred to as Recovery Point Objective (RPO). They need to evaluate if they can achieve their RTO and RPO objectives with each of the choices they are considering.

Open-source tools

Open-source reliability is dependent upon the effectiveness of the company’s implementation, the underlying resources supporting the implementation, and the reliability of the open-source software. Open-source tools are the product of the contributors and may or may not incorporate high availability features. Depending on the implementation and tool, there may be a requirement for downtime for specific management or maintenance activities. The ability to support RTO and RPO depends on the teams supporting the company system, the implementation, and the mechanisms implemented for backup and recovery.

Consider the following:

You are responsible for implementing your open-source software to satisfy your reliability needs and high availability needs
Open-source tools may have downtime requirements to support specific management or maintenance activities
You are responsible for defining, implementing, and testing the backup and recovery mechanisms and procedures
You are responsible for the satisfaction of your RTO and RPO in the event of a failure of your open-source system

AWS services

AWS services are designed to support customer availability needs. As managed services, the service teams are responsible for maintaining the health of the services.

Consider the following:

AWS services are fully managed and service teams are responsible for the health of the services.
AWS services are designed and implemented to support customer reliability requirements. AWS CodeCommit is specifically designed for high availability. AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy are all engineered to exceed 99.9% availability.
CodeCommit, CodeBuild, CodePipeline, and CodeDeploy use highly durable services, including Amazon S3 and Amazon DynamoDB, to store customer data redundantly across multiple facilities.

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation (including data durability, backup, and recovery) and the configuration of the AWS services it consumes. AWS is responsible for the health of the AWS Cloud and the managed services.

Picture showing a graph depicting performance measurement

Performance

What defines timely and efficient delivery of value varies between our two companies. Each is looking for results before an engineer becomes idled by having to wait for results. The startup iterates rapidly based on the results of each prior iteration. There is limited other activity for our startup engineer to perform before they have to wait on actionable results. Our mature company is more likely to have an outstanding backlog or improvements that can be acted upon while changes moves through the pipeline.

Open-source tools

Open-source performance is defined by the resources upon which it is deployed. Open-source tools that can scale out can dynamically improve their performance when resource constrained. Performance can also be improved by scaling up, which is required when performance is constrained by resources and scaling out isn’t supported. The performance of open-source tools may be constrained by characteristics of how they were implemented in code or the libraries they use. If this is the case, the code is available for community or implementer-created improvements to address the limitation.

Consider the following:

You are responsible for managing the performance of your open-source tools
The performance of open-source tools may be constrained by the resources they are implemented upon; the code and libraries used; their system, resource, and software configuration; and the code and libraries present within the tools

AWS services

AWS services are designed to be highly scalable. CodeCommit has a highly scalable architecture, and CodeBuild scales up and down dynamically to meet your build volume. CodePipeline allows you to run actions in parallel in order to increase your workflow speeds.

Consider the following:

AWS services are fully managed, and service teams are responsible for the performance of the services.
AWS services are designed to scale automatically.
Your configuration of the services you consume can affect the performance of those services.
AWS services quotas exist to prevent unexpected costs. You can make changes to service quotas that may affect performance and costs.

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation (including the selection and configuration of the AWS Cloud resources) and the configuration of the AWS services it consumes. AWS is responsible for the performance of the AWS Cloud and the managed AWS services.

Picture showing cart-wheels in motion, depicting operations

Operations

Our startup company wants to limit its operations burden as much as possible in order to focus on development efforts. Our mature company has an established and robust operations capability. In both cases, they perform the management and maintenance activities necessary to support their needs.

Open-source tools

Open-source tools are supported by their volunteer communities. That support is voluntary, without any obligation or commitment from the users. If either company adopts open-source tools, they’re responsible for the management and maintenance of the system. If they want additional support with an obligation and commitment to support their implementation, third parties may provide commercial support at additional cost.

Consider the following:

You are responsible for supporting your implementation.
The open-source community may provide volunteer support for the software.
There is no commitment to support the software by the open-source community.
There may be less documentation, or accepted best practices, available to support open-source tools.
Early adoption of open-source tools, or the use of development builds, includes the chance of encountering unidentified edge cases and unanticipated issues.
The complexity of an implementation and its integrations may increase the difficulty to support open-source tools. The time to identify contributing factors may be extended by the complexity during an incident. Maintaining a set of skilled team members with deep understanding of your implementation may help mitigate this risk.
You may be able to acquire commercial support through a third party.

AWS services

AWS services are committed to providing long-term support for their customers.

Consider the following:

There is long-term commitment from AWS to support the service
As a managed service, the service team maintains current documentation
Additional levels of support are available through AWS Support
Support for AWS is available through partners and third parties

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations. The company is responsible for operating the open-source tools (for example, software configuration changes, updates, patching, and responding to faults). AWS is responsible for the operation of the AWS Cloud and the managed AWS services.

Conclusion

In this post, we discussed how to make informed decisions when choosing to implement open-source tools on AWS services, adopt managed AWS services, or use a combination of both. To do so, you must examine your organization and evaluate the benefits and risks.

A magnifying glass is focused on the single red figure in a group of otherwise blue paper figures standing on a white surface.

Examine your organization

You can make an informed decision about the capabilities you adopt. The insight you need can be gained by examining your organization to identify your goals, needs, and priorities, and discovering what your current emphasis is. Ask the following questions:

What is your organization trying to accomplish and why?
How large is your organization and how is it structured?
How are roles and responsibilities distributed across teams?
How well defined and understood are your processes and procedures?
How do you manage development, testing, delivery, and deployment today?
What are the major challenges your organization faces?
What are the challenges you face managing development?
What problems are you trying to solve with CI/CD tools?
What do you want to achieve with CI/CD tools?

Evaluate benefits and risk

Armed with that knowledge, the next step is to explore the trade-offs between open-source options and managed AWS services. Then evaluate the benefits and risks in terms of the key considerations:

Features
Cost
Security
Reliability
Performance
Operations

When asked “What is the correct answer?” the answer should never be “It depends.” We need to change the question to “What is our use case and what are our needs?” The answer will emerge from there.

Make an informed decision

A Well-Architected solution can include open-source tools, AWS Services, or any combination of both! A Well-Architected choice is an informed decision that evaluates trade-offs, balances benefits and risks, satisfies your requirements, and most importantly supports the achievement of your business outcomes.

Read the other posts in this series and take this journey with BigHat Biosciences and Iponweb as they share their perspectives, the decisions they made, and why.

Resources

Want to learn more? Check out the following CI/CD and developer tools on AWS:

Continuous integration (CI)
Continuous delivery (CD)
AWS Developer Tools

For more information about the AWS Well-Architected Framework, refer to the following whitepapers:

AWS Well-Architected Framework
AWS Well-Architected Operational Excellence pillar
AWS Well-Architected Security pillar
AWS Well-Architected Reliability pillar
AWS Well-Architected Performance Efficiency pillar
AWS Well-Architected Cost Optimization pillar

The 3 hexagons of the well architected logo appear to the right of the words AWS Well-Architected.

Author bio

Brian is the global Operational Excellence lead for the AWS Well-Architected program. Formerly the technical lead for an international network, Brian works with customers and partners researching the operations best practices with the greatest positive impact and produces guidance to help you achieve your goals.

Building end-to-end AWS DevSecOps CI/CD pipeline with open source SCA, SAST and DAST tools

2021-01-22 Srinivas Manepalli

Post Syndicated from Srinivas Manepalli original https://aws.amazon.com/blogs/devops/building-end-to-end-aws-devsecops-ci-cd-pipeline-with-open-source-sca-sast-and-dast-tools/

DevOps is a combination of cultural philosophies, practices, and tools that combine software development with information technology operations. These combined practices enable companies to deliver new application features and improved services to customers at a higher velocity. DevSecOps takes this a step further, integrating security into DevOps. With DevSecOps, you can deliver secure and compliant application changes rapidly while running operations consistently with automation.

Having a complete DevSecOps pipeline is critical to building a successful software factory, which includes continuous integration (CI), continuous delivery and deployment (CD), continuous testing, continuous logging and monitoring, auditing and governance, and operations. Identifying the vulnerabilities during the initial stages of the software development process can significantly help reduce the overall cost of developing application changes, but doing it in an automated fashion can accelerate the delivery of these changes as well.

To identify security vulnerabilities at various stages, organizations can integrate various tools and services (cloud and third-party) into their DevSecOps pipelines. Integrating various tools and aggregating the vulnerability findings can be a challenge to do from scratch. AWS has the services and tools necessary to accelerate this objective and provides the flexibility to build DevSecOps pipelines with easy integrations of AWS cloud native and third-party tools. AWS also provides services to aggregate security findings.

In this post, we provide a DevSecOps pipeline reference architecture on AWS that covers the afore-mentioned practices, including SCA (Software Composite Analysis), SAST (Static Application Security Testing), DAST (Dynamic Application Security Testing), and aggregation of vulnerability findings into a single pane of glass. Additionally, this post addresses the concepts of security of the pipeline and security in the pipeline.

You can deploy this pipeline in either the AWS GovCloud Region (US) or standard AWS Regions. As of this writing, all listed AWS services are available in AWS GovCloud (US) and authorized for FedRAMP High workloads within the Region, with the exception of AWS CodePipeline and AWS Security Hub, which are in the Region and currently under the JAB Review to be authorized shortly for FedRAMP High as well.

Services and tools

In this section, we discuss the various AWS services and third-party tools used in this solution.

CI/CD services

For CI/CD, we use the following AWS services:

AWS CodeBuild – A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy.
AWS CodeCommit – A fully managed source control service that hosts secure Git-based repositories.
AWS CodeDeploy – A fully managed deployment service that automates software deployments to a variety of compute services such as Amazon Elastic Compute Cloud (Amazon EC2), AWS Fargate, AWS Lambda, and your on-premises servers.
AWS CodePipeline – A fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.
AWS Lambda – A service that lets you run code without provisioning or managing servers. You pay only for the compute time you consume.
Amazon Simple Notification Service – Amazon SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.
Amazon Simple Storage Service – Amazon S3 is storage for the internet. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.
AWS Systems Manager Parameter Store – Parameter Store gives you visibility and control of your infrastructure on AWS.

Continuous testing tools

The following are open-source scanning tools that are integrated in the pipeline for the purposes of this post, but you could integrate other tools that meet your specific requirements. You can use the static code review tool Amazon CodeGuru for static analysis, but at the time of this writing, it’s not yet available in GovCloud and currently supports Java and Python (available in preview).

OWASP Dependency-Check – A Software Composition Analysis (SCA) tool that attempts to detect publicly disclosed vulnerabilities contained within a project’s dependencies.
SonarQube (SAST) – Catches bugs and vulnerabilities in your app, with thousands of automated Static Code Analysis rules.
PHPStan (SAST) – Focuses on finding errors in your code without actually running it. It catches whole classes of bugs even before you write tests for the code.
OWASP Zap (DAST) – Helps you automatically find security vulnerabilities in your web applications while you’re developing and testing your applications.

Continuous logging and monitoring services

The following are AWS services for continuous logging and monitoring:

AWS CloudWatch Logs – Allows you to monitor, store, and access your log files from EC2 instances, AWS CloudTrail, Amazon Route 53, and other sources
AWS CloudWatch Events – Delivers a near-real-time stream of system events that describe changes in AWS resources

Auditing and governance services

The following are AWS auditing and governance services:

AWS CloudTrail – Enables governance, compliance, operational auditing, and risk auditing of your AWS account.
AWS Identity and Access Management – Enables you to manage access to AWS services and resources securely. With IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources.
AWS Config – Allows you to assess, audit, and evaluate the configurations of your AWS resources.

Operations services

The following are AWS operations services:

AWS Security Hub – Gives you a comprehensive view of your security alerts and security posture across your AWS accounts. This post uses Security Hub to aggregate all the vulnerability findings as a single pane of glass.
AWS CloudFormation – Gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles, by treating infrastructure as code.
AWS Systems Manager Parameter Store – Provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, Amazon Machine Image (AMI) IDs, and license codes as parameter values.
AWS Elastic Beanstalk – An easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS. This post uses Elastic Beanstalk to deploy LAMP stack with WordPress and Amazon Aurora MySQL. Although we use Elastic Beanstalk for this post, you could configure the pipeline to deploy to various other environments on AWS or elsewhere as needed.

Pipeline architecture

The following diagram shows the architecture of the solution.

AWS DevSecOps CICD pipeline architecture

The main steps are as follows:

When a user commits the code to a CodeCommit repository, a CloudWatch event is generated which, triggers CodePipeline.
CodeBuild packages the build and uploads the artifacts to an S3 bucket. CodeBuild retrieves the authentication information (for example, scanning tool tokens) from Parameter Store to initiate the scanning. As a best practice, it is recommended to utilize Artifact repositories like AWS CodeArtifact to store the artifacts, instead of S3. For simplicity of the workshop, we will continue to use S3.
CodeBuild scans the code with an SCA tool (OWASP Dependency-Check) and SAST tool (SonarQube or PHPStan; in the provided CloudFormation template, you can pick one of these tools during the deployment, but CodeBuild is fully enabled for a bring your own tool approach).
If there are any vulnerabilities either from SCA analysis or SAST analysis, CodeBuild invokes the Lambda function. The function parses the results into AWS Security Finding Format (ASFF) and posts it to Security Hub. Security Hub helps aggregate and view all the vulnerability findings in one place as a single pane of glass. The Lambda function also uploads the scanning results to an S3 bucket.
If there are no vulnerabilities, CodeDeploy deploys the code to the staging Elastic Beanstalk environment.
After the deployment succeeds, CodeBuild triggers the DAST scanning with the OWASP ZAP tool (again, this is fully enabled for a bring your own tool approach).
If there are any vulnerabilities, CodeBuild invokes the Lambda function, which parses the results into ASFF and posts it to Security Hub. The function also uploads the scanning results to an S3 bucket (similar to step 4).
If there are no vulnerabilities, the approval stage is triggered, and an email is sent to the approver for action.
After approval, CodeDeploy deploys the code to the production Elastic Beanstalk environment.
During the pipeline run, CloudWatch Events captures the build state changes and sends email notifications to subscribed users through SNS notifications.
CloudTrail tracks the API calls and send notifications on critical events on the pipeline and CodeBuild projects, such as UpdatePipeline, DeletePipeline, CreateProject, and DeleteProject, for auditing purposes.
AWS Config tracks all the configuration changes of AWS services. The following AWS Config rules are added in this pipeline as security best practices:
CODEBUILD_PROJECT_ENVVAR_AWSCRED_CHECK – Checks whether the project contains environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. The rule is NON_COMPLIANT when the project environment variables contains plaintext credentials.
CLOUD_TRAIL_LOG_FILE_VALIDATION_ENABLED – Checks whether CloudTrail creates a signed digest file with logs. AWS recommends that the file validation be enabled on all trails. The rule is noncompliant if the validation is not enabled.

Security of the pipeline is implemented by using IAM roles and S3 bucket policies to restrict access to pipeline resources. Pipeline data at rest and in transit is protected using encryption and SSL secure transport. We use Parameter Store to store sensitive information such as API tokens and passwords. To be fully compliant with frameworks such as FedRAMP, other things may be required, such as MFA.

Security in the pipeline is implemented by performing the SCA, SAST and DAST security checks. Alternatively, the pipeline can utilize IAST (Interactive Application Security Testing) techniques that would combine SAST and DAST stages.

As a best practice, encryption should be enabled for the code and artifacts, whether at rest or transit.

In the next section, we explain how to deploy and run the pipeline CloudFormation template used for this example. Refer to the provided service links to learn more about each of the services in the pipeline. If utilizing CloudFormation templates to deploy infrastructure using pipelines, we recommend using linting tools like cfn-nag to scan CloudFormation templates for security vulnerabilities.

Prerequisites

Before getting started, make sure you have the following prerequisites:

Elastic Beanstalk environments with an application deployed. In this post, we use WordPress, but you can use any other application. For more information, see Deploying a high-availability WordPress website with an external Amazon RDS database to Elastic Beanstalk.
A CodeCommit repo with your application code. For more information, see Create an AWS CodeCommit repository.
The provided buildspec-*.yml files, sonar-project.properties file, json file, and phpstan.neon file uploaded to the root of the application code repository.
The Lambda function uploaded to a S3 bucket. We use this function to parse the scanning reports and post the results to Security Hub.
A SonarQube URL and generated API token for code scanning.
An OWASP ZAP URL and generated API key for dynamic web scanning.
An application web URL to run the DAST testing.
An email address to receive approval notifications for deployment, pipeline change notifications, and CloudTrail events.
AWS Config and Security Hub enabled. For instructions, see Managing the Configuration Recorder and Enabling Security Hub manually, respectively.

Deploying the pipeline

To deploy the pipeline, complete the following steps: Download the CloudFormation template and pipeline code from GitHub repo.

Log in to your AWS account if you have not done so already.
On the CloudFormation console, choose Create Stack.
Choose the CloudFormation pipeline template.
Choose Next.
Provide the stack parameters:
- Under Code, provide code details, such as repository name and the branch to trigger the pipeline.
- Under SAST, choose the SAST tool (SonarQube or PHPStan) for code analysis, enter the API token and the SAST tool URL. You can skip SonarQube details if using PHPStan as the SAST tool.
- Under DAST, choose the DAST tool (OWASP Zap) for dynamic testing and enter the API token, DAST tool URL, and the application URL to run the scan.
- Under Lambda functions, enter the Lambda function S3 bucket name, filename, and the handler name.
- Under STG Elastic Beanstalk Environment and PRD Elastic Beanstalk Environment, enter the Elastic Beanstalk environment and application details for staging and production to which this pipeline deploys the application code.
- Under General, enter the email addresses to receive notifications for approvals and pipeline status changes.

CF Deploymenet - Passing parameter values

CloudFormation deployment - Passing parameter values

CloudFormation template deployment

After the pipeline is deployed, confirm the subscription by choosing the provided link in the email to receive the notifications.

The provided CloudFormation template in this post is formatted for AWS GovCloud. If you’re setting this up in a standard Region, you have to adjust the partition name in the CloudFormation template. For example, change ARN values from arn:aws-us-gov to arn:aws.

Running the pipeline

To trigger the pipeline, commit changes to your application repository files. That generates a CloudWatch event and triggers the pipeline. CodeBuild scans the code and if there are any vulnerabilities, it invokes the Lambda function to parse and post the results to Security Hub.

When posting the vulnerability finding information to Security Hub, we need to provide a vulnerability severity level. Based on the provided severity value, Security Hub assigns the label as follows. Adjust the severity levels in your code based on your organization’s requirements.

0 – INFORMATIONAL
1–39 – LOW
40– 69 – MEDIUM
70–89 – HIGH
90–100 – CRITICAL

The following screenshot shows the progression of your pipeline.

CodePipeline stages

SCA and SAST scanning

In our architecture, CodeBuild trigger the SCA and SAST scanning in parallel. In this section, we discuss scanning with OWASP Dependency-Check, SonarQube, and PHPStan.

Scanning with OWASP Dependency-Check (SCA)

The following is the code snippet from the Lambda function, where the SCA analysis results are parsed and posted to Security Hub. Based on the results, the equivalent Security Hub severity level (normalized_severity) is assigned.

Lambda code snippet for OWASP Dependency-check

You can see the results in Security Hub, as in the following screenshot.

SecurityHub report from OWASP Dependency-check scanning

Scanning with SonarQube (SAST)

The following is the code snippet from the Lambda function, where the SonarQube code analysis results are parsed and posted to Security Hub. Based on SonarQube results, the equivalent Security Hub severity level (normalized_severity) is assigned.

Lambda code snippet for SonarQube

The following screenshot shows the results in Security Hub.

SecurityHub report from SonarQube scanning

Scanning with PHPStan (SAST)

The following is the code snippet from the Lambda function, where the PHPStan code analysis results are parsed and posted to Security Hub.

Lambda code snippet for PHPStan

The following screenshot shows the results in Security Hub.

SecurityHub report from PHPStan scanning

DAST scanning

In our architecture, CodeBuild triggers DAST scanning and the DAST tool.

If there are no vulnerabilities in the SAST scan, the pipeline proceeds to the manual approval stage and an email is sent to the approver. The approver can review and approve or reject the deployment. If approved, the pipeline moves to next stage and deploys the application to the provided Elastic Beanstalk environment.

Scanning with OWASP Zap

After deployment is successful, CodeBuild initiates the DAST scanning. When scanning is complete, if there are any vulnerabilities, it invokes the Lambda function similar to SAST analysis. The function parses and posts the results to Security Hub. The following is the code snippet of the Lambda function.

Lambda code snippet for OWASP-Zap

The following screenshot shows the results in Security Hub.

SecurityHub report from OWASP-Zap scanning

Aggregation of vulnerability findings in Security Hub provides opportunities to automate the remediation. For example, based on the vulnerability finding, you can trigger a Lambda function to take the needed remediation action. This also reduces the burden on operations and security teams because they can now address the vulnerabilities from a single pane of glass instead of logging into multiple tool dashboards.

Conclusion

In this post, I presented a DevSecOps pipeline that includes CI/CD, continuous testing, continuous logging and monitoring, auditing and governance, and operations. I demonstrated how to integrate various open-source scanning tools, such as SonarQube, PHPStan, and OWASP Zap for SAST and DAST analysis. I explained how to aggregate vulnerability findings in Security Hub as a single pane of glass. This post also talked about how to implement security of the pipeline and in the pipeline using AWS cloud native services. Finally, I provided the DevSecOps pipeline as code using AWS CloudFormation. For additional information on AWS DevOps services and to get started, see AWS DevOps and DevOps Blog.

Srinivas Manepalli is a DevSecOps Solutions Architect in the U.S. Fed SI SA team at Amazon Web Services (AWS). He is passionate about helping customers, building and architecting DevSecOps and highly available software systems. Outside of work, he enjoys spending time with family, nature and good food.

Discovering sensitive data in AWS CodeCommit with AWS Lambda

2021-01-04 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/discovering-sensitive-data-in-aws-codecommit-with-aws-lambda-2/

This post is courtesy of Markus Ziller, Solutions Architect.

Today, git is a de facto standard for version control in modern software engineering. The workflows enabled by git’s branching capabilities are a major reason for this. However, with git’s distributed nature, it can be difficult to reliably remove changes that have been committed from all copies of the repository. This is problematic when secrets such as API keys have been accidentally committed into version control. The longer it takes to identify and remove secrets from git, the more likely that the secret has been checked out by another user.

This post shows a solution that automatically identifies credentials pushed to AWS CodeCommit in near-real-time. I also show three remediation measures that you can use to reduce the impact of secrets pushed into CodeCommit:

Notify users about the leaked credentials.
Lock the repository for non-admins.
Hard reset the CodeCommit repository to a healthy state.

I use the AWS Cloud Development Kit (CDK). This is an open source software development framework to model and provision cloud application resources. Using the CDK can reduce the complexity and amount of code needed to automate the deployment of resources.

Overview of solution

The services in this solution are AWS Lambda, AWS CodeCommit, Amazon EventBridge, and Amazon SNS. These services are part of the AWS serverless platform. They help reduce undifferentiated work around managing servers, infrastructure, and the parts of the application that add less value to your customers. With serverless, the solution scales automatically, has built-in high availability, and you only pay for the resources you use.

This diagram outlines the workflow implemented in this blog:

After a developer pushes changes to CodeCommit, it emits an event to an event bus.
A rule defined on the event bus routes this event to a Lambda function.
The Lambda function uses the AWS SDK for JavaScript to get the changes introduced by commits pushed to the repository.
It analyzes the changes for secrets. If secrets are found, it publishes another event to the event bus.
Rules associated with this event type then trigger invocations of three Lambda functions A, B, and C with information about the problematic changes.
Each of the Lambda functions runs a remediation measure:
- Function A sends out a notification to an SNS topic that informs users about the situation (A1).
- Function B locks the repository by setting a tag with the AWS SDK (B2). It sends out a notification about this action (B2).
- Function C runs git commands that remove the problematic commit from the CodeCommit repository (C2). It also sends out a notification (C1).

Walkthrough

The following walkthrough explains the required components, their interactions and how the provisioning can be automated via CDK.

For this walkthrough, you need:

An AWS account
Installed and authenticated AWS CLI (authenticate with an IAM user or an AWS STS security token)
Installed Node.js, TypeScript
Installed git
AWS CDK installed (npm install aws-cdk -g)

Checkout and deploy the sample stack:

After completing the prerequisites, clone the associated GitHub repository by running the following command in a local directory:
```
git clone [email protected]:aws-samples/discover-sensitive-data-in-aws-codecommit-with-aws-lambda.git
```
Open the repository in a local editor and review the contents of cdk/lib/resources.ts, src/handlers/commits.ts, and src/handlers/remediations.ts.
Follow the instructions in the README.md to deploy the stack.

The CDK will deploy resources for the following services in your account.

Using CodeCommit to manage your git repositories

The CDK creates a new empty repository called TestRepository and adds a tag RepoState with an initial value of ok. You later use this tag in the LockRepo remediation strategy to restrict access.

It also creates two IAM groups with one user in each. Members of the CodeCommitSuperUsers group are always able to access the repository, while members of the CodeCommitUsers group can only access the repository when the value of the tag RepoState is not locked.

I also import the CodeCommitSystemUser into the CDK. Since the user requires git credentials in a downloaded CSV file, it cannot be created by the CDK. Instead it must be created as described in the README file.

The following CDK code sets up all the described resources:

const TAG_NAME = "RepoState";

const superUsers = new Group(this, "CodeCommitSuperUsers", { groupName: "CodeCommitSuperUsers" });
superUsers.addUser(new User(this, "CodeCommitSuperUserA", {
    password: new Secret(this, "CodeCommitSuperUserPassword").secretValue,
    userName: "CodeCommitSuperUserA"
}));

const users = new Group(this, "CodeCommitUsers", { groupName: "CodeCommitUsers" });
users.addUser(new User(this, "User", {
    password: new Secret(this, "CodeCommitUserPassword").secretValue,
    userName: "CodeCommitUserA"
}));

const systemUser = User.fromUserName(this, "CodeCommitSystemUser", props.codeCommitSystemUserName);

const repo = new Repository(this, "Repository", {
    repositoryName: "TestRepository",
    description: "The repository to test this project out",
});
Tags.of(repo).add(TAG_NAME, "ok");

users.addToPolicy(new PolicyStatement({
    effect: Effect.ALLOW,
    actions: ["*"],
    resources: [repo.repositoryArn],
    conditions: {
        StringNotEquals: {
            [`aws:ResourceTag/${TAG_NAME}`]: "locked"
        }
    }
}));

superUsers.addToPolicy(new PolicyStatement({
    effect: Effect.ALLOW,
    actions: ["*"],
    resources: [repo.repositoryArn]
}));

Using EventBridge to pass events between components

I use EventBridge, a serverless event bus, to connect the Lambda functions together. Many AWS services like CodeCommit are natively integrated into EventBridge and publish events about changes in their environment.

repo.onCommit is a higher-level CDK construct. It creates the required resources to invoke a Lambda function for every commit to a given repository. The created events rule looks like this:

Note that this event rule only matches commit events in TestRepository. To send commits of all repositories in that account to the inspecting Lambda function, remove the resources filter in the event pattern.

CodeCommit Repository State Change is a default event that is published by CodeCommit if changes are made to a repository. In addition, I define CodeCommit Security Event, a custom event, which Lambda publishes to the same event bus if secrets are discovered in the inspected code.

The sample below shows how you can set up Lambda functions as targets for both type of events.

const DETAIL_TYPE = "CodeCommit Security Event";
const eventBus = new EventBus(this, "CodeCommitEventBus", {
    eventBusName: "CodeCommitSecurityEvents"
});

repo.onCommit("AnyCommitEvent", {
    ruleName: "CallLambdaOnAnyCodeCommitEvent",
    target: new targets.LambdaFunction(commitInspectLambda)
});


new Rule(this, "CodeCommitSecurityEvent", {
    eventBus,
    enabled: true,
    ruleName: "CodeCommitSecurityEventRule",
    eventPattern: {
        detailType: [DETAIL_TYPE]
    },
    targets: [
        new targets.LambdaFunction(lockRepositoryLambda),
        new targets.LambdaFunction(raiseAlertLambda),
        new targets.LambdaFunction(forcefulRevertLambda)
    ]
});

Using Lambda functions to run remediation measures

AWS Lambda functions allow you to run code in response to events. The example defines four Lambda functions.

By comparing the delta to its predecessor, the commitInspectLambda function analyzes if secrets are introduced by a commit. With the CDK, you can create a Lambda function with:

const myLambdaInCDK = new Function(this, "UniqueIdentifierRequiredByCDK", {
    runtime: Runtime.NODEJS_12_X,
    handler: "<handlerfile>.<function name>",
    code: Code.fromAsset(path.join(__dirname, "..", "..", "src", "handlers")),
    // See git repository for complete code
});

The code for this Lambda function uses the AWS SDK for JavaScript to fetch the details of the commit, the differences introduced, and the new content.

The code checks each modified file line by line with a regular expression that matches typical secret formats. In src/handlers/regex.json, I provide a few regular expressions that match common secrets. You can extend this with your own patterns.

If a secret is discovered, a CodeCommit Security Event is published to the event bus. EventBridge then invokes all Lambda functions that are registered as targets with this event. This demo triggers three remediation measures.

The raiseAlertLambda function uses the AWS SDK for JavaScript to send out a notification to all subscribers (that is, CodeCommit administrators) on an SNS topic. It takes no further action.

SNS.publish({
    TopicArn: <TOPIC_ARN>,
    Subject: `[ACTION REQUIRED] Secrets discovered in <repo>`
    Message: `<Your message>
}

The lockRepositoryLambda function uses the AWS SDK for JavaScript to change the RepoState tag from ok to locked. This restricts access to members of the CodeCommitSuperUsers IAM group.

CodeCommit.tagResource({
    resourceArn: event.detail.repositoryArn,
    tags: {
        RepoState: "locked"
    }
})

In addition, the Lambda function uses SNS to send out a notification. The forcefulRevertLambda function runs the following git commands:

git clone <repository>
git checkout <branch>
git reset –hard <previousCommitId>
git push origin <branch> --force

These commands reset the repository to the last accepted commit, by forcefully removing the respective commit from the git history of your CodeCommit repo. I advise you to handle this with care and only activate it on a real project if you fully understand the consequences of rewriting git history.

The Node.js v12 runtime for Lambda does not have a git runtime installed by default. You can add one by using the git-lambda2 Lambda layer. This allows you to run git commands from within the Lambda function.

Finally, this Lambda function also sends out a notification. The complete code is available in the GitHub repo.

Using SNS to notify users

To notify users about secrets discovered and actions taken, you create an SNS topic and subscribe to it via email.

const topic = new Topic(this, "CodeCommitSecurityEventNotification", {
    displayName: "CodeCommitSecurityEventNotification",
});

topic.addSubscription(new subs.EmailSubscription(/* your email address */));

Testing the solution

You can test the deployed solution by running these two sets of commands. First, add a file with no credentials:

echo "Clean file - no credentials here" > clean_file.txt
git add clean_file.txt
git commit clean_file.txt -m "Adds clean_file.txt"
git push

Then add a file containing credentials:

SECRET_LIKE_STRING=$(cat /dev/urandom | env LC_CTYPE=C tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)
echo "secret=$SECRET_LIKE_STRING" > problematic_file.txt
git add problematic_file.txt
git commit problematic_file.txt -m "Adds secret-like string to problematic_file.txt"
git push

This first command creates, commits and pushes an unproblematic file clean_file.txt that will pass the checks of commitInspectLambda. The second command creates, commits, and pushes problematic_file.txt, which matches the regular expressions and triggers the remediation measures.

If you check your email, you soon receive notifications about actions taken by the Lambda functions.

Cleaning up

To avoid incurring charges, delete the resources by running cdk destroy and confirming the deletion.

Conclusion

This post demonstrates how you can implement a solution to discover secrets in commits to AWS CodeCommit repositories. It also defines different strategies to remediate this.

The CDK code to set up all components is minimal and can be extended for remediation measures. The template is portable between Regions and uses serverless technologies to minimize cost and complexity.

For more serverless learning resources, visit Serverless Land.

Automating deployments to Raspberry Pi devices using AWS CodePipeline

2020-10-31 Ahmed ElHaw

Post Syndicated from Ahmed ElHaw original https://aws.amazon.com/blogs/devops/automating-deployments-to-raspberry-pi-devices-using-aws-codepipeline/

Managing applications deployments on Raspberry Pi can be cumbersome, especially in headless mode and at scale when placing the devices outdoors and out of reach such as in home automation projects, in the yard (for motion detection) or on the roof (as a humidity and temperature sensor). In these use cases, you have to remotely connect via secure shell to administer the device.

It can be complicated to keep physically connecting when you need a monitor, keyboard, and mouse. Alternatively, you can connect via SSH in your home local network, provided your client workstation is also on the same private network.

In this post, we discuss using Raspberry Pi as a headless server with minimal-to-zero direct interaction by using AWS CodePipeline. We examine two use cases:

Managing and automating operational tasks of the Raspberry Pi, running Raspbian OS or any other Linux distribution. For more information about this configuration, see Manage Raspberry Pi devices using AWS Systems Manager.
Automating deployments to one or more Raspberry Pi device in headless mode (in which you don’t use a monitor or keyboard to run your device). If you use headless mode but still need to do some wireless setup, you can enable wireless networking and SSH when creating an image.

Solution overview

Our solution uses the following services:

We use CodePipeline to manage continuous integration and deployment to Raspberry Pi running Ubuntu Server 18 for ARM. As of this writing, CodeDeploy agents are supported on Windows OS, Red Hat, and Ubuntu.

For this use case, we use the image ubuntu-18.04.4-preinstalled-server-arm64+raspi3.img.

To close the loop, you edit your code or commit new revisions from your PC or Amazon Elastic Compute Cloud (Amazon EC2) to trigger the pipeline to deploy to Pi. The following diagram illustrates the architecture of our automated pipeline.

Setting up a Raspberry Pi device

To set up a CodeDeploy agent on a Raspberry Pi device, the device should be running an Ubuntu Server 18 for ARM, which is supported by the Raspberry Pi processor architecture and the CodeDeploy agent, and it should be connected to the internet. You will need a keyboard and a monitor for the initial setup.

Follow these instructions for your initial setup:

Download the Ubuntu image.

Pick the image based on your Raspberry Pi model. For this use case, we use Raspberry Pi 4 with Ubuntu 18.04.4 LTS.

Burn the Ubuntu image to your microSD using a disk imager software (or other reliable tool). For instructions, see Create an Ubuntu Image for a Raspberry Pi on Windows.
Configure WiFi on the Ubuntu server.

After booting from the newly flashed microSD, you can configure the OS.

To enable DHCP, enter the following YAML (or create the yaml file if it doesn’t exist) to /etc/netplan/wireless.yaml:

network:
  version: 2
  wifis:
    wlan0:
      dhcp4: yes
      dhcp6: no
      access-points:
        "<your network ESSID>":
          password: "<your wifi password>"

Replace the variables <your network ESSID> and <your wifi password> with your wireless network SSID and password, respectively.

Run the netplan by entering the following command:

ubuntu@ubuntu:~$ sudo netplan try

Installing CodeDeploy and registering Raspberry Pi as an on-premises instance

When the Raspberry Pi is connected to the internet, you’re ready to install the AWS Command Line Interface (AWS CLI) and the CodeDeploy agent to manage automated deployments through CodeDeploy.

To register an on-premises instance, you must use an AWS Identity and Access Management (IAM) identity to authenticate your requests. You can choose from the following options for the IAM identity and registration method you use:

An IAM user ARN. This is best for registering a single on-premises instance.
An IAM role to authenticate requests with periodically refreshed temporary credentials generated with the AWS Security Token Service (AWS STS). This is best for registering a large number of on-premises instances.

For this post, we use the first option and create an IAM user and register a single Raspberry Pi. You can use this procedure for a handful of devices. Make sure you limit the privileges of the IAM user to what you need to achieve; a scoped-down IAM policy is given in the documentation instructions. For more information, see Use the register command (IAM user ARN) to register an on-premises instance.

Install the AWS CLI on Raspberry Pi with the following code:

ubuntu@ubuntu:~$ sudo apt install awscli

Configure the AWS CLI and enter your newly created IAM access key, secret access key, and Region (for example, eu-west-1):

ubuntu@ubuntu:~$ sudo aws configure
AWS Access Key ID [None]: <IAM Access Key>
AWS Secret Access Key [None]: <Secret Access Key>
Default region name [None]: <AWS Region>
Default output format [None]: Leave default, press Enter.

Now that the AWS CLI running on the Raspberry Pi has access to CodeDeploy API operations, you can register the device as an on-premises instance:

ubuntu@ubuntu:~$ sudo aws deploy register --instance-name rpi4UbuntuServer --iam-user-arn arn:aws:iam::<AWS_ACCOUNT_ID>:user/Rpi --tags Key=Name,Value=Rpi4 --region eu-west-1
Registering the on-premises instance... DONE
Adding tags to the on-premises instance... DONE

Tags allow you to assign metadata to your AWS resources. Each tag is a simple label consisting of a customer-defined key and an optional value that can make it easier to manage, search for, and filter resources by purpose, owner, environment, or other criteria.

When working with on-premises instances with CodeDeploy, tags are mandatory to select the instances for deployment. For this post, we tag the first device with Key=Name,Value=Rpi4. Generally speaking, it’s good practice to use tags on all applicable resources.

You should see something like the following screenshot on the CodeDeploy console.

Or from the CLI, you should see the following output:

ubuntu@ubuntu:~$ sudo aws deploy list-on-premises-instances
{
    "instanceNames": [
        "rpi4UbuntuServer"
    ]
}

Install the CodeDeploy agent:

ubuntu@ubuntu:~$ sudo aws deploy install --override-config --config-file /etc/codedeploy-agent/conf/codedeploy.onpremises.yml --region eu-west-1

If the preceding command fails due to dependencies, you can get the CodeDeploy package and install it manually:

ubuntu@ubuntu:~$ sudo apt-get install ruby
ubuntu@ubuntu:~$ sudo wget https://aws-codedeploy-us-west-2.s3.amazonaws.com/latest/install
--2020-03-28 18:58:15--  https://aws-codedeploy-us-west-2.s3.amazonaws.com/latest/install
Resolving aws-codedeploy-us-west-2.s3.amazonaws.com (aws-codedeploy-us-west-2.s3.amazonaws.com)... 52.218.249.82
Connecting to aws-codedeploy-us-west-2.s3.amazonaws.com (aws-codedeploy-us-west-2.s3.amazonaws.com)|52.218.249.82|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13819 (13K) []
Saving to: ‘install’
install 100%[====================================================================>]  13.50K  --.-KB/s    in 0.003s 
2020-03-28 18:58:16 (3.81 MB/s) - ‘install’ saved [13819/13819]
ubuntu@ubuntu:~$ sudo chmod +x ./install
ubuntu@ubuntu:~$ sudo ./install auto

Check the service status with the following code:

ubuntu@ubuntu:~$ sudo service codedeploy-agent status
codedeploy-agent.service - LSB: AWS CodeDeploy Host Agent
   Loaded: loaded (/etc/init.d/codedeploy-agent; generated)
   Active: active (running) since Sat 2020-08-15 14:18:22 +03; 17s ago
     Docs: man:systemd-sysv-generator(8)
    Tasks: 3 (limit: 4441)
   CGroup: /system.slice/codedeploy-agent.service
           └─4243 codedeploy-agent: master 4243

Start the service (if not started automatically):

ubuntu@ubuntu:~$ sudo service codedeploy-agent start

Congratulations! Now that the CodeDeploy agent is installed and the Raspberry Pi is registered as an on-premises instance, CodeDeploy can deploy your application build to the device.

Creating your source stage

You’re now ready to create your source stage.

On the CodeCommit console, under Source, choose Repositories.
Choose Create repository.

For instructions on connecting your repository from your local workstation, see Setup for HTTPS users using Git credentials.

In the root directory of the repository, you should include an AppSpec file for an EC2/On-Premises deployment, where the filename must be yml for a YAML-based file. The file name is case-sensitive.

The following example code is from the appspec.yml file:

version: 0.0
os: linux
files:
  - source: /
    destination: /home/ubuntu/AQI/
hooks:
  BeforeInstall:
    - location: scripts/testGPIO.sh
      timeout: 60
      runas: root
  AfterInstall:
    - location: scripts/testSensors.sh
      timeout: 300
      runas: root
  ApplicationStart:
    - location: startpublishdht11toshadow.sh
    - location: startpublishnovatoshadow.sh
      timeout: 300
      runas: root

The files section defines the files to copy from the repository to the destination path on the Raspberry Pi.

The hooks section runs one time per deployment to an instance. If an event hook isn’t present, no operation runs for that event. This section is required only if you’re running scripts as part of the deployment. It’s useful to implement some basic testing before and after installation of your application revisions. For more information about hooks, see AppSpec ‘hooks’ section for an EC2/On-Premises deployment.

Creating your deploy stage

To create your deploy stage, complete the following steps:

On the CodeDeploy console, choose Applications.
Create your application and deployment group.
1. For Deployment type, select In-place.

For Environment configuration, select On-premises instances.
Add the tags you registered the instance with in the previous step (for this post, we add the key-value pair Name=RPI4.

Creating your pipeline

You’re now ready to create your pipeline.

On the CodePipeline console, choose Pipelines.
Choose Create pipeline.
For Pipeline name, enter a descriptive name.
For Service role¸ select New service role.
For Role name, enter your service role name.
Leave the advanced settings at their default.
Choose Next.

For Source provider, choose AWS CodeCommit
For Repository name, choose the repository you created earlier.
For Branch name, enter your repository branch name.
For Change detection options, select Amazon CloudWatch Events.
Choose Next.

As an optional step, you can add a build stage, depending on whether your application is built with an interpreted language like Python or a compiled one like .NET C#. CodeBuild creates a fully managed build server on your behalf that runs the build commands using the buildspec.yml in the source code root directory.

For Deploy provider, choose AWS CodeDeploy.
For Region, choose your Region.
For Application name, choose your application.
For Deployment group, choose your deployment group.
Choose Next.

Review your settings and create your pipeline.

Cleaning up

If you no longer plan to deploy to your Raspberry PI and want remove the CodeDeploy agent from your device, you can clean up with the following steps.

Uninstalling the agent

Automatically uninstall the CodeDeploy agent and remove the configuration file from an on-premises instance with the following code:

ubuntu@ubuntu:~$ sudo aws deploy uninstall
(Reading database ... 238749 files and directories currently installed.)
Removing codedeploy-agent (1.0-1.1597) ...
Processing triggers for systemd (237-3ubuntu10.39) ...
Processing triggers for ureadahead (0.100.0-21) ...
Uninstalling the AWS CodeDeploy Agent... DONE
Deleting the on-premises instance configuration... DONE

The uninstall command does the following:

Stops the running CodeDeploy agent on the on-premises instance.
Uninstalls the CodeDeploy agent from the on-premises instance.
Removes the configuration file from the on-premises instance. (For Ubuntu Server and RHEL, this is /etc/codedeploy-agent/conf/codedeploy.onpremises.yml. For Windows Server, this is C:\ProgramData\Amazon\CodeDeploy\conf.onpremises.yml.)

De-registering the on-premises instance

This step is only supported using the AWS CLI. To de-register your instance, enter the following code:

ubuntu@ubuntu:~$ sudo aws deploy deregister --instance-name rpi4UbuntuServer --region eu-west-1
Retrieving on-premises instance information... DONE
IamUserArn: arn:aws:iam::XXXXXXXXXXXX:user/Rpi
Tags: Key=Name,Value=Rpi4
Removing tags from the on-premises instance... DONE
Deregistering the on-premises instance... DONE
Deleting the IAM user policies... DONE
Deleting the IAM user access keys... DONE
Deleting the IAM user (Rpi)... DONE

Optionally, delete your application from CodeDeploy, and your repository from CodeCommit and CodePipeline from the respective service consoles.

Conclusion

You’re now ready to automate your deployments to your Raspberry Pi or any on-premises supported operating system. Automated deployments and source code version control frees up more time in developing your applications. Continuous deployment helps with the automation and version tracking of your scripts and applications deployed on the device.

For more information about IoT projects created using a Raspberry Pi, see my Air Pollution demo and Kid Monitor demo.

About the author

Ahmed ElHaw is a Sr. Solutions Architect at Amazon Web Services (AWS) with background in telecom, web development and design, and is passionate about spatial computing and AWS serverless technologies. He enjoys providing technical guidance to customers, helping them architect and build solutions that make the best use of AWS. Outside of work he enjoys spending time with his kids and playing video games.

Agile website delivery with Hugo and AWS Amplify

2020-10-28 Nigel Harris

Post Syndicated from Nigel Harris original https://aws.amazon.com/blogs/devops/agile-website-delivery-with-hugo-and-aws-amplify/

In this post, we show how you can rapidly configure and deploy a website using Hugo (an AWS Cloud9 integrated development environment (IDE) for content editing), AWS CodeCommit for source code control, and AWS Amplify to implement a source code-controlled, automated deployment process.

When hosting a website on AWS, you can choose from several options. One popular option is to use Amazon Simple Storage Service (Amazon S3) to host a static website. If you prefer full access to the infrastructure hosting your website, you can use the NGINX Quick Start to quickly deploy web server infrastructure using AWS CloudFormation.

Static website generators such as Hugo and MkDocs accelerate the website content generation process, and can be a valuable tool when trying to rapidly deliver technical documentation or similar content. Typically, the content creation process requires programming in HTML and CSS.

Hugo is written in Go and available under the Apache 2.0 license. It provides several themes (collections of layouts) that accelerate website creation by drastically reducing the need to focus on format. You can author content in Markdown and output in multiple languages and formats (including ebook formats). Excellent examples of public websites built using Hugo include Digital.gov and Kubernetes.io.

Solution overview

This solution illustrates how to provision a hosted, source code-controlled Hugo generated website using CodeCommit and Amplify Console. The provisioned website is configured with a custom subdomain and an SSL certificate. We use an AWS Cloud9 IDE to enable content creation in the cloud.

Setting up an AWS Cloud9 IDE

Start by provisioning an AWS Cloud9 IDE. AWS Cloud9 environments run using Amazon Elastic Compute Cloud (Amazon EC2). You need to provision your AWS Cloud9 environment into an existing public subnet in an Amazon Virtual Private Cloud (Amazon VPC) within your AWS account. You can complete this in the following steps:

1. Access your AWS account using with an identity with administrative privileges. If you don’t have an AWS account, you can create one.

2. Create a new AWS Cloud9 environment using the wizard on the AWS Cloud9 console.

3. Enter a name for your desktop and an optional description.

4. Choose Next step.

Naming your Cloud 9 environment

5. In the Environment settings section, for Environment type, select Create a new EC2 instance for environment (direct access).

6. For Instance type, select your preferred instance type (the default, t2.micro, works for this use case)

7. Under Network settings, for Network (VPC), choose a VPC that you wish to deploy your AWS Cloud9 instance into. You may wish to use your default VPC, which is suitable for the purpose of this tutorial.

8. Choose a public subnet from this VPC for deployment.

Cloud9 Settings

9. Leave all other settings unchanged and choose Next step.
10. Review your choices and choose Create environment.

Environment creation takes a few minutes to complete. When the environment is ready, you receive access to the AWS Cloud9 IDE in your browser. We return to it shortly to develop content for your Hugo website.

Your Cloud9 Desktop

Configuring a source code repository to track content changes

Static website generators enable rapid changes to website content and layout. Source control management (SCM) systems provide a revision history for your code, and allow you to revert to previous versions of a project when unintended changes are introduced. SCM systems become increasingly important as the velocity of change and the number of team members introducing change increases.

You now create a source code repository to track changes to your content. You use CodeCommit, a fully-managed source control service that hosts secure Git-based repositories.

1. In a new browser, sign in to the CodeCommit Console and create a new repository.

2. For Repository name, enter amplify-website.

3. For Description, enter an appropriate description.

4. Choose Create.

Create repository

Repository creation takes just a few moments.

5. In the Connection steps section, choose the appropriate method to connect to your repository based on how you accessed your AWS account.

For this post, I signed in to my AWS account using federated access, so I choose the HTTPS Git Remote CodeCommit (HTTPS-GRC) tab. This is the recommended connection method for this sign-in type. You can also configure a connection to your repository using SSH or Git credentials over HTTPS. SSH and Git credentials over HTTPS are appropriate methods if you have signed in to your AWS account as an AWS Identity and Access Management (IAM) user. The Amazon CodeCommit console provides additional information regarding each of these connection types, including links to supporting documentation.

Connect to Repo

Configuring and deploying an example website

You’re now ready to configure and deploy your website.

1. Return to the browser with your AWS Cloud9 IDE and place your cursor in the lower terminal pane of the IDE.

The terminal pane provides Bash shell access on the EC2 instance running AWS Cloud9.

You now create a Hugo website. The website design is based on Hugo-theme-learn. Themes are collections of Hugo layouts that take all the hassle out of building your website. Learn is a multilingual-ready theme authored by Mathieu Cornic, designed for building technical documentation websites.

Hugo provides a variety of themes on their website. Many of the themes include bundled example website content that you can easily adapt by following the accompanying theme documentation.

2. Enter the following code to download an existing example website stored as a .zip file, extract it, and commit the contents into CodeCommit from your AWS Cloud9 IDE:

cd ~/environment
aws s3 cp s3://ee-assets-prod-us-east-1/modules/3c5ba9cb6ff44465b96993d210f67147/v1/example-website.zip ~/environment/example-website.zip
unzip example-website.zip
rm example-website.zip

The following screenshot shows your output.

example website copy commands

Next, we run commands to create a directory to host your website and copy files into place from the example website to get started. We then create a new default branch called main (formerly referred to as the master branch), local to our AWS Cloud9 instance. We then copy files into place from the example website. After adding and committing them locally, we push all our changes to the remote Amazon Codecommit repository.

3. Enter the following code:

mkdir ~/environment/amplify-website/
cd ~/environment/amplify-website/
git init
git remote add origin codecommit::us-east-1://amplify-website
git remote -v
git checkout -b main
cp -rp ~/environment/example-website/* ~/environment/amplify-website/
git add *
git commit -am "first commit"
git push -u origin main

Deployment and hosting is achieved by using Amplify Console, a static web hosting service that accelerates your application release cycle by providing a simple CI/CD workflow for building and deploying static web applications.

4. On the Amplify console, under Deploy, choose Get Started.

Amplify banner

5. On the Get started with the Amplify Console page, select AWS CodeCommit as your source code repository.

6. Choose Continue.

Amplify get started page

7. On the Add repository branch page, for Recently updated repositories, choose your repository.

8. For Branch, choose main.

9. Choose Next.

add branch

On the Configure build settings page, Amplify automatically uses the amplify.yml file for build settings for your deployment. You committed this into your source code repository in the previous step. The amplify.yml file is detected from the root of your website directory structure.

10. Choose Next.

Amplify configure build settings

11. On the review page, choose Save and deploy.

Amplify builds and deploys your Amplify website within minutes, and shows you its progress. When deployment is complete, you can access the website to see the sample content.

amplify website

The following screenshot shows your example website.

sample website

Promoting changes to the website

We can now update the line of text in the home page and commit and publish this change.

1. Return to the browser with your AWS Cloud9 IDE and place your cursor in the lower terminal pane of the IDE.

2. On the navigation pane, choose the file ~/environment/amplify-website/workshop/content/_index.en.md.

The contents of the file open under a new tab in the upper pane.

3. Change the string First Line of Text to First Update to Website.

content change

4. From the File menu, choose Save to save the changes you have made to the _index.en.md file.

save content changes

5. Commit the changes and push to CodeCommit by running the following command in the lower terminal pane in AWS Cloud9:

git add *; git commit -am "homepage update"; git push origin main

The output in your AWS Cloud9 terminal should appear similar to the following screenshot.

commit output

6. Return to the Amplify Console and observe how the committed change in CodeCommit is automatically detected. Amplify runs deployment steps to push your changes to the website.

amplify deploy changes

7. Access the URL of your website after this update is complete to verify that the first line of text on your home page has changed.

updated website

You can repeat this process to make source-code controlled, automated changes to your website.

Adding a custom domain

Adding a custom domain to your Amplify configuration makes it easier for clients to access your content. You can register new domains using Amazon Route 53 or, if you have an existing domain registered outside of AWS, you can integrate it with Route 53 and Amplify. For our use case, the domain www.hugoonamplify.com is a registered a domain name using a third-party registrar (NameCheap). You can manage DNS configurations for domains registered outside of AWS using Route 53.

Start by configuring a public hosted zone in Route 53.

1. On the Route 53 console, choose Hosted zones.

2. Choose Create hosted zone.

hosted zones

3. For Domain name, enter hugoonamplify.com.

4. For Description, enter an appropriate description.

5. For Type, select Public hosted zone.

hosted zones configuration

6. Choose Create hosted zone.

7. Save the addresses of the name servers that respond to client DNS lookup requests for the custom domain.

create hosted zone

8. In a separate browser, access the console of your DNS registrar.

9. Configure a custom DNS name servers setting on the console of the third-party domain name registrar.

This configuration specifies the Route 53 assigned name servers as authoritative DNS for our custom domain. For this use case, propagation of this change may take up to 48 hours.

namecheap console

10. Use https://who.is to verify that the AWS name servers are listed correctly for your custom domain to internet clients.

whois lookup

You can now set up your custom domain in Amplify. Amplify helps you configure DNS and set up SSL for your desired custom domain.

domain management

11. On the Amplify Console, under App settings, choose Domain management.

12. Choose Add domain.

13. For Domain, enter your custom domain name (hugoonamplify.com).

14. Choose Configure domain.

15. For Subdomain, I only want to set up www and choose to exclude the root of my custom domain.

16. Choose Save.

Amplify begins the process of creating the SSL certificates. Amplify sends a notification that it’s issuing an SSL certificate to secure traffic to the custom domain.

ssl domain management

After a few moments, it proceeds to SSL configuration and indicates that ownership of domain is in progress.

ssl domain management configuration

Amplify verifies domain ownership by creating a sample CNAME record in your hosted zone file. When ownership is verified, the domain is propagated onto an Amazon CloudFront distribution managed by the Amplify service, and domain activation is complete.

ssl domain management configured

Clients can now access the website using the custom domain name www.hugoonaplify.com.

access website via custom domain

Establishing a subdomain for development

You can create a development website in Amplify that is aligned to a development code branch in CodeCommit that enables testing changes prior to production release.

1. Access the AWS Cloud9 IDE and use the terminal to enter the following commands to create a development branch and push changes to CodeCommit using the current content from the main branch with a single content change:

git checkout -b development
git branch
git remote -v
git add *; git commit -am "first development commit";
git push -u origin development

2. Open and edit the file ~/environment/amplify-website/workshop/content/_index.en.md and change the string Update to Website to something else.

Alternatively, run the following Unix sed command from the terminal in AWS Cloud9 to make that content change:

sed -i 's/Update to Website/Update to Development/g' ~/environment/amplify-website/workshop/content/_index.en.md

3. Commit and push your change with the following code:

git add *; git commit -am "second development commit"; git push -u origin development

You now configure a subdomain in Amplify to allow developers to review changes.

4. Return to the amplify-website app.

5. Choose Connect branch.

connect branch

6. For Branch, choose the development branch you created and committed code into.

7. Choose Next.

add development branch

Amplify builds a second website based on the contents of the development branch. You can see the instance of your website matched to the development code branch on Amplify Console.

amplify two branches

8. Access the domain management menu item in your Amplify application to add a friendly subdomain.

9. Edit the domain and add a subdomain item with a name of your choice (for example, dev).

10. Associate it to the development branch containing the committed code and content changes.

11. Choose Add.

add dev domain

You can access the subdomain to verify the changes.

verify domain

Controlling access to development

You may wish to restrict access to new content as it’s deployed into the development website.

1. On Amplify Console, choose your application.

2. Choose Access control.

3. Under Access control settings, choose your preferred settings.

You have the option to restrict access globally or on a branch-by-branch basis. For this use case, we create a simple password protection for a user named developer on the development branch and site.

access control settings

Cleaning up

Unless you plan to keep the website you have constructed, you can quickly clean up provisioned assets and avoid any unnecessary costs.

1. On Amplify Console, select the app you created.

2. From the Actions drop-down menu, choose Delete app.

3. In the pop-up window, confirm the deletion.

4. On the CodeCommit dashboard, select the repository you created.

5. Choose Delete.

6. In the pop-up window, confirm the deletion.

7. On the AWS Cloud9 dashboard, select the IDE you created.

8. Choose Delete.

9. In the pop-up window, confirm the deletion.

Conclusion

Hugo is a powerful tool that enables accelerated delivery of content in a variety of formats including image portfolios, online resume presentation, blogging, and technical documentation. Amplify Console provides a convenient, easy-to-use, static web hosting service that can greatly accelerate delivery of static content.

When combining Hugo with Amplify Console, you can rapidly deploy websites in minutes with features such as friendly URLS, environments matched to code branches, and encryption (SSL). Visit gohugo.io to find out more about Hugo. For more information about how Amplify Console can help you rapidly deploy Hugo and other modern web applications, see the AWS Amplify Console User Guide.

Integrating AWS CloudFormation Guard into CI/CD pipelines

2020-10-16 Sergey Voinich

Post Syndicated from Sergey Voinich original https://aws.amazon.com/blogs/devops/integrating-aws-cloudformation-guard/

In this post, we discuss and build a managed continuous integration and continuous deployment (CI/CD) pipeline that uses AWS CloudFormation Guard to automate and simplify pre-deployment compliance checks of your AWS CloudFormation templates. This enables your teams to define a single source of truth for what constitutes valid infrastructure definitions, to be compliant with your company guidelines and streamline AWS resources’ deployment lifecycle.

We use the following AWS services and open-source tools to set up the pipeline:

CloudFormation
CloudFormation Guard
AWS CodeBuild
AWS CodeCommit
AWS CodePipeline
AWS Command Line Interface (AWS CLI)
Amazon Elastic Block Store (Amazon EBS)
Amazon Elastic Compute Cloud (Amazon EC2)

Solution overview

The CI/CD workflow includes the following steps:

A code change is committed and pushed to the CodeCommit repository.
CodePipeline automatically triggers a CodeBuild job.
CodeBuild spins up a compute environment and runs the phases specified in the buildspec.yml file:
Clone the code from the CodeCommit repository (CloudFormation template, rule set for CloudFormation Guard, buildspec.yml file).
Clone the code from the CloudFormation Guard repository on GitHub.
Provision the build environment with necessary components (rust, cargo, git, build-essential).
Download CloudFormation Guard release from GitHub.
Run a validation check of the CloudFormation template.
If the validation is successful, pass the control over to CloudFormation and deploy the stack. If the validation fails, stop the build job and print a summary to the build job log.

The following diagram illustrates this workflow.

Architecture Diagram of CI/CD Pipeline with CloudFormation Guard

Prerequisites

For this walkthrough, complete the following prerequisites:

Have an AWS account
Have a clear understanding of CloudFormation Guard
Configure the AWS CLI
Configure your credentials for CodeCommit

Creating your CodeCommit repository

Create your CodeCommit repository by running a create-repository command in the AWS CLI:

aws codecommit create-repository --repository-name cfn-guard-demo --repository-description "CloudFormation Guard Demo"

The following screenshot indicates that the repository has been created.

CodeCommit Repository has been created

Populating the CodeCommit repository

Populate your repository with the following artifacts:

A buildspec.yml file. Modify the following code as per your requirements:

version: 0.2
env:
  variables:
    # Definining CloudFormation Teamplate and Ruleset as variables - part of the code repo
    CF_TEMPLATE: "cfn_template_file_example.yaml"
    CF_ORG_RULESET:  "cfn_guard_ruleset_example"
phases:
  install:
    commands:
      - apt-get update
      - apt-get install build-essential -y
      - apt-get install cargo -y
      - apt-get install git -y
  pre_build:
    commands:
      - echo "Setting up the environment for AWS CloudFormation Guard"
      - echo "More info https://github.com/aws-cloudformation/cloudformation-guard"
      - echo "Install Rust"
      - curl https://sh.rustup.rs -sSf | sh -s -- -y
  build:
    commands:
       - echo "Pull GA release from github"
       - echo "More info https://github.com/aws-cloudformation/cloudformation-guard/releases"
       - wget https://github.com/aws-cloudformation/cloudformation-guard/releases/download/1.0.0/cfn-guard-linux-1.0.0.tar.gz
       - echo "Extract cfn-guard"
       - tar xvf cfn-guard-linux-1.0.0.tar.gz .
  post_build:
    commands:
       - echo "Validate CloudFormation template with cfn-guard tool"
       - echo "More information https://github.com/aws-cloudformation/cloudformation-guard/blob/master/cfn-guard/README.md"
       - cfn-guard-linux/cfn-guard check --rule_set $CF_ORG_RULESET --template $CF_TEMPLATE --strict-checks
artifacts:
  files:
    - cfn_template_file_example.yaml
  name: guard_templates

An example of a rule set file (cfn_guard_ruleset_example) for CloudFormation Guard. Modify the following code as per your requirements:

#CFN Guard rules set example

#List of multiple references
let allowed_azs = [us-east-1a,us-east-1b]
let allowed_ec2_instance_types = [t2.micro,t3.nano,t3.micro]
let allowed_security_groups = [sg-08bbcxxc21e9ba8e6,sg-07b8bx98795dcab2]

#EC2 Policies
AWS::EC2::Instance AvailabilityZone IN %allowed_azs
AWS::EC2::Instance ImageId == ami-0323c3dd2da7fb37d
AWS::EC2::Instance InstanceType IN %allowed_ec2_instance_types
AWS::EC2::Instance SecurityGroupIds == ["sg-07b8xxxsscab2"]
AWS::EC2::Instance SubnetId == subnet-0407a7casssse558

#EBS Policies
AWS::EC2::Volume AvailabilityZone == us-east-1a
AWS::EC2::Volume Encrypted == true
AWS::EC2::Volume Size == 50 |OR| AWS::EC2::Volume Size == 100
AWS::EC2::Volume VolumeType == gp2

An example of a CloudFormation template file (.yaml). Modify the following code as per your requirements:

AWSTemplateFormatVersion: "2010-09-09"
Description: "EC2 instance with encrypted EBS volume for AWS CloudFormation Guard Testing"

Resources:

 EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: 'ami-0323c3dd2da7fb37d'
      AvailabilityZone: 'us-east-1a'
      KeyName: "your-ssh-key"
      InstanceType: 't3.micro'
      SubnetId: 'subnet-0407a7xx68410e558'
      SecurityGroupIds:
        - 'sg-07b8b339xx95dcab2'
      Volumes:
         - 
          Device: '/dev/sdf'
          VolumeId: !Ref EBSVolume
      Tags:
       - Key: Name
         Value: cfn-guard-ec2

 EBSVolume:
   Type: AWS::EC2::Volume
   Properties:
     Size: 100
     AvailabilityZone: 'us-east-1a'
     Encrypted: true
     VolumeType: gp2
     Tags:
       - Key: Name
         Value: cfn-guard-ebs
   DeletionPolicy: Snapshot

Outputs:
  InstanceID:
    Description: The Instance ID
    Value: !Ref EC2Instance
  Volume:
    Description: The Volume ID
    Value: !Ref  EBSVolume

Optional CodeCommit Repository Structure

The following screenshot shows a potential CodeCommit repository structure.

Creating a CodeBuild project

Our CodeBuild project orchestrates around CloudFormation Guard and runs validation checks of our CloudFormation templates as a phase of the CI process.

On the CodeBuild console, choose Build projects.
Choose Create build projects.
For Project name, enter your project name.
For Description, enter a description.

Create CodeBuild Project

For Source provider, choose AWS CodeCommit.
For Repository, choose the CodeCommit repository you created in the previous step.

Define the source for your CodeBuild Project

To setup CodeBuild environment we will use managed image based on Ubuntu 18.04

For Environment Image, select Managed image.
For Operating system, choose Ubuntu.
For Service role¸ select New service role.
For Role name, enter your service role name.

Setup the environment, the OS image and other settings for the CodeBuild

Leave the default settings for additional configuration, buildspec, batch configuration, artifacts, and logs.

You can also use CodeBuild with custom build environments to help you optimize billing and improve the build time.

Creating IAM roles and policies

Our CI/CD pipeline needs two AWS Identity and Access Management (IAM) roles to run properly: one role for CodePipeline to work with other resources and services, and one role for AWS CloudFormation to run the deployments that passed the validation check in the CodeBuild phase.

Creating permission policies

Create your permission policies first. The following code is the policy in JSON format for CodePipeline:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "codecommit:UploadArchive",
                "codecommit:CancelUploadArchive",
                "codecommit:GetCommit",
                "codecommit:GetUploadArchiveStatus",
                "codecommit:GetBranch",
                "codestar-connections:UseConnection",
                "codebuild:BatchGetBuilds",
                "codedeploy:CreateDeployment",
                "codedeploy:GetApplicationRevision",
                "codedeploy:RegisterApplicationRevision",
                "codedeploy:GetDeploymentConfig",
                "codedeploy:GetDeployment",
                "codebuild:StartBuild",
                "codedeploy:GetApplication",
                "s3:*",
                "cloudformation:*",
                "ec2:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "*",
            "Condition": {
                "StringEqualsIfExists": {
                    "iam:PassedToService": [
                        "cloudformation.amazonaws.com",
                        "ec2.amazonaws.com"
                    ]
                }
            }
        }
    ]
}

To create your policy for CodePipeline, run the following CLI command:

aws iam create-policy --policy-name CodePipeline-Cfn-Guard-Demo --policy-document file://CodePipelineServiceRolePolicy_example.json

Capture the policy ARN that you get in the output to use in the next steps.

The following code is the policy in JSON format for AWS CloudFormation:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "iam:CreateServiceLinkedRole",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:AWSServiceName": [
                        "autoscaling.amazonaws.com",
                        "ec2scheduled.amazonaws.com",
                        "elasticloadbalancing.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:GetObjectAcl",
                "s3:GetObject",
                "cloudwatch:*",
                "ec2:*",
                "autoscaling:*",
                "s3:List*",
                "s3:HeadBucket"
            ],
            "Resource": "*"
        }
    ]
}

Create the policy for AWS CloudFormation by running the following CLI command:

aws iam create-policy --policy-name CloudFormation-Cfn-Guard-Demo --policy-document file://CloudFormationRolePolicy_example.json

Capture the policy ARN that you get in the output to use in the next steps.

Creating roles and trust policies

The following code is the trust policy for CodePipeline in JSON format:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "codepipeline.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create your role for CodePipeline with the following CLI command:

aws iam create-role --role-name CodePipeline-Cfn-Guard-Demo-Role --assume-role-policy-document file://RoleTrustPolicy_CodePipeline.json

Capture the role name for the next step.

The following code is the trust policy for AWS CloudFormation in JSON format:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudformation.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create your role for AWS CloudFormation with the following CLI command:

aws iam create-role --role-name CF-Cfn-Guard-Demo-Role --assume-role-policy-document file://RoleTrustPolicy_CloudFormation.json

Capture the role name for the next step.

Finally, attach the permissions policies created in the previous step to the IAM roles you created:

aws iam attach-role-policy --role-name CodePipeline-Cfn-Guard-Demo-Role --policy-arn "arn:aws:iam::<AWS Account Id >:policy/CodePipeline-Cfn-Guard-Demo"

aws iam attach-role-policy --role-name CF-Cfn-Guard-Demo-Role --policy-arn "arn:aws:iam::<AWS Account Id>:policy/CloudFormation-Cfn-Guard-Demo"

Creating a pipeline

We can now create our pipeline to assemble all the components into one managed, continuous mechanism.

On the CodePipeline console, choose Pipelines.
Choose Create new pipeline.
For Pipeline name, enter a name.
For Service role, select Existing service role.
For Role ARN, choose the service role you created in the previous step.
Choose Next.

Setting Up CodePipeline environment

In the Source section, for Source provider, choose AWS CodeCommit.
For Repository name¸ enter your repository name.
For Branch name, choose master.
For Change detection options, select Amazon CloudWatch Events.
Choose Next.

Adding CodeCommit to CodePipeline

In the Build section, for Build provider, choose AWS CodeBuild.
For Project name, choose the CodeBuild project you created.
For Build type, select Single build.
Choose Next.

Adding Build Project to Pipeline Stage

Now we will create a deploy stage in our CodePipeline to deploy CloudFormation templates that passed the CloudFormation Guard inspection in the CI stage.

In the Deploy section, for Deploy provider, choose AWS CloudFormation.
For Action mode¸ choose Create or update stack.
For Stack name, choose any stack name.
For Artifact name, choose BuildArtifact.
For File name, enter the CloudFormation template name in your CodeCommit repository (In case of our demo it is cfn_template_file_example.yaml).
For Role name, choose the role you created earlier for CloudFormation.

Adding deploy stage to CodePipeline

22. In the next step review your selections for the pipeline to be created. The stages and action providers in each stage are shown in the order that they will be created. Click Create pipeline. Our CodePipeline is ready.

Validating the CI/CD pipeline operation

Our CodePipeline has two basic flows and outcomes. If the CloudFormation template complies with our CloudFormation Guard rule set file, the resources in the template deploy successfully (in our use case, we deploy an EC2 instance with an encrypted EBS volume).

CloudFormation Console

If our CloudFormation template doesn’t comply with the policies specified in our CloudFormation Guard rule set file, our CodePipeline stops at the CodeBuild step and you see an error in the build job log indicating the resources that are non-compliant:

[EBSVolume] failed because [Encrypted] is [false] and the permitted value is [true]
[EC2Instance] failed because [t3.2xlarge] is not in [t2.micro,t3.nano,t3.micro] for [InstanceType]
Number of failures: 2

Note: To demonstrate the above functionality I changed my CloudFormation template to use unencrypted EBS volume and switched the EC2 instance type to t3.2xlarge which do not adhere to the rules that we specified in the Guard rule set file

Cleaning up

To avoid incurring future charges, delete the resources that we have created during the walkthrough:

CloudFormation stack resources that were deployed by the CodePipeline
CodePipeline that we have created
CodeBuild project
CodeCommit repository

Conclusion

In this post, we covered how to integrate CloudFormation Guard into CodePipeline and fully automate pre-deployment compliance checks of your CloudFormation templates. This allows your teams to have an end-to-end automated CI/CD pipeline with minimal operational overhead and stay compliant with your organizational infrastructure policies.

Complete CI/CD with AWS CodeCommit, AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline

2020-10-01 Nitin Verma

Post Syndicated from Nitin Verma original https://aws.amazon.com/blogs/devops/complete-ci-cd-with-aws-codecommit-aws-codebuild-aws-codedeploy-and-aws-codepipeline/

Many organizations have been shifting to DevOps practices, which is the combination of cultural philosophies, practices, and tools that increases your organization’s ability to deliver applications and services at high velocity; for example, evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes.

DevOps-Feedback-Flow

An integral part of DevOps is adopting the culture of continuous integration and continuous delivery/deployment (CI/CD), where a commit or change to code passes through various automated stage gates, all the way from building and testing to deploying applications, from development to production environments.

This post uses the AWS suite of CI/CD services to compile, build, and install a version-controlled Java application onto a set of Amazon Elastic Compute Cloud (Amazon EC2) Linux instances via a fully automated and secure pipeline. The goal is to promote a code commit or change to pass through various automated stage gates all the way from development to production environments, across AWS accounts.

AWS services

This solution uses the following AWS services:

AWS CodeCommit – A fully-managed source control service that hosts secure Git-based repositories. CodeCommit makes it easy for teams to collaborate on code in a secure and highly scalable ecosystem. This solution uses CodeCommit to create a repository to store the application and deployment codes.
AWS CodeBuild – A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy, on a dynamically created build server. This solution uses CodeBuild to build and test the code, which we deploy later.
AWS CodeDeploy – A fully managed deployment service that automates software deployments to a variety of compute services such as Amazon EC2, AWS Fargate, AWS Lambda, and your on-premises servers. This solution uses CodeDeploy to deploy the code or application onto a set of EC2 instances running CodeDeploy agents.
AWS CodePipeline – A fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. This solution uses CodePipeline to create an end-to-end pipeline that fetches the application code from CodeCommit, builds and tests using CodeBuild, and finally deploys using CodeDeploy.
AWS CloudWatch Events – An AWS CloudWatch Events rule is created to trigger the CodePipeline on a Git commit to the CodeCommit repository.
Amazon Simple Storage Service (Amazon S3) – An object storage service that offers industry-leading scalability, data availability, security, and performance. This solution uses an S3 bucket to store the build and deployment artifacts created during the pipeline run.
AWS Key Management Service (AWS KMS) – AWS KMS makes it easy for you to create and manage cryptographic keys and control their use across a wide range of AWS services and in your applications. This solution uses AWS KMS to make sure that the build and deployment artifacts stored on the S3 bucket are encrypted at rest.

Overview of solution

This solution uses two separate AWS accounts: a dev account (111111111111) and a prod account (222222222222) in Region us-east-1.

We use the dev account to deploy and set up the CI/CD pipeline, along with the source code repo. It also builds and tests the code locally and performs a test deploy.

The prod account is any other account where the application is required to be deployed from the pipeline in the dev account.

In summary, the solution has the following workflow:

A change or commit to the code in the CodeCommit application repository triggers CodePipeline with the help of a CloudWatch event.
The pipeline downloads the code from the CodeCommit repository, initiates the Build and Test action using CodeBuild, and securely saves the built artifact on the S3 bucket.
If the preceding step is successful, the pipeline triggers the Deploy in Dev action using CodeDeploy and deploys the app in dev account.
If successful, the pipeline triggers the Deploy in Prod action using CodeDeploy and deploys the app in the prod account.

The following diagram illustrates the workflow:

cicd-overall-flow

Failsafe deployments

This example of CodeDeploy uses the IN_PLACE type of deployment. However, to minimize the downtime, CodeDeploy inherently supports multiple deployment strategies. This example makes use of following features: rolling deployments and automatic rollback.

CodeDeploy provides the following three predefined deployment configurations, to minimize the impact during application upgrades:

CodeDeployDefault.OneAtATime – Deploys the application revision to only one instance at a time
CodeDeployDefault.HalfAtATime – Deploys to up to half of the instances at a time (with fractions rounded down)
CodeDeployDefault.AllAtOnce – Attempts to deploy an application revision to as many instances as possible at once

For OneAtATime and HalfAtATime, CodeDeploy monitors and evaluates instance health during the deployment and only proceeds to the next instance or next half if the previous deployment is healthy. For more information, see Working with deployment configurations in CodeDeploy.

You can also configure a deployment group or deployment to automatically roll back when a deployment fails or when a monitoring threshold you specify is met. In this case, the last known good version of an application revision is automatically redeployed after a failure with the new application version.

How CodePipeline in the dev account deploys apps in the prod account

In this post, the deployment pipeline using CodePipeline is set up in the dev account, but it has permissions to deploy the application in the prod account. We create a special cross-account role in the prod account, which has the following:

Permission to use fetch artifacts (app) rom Amazon S3 and deploy it locally in the account using CodeDeploy
Trust with the dev account where the pipeline runs

CodePipeline in the dev account assumes this cross-account role in the prod account to deploy the app.

Do I need multiple accounts?
If you answer “yes” to any of the following questions you should consider creating more AWS accounts:

Does your business require administrative isolation between workloads? Administrative isolation by account is the most straightforward way to grant independent administrative groups different levels of administrative control over AWS resources based on workload, development lifecycle, business unit (BU), or data sensitivity.
Does your business require limited visibility and discoverability of workloads? Accounts provide a natural boundary for visibility and discoverability. Workloads cannot be accessed or viewed unless an administrator of the account enables access to users managed in another account.
Does your business require isolation to minimize blast radius? Separate accounts help define boundaries and provide natural blast-radius isolation to limit the impact of a critical event such as a security breach, an unavailable AWS Region or Availability Zone, account suspensions, and so on.
Does your business require a particular workload to operate within AWS service limits without impacting the limits of another workload? You can use AWS account service limits to impose restrictions on a business unit, development team, or project. For example, if you create an AWS account for a project group, you can limit the number of Amazon Elastic Compute Cloud (Amazon EC2) or high performance computing (HPC) instances that can be launched by the account.
Does your business require strong isolation of recovery or auditing data? If regulatory requirements require you to control access and visibility to auditing data, you can isolate the data in an account separate from the one where you run your workloads (for example, by writing AWS CloudTrail logs to a different account).

Prerequisites

For this walkthrough, you should complete the following prerequisites:

Have access to at least two AWS accounts. For this post, the dev and prod accounts are in us-east-1. You can search and replace the Region and account IDs in all the steps and sample AWS Identity and Access Management (IAM) policies in this post.
Ensure you have EC2 Linux instances with the CodeDeploy agent installed in all the accounts or VPCs where the sample Java application is to be installed (dev and prod accounts).
- To manually create EC2 instances with CodeDeploy agent, refer Create an Amazon EC2 instance for CodeDeploy (AWS CLI or Amazon EC2 console). Keep in mind the following:
  - CodeDeploy uses EC2 instance tags to identify instances to use to deploy the application, so it’s important to set tags appropriately. For this post, we use the tag name Application with the value MyWebApp to identify instances where the sample app is installed.
  - Make sure to use an EC2 instance profile (AWS Service Role for EC2 instance) with permissions to read the S3 bucket containing artifacts built by CodeBuild. Refer to the IAM role cicd_ec2_instance_profile in the table Roles-1 below for the set of permissions required. You must update this role later with the actual KMS key and S3 bucket name created as part of the deployment process.
- To create EC2 Linux instances via AWS Cloudformation, download and launch the AWS CloudFormation template from the GitHub repo: cicd-ec2-instance-with-codedeploy.json
  - This deploys an EC2 instance with AWS CodeDeploy agent.
  - Inputs required:
    - AMI : Enter name of the Linux AMI in your region. (This template has been tested with latest Amazon Linux 2 AMI)
    - Ec2SshKeyPairName: Name of an existing SSH KeyPair
    - Ec2IamInstanceProfile: Name of an existing EC2 instance profile. Note: Use the permissions in the template cicd_ec2_instance_profile_policy.json to create the policy for this EC2 Instance Profile role. You must update this role later with the actual KMS key and S3 bucket name created as part of the deployment process.
    - Update the EC2 instance Tags per your need.

Ensure required IAM permissions. Have an IAM user with an IAM Group or Role that has the following access levels or permissions:

AWS Service / Components	Access Level	Accounts	Comments
AWS CodeCommit	Full (admin)	Dev	Use AWS managed policy AWSCodeCommitFullAccess.
AWS CodePipeline	Full (admin)	Dev	Use AWS managed policy AWSCodePipelineFullAccess.
AWS CodeBuild	Full (admin)	Dev	Use AWS managed policy AWSCodeBuildAdminAccess.
AWS CodeDeploy	Full (admin)	All	Use AWS managed policy AWSCodeDeployFullAccess.
Create S3 bucket and bucket policies	Full (admin)	Dev	IAM policies can be restricted to specific bucket.
Create KMS key and policies	Full (admin)	Dev	IAM policies can be restricted to specific KMS key.
AWS CloudFormation	Full (admin)	Dev	Use AWS managed policy AWSCloudFormationFullAccess.
Create and pass IAM roles	Full (admin)	All	Ability to create IAM roles and policies can be restricted to specific IAM roles or actions. Also, an admin team with IAM privileges could create all the required roles. Refer to the IAM table Roles-1 below.
AWS Management Console and AWS CLI	As per IAM User permissions	All	To access suite of Code services.

Create Git credentials for CodeCommit in the pipeline account (dev account). AWS allows you to either use Git credentials or associate SSH public keys with your IAM user. For this post, use Git credentials associated with your IAM user (created in the previous step). For instructions on creating a Git user, see Create Git credentials for HTTPS connections to CodeCommit. Download and save the Git credentials to use later for deploying the application.
Create all AWS IAM roles as per the following tables (Roles-1). Make sure to update the following references in all the given IAM roles and policies:
- Replace the sample dev account (111111111111) and prod account (222222222222) with actual account IDs
- Replace the S3 bucket mywebapp-codepipeline-bucket-us-east-1-111111111111 with your preferred bucket name.
- Replace the KMS key ID key/82215457-e360-47fc-87dc-a04681c91ce1 with your KMS key ID.

Table: Roles-1

Service	IAM Role Type	Account	IAM Role Name (used for this post)	IAM Role Policy (required for this post)	IAM Role Permissions
AWS CodePipeline	Service role	Dev (111111111111)	cicd_codepipeline_service_role Select Another AWS Account and use this account as the account ID to create the role. Later update the trust as follows: “Principal”: {“Service”: “codepipeline.amazonaws.com”},	Use the permissions in the template cicd_codepipeline_service_policy.json to create the policy for this role.	This CodePipeline service role has appropriate permissions to the following services in a local account: Manage CodeCommit repos Initiate build via CodeBuild Create deployments via CodeDeploy Assume cross-account CodeDeploy role in prod account to deploy the application
AWS CodePipeline	IAM role	Dev (111111111111)	cicd_codepipeline_trigger_cwe_role Select Another AWS Account and use this account as the account ID to create the role. Later update the trust as follows: “Principal”: {“Service”: “events.amazonaws.com”},	Use the permissions in the template cicd_codepipeline_trigger_cwe_policy.json to create the policy for this role.	CodePipeline uses this role to set a CloudWatch event to trigger the pipeline when there is a change or commit made to the code repository.
AWS CodePipeline	IAM role	Prod (222222222222)	cicd_codepipeline_cross_ac_role Choose Another AWS Account and use the dev account as the trusted account ID to create the role.	Use the permissions in the template cicd_codepipeline_cross_ac_policy.json to create the policy for this role.	This role is created in the prod account and has permissions to use CodeDeploy and fetch from Amazon S3. The role is assumed by CodePipeline from the dev account to deploy the app in the prod account. Make sure to set up trust with the dev account for this IAM role on the Trust relationships tab.
AWS CodeBuild	Service role	Dev (111111111111)	cicd_codebuild_service_role Choose CodeBuild as the use case to create the role.	Use the permissions in the template cicd_codebuild_service_policy.json to create the policy for this role.	This CodeBuild service role has appropriate permissions to: The S3 bucket to store artefacts Stream logs to CloudWatch Logs Pull code from CodeCommit Get the SSM parameter for CodeBuild Miscellaneous Amazon EC2 permissions
AWS CodeDeploy	Service role	Dev (111111111111) and Prod (222222222222)	cicd_codedeploy_service_role Choose CodeDeploy as the use case to create the role.	Use the built-in AWS managed policy AWSCodeDeployRole for this role.	This CodeDeploy service role has appropriate permissions to: Miscellaneous Amazon EC2 Auto Scaling Miscellaneous Amazon EC2 Publish Amazon SNS topic AWS CloudWatch metrics Elastic Load Balancing
EC2 Instance	Service role for EC2 instance profile	Dev (111111111111) and Prod (222222222222)	cicd_ec2_instance_profile Choose EC2 as the use case to create the role.	Use the permissions in the template cicd_ec2_instance_profile_policy.json to create the policy for this role.	This is set as the EC2 instance profile for the EC2 instances where the app is deployed. It has appropriate permissions to fetch artefacts from Amazon S3 and decrypt contents using the KMS key. You must update this role later with the actual KMS key and S3 bucket name created as part of the deployment process.

Setting up the prod account

To set up the prod account, complete the following steps:

Download and launch the AWS CloudFormation template from the GitHub repo: cicd-codedeploy-prod.json
- This deploys the CodeDeploy app and deployment group.
- Make sure that you already have a set of EC2 Linux instances with the CodeDeploy agent installed in all the accounts where the sample Java application is to be installed (dev and prod accounts). If not, refer back to the Prerequisites section.
Update the existing EC2 IAM instance profile (cicd_ec2_instance_profile):
- Replace the S3 bucket name mywebapp-codepipeline-bucket-us-east-1-111111111111 with your S3 bucket name (the one used for the CodePipelineArtifactS3Bucket variable when you launched the CloudFormation template in the dev account).
- Replace the KMS key ARN arn:aws:kms:us-east-1:111111111111:key/82215457-e360-47fc-87dc-a04681c91ce1 with your KMS key ARN (the one created as part of the CloudFormation template launch in the dev account).

Setting up the dev account

To set up your dev account, complete the following steps:

Download and launch the CloudFormation template from the GitHub repo: cicd-aws-code-suite-dev.json
The stack deploys the following services in the dev account:
- CodeCommit repository
- CodePipeline
- CodeBuild environment
- CodeDeploy app and deployment group
- CloudWatch event rule
- KMS key (used to encrypt the S3 bucket)
- S3 bucket and bucket policy

Use following values as inputs to the CloudFormation template. You should have created all the existing resources and roles beforehand as part of the prerequisites.

Key	Example Value	Comments
CodeCommitWebAppRepo	MyWebAppRepo	Name of the new CodeCommit repository for your web app.
CodeCommitMainBranchName	master	Main branch name on your CodeCommit repository. Default is master (which is pushed to the prod environment).
CodeBuildProjectName	MyCBWebAppProject	Name of the new CodeBuild environment.
CodeBuildServiceRole	arn:aws:iam::111111111111:role/cicd_codebuild_service_role	ARN of an existing IAM service role to be associated with CodeBuild to build web app code.
CodeDeployApp	MyCDWebApp	Name of the new CodeDeploy app to be created for your web app. We assume that the CodeDeploy app name is the same in all accounts where deployment needs to occur (in this case, the prod account).
CodeDeployGroupDev	MyCICD-Deployment-Group-Dev	Name of the new CodeDeploy deployment group to be created in the dev account.
CodeDeployGroupProd	MyCICD-Deployment-Group-Prod	Name of the existing CodeDeploy deployment group in prod account. Created as part of the prod account setup.
CodeDeployGroupTagKey	Application	Name of the tag key that CodeDeploy uses to identify the existing EC2 fleet for the deployment group to use.
CodeDeployGroupTagValue	MyWebApp	Value of the tag that CodeDeploy uses to identify the existing EC2 fleet for the deployment group to use.
CodeDeployConfigName	CodeDeployDefault.OneAtATime	Desired Code Deploy config name. Valid options are: CodeDeployDefault.OneAtATime CodeDeployDefault.HalfAtATime CodeDeployDefault.AllAtOnce For more information, see Deployment configurations on an EC2/on-premises compute platform.
CodeDeployServiceRole	arn:aws:iam::111111111111:role/cicd_codedeploy_service_role	ARN of an existing IAM service role to be associated with CodeDeploy to deploy web app.
CodePipelineName	MyWebAppPipeline	Name of the new CodePipeline to be created for your web app.
CodePipelineArtifactS3Bucket	mywebapp-codepipeline-bucket-us-east-1-111111111111	Name of the new S3 bucket to be created where artifacts for the pipeline are stored for this web app.
CodePipelineServiceRole	arn:aws:iam::111111111111:role/cicd_codepipeline_service_role	ARN of an existing IAM service role to be associated with CodePipeline to deploy web app.
CodePipelineCWEventTriggerRole	arn:aws:iam::111111111111:role/cicd_codepipeline_trigger_cwe_role	ARN of an existing IAM role used to trigger the pipeline you named earlier upon a code push to the CodeCommit repository.
CodeDeployRoleXAProd	arn:aws:iam::222222222222:role/cicd_codepipeline_cross_ac_role	ARN of an existing IAM role in the cross-account for CodePipeline to assume to deploy the app.

It should take 5–10 minutes for the CloudFormation stack to complete. When the stack is complete, you can see that CodePipeline has built the pipeline (MyWebAppPipeline) with the CodeCommit repository and CodeBuild environment, along with actions for CodeDeploy in local (dev) and cross-account (prod). CodePipeline should be in a failed state because your CodeCommit repository is empty initially.

Update the existing Amazon EC2 IAM instance profile (cicd_ec2_instance_profile):
- Replace the S3 bucket name mywebapp-codepipeline-bucket-us-east-1-111111111111 with your S3 bucket name (the one used for the CodePipelineArtifactS3Bucket parameter when launching the CloudFormation template in the dev account).
- Replace the KMS key ARN arn:aws:kms:us-east-1:111111111111:key/82215457-e360-47fc-87dc-a04681c91ce1 with your KMS key ARN (the one created as part of the CloudFormation template launch in the dev account).

Deploying the application

You’re now ready to deploy the application via your desktop or PC.

Assuming you have the required HTTPS Git credentials for CodeCommit as part of the prerequisites, clone the CodeCommit repo that was created earlier as part of the dev account setup. Obtain the name of the CodeCommit repo to clone, from the CodeCommit console. Enter the Git user name and password when prompted. For example:
```
$ git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/MyWebAppRepo my-web-app-repo
Cloning into 'my-web-app-repo'...
Username for 'https://git-codecommit.us-east-1.amazonaws.com/v1/repos/MyWebAppRepo': xxxx
Password for 'https://[email protected]/v1/repos/MyWebAppRepo': xxxx
```
Download the MyWebAppRepo.zip file containing a sample Java application, CodeBuild configuration to build the app, and CodeDeploy config file to deploy the app.
Copy and unzip the file into the my-web-app-repo Git repository folder created earlier.
Assuming this is the sample app to be deployed, commit these changes to the Git repo. For example:
```
$ cd my-web-app-repo 
$ git add -A 
$ git commit -m "initial commit" 
$ git push
```

For more information, see Tutorial: Create a simple pipeline (CodeCommit repository).

After you commit the code, the CodePipeline will be triggered and all the stages and your application should be built, tested, and deployed all the way to the production environment!

The following screenshot shows the entire pipeline and its latest run:

Troubleshooting

To troubleshoot any service-related issues, see the following:

Cleaning up

To avoid incurring future charges or to remove any unwanted resources, delete the following:

EC2 instance used to deploy the application
CloudFormation template to remove all AWS resources created through this post
IAM users or roles

Conclusion

Using this solution, you can easily set up and manage an entire CI/CD pipeline in AWS accounts using the native AWS suite of CI/CD services, where a commit or change to code passes through various automated stage gates all the way from building and testing to deploying applications, from development to production environments.

FAQs

In this section, we answer some frequently asked questions:

Can I expand this deployment to more than two accounts?
- Yes. You can deploy a pipeline in a tooling account and use dev, non-prod, and prod accounts to deploy code on EC2 instances via CodeDeploy. Changes are required to the templates and policies accordingly.
Can I ensure the application isn’t automatically deployed in the prod account via CodePipeline and needs manual approval?
- Yes. Minor changes are required to the CodePipeline section of the CloudFormation template for the dev environment to add an Approval stage and action. Refer to the following Solution Variations section for more information.
Can I use a CodeDeploy group with an Auto Scaling group?
- Yes. Minor changes required to the CodeDeploy group creation process. Refer to the following Solution Variations section for more information.
Can I use this pattern for EC2 Windows instances?
- Not as is. You need to make a few changes to the CloudFormation template for the dev environment for a CodeBuild project. For more information about setting up a CodeBuild environment, see Microsoft Windows samples for CodeBuild.

Solution variations

In this section, we provide a few variations to our solution:

To associate CodeDeploy to an Auto Scaling group instead of grouping EC2 instances using tags, see Integrating CodeDeploy with Amazon EC2 Auto Scaling.
To deploy apps to on-premises instances via CodeDeploy, see Working with on-premises instances for CodeDeploy.
To use GitHub to trigger CodePipeline (instead of CodeCommit), see Use webhooks to start a pipeline (GitHub source).
To learn about various providers supported by CodePipeline, see Valid action types and providers in CodePipeline.
To add manual approval before deploying an app to production, see Manage approval actions in CodePipeline.

Author bio

Nitin Verma

Nitin is currently a Sr. Cloud Architect in the AWS Managed Services(AMS). He has many years of experience with DevOps-related tools and technologies. Speak to your AWS Managed Services representative to deploy this solution in AMS!