Tag Archives: Developer Tools

CDK Corner – January 2021

Post Syndicated from Christian Weber original https://aws.amazon.com/blogs/devops/cdk-corner-february-2021/

Social: Events in the Community

CDK Day is coming up on April 30th! This is your chance to meet and engage with the CDK Community! Last year’s event included an incredible amount of content, whether it was learning the origin story of CDK, learning how CDK is used in a Large Enterprise, there were many great sessions, as well as Eric Johnson cosplaying as the official CDK Mascot.

Do you have a story to share about using CDK, about something funny/crazy/interesting/cool/another adjective? The CFPs are now open — the community wants to hear your stories; so go ahead and submit here!

Updates: Changes made across CDK

In January, the CDK Community and the AWS CDK team were together hard at work, bringing in new changes, features, or, as NetaNir likes to call them, many new “goodies” to the CDK!

AWS Construct Library and Core

The CDK Team announced General Availability of the EKS Module in CDK with PR#12640. Moving a CDK Module from Experimental to Stable requires substantial effort from both the CDK Community and Team — the appreciation for everyone that contributed to this effort cannot be understated. Take a look at the project milestone to explore some of the work that contributed to releasing the EKS constrcut to GA. Great job everyone!

External assets are now supported from PR#12259. With this change, you can now setup cdk-assets.json with Files, Archives, or even Docker Images built by external utilities. This is great if your CDK Application relies on assets from other sources, such as an internal pipeline, or if you want to pull the latest Docker Image built from some external utility.

CDK will now alert you if your stack hits the maximum number of CloudFormation Resources. If you’re deploying complex CDK Stacks, you’ll know that sometimes you will hit this cap which seems to only happen when you’ve walked away from your computer to make a coffee while your stack is deploying, only to come back with a latte and a command line full of exceptions. This wonderful quality-of-life change was merged in PR#12193.

AWS CodeBuild

AWS CodeBuild in CDK can now be configured with Standard 5.0 Runtime Environments, which now supports many new runtime environments, including support for Python 3.9 which means, for example, CodeBuild now natively understands the union operator in Python dictionaries you’ve been using to combine dictionaries in your project.

AWS EC2

There is now support for m6gd and r6gd Graviton EC2 Instances from CDK with PR#12302. Graviton Instances are a great way to utilize ARM Archicture at a lower cost.

Support for new io2 and and gp3 EBS Volumes were announced at re:Invent, followed up with a community contribution from leandrodamascena in PR#12074

AWS ElasticSearch

A big cost savings feature to support ElasticSearch UltraWarm nodes in CDK, now gives CDK users the opportunity to store data in S3 instead of an SSD with ElasticSearch, which can substantially reduce storage costs.

AWS S3

Securing S3 Buckets is a standard practice, and CDK has tightened its security on S3 Buckets by limiting the PutObject permission of Bucket.grantWrite() to just s3:PutObject instead of s3:PutObject*. This subtle change means that only the first permission is added to the IAM Principal, instead of any other IAM permission prefixed with PutObject (Such as s3:PutObjectAcl). You still have the flexibility to make this permission add-on if needed, though.

AWS StepFunctions

A member of the CDK Community, ayush987goyal, submitted PR#12436 for StepFunctions-Tasks. This feature now lets users specify the family and revision of a taskDefinitionFamily inside EcsRunTask, thanks to their effort. This modifies previous behavior of the construct where a user could only deploy the latest revision of a Task by supplying the ARN of the Task.

CloudFormation and new L1 Resources

As CDK synthesizes CloudFormation Templates, it’s important that CDK stays up to date with the CloudFormation Resource Specification these updates to our collection of L1 Constructs. Now that they’re here, the community and team can begin implementing beautiful L2 Constructs for these L1s. Interested in contributing an L2 from these L1s? Take a look at our CONTRIBUTING doc to get up and running.

In January the team introduced several updates of the CloudFormation Resource Spec to CDK, bringing support for a whole slew of new Resources, Attribute Updates and Property Changes. These updates, among others, include new resource types for CloudFormation Modules, SageMaker Pipelines, AWS Config Saved Queries, AWS DataSync, AWS Service Catalog App Registry, AWS QuickSight, Virtual Clusters for EMR Containers for Amazon Elastic MapReduce, support for DNSSEC in Route53, and support for ECR Public Repositories.

My favorite of all these is ECR Public Repositories. Public Repositories support was just recently announced, in December at AWS re:Invent. Now you can deploy and manage a public repository with CDK as an L1 Construct. So, if you have an exciting Container Image that you’ve been wanting to share with the world with your own Public Repository, set it all up with CDK!

To be in the know on updates to the CDK, and updates to CDK’s CloudFormation Resource Spec, update your repository notification settings to watch for new CDK Releases , and browse the cfnspec CHANGELOG.

Learning: Level up your CDK Knowledge

AWS has released a new training module for the CDK. This free 7 module course teaches users the fundamental concepts of the CDK, from explaining its core benefits, to defining the common language and terms, to tips for troubleshooting CDK Projects. This is a great course for developers, or related stakeholders who may be considering whether or not to adopt CDK in their team or organization.

Community Acknowledgements: Thanks for your hard work

We love highlighting Pull Requests from our community of CDK users. This month’s spotlight goes to Jacob-Doetsch, who submitted a fix when deploying Bastion Hosts backed by ARM Architecture. As ARM based architecture increases in usage across AWS, identifying and resolving these types of bugs helps CDK maintain the ability to help Developers continue moving quickly. Great job Jacob!

And finally, to round out the CDK Corner, a round of applause to the following users who merged their first Pull Request to CDK in January! The CDK Community appreciates your hard work and effort!

Improving the CPU and latency performance of Amazon applications using AWS CodeGuru Profiler

Post Syndicated from Neha Gupta original https://aws.amazon.com/blogs/devops/improving-the-cpu-and-latency-performance-of-amazon-applications-using-aws-codeguru-profiler/

Amazon CodeGuru Profiler is a developer tool powered by machine learning (ML) that helps identify an application’s most expensive lines of code and provides intelligent recommendations to optimize it. You can identify application performance issues and troubleshoot latency and CPU utilization issues in your application.

You can use CodeGuru Profiler to optimize performance for any application running on AWS Lambda, Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), AWS Fargate, or AWS Elastic Beanstalk, and on premises.

This post gives a high-level overview of how CodeGuru Profiler has reduced CPU usage and latency by approximately 50% and saved around $100,000 a year for a particular Amazon retail service.

Technical and business value of CodeGuru Profiler

CodeGuru Profiler is easy and simple to use, just turn it on and start using it. You can keep it running in the background and you can just look into the CodeGuru Profiler findings and implement the relevant changes.

It’s fairly low cost and unlike traditional tools that take up lot of CPU and RAM, running CodeGuru Profiler has less than 1% impact on total CPU usage overhead to applications and typically uses no more than 100 MB of memory.

You can run it in a pre-production environment to test changes to ensure no impact occurs on your application’s key metrics.

It automatically detects performance anomalies in the application stack traces that start consuming more CPU or show increased latency. It also provides visualizations and recommendations on how to fix performance issues and the estimated cost of running inefficient code. Detecting the anomalies early prevents escalating the issue in production. This helps you prioritize remediation by giving you enough time to fix the issue before it impacts your service’s availability and your customers’ experience.

How we used CodeGuru Profiler at Amazon

Amazon has on-boarded many of its applications to CodeGuru Profiler, which has resulted in an annual savings of millions of dollars and latency improvements. In this post, we discuss how we used CodeGuru Profiler on an Amazon Prime service. A simple code change resulted in saving around $100,000 for the year.

Opportunity to improve

After a change to one of our data sources that caused its payload size to increase, we expected a slight increase to our service latency, but what we saw was higher than expected. Because CodeGuru Profiler is easy to integrate, we were able to quickly make and deploy the changes needed to get it running on our production environment.

After loading up the profile in Amazon CodeGuru Profiler, it was immediately apparent from the visualization that a very large portion of the service’s CPU time was being taken up by Jackson deserialization (37%, across the two call sites). It was also interesting that most of the blocking calls in the program (in blue) was happening in the jackson.databind method _createAndCacheValueDeserializer.

Flame graphs represent the relative amount of time that the CPU spends at each point in the call graph. The wider it is, the more CPU usage it corresponds to.

The following flame graph is from before the performance improvements were implemented.

The Flame Graph before the deployment

Looking at the source for _createAndCacheValueDeserializer confirmed that there was a synchronized block. From within it, _createAndCache2 was called, which actually did the adding to the cache. Adding to the cache was guarded by a boolean condition which had a comment that indicated that caching would only be enabled for custom serializers if @JsonCachable was set.

Solution

Checking the documentation for @JsonCachable confirmed that this annotation looked like the correct solution for this performance issue. After we deployed a quick change to add @JsonCachable to our four custom deserializers, we observed that no visible time was spent in _createAndCacheValueDeserializer.

Results

Adding a one-line annotation in four different places made the code run twice as fast. Because it was holding a lock while it recreated the same deserializers for every call, this was allowing only one of the four CPU cores to be used and therefore causing latency and inefficiency. Reusing the deserializers avoided repeated work and saved us lot of resources.

After the CodeGuru Profiler recommendations were implemented, the amount of CPU spent in Jackson reduced from 37% to 5% across the two call paths, and there was no visible blocking. With the removal of the blocking, we could run higher load on our hosts and reduce the fleet size, saving approximately $100,000 a year in Amazon EC2 costs, thereby resulting in overall savings.

The following flame graph shows performance after the deployment.

The Flame Graph after the deployment

Metrics

The following graph shows that CPU usage reduced by almost 50%. The blue line shows the CPU usage the week before we implemented CodeGuru Profiler recommendations, and green shows the dropped usage after deploying. We could later safely scale down the fleet to reduce costs, while still having better performance than prior to the change.

Average Fleet CPU Utilization

 

The following graph shows the server latency, which also dropped by almost 50%. The latency dropped from 100 milliseconds to 50 milliseconds as depicted in the initial portion of the graph. The orange line depicts p99, green p99.9, and blue p50 (mean latency).

Server Latency

 

Conclusion

With a few lines of changed code and a half-hour investigation, we removed the bottleneck which led to lower utilization of resources and  thus we were able to decrease the fleet size. We have seen many similar cases, and in one instance, a change of literally six characters of inefficient code, reduced CPU usage from 99% to 5%.

Across Amazon, CodeGuru Profiler has been used internally among various teams and resulted in millions of dollars of savings and performance optimization. You can use CodeGuru Profiler for quick insights into performance issues of your application. The more efficient the code and application is, the less costly it is to run. You can find potential savings for any application running in production and significantly reduce infrastructure costs using CodeGuru Profiler. Reducing fleet size, latency, and CPU usage is a major win.

 

 

About the Authors

Neha Gupta

Neha Gupta is a Solutions Architect at AWS and have 16 years of experience as a Database architect/ DBA. Apart from work, she’s outdoorsy and loves to dance.

Ian Clark

Ian is a Senior Software engineer with the Last Mile organization at Amazon. In his spare time, he enjoys exploring the Vancouver area with his family.

Using AWS DevOps Tools to model and provision AWS Glue workflows

Post Syndicated from Nuatu Tseggai original https://aws.amazon.com/blogs/devops/provision-codepipeline-glue-workflows/

This post provides a step-by-step guide on how to model and provision AWS Glue workflows utilizing a DevOps principle known as infrastructure as code (IaC) that emphasizes the use of templates, source control, and automation. The cloud resources in this solution are defined within AWS CloudFormation templates and provisioned with automation features provided by AWS CodePipeline and AWS CodeBuild. These AWS DevOps tools are flexible, interchangeable, and well suited for automating the deployment of AWS Glue workflows into different environments such as dev, test, and production, which typically reside in separate AWS accounts and Regions.

AWS Glue workflows allow you to manage dependencies between multiple components that interoperate within an end-to-end ETL data pipeline by grouping together a set of related jobs, crawlers, and triggers into one logical run unit. Many customers using AWS Glue workflows start by defining the pipeline using the AWS Management Console and then move on to monitoring and troubleshooting using either the console, AWS APIs, or the AWS Command Line Interface (AWS CLI).

Solution overview

The solution uses COVID-19 datasets. For more information on these datasets, see the public data lake for analysis of COVID-19 data, which contains a centralized repository of freely available and up-to-date curated datasets made available by the AWS Data Lake team.

Because the primary focus of this solution showcases how to model and provision AWS Glue workflows using AWS CloudFormation and CodePipeline, we don’t spend much time describing intricate transform capabilities that can be performed in AWS Glue jobs. As shown in the Python scripts, the business logic is optimized for readability and extensibility so you can easily home in on the functions that aggregate data based on monthly and quarterly time periods.

The ETL pipeline reads the source COVID-19 datasets directly and writes only the aggregated data to your S3 bucket.

The solution exposes the datasets in the following tables:

Table NameDescriptionDataset locationProvider
countrycodeLookup table for country codess3://covid19-lake/static-datasets/csv/countrycode/Rearc
countypopulationLookup table for the population of each countys3://covid19-lake/static-datasets/csv/CountyPopulation/Rearc
state_abvLookup table for US state abbreviationss3://covid19-lake/static-datasets/json/state-abv/Rearc
rearc_covid_19_nyt_data_in_usa_us_countiesData on COVID-19 cases at US county levels3://covid19-lake/rearc-covid-19-nyt-data-in-usa/csv/us-counties/Rearc
rearc_covid_19_nyt_data_in_usa_us_statesData on COVID-19 cases at US state levels3://covid19-lake/rearc-covid-19-nyt-data-in-usa/csv/us-states/Rearc
rearc_covid_19_testing_data_states_dailyData on COVID-19 cases at US state levels3://covid19-lake/rearc-covid-19-testing-data/csv/states_daily/Rearc
rearc_covid_19_testing_data_us_dailyUS total test daily trends3://covid19-lake/rearc-covid-19-testing-data/csv/us_daily/Rearc
rearc_covid_19_testing_data_us_total_latestUS total testss3://covid19-lake/rearc-covid-19-testing-data/csv/us-total-latest/Rearc
rearc_covid_19_world_cases_deaths_testingWorld total testss3://covid19-lake/rearc-covid-19-world-cases-deaths-testing/Rearc
rearc_usa_hospital_bedsHospital beds and their utilization in the USs3://covid19-lake/rearc-usa-hospital-beds/Rearc
world_cases_deaths_aggregatesMonthly and quarterly aggregate of the worlds3://<your-S3-bucket-name>/covid19/world-cases-deaths-aggregates/Aggregate

Prerequisites

This post assumes you have the following:

  • Access to an AWS account
  • The AWS CLI (optional)
  • Permissions to create a CloudFormation stack
  • Permissions to create AWS resources, such as AWS Identity and Access Management (IAM) roles, Amazon Simple Storage Service (Amazon S3) buckets, and various other resources
  • General familiarity with AWS Glue resources (triggers, crawlers, and jobs)

Architecture

The CloudFormation template glue-workflow-stack.yml defines all the AWS Glue resources shown in the following diagram.

architecture diagram showing ETL process

Figure: AWS Glue workflow architecture diagram

Modeling the AWS Glue workflow using AWS CloudFormation

Let’s start by exploring the template used to model the AWS Glue workflow: glue-workflow-stack.yml

We focus on two resources in the following snippet:

  • AWS::Glue::Workflow
  • AWS::Glue::Trigger

From a logical perspective, a workflow contains one or more triggers that are responsible for invoking crawlers and jobs. Building a workflow starts with defining the crawlers and jobs as resources within the template and then associating it with triggers.

Defining the workflow

This is where the definition of the workflow starts. In the following snippet, we specify the type as AWS::Glue::Workflow and the property Name as a reference to the parameter GlueWorkflowName.

Parameters:
  GlueWorkflowName:
    Type: String
    Description: Glue workflow that tracks all triggers, jobs, crawlers as a single entity
    Default: Covid_19

Resources:
  Covid19Workflow:
    Type: AWS::Glue::Workflow
    Properties: 
      Description: Glue workflow that tracks specified triggers, jobs, and crawlers as a single entity
      Name: !Ref GlueWorkflowName

Defining the triggers

This is where we define each trigger and associate it with the workflow. In the following snippet, we specify the property WorkflowName on each trigger as a reference to the logical ID Covid19Workflow.

These triggers allow us to create a chain of dependent jobs and crawlers as specified by the properties Actions and Predicate.

The trigger t_Start utilizes a type of SCHEDULED, which means that it starts at a defined time (in our case, one time a day at 8:00 AM UTC). Every time it runs, it starts the job with the logical ID Covid19WorkflowStarted.

The trigger t_GroupA utilizes a type of CONDITIONAL, which means that it starts when the resources specified within the property Predicate have reached a specific state (when the list of Conditions specified equals SUCCEEDED). Every time t_GroupA runs, it starts the crawlers with the logical ID’s CountyPopulation and Countrycode, per the Actions property containing a list of actions.

  TriggerJobCovid19WorkflowStart:
    Type: AWS::Glue::Trigger
    Properties:
      Name: t_Start
      Type: SCHEDULED
      Schedule: cron(0 8 * * ? *) # Runs once a day at 8 AM UTC
      StartOnCreation: true
      WorkflowName: !Ref GlueWorkflowName
      Actions:
        - JobName: !Ref Covid19WorkflowStarted

  TriggerCrawlersGroupA:
    Type: AWS::Glue::Trigger
    Properties:
      Name: t_GroupA
      Type: CONDITIONAL
      StartOnCreation: true
      WorkflowName: !Ref GlueWorkflowName
      Actions:
        - CrawlerName: !Ref CountyPopulation
        - CrawlerName: !Ref Countrycode
      Predicate:
        Conditions:
          - JobName: !Ref Covid19WorkflowStarted
            LogicalOperator: EQUALS
            State: SUCCEEDED

Provisioning the AWS Glue workflow using CodePipeline

Now let’s explore the template used to provision the CodePipeline resources: codepipeline-stack.yml

This template defines an S3 bucket that is used as the source action for the pipeline. Any time source code is uploaded to a specified bucket, AWS CloudTrail logs the event, which is detected by an Amazon CloudWatch Events rule configured to start running the pipeline in CodePipeline. The pipeline orchestrates CodeBuild to get the source code and provision the workflow.

For more information on any of the available source actions that you can use with CodePipeline, such as Amazon S3, AWS CodeCommit, Amazon Elastic Container Registry (Amazon ECR), GitHub, GitHub Enterprise Server, GitHub Enterprise Cloud, or Bitbucket, see Start a pipeline execution in CodePipeline.

We start by deploying the stack that sets up the CodePipeline resources. This stack can be deployed in any Region where CodePipeline and AWS Glue are available. For more information, see AWS Regional Services.

Cloning the GitHub repo

Clone the GitHub repo with the following command:

$ git clone https://github.com/aws-samples/provision-codepipeline-glue-workflows.git

Deploying the CodePipeline stack

Deploy the CodePipeline stack with the following command:

$ aws cloudformation deploy \
--stack-name codepipeline-covid19 \
--template-file cloudformation/codepipeline-stack.yml \
--capabilities CAPABILITY_NAMED_IAM \
--no-fail-on-empty-changeset \
--region <AWS_REGION>

When the deployment is complete, you can view the pipeline that was provisioned on the CodePipeline console.

CodePipeline console showing the deploy pipeline in failed state

Figure: CodePipeline console

The preceding screenshot shows that the pipeline failed. This is because we haven’t uploaded the source code yet.

In the following steps, we zip and upload the source code, which triggers another (successful) run of the pipeline.

Zipping the source code

Zip the source code containing Glue scripts, CloudFormation templates, and Buildspecs file with the following command:

$ zip -r source.zip . -x images/\* *.history* *.git* *.DS_Store*

You can omit *.DS_Store* from the preceding command if you are not a Mac user.

Uploading the source code

Upload the source code with the following command:

$ aws s3 cp source.zip s3://covid19-codepipeline-source-<AWS_ACCOUNT_ID>-<AWS_REGION>

Make sure to provide your account ID and Region in the preceding command. For example, if your AWS account ID is 111111111111 and you’re using Region us-west-2, use the following command:

$ aws s3 cp source.zip s3://covid19-codepipeline-source-111111111111-us-west-2

Now that the source code has been uploaded, view the pipeline again to see it in action.

CodePipeline console showing the deploy pipeline in success state

Figure: CodePipeline console displaying stage “Deploy” in-progress

Choose Details within the Deploy stage to see the build logs.

CodeBuild console displaying build logs

Figure: CodeBuild console displaying build logs

To modify any of the commands that run within the Deploy stage, feel free to modify: deploy-glue-workflow-stack.yml

Try uploading the source code a few more times. Each time it’s uploaded, CodePipeline starts and runs another deploy of the workflow stack. If nothing has changed in the source code, AWS CloudFormation automatically determines that the stack is already up to date. If something has changed in the source code, AWS CloudFormation automatically determines that the stack needs to be updated and proceeds to run the change set.

Viewing the provisioned workflow, triggers, jobs, and crawlers

To view your workflows on the AWS Glue console, in the navigation pane, under ETL, choose Workflows.

Glue console showing workflows

Figure: Navigate to Workflows

To view your triggers, in the navigation pane, under ETL, choose Triggers.

Glue console showing triggers

Figure: Navigate to Triggers

To view your crawlers, under Data Catalog, choose Crawlers.

Glue console showing crawlers

Figure: Navigate to Crawlers

To view your jobs, under ETL, choose Jobs.

Glue console showing jobs

Figure: Navigate to Jobs

Running the workflow

The workflow runs automatically at 8:00 AM UTC. To start the workflow manually, you can use either the AWS CLI or the AWS Glue console.

To start the workflow with the AWS CLI, enter the following command:

$ aws glue start-workflow-run – name Covid_19 – region <AWS_REGION>

To start the workflow on the AWS Glue console, on the Workflows page, select your workflow and choose Run on the Actions menu.

Glue console run workflow

Figure: AWS Glue console start workflow run

To view the run details of the workflow, choose the workflow on the AWS Glue console and choose View run details on the History tab.

Glue console view run details of a workflow

Figure: View run details

The following screenshot shows a visual representation of the workflow as a graph with your run details.

Glue console showing visual representation of the workflow as a graph.

Figure: AWS Glue console displaying details of successful workflow run

Cleaning up

To avoid additional charges, delete the stack created by the CloudFormation template and the contents of the buckets you created.

1. Delete the contents of the covid19-dataset bucket with the following command:

$ aws s3 rm s3://covid19-dataset-<AWS_ACCOUNT_ID>-<AWS_REGION> – recursive

2. Delete your workflow stack with the following command:

$ aws cloudformation delete-stack – stack-name glue-covid19 – region <AWS_REGION>

To delete the contents of the covid19-codepipeline-source bucket, it’s simplest to use the Amazon S3 console because it makes it easy to delete multiple versions of the object at once.

3. Navigate to the S3 bucket named covid19-codepipeline-source-<AWS_ACCOUNT_ID>- <AWS_REGION>.

4. Choose List versions.

5. Select all the files to delete.

6. Choose Delete and follow the prompts to permanently delete all the objects.

S3 console delete all object versions

Figure: AWS S3 console delete all object versions

7. Delete the contents of the covid19-codepipeline-artifacts bucket:

$ aws s3 rm s3://covid19-codepipeline-artifacts-<AWS_ACCOUNT_ID>-<AWS-REGION> – recursive

8. Delete the contents of the covid19-cloudtrail-logs bucket:

$ aws s3 rm s3://covid19-cloudtrail-logs-<AWS_ACCOUNT_ID>-<AWS-REGION> – recursive

9. Delete the pipeline stack:

$ aws cloudformation delete-stack – stack-name codepipeline-covid19 – region <AWS-REGION>

Conclusion

In this post, we stepped through how to use AWS DevOps tooling to model and provision an AWS Glue workflow that orchestrates an end-to-end ETL pipeline on a real-world dataset.

You can download the source code and template from this Github repository and adapt it as you see fit for your data pipeline use cases. Feel free to leave comments letting us know about the architectures you build for your environment. To learn more about building ETL pipelines with AWS Glue, see the AWS Glue Developer Guide and the AWS Data Analytics learning path.

About the Authors

Nuatu Tseggai

Nuatu Tseggai is a Cloud Infrastructure Architect at Amazon Web Services. He enjoys working with customers to design and build event-driven distributed systems that span multiple services.

Suvojit Dasgupta

Suvojit Dasgupta is a Sr. Customer Data Architect at Amazon Web Services. He works with customers to design and build complex data solutions on AWS.

Understanding memory usage in your Java application with Amazon CodeGuru Profiler

Post Syndicated from Fernando Ciciliati original https://aws.amazon.com/blogs/devops/understanding-memory-usage-in-your-java-application-with-amazon-codeguru-profiler/

“Where has all that free memory gone?” This is the question we ask ourselves every time our application emits that dreaded OutOfMemoyError just before it crashes. Amazon CodeGuru Profiler can help you find the answer.

Thanks to its brand-new memory profiling capabilities, troubleshooting and resolving memory issues in Java applications (or almost anything that runs on the JVM) is much easier. AWS launched the CodeGuru Profiler Heap Summary feature at re:Invent 2020. This is the first step in helping us, developers, understand what our software is doing with all that memory it uses.

The Heap Summary view shows a list of Java classes and data types present in the Java Virtual Machine heap, alongside the amount of memory they’re retaining and the number of instances they represent. The following screenshot shows an example of this view.

Amazon CodeGuru Profiler heap summary view example

Figure: Amazon CodeGuru Profiler Heap Summary feature

Because CodeGuru Profiler is a low-overhead, production profiling service designed to be always on, it can capture and represent how memory utilization varies over time, providing helpful visual hints about the object types and the data types that exhibit a growing trend in memory consumption.

In the preceding screenshot, we can see that several lines on the graph are trending upwards:

  • The red top line, horizontal and flat, shows how much memory has been reserved as heap space in the JVM. In this case, we see a heap size of 512 MB, which can usually be configured in the JVM with command line parameters like -Xmx.
  • The second line from the top, blue, represents the total memory in use in the heap, independent of their type.
  • The third, fourth, and fifth lines show how much memory space each specific type has been using historically in the heap. We can easily spot that java.util.LinkedHashMap$Entry and java.lang.UUID display growing trends, whereas byte[] has a flat line and seems stable in memory usage.

Types that exhibit constantly growing trend of memory utilization with time deserve a closer look. Profiler helps you focus your attention on these cases. Associating the information presented by the Profiler with your own knowledge of your application and code base, you can evaluate whether the amount of memory being used for a specific data type can be considered normal, or if it might be a memory leak – the unintentional holding of memory by an application due to the failure in freeing-up unused objects. In our example above, java.util.LinkedHashMap$Entry and java.lang.UUIDare good candidates for investigation.

To make this functionality available to customers, CodeGuru Profiler uses the power of Java Flight Recorder (JFR), which is now openly available with Java 8 (since OpenJDK release 262) and above. The Amazon CodeGuru Profiler agent for Java, which already does an awesome job capturing data about CPU utilization, has been extended to periodically collect memory retention metrics from JFR and submit them for processing and visualization via Amazon CodeGuru Profiler. Thanks to its high stability and low overhead, the Profiler agent can be safely deployed to services in production, because it is exactly there, under real workloads, that really interesting memory issues are most likely to show up.

Summary

For more information about CodeGuru Profiler and other AI-powered services in the Amazon CodeGuru family, see Amazon CodeGuru. If you haven’t tried the CodeGuru Profiler yet, start your 90-day free trial right now and understand why continuous profiling is becoming a must-have in every production environment. For Amazon CodeGuru customers who are already enjoying the benefits of always-on profiling, this new feature is available at no extra cost. Just update your Profiler agent to version 1.1.0 or newer, and enable Heap Summary in your agent configuration.

 

Happy profiling!

Building a Jenkins Pipeline with AWS SAM

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/building-a-jenkins-pipeline-with-aws-sam/

This post is courtesy of Tarun Kumar Mall, SDE at AWS.

This post shows how to set up a multi-stage pipeline on a Jenkins host for a serverless application, using the AWS Serverless Application Model (AWS SAM).

Overview

This tutorial uses Jenkins Pipeline plugin. A commit to the main branch of the repository starts and deploys the application, using the AWS SAM CLI. This tutorial deploys a small serverless API application called HelloWorldApi.

The pipeline consists of stages to build and deploy the application. Jenkins first ensures that the build environment is set up and installs any necessary tools. Next, Jenkins prepares the build artifacts. It promotes the artifacts to the next stage, where they are deployed to a beta environment using the AWS SAM CLI. Integration tests are run after deployment. If the tests pass, the application is deployed to the production environment.

CICD workflow diagram

CICD workflow diagram

The following prerequisites are required:

Setting up the backend application and development stack

Using AWS CloudFormation to define the infrastructure, you can create multiple environments or stacks from the same infrastructure definition. A “dev stack” is a copy of production infrastructure deployed to a developer account for testing purposes.

As serverless services use a pay-for-value model, it can be cost effective to use a high-fidelity copy of your production stack. Dev stacks are created by each developer as needed and deleted without having any negative impact on production.

For complex applications, it may not be feasible for every developer to have their own stack. However, for this tutorial, setting up the dev stack first for testing is recommended. Setting up a dev stack takes you through a manual process of how a stack is created. Later, this process is used to automate the setup using Jenkins.

To create a dev stack:

  1. Clone backend application repository https://github.com/aws-samples/aws-sam-jenkins-pipeline-tutorial
    git clone https://github.com/aws-samples/aws-sam-jenkins-pipeline-tutorial.git
  2. Build the application and run the guided deploy command:
    cd aws-sam-jenkins-pipeline-tutorial
    sam build
    sam deploy – guided

    AWS SAM guided deploy output

    AWS SAM guided deploy output

This sets up a development stack and saves the settings in the samconfig.toml file with configuration environment specific to a user. This also triggers a deployment.

  1. After deployment, make a small code change. For example, in the file hello-world/app.js change the message Hello world to Hello world from user <your name>.
  2. Deploy the updated code:
    sam build
    sam deploy -–config-env <your_username>

With this command, each developer can create their own configuration environment. They can use this for deploying to their development stack and testing changes before pushing changes to the repository.

Once deployment finishes, the API endpoint is displayed in the console output. You can use this endpoint to make GET requests and test the API manually.

Deployment output

Deployment output

To update and run the integration test:

  1. Open the hello-world/tests/integ/test-integ-api.js file.
  2. Update the assert statement in line 32 to include <your name> from the previous step:
    it("verifies if response contains my username", async () => {
      assert.include(apiResponse.data.message, "<your name>");
    });
  3. Open package.json and add the line in bold:
    {
      ...
      "scripts": {
        "test": "mocha tests/unit/",
        "integ-test": "mocha tests/integ/"
      }
      ...
    }
  4. From the terminal, run the following commands:
    cd hello-world
    npm install
    AWS_REGION=us-west-2 STACK_NAME=sam-app-user1-dev-stack npm run integ-test
    If you are using Microsoft Windows, instead run:
    cd hello-world
    npm install
    set AWS_REGION=us-west-2
    set STACK_NAME=sam-app-user1-dev-stack
    npm run integ-test

    Test results

    Test results

You have deployed a fully configured development stack with working integration tests. To push the code to GitHub:

  1. Create a new repository in GitHub.
    1. From the GitHub account homepage, choose New.
    2. Enter a repository name and choose Create Repository.
    3. Copy the repository URL.
  2. From the root directory of the AWS SAM project, run:
    git init
    git commit -am “first commit”
    git remote add origin <your-repository-url>
    git push -u origin main

Creating an IAM user for Jenkins

To create an IAM user for the Jenkins deployment:

  1. Sign in to the AWS Management Console and navigate to IAM.
  2. Select Users from side navigation and choose Add user.
  3. Enter the User name as sam-jenkins-demo-credentials and grant Programmatic access to this user.
  4. On the next page, select Attach existing policies directly and choose Create Policy.
  5. Select the JSON tab and enter the following policy. Replace <YOUR_ACCOUNT_ID> with your AWS account ID:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "CloudFormationTemplate",
                "Effect": "Allow",
                "Action": [
                    "cloudformation:CreateChangeSet"
                ],
                "Resource": [
                    "arn:aws:cloudformation:*:aws:transform/Serverless-2016-10-31"
                ]
            },
            {
                "Sid": "CloudFormationStack",
                "Effect": "Allow",
                "Action": [
                    "cloudformation:CreateChangeSet",
                    "cloudformation:DeleteStack",
                    "cloudformation:DescribeChangeSet",
                    "cloudformation:DescribeStackEvents",
                    "cloudformation:DescribeStacks",
                    "cloudformation:ExecuteChangeSet",
                    "cloudformation:GetTemplateSummary"
                ],
                "Resource": [
                    "arn:aws:cloudformation:*:<YOUR_ACCOUNT_ID>:stack/*"
                ]
            },
            {
                "Sid": "S3",
                "Effect": "Allow",
                "Action": [
                    "s3:CreateBucket",
                    "s3:GetObject",
                    "s3:PutObject"
                ],
                "Resource": [
                    "arn:aws:s3:::*/*"
                ]
            },
            {
                "Sid": "Lambda",
                "Effect": "Allow",
                "Action": [
                    "lambda:AddPermission",
                    "lambda:CreateFunction",
                    "lambda:DeleteFunction",
                    "lambda:GetFunction",
                    "lambda:GetFunctionConfiguration",
                    "lambda:ListTags",
                    "lambda:RemovePermission",
                    "lambda:TagResource",
                    "lambda:UntagResource",
                    "lambda:UpdateFunctionCode",
                    "lambda:UpdateFunctionConfiguration"
                ],
                "Resource": [
                    "arn:aws:lambda:*:<YOUR_ACCOUNT_ID>:function:*"
                ]
            },
            {
                "Sid": "IAM",
                "Effect": "Allow",
                "Action": [
                    "iam:AttachRolePolicy",
                    "iam:CreateRole",
                    "iam:DeleteRole",
                    "iam:DetachRolePolicy",
                    "iam:GetRole",
                    "iam:PassRole",
                    "iam:TagRole"
                ],
                "Resource": [
                    "arn:aws:iam::<YOUR_ACCOUNT_ID>:role/*"
                ]
            },
            {
                "Sid": "APIGateway",
                "Effect": "Allow",
                "Action": [
                    "apigateway:DELETE",
                    "apigateway:GET",
                    "apigateway:PATCH",
                    "apigateway:POST",
                    "apigateway:PUT"
                ],
                "Resource": [
                    "arn:aws:apigateway:*::*"
                ]
            }
        ]
    }
  6. Choose Review Policy and add a policy name on the next page.
  7. Choose Create Policy button.
  8. Return to the previous tab to continue creating the IAM user. Choose Refresh and search for the policy name you created. Select the policy.
  9. Choose Next Tags and then Review.
  10. Choose Create user and save the Access key ID and Secret access key.

Configuring Jenkins

To configure AWS credentials in Jenkins:

  1. On the Jenkins dashboard, go to Manage Jenkins > Manage Plugins in the Available tab. Search for the Pipeline: AWS Steps plugin and choose Install without restart.
  2. Navigate to Manage Jenkins > Manage Credentials > Jenkins (global) > Global Credentials > Add Credentials.
  3. Select Kind as AWS credentials and use the ID sam-jenkins-demo-credentials.
  4. Enter the access key ID and secret access key and choose OK.

    Jenkins credential configuration

    Jenkins credential configuration

  5. Create Amazon S3 buckets for each Region in the pipeline. S3 bucket names must be unique within a partition:
    aws s3 mb s3://sam-jenkins-demo-us-west-2-<your_name> – region us-west-2
    aws s3 mb s3://sam-jenkins-demo-us-east-1-<your_name> – region us-east-1
  6. Create a file named Jenkinsfile at the root of the project and add:
    pipeline {
      agent any
     
      stages {
        stage('Install sam-cli') {
          steps {
            sh 'python3 -m venv venv && venv/bin/pip install aws-sam-cli'
            stash includes: '**/venv/**/*', name: 'venv'
          }
        }
        stage('Build') {
          steps {
            unstash 'venv'
            sh 'venv/bin/sam build'
            stash includes: '**/.aws-sam/**/*', name: 'aws-sam'
          }
        }
        stage('beta') {
          environment {
            STACK_NAME = 'sam-app-beta-stage'
            S3_BUCKET = 'sam-jenkins-demo-us-west-2-user1'
          }
          steps {
            withAWS(credentials: 'sam-jenkins-demo-credentials', region: 'us-west-2') {
              unstash 'venv'
              unstash 'aws-sam'
              sh 'venv/bin/sam deploy – stack-name $STACK_NAME -t template.yaml – s3-bucket $S3_BUCKET – capabilities CAPABILITY_IAM'
              dir ('hello-world') {
                sh 'npm ci'
                sh 'npm run integ-test'
              }
            }
          }
        }
        stage('prod') {
          environment {
            STACK_NAME = 'sam-app-prod-stage'
            S3_BUCKET = 'sam-jenkins-demo-us-east-1-user1'
          }
          steps {
            withAWS(credentials: 'sam-jenkins-demo-credentials', region: 'us-east-1') {
              unstash 'venv'
              unstash 'aws-sam'
              sh 'venv/bin/sam deploy – stack-name $STACK_NAME -t template.yaml – s3-bucket $S3_BUCKET – capabilities CAPABILITY_IAM'
            }
          }
        }
      }
    }
  7. Commit and push the code to the GitHub repository by running following commands:
    git commit -am “Adding Jenkins pipeline config.”
    git push origin -u main

Next, create a Jenkins Pipeline project:

  1. From the Jenkins dashboard, choose New Item, select Pipeline, and enter the project name sam-jenkins-demo-pipeline.

    Jenkins Pipeline creation wizard

    Jenkins Pipeline creation wizard

  2. Under Build Triggers, select Poll SCM and enter * * * * *. This polls the repository for changes every minute.

    Jenkins build triggers configuration

    Jenkins build triggers configuration

  3. Under the Pipeline section, select Definition as Pipeline script from SCM.
    • Select GIT under SCM and enter the repository URL.
    • Set Branches to build to */main.
    • Set the Script Path to Jenkinsfile.

      Jenkins pipeline configuration

      Jenkins pipeline configuration

  4. Save the project.

After the build finishes, you see the pipeline:

Jenkins pipeline stages

Jenkins pipeline stages

Review the logs for the beta stage to check that the integration test is completed successfully.

Jenkins stage logs

Jenkins stage logs

Conclusion

This tutorial uses a Jenkins Pipeline to add an automated CI/CD pipeline to an AWS SAM-generated example application. Jenkins automatically builds, tests, and deploys the changes after each commit to the repository.

Using Jenkins, developers can gain the benefits of continuous integration and continuous deployment of serverless applications to the AWS Cloud with minimal configuration.

For more information, see the Jenkins Pipeline and AWS Serverless Application Model documentation.

We want to hear your feedback! Is this approach useful for your organization? Do you want to see another implementation? Contact us on Twitter @edjgeek or via comments!

Resource leak detection in Amazon CodeGuru Reviewer

Post Syndicated from Pranav Garg original https://aws.amazon.com/blogs/devops/resource-leak-detection-in-amazon-codeguru/

This post discusses the resource leak detector for Java in Amazon CodeGuru Reviewer. CodeGuru Reviewer automatically analyzes pull requests (created in supported repositories such as AWS CodeCommit, GitHub, GitHub Enterprise, and Bitbucket) and generates recommendations for improving code quality. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. This blog does not describe the resource leak detector for Python programs that is now available in preview.

What are resource leaks?

Resources are objects with a limited availability within a computing system. These typically include objects managed by the operating system, such as file handles, database connections, and network sockets. Because the number of such resources in a system is limited, they must be released by an application as soon as they are used. Otherwise, you will run out of resources and you won’t be able to allocate new ones. The paradigm of acquiring a resource and releasing it is also followed by other categories of objects such as metric wrappers and timers.

Resource leaks are bugs that arise when a program doesn’t release the resources it has acquired. Resource leaks can lead to resource exhaustion. In the worst case, they can cause the system to slow down or even crash.

Starting with Java 7, most classes holding resources implement the java.lang.AutoCloseable interface and provide a close() method to release them. However, a close() call in source code doesn’t guarantee that the resource is released along all program execution paths. For example, in the following sample code, resource r is acquired by calling its constructor and is closed along the path corresponding to the if branch, shown using green arrows. To ensure that the acquired resource doesn’t leak, you must also close r along the path corresponding to the else branch (the path shown using red arrows).

A resource must be closed along all execution paths to prevent resource leaks

Often, resource leaks manifest themselves along code paths that aren’t frequently run, or under a heavy system load, or after the system has been running for a long time. As a result, such leaks are latent and can remain dormant in source code for long periods of time before manifesting themselves in production environments. This is the primary reason why resource leak bugs are difficult to detect or replicate during testing, and why automatically detecting these bugs during pull requests and code scans is important.

Detecting resource leaks in CodeGuru Reviewer

For this post, we consider the following Java code snippet. In this code, method getConnection() attempts to create a connection in the connection pool associated with a data source. Typically, a connection pool limits the maximum number of connections that can remain open at any given time. As a result, you must close connections after their use so as to not exhaust this limit.

 1     private Connection getConnection(final BasicDataSource dataSource, ...)
               throws ValidateConnectionException, SQLException {
 2         boolean connectionAcquired = false;
 3         // Retrying three times to get the connection.
 4         for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
 5             Connection connection = dataSource.getConnection();
 6             // validateConnection may throw ValidateConnectionException
 7             if (! validateConnection(connection, ...)) {
 8                 // connection is invalid
 9                 DbUtils.closeQuietly(connection);
10             } else {
11                 // connection is established
12                 connectionAcquired = true;
13                 return connection;
14             }
15         }
16         return null;
17     }

At first glance, it seems that the method getConnection() doesn’t leak connection resources. If a valid connection is established in the connection pool (else branch on line 10 is taken), the method getConnection() returns it to the client for use (line 13). If the connection established is invalid (if branch on line 7 is taken), it’s closed in line 9 before another attempt is made to establish a connection.

However, method validateConnection() at line 7 can throw a ValidateConnectionException. If this exception is thrown after a connection is established at line 5, the connection is neither closed in this method nor is it returned upstream to the client to be closed later. Furthermore, if this exceptional code path runs frequently, for instance, if the validation logic throws on a specific recurring service request, each new request causes a connection to leak in the connection pool. Eventually, the client can’t acquire new connections to the data source, impacting the availability of the service.

A typical recommendation to prevent resource leak bugs is to declare the resource objects in a try-with-resources statement block. However, we can’t use try-with-resources to fix the preceding method because this method is required to return an open connection for use in the upstream client. The CodeGuru Reviewer recommendation for the preceding code snippet is as follows:

“Consider closing the following resource: connection. The resource is referenced at line 7. The resource is closed at line 9. The resource is returned at line 13. There are other execution paths that don’t close the resource or return it, for example, when validateConnection throws an exception. To prevent this resource leak, close connection along these other paths before you exit this method.”

As mentioned in the Reviewer recommendation, to prevent this resource leak, you must close the established connection when method validateConnection() throws an exception. This can be achieved by inserting the validation logic (lines 7–14) in a try block. In the finally block associated with this try, the connection must be closed by calling DbUtils.closeQuietly(connection) if connectionAcquired == false. The method getConnection() after this fix has been applied is as follows:

private Connection getConnection(final BasicDataSource dataSource, ...) 
        throws ValidateConnectionException, SQLException {
    boolean connectionAcquired = false;
    // Retrying three times to get the connection.
    for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
        Connection connection = dataSource.getConnection();
        try {
            // validateConnection may throw ValidateConnectionException
            if (! validateConnection(connection, ...)) {
                // connection is invalid
                DbUtils.closeQuietly(connection);
            } else {
                // connection is established
                connectionAcquired = true;
                return connection;
            }
        } finally {
            if (!connectionAcquired) {
                DBUtils.closeQuietly(connection);
            }
        }
    }
    return null;
}

As shown in this example, resource leaks in production services can be very disruptive. Furthermore, leaks that manifest along exceptional or less frequently run code paths can be hard to detect or replicate during testing and can remain dormant in the code for long periods of time before manifesting themselves in production environments. With the resource leak detector, you can detect such leaks on objects belonging to a large number of popular Java types such as file streams, database connections, network sockets, timers and metrics, etc.

Combining static code analysis with machine learning for accurate resource leak detection

In this section, we dive deep into the inner workings of the resource leak detector. The resource leak detector in CodeGuru Reviewer uses static analysis algorithms and techniques. Static analysis algorithms perform code analysis without running the code. These algorithms are generally prone to high false positives (the tool might report correct code as having a bug). If the number of these false positives is high, it can lead to alarm fatigue and low adoption of the tool. As a result, the resource leak detector in CodeGuru Reviewer prioritizes precision over recall— the findings we surface are resource leaks with a high accuracy, though CodeGuru Reviewer could potentially miss some resource leak findings.

The main reason for false positives in static code analysis is incomplete information available to the analysis. CodeGuru Reviewer requires only the Java source files and doesn’t require all dependencies or the build artifacts. Not requiring the external dependencies or the build artifacts reduces the friction to perform automated code reviews. As a result, static analysis only has access to the code in the source repository and doesn’t have access to its external dependencies. The resource leak detector in CodeGuru Reviewer combines static code analysis with a machine learning (ML) model. This ML model is used to reason about external dependencies to provide accurate recommendations.

To understand the use of the ML model, consider again the code above for method getConnection() that had a resource leak. In the code snippet, a connection to the data source is established by calling BasicDataSource.getConnection() method, declared in the Apache Commons library. As mentioned earlier, we don’t require the source code of external dependencies like the Apache library for code analysis during pull requests. Without access to the code of external dependencies, a pure static analysis-driven technique doesn’t know whether the Connection object obtained at line 5 will leak, if not closed. Similarly, it doesn’t know that DbUtils.closeQuietly() is a library function that closes the connection argument passed to it at line 9. Our detector combines static code analysis with ML that learns patterns over such external function calls from a large number of available code repositories. As a result, our resource leak detector knows that the connection doesn’t leak along the following code path:

  • A connection is established on line 5
  • Method validateConnection() returns false at line 7
  • DbUtils.closeQuietly() is called on line 9

This suppresses the possible false warning. At the same time, the detector knows that there is a resource leak when the connection is established at line 5, and validateConnection() throws an exception at line 7 that isn’t caught.

When we run CodeGuru Reviewer on this code snippet, it surfaces only the second leak scenario and makes an appropriate recommendation to fix this bug.

The ML model used in the resource leak detector has been trained on a large number of internal Amazon and GitHub code repositories.

Responses to the resource leak findings

Although closing an open resource in code isn’t difficult, doing so properly along all program paths is important to prevent resource leaks. This can easily be overlooked, especially along exceptional or less frequently run paths. As a result, the resource leak detector in CodeGuru Reviewer has observed a relatively high frequency, and has alerted developers within Amazon to thousands of resource leaks before they hit production.

The resource leak detections have witnessed a high developer acceptance rate, and developer feedback towards the resource leak detector has been very positive. Some of the feedback from developers includes “Very cool, automated finding,” “Good bot :),” and “Oh man, this is cool.” Developers have also concurred that the findings are important and need to be fixed.

Conclusion

Resource leak bugs are difficult to detect or replicate during testing. They can impact the availability of production services. As a result, it’s important to automatically detect these bugs early on in the software development workflow, such as during pull requests or code scans. The resource leak detector in CodeGuru Reviewer combines static code analysis algorithms with ML to surface only the high confidence leaks. It has a high developer acceptance rate and has alerted developers within Amazon to thousands of leaks before those leaks hit production.

New for Amazon CodeGuru – Python Support, Security Detectors, and Memory Profiling

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-amazon-codeguru-python-support-security-detectors-and-memory-profiling/

Amazon CodeGuru is a developer tool that helps you improve your code quality and has two main components:

  • CodeGuru Reviewer uses program analysis and machine learning to detect potential defects that are difficult to find in your code and offers suggestions for improvement.
  • CodeGuru Profiler collects runtime performance data from your live applications, and provides visualizations and recommendations to help you fine-tune your application performance.

Today, I am happy to announce three new features:

  • Python Support for CodeGuru Reviewer and Profiler (Preview) – You can now use CodeGuru to improve applications written in Python. Before this release, CodeGuru Reviewer could analyze Java code, and CodeGuru Profiler supported applications running on a Java virtual machine (JVM).
  • Security Detectors for CodeGuru Reviewer – A new set of detectors for CodeGuru Reviewer to identify security vulnerabilities and check for security best practices in your Java code.
  • Memory Profiling for CodeGuru Profiler – A new visualization of memory retention per object type over time. This makes it easier to find memory leaks and optimize how your application is using memory.

Let’s see these functionalities in more detail.

Python Support for CodeGuru Reviewer and Profiler (Preview)
Python Support for CodeGuru Reviewer is available in Preview and offers recommendations on how to improve the Python code of your applications in multiple categories such as concurrency, data structures and control flow, scientific/math operations, error handling, using the standard library, and of course AWS best practices.

You can now also use CodeGuru Profiler to collect runtime performance data from your Python applications and get visualizations to help you identify how code is running on the CPU and where time is consumed. In this way, you can detect the most expensive lines of code of your application. Focusing your tuning activities on those parts helps you reduce infrastructure cost and improve application performance.

Let’s see the CodeGuru Reviewer in action with some Python code. When I joined AWS eight years ago, one of the first projects I created was a Filesystem in Userspace (FUSE) interface to Amazon Simple Storage Service (S3) called yas3fs (Yet Another S3-backed File System). It was inspired by the more popular s3fs-fuse project but rewritten from scratch to implement a distributed cache synchronized by Amazon Simple Notification Service (SNS) notifications (now, thanks to the many contributors, it’s using S3 event notifications). It was also a good excuse for me to learn more about Python programming and S3. It’s a personal project that at the time made available as open source. Today, if you need a shared file system, you can use Amazon Elastic File System (EFS).

In the CodeGuru console, I associate the yas3fs repository. You can associate repositories from GitHub, including GitHub Enterprise Cloud and GitHub Enterprise Server, Bitbucket, or AWS CodeCommit.

After that, I can get a code review from CodeGuru in two ways:

  • Automatically, when I create a pull request. This is a great way to use it as you and your team are working on a code base.
  • Manually, creating a repository analysis to get a code review for all the code in one branch. This is useful to start using GodeGuru with an existing code base.

Since I just associated the whole repository, I go for a full analysis and write down the branch name to review (apologies, I was still using master at the time, now I use main for new projects).

After a few minutes, the code review is completed, and there are 14 recommendations. Not bad, but I can definitely improve the code. Here’s a few of the recommendations I get. I was using exceptions and global variables too much at the time.

Security Detectors for CodeGuru Reviewer
The new CodeGuru Reviewer Security Detector uses automated reasoning to analyze all code paths and find potential security issues deep in your Java code, even ones that span multiple methods and files and that may involve multiple sequences of operations. To build this detector, we used learning and best practices from Amazon’s 20+ years of experience.

The Security Detector is also identifying security vulnerabilities in the top 10 Open Web Application Security Project (OWASP) categories, such as weak hash encryption.

If the security detector discovers an issue, it offers a suggested remediation along with an explanation. In this way, it’s much easier to follow security best practices for AWS APIs, such as those for AWS Key Management Service (KMS) and Amazon Elastic Compute Cloud (EC2), and for common Java cryptography and TLS/SSL libraries.

With help from the security detector, security engineers can focus on architectural and application-specific security best-practices, and code reviewers can focus their attention on other improvements.

Memory Profiling for CodeGuru Profiler
For applications running on a JVM, CodeGuru Profiler can now show the Heap Summary, a consolidated view of memory retention during a time frame, tracking both overall sizes and number of objects per object type (such as String, int, char[], and custom types). These metrics are presented in a timeline graph, so that you can easily spot trends and peaks of memory utilization per object type.

Here are a couple of scenarios where this can help:

Memory Leaks – A constantly growing memory utilization curve for one or more object types may indicate a leak (intended here as unnecessary retention of memory objects by the application), possibly leading to out-of-memory errors and application crashes.

Memory Optimizations – Having a breakdown of memory utilization per object type is a step beyond traditional memory utilization monitoring, based solely on JVM-level metrics like total heap usage. By knowing that an unexpectedly high amount of memory has been associated with a specific object type, you can focus your analysis and optimization efforts on the parts of your application that are responsible for allocating and referencing objects of that type.

For example, here is a graph showing how memory is used by a Java application over an interval of time. Apart from the total capacity available and the used space, I can see how memory is being used by some specific object types, such as byte[], java.lang.UUID, and the entries of a java.util.LinkedHashMap. The continuous growth over time of the memory retained by these object types is suspicious. There is probably a memory leak I have to investigate.

In the table just below, I have a longer list of object types allocating memory on the heap. The first three are selected and for that reason are shown in the graph above. Here, I can inspect other object types and select them to see their memory usage over time. It looks like the three I already selected are the ones with more risk of being affected by a memory leak.

Available Now
These new features are available today in all regions where Amazon CodeGuru is offered. For more information, please see the AWS Regional Services table.

There are no pricing changes for Python support, security detectors, and memory profiling. You pay for what you use without upfront fees or commitments.

Learn more about Amazon CodeGuru and start using these new features today to improve the code quality of your applications.  

Danilo

Raising code quality for Python applications using Amazon CodeGuru

Post Syndicated from Ran Fu original https://aws.amazon.com/blogs/devops/raising-code-quality-for-python-applications-using-amazon-codeguru/

We are pleased to announce the launch of Python support for Amazon CodeGuru, a service for automated code reviews and application performance recommendations. CodeGuru is powered by program analysis and machine learning, and trained on best practices and hard-learned lessons across millions of code reviews and thousands of applications profiled on open-source projects and internally at Amazon.

Amazon CodeGuru has two services:

  • Amazon CodeGuru Reviewer – Helps you improve source code quality by detecting hard-to-find defects during application development and recommending how to remediate them.
  • Amazon CodeGuru Profiler – Helps you find the most expensive lines of code, helps reduce your infrastructure cost, and fine-tunes your application performance.

The launch of Python support extends CodeGuru beyond its original Java support. Python is a widely used language for various use cases, including web app development and DevOps. Python’s growth in data analysis and machine learning areas is driven by its rich frameworks and libraries. In this post, we discuss how to use CodeGuru Reviewer and Profiler to improve your code quality for Python applications.

CodeGuru Reviewer for Python

CodeGuru Reviewer now allows you to analyze your Python code through pull requests and full repository analysis. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. We analyzed large code corpuses and Python documentation to source hard-to-find coding issues and trained our detectors to provide best practice recommendations. We expect such recommendations to benefit beginners as well as expert Python programmers.

CodeGuru Reviewer generates recommendations in the following categories:

  • AWS SDK APIs best practices
  • Data structures and control flow, including exception handling
  • Resource leaks
  • Secure coding practices to protect from potential shell injections

In the following sections, we provide real-world examples of bugs that can be detected in each of the categories:

AWS SDK API best practices

AWS has hundreds of services and thousands of APIs. Developers can now benefit from CodeGuru Reviewer recommendations related to AWS APIs. AWS recommendations in CodeGuru Reviewer cover a wide range of scenarios such as detecting outdated or deprecated APIs, warning about API misuse, authentication and exception scenarios, and efficient API alternatives.

Consider the pagination trait, implemented by over 1,000 APIs from more than 150 AWS services. The trait is commonly used when the response object is too large to return in a single response. To get the complete set of results, iterated calls to the API are required, until the last page is reached. If developers were not aware of this, they would write the code as the following (this example is patterned after actual code):

def sync_ddb_table(source_ddb, destination_ddb):
    response = source_ddb.scan(TableName=“table1”)
    for item in response['Items']:
        ...
        destination_ddb.put_item(TableName=“table2”, Item=item)
    …   

Here the scan API is used to read items from one Amazon DynamoDB table and the put_item API to save them to another DynamoDB table. The scan API implements the Pagination trait. However, the developer missed iterating on the results beyond the first scan, leading to only partial copying of data.

The following screenshot shows what CodeGuru Reviewer recommends:

The following screenshot shows CodeGuru Reviewer recommends on the need for pagination

The developer fixed the code based on this recommendation and added complete handling of paginated results by checking the LastEvaluatedKey value in the response object of the paginated API scan as follows:

def sync_ddb_table(source_ddb, destination_ddb):
    response = source_ddb.scan(TableName==“table1”)
    for item in response['Items']:
        ...
        destination_ddb.put_item(TableName=“table2”, Item=item)
    # Keeps scanning util LastEvaluatedKey is null
    while "LastEvaluatedKey" in response:
        response = source_ddb.scan(
            TableName="table1",
            ExclusiveStartKey=response["LastEvaluatedKey"]
        )
        for item in response['Items']:
            destination_ddb.put_item(TableName=“table2”, Item=item)
    …   

CodeGuru Reviewer recommendation is rich and offers multiple options for implementing Paginated scan. We can also initialize the ExclusiveStartKey value to None and iteratively update it based on the LastEvaluatedKey value obtained from the scan response object in a loop. This fix below conforms to the usage mentioned in the official documentation.

def sync_ddb_table(source_ddb, destination_ddb):
    table = source_ddb.Table(“table1”)
    scan_kwargs = {
                  …
    }
    done = False
    start_key = None
    while not done:
        if start_key:
            scan_kwargs['ExclusiveStartKey'] = start_key
        response = table.scan(**scan_kwargs)
        for item in response['Items']:
            destination_ddb.put_item(TableName=“table2”, Item=item)
        start_key = response.get('LastEvaluatedKey', None)
        done = start_key is None

Data structures and control flow

Python’s coding style is different from other languages. For code that does not conform to Python idioms, CodeGuru Reviewer provides a variety of suggestions for efficient and correct handling of data structures and control flow in the Python 3 standard library:

  • Using DefaultDict for compact handling of missing dictionary keys over using the setDefault() API or dealing with KeyError exception
  • Using a subprocess module over outdated APIs for subprocess handling
  • Detecting improper exception handling such as catching and passing generic exceptions that can hide latent issues.
  • Detecting simultaneous iteration and modification to loops that might lead to unexpected bugs because the iterator expression is only evaluated one time and does not account for subsequent index changes.

The following code is a specific example that can confuse novice developers.

def list_sns(region, creds, sns_topics=[]):
    sns = boto_session('sns', creds, region)
    response = sns.list_topics()
    for topic_arn in response["Topics"]:
        sns_topics.append(topic_arn["TopicArn"])
    return sns_topics
  
def process():
    ...
    for region, creds in jobs["auth_config"]:
        arns = list_sns(region, creds)
        ... 

The process() method iterates over different AWS Regions and collects Regional ARNs by calling the list_sns() method. The developer might expect that each call to list_sns() with a Region parameter returns only the corresponding Regional ARNs. However, the preceding code actually leaks the ARNs from prior calls to subsequent Regions. This happens due to an idiosyncrasy of Python relating to the use of mutable objects as default argument values. Python default value are created exactly one time, and if that object is mutated, subsequent references to the object refer to the mutated value instead of re-initialization.

The following screenshot shows what CodeGuru Reviewer recommends:

The following screenshot shows CodeGuru Reviewer recommends about initializing a value for mutable objects

The developer accepted the recommendation and issued the below fix.

def list_sns(region, creds, sns_topics=None):
    sns = boto_session('sns', creds, region)
    response = sns.list_topics()
    if sns_topics is None: 
        sns_topics = [] 
    for topic_arn in response["Topics"]:
        sns_topics.append(topic_arn["TopicArn"])
    return sns_topics

Resource leaks

A Pythonic practice for resource handling is using Context Managers. Our analysis shows that resource leaks are rampant in Python code where a developer may open external files or windows and forget to close them eventually. A resource leak can slow down or crash your system. Even if a resource is closed, using Context Managers is Pythonic. For example, CodeGuru Reviewer detects resource leaks in the following code:

def read_lines(file):
    lines = []
    f = open(file, ‘r’)
    for line in f:
        lines.append(line.strip(‘\n’).strip(‘\r\n’))
    return lines

The following screenshot shows that CodeGuru Reviewer recommends that the developer either use the ContextLib with statement or use a try-finally block to explicitly close a resource.

The following screenshot shows CodeGuru Reviewer recommend about fixing the potential resource leak

The developer accepted the recommendation and fixed the code as shown below.

def read_lines(file):
    lines = []
    with open(file, ‘r’) as f: 
        for line in f:
            lines.append(line.strip(‘\n’).strip(‘\r\n’))
    return lines

Secure coding practices

Python is often used for scripting. An integral part of such scripts is the use of subprocesses. As of this writing, CodeGuru Reviewer makes a limited, but important set of recommendations to make sure that your use of eval functions or subprocesses is secure from potential shell injections. It issues a warning if it detects that the command used in eval or subprocess scenarios might be influenced by external factors. For example, see the following code:

def execute(cmd):
    try:
        retcode = subprocess.call(cmd, shell=True)
        ...
    except OSError as e:
        ...

The following screenshot shows the CodeGuru Reviewer recommendation:

The following screenshot shows CodeGuru Reviewer recommends about potential shell injection vulnerability

The developer accepted this recommendation and made the following fix.

def execute(cmd):
    try:
        retcode = subprocess.call(shlex.quote(cmd), shell=True)
        ...
    except OSError as e:
        ...

As shown in the preceding recommendations, not only are the code issues detected, but a detailed recommendation is also provided on how to fix the issues, along with a link to the Python official documentation. You can provide feedback on recommendations in the CodeGuru Reviewer console or by commenting on the code in a pull request. This feedback helps improve the performance of Reviewer so that the recommendations you see get better over time.

Now let’s take a look at CodeGuru Profiler.

CodeGuru Profiler for Python

Amazon CodeGuru Profiler analyzes your application’s performance characteristics and provides interactive visualizations to show you where your application spends its time. These visualizations a. k. a. flame graphs are a powerful tool to help you troubleshoot which code methods have high latency or are over utilizing your CPU.

Thanks to the new Python agent, you can now use CodeGuru Profiler on your Python applications to investigate performance issues.

The following list summarizes the supported versions as of this writing.

  • AWS Lambda functions: Python3.8, Python3.7, Python3.6
  • Other environments: Python3.9, Python3.8, Python3.7, Python3.6

Onboarding your Python application

For this post, let’s assume you have a Python application running on Amazon Elastic Compute Cloud (Amazon EC2) hosts that you want to profile. To onboard your Python application, complete the following steps:

1. Create a new profiling group in CodeGuru Profiler console called ProfilingGroupForMyApplication. Give access to your Amazon EC2 execution role to submit to this profiling group. See the documentation for details about how to create a Profiling Group.

2. Install the codeguru_profiler_agent module:

pip3 install codeguru_profiler_agent

3. Start the profiler in your application.

An easy way to profile your application is to start your script through the codeguru_profiler_agent module. If you have an app.py script, use the following code:

python -m codeguru_profiler_agent -p ProfilingGroupForMyApplication app.py

Alternatively, you can start the agent manually inside the code. This must be done only one time, preferably in your startup code:

from codeguru_profiler_agent import Profiler

if __name__ == "__main__":
     Profiler(profiling_group_name='ProfilingGroupForMyApplication')
     start_application()    # your code in there....

Onboarding your Python Lambda function

Onboarding for an AWS Lambda function is quite similar.

  1. Create a profiling group called ProfilingGroupForMyLambdaFunction, this time we select “Lambda” for the compute platform. Give access to your Lambda function role to submit to this profiling group. See the documentation for details about how to create a Profiling Group.
  2. Include the codeguru_profiler_agent module in your Lambda function code.
  3. Add the with_lambda_profiler decorator to your handler function:
from codeguru_profiler_agent import with_lambda_profiler

@with_lambda_profiler(profiling_group_name='ProfilingGroupForMyLambdaFunction')
def handler_function(event, context):
      # Your code here

Alternatively, you can profile an existing Lambda function without updating the source code by adding a layer and changing the configuration. For more information, see Profiling your applications that run on AWS Lambda.

Profiling a Lambda function helps you see what is slowing down your code so you can reduce the duration, which reduces the cost and improves latency. You need to have continuous traffic on your function in order to produce a usable profile.

Viewing your profile

After running your profile for some time, you can view it on the CodeGuru console.

Screenshot of Flame graph visualization by CodeGuru Profiler

Each frame in the flame graph shows how much that function contributes to latency. In this example, an outbound call that crosses the network is taking most of the duration in the Lambda function, caching its result would improve the latency.

For more information, see Investigating performance issues with Amazon CodeGuru Profiler.

Supportability for CodeGuru Profiler is documented here.

If you don’t have an application to try CodeGuru Profiler on, you can use the demo application in the following GitHub repo.

Conclusion

This post introduced how to leverage CodeGuru Reviewer to identify hard-to-find code defects in various issue categories and how to onboard your Python applications or Lambda function in CodeGuru Profiler for CPU profiling. Combining both services can help you improve code quality for Python applications. CodeGuru is now available for you to try. For more pricing information, please see Amazon CodeGuru pricing.

 

About the Authors

Neela Sawant is a Senior Applied Scientist in the Amazon CodeGuru team. Her background is building AI-powered solutions to customer problems in a variety of domains such as software, multimedia, and retail. When she isn’t working, you’ll find her exploring the world anew with her toddler and hacking away at AI for social good.

 

 

Pierre Marieu is a Software Development Engineer in the Amazon CodeGuru Profiler team in London. He loves building tools that help the day-to-day life of other software engineers. Previously, he worked at Amadeus IT, building software for the travel industry.

 

 

 

Ran Fu is a Senior Product Manager in the Amazon CodeGuru team. He has a deep customer empathy, and love exploring who are the customers, what are their needs, and why those needs matter. Besides work, you may find him snowboarding in Keystone or Vail, Colorado.

 

Using NuGet with AWS CodeArtifact

Post Syndicated from John Standish original https://aws.amazon.com/blogs/devops/using-nuget-with-aws-codeartifact/

Managing NuGet packages for .NET development can be a challenge. Tasks such as initial configuration, ongoing maintenance, and scaling inefficiencies are the biggest pain points for developers and organizations. With its addition of NuGet package support, AWS CodeArtifact now provides easy-to-configure and scalable package management for .NET developers. You can use NuGet packages stored in CodeArtifact in Visual Studio, allowing you to use the tools you already know.

In this post, we show how you can provision NuGet repositories in 5 minutes. Then we demonstrate how to consume packages from your new NuGet repositories, all while using .NET native tooling.

All relevant code for this post is available in the aws-codeartifact-samples GitHub repo.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Architecture overview

Two core resource types make up CodeArtifact: domains and repositories. Domains provide an easy way manage multiple repositories within an organization. Repositories store packages and their assets. You can connect repositories to other CodeArtifact repositories, or popular public package repositories such as nuget.org, using upstream and external connections. For more information about these concepts, see AWS CodeArtifact Concepts.

The following diagram illustrates this architecture.

AWS CodeArtifact core concepts

Figure: AWS CodeArtifact core concepts

Creating CodeArtifact resources with AWS CloudFormation

The AWS CloudFormation template provided in this post provisions three CodeArtifact resources: a domain, a team repository, and a shared repository. The team repository is configured to use the shared repository as an upstream repository, and the shared repository has an external connection to nuget.org.

The following diagram illustrates this architecture.

Example AWS CodeArtifact architecture

Figure: Example AWS CodeArtifact architecture

The following CloudFormation template used in this walkthrough:

AWSTemplateFormatVersion: '2010-09-09'
Description: AWS CodeArtifact resources for dotnet

Resources:
  # Create Domain
  ExampleDomain:
    Type: AWS::CodeArtifact::Domain
    Properties:
      DomainName: example-domain
      PermissionsPolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              AWS: 
              - !Sub arn:aws:iam::${AWS::AccountId}:root
            Resource: "*"
            Action:
              - codeartifact:CreateRepository
              - codeartifact:DescribeDomain
              - codeartifact:GetAuthorizationToken
              - codeartifact:GetDomainPermissionsPolicy
              - codeartifact:ListRepositoriesInDomain

  # Create External Repository
  MyExternalRepository:
    Type: AWS::CodeArtifact::Repository
    Condition: ProvisionNugetTeamAndUpstream
    Properties:
      DomainName: !GetAtt ExampleDomain.Name
      RepositoryName: my-external-repository       
      ExternalConnections:
        - public:nuget-org
      PermissionsPolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              AWS: 
              - !Sub arn:aws:iam::${AWS::AccountId}:root
            Resource: "*"
            Action:
              - codeartifact:DescribePackageVersion
              - codeartifact:DescribeRepository
              - codeartifact:GetPackageVersionReadme
              - codeartifact:GetRepositoryEndpoint
              - codeartifact:ListPackageVersionAssets
              - codeartifact:ListPackageVersionDependencies
              - codeartifact:ListPackageVersions
              - codeartifact:ListPackages
              - codeartifact:PublishPackageVersion
              - codeartifact:PutPackageMetadata
              - codeartifact:ReadFromRepository

  # Create Repository
  MyTeamRepository:
    Type: AWS::CodeArtifact::Repository
    Properties:
      DomainName: !GetAtt ExampleDomain.Name
      RepositoryName: my-team-repository
      Upstreams:
        - !GetAtt MyExternalRepository.Name
      PermissionsPolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              AWS: 
              - !Sub arn:aws:iam::${AWS::AccountId}:root
            Resource: "*"
            Action:
              - codeartifact:DescribePackageVersion
              - codeartifact:DescribeRepository
              - codeartifact:GetPackageVersionReadme
              - codeartifact:GetRepositoryEndpoint
              - codeartifact:ListPackageVersionAssets
              - codeartifact:ListPackageVersionDependencies
              - codeartifact:ListPackageVersions
              - codeartifact:ListPackages
              - codeartifact:PublishPackageVersion
              - codeartifact:PutPackageMetadata
              - codeartifact:ReadFromRepository

Getting the CloudFormation template

To use the CloudFormation stack, we recommend you clone the following GitHub repo so you also have access to the example projects. See the following code:

git clone https://github.com/aws-samples/aws-codeartifact-samples.git
cd aws-codeartifact-samples/getting-started/dotnet/cloudformation/

Alternatively, you can copy the previous template into a file on your local filesystem named deploy.yml.

Provisioning the CloudFormation stack

Now that you have a local copy of the template, you need to provision the resources using a CloudFormation stack. You can deploy the stack using the AWS CLI or on the AWS CloudFormation console.

To use the AWS CLI, enter the following code:

aws cloudformation deploy \
--template-file deploy.yml \
--region <YOUR_PREFERRED_REGION> \
--stack-name CodeArtifact-GettingStarted-DotNet

To use the AWS CloudFormation console, complete the following steps:

  1. On the AWS CloudFormation console, choose Create stack.
  2. Choose With new resources (standard).
  3. Select Upload a template file.
  4. Choose Choose file.
  5. Name the stack CodeArtifact-GettingStarted-DotNet.
  6. Continue to choose Next until prompted to create the stack.

Configuring your local development experience

We use the CodeArtifact credential provider to connect the Visual Studio IDE to a CodeArtifact repository. You need to download and install the AWS Toolkit for Visual Studio to configure the credential provider. The toolkit is an extension for Microsoft Visual Studio on Microsoft Windows that makes it easy to develop, debug, and deploy .NET applications to AWS. The credential provider automates fetching and refreshing the authentication token required to pull packages from CodeArtifact. For more information about the authentication process, see AWS CodeArtifact authentication and tokens.

To connect to a repository, you complete the following steps:

  1. Configure an account profile in the AWS Toolkit.
  2. Copy the source endpoint from the AWS Explorer.
  3. Set the NuGet package source as the source endpoint.
  4. Add packages for your project via your CodeArtifact repository.

Configuring an account profile in the AWS Toolkit

Before you can use the Toolkit for Visual Studio, you must provide a set of valid AWS credentials. In this step, we set up a profile that has access to interact with CodeArtifact. For instructions, see Providing AWS Credentials.

Visual Studio Toolkit for AWS Account Profile Setup

Figure: Visual Studio Toolkit for AWS Account Profile Setup

Copying the NuGet source endpoint

After you set up your profile, you can see your provisioned repositories.

  1. In the AWS Explorer pane, navigate to the repository you want to connect to.
  2. Choose your repository (right-click).
  3. Choose Copy NuGet Source Endpoint.
AWS CodeArtifact repositories shown in the AWS Explorer

Figure: AWS CodeArtifact repositories shown in the AWS Explorer

 

You use the source endpoint later to configure your NuGet package sources.

Setting the package source using the source endpoint

Now that you have your source endpoint, you can set up the NuGet package source.

  1. In Visual Studio, under Tools, choose Options.
  2. Choose NuGet Package Manager.
  3. Under Options, choose the + icon to add a package source.
  4. For Name , enter codeartifact.
  5. For Source, enter the source endpoint you copied from the previous step.
Configuring Nuget package sources for AWS CodeArtifact

Figure: Configuring NuGet package sources for AWS CodeArtifact

 

Adding packages via your CodeArtifact repository

After the package source is configured against your team repository, you can pull packages via the upstream connection to the shared repository.

  1. Choose Manage NuGet Packages for your project.
    • You can now see packages from nuget.org.
  2. Choose any package to add it to your project.
Exploring packages while connected to a AWS CodeArtifact repository

Exploring packages while connected to a AWS CodeArtifact repository

Viewing packages stored in your CodeArtifact team repository

Packages are stored in a repository you pull from, or referenced via the upstream connection. Because we’re pulling packages from nuget.org through an external connection, you can see cached copies of those packages in your repository. To view the packages, navigate to your repository on the CodeArtifact console.

Packages stored in a AWS CodeArtifact repository

Packages stored in a AWS CodeArtifact repository

Cleaning Up

When you’re finished with this walkthrough, you may want to remove any provisioned resources. To remove the resources that the CloudFormation template created, navigate to the stack on the AWS CloudFormation console and choose Delete Stack. It may take a few minutes to delete all provisioned resources.

After the resources are deleted, there are no more cleanup steps.

Conclusion

We have shown you how to set up CodeArtifact in minutes and easily integrate it with NuGet. You can build and push your package faster, from hours or days to minutes. You can also integrate CodeArtifact directly in your Visual Studio environment with four simple steps. With CodeArtifact repositories, you inherit the durability and security posture from the underlying storage of CodeArtifact for your packages.

As of November 2020, CodeArtifact is available in the following AWS Regions:

  • US: US East (Ohio), US East (N. Virginia), US West (Oregon)
  • AP: Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo)
  • EU: Europe (Frankfurt), Europe (Ireland), Europe (Stockholm)

For an up-to-date list of Regions where CodeArtifact is available, see AWS CodeArtifact FAQ.

About the Authors

John Standish

John Standish is a Solutions Architect at AWS and spent over 13 years as a Microsoft .Net developer. Outside of work, he enjoys playing video games, cooking, and watching hockey.

Nuatu Tseggai

Nuatu Tseggai is a Cloud Infrastructure Architect at Amazon Web Services. He enjoys working with customers to design and build event-driven distributed systems that span multiple services.

Neha Gupta

Neha Gupta is a Solutions Architect at AWS and have 16 years of experience as a Database architect/ DBA. Apart from work, she’s outdoorsy and loves to dance.

Elijah Batkoski

Elijah is a Technical Writer for Amazon Web Services. Elijah has produced technical documentation and blogs for a variety of tools and services, primarily focused around DevOps.

Publishing private npm packages with AWS CodeArtifact

Post Syndicated from Ryan Sonshine original https://aws.amazon.com/blogs/devops/publishing-private-npm-packages-aws-codeartifact/

This post demonstrates how to create, publish, and download private npm packages using AWS CodeArtifact, allowing you to share code across your organization without exposing your packages to the public.

The ability to control CodeArtifact repository access using AWS Identity and Access Management (IAM) removes the need to manage additional credentials for a private npm repository when developers already have IAM roles configured.

You can use private npm packages for a variety of use cases, such as:

  • Reducing code duplication
  • Configuration such as code linting and styling
  • CLI tools for internal processes

This post shows how to easily create a sample project in which we publish an npm package and install the package from CodeArtifact. For more information about pipeline integration, see AWS CodeArtifact and your package management flow – Best Practices for Integration.

Solution overview

The following diagram illustrates this solution.

Diagram showing npm package publish and install with CodeArtifact

In this post, you create a private scoped npm package containing a sample function that can be used across your organization. You create a second project to download the npm package. You also learn how to structure your npm package to make logging in to CodeArtifact automatic when you want to build or publish the package.

The code covered in this post is available on GitHub:

Prerequisites

Before you begin, you need to complete the following:

  1. Create an AWS account.
  2. Install the AWS Command Line Interface (AWS CLI). CodeArtifact is supported in these CLI versions:
    1. 18.83 or later: install the AWS CLI version 1
    2. 0.54 or later: install the AWS CLI version 2
  3. Create a CodeArtifact repository.
  4. Add required IAM permissions for CodeArtifact.

Creating your npm package

You can create your npm package in three easy steps: set up the project, create your npm script for authenticating with CodeArtifact, and publish the package.

Setting up your project

Create a directory for your new npm package. We name this directory my-package because it serves as the name of the package. We use an npm scope for this package, where @myorg represents the scope all of our organization’s packages are published under. This helps us distinguish our internal private package from external packages. See the following code:

npm init – [email protected] -y

{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

The package.json file specifies that the main file of the package is called index.js. Next, we create that file and add our package function to it:

module.exports.helloWorld = function() {
  console.log('Hello world!');
}

Creating an npm script

To create your npm script, complete the following steps:

  1. On the CodeArtifact console, choose the repository you created as part of the prerequisites.

If you haven’t created a repository, create one before proceeding.

CodeArtifact repository details console

  1. Select your CodeArtifact repository and choose Details to view the additional details for your repository.

We use two items from this page:

  • Repository name (my-repo)
  • Domain (my-domain)
  1. Create a script named co:login in our package.json. The package.json contains the following code:
{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "co:login": "aws codeartifact login – tool npm – repository my-repo – domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Running this script updates your npm configuration to use your CodeArtifact repository and sets your authentication token, which expires after 12 hours.

  1. To test our new script, enter the following command:

npm run co:login

The following code is the output:

> aws codeartifact login – tool npm – repository my-repo – domain my-domain
Successfully configured npm to use AWS CodeArtifact repository https://my-domain-<ACCOUNT ID>.d.codeartifact.us-east-1.amazonaws.com/npm/my-repo/
Login expires in 12 hours at 2020-09-04 02:16:17-04:00
  1. Add a prepare script to our package.json to run our login command:
{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login – tool npm – repository my-repo – domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

This configures our project to automatically authenticate and generate an access token anytime npm install or npm publish run on the project.

If you see an error containing Invalid choice, valid choices are:, you need to update the AWS CLI according to the versions listed in the perquisites of this post.

Publishing your package

To publish our new package for the first time, run npm publish.

The following screenshot shows the output.

Terminal showing npm publish output

If we navigate to our CodeArtifact repository on the CodeArtifact console, we now see our new private npm package ready to be downloaded.

CodeArtifact console showing published npm package

Installing your private npm package

To install your private npm package, you first set up the project and add the CodeArtifact configs. After you install your package, it’s ready to use.

Setting up your project

Create a directory for a new application and name it my-app. This is a sample project to download our private npm package published in the previous step. You can apply this pattern to all repositories you intend on installing your organization’s npm packages in.

npm init -y

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "A sample application consuming a private scoped npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Adding CodeArtifact configs

Copy the npm scripts prepare and co:login created earlier to your new project:

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "A sample application consuming a private scoped npm package",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login – tool npm – repository my-repo – domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Installing your new private npm package

Enter the following command:

npm install @myorg/my-package

Your package.json should now list @myorg/my-package in your dependencies:

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login – tool npm – repository my-repo – domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "dependencies": {
    "@myorg/my-package": "^1.0.0"
  }
}

Using your new npm package

In our my-app application, create a file named index.js to run code from our package containing the following:

const { helloWorld } = require('@myorg/my-package');

helloWorld();

Run node index.js in your terminal to see the console print the message from our @myorg/my-package helloWorld function.

Cleaning Up

If you created a CodeArtifact repository for the purposes of this post, use one of the following methods to delete the repository:

Remove the changes made to your user profile’s npm configuration by running npm config delete registry, this will remove the CodeArtifact repository from being set as your default npm registry.

Conclusion

In this post, you successfully published a private scoped npm package stored in CodeArtifact, which you can reuse across multiple teams and projects within your organization. You can use npm scripts to streamline the authentication process and apply this pattern to save time.

About the Author

Ryan Sonshine

Ryan Sonshine is a Cloud Application Architect at Amazon Web Services. He works with customers to drive digital transformations while helping them architect, automate, and re-engineer solutions to fully leverage the AWS Cloud.

 

 

Automate thousands of mainframe tests on AWS with the Micro Focus Enterprise Suite

Post Syndicated from Kevin Yung original https://aws.amazon.com/blogs/devops/automate-mainframe-tests-on-aws-with-micro-focus/

Micro Focus – AWS Advanced Technology Parnter, they are a global infrastructure software company with 40 years of experience in delivering and supporting enterprise software.

We have seen mainframe customers often encounter scalability constraints, and they can’t support their development and test workforce to the scale required to support business requirements. These constraints can lead to delays, reduce product or feature releases, and make them unable to respond to market requirements. Furthermore, limits in capacity and scale often affect the quality of changes deployed, and are linked to unplanned or unexpected downtime in products or services.

The conventional approach to address these constraints is to scale up, meaning to increase MIPS/MSU capacity of the mainframe hardware available for development and testing. The cost of this approach, however, is excessively high, and to ensure time to market, you may reject this approach at the expense of quality and functionality. If you’re wrestling with these challenges, this post is written specifically for you.

To accompany this post, we developed an AWS prescriptive guidance (APG) pattern for developer instances and CI/CD pipelines: Mainframe Modernization: DevOps on AWS with Micro Focus.

Overview of solution

In the APG, we introduce DevOps automation and AWS CI/CD architecture to support mainframe application development. Our solution enables you to embrace both Test Driven Development (TDD) and Behavior Driven Development (BDD). Mainframe developers and testers can automate the tests in CI/CD pipelines so they’re repeatable and scalable. To speed up automated mainframe application tests, the solution uses team pipelines to run functional and integration tests frequently, and uses systems test pipelines to run comprehensive regression tests on demand. For more information about the pipelines, see Mainframe Modernization: DevOps on AWS with Micro Focus.

In this post, we focus on how to automate and scale mainframe application tests in AWS. We show you how to use AWS services and Micro Focus products to automate mainframe application tests with best practices. The solution can scale your mainframe application CI/CD pipeline to run thousands of tests in AWS within minutes, and you only pay a fraction of your current on-premises cost.

The following diagram illustrates the solution architecture.

Mainframe DevOps On AWS Architecture Overview, on the left is the conventional mainframe development environment, on the left is the CI/CD pipelines for mainframe tests in AWS

Figure: Mainframe DevOps On AWS Architecture Overview

 

Best practices

Before we get into the details of the solution, let’s recap the following mainframe application testing best practices:

  • Create a “test first” culture by writing tests for mainframe application code changes
  • Automate preparing and running tests in the CI/CD pipelines
  • Provide fast and quality feedback to project management throughout the SDLC
  • Assess and increase test coverage
  • Scale your test’s capacity and speed in line with your project schedule and requirements

Automated smoke test

In this architecture, mainframe developers can automate running functional smoke tests for new changes. This testing phase typically “smokes out” regression of core and critical business functions. You can achieve these tests using tools such as py3270 with x3270 or Robot Framework Mainframe 3270 Library.

The following code shows a feature test written in Behave and test step using py3270:

# home_loan_calculator.feature
Feature: calculate home loan monthly repayment
  the bankdemo application provides a monthly home loan repayment caculator 
  User need to input into transaction of home loan amount, interest rate and how many years of the loan maturity.
  User will be provided an output of home loan monthly repayment amount

  Scenario Outline: As a customer I want to calculate my monthly home loan repayment via a transaction
      Given home loan amount is <amount>, interest rate is <interest rate> and maturity date is <maturity date in months> months 
       When the transaction is submitted to the home loan calculator
       Then it shall show the monthly repayment of <monthly repayment>

    Examples: Homeloan
      | amount  | interest rate | maturity date in months | monthly repayment |
      | 1000000 | 3.29          | 300                     | $4894.31          |

 

# home_loan_calculator_steps.py
import sys, os
from py3270 import Emulator
from behave import *

@given("home loan amount is {amount}, interest rate is {rate} and maturity date is {maturity_date} months")
def step_impl(context, amount, rate, maturity_date):
    context.home_loan_amount = amount
    context.interest_rate = rate
    context.maturity_date_in_months = maturity_date

@when("the transaction is submitted to the home loan calculator")
def step_impl(context):
    # Setup connection parameters
    tn3270_host = os.getenv('TN3270_HOST')
    tn3270_port = os.getenv('TN3270_PORT')
	# Setup TN3270 connection
    em = Emulator(visible=False, timeout=120)
    em.connect(tn3270_host + ':' + tn3270_port)
    em.wait_for_field()
	# Screen login
    em.fill_field(10, 44, 'b0001', 5)
    em.send_enter()
	# Input screen fields for home loan calculator
    em.wait_for_field()
    em.fill_field(8, 46, context.home_loan_amount, 7)
    em.fill_field(10, 46, context.interest_rate, 7)
    em.fill_field(12, 46, context.maturity_date_in_months, 7)
    em.send_enter()
    em.wait_for_field()    

    # collect monthly replayment output from screen
    context.monthly_repayment = em.string_get(14, 46, 9)
    em.terminate()

@then("it shall show the monthly repayment of {amount}")
def step_impl(context, amount):
    print("expected amount is " + amount.strip() + ", and the result from screen is " + context.monthly_repayment.strip())
assert amount.strip() == context.monthly_repayment.strip()

To run this functional test in Micro Focus Enterprise Test Server (ETS), we use AWS CodeBuild.

We first need to build an Enterprise Test Server Docker image and push it to an Amazon Elastic Container Registry (Amazon ECR) registry. For instructions, see Using Enterprise Test Server with Docker.

Next, we create a CodeBuild project and uses the Enterprise Test Server Docker image in its configuration.

The following is an example AWS CloudFormation code snippet of a CodeBuild project that uses Windows Container and Enterprise Test Server:

  BddTestBankDemoStage:
    Type: AWS::CodeBuild::Project
    Properties:
      Name: !Sub '${AWS::StackName}BddTestBankDemo'
      LogsConfig:
        CloudWatchLogs:
          Status: ENABLED
      Artifacts:
        Type: CODEPIPELINE
        EncryptionDisabled: true
      Environment:
        ComputeType: BUILD_GENERAL1_LARGE
        Image: !Sub "${EnterpriseTestServerDockerImage}:latest"
        ImagePullCredentialsType: SERVICE_ROLE
        Type: WINDOWS_SERVER_2019_CONTAINER
      ServiceRole: !Ref CodeBuildRole
      Source:
        Type: CODEPIPELINE
        BuildSpec: bdd-test-bankdemo-buildspec.yaml

In the CodeBuild project, we need to create a buildspec to orchestrate the commands for preparing the Micro Focus Enterprise Test Server CICS environment and issue the test command. In the buildspec, we define the location for CodeBuild to look for test reports and upload them into the CodeBuild report group. The following buildspec code uses custom scripts DeployES.ps1 and StartAndWait.ps1 to start your CICS region, and runs Python Behave BDD tests:

version: 0.2
phases:
  build:
    commands:
      - |
        # Run Command to start Enterprise Test Server
        CD C:\
        .\DeployES.ps1
        .\StartAndWait.ps1

        py -m pip install behave

        Write-Host "waiting for server to be ready ..."
        do {
          Write-Host "..."
          sleep 3  
        } until(Test-NetConnection 127.0.0.1 -Port 9270 | ? { $_.TcpTestSucceeded } )

        CD C:\tests\features
        MD C:\tests\reports
        $Env:Path += ";c:\wc3270"

        $address=(Get-NetIPAddress -AddressFamily Ipv4 | where { $_.IPAddress -Match "172\.*" })
        $Env:TN3270_HOST = $address.IPAddress
        $Env:TN3270_PORT = "9270"
        
        behave.exe – color – junit – junit-directory C:\tests\reports
reports:
  bankdemo-bdd-test-report:
    files: 
      - '**/*'
    base-directory: "C:\\tests\\reports"

In the smoke test, the team may run both unit tests and functional tests. Ideally, these tests are better to run in parallel to speed up the pipeline. In AWS CodePipeline, we can set up a stage to run multiple steps in parallel. In our example, the pipeline runs both BDD tests and Robot Framework (RPA) tests.

The following CloudFormation code snippet runs two different tests. You use the same RunOrder value to indicate the actions run in parallel.

#...
        - Name: Tests
          Actions:
            - Name: RunBDDTest
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName: !Ref BddTestBankDemoStage
                PrimarySource: Config
              InputArtifacts:
                - Name: DemoBin
                - Name: Config
              RunOrder: 1
            - Name: RunRbTest
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName : !Ref RpaTestBankDemoStage
                PrimarySource: Config
              InputArtifacts:
                - Name: DemoBin
                - Name: Config
              RunOrder: 1  
#...

The following screenshot shows the example actions on the CodePipeline console that use the preceding code.

Screenshot of CodePipeine parallel execution tests using a same run order value

Figure – Screenshot of CodePipeine parallel execution tests

Both DBB and RPA tests produce jUnit format reports, which CodeBuild can ingest and show on the CodeBuild console. This is a great way for project management and business users to track the quality trend of an application. The following screenshot shows the CodeBuild report generated from the BDD tests.

CodeBuild report generated from the BDD tests showing 100% pass rate

Figure – CodeBuild report generated from the BDD tests

Automated regression tests

After you test the changes in the project team pipeline, you can automatically promote them to another stream with other team members’ changes for further testing. The scope of this testing stream is significantly more comprehensive, with a greater number and wider range of tests and higher volume of test data. The changes promoted to this stream by each team member are tested in this environment at the end of each day throughout the life of the project. This provides a high-quality delivery to production, with new code and changes to existing code tested together with hundreds or thousands of tests.

In enterprise architecture, it’s commonplace to see an application client consuming web services APIs exposed from a mainframe CICS application. One approach to do regression tests for mainframe applications is to use Micro Focus Verastream Host Integrator (VHI) to record and capture 3270 data stream processing and encapsulate these 3270 data streams as business functions, which in turn are packaged as web services. When these web services are available, they can be consumed by a test automation product, which in our environment is Micro Focus UFT One. This uses the Verastream server as the orchestration engine that translates the web service requests into 3270 data streams that integrate with the mainframe CICS application. The application is deployed in Micro Focus Enterprise Test Server.

The following diagram shows the end-to-end testing components.

Regression Test the end-to-end testing components using ECS Container for Exterprise Test Server, Verastream Host Integrator and UFT One Container, all integration points are using Elastic Network Load Balancer

Figure – Regression Test Infrastructure end-to-end Setup

To ensure we have the coverage required for large mainframe applications, we sometimes need to run thousands of tests against very large production volumes of test data. We want the tests to run faster and complete as soon as possible so we reduce AWS costs—we only pay for the infrastructure when consuming resources for the life of the test environment when provisioning and running tests.

Therefore, the design of the test environment needs to scale out. The batch feature in CodeBuild allows you to run tests in batches and in parallel rather than serially. Furthermore, our solution needs to minimize interference between batches, a failure in one batch doesn’t affect another running in parallel. The following diagram depicts the high-level design, with each batch build running in its own independent infrastructure. Each infrastructure is launched as part of test preparation, and then torn down in the post-test phase.

Regression Tests in CodeBuoild Project setup to use batch mode, three batches running in independent infrastructure with containers

Figure – Regression Tests in CodeBuoild Project setup to use batch mode

Building and deploying regression test components

Following the design of the parallel regression test environment, let’s look at how we build each component and how they are deployed. The followings steps to build our regression tests use a working backward approach, starting from deployment in the Enterprise Test Server:

  1. Create a batch build in CodeBuild.
  2. Deploy to Enterprise Test Server.
  3. Deploy the VHI model.
  4. Deploy UFT One Tests.
  5. Integrate UFT One into CodeBuild and CodePipeline and test the application.

Creating a batch build in CodeBuild

We update two components to enable a batch build. First, in the CodePipeline CloudFormation resource, we set BatchEnabled to be true for the test stage. The UFT One test preparation stage uses the CloudFormation template to create the test infrastructure. The following code is an example of the AWS CloudFormation snippet with batch build enabled:

#...
        - Name: SystemsTest
          Actions:
            - Name: Uft-Tests
              ActionTypeId:
                Category: Build
                Owner: AWS
                Provider: CodeBuild
                Version: 1
              Configuration:
                ProjectName : !Ref UftTestBankDemoProject
                PrimarySource: Config
                BatchEnabled: true
                CombineArtifacts: true
              InputArtifacts:
                - Name: Config
                - Name: DemoSrc
              OutputArtifacts:
                - Name: TestReport                
              RunOrder: 1
#...

Second, in the buildspec configuration of the test stage, we provide a build matrix setting. We use the custom environment variable TEST_BATCH_NUMBER to indicate which set of tests runs in each batch. See the following code:

version: 0.2
batch:
  fast-fail: true
  build-matrix:
    static:
      ignore-failure: false
    dynamic:
      env:
        variables:
          TEST_BATCH_NUMBER:
            - 1
            - 2
            - 3 
phases:
  pre_build:
commands:
#...

After setting up the batch build, CodeBuild creates multiple batches when the build starts. The following screenshot shows the batches on the CodeBuild console.

Regression tests Codebuild project ran in batch mode, three batches ran in prallel successfully

Figure – Regression tests Codebuild project ran in batch mode

Deploying to Enterprise Test Server

ETS is the transaction engine that processes all the online (and batch) requests that are initiated through external clients, such as 3270 terminals, web services, and websphere MQ. This engine provides support for various mainframe subsystems, such as CICS, IMS TM and JES, as well as code-level support for COBOL and PL/I. The following screenshot shows the Enterprise Test Server administration page.

Enterprise Server Administrator window showing configuration for CICS

Figure – Enterprise Server Administrator window

In this mainframe application testing use case, the regression tests are CICS transactions, initiated from 3270 requests (encapsulated in a web service). For more information about Enterprise Test Server, see the Enterprise Test Server and Micro Focus websites.

In the regression pipeline, after the stage of mainframe artifact compiling, we bake in the artifact into an ETS Docker container and upload the image to an Amazon ECR repository. This way, we have an immutable artifact for all the tests.

During each batch’s test preparation stage, a CloudFormation stack is deployed to create an Amazon ECS service on Windows EC2. The stack uses a Network Load Balancer as an integration point for the VHI’s integration.

The following code is an example of the CloudFormation snippet to create an Amazon ECS service using an Enterprise Test Server Docker image:

#...
  EtsService:
    DependsOn:
    - EtsTaskDefinition
    - EtsContainerSecurityGroup
    - EtsLoadBalancerListener
    Properties:
      Cluster: !Ref 'WindowsEcsClusterArn'
      DesiredCount: 1
      LoadBalancers:
        -
          ContainerName: !Sub "ets-${AWS::StackName}"
          ContainerPort: 9270
          TargetGroupArn: !Ref EtsPort9270TargetGroup
      HealthCheckGracePeriodSeconds: 300          
      TaskDefinition: !Ref 'EtsTaskDefinition'
    Type: "AWS::ECS::Service"

  EtsTaskDefinition:
    Properties:
      ContainerDefinitions:
        -
          Image: !Sub "${AWS::AccountId}.dkr.ecr.us-east-1.amazonaws.com/systems-test/ets:latest"
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref 'SystemsTestLogGroup'
              awslogs-region: !Ref 'AWS::Region'
              awslogs-stream-prefix: ets
          Name: !Sub "ets-${AWS::StackName}"
          cpu: 4096
          memory: 8192
          PortMappings:
            -
              ContainerPort: 9270
          EntryPoint:
          - "powershell.exe"
          Command: 
          - '-F'
          - .\StartAndWait.ps1
          - 'bankdemo'
          - C:\bankdemo\
          - 'wait'
      Family: systems-test-ets
    Type: "AWS::ECS::TaskDefinition"
#...

Deploying the VHI model

In this architecture, the VHI is a bridge between mainframe and clients.

We use the VHI designer to capture the 3270 data streams and encapsulate the relevant data streams into a business function. We can then deliver this function as a web service that can be consumed by a test management solution, such as Micro Focus UFT One.

The following screenshot shows the setup for getCheckingDetails in VHI. Along with this procedure we can also see other procedures (eg calcCostLoan) defined that get generated as a web service. The properties associated with this procedure are available on this screen to allow for the defining of the mapping of the fields between the associated 3270 screens and exposed web service.

example of VHI designer to capture the 3270 data streams and encapsulate the relevant data streams into a business function getCheckingDetails

Figure – Setup for getCheckingDetails in VHI

The following screenshot shows the editor for this procedure and is initiated by the selection of the Procedure Editor. This screen presents the 3270 screens that are involved in the business function that will be generated as a web service.

VHI designer Procedure Editor shows the procedure

Figure – VHI designer Procedure Editor shows the procedure

After you define the required functional web services in VHI designer, the resultant model is saved and deployed into a VHI Docker image. We use this image and the associated model (from VHI designer) in the pipeline outlined in this post.

For more information about VHI, see the VHI website.

The pipeline contains two steps to deploy a VHI service. First, it installs and sets up the VHI models into a VHI Docker image, and it’s pushed into Amazon ECR. Second, a CloudFormation stack is deployed to create an Amazon ECS Fargate service, which uses the latest built Docker image. In AWS CloudFormation, the VHI ECS task definition defines an environment variable for the ETS Network Load Balancer’s DNS name. Therefore, the VHI can bootstrap and point to an ETS service. In the VHI stack, it uses a Network Load Balancer as an integration point for UFT One test integration.

The following code is an example of a ECS Task Definition CloudFormation snippet that creates a VHI service in Amazon ECS Fargate and integrates it with an ETS server:

#...
  VhiTaskDefinition:
    DependsOn:
    - EtsService
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: systems-test-vhi
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      ExecutionRoleArn: !Ref FargateEcsTaskExecutionRoleArn
      Cpu: 2048
      Memory: 4096
      ContainerDefinitions:
        - Cpu: 2048
          Name: !Sub "vhi-${AWS::StackName}"
          Memory: 4096
          Environment:
            - Name: esHostName 
              Value: !GetAtt EtsInternalLoadBalancer.DNSName
            - Name: esPort
              Value: 9270
          Image: !Ref "${AWS::AccountId}.dkr.ecr.us-east-1.amazonaws.com/systems-test/vhi:latest"
          PortMappings:
            - ContainerPort: 9680
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref 'SystemsTestLogGroup'
              awslogs-region: !Ref 'AWS::Region'
              awslogs-stream-prefix: vhi

#...

Deploying UFT One Tests

UFT One is a test client that uses each of the web services created by the VHI designer to orchestrate running each of the associated business functions. Parameter data is supplied to each function, and validations are configured against the data returned. Multiple test suites are configured with different business functions with the associated data.

The following screenshot shows the test suite API_Bankdemo3, which is used in this regression test process.

the screenshot shows the test suite API_Bankdemo3 in UFT One test setup console, the API setup for getCheckingDetails

Figure – API_Bankdemo3 in UFT One Test Editor Console

For more information, see the UFT One website.

Integrating UFT One and testing the application

The last step is to integrate UFT One into CodeBuild and CodePipeline to test our mainframe application. First, we set up CodeBuild to use a UFT One container. The Docker image is available in Docker Hub. Then we author our buildspec. The buildspec has the following three phrases:

  • Setting up a UFT One license and deploying the test infrastructure
  • Starting the UFT One test suite to run regression tests
  • Tearing down the test infrastructure after tests are complete

The following code is an example of a buildspec snippet in the pre_build stage. The snippet shows the command to activate the UFT One license:

version: 0.2
batch: 
# . . .
phases:
  pre_build:
    commands:
      - |
        # Activate License
        $process = Start-Process -NoNewWindow -RedirectStandardOutput LicenseInstall.log -Wait -File 'C:\Program Files (x86)\Micro Focus\Unified Functional Testing\bin\HP.UFT.LicenseInstall.exe' -ArgumentList @('concurrent', 10600, 1, ${env:AUTOPASS_LICENSE_SERVER})        
        Get-Content -Path LicenseInstall.log
        if (Select-String -Path LicenseInstall.log -Pattern 'The installation was successful.' -Quiet) {
          Write-Host 'Licensed Successfully'
        } else {
          Write-Host 'License Failed'
          exit 1
        }
#...

The following command in the buildspec deploys the test infrastructure using the AWS Command Line Interface (AWS CLI)

aws cloudformation deploy – stack-name $stack_name `
--template-file cicd-pipeline/systems-test-pipeline/systems-test-service.yaml `
--parameter-overrides EcsCluster=$cluster_arn `
--capabilities CAPABILITY_IAM

Because ETS and VHI are both deployed with a load balancer, the build detects when the load balancers become healthy before starting the tests. The following AWS CLI commands detect the load balancer’s target group health:

$vhi_health_state = (aws elbv2 describe-target-health – target-group-arn $vhi_target_group_arn – query 'TargetHealthDescriptions[0].TargetHealth.State' – output text)
$ets_health_state = (aws elbv2 describe-target-health – target-group-arn $ets_target_group_arn – query 'TargetHealthDescriptions[0].TargetHealth.State' – output text)          

When the targets are healthy, the build moves into the build stage, and it uses the UFT One command line to start the tests. See the following code:

$process = Start-Process -Wait  -NoNewWindow -RedirectStandardOutput UFTBatchRunnerCMD.log `
-FilePath "C:\Program Files (x86)\Micro Focus\Unified Functional Testing\bin\UFTBatchRunnerCMD.exe" `
-ArgumentList @("-source", "${env:CODEBUILD_SRC_DIR_DemoSrc}\bankdemo\tests\API_Bankdemo\API_Bankdemo${env:TEST_BATCH_NUMBER}")

The next release of Micro Focus UFT One (November or December 2020) will provide an exit status to indicate a test’s success or failure.

When the tests are complete, the post_build stage tears down the test infrastructure. The following AWS CLI command tears down the CloudFormation stack:


#...
	post_build:
	  finally:
	  	- |
		  Write-Host "Clean up ETS, VHI Stack"
		  #...
		  aws cloudformation delete-stack – stack-name $stack_name
          aws cloudformation wait stack-delete-complete – stack-name $stack_name

At the end of the build, the buildspec is set up to upload UFT One test reports as an artifact into Amazon Simple Storage Service (Amazon S3). The following screenshot is the example of a test report in HTML format generated by UFT One in CodeBuild and CodePipeline.

UFT One HTML report shows regression testresult and test detals

Figure – UFT One HTML report

A new release of Micro Focus UFT One will provide test report formats supported by CodeBuild test report groups.

Conclusion

In this post, we introduced the solution to use Micro Focus Enterprise Suite, Micro Focus UFT One, Micro Focus VHI, AWS developer tools, and Amazon ECS containers to automate provisioning and running mainframe application tests in AWS at scale.

The on-demand model allows you to create the same test capacity infrastructure in minutes at a fraction of your current on-premises mainframe cost. It also significantly increases your testing and delivery capacity to increase quality and reduce production downtime.

A demo of the solution is available in AWS Partner Micro Focus website AWS Mainframe CI/CD Enterprise Solution. If you’re interested in modernizing your mainframe applications, please visit Micro Focus and contact AWS mainframe business development at [email protected].

References

Micro Focus

 

Peter Woods

Peter Woods

Peter has been with Micro Focus for almost 30 years, in a variety of roles and geographies including Technical Support, Channel Sales, Product Management, Strategic Alliances Management and Pre-Sales, primarily based in Europe but for the last four years in Australia and New Zealand. In his current role as Pre-Sales Manager, Peter is charged with driving and supporting sales activity within the Application Modernization and Connectivity team, based in Melbourne.

Leo Ervin

Leo Ervin

Leo Ervin is a Senior Solutions Architect working with Micro Focus Enterprise Solutions working with the ANZ team. After completing a Mathematics degree Leo started as a PL/1 programming with a local insurance company. The next step in Leo’s career involved consulting work in PL/1 and COBOL before he joined a start-up company as a technical director and partner. This company became the first distributor of Micro Focus software in the ANZ region in 1986. Leo’s involvement with Micro Focus technology has continued from this distributorship through to today with his current focus on cloud strategies for both DevOps and re-platform implementations.

Kevin Yung

Kevin Yung

Kevin is a Senior Modernization Architect in AWS Professional Services Global Mainframe and Midrange Modernization (GM3) team. Kevin currently is focusing on leading and delivering mainframe and midrange applications modernization for large enterprise customers.

Rapid and flexible Infrastructure as Code using the AWS CDK with AWS Solutions Constructs

Post Syndicated from Biff Gaut original https://aws.amazon.com/blogs/devops/rapid-flexible-infrastructure-with-solutions-constructs-cdk/

Introduction

As workloads move to the cloud and all infrastructure becomes virtual, infrastructure as code (IaC) becomes essential to leverage the agility of this new world. JSON and YAML are the powerful, declarative modeling languages of AWS CloudFormation, allowing you to define complex architectures using IaC. Just as higher level languages like BASIC and C abstracted away the details of assembly language and made developers more productive, the AWS Cloud Development Kit (AWS CDK) provides a programming model above the native template languages, a model that makes developers more productive when creating IaC. When you instantiate CDK objects in your Typescript (or Python, Java, etc.) application, those objects “compile” into a YAML template that the CDK deploys as an AWS CloudFormation stack.

AWS Solutions Constructs take this simplification a step further by providing a library of common service patterns built on top of the CDK. These multi-service patterns allow you to deploy multiple resources with a single object, resources that follow best practices by default – both independently and throughout their interaction.

Comparison of an Application stack with Assembly Language, 4th generation language and Object libraries such as Hibernate with an IaC stack of CloudFormation, AWS CDK and AWS Solutions Constructs

Application Development Stack vs. IaC Development Stack

Solution overview

To demonstrate how using Solutions Constructs can accelerate the development of IaC, in this post you will create an architecture that ingests and stores sensor readings using Amazon Kinesis Data Streams, AWS Lambda, and Amazon DynamoDB.

An architecture diagram showing sensor readings being sent to a Kinesis data stream. A Lambda function will receive the Kinesis records and store them in a DynamoDB table.

Prerequisite – Setting up the CDK environment

Tip – If you want to try this example but are concerned about the impact of changing the tools or versions on your workstation, try running it on AWS Cloud9. An AWS Cloud9 environment is launched with an AWS Identity and Access Management (AWS IAM) role and doesn’t require configuring with an access key. It uses the current region as the default for all CDK infrastructure.

To prepare your workstation for CDK development, confirm the following:

  • Node.js 10.3.0 or later is installed on your workstation (regardless of the language used to write CDK apps).
  • You have configured credentials for your environment. If you’re running locally you can do this by configuring the AWS Command Line Interface (AWS CLI).
  • TypeScript 2.7 or later is installed globally (npm -g install typescript)

Before creating your CDK project, install the CDK toolkit using the following command:

npm install -g aws-cdk

Create the CDK project

  1. First create a project folder called stream-ingestion with these two commands:

mkdir stream-ingestion
cd stream-ingestion

  1. Now create your CDK application using this command:

npx [email protected] init app – language=typescript

Tip – This example will be written in TypeScript – you can also specify other languages for your projects.

At this time, you must use the same version of the CDK and Solutions Constructs. We’re using version 1.68.0 of both based upon what’s available at publication time, but you can update this with a later version for your projects in the future.

Let’s explore the files in the application this command created:

  • bin/stream-ingestion.ts – This is the module that launches the application. The key line of code is:

new StreamIngestionStack(app, 'StreamIngestionStack');

This creates the actual stack, and it’s in StreamIngestionStack that you will write the CDK code that defines the resources in your architecture.

  • lib/stream-ingestion-stack.ts – This is the important class. In the constructor of StreamIngestionStack you will add the constructs that will create your architecture.

During the deployment process, the CDK uploads your Lambda function to an Amazon S3 bucket so it can be incorporated into your stack.

  1. To create that S3 bucket and any other infrastructure the CDK requires, run this command:

cdk bootstrap

The CDK uses the same supporting infrastructure for all projects within a region, so you only need to run the bootstrap command once in any region in which you create CDK stacks.

  1. To install the required Solutions Constructs packages for our architecture, run the these two commands from the command line:

npm install @aws-solutions-constructs/[email protected]
npm install @aws-solutions-constructs/aws-lambda[email protected]

Write the code

First you will write the Lambda function that processes the Kinesis data stream messages.

  1. Create a folder named lambda under stream-ingestion
  2. Within the lambda folder save a file called lambdaFunction.js with the following contents:
var AWS = require("aws-sdk");

// Create the DynamoDB service object
var ddb = new AWS.DynamoDB({ apiVersion: "2012-08-10" });

AWS.config.update({ region: process.env.AWS_REGION });

// We will configure our construct to 
// look for the .handler function
exports.handler = async function (event) {
  try {
    // Kinesis will deliver records 
    // in batches, so we need to iterate through
    // each record in the batch
    for (let record of event.Records) {
      const reading = parsePayload(record.kinesis.data);
      await writeRecord(record.kinesis.partitionKey, reading);
    };
  } catch (err) {
    console.log(`Write failed, err:\n${JSON.stringify(err, null, 2)}`);
    throw err;
  }
  return;
};

// Write the provided sensor reading data to the DynamoDB table
async function writeRecord(partitionKey, reading) {

  var params = {
    // Notice that Constructs automatically sets up 
    // an environment variable with the table name.
    TableName: process.env.DDB_TABLE_NAME,
    Item: {
      partitionKey: { S: partitionKey },  // sensor Id
      timestamp: { S: reading.timestamp },
      value: { N: reading.value}
    },
  };

  // Call DynamoDB to add the item to the table
  await ddb.putItem(params).promise();
}

// Decode the payload and extract the sensor data from it
function parsePayload(payload) {

  const decodedPayload = Buffer.from(payload, "base64").toString(
    "ascii"
  );

  // Our CLI command will send the records to Kinesis
  // with the values delimited by '|'
  const payloadValues = decodedPayload.split("|", 2)
  return {
    value: payloadValues[0],
    timestamp: payloadValues[1]
  }
}

We won’t spend a lot of time explaining this function – it’s pretty straightforward and heavily commented. It receives an event with one or more sensor readings, and for each reading it extracts the pertinent data and saves it to the DynamoDB table.

You will use two Solutions Constructs to create your infrastructure:

The aws-kinesisstreams-lambda construct deploys an Amazon Kinesis data stream and a Lambda function.

  • aws-kinesisstreams-lambda creates the Kinesis data stream and Lambda function that subscribes to that stream. To support this, it also creates other resources, such as IAM roles and encryption keys.

The aws-lambda-dynamodb construct deploys a Lambda function and a DynamoDB table.

  • aws-lambda-dynamodb creates an Amazon DynamoDB table and a Lambda function with permission to access the table.
  1. To deploy the first of these two constructs, replace the code in lib/stream-ingestion-stack.ts with the following code:
import * as cdk from "@aws-cdk/core";
import * as lambda from "@aws-cdk/aws-lambda";
import { KinesisStreamsToLambda } from "@aws-solutions-constructs/aws-kinesisstreams-lambda";

import * as ddb from "@aws-cdk/aws-dynamodb";
import { LambdaToDynamoDB } from "@aws-solutions-constructs/aws-lambda-dynamodb";

export class StreamIngestionStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const kinesisLambda = new KinesisStreamsToLambda(
      this,
      "KinesisLambdaConstruct",
      {
        lambdaFunctionProps: {
          // Where the CDK can find the lambda function code
          runtime: lambda.Runtime.NODEJS_10_X,
          handler: "lambdaFunction.handler",
          code: lambda.Code.fromAsset("lambda"),
        },
      }
    );

    // Next Solutions Construct goes here
  }
}

Let’s explore this code:

  • It instantiates a new KinesisStreamsToLambda object. This Solutions Construct will launch a new Kinesis data stream and a new Lambda function, setting up the Lambda function to receive all the messages in the Kinesis data stream. It will also deploy all the additional resources and policies required for the architecture to follow best practices.
  • The third argument to the constructor is the properties object, where you specify overrides of default values or any other information the construct needs. In this case you provide properties for the encapsulated Lambda function that informs the CDK where to find the code for the Lambda function that you stored as lambda/lambdaFunction.js earlier.
  1. Now you’ll add the second construct that connects the Lambda function to a new DynamoDB table. In the same lib/stream-ingestion-stack.ts file, replace the line // Next Solutions Construct goes here with the following code:
    // Define the primary key for the new DynamoDB table
    const primaryKeyAttribute: ddb.Attribute = {
      name: "partitionKey",
      type: ddb.AttributeType.STRING,
    };

    // Define the sort key for the new DynamoDB table
    const sortKeyAttribute: ddb.Attribute = {
      name: "timestamp",
      type: ddb.AttributeType.STRING,
    };

    const lambdaDynamoDB = new LambdaToDynamoDB(
      this,
      "LambdaDynamodbConstruct",
      {
        // Tell construct to use the Lambda function in
        // the first construct rather than deploy a new one
        existingLambdaObj: kinesisLambda.lambdaFunction,
        tablePermissions: "Write",
        dynamoTableProps: {
          partitionKey: primaryKeyAttribute,
          sortKey: sortKeyAttribute,
          billingMode: ddb.BillingMode.PROVISIONED,
          removalPolicy: cdk.RemovalPolicy.DESTROY
        },
      }
    );

    // Add autoscaling
    const readScaling = lambdaDynamoDB.dynamoTable.autoScaleReadCapacity({
      minCapacity: 1,
      maxCapacity: 50,
    });

    readScaling.scaleOnUtilization({
      targetUtilizationPercent: 50,
    });

Let’s explore this code:

  • The first two const objects define the names and types for the partition key and sort key of the DynamoDB table.
  • The LambdaToDynamoDB construct instantiated creates a new DynamoDB table and grants access to your Lambda function. The key to this call is the properties object you pass in the third argument.
    • The first property sent to LambdaToDynamoDB is existingLambdaObj – by setting this value to the Lambda function created by KinesisStreamsToLambda, you’re telling the construct to not create a new Lambda function, but to grant the Lambda function in the other Solutions Construct access to the DynamoDB table. This illustrates how you can chain many Solutions Constructs together to create complex architectures.
    • The second property sent to LambdaToDynamoDB tells the construct to limit the Lambda function’s access to the table to write only.
    • The third property sent to LambdaToDynamoDB is actually a full properties object defining the DynamoDB table. It provides the two attribute definitions you created earlier as well as the billing mode. It also sets the RemovalPolicy to DESTROY. This policy setting ensures that the table is deleted when you delete this stack – in most cases you should accept the default setting to protect your data.
  • The last two lines of code show how you can use statements to modify a construct outside the constructor. In this case we set up auto scaling on the new DynamoDB table, which we can access with the dynamoTable property on the construct we just instantiated.

That’s all it takes to create the all resources to deploy your architecture.

  1. Save all the files, then compile the Typescript into a CDK program using this command:

npm run build

  1. Finally, launch the stack using this command:

cdk deploy

(Enter “y” in response to Do you wish to deploy all these changes (y/n)?)

You will see some warnings where you override CDK default values. Because you are doing this intentionally you may disregard these, but it’s always a good idea to review these warnings when they occur.

Tip – Many mysterious CDK project errors stem from mismatched versions. If you get stuck on an inexplicable error, check package.json and confirm that all CDK and Solutions Constructs libraries have the same version number (with no leading caret ^). If necessary, correct the version numbers, delete the package-lock.json file and node_modules tree and run npm install. Think of this as the “turn it off and on again” first response to CDK errors.

You have now deployed the entire architecture for the demo – open the CloudFormation stack in the AWS Management Console and take a few minutes to explore all 12 resources that the program deployed (and the 380 line template generated to created them).

Feed the Stream

Now use the CLI to send some data through the stack.

Go to the Kinesis Data Streams console and copy the name of the data stream. Replace the stream name in the following command and run it from the command line.

aws kinesis put-records \
--stream-name StreamIngestionStack-KinesisLambdaConstructKinesisStreamXXXXXXXX-XXXXXXXXXXXX \
--records \
PartitionKey=1301,'Data=15.4|2020-08-22T01:16:36+00:00' \
PartitionKey=1503,'Data=39.1|2020-08-22T01:08:15+00:00'

Tip – If you are using the AWS CLI v2, the previous command will result in an “Invalid base64…” error because v2 expects the inputs to be Base64 encoded by default. Adding the argument --cli-binary-format raw-in-base64-out will fix the issue.

To confirm that the messages made it through the service, open the DynamoDB console – you should see the two records in the table.

Now that you’ve got it working, pause to think about what you just did. You deployed a system that can ingest and store sensor readings and scale to handle heavy loads. You did that by instantiating two objects – well under 60 lines of code. Experiment with changing some property values and deploying the changes by running npm run build and cdk deploy again.

Cleanup

To clean up the resources in the stack, run this command:

cdk destroy

Conclusion

Just as languages like BASIC and C allowed developers to write programs at a higher level of abstraction than assembly language, the AWS CDK and AWS Solutions Constructs allow us to create CloudFormation stacks in Typescript, Java, or Python instead JSON or YAML. Just as there will always be a place for assembly language, there will always be situations where we want to write CloudFormation templates manually – but for most situations, we can now use the AWS CDK and AWS Solutions Constructs to create complex and complete architectures in a fraction of the time with very little code.

AWS Solutions Constructs can currently be used in CDK applications written in Typescript, Javascript, Java and Python and will be available in C# applications soon.

About the Author

Biff Gaut has been shipping software since 1983, from small startups to large IT shops. Along the way he has contributed to 2 books, spoken at several conferences and written many blog posts. He is now a Principal Solutions Architect at AWS working on the AWS Solutions Constructs team, helping customers deploy better architectures more quickly.

Building, bundling, and deploying applications with the AWS CDK

Post Syndicated from Cory Hall original https://aws.amazon.com/blogs/devops/building-apps-with-aws-cdk/

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to model and provision your cloud application resources using familiar programming languages.

The post CDK Pipelines: Continuous delivery for AWS CDK applications showed how you can use CDK Pipelines to deploy a TypeScript-based AWS Lambda function. In that post, you learned how to add additional build commands to the pipeline to compile the TypeScript code to JavaScript, which is needed to create the Lambda deployment package.

In this post, we dive deeper into how you can perform these build commands as part of your AWS CDK build process by using the native AWS CDK bundling functionality.

If you’re working with Python, TypeScript, or JavaScript-based Lambda functions, you may already be familiar with the PythonFunction and NodejsFunction constructs, which use the bundling functionality. This post describes how to write your own bundling logic for instances where a higher-level construct either doesn’t already exist or doesn’t meet your needs. To illustrate this, I walk through two different examples: a Lambda function written in Golang and a static site created with Nuxt.js.

Concepts

A typical CI/CD pipeline contains steps to build and compile your source code, bundle it into a deployable artifact, push it to artifact stores, and deploy to an environment. In this post, we focus on the building, compiling, and bundling stages of the pipeline.

The AWS CDK has the concept of bundling source code into a deployable artifact. As of this writing, this works for two main types of assets: Docker images published to Amazon Elastic Container Registry (Amazon ECR) and files published to Amazon Simple Storage Service (Amazon S3). For files published to Amazon S3, this can be as simple as pointing to a local file or directory, which the AWS CDK uploads to Amazon S3 for you.

When you build an AWS CDK application (by running cdk synth), a cloud assembly is produced. The cloud assembly consists of a set of files and directories that define your deployable AWS CDK application. In the context of the AWS CDK, it might include the following:

  • AWS CloudFormation templates and instructions on where to deploy them
  • Dockerfiles, corresponding application source code, and information about where to build and push the images to
  • File assets and information about which S3 buckets to upload the files to

Use case

For this use case, our application consists of front-end and backend components. The example code is available in the GitHub repo. In the repository, I have split the example into two separate AWS CDK applications. The repo also contains the Golang Lambda example app and the Nuxt.js static site.

Golang Lambda function

To create a Golang-based Lambda function, you must first create a Lambda function deployment package. For Go, this consists of a .zip file containing a Go executable. Because we don’t commit the Go executable to our source repository, our CI/CD pipeline must perform the necessary steps to create it.

In the context of the AWS CDK, when we create a Lambda function, we have to tell the AWS CDK where to find the deployment package. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  runtime: lambda.Runtime.GO_1_X,
  handler: 'main',
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-go-executable')),
});

In the preceding code, the lambda.Code.fromAsset() method tells the AWS CDK where to find the Golang executable. When we run cdk synth, it stages this Go executable in the cloud assembly, which it zips and publishes to Amazon S3 as part of the PublishAssets stage.

If we’re running the AWS CDK as part of a CI/CD pipeline, this executable doesn’t exist yet, so how do we create it? One method is CDK bundling. The lambda.Code.fromAsset() method takes a second optional argument, AssetOptions, which contains the bundling parameter. With this bundling parameter, we can tell the AWS CDK to perform steps prior to staging the files in the cloud assembly.

Breaking down the BundlingOptions parameter further, we can perform the build inside a Docker container or locally.

Building inside a Docker container

For this to work, we need to make sure that we have Docker running on our build machine. In AWS CodeBuild, this means setting privileged: true. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-source-code'), {
    bundling: {
      image: lambda.Runtime.GO_1_X.bundlingDockerImage,
      command: [
        'bash', '-c', [
          'go test -v',
          'GOOS=linux go build -o /asset-output/main',
      ].join(' && '),
    },
  })
  ...
});

We specify two parameters:

  • image (required) – The Docker image to perform the build commands in
  • command (optional) – The command to run within the container

The AWS CDK mounts the folder specified as the first argument to fromAsset at /asset-input inside the container, and mounts the asset output directory (where the cloud assembly is staged) at /asset-output inside the container.

After we perform the build commands, we need to make sure we copy the Golang executable to the /asset-output location (or specify it as the build output location like in the preceding example).

This is the equivalent of running something like the following code:

docker run \
  --rm \
  -v folder-containing-source-code:/asset-input \
  -v cdk.out/asset.1234a4b5/:/asset-output \
  lambci/lambda:build-go1.x \
  bash -c 'GOOS=linux go build -o /asset-output/main'

Building locally

To build locally (not in a Docker container), we have to provide the local parameter. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-source-code'), {
    bundling: {
      image: lambda.Runtime.GO_1_X.bundlingDockerImage,
      command: [],
      local: {
        tryBundle(outputDir: string) {
          try {
            spawnSync('go version')
          } catch {
            return false
          }

          spawnSync(`GOOS=linux go build -o ${path.join(outputDir, 'main')}`);
          return true
        },
      },
    },
  })
  ...
});

The local parameter must implement the ILocalBundling interface. The tryBundle method is passed the asset output directory, and expects you to return a boolean (true or false). If you return true, the AWS CDK doesn’t try to perform Docker bundling. If you return false, it falls back to Docker bundling. Just like with Docker bundling, you must make sure that you place the Go executable in the outputDir.

Typically, you should perform some validation steps to ensure that you have the required dependencies installed locally to perform the build. This could be checking to see if you have go installed, or checking a specific version of go. This can be useful if you don’t have control over what type of build environment this might run in (for example, if you’re building a construct to be consumed by others).

If we run cdk synth on this, we see a new message telling us that the AWS CDK is bundling the asset. If we include additional commands like go test, we also see the output of those commands. This is especially useful if you wanted to fail a build if tests failed. See the following code:

$ cdk synth
Bundling asset GolangLambdaStack/MyGoFunction/Code/Stage...
✓  . (9ms)
✓  clients (5ms)

DONE 8 tests in 11.476s
✓  clients (5ms) (coverage: 84.6% of statements)
✓  . (6ms) (coverage: 78.4% of statements)

DONE 8 tests in 2.464s

Cloud Assembly

If we look at the cloud assembly that was generated (located at cdk.out), we see something like the following code:

$ cdk synth
Bundling asset GolangLambdaStack/MyGoFunction/Code/Stage...
✓  . (9ms)
✓  clients (5ms)

DONE 8 tests in 11.476s
✓  clients (5ms) (coverage: 84.6% of statements)
✓  . (6ms) (coverage: 78.4% of statements)

DONE 8 tests in 2.464s

It contains our GolangLambdaStack CloudFormation template that defines our Lambda function, as well as our Golang executable, bundled at asset.01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952/main.

Let’s look at how the AWS CDK uses this information. The GolangLambdaStack.assets.json file contains all the information necessary for the AWS CDK to know where and how to publish our assets (in this use case, our Golang Lambda executable). See the following code:

{
  "version": "5.0.0",
  "files": {
    "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952": {
      "source": {
        "path": "asset.01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952",
        "packaging": "zip"
      },
      "destinations": {
        "current_account-current_region": {
          "bucketName": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}",
          "objectKey": "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952.zip",
          "assumeRoleArn": "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/cdk-hnb659fds-file-publishing-role-${AWS::AccountId}-${AWS::Region}"
        }
      }
    }
  }
}

The file contains information about where to find the source files (source.path) and what type of packaging (source.packaging). It also tells the AWS CDK where to publish this .zip file (bucketName and objectKey) and what AWS Identity and Access Management (IAM) role to use (assumeRoleArn). In this use case, we only deploy to a single account and Region, but if you have multiple accounts or Regions, you see multiple destinations in this file.

The GolangLambdaStack.template.json file that defines our Lambda resource looks something like the following code:

{
  "Resources": {
    "MyGoFunction0AB33E85": {
      "Type": "AWS::Lambda::Function",
      "Properties": {
        "Code": {
          "S3Bucket": {
            "Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
          },
          "S3Key": "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952.zip"
        },
        "Handler": "main",
        ...
      }
    },
    ...
  }
}

The S3Bucket and S3Key match the bucketName and objectKey from the assets.json file. By default, the S3Key is generated by calculating a hash of the folder location that you pass to lambda.Code.fromAsset(), (for this post, folder-containing-source-code). This means that any time we update our source code, this calculated hash changes and a new Lambda function deployment is triggered.

Nuxt.js static site

In this section, I walk through building a static site using the Nuxt.js framework. You can apply the same logic to any static site framework that requires you to run a build step prior to deploying.

To deploy this static site, we use the BucketDeployment construct. This is a construct that allows you to populate an S3 bucket with the contents of .zip files from other S3 buckets or from a local disk.

Typically, we simply tell the BucketDeployment construct where to find the files that it needs to deploy to the S3 bucket. See the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-directory')),
  ],
  destinationBucket: myBucket
});

To deploy a static site built with a framework like Nuxt.js, we need to first run a build step to compile the site into something that can be deployed. For Nuxt.js, we run the following two commands:

  • yarn install – Installs all our dependencies
  • yarn generate – Builds the application and generates every route as an HTML file (used for static hosting)

This creates a dist directory, which you can deploy to Amazon S3.

Just like with the Golang Lambda example, we can perform these steps as part of the AWS CDK through either local or Docker bundling.

Building inside a Docker container

To build inside a Docker container, use the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-nuxtjs-project'), {
      bundling: {
        image: cdk.BundlingDockerImage.fromRegistry('node:lts'),
        command: [
          'bash', '-c', [
            'yarn install',
            'yarn generate',
            'cp -r /asset-input/dist/* /asset-output/',
          ].join(' && '),
        ],
      },
    }),
  ],
  ...
});

For this post, we build inside the publicly available node:lts image hosted on DockerHub. Inside the container, we run our build commands yarn install && yarn generate, and copy the generated dist directory to our output directory (the cloud assembly).

The parameters are the same as described in the Golang example we walked through earlier.

Building locally

To build locally, use the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-nuxtjs-project'), {
      bundling: {
        local: {
          tryBundle(outputDir: string) {
            try {
              spawnSync('yarn – version');
            } catch {
              return false
            }

            spawnSync('yarn install && yarn generate');

       fs.copySync(path.join(__dirname, ‘path-to-nuxtjs-project’, ‘dist’), outputDir);
            return true
          },
        },
        image: cdk.BundlingDockerImage.fromRegistry('node:lts'),
        command: [],
      },
    }),
  ],
  ...
});

Building locally works the same as the Golang example we walked through earlier, with one exception. We have one additional command to run that copies the generated dist folder to our output directory (cloud assembly).

Conclusion

This post showed how you can easily compile your backend and front-end applications using the AWS CDK. You can find the example code for this post in this GitHub repo. If you have any questions or comments, please comment on the GitHub repo. If you have any additional examples you want to add, we encourage you to create a Pull Request with your example!

Our code also contains examples of deploying the applications using CDK Pipelines, so if you’re interested in deploying the example yourself, check out the example repo.

 

About the author

Cory Hall

Cory is a Solutions Architect at Amazon Web Services with a passion for DevOps and is based in Charlotte, NC. Cory works with enterprise AWS customers to help them design, deploy, and scale applications to achieve their business goals.

Improving customer experience and reducing cost with CodeGuru Profiler

Post Syndicated from Rajesh original https://aws.amazon.com/blogs/devops/improving-customer-experience-and-reducing-cost-with-codeguru-profiler/

Amazon CodeGuru is a set of developer tools powered by machine learning that provides intelligent recommendations for improving code quality and identifying an application’s most expensive lines of code. Amazon CodeGuru Profiler allows you to profile your applications in a low impact, always on manner. It helps you improve your application’s performance, reduce cost and diagnose application issues through rich data visualization and proactive recommendations. CodeGuru Profiler has been a very successful and widely used service within Amazon, before it was offered as a public service. This post discusses a few ways in which internal Amazon teams have used and benefited from continuous profiling of their production applications. These uses cases can provide you with better insights on how to reap similar benefits for your applications using CodeGuru Profiler.

Inside Amazon, over 100,000 applications currently use CodeGuru Profiler across various environments globally. Over the last few years, CodeGuru Profiler has served as an indispensable tool for resolving issues in the following three categories:

  1. Performance bottlenecks, high latency and CPU utilization
  2. Cost and Infrastructure utilization
  3. Diagnosis of an application impacting event

API latency improvement for CodeGuru Profiler

What could be a better example than CodeGuru Profiler using itself to improve its own performance?
CodeGuru Profiler offers an API called BatchGetFrameMetricData, which allows you to fetch time series data for a set of frames or methods. We noticed that the 99th percentile latency (i.e. the slowest 1 percent of requests over a 5 minute period) metric for this API was approximately 5 seconds, higher than what we wanted for our customers.

Solution

CodeGuru Profiler is built on a micro service architecture, with the BatchGetFrameMetricData API implemented as set of AWS Lambda functions. It also leverages other AWS services such as Amazon DynamoDB to store data and Amazon CloudWatch to record performance metrics.

When investigating the latency issue, the team found that the 5-second latency spikes were happening during certain time intervals rather than continuously, which made it difficult to easily reproduce and determine the root cause of the issue in pre-production environment. The new Lambda profiling feature in CodeGuru came in handy, and so the team decided to enable profiling for all its Lambda functions. The low impact, continuous profiling capability of CodeGuru Profiler allowed the team to capture comprehensive profiles over a period of time, including when the latency spikes occurred, enabling the team to better understand the issue.
After capturing the profiles, the team went through the flame graphs of one of the Lambda functions (TimeSeriesMetricsGeneratorLambda) and learned that all of its CPU time was spent by the thread responsible to publish metrics to CloudWatch. The following screenshot shows a flame graph during one of these spikes.

TimeSeriesMetricsGeneratorLambda taking 100% CPU

As seen, there is a single call stack visible in the above flame graph, indicating all the CPU time was taken by the thread invoking above code. This helped the team immediately understand what was happening. Above code was related to the thread responsible for publishing the CloudWatch metrics. This thread was publishing these metrics in a synchronized block and as this thread took most of the CPU, it caused all other threads to wait and the latency to spike. To fix the issue, the team simply changed the TimeSeriesMetricsGeneratorLambda Lambda code, to publish CloudWatch metrics at the end of the function, which eliminated contention of this thread with all other threads.

Improvement

After the fix was deployed, the 5 second latency spikes were gone, as seen in the following graph.

Latency reduction for BatchGetFrameMetricData API

Cost, infrastructure and other improvements for CAGE

CAGE is an internal Amazon retail service that does royalty aggregation for digital products, such as Kindle eBooks, MP3 songs and albums and more. Like many other Amazon services, CAGE is also customer of CodeGuru Profiler.

CAGE was experiencing latency delays and growing infrastructure cost, and wanted to reduce them. Thanks to CodeGuru Profiler’s always-on profiling capabilities, rich visualization and recommendations, the team was able to successfully diagnose the issues, determine the root cause and fix them.

Solution

With the help of CodeGuru Profiler, the CAGE team identified several reasons for their degraded service performance and increased hardware utilization:

  • Excessive garbage collection activity – The team reviewed the service flame graphs (see the following screenshot) and identified that a lot of CPU time was spent getting garbage collection activities, 65.07% of the total service CPU.

Excessive garbage collection activities for CAGE

  • Metadata overhead – The team followed CodeGuru Profiler recommendation to identify that the service’s DynamoDB responses were consuming higher CPU, 2.86% of total CPU time. This was due to the response metadata caching in the AWS SDK v1.x HTTP client that was turned on by default. This was causing higher CPU overhead for high throughput applications such as CAGE. The following screenshot shows the relevant recommendation.

Response metadata recommendation for CAGE

  • Excessive logging – The team also identified excessive logging of its internal Amazon ION structures. The team initially added this logging for debugging purposes, but was unaware of its impact on the CPU cost, taking 2.28% of the overall service CPU. The following screenshot is part of the flame graph that helped identify the logging impact.

Excessive logging in CAGE service

The team used these flame graphs and CodeGuru Profiler provided recommendations to determine the root cause of the issues and systematically resolve them by doing the following:

  • Switching to a more efficient garbage collector
  • Removing excessive logging
  • Disabling metadata caching for Dynamo DB response

Improvements

After making these changes, the team was able to reduce their infrastructure cost by 25%, saving close to $2600 per month. Service latency also improved, with a reduction in service’s 99th percentile latency from approximately 2,500 milliseconds to 250 milliseconds in their North America (NA) region as shown below.

CAGE Latency Reduction

The team also realized a side benefit of having reduced log verbosity and saw a reduction in log size by 55%.

Event Analysis of increased checkout latency for Amazon.com

During one of the high traffic times, Amazon retail customers experienced higher than normal latency on their checkout page. The issue was due to one of the downstream service’s API experiencing high latency and CPU utilization. While the team quickly mitigated the issue by increasing the service’s servers, the always-on CodeGuru Profiler came to the rescue to help diagnose and fix the issue permanently.

Solution

The team analyzed the flame graphs from CodeGuru Profiler at the time of the event and noticed excessive CPU consumption (69.47%) when logging exceptions using Log4j2. See the following screenshot taken from an earlier version of CodeGuru Profiler user interface.

Excessive CPU consumption when logging exceptions using Log4j2

With CodeGuru Profiler flame graph and other metrics, the team quickly confirmed that the issue was due to excessive exception logging using Log4j2. This downstream service had recently upgraded to Log4j2 version 2.8, in which exception logging could be expensive, due to the way Log4j2 handles class-loading of certain stack frames. Log4j 2.x versions enabled class loading by default, which was disabled in 1.x versions, causing the increased latency and CPU utilization. The team was not able to detect this issue in pre-production environment, as the impact was observable only in high traffic situations.

Improvement

After they understood the issue, the team successfully rolled out the fix, removing the unnecessary exception trace logging to fix the issue. Such performance issues and many others are proactively offered as CodeGuru Profiler recommendations, to ensure you can proactively learn about such issues with your applications and quickly resolve them.

Conclusion

I hope this post provided a glimpse into various ways CodeGuru Profiler can benefit your business and applications. To get started using CodeGuru Profiler, see Setting up CodeGuru Profiler.
For more information about CodeGuru Profiler, see the following:

Investigating performance issues with Amazon CodeGuru Profiler

Optimizing application performance with Amazon CodeGuru Profiler

Find Your Application’s Most Expensive Lines of Code and Improve Code Quality with Amazon CodeGuru

 

Event-driven architecture for using third-party Git repositories as source for AWS CodePipeline

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/event-driven-architecture-for-using-third-party-git-repositories-as-source-for-aws-codepipeline/

In the post Using Custom Source Actions in AWS CodePipeline for Increased Visibility for Third-Party Source Control, we demonstrated using custom actions in AWS CodePipeline and a worker that periodically polls for jobs and processes further to get the artifact from the Git repository. In this post, we discuss using an event-driven architecture to trigger an AWS CodePipeline pipeline that has a third-party Git repository within the source stage that is part of a custom action.

Instead of using a worker to periodically poll for available jobs across all pipelines, we can define a custom source action on a particular pipeline to trigger an Amazon CloudWatch Events rule when the webhook on CodePipeline receives an event and puts it into an In Progress state. This works exactly like how CodePipeline works with natively supported Git repositories like AWS CodeCommit or GitHub as a source.

Solution architecture

The following diagram shows how you can use an event-driven architecture with a custom source stage that is associated with a third-party Git repository that isn’t supported by CodePipeline natively. For our use case, we use GitLab, but you can use any Git repository that supports Git webhooks.

3rdparty-gitblog-new.jpg

The architecture includes the following steps:

1. A user commits code to a Git repository.

2. The commit invokes a Git webhook.

3. This invokes a CodePipeline webhook.

4. The CodePipeline source stage is put into In Progress status.

5. The source stage action triggers a CloudWatch Events rule that indicates the stage started.

6. The CloudWatch event triggers an AWS Lambda function.

7. The function polls for the job details of the custom action.

8. The function also triggers AWS CodeBuild and passes all the job-related information.

9. CodeBuild gets the public SSH key stored in AWS Secrets Manager (or user name and password, if using HTTPS Git access).

10. CodeBuild clones the repository for a particular branch.

11. CodeBuild zips and uploads the archive to the CodePipeline artifact store Amazon Simple Storage Service (Amazon S3) bucket.

12. A Lambda function sends a success message to the CodePipeline source stage so it can proceed to the next stage.

Similarly, with the same setup, if you chose a release change for the pipeline that has custom source stage, a CloudWatch event is triggered, which triggers a Lambda function, and the same process repeats until it gets the artifact from the Git repository.

Solution overview

To set up the solution, you complete the following steps:

1. Create an SSH key pair for authenticating to the Git repository.

2. Publish the key to Secrets Manager.

3. Launch the AWS CloudFormation stack to provision resources.

4. Deploy a sample CodePipeline and test the custom action type.

5. Retrieve the webhook URL.

6. Create a webhook and add the webhook URL.

Creating an SSH key pair
You first create an SSH key pair to use for authenticating to the Git repository using ssh-keygen on your terminal. See the following code:

ssh-keygen -t rsa -b 4096 -C "[email protected]"

Follow the prompt from ssh-keygen and give a name for the key, for example codepipeline_git_rsa. This creates two new files in the current directory: codepipeline_git_rsa and codepipeline_git_rsa.pub.

Make a note of the contents of codepipeline_git_rsa.pub and add it as an authorized key for your Git user. For instructions, see Adding an SSH key to your GitLab account.

Publishing the key
Publish this key to Secrets Manager using the AWS Command Line Interface (AWS CLI):

export SecretsManagerArn=$(aws secretsmanager create-secret – name codepipeline_git \
--secret-string file://codepipeline_git_rsa – query ARN – output text)

Make a note of the ARN, which is required later.

Alternative, you can create a secret on the Secrets Manager console.

Make sure that the lines in the private key codepipeline_git are the same when the value to the secret is added.

Launching the CloudFormation stack

Clone the git repository aws-codepipeline-third-party-git-repositories that contains the AWS CloudFormation templates and AWS Lambda function code using the below command:

git clone https://github.com/aws-samples/aws-codepipeline-third-party-git-repositories.git .

Now you should have the below files in the cloned repository

cfn/
|--sample_pipeline_custom.yaml
`--third_party_git_custom_action.yaml
lambda/
`--lambda_function.py

Launch the CloudFormation stack using the template third_party_git_custom_action.yaml from the cfn directory. The main resources created by this stack are:

1. CodePipeline Custom Action Type. ResourceType: AWS::CodePipeline::CustomActionType
2. Lambda Function. ResourceType: AWS::Lambda::Function
3. CodeBuild Project. ResourceType: AWS::CodeBuild::Project
4. Lambda Execution Role. ResourceType: AWS::IAM::Role
5. CodeBuild Service Role. ResourceType: AWS::IAM::Role

These resources help uplift the logic for connecting to the Git repository, which for this post is GitLab.

Upload the Lambda function code to any S3 bucket in the same Region where the stack is being deployed. To create a new S3 bucket, use the following code (make sure to provide a unique name):

export ACCOUNT_ID=$(aws sts get-caller-identity – query Account – output text)
export S3_BUCKET_NAME=codepipeline-git-custom-action-${ACCOUNT_ID} 
aws s3 mb s3://${S3_BUCKET_NAME} – region us-east-1

Then zip the contents of the function and upload to the S3 bucket (substitute the appropriate bucket name):

export ZIP_FILE_NAME="codepipeline_git.zip"
zip -jr ${ZIP_FILE_NAME} ./lambda/lambda_function.py && \
aws s3 cp codepipeline_git.zip \
s3://${S3_BUCKET_NAME}/${ZIP_FILE_NAME}

If you don’t have a VPC and subnets that Lambda and CodeBuild require, you can create those by launching the following CloudFormation stack.

Run the following AWS CLI command to deploy the third-party Git source solution stack:

export vpcId="vpc-123456789"
export subnetId1="subnet-12345"
export subnetId2="subnet-54321"
export GIT_SOURCE_STACK_NAME="thirdparty-codepipeline-git-source"
aws cloudformation create-stack \
--stack-name ${GIT_SOURCE_STACK_NAME} \
--template-body file://$(pwd)/cfn/third_party_git_custom_action.yaml \
--parameters ParameterKey=SourceActionVersion,ParameterValue=1 \
ParameterKey=SourceActionProvider,ParameterValue=CustomSourceForGit \
ParameterKey=GitPullLambdaSubnet,ParameterValue=${subnetId1}\\,${subnetId2} \
ParameterKey=GitPullLambdaVpc,ParameterValue=${vpcId} \
ParameterKey=LambdaCodeS3Bucket,ParameterValue=${S3_BUCKET_NAME} \
ParameterKey=LambdaCodeS3Key,ParameterValue=${ZIP_FILE_NAME} \
--capabilities CAPABILITY_IAM

Alternatively, launch the stack by choosing Launch Stack:

cloudformation-launch-stack.png

For more information about the VPC requirements for Lambda and CodeBuild, see Internet and service access for VPC-connected functions and Use AWS CodeBuild with Amazon Virtual Private Cloud, respectively.

A custom source action type is now available on the account where you deployed the stack. You can check this on the CodePipeline console by attempting to create a new pipeline. You can see your source type listed under Source provider.

codepipeline-source-stage-dropdown.png

Testing the pipeline

We now deploy a sample pipeline and test the custom action type using the template sample_pipeline_custom.yaml from the cfn directory . You can run the following AWS CLI command to deploy the CloudFormation stack:

Note: Please provide the GitLab repository url to SSH_URL environment variable that you have access to or create a new GitLab project and repository. The example url "[email protected]:kirankumar15/test.git" is for illustration purposes only.

export SSH_URL="[email protected]:kirankumar15/test.git"
export SAMPLE_STACK_NAME="third-party-codepipeline-git-source-test"
aws cloudformation create-stack \
--stack-name ${SAMPLE_STACK_NAME}\
--template-body file://$(pwd)/cfn/sample_pipeline_custom.yaml \
--parameters ParameterKey=Branch,ParameterValue=master \
ParameterKey=GitUrl,ParameterValue=${SSH_URL} \
ParameterKey=SourceActionVersion,ParameterValue=1 \
ParameterKey=SourceActionProvider,ParameterValue=CustomSourceForGit \
ParameterKey=CodePipelineName,ParameterValue=sampleCodePipeline \
ParameterKey=SecretsManagerArnForSSHPrivateKey,ParameterValue=${SecretsManagerArn} \
ParameterKey=GitWebHookIpAddress,ParameterValue=34.74.90.64/28 \
--capabilities CAPABILITY_IAM

Alternatively, choose Launch stack:

cloudformation-launch-stack.png

Retrieving the webhook URL
When the stack creation is complete, retrieve the CodePipeline webhook URL from the stack outputs. Use the following AWS CLI command:

aws cloudformation describe-stacks \
--stack-name ${SAMPLE_STACK_NAME}\ 
--output text \
--query "Stacks[].Outputs[?OutputKey=='CodePipelineWebHookUrl'].OutputValue"

For more information about stack outputs, see Outputs.

Creating a webhook

You can use an existing GitLab repository or create a new GitLab repository and follow the below steps to add a webhook to it.
To create your webhook, complete the following steps:

1. Navigate to the Webhooks Settings section on the GitLab console for the repository that you want to have as a source for CodePipeline.

2. For URL, enter the CodePipeline webhook URL you retrieved in the previous step.

3. Select Push events and optionally enter a branch name.

4. Select Enable SSL verification.

5. Choose Add webhook.

gitlab-webhook.png

For more information about webhooks, see Webhooks.

We’re now ready to test the solution.

Testing the solution

To test the solution, we make changes to the branch that we passed as the branch parameter in the GitLab repository. This should trigger the pipeline. On the CodePipeline console, you can see the Git Commit ID on the source stage of the pipeline when it succeeds.

Note: Please provide the GitLab repository url that you have access to or create a new GitLab repository. and make sure that it has buildspec.yml in the contents to execute in AWS CodeBuild project in the Build stage. The example url "[email protected]:kirankumar15/test.git" is for illustration purposes only.

Enter the following code to clone your repository:

git clone [email protected]:kirankumar15/test.git .

Add a sample file to the repository with the name sample.txt, then commit and push it to the repository:

echo "adding a sample file" >> sample_text_file.txt
git add ./
git commit -m "added sample_test_file.txt to the repository"
git push -u origin master

The pipeline should show the status In Progress.

codepipeline_inprogress.png

After few minutes, it changes to Succeeded status and you see the Git Commit message on the source stage.

codepipeline_succeeded.png

You can also view the Git Commit message by choosing the execution ID of the pipeline, navigating to the timeline section, and choosing the source action. You should see the Commit message and Commit ID that correlates with the Git repository.

codepipeline-commit-msg.png

Troubleshooting

If the CodePipeline fails, check the Lambda function logs for the function with the name GitLab-CodePipeline-Source-${ACCOUNT_ID}. For instructions on checking logs, see Accessing Amazon CloudWatch logs for AWS Lambda.

If the Lambda logs has the CodeBuild build ID, then check the CodeBuild run logs for that build ID for errors. For instructions, see View detailed build information.

Cleaning up

Delete the CloudFormation stacks that you created. You can use the following AWS CLI commands:

aws cloudformation delete-stack – stack-name ${SAMPLE_STACK_NAME}

aws cloudformation delete-stack – stack-name ${GIT_SOURCE_STACK_NAME}

Alternatively, delete the stack on the AWS CloudFormation console.

Additionally, empty the S3 bucket and delete it. Locate the bucket in the ${SAMPLE_STACK_NAME} stack. Then use the following AWS CLI command:

aws s3 rb s3://${S3_BUCKET_NAME} – force

You can also delete and empty the bucket on the Amazon S3 console.

Conclusion

You can use the architecture in this post for any Git repository that supports webhooks. This solution also works if the repository is reachable only from on premises, and if the endpoints can be accessed from a VPC. This event-driven architecture works just like using any natively supported source for CodePipeline.

 

About the Author

kirankumar.jpegKirankumar Chandrashekar is a DevOps consultant at AWS Professional Services. He focuses on leading customers in architecting DevOps technologies. Kirankumar is passionate about Infrastructure as Code and DevOps. He enjoys music, as well as cooking and travelling.

 

Building a cross-account CI/CD pipeline for single-tenant SaaS solutions

Post Syndicated from Rafael Ramos original https://aws.amazon.com/blogs/devops/cross-account-ci-cd-pipeline-single-tenant-saas/

With the increasing demand from enterprise customers for a pay-as-you-go consumption model, more and more independent software vendors (ISVs) are shifting their business model towards software as a service (SaaS). Usually this kind of solution is architected using a multi-tenant model. It means that the infrastructure resources and applications are shared across multiple customers, with mechanisms in place to isolate their environments from each other. However, you may not want or can’t afford to share resources for security or compliance reasons, so you need a single-tenant environment.

To achieve this higher level of segregation across the tenants, it’s recommended to isolate the environments on the AWS account level. This strategy brings benefits, such as no network overlapping, no account limits sharing, and simplified usage tracking and billing, but it comes with challenges from an operational standpoint. Whereas multi-tenant solutions require management of a single shared production environment, single-tenant installations consist of dedicated production environments for each customer, without any shared resources across the tenants. When the number of tenants starts to grow, delivering new features at a rapid pace becomes harder to accomplish, because each new version needs to be manually deployed on each tenant environment.

This post describes how to automate this deployment process to deliver software quickly, securely, and less error-prone for each existing tenant. I demonstrate all the steps to build and configure a CI/CD pipeline using AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, and AWS CloudFormation. For each new version, the pipeline automatically deploys the same application version on the multiple tenant AWS accounts.

There are different caveats to build such cross-account CI/CD pipelines on AWS. Because of that, I use AWS Command Line Interface (AWS CLI) to manually go through the process and demonstrate in detail the various configuration aspects you have to handle, such as artifact encryption, cross-account permission granting, and pipeline actions.

Single-tenancy vs. multi-tenancy

One of the first aspects to consider when architecting your SaaS solution is its tenancy model. Each brings their own benefits and architectural challenges. On multi-tenant installations, each customer shares the same set of resources, including databases and applications. With this mode, you can use the servers’ capacity more efficiently, which generally leads to significant cost-saving opportunities. On the other hand, you have to carefully secure your solution to prevent a customer from accessing sensitive data from another. Designing for high availability becomes even more critical on multi-tenant workloads, because more customers are affected in the event of downtime.

Because the environments are by definition isolated from each other, single-tenant solutions are simpler to design when it comes to security, networking isolation, and data segregation. Likewise, you can customize the applications per customer, and have different versions for specific tenants. You also have the advantage of eliminating the noisy-neighbor effect, and can plan the infrastructure for the customer’s scalability requirements. As a drawback, in comparison with multi-tenant, the single-tenant model is operationally more complex because you have more servers and applications to maintain.

Which tenancy model to choose depends ultimately on whether you can meet your customer needs. They might have specific governance requirements, be bound to a certain industry regulation, or have compliance criteria that influences which model they can choose. For more information about modeling your SaaS solutions, see SaaS on AWS.

Solution overview

To demonstrate this solution, I consider a fictitious single-tenant ISV with two customers: Unicorn and Gnome. It uses one central account where the tools reside (Tooling account), and two other accounts, each representing a tenant (Unicorn and Gnome accounts). As depicted in the following architecture diagram, when a developer pushes code changes to CodeCommit, Amazon CloudWatch Events  triggers the CodePipeline CI/CD pipeline, which automatically deploys a new version on each tenant’s AWS account. It ensures that the fictitious ISV doesn’t have the operational burden to manually re-deploy the same version for each end-customers.

Architecture diagram of a CI/CD pipeline for single-tenant SaaS solutions

For illustration purposes, the sample application I use in this post is an AWS Lambda function that returns a simple JSON object when invoked.

Prerequisites

Before getting started, you must have the following prerequisites:

Setting up the Git repository

Your first step is to set up your Git repository.

  1. Create a CodeCommit repository to host the source code.

The CI/CD pipeline is automatically triggered every time new code is pushed to that repository.

  1. Make sure Git is configured to use IAM credentials to access AWS CodeCommit via HTTP by running the following command from the terminal:
git config – global credential.helper '!aws codecommit credential-helper [email protected]'
git config – global credential.UseHttpPath true
  1. Clone the newly created repository locally, and add two files in the root folder: index.js and application.yaml.

The first file is the JavaScript code for the Lambda function that represents the sample application. For our use case, the function returns a JSON response object with statusCode: 200 and the body Hello!\n. See the following code:

exports.handler = async (event) => {
    const response = {
        statusCode: 200,
        body: `Hello!\n`,
    };
    return response;
};

The second file is where the infrastructure is defined using AWS CloudFormation. The sample application consists of a Lambda function, and we use AWS Serverless Application Model (AWS SAM) to simplify the resources creation. See the following code:

AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Description: Sample Application.

Parameters:
    S3Bucket:
        Type: String
    S3Key:
        Type: String
    ApplicationName:
        Type: String
        
Resources:
    SampleApplication:
        Type: 'AWS::Serverless::Function'
        Properties:
            FunctionName: !Ref ApplicationName
            Handler: index.handler
            Runtime: nodejs12.x
            CodeUri:
                Bucket: !Ref S3Bucket
                Key: !Ref S3Key
            Description: Hello Lambda.
            MemorySize: 128
            Timeout: 10
  1. Push both files to the remote Git repository.

Creating the artifact store encryption key

By default, CodePipeline uses server-side encryption with an AWS Key Management Service (AWS KMS) managed customer master key (CMK) to encrypt the release artifacts. Because the Unicorn and Gnome accounts need to decrypt those release artifacts, you need to create a customer managed CMK in the Tooling account.

From the terminal, run the following command to create the artifact encryption key:

aws kms create-key – region <YOUR_REGION>

This command returns a JSON object with the key ARN property if run successfully. Its format is similar to arn:aws:kms:<YOUR_REGION>:<TOOLING_ACCOUNT_ID>:key/<KEY_ID>. Record this value to use in the following steps.

The encryption key has been created manually for educational purposes only, but it’s considered a best practice to have it as part of the Infrastructure as Code (IaC) bundle.

Creating an Amazon S3 artifact store and configuring a bucket policy

Our use case uses Amazon Simple Storage Service (Amazon S3) as artifact store. Every release artifact is encrypted and stored as an object in an S3 bucket that lives in the Tooling account.

To create and configure the artifact store, follow these steps in the Tooling account:

  1. From the terminal, create an S3 bucket and give it a unique name:
aws s3api create-bucket \
    – bucket <BUCKET_UNIQUE_NAME> \
    – region <YOUR_REGION> \
    – create-bucket-configuration LocationConstraint=<YOUR_REGION>
  1. Configure the bucket to use the customer managed CMK created in the previous step. This makes sure the objects stored in this bucket are encrypted using that key, replacing <KEY_ARN> with the ARN property from the previous step:
aws s3api put-bucket-encryption \
    – bucket <BUCKET_UNIQUE_NAME> \
    – server-side-encryption-configuration \
        '{
            "Rules": [
                {
                    "ApplyServerSideEncryptionByDefault": {
                        "SSEAlgorithm": "aws:kms",
                        "KMSMasterKeyID": "<KEY_ARN>"
                    }
                }
            ]
        }'
  1. The artifacts stored in the bucket need to be accessed from the Unicorn and Gnome Configure the bucket policies to allow cross-account access:
aws s3api put-bucket-policy \
    – bucket <BUCKET_UNIQUE_NAME> \
    – policy \
        '{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "s3:GetBucket*",
                        "s3:List*"
                    ],
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": [
                            "arn:aws:iam::<UNICORN_ACCOUNT_ID>:root",
                            "arn:aws:iam::<GNOME_ACCOUNT_ID>:root"
                        ]
                    },
                    "Resource": [
                        "arn:aws:s3:::<BUCKET_UNIQUE_NAME>"
                    ]
                },
                {
                    "Action": [
                        "s3:GetObject*"
                    ],
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": [
                            "arn:aws:iam::<UNICORN_ACCOUNT_ID>:root",
                            "arn:aws:iam::<GNOME_ACCOUNT_ID>:root"
                        ]
                    },
                    "Resource": [
                        "arn:aws:s3:::<BUCKET_UNIQUE_NAME>/CrossAccountPipeline/*"
                    ]
                }
            ]
        }' 

This S3 bucket has been created manually for educational purposes only, but it’s considered a best practice to have it as part of the IaC bundle.

Creating a cross-account IAM role in each tenant account

Following the security best practice of granting least privilege, each action declared on CodePipeline should have its own IAM role.  For this use case, the pipeline needs to perform changes in the Unicorn and Gnome accounts from the Tooling account, so you need to create a cross-account IAM role in each tenant account.

Repeat the following steps for each tenant account to allow CodePipeline to assume role in those accounts:

  1. Configure a named CLI profile for the tenant account to allow running commands using the correct access keys.
  2. Create an IAM role that can be assumed from another AWS account, replacing <TENANT_PROFILE_NAME> with the profile name you defined in the previous step:
aws iam create-role \
    – role-name CodePipelineCrossAccountRole \
    – profile <TENANT_PROFILE_NAME> \
    – assume-role-policy-document \
        '{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": "arn:aws:iam::<TOOLING_ACCOUNT_ID>:root"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }'
  1. Create an IAM policy that grants access to the artifact store S3 bucket and to the artifact encryption key:
aws iam create-policy \
    – policy-name CodePipelineCrossAccountArtifactReadPolicy \
    – profile <TENANT_PROFILE_NAME> \
    – policy-document \
        '{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "s3:GetBucket*",
                        "s3:ListBucket"
                    ],
                    "Resource": [
                        "arn:aws:s3:::<BUCKET_UNIQUE_NAME>"
                    ],
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "s3:GetObject*",
                        "s3:Put*"
                    ],
                    "Resource": [
                        "arn:aws:s3:::<BUCKET_UNIQUE_NAME>/CrossAccountPipeline/*"
                    ],
                    "Effect": "Allow"
                },
                {
                    "Action": [ 
                        "kms:DescribeKey", 
                        "kms:GenerateDataKey*", 
                        "kms:Encrypt", 
                        "kms:ReEncrypt*", 
                        "kms:Decrypt" 
                    ], 
                    "Resource": "<KEY_ARN>",
                    "Effect": "Allow"
                }
            ]
        }'
  1. Attach the CodePipelineCrossAccountArtifactReadPolicy IAM policy to the CodePipelineCrossAccountRole IAM role:
aws iam attach-role-policy \
    – profile <TENANT_PROFILE_NAME> \
    – role-name CodePipelineCrossAccountRole \
    – policy-arn arn:aws:iam::<TENANT_ACCOUNT_ID>:policy/CodePipelineCrossAccountArtifactReadPolicy
  1. Create an IAM policy that allows to pass the IAM role CloudFormationDeploymentRole to CloudFormation and to perform CloudFormation actions on the application Stack:
aws iam create-policy \
    – policy-name CodePipelineCrossAccountCfnPolicy \
    – profile <TENANT_PROFILE_NAME> \
    – policy-document \
        '{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "iam:PassRole"
                    ],
                    "Resource": "arn:aws:iam::<TENANT_ACCOUNT_ID>:role/CloudFormationDeploymentRole",
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "cloudformation:*"
                    ],
                    "Resource": "arn:aws:cloudformation:<YOUR_REGION>:<TENANT_ACCOUNT_ID>:stack/SampleApplication*/*",
                    "Effect": "Allow"
                }
            ]
        }'
  1. Attach the CodePipelineCrossAccountCfnPolicy IAM policy to the CodePipelineCrossAccountRole IAM role:
aws iam attach-role-policy \
    – profile <TENANT_PROFILE_NAME> \
    – role-name CodePipelineCrossAccountRole \
    – policy-arn arn:aws:iam::<TENANT_ACCOUNT_ID>:policy/CodePipelineCrossAccountCfnPolicy

Additional configuration is needed in the Tooling account to allow access, which you complete later on.

Creating a deployment IAM role in each tenant account

After CodePipeline assumes the CodePipelineCrossAccountRole IAM role into the tenant account, it triggers AWS CloudFormation to provision the infrastructure based on the template defined in the application.yaml file. For that, AWS CloudFormation needs to assume an IAM role that grants privileges to create resources into the tenant AWS account.

Repeat the following steps for each tenant account to allow AWS CloudFormation to create resources in those accounts:

  1. Create an IAM role that can be assumed by AWS CloudFormation:
aws iam create-role \
    – role-name CloudFormationDeploymentRole \
    – profile <TENANT_PROFILE_NAME> \
    – assume-role-policy-document \
        '{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "cloudformation.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }'
  1. Create an IAM policy that grants permissions to create AWS resources:
aws iam create-policy \
    – policy-name CloudFormationDeploymentPolicy \
    – profile <TENANT_PROFILE_NAME> \
    – policy-document \
        '{
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "iam:PassRole",
                    "Resource": "arn:aws:iam::<TENANT_ACCOUNT_ID>:role/*",
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "iam:GetRole",
                        "iam:CreateRole",
                        "iam:DeleteRole",
                        "iam:AttachRolePolicy",
                        "iam:DetachRolePolicy"
                    ],
                    "Resource": "arn:aws:iam::<TENANT_ACCOUNT_ID>:role/*",
                    "Effect": "Allow"
                },
                {
                    "Action": "lambda:*",
                    "Resource": "*",
                    "Effect": "Allow"
                },
                {
                    "Action": "codedeploy:*",
                    "Resource": "*",
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "s3:GetObject*",
                        "s3:GetBucket*",
                        "s3:List*"
                    ],
                    "Resource": [
                        "arn:aws:s3:::<BUCKET_UNIQUE_NAME>",
                        "arn:aws:s3:::<BUCKET_UNIQUE_NAME>/*"
                    ],
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "kms:Decrypt",
                        "kms:DescribeKey"
                    ],
                    "Resource": "<KEY_ARN>",
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "cloudformation:CreateStack",
                        "cloudformation:DescribeStack*",
                        "cloudformation:GetStackPolicy",
                        "cloudformation:GetTemplate*",
                        "cloudformation:SetStackPolicy",
                        "cloudformation:UpdateStack",
                        "cloudformation:ValidateTemplate"
                    ],
                    "Resource": "arn:aws:cloudformation:<YOUR_REGION>:<TENANT_ACCOUNT_ID>:stack/SampleApplication*/*",
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "cloudformation:CreateChangeSet"
                    ],
                    "Resource": "arn:aws:cloudformation:<YOUR_REGION>:aws:transform/Serverless-2016-10-31",
                    "Effect": "Allow"
                }
            ]
        }'

The granted permissions in this IAM policy depend on the resources your application needs to be provisioned. Because the application in our use case consists of a simple Lambda function, the IAM policy only needs permissions over Lambda. The other permissions declared are to access and decrypt the Lambda code from the artifact store, use AWS CodeDeploy to deploy the function, and create and attach the Lambda execution role.

  1. Attach the IAM policy to the IAM role:
aws iam attach-role-policy \
    – profile <TENANT_PROFILE_NAME> \
    – role-name CloudFormationDeploymentRole \
    – policy-arn arn:aws:iam::<TENANT_ACCOUNT_ID>:policy/CloudFormationDeploymentPolicy

Configuring an artifact store encryption key

Even though the IAM roles created in the tenant accounts declare permissions to use the CMK encryption key, that’s not enough to have access to the key. To access the key, you must update the CMK key policy.

From the terminal, run the following command to attach the new policy:

aws kms put-key-policy \
    – key-id <KEY_ARN> \
    – policy-name default \
    – region <YOUR_REGION> \
    – policy \
        '{
             "Id": "TenantAccountAccess",
             "Version": "2012-10-17",
             "Statement": [
                {
                    "Sid": "Enable IAM User Permissions",
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": "arn:aws:iam::<TOOLING_ACCOUNT_ID>:root"
                    },
                    "Action": "kms:*",
                    "Resource": "*"
                },
                {
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": [
                            "arn:aws:iam::<GNOME_ACCOUNT_ID>:role/CloudFormationDeploymentRole",
                            "arn:aws:iam::<GNOME_ACCOUNT_ID>:role/CodePipelineCrossAccountRole",
                            "arn:aws:iam::<UNICORN_ACCOUNT_ID>:role/CloudFormationDeploymentRole",
                            "arn:aws:iam::<UNICORN_ACCOUNT_ID>:role/CodePipelineCrossAccountRole"
                        ]
                    },
                    "Action": [
                        "kms:Decrypt",
                        "kms:DescribeKey"
                    ],
                    "Resource": "*"
                }
             ]
         }'

Provisioning the CI/CD pipeline

Each CodePipeline workflow consists of two or more stages, which are composed by a series of parallel or serial actions. For our use case, the pipeline is made up of four stages:

  • Source – Declares CodeCommit as the source control for the application code.
  • Build – Using CodeBuild, it installs the dependencies and builds deployable artifacts. In this use case, the sample application is too simple and this stage is used for illustration purposes.
  • Deploy_Dev – Deploys the sample application on a sandbox environment. At this point, the deployable artifacts generated at the Build stage are used to create a CloudFormation stack and deploy the Lambda function.
  • Deploy_Prod – Similar to Deploy_Dev, at this stage the sample application is deployed on the tenant production environments. For that, it contains two actions (one per tenant) that are run in parallel. CodePipeline uses CodePipelineCrossAccountRole to assume a role on the tenant account, and from there, CloudFormationDeploymentRole is used to effectively deploy the application.

To provision your resources, complete the following steps from the terminal:

  1. Download the CloudFormation pipeline template:
curl -LO https://cross-account-ci-cd-pipeline-single-tenant-saas.s3.amazonaws.com/pipeline.yaml
  1. Deploy the CloudFormation stack using the pipeline template:
aws cloudformation deploy \
    – template-file pipeline.yaml \
    – region <YOUR_REGION> \
    – stack-name <YOUR_PIPELINE_STACK_NAME> \
    – capabilities CAPABILITY_IAM \
    – parameter-overrides \
        ArtifactBucketName=<BUCKET_UNIQUE_NAME> \
        ArtifactEncryptionKeyArn=<KMS_KEY_ARN> \
        UnicornAccountId=<UNICORN_TENANT_ACCOUNT_ID> \
        GnomeAccountId=<GNOME_TENANT_ACCOUNT_ID> \
        SampleApplicationRepositoryName=<YOUR_CODECOMMIT_REPOSITORY_NAME> \
        RepositoryBranch=<YOUR_CODECOMMIT_MAIN_BRANCH>

This is the list of the required parameters to deploy the template:

    • ArtifactBucketName – The name of the S3 bucket where the deployment artifacts are to be stored.
    • ArtifactEncryptionKeyArn – The ARN of the customer managed CMK to be used as artifact encryption key.
    • UnicornAccountId – The AWS account ID for the first tenant (Unicorn) where the application is to be deployed.
    • GnomeAccountId – The AWS account ID for the second tenant (Gnome) where the application is to be deployed.
    • SampleApplicationRepositoryName – The name of the CodeCommit repository where source changes are detected.
    • RepositoryBranch – The name of the CodeCommit branch where source changes are detected. The default value is master in case no value is provided.
  1. Wait for AWS CloudFormation to create the resources.

When stack creation is complete, the pipeline starts automatically.

For each existing tenant, an action is declared within the Deploy_Prod stage. The following code is a snippet of how these actions are configured to deploy the application on a different account:

RoleArn: !Sub arn:aws:iam::${UnicornAccountId}:role/CodePipelineCrossAccountRole
Configuration:
    ActionMode: CREATE_UPDATE
    Capabilities: CAPABILITY_IAM,CAPABILITY_AUTO_EXPAND
    StackName: !Sub SampleApplication-unicorn-stack-${AWS::Region}
    RoleArn: !Sub arn:aws:iam::${UnicornAccountId}:role/CloudFormationDeploymentRole
    TemplatePath: CodeCommitSource::application.yaml
    ParameterOverrides: !Sub | 
        { 
            "ApplicationName": "SampleApplication-Unicorn",
            "S3Bucket": { "Fn::GetArtifactAtt" : [ "ApplicationBuildOutput", "BucketName" ] },
            "S3Key": { "Fn::GetArtifactAtt" : [ "ApplicationBuildOutput", "ObjectKey" ] }
        }

The code declares two IAM roles. The first one is the IAM role assumed by the CodePipeline action to access the tenant AWS account, whereas the second is the IAM role used by AWS CloudFormation to create AWS resources in the tenant AWS account. The ParameterOverrides configuration declares where the release artifact is located. The S3 bucket and key are in the Tooling account and encrypted using the customer managed CMK. That’s why it was necessary to grant access from external accounts using a bucket and KMS policies.

Besides the CI/CD pipeline itself, this CloudFormation template declares IAM roles that are used by the pipeline and its actions. The main IAM role is named CrossAccountPipelineRole, which is used by the CodePipeline service. It contains permissions to assume the action roles. See the following code:

{
    "Action": "sts:AssumeRole",
    "Effect": "Allow",
    "Resource": [
        "arn:aws:iam::<TOOLING_ACCOUNT_ID>:role/<PipelineSourceActionRole>",
        "arn:aws:iam::<TOOLING_ACCOUNT_ID>:role/<PipelineApplicationBuildActionRole>",
        "arn:aws:iam::<TOOLING_ACCOUNT_ID>:role/<PipelineDeployDevActionRole>",
        "arn:aws:iam::<UNICORN_ACCOUNT_ID>:role/CodePipelineCrossAccountRole",
        "arn:aws:iam::<GNOME_ACCOUNT_ID>:role/CodePipelineCrossAccountRole"
    ]
}

When you have more tenant accounts, you must add additional roles to the list.

After CodePipeline runs successfully, test the sample application by invoking the Lambda function on each tenant account:

aws lambda invoke – function-name SampleApplication – profile <TENANT_PROFILE_NAME> – region <YOUR_REGION> out

The output should be:

{
    "StatusCode": 200,
    "ExecutedVersion": "$LATEST"
}

Cleaning up

Follow these steps to delete the components and avoid future incurring charges:

  1. Delete the production application stack from each tenant account:
aws cloudformation delete-stack – profile <TENANT_PROFILE_NAME> – region <YOUR_REGION> – stack-name SampleApplication-<TENANT_NAME>-stack-<YOUR_REGION>
  1. Delete the dev application stack from the Tooling account:
aws cloudformation delete-stack – region <YOUR_REGION> – stack-name SampleApplication-dev-stack-<YOUR_REGION>
  1. Delete the pipeline stack from the Tooling account:
aws cloudformation delete-stack – region <YOUR_REGION> – stack-name <YOUR_PIPELINE_STACK_NAME>
  1. Delete the customer managed CMK from the Tooling account:
aws kms schedule-key-deletion – region <YOUR_REGION> – key-id <KEY_ARN>
  1. Delete the S3 bucket from the Tooling account:
aws s3 rb s3://<BUCKET_UNIQUE_NAME> – force
  1. Optionally, delete the IAM roles and policies you created in the tenant accounts

Conclusion

This post demonstrated what it takes to build a CI/CD pipeline for single-tenant SaaS solutions isolated on the AWS account level. It covered how to grant cross-account access to artifact stores on Amazon S3 and artifact encryption keys on AWS KMS using policies and IAM roles. This approach is less error-prone because it avoids human errors when manually deploying the exact same application for multiple tenants.

For this use case, we performed most of the steps manually to better illustrate all the steps and components involved. For even more automation, consider using the AWS Cloud Development Kit (AWS CDK) and its pipeline construct to create your CI/CD pipeline and have everything as code. Moreover, for production scenarios, consider having integration tests as part of the pipeline.

Rafael Ramos

Rafael Ramos

Rafael is a Solutions Architect at AWS, where he helps ISVs on their journey to the cloud. He spent over 13 years working as a software developer, and is passionate about DevOps and serverless. Outside of work, he enjoys playing tabletop RPG, cooking and running marathons.

Integrating AWS CloudFormation Guard into CI/CD pipelines

Post Syndicated from Sergey Voinich original https://aws.amazon.com/blogs/devops/integrating-aws-cloudformation-guard/

In this post, we discuss and build a managed continuous integration and continuous deployment (CI/CD) pipeline that uses AWS CloudFormation Guard to automate and simplify pre-deployment compliance checks of your AWS CloudFormation templates. This enables your teams to define a single source of truth for what constitutes valid infrastructure definitions, to be compliant with your company guidelines and streamline AWS resources’ deployment lifecycle.

We use the following AWS services and open-source tools to set up the pipeline:

Solution overview

The CI/CD workflow includes the following steps:

  1. A code change is committed and pushed to the CodeCommit repository.
  2. CodePipeline automatically triggers a CodeBuild job.
  3. CodeBuild spins up a compute environment and runs the phases specified in the buildspec.yml file:
  4. Clone the code from the CodeCommit repository (CloudFormation template, rule set for CloudFormation Guard, buildspec.yml file).
  5. Clone the code from the CloudFormation Guard repository on GitHub.
  6. Provision the build environment with necessary components (rust, cargo, git, build-essential).
  7. Download CloudFormation Guard release from GitHub.
  8. Run a validation check of the CloudFormation template.
  9. If the validation is successful, pass the control over to CloudFormation and deploy the stack. If the validation fails, stop the build job and print a summary to the build job log.

The following diagram illustrates this workflow.

Architecture Diagram

Architecture Diagram of CI/CD Pipeline with CloudFormation Guard

Prerequisites

For this walkthrough, complete the following prerequisites:

Creating your CodeCommit repository

Create your CodeCommit repository by running a create-repository command in the AWS CLI:

aws codecommit create-repository – repository-name cfn-guard-demo – repository-description "CloudFormation Guard Demo"

The following screenshot indicates that the repository has been created.

CodeCommit Repository

CodeCommit Repository has been created

Populating the CodeCommit repository

Populate your repository with the following artifacts:

  1. A buildspec.yml file. Modify the following code as per your requirements:
version: 0.2
env:
  variables:
    # Definining CloudFormation Teamplate and Ruleset as variables - part of the code repo
    CF_TEMPLATE: "cfn_template_file_example.yaml"
    CF_ORG_RULESET:  "cfn_guard_ruleset_example"
phases:
  install:
    commands:
      - apt-get update
      - apt-get install build-essential -y
      - apt-get install cargo -y
      - apt-get install git -y
  pre_build:
    commands:
      - echo "Setting up the environment for AWS CloudFormation Guard"
      - echo "More info https://github.com/aws-cloudformation/cloudformation-guard"
      - echo "Install Rust"
      - curl https://sh.rustup.rs -sSf | sh -s – -y
  build:
    commands:
       - echo "Pull GA release from github"
       - echo "More info https://github.com/aws-cloudformation/cloudformation-guard/releases"
       - wget https://github.com/aws-cloudformation/cloudformation-guard/releases/download/1.0.0/cfn-guard-linux-1.0.0.tar.gz
       - echo "Extract cfn-guard"
       - tar xvf cfn-guard-linux-1.0.0.tar.gz .
  post_build:
    commands:
       - echo "Validate CloudFormation template with cfn-guard tool"
       - echo "More information https://github.com/aws-cloudformation/cloudformation-guard/blob/master/cfn-guard/README.md"
       - cfn-guard-linux/cfn-guard check – rule_set $CF_ORG_RULESET – template $CF_TEMPLATE – strict-checks
artifacts:
  files:
    - cfn_template_file_example.yaml
  name: guard_templates
  1. An example of a rule set file (cfn_guard_ruleset_example) for CloudFormation Guard. Modify the following code as per your requirements:
#CFN Guard rules set example

#List of multiple references
let allowed_azs = [us-east-1a,us-east-1b]
let allowed_ec2_instance_types = [t2.micro,t3.nano,t3.micro]
let allowed_security_groups = [sg-08bbcxxc21e9ba8e6,sg-07b8bx98795dcab2]

#EC2 Policies
AWS::EC2::Instance AvailabilityZone IN %allowed_azs
AWS::EC2::Instance ImageId == ami-0323c3dd2da7fb37d
AWS::EC2::Instance InstanceType IN %allowed_ec2_instance_types
AWS::EC2::Instance SecurityGroupIds == ["sg-07b8xxxsscab2"]
AWS::EC2::Instance SubnetId == subnet-0407a7casssse558

#EBS Policies
AWS::EC2::Volume AvailabilityZone == us-east-1a
AWS::EC2::Volume Encrypted == true
AWS::EC2::Volume Size == 50 |OR| AWS::EC2::Volume Size == 100
AWS::EC2::Volume VolumeType == gp2
  1. An example of a CloudFormation template file (.yaml). Modify the following code as per your requirements:
AWSTemplateFormatVersion: "2010-09-09"
Description: "EC2 instance with encrypted EBS volume for AWS CloudFormation Guard Testing"

Resources:

 EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: 'ami-0323c3dd2da7fb37d'
      AvailabilityZone: 'us-east-1a'
      KeyName: "your-ssh-key"
      InstanceType: 't3.micro'
      SubnetId: 'subnet-0407a7xx68410e558'
      SecurityGroupIds:
        - 'sg-07b8b339xx95dcab2'
      Volumes:
         - 
          Device: '/dev/sdf'
          VolumeId: !Ref EBSVolume
      Tags:
       - Key: Name
         Value: cfn-guard-ec2

 EBSVolume:
   Type: AWS::EC2::Volume
   Properties:
     Size: 100
     AvailabilityZone: 'us-east-1a'
     Encrypted: true
     VolumeType: gp2
     Tags:
       - Key: Name
         Value: cfn-guard-ebs
   DeletionPolicy: Snapshot

Outputs:
  InstanceID:
    Description: The Instance ID
    Value: !Ref EC2Instance
  Volume:
    Description: The Volume ID
    Value: !Ref  EBSVolume
AWS CodeCommit

Optional CodeCommit Repository Structure

The following screenshot shows a potential CodeCommit repository structure.

Creating a CodeBuild project

Our CodeBuild project orchestrates around CloudFormation Guard and runs validation checks of our CloudFormation templates as a phase of the CI process.

  1. On the CodeBuild console, choose Build projects.
  2. Choose Create build projects.
  3. For Project name, enter your project name.
  4. For Description, enter a description.
AWS CodeBuild

Create CodeBuild Project

  1. For Source provider, choose AWS CodeCommit.
  2. For Repository, choose the CodeCommit repository you created in the previous step.
AWS CodeBuild

Define the source for your CodeBuild Project

To setup CodeBuild environment we will use managed image based on Ubuntu 18.04

  1. For Environment Image, select Managed image.
  2. For Operating system, choose Ubuntu.
  3. For Service role¸ select New service role.
  4. For Role name, enter your service role name.
CodeBuild Environment

Setup the environment, the OS image and other settings for the CodeBuild

  1. Leave the default settings for additional configuration, buildspec, batch configuration, artifacts, and logs.

You can also use CodeBuild with custom build environments to help you optimize billing and improve the build time.

Creating IAM roles and policies

Our CI/CD pipeline needs two AWS Identity and Access Management (IAM) roles to run properly: one role for CodePipeline to work with other resources and services, and one role for AWS CloudFormation to run the deployments that passed the validation check in the CodeBuild phase.

Creating permission policies

Create your permission policies first. The following code is the policy in JSON format for CodePipeline:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "codecommit:UploadArchive",
                "codecommit:CancelUploadArchive",
                "codecommit:GetCommit",
                "codecommit:GetUploadArchiveStatus",
                "codecommit:GetBranch",
                "codestar-connections:UseConnection",
                "codebuild:BatchGetBuilds",
                "codedeploy:CreateDeployment",
                "codedeploy:GetApplicationRevision",
                "codedeploy:RegisterApplicationRevision",
                "codedeploy:GetDeploymentConfig",
                "codedeploy:GetDeployment",
                "codebuild:StartBuild",
                "codedeploy:GetApplication",
                "s3:*",
                "cloudformation:*",
                "ec2:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "*",
            "Condition": {
                "StringEqualsIfExists": {
                    "iam:PassedToService": [
                        "cloudformation.amazonaws.com",
                        "ec2.amazonaws.com"
                    ]
                }
            }
        }
    ]
}

To create your policy for CodePipeline, run the following CLI command:

aws iam create-policy – policy-name CodePipeline-Cfn-Guard-Demo – policy-document file://CodePipelineServiceRolePolicy_example.json

Capture the policy ARN that you get in the output to use in the next steps.

The following code is the policy in JSON format for AWS CloudFormation:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "iam:CreateServiceLinkedRole",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:AWSServiceName": [
                        "autoscaling.amazonaws.com",
                        "ec2scheduled.amazonaws.com",
                        "elasticloadbalancing.amazonaws.com"
                    ]
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:GetObjectAcl",
                "s3:GetObject",
                "cloudwatch:*",
                "ec2:*",
                "autoscaling:*",
                "s3:List*",
                "s3:HeadBucket"
            ],
            "Resource": "*"
        }
    ]
}

Create the policy for AWS CloudFormation by running the following CLI command:

aws iam create-policy – policy-name CloudFormation-Cfn-Guard-Demo – policy-document file://CloudFormationRolePolicy_example.json

Capture the policy ARN that you get in the output to use in the next steps.

Creating roles and trust policies

The following code is the trust policy for CodePipeline in JSON format:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "codepipeline.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create your role for CodePipeline with the following CLI command:

aws iam create-role – role-name CodePipeline-Cfn-Guard-Demo-Role – assume-role-policy-document file://RoleTrustPolicy_CodePipeline.json

Capture the role name for the next step.

The following code is the trust policy for AWS CloudFormation in JSON format:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudformation.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create your role for AWS CloudFormation with the following CLI command:

aws iam create-role – role-name CF-Cfn-Guard-Demo-Role – assume-role-policy-document file://RoleTrustPolicy_CloudFormation.json

Capture the role name for the next step.

 

Finally, attach the permissions policies created in the previous step to the IAM roles you created:

aws iam attach-role-policy – role-name CodePipeline-Cfn-Guard-Demo-Role  – policy-arn "arn:aws:iam::<AWS Account Id >:policy/CodePipeline-Cfn-Guard-Demo"

aws iam attach-role-policy – role-name CF-Cfn-Guard-Demo-Role  – policy-arn "arn:aws:iam::<AWS Account Id>:policy/CloudFormation-Cfn-Guard-Demo"

Creating a pipeline

We can now create our pipeline to assemble all the components into one managed, continuous mechanism.

  1. On the CodePipeline console, choose Pipelines.
  2. Choose Create new pipeline.
  3. For Pipeline name, enter a name.
  4. For Service role, select Existing service role.
  5. For Role ARN, choose the service role you created in the previous step.
  6. Choose Next.
CodePipeline Setup

Setting Up CodePipeline environment

  1. In the Source section, for Source provider, choose AWS CodeCommit.
  2. For Repository name¸ enter your repository name.
  3. For Branch name, choose master.
  4. For Change detection options, select Amazon CloudWatch Events.
  5. Choose Next.
AWS CodePipeline Source

Adding CodeCommit to CodePipeline

  1. In the Build section, for Build provider, choose AWS CodeBuild.
  2. For Project name, choose the CodeBuild project you created.
  3. For Build type, select Single build.
  4. Choose Next.
CodePipeline Build Stage

Adding Build Project to Pipeline Stage

Now we will create a deploy stage in our CodePipeline to deploy CloudFormation templates that passed the CloudFormation Guard inspection in the CI stage.

  1. In the Deploy section, for Deploy provider, choose AWS CloudFormation.
  2. For Action mode¸ choose Create or update stack.
  3. For Stack name, choose any stack name.
  4. For Artifact name, choose BuildArtifact.
  5. For File name, enter the CloudFormation template name in your CodeCommit repository (In case of our demo it is cfn_template_file_example.yaml).
  6. For Role name, choose the role you created earlier for CloudFormation.
CodePipeline - Deploy Stage

Adding deploy stage to CodePipeline

22. In the next step review your selections for the pipeline to be created. The stages and action providers in each stage are shown in the order that they will be created. Click Create pipeline. Our CodePipeline is ready.

Validating the CI/CD pipeline operation

Our CodePipeline has two basic flows and outcomes. If the CloudFormation template complies with our CloudFormation Guard rule set file, the resources in the template deploy successfully (in our use case, we deploy an EC2 instance with an encrypted EBS volume).

CloudFormation Deployed

CloudFormation Console

If our CloudFormation template doesn’t comply with the policies specified in our CloudFormation Guard rule set file, our CodePipeline stops at the CodeBuild step and you see an error in the build job log indicating the resources that are non-compliant:

[EBSVolume] failed because [Encrypted] is [false] and the permitted value is [true]
[EC2Instance] failed because [t3.2xlarge] is not in [t2.micro,t3.nano,t3.micro] for [InstanceType]
Number of failures: 2

Note: To demonstrate the above functionality I changed my CloudFormation template to use unencrypted EBS volume and switched the EC2 instance type to t3.2xlarge which do not adhere to the rules that we specified in the Guard rule set file

Cleaning up

To avoid incurring future charges, delete the resources that we have created during the walkthrough:

  • CloudFormation stack resources that were deployed by the CodePipeline
  • CodePipeline that we have created
  • CodeBuild project
  • CodeCommit repository

Conclusion

In this post, we covered how to integrate CloudFormation Guard into CodePipeline and fully automate pre-deployment compliance checks of your CloudFormation templates. This allows your teams to have an end-to-end automated CI/CD pipeline with minimal operational overhead and stay compliant with your organizational infrastructure policies.

Standardizing CI/CD pipelines for .NET web applications with AWS Service Catalog

Post Syndicated from Borja Prado Miguelez original https://aws.amazon.com/blogs/devops/standardizing-cicd-pipelines-net-web-applications-aws-service-catalog/

As companies implement DevOps practices, standardizing the deployment of continuous integration and continuous deployment (CI/CD) pipelines is increasingly important. Your developer team may not have the ability or time to create your own CI/CD pipelines and processes from scratch for each new project. Additionally, creating a standardized DevOps process can help your entire company ensure that all development teams are following security and governance best practices.

Another challenge that large enterprise and small organization IT departments deal with is managing their software portfolio. This becomes even harder in agile scenarios working with mobile and web applications where you need to not only provision the cloud resources for hosting the application, but also have a proper DevOps process in place.

Having a standardized portfolio of products for your development teams enables you to provision the infrastructure resources needed to create development environments, and helps reduce the operation overhead and accelerate the overall development process.

This post shows you how to provide your end-users a catalog of resources with all the functionality a development team needs to check in code and run it in a highly scalable load balanced cloud compute environment.

We use AWS Service Catalog to provision a cloud-based AWS Cloud9 IDE, a CI/CD pipeline using AWS CodePipeline, and the AWS Elastic Beanstalk compute service to run the website. AWS Service Catalog allows organizations to keep control of the services and products that can be provisioned across the organization’s AWS account, and there’s an effective software delivery process in place by using CodePipeline to orchestrate the application deployment. The following diagram illustrates this architecture.

Architecture Diagram

You can find all the templates we use in this post on the AWS Service Catalog Elastic Beanstalk Reference architecture GitHub repo.

Provisioning the AWS Service Catalog portfolio

To get started, you must provision the AWS Service Catalog portfolio with AWS CloudFormation.

  1. Choose Launch Stack, which creates the AWS Service Catalog portfolio in your AWS account.Launch Stack action
  2. If you’re signed into AWS as an AWS Identity and Access Management (IAM) role, add your role name in the LinkedRole1 parameter.
  3. Continue through the stack launch screens using the defaults and choosing Next.
  4. Select the acknowledgements in the Capabilities box on the third screen.

When the stack is complete, a top-level CloudFormation stack with the default name SC-RA-Beanstalk-Portfolio, which contains five nested stacks, has created the AWS Service Catalog products with the services the development team needs to implement a CI/CD pipeline and host the web application. This AWS Service Catalog reference architecture provisions the AWS Service Catalog products needed to set up the DevOps pipeline and the application environment.

Cloudformation Portfolio Stack

When the portfolio has been created, you have completed the administrator setup. As an end-user (any roles you added to the LinkedRole1 or LinkedRole2 parameters), you can access the portfolio section on the AWS Service Catalog console and review the product list, which now includes the AWS Cloud9 IDE, Elastic Beanstalk application, and CodePipeline project that we will use for continuous delivery.

Service Catalog Products

On the AWS Service Catalog administrator section, inside the Elastic Beanstalk reference architecture portfolio, we can add and remove groups, roles, and users by choosing Add groups, roles, users on the Group, roles, and users tab. This lets us enable developers or other users to deploy the products from this portfolio.

Service Catalog Groups, Roles, and Users

Solution overview

The rest of this post walks you through how to provision the resources you need for CI/CD and web application deployment. You complete the following steps:

  1. Deploy the CI/CD pipeline.
  2. Provision the AWS Cloud9 IDE.
  3. Create the Elastic Beanstalk environment.

Deploying the CI/CD pipeline

The first product you need is the CI/CD pipeline, which manages the code and deployment process.

  1. Sign in to the AWS Service Catalog console in the same Region where you launched the CloudFormation stack earlier.
  2. On the Products list page, locate the CodePipeline product you created earlier.
    Service Catalog Products List
  3. Choose Launch product.

You now provision the CI/CI pipeline. For this post, we use some name examples for the pipeline name, Elastic Beanstalk application name, and code repository, which you can of course modify.

  1. Enter a name for the provisioned Codepipeline product.
  2. Select the Windows version and click Next.
  3. For the application and repository name, enter dotnetapp.
  4. Leave all other settings at their default and click Next.
  5. Choose Launch to start the provisioning of the CodePipeline product.

When you’re finished, the provisioned pipeline should appear on the Provisioned products list.

CodePipeline Product Provisioned

  1. Copy the CloneUrlHttp output to use in a later step.

You now have the CI/CD pipeline ready, with the code repository and the continuous integration service that compiles the code, runs tests, and generates the software bundle stored in Amazon Simple Storage Service (Amazon S3) ready to be deployed. The following diagram illustrates this architecture.

CodePipeline Configuration Diagram

When the Elastic Beanstalk environment is provisioned, the deploy stage takes care of deploying the bundle application stored in Amazon S3, so the DevOps pipeline takes care of the full software delivery as shown in the earlier architecture diagram.

The Region we use should support the WINDOWS_SERVER_2019_CONTAINER build image that AWS CodeBuild uses. You can modify the environment type or create a custom one by editing the CloudFormation template used for the CodePipeline for Windows.

Provisioning the AWS Cloud9 IDE

To show the full lifecycle of the deployment of a web application with Elastic Beanstalk, we use a .NET web application, but this reference architecture also supports Linux. To provision an AWS Cloud9 environment, complete the following steps:

  1. From the AWS Service Catalog product list, choose the AWS Cloud9 IDE.
  2. Click Launch product.
  3. Enter a name for the provisioned Cloud9 product and click Next.
  4. Enter an EnvironmentName and select the InstanceType.
  5. Set LinkedRepoPath to /dotnetapp.
  6. For LinkedRepoCloneUrl, enter the CloneUrlHttp from the previous step.
  7. Leave the default parameters for tagOptions and Notifications, and click Launch.
    Cloud9 Environment Settings

Now we download a sample ASP.NET MVC application in the AWS Cloud9 IDE, move it under the folder we specified in the previous step, and push the code.

  1. Open the IDE with the Cloud9Url link from AWS Service Catalog output.
  2. Get the sample .NET web application and move it under the dotnetapp. See the following code:
  3. cp -R aws-service-catalog-reference-architectures/labs/SampleDotNetApplication/* dotnetapp/
  1. Check in to the sample application to the CodeCommit repo:
  2. cd dotnetapp
    git add – all
    git commit -m "initial commit"
    git push

Now that we have committed the application to the code repository, it’s time to review the DevOps pipeline.

  1. On the CodePipeline console, choose Pipelines.

You should see the pipeline ElasticBeanstalk-ProductPipeline-dotnetapp running.

CodePipeline Execution

  1. Wait until the three pipeline stages are complete, this may take several minutes.

The code commitment and web application build stages are successful, but the code deployment stage fails because we haven’t provisioned the Elastic Beanstalk environment yet.

If you want to deploy your own sample or custom ASP.NET web application, CodeBuild requires the build specification file buildspec-build-dotnet.yml for the .NET Framework, which is located under the elasticbeanstalk/codepipeline folder in the GitHub repo. See the following example code:

version: 0.2
env:
  variables:
    DOTNET_FRAMEWORK: 4.6.1
phases:
  build:
    commands:
      - nuget restore
      - msbuild /p:TargetFrameworkVersion=v$env:DOTNET_FRAMEWORK /p:Configuration=Release /p:DeployIisAppPath="Default Web Site" /t:Package
      - dir obj\Release\Package
artifacts:
  files:
    - 'obj/**/*'
    - 'codepipeline/*'

Creating the Elastic Beanstalk environment

Finally, it’s time to provision the hosting system, an Elastic Beanstalk Windows-based environment, where the .NET sample web application runs. For this, we follow the same approach from the previous steps and provision the Elastic Beanstalk AWS Service Catalog product.

  1. On the AWS Service Catalog console, on the Product list page, choose the Elastic Beanstalk application product.
  2. Choose Launch product.
  3. Enter an environment name and click Next.
  4. Enter the application name.
  5. Enter the S3Bucket and S3SourceBundle that were generated (you can retrieve them from the Amazon S3 console).
  6. Set the SolutionStackName to 64bit Windows Server Core 2019 v2.5.8 running IIS 10.0. Follow this link for up to date platform names.
  7. Elastic Beanstalk Environment Settings
  1. Launch the product.
  2. To verify that you followed the steps correctly, review that the provisioned products are all available (AWS Cloud9 IDE, Elastic Beanstalk CodePipeline project, and Elastic Beanstalk application) and the recently created Elastic Beanstalk environment is healthy.

As in the previous step, if you’re planning to deploy your own sample or custom ASP.NET web application, AWS CodeDeploy requires the deploy specification file buildspec-deploy-dotnet.yml for the .NET Framework, which should be located under the codepipeline folder in the GitHub repo. See the following code:

version: 0.2
phases:
  pre_build:
    commands:          
      - echo application deploy started on `date`      
      - ls -l
      - ls -l obj/Release/Package
      - aws s3 cp ./obj/Release/Package/SampleWebApplication.zip s3://$ARTIFACT_BUCKET/$EB_APPLICATION_NAME-$CODEBUILD_BUILD_NUMBER.zip
  build:
    commands:
      - echo Pushing package to Elastic Beanstalk...      
      - aws elasticbeanstalk create-application-version – application-name $EB_APPLICATION_NAME – version-label v$CODEBUILD_BUILD_NUMBER – description "Auto deployed from CodeCommit build $CODEBUILD_BUILD_NUMBER" – source-bundle S3Bucket="$ARTIFACT_BUCKET",S3Key="$EB_APPLICATION_NAME-$CODEBUILD_BUILD_NUMBER.zip"
      - aws elasticbeanstalk update-environment – environment-name "EB-ENV-$EB_APPLICATION_NAME" – version-label v$CODEBUILD_BUILD_NUMBER

The same codepipeline folder contains some build and deploy specification files besides the .NET ones, which you could use if you prefer to use a different framework like Python to deploy a web application with Elastic Beanstalk.

  1. To complete the application deployment, go to the application pipeline and release the change, which triggers the pipeline with the application environment now ready.
    Deployment Succeeded

When you create the environment through the AWS Service Catalog, you can access the provisioned Elastic Beanstalk environment.

  1. In the Events section, locate the LoadBalancerURL, which is the public endpoint that we use to access the website.
    Elastic Beanstalk LoadBalancer URL
  1. In our preferred browser, we can check that the website has been successfully deployed.ASP.NET Sample Web Application

Cleaning up

When you’re finished, you should complete the following steps to delete the resources you provisioned to avoid incurring further charges and keep the account free of unused resources.

  1. The CodePipeline product creates an S3 bucket which you must empty from the S3 console.
  2. On the AWS Service Catalog console, end the provisioned resources from the Provisioned products list.
  3. As administrator, in the CloudFormation console, delete the stack SC-RA-Beanstalk-Portfolio.

Conclusions

This post has shown you how to deploy a standardized DevOps pipeline which was then used to manage and deploy a sample .NET application on Elastic Beanstalk using the Service Catalog Elastic Beanstalk reference architecture. AWS Service Catalog is the ideal service for administrators who need to centrally provision and manage the AWS services needed with a consistent governance model. Deploying web applications to Elastic Beanstalk is very simple for developers and provides built in scalability, patch maintenance, and self-healing for your applications.

The post includes the information and references on how to extend the solution with other programming languages and operating systems supported by Elastic Beanstalk.

About the Authors

Borja Prado
Borja Prado Miguelez

Borja is a Senior Specialist Solutions Architect for Microsoft workloads at Amazon Web Services. He is passionate about web architectures and application modernization, helping customers to build scalable solutions with .NET and migrate their Windows workloads to AWS.

Chris Chapman
Chris Chapman

Chris is a Partner Solutions Architect covering AWS Marketplace, Service Catalog, and Control Tower. Chris was a software developer and data engineer for many years and now his core mission is helping customers and partners automate AWS infrastructure deployment and provisioning.

Integrating Jenkins with AWS CodeArtifact to publish and consume Python artifacts

Post Syndicated from Matt Ulinski original https://aws.amazon.com/blogs/devops/using-jenkins-with-codeartifact/

Python packages are used to share and reuse code across projects. Centralized artifact storage allows sharing versioned artifacts across an organization. This post explains how you can set up two Jenkins projects. The first project builds the Python package and publishes it to AWS CodeArtifact using twine (Python utility for publishing packages), and the second project consumes the package using pip and deploys an application to AWS Fargate.

Solution overview

The following diagram illustrates this architecture.

Architecture Diagram

 

The solution consists of two GitHub repositories and two Jenkins projects. The first repository contains the source code of a Python package. Jenkins builds this package and publishes it to a CodeArtifact repository.

The second repository contains the source code of a Python Flask application that has a dependency on the package produced by the first repository. Jenkins builds a Docker image containing the application and its dependencies, pushes the image to an Amazon Elastic Container Registry (Amazon ECR) registry, and deploys it to AWS Fargate using AWS CloudFormation.

Prerequisites

For this walkthrough, you should have the following prerequisites:

To create a new Jenkins server that includes the required dependencies, complete the following steps:

  1. Launch a CloudFormation stack with the following link:
    Launch CloudFormation stack
  2. Choose Next.
  3. Enter the name for your stack.
  4. Select the Amazon Elastic Compute Cloud (Amazon EC2) instance type for your Jenkins server.
  5. Select the subnet and corresponding VPC.
  6. Choose Next.
  7. Scroll down to the bottom of the page and choose Next.
  8. Review the stack configuration and choose Create stack.

AWS CloudFormation creates the following resources:

  • JenkinsInstance – Amazon EC2 instance that Jenkins and its dependencies is installed on
  • JenkinsWaitCondition – CloudFormation wait condition that waits for Jenkins to be fully installed before finishing the deployment
  • JenkinsSecurityGroup – Security group attached to the EC2 instance that allows inbound traffic on port 8080

The stack takes a few minutes to deploy. When it’s fully deployed, you can find the URL and initial password for Jenkins on the Outputs tab of the stack.

CloudFormation outputs tab

Use the initial password to unlock the Jenkins installation, then follow the setup wizard to install the suggested plugins and create a new Jenkins user. After the user is created, the initial password no longer works.

On the Jenkins homepage, complete the following steps:

  1. Choose Manage Jenkins.
  2. Choose Manage Plugins.
  3. On the Available tab, search for “Docker Pipeline” and select it.
    Jenkins plugins available tab
  4. Choose Download now and install after restart.
  5. Select Restart Jenkins when installation is complete and no jobs are running.

Jenkins plugins installation complete

Jenkins is ready to use after it restarts. Log in with the user you created with the setup wizard.

Setting up a CodeArtifact repository

To get started, create a CodeArtifact repository to store the Python packages.

  1. On the CodeArtifact console, choose Create repository.
  2. For Repository name, enter a name (for this post, I use my-repository).
  3. For Public upstream repositories, choose pypi-store.
  4. Choose Next.
    AWS CodeArtifact repository wizard
  5. Choose This AWS account.
  6. If you already have a CodeArtifact domain, choose it from the drop-down menu. If you don’t already have a CodeArtifact domain, choose a name for your domain and the console creates it for you. For this post, I named my domain my-domain.
  7. Choose Next.
  8. Review the repository details and choose Create repository.
    CodeArtifact repository overview

You now have a CodeArtifact repository created, which you use to store and retrieve Python packages used by the application.

Configuring Jenkins: Creating an IAM user

  1. On the IAM console, choose User.
  2. Choose Add user.
  3. Enter a name for the user (for this post, I used the name Jenkins).
  4. Select Programmatic access as the access type.
  5. Choose Next: Permissions.
  6. Select Attach existing policies directly.
  7. Choose the following policies:
    1. AmazonEC2ContainerRegistryPowerUser – Allows Jenkins to push Docker images to ECR.
    2. AmazonECS_FullAccess – Allows Jenkins to deploy your application to AWS Fargate.
    3. AWSCloudFormationFullAccess – Allows Jenkins to update the CloudFormation stack.
    4. AWSCodeArtifactAdminAccessAllows Jenkins access to the CodeArtifact repository.
  8. Choose Next: Tags.
  9. Choose Next: Review.
  10. Review the configuration and choose Create user.
  11. Record the Access key ID and Secret access key; you need them to configure Jenkins.

Configuring Jenkins: Adding credentials

After you create your IAM user, you need to set up the credentials in Jenkins.

  1. Open Jenkins.
  2. From the left pane, choose Manage Jenkins
  3. Choose Manage Credentials.
  4. Hover over the (global) domain and expand the drop-down menu.
  5. Choose Add credentials.
    Jenkins credentials
  6. Enter the following credentials:
    1. Kind – User name with password.
    2. Scope – Global (Jenkins, nodes, items, all child items).
    3. Username – Enter the Access key ID for the Jenkins IAM user.
    4. Password – Enter the Secret access key for the Jenkins IAM user.
    5. ID – Name for the credentials (for this post, I used AWS).
  7. Choose OK.

You use the credentials to make API calls to AWS as part of the builds.

Publishing a Python package

To publish your Python package, complete the following steps:

  1. Create a new GitHub repo to store the source of the sample package.
  2. Clone the sample GitHub repo onto your local machine.
  3. Navigate to the package_src directory.
  4. Place its contents in your GitHub repo.
    Package repository contents

When your GitHub repo is populated with the sample package, you can create the first Jenkins project.

  1. On the Jenkins homepage, choose New Item.
  2. Enter a name for the project; for example, producer.
  3. Choose Freestyle project.
  4. Choose OK.
    Jenkins new project wizard
  5. In the Source Code Management section, choose Git.
  6. Enter the HTTP clone URL of your GitHub repo into the Repository URL
  7. To make sure that the workspace is clean before each build, under Additional Behaviors, choose Add and select Clean before checkout.
    Jenkins source code managnment
  8. To have builds start automatically when a change occurs in the repository, under Build Triggers, select Poll SCM and enter * * * * * in the Schedule
    Jenkins build triggers
  9. In the Build Environment section, select Use secret text(s) or file(s).
  10. Choose Add and choose Username and password (separated).
  11. Enter the following information:
    1. UsernameAWS_ACCESS_KEY_ID
    2. PasswordAWS_SECRET_ACCESS_KEY
    3. Credentials – Select Specific Credentials and from the drop-down menu and choose the previously created credentials.
      Jenkins credential binding
  12. In the Build section, choose Add build step.
  13. Choose Execute shell.
  14. Enter the following command and replace my-domain, my-repository, and my-region with the name of your CodeArtifact domain, repository, and Region:
    python3 setup.py sdist bdist_wheel
    aws codeartifact login – tool twine – domain my-domain – repository my-repository – region my-region
    python3 -m twine upload dist/* – repository codeartifact

    These commands do the following:

    • Build the Python package
    • Run the aws codeartifact login AWS Command Line Interface (AWS CLI) command, which retrieves the access token for CodeArtifact and configures the twine client
    • Use twine to publish the Python package to CodeArtifact
  15. Choose Save.
  16. Start a new build by choosing Build Now in the left pane.After a build starts, it shows in the Build History on the left pane. To view the build’s details, choose the build’s ID number.
    Jenkins project builds
  17. To view the results of the run commands, from the build details page, choose Console Output.
  18. To see that the package has been successfully published, check the CodeArtifact repository on the console.
    CodeArtifact console showing package

When a change is pushed to the repo, Jenkins will start a new build and attempt to publish the package. CodeArtifact will prevent publishing duplicates of the same package version, failing the Jenkins build.

If you want to publish a new version of the package, you will need to increment the version number.

The sample package uses semantic versioning (major.minor.maintenance), to change the version number modify the version='1.0.0' value in the setup.py file. You can do this manually before pushing any changes to the repo, or automatically as part of the build process by using the python-semantic-release package, or a similar solution.

Consuming a package and deploying an application

After you have a package published, you can use it in an application.

  1. Create a new GitHub repo for this application.
  2. Populate it with the contents of the application_src directory from the sample repo.
    Sample application repository

The version of the sample package used by the application is defined in the requirements.txt file. If you have published a new version of the package and want the application to use it modify the fantastic-ascii==1.0.0 value in this file.

After the repository created, you need to deploy the CloudFormation template application.yml. The template creates the following resources:

  • ECRRepository – Amazon ECR repository to store your Docker image.
  • ClusterAmazon Elastic Container Service (Amazon ECS) cluster that contains the service of your application.
  • TaskDefinition – ECS task definition that defines how your Docker image is deployed.
  • ExecutionRole – IAM role that Amazon ECS uses to pull the Docker image.
  • TaskRole – IAM role provided to the ECS task.
  • ContainerSecurityGroup – Security group that allows outbound traffic to ports 8080 and 80.
  • Service – Amazon ECS service that launches and manages your Docker containers.
  • TargetGroup – Target group used by the Load Balancer to send traffic to Docker containers.
  • Listener – Load Balancer Listener that listens for incoming traffic on port 80.
  • LoadBalancer – Load Balancer that sends traffic to the ECS task.
  1. Choose the following link to create the application’s CloudFormation stack:
    Launch CloudFormation stack
  2. Choose Next.
  3. Enter the following parameters:
    1. Stack name – Name for the CloudFormation stack. For this post, I use the name Consumer.
    2. Container Name – Name for your application (for this post, I use application).
    3. Image Tag – Leave this field blank. Jenkins populates it when you deploy the application.
    4. VPC – Choose a VPC in your account that contains two public subnets.
    5. SubnetA – Choose a public subnet from the previously chosen VPC.
    6. SubnetB – Choose a public subnet from the previously chosen VPC.
  4. Choose Next.
  5. Scroll down to the bottom of the page and choose Next.
  6. Review the configuration of the stack.
  7. Acknowledge the IAM resources warning to allow CloudFormation to create the TaskRole IAM role.
  8. Choose Create Stack.

After the stack is created, the Outputs tab contains information you can use to configure the Jenkins project.

Application stack outputs tab

To access the sample application, choose the ApplicationUrl link. Because the application has not yet been deployed, you receive an error message.

You can now create the second Jenkins project, which uses a configured through a Jenkinsfile stored in the source repository. The Jenkinsfile defines the steps that the build takes to build and deploy a Docker image containing your application.

The Jenkinsfile included in the sample instructs Jenkins to perform these steps:

  1. Get the authorization token for CodeArtifact:
    withCredentials([usernamePassword(
        credentialsId: CREDENTIALS_ID,
        passwordVariable: 'AWS_SECRET_ACCESS_KEY',
        usernameVariable: 'AWS_ACCESS_KEY_ID'
    )]) {
        authToken = sh(
                returnStdout: true,
                script: 'aws codeartifact get-authorization-token \
                – domain $AWS_CA_DOMAIN \
                – query authorizationToken \
                – output text \
                – duration-seconds 900'
        ).trim()
    }

  2. Start a Docker build and pass the authorization token as an argument to the build:
    sh ("""
        set +x
        docker build -t $CONTAINER_NAME:$BUILD_NUMBER \
        – build-arg CODEARTIFACT_TOKEN='$authToken' \
        – build-arg DOMAIN=$AWS_CA_DOMAIN-$AWS_ACCOUNT_ID \
        – build-arg REGION=$AWS_REGION \
        – build-arg REPO=$AWS_CA_REPO .
    """)

  3. Inside of Docker, the passed argument is used to configure pip to use CodeArtifact:
    RUN pip config set global.index-url "https://aws:[email protected]$DOMAIN.d.codeartifact.$REGION.amazonaws.com/pypi/$REPO/simple/"
    RUN pip install -r requirements.txt

  4. Test the image by starting a container and performing a simple GET request.
  5. Log in to the Amazon ECR repository and push the Docker image.
  6. Update the CloudFormation template and start a deployment of the application.

Look at the Jenkinsfile and Dockerfile in your repository to review the exact commands being used, then take the following steps to setup the second Jenkins projects:

  1. Change the variables defined in the environment section at the top of the Jenkinsfile:
    environment {
        AWS_ACCOUNT_ID = 'Your AWS Account ID'
        AWS_REGION = 'Region you used for this project'
        AWS_CA_DOMAIN = 'Name of your CodeArtifact domain'
        AWS_CA_REPO = 'Name of your CodeArtifact repository'
        AWS_STACK_NAME = 'Name of the CloudFormation stack'
        CONTAINER_NAME = 'Container name provided to CloudFormation'
        CREDENTIALS_ID = 'Jenkins credentials ID
    }
  2. Commit the changes to the GitHub repo.
  3. To create a new Jenkins project, on the Jenkins homepage, choose New Item.
  4. Enter a name for the project, for example, Consumer.
  5. Choose Pipeline.
  6. Choose OK.
    Jenkins pipeline wizard
  7. To have a new build start automatically when a change is detected in the repository, under Build Triggers, select Poll SCM and enter * * * * * in the Schedule field.
    Jenkins source polling configuration
  8. In the Pipeline section, choose Pipeline script from SCM from the Definition drop-down menu.
  9. Choose Git for the SCM
  10. Enter the HTTP clone URL of your GitHub repo into the Repository URL
  11. To make sure that your workspace is clean before each build, under Additional Behaviors, choose Add and select Clean before checkout.
    Jenkins source configuration
  12. Choose Save.

The Jenkins project is now ready. To start a new job, choose Build Now from the navigation pane. You see a visualization of the pipeline as it moves through the various stages, gathering the dependencies and deploying your application.

Jenkins application pipeline visualization

When the Deploy to ECS stage of the pipeline is complete, you can choose ApplicationUrl on the Outputs tab of the CloudFormation stack. You see a simple webpage that uses the Python package to display the current time.

Deployed application displaying in browser

Cleaning up

To avoid incurring future charges, delete the resources created in this post.

To empty the Amazon ECR repository:

  1. Open the application’s CloudFormation stack.
  2. On the Resources tab, choose the link next to the ECRRepository
  3. Select the check-box next to each of the images in the repository.
  4. Choose Delete.
  5. Confirm the deletion.

To delete the CloudFormation stacks:

  1. On the AWS CloudFormation console, select the application stack you deployed earlier.
  2. Choose Delete.
  3. Confirm the deletion.

If you created a Jenkins as part of this post, select the Jenkins stack and delete it.

To delete the CodeArtifact repository:

  1. On the CodeArtifact console, navigate to the repository you created.
  2. Choose Delete.
  3. Confirm the deletion.

If you’re not using the CodeArtifact domain for other repositories, you should follow the previous steps to delete the pypi-store repository, because it contains the public packages that were used by the application, then delete the CodeArtifact domain:

  1. On the CodeArtifact console, navigate to the domain you created.
  2. Choose Delete.
  3. Confirm the deletion.

Conclusion

In this post I showed how you can use Jenkins to publish and consume a Python package with Jenkins and CodeArtifact. I walked you through creating two Jenkins projects, a Jenkins freestyle project that built a package and published it to CodeArtifact, and a Jenkins pipeline project that built a Docker image that used the package in an application that was deployed to AWS Fargate.

About the author

Matt Ulinski is a Cloud Support Engineer with Amazon Web Services.

 

 

Cross-account and cross-region deployment using GitHub actions and AWS CDK

Post Syndicated from DAMODAR SHENVI WAGLE original https://aws.amazon.com/blogs/devops/cross-account-and-cross-region-deployment-using-github-actions-and-aws-cdk/

GitHub Actions is a feature on GitHub’s popular development platform that helps you automate your software development workflows in the same place you store code and collaborate on pull requests and issues. You can write individual tasks called actions, and combine them to create a custom workflow. Workflows are custom automated processes that you can set up in your repository to build, test, package, release, or deploy any code project on GitHub.

A cross-account deployment strategy is a CI/CD pattern or model in AWS. In this pattern, you have a designated AWS account called tools, where all CI/CD pipelines reside. Deployment is carried out by these pipelines across other AWS accounts, which may correspond to dev, staging, or prod. For more information about a cross-account strategy in reference to CI/CD pipelines on AWS, see Building a Secure Cross-Account Continuous Delivery Pipeline.

In this post, we show you how to use GitHub Actions to deploy an AWS Lambda-based API to an AWS account and Region using the cross-account deployment strategy.

Using GitHub Actions may have associated costs in addition to the cost associated with the AWS resources you create. For more information, see About billing for GitHub Actions.

Prerequisites

Before proceeding any further, you need to identify and designate two AWS accounts required for the solution to work:

  • Tools – Where you create an AWS Identity and Access Management (IAM) user for GitHub Actions to use to carry out deployment.
  • Target – Where deployment occurs. You can call this as your dev/stage/prod environment.

You also need to create two AWS account profiles in ~/.aws/credentials for the tools and target accounts, if you don’t already have them. These profiles need to have sufficient permissions to run an AWS Cloud Development Kit (AWS CDK) stack. They should be your private profiles and only be used during the course of this use case. So, it should be fine if you want to use admin privileges. Don’t share the profile details, especially if it has admin privileges. I recommend removing the profile when you’re finished with this walkthrough. For more information about creating an AWS account profile, see Configuring the AWS CLI.

Solution overview

You start by building the necessary resources in the tools account (an IAM user with permissions to assume a specific IAM role from the target account to carry out deployment). For simplicity, we refer to this IAM role as the cross-account role, as specified in the architecture diagram.

You also create the cross-account role in the target account that trusts the IAM user in the tools account and provides the required permissions for AWS CDK to bootstrap and initiate creating an AWS CloudFormation deployment stack in the target account. GitHub Actions uses the tools account IAM user credentials to the assume the cross-account role to carry out deployment.

In addition, you create an AWS CloudFormation execution role in the target account, which AWS CloudFormation service assumes in the target account. This role has permissions to create your API resources, such as a Lambda function and Amazon API Gateway, in the target account. This role is passed to AWS CloudFormation service via AWS CDK.

You then configure your tools account IAM user credentials in your Git secrets and define the GitHub Actions workflow, which triggers upon pushing code to a specific branch of the repo. The workflow then assumes the cross-account role and initiates deployment.

The following diagram illustrates the solution architecture and shows AWS resources across the tools and target accounts.

Architecture diagram

Creating an IAM user

You start by creating an IAM user called git-action-deployment-user in the tools account. The user needs to have only programmatic access.

  1. Clone the GitHub repo aws-cross-account-cicd-git-actions-prereq and navigate to folder tools-account. Here you find the JSON parameter file src/cdk-stack-param.json, which contains the parameter CROSS_ACCOUNT_ROLE_ARN, which represents the ARN for the cross-account role we create in the next step in the target account. In the ARN, replace <target-account-id> with the actual account ID for your designated AWS target account.                                             Replace <target-account-id> with designated AWS account id
  2. Run deploy.sh by passing the name of the tools AWS account profile you created earlier. The script compiles the code, builds a package, and uses the AWS CDK CLI to bootstrap and deploy the stack. See the following code:
cd aws-cross-account-cicd-git-actions-prereq/tools-account/
./deploy.sh "<AWS-TOOLS-ACCOUNT-PROFILE-NAME>"

You should now see two stacks in the tools account: CDKToolkit and cf-GitActionDeploymentUserStack. AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an Amazon Simple Storage Service (Amazon S3) bucket needed to hold deployment assets such as a CloudFormation template and Lambda code package. cf-GitActionDeploymentUserStack creates the IAM user with permission to assume git-action-cross-account-role (which you create in the next step). On the Outputs tab of the stack, you can find the user access key and the AWS Secrets Manager ARN that holds the user secret. To retrieve the secret, you need to go to Secrets Manager. Record the secret to use later.

Stack that creates IAM user with its secret stored in secrets manager

Creating a cross-account IAM role

In this step, you create two IAM roles in the target account: git-action-cross-account-role and git-action-cf-execution-role.

git-action-cross-account-role provides required deployment-specific permissions to the IAM user you created in the last step. The IAM user in the tools account can assume this role and perform the following tasks:

  • Upload deployment assets such as the CloudFormation template and Lambda code package to a designated S3 bucket via AWS CDK
  • Create a CloudFormation stack that deploys API Gateway and Lambda using AWS CDK

AWS CDK passes git-action-cf-execution-role to AWS CloudFormation to create, update, and delete the CloudFormation stack. It has permissions to create API Gateway and Lambda resources in the target account.

To deploy these two roles using AWS CDK, complete the following steps:

  1. In the already cloned repo from the previous step, navigate to the folder target-account. This folder contains the JSON parameter file cdk-stack-param.json, which contains the parameter TOOLS_ACCOUNT_USER_ARN, which represents the ARN for the IAM user you previously created in the tools account. In the ARN, replace <tools-account-id> with the actual account ID for your designated AWS tools account.                                             Replace <tools-account-id> with designated AWS account id
  2. Run deploy.sh by passing the name of the target AWS account profile you created earlier. The script compiles the code, builds the package, and uses the AWS CDK CLI to bootstrap and deploy the stack. See the following code:
cd ../target-account/
./deploy.sh "<AWS-TARGET-ACCOUNT-PROFILE-NAME>"

You should now see two stacks in your target account: CDKToolkit and cf-CrossAccountRolesStack. AWS CDK creates the CDKToolkit stack when we bootstrap the AWS CDK app. This creates an S3 bucket to hold deployment assets such as the CloudFormation template and Lambda code package. The cf-CrossAccountRolesStack creates the two IAM roles we discussed at the beginning of this step. The IAM role git-action-cross-account-role now has the IAM user added to its trust policy. On the Outputs tab of the stack, you can find these roles’ ARNs. Record these ARNs as you conclude this step.

Stack that creates IAM roles to carry out cross account deployment

Configuring secrets

One of the GitHub actions we use is aws-actions/[email protected]. This action configures AWS credentials and Region environment variables for use in the GitHub Actions workflow. The AWS CDK CLI detects the environment variables to determine the credentials and Region to use for deployment.

For our cross-account deployment use case, aws-actions/[email protected] takes three pieces of sensitive information besides the Region: AWS_ACCESS_KEY_ID, AWS_ACCESS_KEY_SECRET, and CROSS_ACCOUNT_ROLE_TO_ASSUME. Secrets are recommended for storing sensitive pieces of information in the GitHub repo. It keeps the information in an encrypted format. For more information about referencing secrets in the workflow, see Creating and storing encrypted secrets.

Before we continue, you need your own empty GitHub repo to complete this step. Use an existing repo if you have one, or create a new repo. You configure secrets in this repo. In the next section, you check in the code provided by the post to deploy a Lambda-based API CDK stack into this repo.

  1. On the GitHub console, navigate to your repo settings and choose the Secrets tab.
  2. Add a new secret with name as TOOLS_ACCOUNT_ACCESS_KEY_ID.
  3. Copy the access key ID from the output OutGitActionDeploymentUserAccessKey of the stack GitActionDeploymentUserStack in tools account.
  4. Enter the ID in the Value field.                                                                                                                                                                Create secret
  5. Repeat this step to add two more secrets:
    • TOOLS_ACCOUNT_SECRET_ACCESS_KEY (value retrieved from the AWS Secrets Manager in tools account)
    • CROSS_ACCOUNT_ROLE (value copied from the output OutCrossAccountRoleArn of the stack cf-CrossAccountRolesStack in target account)

You should now have three secrets as shown below.

All required git secrets

Deploying with GitHub Actions

As the final step, first clone your empty repo where you set up your secrets. Download and copy the code from the GitHub repo into your empty repo. The folder structure of your repo should mimic the folder structure of source repo. See the following screenshot.

Folder structure of the Lambda API code

We can take a detailed look at the code base. First and foremost, we use Typescript to deploy our Lambda API, so we need an AWS CDK app and AWS CDK stack. The app is defined in app.ts under the repo root folder location. The stack definition is located under the stack-specific folder src/git-action-demo-api-stack. The Lambda code is located under the Lambda-specific folder src/git-action-demo-api-stack/lambda/ git-action-demo-lambda.

We also have a deployment script deploy.sh, which compiles the app and Lambda code, packages the Lambda code into a .zip file, bootstraps the app by copying the assets to an S3 bucket, and deploys the stack. To deploy the stack, AWS CDK has to pass CFN_EXECUTION_ROLE to AWS CloudFormation; this role is configured in src/params/cdk-stack-param.json. Replace <target-account-id> with your own designated AWS target account ID.

Update cdk-stack-param.json in git-actions-cross-account-cicd repo with TARGET account id

Finally, we define the Git Actions workflow under the .github/workflows/ folder per the specifications defined by GitHub Actions. GitHub Actions automatically identifies the workflow in this location and triggers it if conditions match. Our workflow .yml file is named in the format cicd-workflow-<region>.yml, where <region> in the file name identifies the deployment Region in the target account. In our use case, we use us-east-1 and us-west-2, which is also defined as an environment variable in the workflow.

The GitHub Actions workflow has a standard hierarchy. The workflow is a collection of jobs, which are collections of one or more steps. Each job runs on a virtual machine called a runner, which can either be GitHub-hosted or self-hosted. We use the GitHub-hosted runner ubuntu-latest because it works well for our use case. For more information about GitHub-hosted runners, see Virtual environments for GitHub-hosted runners. For more information about the software preinstalled on GitHub-hosted runners, see Software installed on GitHub-hosted runners.

The workflow also has a trigger condition specified at the top. You can schedule the trigger based on the cron settings or trigger it upon code pushed to a specific branch in the repo. See the following code:

name: Lambda API CICD Workflow
# This workflow is triggered on pushes to the repository branch master.
on:
  push:
    branches:
      - master

# Initializes environment variables for the workflow
env:
  REGION: us-east-1 # Deployment Region

jobs:
  deploy:
    name: Build And Deploy
    # This job runs on Linux
    runs-on: ubuntu-latest
    steps:
      # Checkout code from git repo branch configured above, under folder $GITHUB_WORKSPACE.
      - name: Checkout
        uses: actions/[email protected]
      # Sets up AWS profile.
      - name: Configure AWS credentials
        uses: aws-actions/[email protected]
        with:
          aws-access-key-id: ${{ secrets.TOOLS_ACCOUNT_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.TOOLS_ACCOUNT_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.REGION }}
          role-to-assume: ${{ secrets.CROSS_ACCOUNT_ROLE }}
          role-duration-seconds: 1200
          role-session-name: GitActionDeploymentSession
      # Installs CDK and other prerequisites
      - name: Prerequisite Installation
        run: |
          sudo npm install -g [email protected]
          cdk – version
          aws s3 ls
      # Build and Deploy CDK application
      - name: Build & Deploy
        run: |
          cd $GITHUB_WORKSPACE
          ls -a
          chmod 700 deploy.sh
          ./deploy.sh

For more information about triggering workflows, see Triggering a workflow with events.

We have configured a single job workflow for our use case that runs on ubuntu-latest and is triggered upon a code push to the master branch. When you create an empty repo, master branch becomes the default branch. The workflow has four steps:

  1. Check out the code from the repo, for which we use a standard Git action actions/[email protected]. The code is checked out into a folder defined by the variable $GITHUB_WORKSPACE, so it becomes the root location of our code.
  2. Configure AWS credentials using aws-actions/[email protected]. This action is configured as explained in the previous section.
  3. Install your prerequisites. In our use case, the only prerequisite we need is AWS CDK. Upon installing AWS CDK, we can do a quick test using the AWS Command Line Interface (AWS CLI) command aws s3 ls. If cross-account access was successfully established in the previous step of the workflow, this command should return a list of buckets in the target account.
  4. Navigate to root location of the code $GITHUB_WORKSPACE and run the deploy.sh script.

You can check in the code into the master branch of your repo. This should trigger the workflow, which you can monitor on the Actions tab of your repo. The commit message you provide is displayed for the respective run of the workflow.

Workflow for region us-east-1 Workflow for region us-west-2

You can choose the workflow link and monitor the log for each individual step of the workflow.

Git action workflow steps

In the target account, you should now see the CloudFormation stack cf-GitActionDemoApiStack in us-east-1 and us-west-2.

Lambda API stack in us-east-1 Lambda API stack in us-west-2

The API resource URL DocUploadRestApiResourceUrl is located on the Outputs tab of the stack. You can invoke your API by choosing this URL on the browser.

API Invocation Output

Clean up

To remove all the resources from the target and tools accounts, complete the following steps in their given order:

  1. Delete the CloudFormation stack cf-GitActionDemoApiStack from the target account. This step removes the Lambda and API Gateway resources and their associated IAM roles.
  2. Delete the CloudFormation stack cf-CrossAccountRolesStack from the target account. This removes the cross-account role and CloudFormation execution role you created.
  3. Go to the CDKToolkit stack in the target account and note the BucketName on the Output tab. Empty that bucket and then delete the stack.
  4. Delete the CloudFormation stack cf-GitActionDeploymentUserStack from tools account. This removes cross-account-deploy-user IAM user.
  5. Go to the CDKToolkit stack in the tools account and note the BucketName on the Output tab. Empty that bucket and then delete the stack.

Security considerations

Cross-account IAM roles are very powerful and need to be handled carefully. For this post, we strictly limited the cross-account IAM role to specific Amazon S3 and CloudFormation permissions. This makes sure that the cross-account role can only do those things. The actual creation of Lambda, API Gateway, and Amazon DynamoDB resources happens via the AWS CloudFormation IAM role, which AWS  CloudFormation assumes in the target AWS account.

Make sure that you use secrets to store your sensitive workflow configurations, as specified in the section Configuring secrets.

Conclusion

In this post we showed how you can leverage GitHub’s popular software development platform to securely deploy to AWS accounts and Regions using GitHub actions and AWS CDK.

Build your own GitHub Actions CI/CD workflow as shown in this post.

About the author

 

Damodar Shenvi Wagle is a Cloud Application Architect at AWS Professional Services. His areas of expertise include architecting serverless solutions, ci/cd and automation.