All posts by Nikunj Vaidya

Gaining operational insights with AIOps using Amazon DevOps Guru

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/gaining-operational-insights-with-aiops-using-amazon-devops-guru/

Amazon DevOps Guru offers a fully managed AIOps platform service that enables developers and operators to improve application availability and resolve operational issues faster. It minimizes manual effort by leveraging machine learning (ML) powered recommendations. Its ML models take advantage of AWS’s expertise in operating highly available applications for the world’s largest e-commerce business for over 20 years. DevOps Guru automatically detects operational issues, predicts impending resource exhaustion, details likely causes, and recommends remediation actions.

This post walks you through how to enable DevOps Guru for your account in a typical serverless environment and observe the insights and recommendations generated for various activities. These insights are generated for operational events that could pose a risk to your application availability. DevOps Guru uses AWS CloudFormation stacks as the application boundary to detect resource relationships and co-relate with deployment events.

Solution overview

The goal of this post is to demonstrate the insights generated for anomalies detected by DevOps Guru from DevOps operations. If you don’t have a test environment and want to build out infrastructure to test the generation of insights, then you can follow along through this post. However, if you have a test or production environment (preferably spawned from CloudFormation stacks), you can skip the first section and jump directly to section, Enabling DevOps Guru and injecting traffic.

The solution includes the following steps:

1. Deploy a serverless infrastructure.

2. Enable DevOps Guru and inject traffic.

3. Review the generated DevOps Guru insights.

4. Inject another failure to generate a new insight.

As depicted in the following diagram, we use a CloudFormation stack to create a serverless infrastructure, comprising of Amazon API Gateway, AWS Lambda, and Amazon DynamoDB, and inject HTTP requests at a high rate towards the API published to list records.

Serverless infrastructure monitored by DevOps Guru

When DevOps Guru is enabled to monitor your resources within the account, it uses a combination of vended Amazon CloudWatch metrics and specific patterns from its ML models to detect anomalies. When an anomaly is detected, it generates an insight with the recommendations.

The generation of insights is dependent upon several factors. Although this post provides a canned environment to reproduce insights, the results may vary depending upon traffic pattern, timings of traffic injection, and so on.

Prerequisites

To complete this tutorial, you should have access to an AWS Cloud9 environment and the AWS Command Line Interface (AWS CLI).

Deploying a serverless infrastructure

To deploy your serverless infrastructure, you complete the following high-level steps:

1.   Create your IDE environment.

2.   Launch the CloudFormation template to deploy the serverless infrastructure.

3.   Populate the DynamoDB table.

 

1. Creating your IDE environment

We recommend using AWS Cloud9 to create an environment to get access to the AWS CLI from a bash terminal. AWS Cloud9 is a browser-based IDE that provides a development environment in the cloud. While creating the new environment, ensure you choose Linux2 as the operating system. Alternatively, you can use your bash terminal in your favorite IDE and configure your AWS credentials in your terminal.

When access is available, run the following command to confirm that you can see the Amazon Simple Storage Service (Amazon S3) buckets in your account:

aws s3 ls

Install the following prerequisite packages and ensure you have Python3 installed:

sudo yum install jq -y

export AWS_REGION=$(curl -s \
169.254.169.254/latest/dynamic/instance-identity/document | jq -r '.region')

sudo pip3 install requests

Clone the git repository to download the required CloudFormation templates:

git clone https://github.com/aws-samples/amazon-devopsguru-samples
cd amazon-devopsguru-samples/generate-devopsguru-insights/

2. Launching the CloudFormation template to deploy your serverless infrastructure

To deploy your infrastructure, complete the following steps:

  • Run the CloudFormation template using the following command:
aws cloudformation create-stack --stack-name myServerless-Stack \
--template-body file:///$PWD/cfn-shops-monitoroper-code.yaml \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM

The AWS CloudFormation deployment creates an API Gateway, a DynamoDB table, and a Lambda function with sample code.

  • When it’s complete, go to the Outputs tab of the stack on the AWS CloudFormation console.
  • Record the links for the two APIs: one of them to list the table contents and other one to populate the contents.

3. Populating the DynamoDB table

Run the following commands (simply copy-paste) to populate the DynamoDB table. The below three commands will identify the name of the DynamoDB table from the CloudFormation stack and populate the name in the populate-shops-dynamodb-table.json file.

dynamoDBTableName=$(aws cloudformation list-stack-resources \
--stack-name myServerless-Stack | \
jq '.StackResourceSummaries[]|select(.ResourceType == "AWS::DynamoDB::Table").PhysicalResourceId' | tr -d '"')
sudo sed -i s/"<YOUR-DYNAMODB-TABLE-NAME>"/$dynamoDBTableName/g \
populate-shops-dynamodb-table.json
aws dynamodb batch-write-item \
--request-items file://populate-shops-dynamodb-table.json

The command gives the following output:

{
"UnprocessedItems": {}
}

This populates the DynamoDB table with a few sample records, which you can verify by accessing the ListRestApiEndpointMonitorOper API URL published on the Outputs tab of the CloudFormation stack. The following screenshot shows the output.

Screenshot showing the API output

Enabling DevOps Guru and injecting traffic

In this section, you complete the following high-level steps:

1.   Enable DevOps Guru for the CloudFormation stack.

2.   Wait for the serverless stack to complete.

3.   Update the stack.

4.   Inject HTTP requests into your API.

 

1. Enabling DevOps Guru for the CloudFormation stack

To enable DevOps Guru for CloudFormation, complete the following steps:

  • Run the CloudFormation template to enable DevOps Guru for this CloudFormation stack:
aws cloudformation create-stack \
--stack-name EnableDevOpsGuruForServerlessCfnStack \
--template-body file:///$PWD/EnableDevOpsGuruForServerlessCfnStack.yaml \
--parameters ParameterKey=CfnStackNames,ParameterValue=myServerless-Stack \
--region ${AWS_REGION}
  • When the stack is created, navigate to the Amazon DevOps Guru console.
  • Choose Settings.
  • Under CloudFormation stacks, locate myServerless-Stack.

If you don’t see it, your CloudFormation stack has not been successfully deployed. You may remove and redeploy the EnableDevOpsGuruForServerlessCfnStack stack.

Optionally, you can configure Amazon Simple Notification Service (Amazon SNS) topics or enable AWS Systems Manager integration to create OpsItem entries for every insight created by DevOps Guru. If you need to deploy as a stack set across multiple accounts and Regions, see Easily configure Amazon DevOps Guru across multiple accounts and Regions using AWS CloudFormation StackSets.

2. Waiting for baselining of resources

This is a necessary step to allow DevOps Guru to complete baselining the resources and benchmark the normal behavior. For our serverless stack with 3 resources, we recommend waiting for 2 hours before carrying out next steps. When enabled in a production environment, depending upon the number of resources selected to monitor, it can take up to 24 hours for it to complete baselining.

Note: Unlike many monitoring tools, DevOps Guru does not expect the dashboard to be continuously monitored and thus under normal system health, the dashboard would simply show zero’ed counters for the ongoing insights. It is only when an anomaly is detected, it will raise an alert and display an insight on the dashboard.

3. Updating the CloudFormation stack

When enough time has elapsed, we will make a configuration change to simulate a typical operational event. As shown below, update the CloudFormation template to change the read capacity for the DynamoDB table from 5 to 1.

CloudFormation showing read capacity settings to modify

Run the following command to deploy the updated CloudFormation template:

aws cloudformation update-stack --stack-name myServerless-Stack \
--template-body file:///$PWD/cfn-shops-monitoroper-code.yaml \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM

4. Injecting HTTP requests into your API

We now inject ingress traffic in the form of HTTP requests towards the ListRestApiEndpointMonitorOper API, either manually or using the Python script provided in the current directory (sendAPIRequest.py). Due to reduced read capacity, the traffic will oversubscribe the dynamodb tables, thus inducing an anomaly. However, before triggering the script, populate the url parameter with the correct API link for the  ListRestApiEndpointMonitorOper API, listed in the CloudFormation stack’s output tab.

After the script is saved, trigger the script using the following command:

python sendAPIRequest.py

Make sure you’re getting an output of status 200 with the records that we fed into the DynamoDB table (see the following screenshot). You may have to launch multiple tabs (preferably 4) of the terminal to run the script to inject a high rate of traffic.

Terminal showing script executing and injecting traffic to API

After approximately 10 minutes of the script running in a loop, an operational insight is generated in DevOps Guru.

 

Reviewing DevOps Guru insights

While these operations are being run, DevOps Guru monitors for anomalies, logs insights that provide details about the metrics indicating an anomaly, and prints actionable recommendations to mitigate the anomaly. In this section, we review some of these insights.

Under normal conditions, DevOps Guru dashboard will show the ongoing insights counter to be zero. It monitors a high number of metrics behind the scenes and offloads the operator from manually monitoring any counters or graphs. It raises an alert in the form of an insight, only when anomaly occurs.

The following screenshot shows an ongoing reactive insight for the specific CloudFormation stack. When you choose the insight, you see further details. The number under the Total resources analyzed last hour may vary, so for this post, you can ignore this number.

DevOps Guru dashboard showing an ongoing reactive insight

The insight is divided into four sections: Insight overview, Aggregated metrics, Relevant events, and Recommendations. Let’s take a closer look into these sections.

The following screenshot shows the Aggregated metrics section, where it shows metrics for all the resources that it detected anomalies in (DynamoDB, Lambda, and API Gateway). Note that depending upon your traffic pattern, lambda settings, baseline traffic, the list of metrics may vary. In the example below, the timeline shows that an anomaly for DynamoDB started first and was followed by API Gateway and Lambda. This helps us understand the cause and symptoms, and prioritize the right anomaly investigation.

The listing of metrics inside an Insight

Initially, you may see only two metrics listed, however, over time, it populates more metrics that showed anomalies. You can see the anomaly for DynamoDB started earlier than the anomalies for API Gateway and Lambda, thus indicating them as after effects. In addition to the information in the preceding screenshot, you may see Duration p90 and IntegrationLatency p90 (for Lambda and API Gateway respectively, due to increased backend latency) metrics also listed.

Now we check the Relevant events section, which lists potential triggers for the issue. The events listed here depend on the set of operations carried out on this CloudFormation stack’s resources in this Region. This makes it easy for the operator to be reminded of a change that may have caused this issue. The dots (representing events) that are near the Insight start portion of timeline are of particular interest.

Related Events shown inside the Insight

If you need to delve into any of these events, just click of any of these points, and it provides more details as shown below screenshot.

Delving into details of the related event listed in Insight

You can choose the link for an event to view specific details about the operational event (configuration change via CloudFormation update stack operation).

Now we move to the Recommendations section, which provides prescribed guidance for mitigating this anomaly. As seen in the following screenshot, it recommends to roll back the configuration change related to the read capacity for the DynamoDB table. It also lists specific metrics and the event as part of the recommendation for reference.

Recommendations provided inside the Insight

Going back to the metric section of the insight, when we select Graphed anomalies, it shows you the graphs of all related resources. Below screenshot shows a snippet showing anomaly for DynamoDB ReadThrottleEvents metrics. As seen in the below screenshot of the graph pattern, the read operations on the table are exceeding the provisioned throughput of read capacity. This clearly indicates an anomaly.

Graphed anomalies in DevOps Guru

Let’s navigate to the DynamoDB table and check our table configuration. Checking the table properties, we notice that our read capacity is reduced to 1. This is our root cause that led to this anomaly.

Checking the DynamoDB table capacity settings

If we change it to 5, we fix this anomaly. Alternatively, if the traffic is stopped, the anomaly moves to a Resolved state.

The ongoing reactive insight takes a few minutes after resolution to move to a Closed state.

Note: When the insight is active, you may see more metrics get populated over time as we detect further anomalies. When carrying out the preceding tests, if you don’t see all the metrics as listed in the screenshots, you may have to wait longer.

 

Injecting another failure to generate a new DevOps Guru insight

Let’s create a new failure and generate an insight for that.

1.   After you close the insight from the previous section, trigger the HTTP traffic generation loop from the AWS Cloud9 terminal.

We modify the Lambda functions’s resource-based policy by removing the permissions for API Gateway to access this function.

2.   On the Lambda console, open the function ScanFunctionMonitorOper.

3.   On the Permissions tab, access the policy.

Accessing the permissions tab for the Lambda

 

4.   Save a copy of the policy offline as a backup before making any changes.

5.   Note down the “Sid” values for the “AWS:SourceArn” that ends with prod/*/ and prod/*/*.

Checking the Resource-based policy for the Lambda

6.   Run the following command to remove the “Sid” JSON statements in your Cloud9 terminal:

aws lambda remove-permission --function-name ScanFunctionMonitorOper \
--statement-id <Sid-value-ending-with-prod/*/>

7.   Run the same command for the second Sid value:

aws lambda remove-permission --function-name ScanFunctionMonitorOper \
--statement-id <Sid-value-ending-with-prod/*/*>

You should see several 5XX errors, as in the following screenshot.

Terminal output now showing 500 errors for the script output

After less than 8 minutes, you should see a new ongoing reactive insight on the DevOps Guru dashboard.

Let’s take a closer look at the insight. The following screenshot shows the anomalous metric 5XXError Average of API Gateway and its duration. (This insight shows as closed because I had already restored permissions.)

Insight showing 5XX errors for API-Gateway and link to OpsItem

If you have configured to enable creating OpsItem in Systems Manager, you would see the link to OpsItem ID created in the insight, as shown above. This is an optional configuration, which will enable you to track the insights in the form of open tickets (OpsItems) in Systems Manager OpsCenter.

The recommendations provide guidance based upon the related events and anomalous metrics.

After the insight has been generated, reviewed, and verified, restore the permissions by running the following command:

aws lambda add-permission --function-name ScanFunctionMonitorOper  \
--statement-id APIGatewayProdPerm --action lambda:InvokeFunction \
--principal apigateway.amazonaws.com

If needed, you can insert the condition to point to the API Gateway ARN to allow only specific API Gateways to access the Lambda function.

 

Cleaning up

After you walk through this post, you should clean up and un-provision the resources to avoid incurring any further charges.

1.   To un-provision the CloudFormation stacks, on the AWS CloudFormation console, choose Stacks.

2.   Select each stack (EnableDevOpsGuruForServerlessCfnStack and myServerless-Stack) and choose Delete.

3.   Check to confirm that the DynamoDB table created with the stacks is cleaned up. If not, delete the table manually.

4.   Un-provision the AWS Cloud9 environment.

 

Conclusion

This post reviewed how DevOps Guru can continuously monitor the resources in your AWS account in a typical production environment. When it detects an anomaly, it generates an insight, which includes the vended CloudWatch metrics that breached the threshold, the CloudFormation stack in which the resource existed, relevant events that could be potential triggers, and actionable recommendations to mitigate the condition.

DevOps Guru generates insights that are relevant to you based upon the pre-trained machine-learning models, removing the undifferentiated heavy lifting of manually monitoring several events, metrics, and trends.

I hope this post was useful to you and that you would consider DevOps Guru for your production needs.

 

Easily configure Amazon DevOps Guru across multiple accounts and Regions using AWS CloudFormation StackSets

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/configure-devops-guru-multiple-accounts-regions-using-cfn-stacksets/

As applications become increasingly distributed and complex, operators need more automated practices to maintain application availability and reduce the time and effort spent on detecting, debugging, and resolving operational issues.

Enter Amazon DevOps Guru (preview).

Amazon DevOps Guru is a machine learning (ML) powered service that gives you a simpler way to improve an application’s availability and reduce expensive downtime. Without involving any complex configuration setup, DevOps Guru automatically ingests operational data in your AWS Cloud. When DevOps Guru identifies a critical issue, it automatically alerts you with a summary of related anomalies, the likely root cause, and context on when and where the issue occurred. DevOps Guru also, when possible, provides prescriptive recommendations on how to remediate the issue.

Using Amazon DevOps Guru is easy and doesn’t require you to have any ML expertise. To get started, you need to configure DevOps Guru and specify which AWS resources to analyze. If your applications are distributed across multiple AWS accounts and AWS Regions, you need to configure DevOps Guru for each account-Region combination. Though this may sound complex, it’s in fact very simple to do so using AWS CloudFormation StackSets. This post walks you through the steps to configure DevOps Guru across multiple AWS accounts or organizational units, using AWS CloudFormation StackSets.

 

Solution overview

The goal of this post is to provide you with sample templates to facilitate onboarding Amazon DevOps Guru across multiple AWS accounts. Instead of logging into each account and enabling DevOps Guru, you use AWS CloudFormation StackSets from the primary account to enable DevOps Guru across multiple accounts in a single AWS CloudFormation operation. When it’s enabled, DevOps Guru monitors your associated resources and provides you with detailed insights for anomalous behavior along with intelligent recommendations to mitigate and incorporate preventive measures.

We consider various options in this post for enabling Amazon DevOps Guru across multiple accounts and Regions:

  • All resources across multiple accounts and Regions
  • Resources from specific CloudFormation stacks across multiple accounts and Regions
  • For All resources in an organizational unit

In the following diagram, we launch the AWS CloudFormation StackSet from a primary account to enable Amazon DevOps Guru across two AWS accounts and carry out operations to generate insights. The StackSet uses a single CloudFormation template to configure DevOps Guru, and deploys it across multiple accounts and regions, as specified in the command.

Figure: Shows enabling of DevOps Guru using CloudFormation StackSets

Figure: Shows enabling of DevOps Guru using CloudFormation StackSets

When Amazon DevOps Guru is enabled to monitor your resources within the account, it uses a combination of vended Amazon CloudWatch metrics, AWS CloudTrail logs, and specific patterns from its ML models to detect an anomaly. When the anomaly is detected, it generates an insight with the recommendations.

Figure: Shows DevOps Guru generating Insights based upon ingested metrics

Figure: Shows DevOps Guru monitoring the resources and generating insights for anomalies detected

 

Prerequisites

To complete this post, you should have the following prerequisites:

  • Two AWS accounts. For this post, we use the account numbers 111111111111 (primary account) and 222222222222. We will carry out the CloudFormation operations and monitoring of the stacks from this primary account.
  • To use organizations instead of individual accounts, identify the organization unit (OU) ID that contains at least one AWS account.
  • Access to a bash environment, either using an AWS Cloud9 environment or your local terminal with the AWS Command Line Interface (AWS CLI) installed.
  • AWS Identity and Access Management (IAM) roles for AWS CloudFormation StackSets.
  • Knowledge of CloudFormation StackSets

 

(a) Using an AWS Cloud9 environment or AWS CLI terminal
We recommend using AWS Cloud9 to create an environment to get access to the AWS CLI from a bash terminal. Make sure you select Linux2 as the operating system for the AWS Cloud9 environment.

Alternatively, you may use your bash terminal in your favorite IDE and configure your AWS credentials in your terminal.

(b) Creating IAM roles

If you are using Organizations for account management, you would not need to create the IAM roles manually and instead use Organization based trusted access and SLRs. You may skip the sections (b), (c) and (d). If not using Organizations, please read further.

Before you can deploy AWS CloudFormation StackSets, you must have the following IAM roles:

  • AWSCloudFormationStackSetAdministrationRole
  • AWSCloudFormationStackSetExecutionRole

The IAM role AWSCloudFormationStackSetAdministrationRole should be created in the primary account whereas AWSCloudFormationStackSetExecutionRole role should be created in all the accounts where you would like to run the StackSets.

If you’re already using AWS CloudFormation StackSets, you should already have these roles in place. If not, complete the following steps to provision these roles.

(c) Creating the AWSCloudFormationStackSetAdministrationRole role
To create the AWSCloudFormationStackSetAdministrationRole role, sign in to your primary AWS account and go to the AWS Cloud9 terminal.

Execute the following command to download the file:

curl -O https://s3.amazonaws.com/cloudformation-stackset-sample-templates-us-east-1/AWSCloudFormationStackSetAdministrationRole.yml

Execute the following command to create the stack:

aws cloudformation create-stack \
--stack-name AdminRole \
--template-body file:///$PWD/AWSCloudFormationStackSetAdministrationRole.yml \
--capabilities CAPABILITY_NAMED_IAM \
--region us-east-1

(d) Creating the AWSCloudFormationStackSetExecutionRole role
You now create the role AWSCloudFormationStackSetExecutionRole in the primary account and other target accounts where you want to enable DevOps Guru. For this post, we create it for our two accounts and two Regions (us-east-1 and us-east-2).

Execute the following command to download the file:

curl -O https://s3.amazonaws.com/cloudformation-stackset-sample-templates-us-east-1/AWSCloudFormationStackSetExecutionRole.yml

Execute the following command to create the stack:

aws cloudformation create-stack \
--stack-name ExecutionRole \
--template-body file:///$PWD/AWSCloudFormationStackSetExecutionRole.yml \
--parameters ParameterKey=AdministratorAccountId,ParameterValue=111111111111 \
--capabilities CAPABILITY_NAMED_IAM \
--region us-east-1

Now that the roles are provisioned, you can use AWS CloudFormation StackSets in the next section.

 

Running AWS CloudFormation StackSets to enable DevOps Guru

With the required IAM roles in place, now you can deploy the stack sets to enable DevOps Guru across multiple accounts.

As a first step, go to your bash terminal and clone the GitHub repository to access the CloudFormation templates:

git clone https://github.com/aws-samples/amazon-devopsguru-samples
cd amazon-devopsguru-samples/enable-devopsguru-stacksets

 

(a) Configuring Amazon SNS topics for DevOps Guru to send notifications for operational insights

If you want to receive notifications for operational insights generated by Amazon DevOps Guru, you need to configure an Amazon Simple Notification Service (Amazon SNS) topic across multiple accounts. If you have already configured SNS topics and want to use them, identify the topic name and directly skip to the step to enable DevOps Guru.

Note for Central notification target: You may prefer to configure an SNS Topic in the central AWS account so that all Insight notifications are sent to a single target. In such a case, you would need to modify the central account SNS topic policy to allow other accounts to send notifications.

To create your stack set, enter the following command (provide an email for receiving insights):

aws cloudformation create-stack-set \
--stack-set-name CreateDevOpsGuruTopic \
--template-body file:///$PWD/CreateSNSTopic.yml \
--parameters ParameterKey=EmailAddress,ParameterValue=<[email protected]> \
--region us-east-1

Instantiate AWS CloudFormation StackSets instances across multiple accounts and multiple Regions (provide your account numbers and Regions as needed):

aws cloudformation create-stack-instances \
--stack-set-name CreateDevOpsGuruTopic \
--accounts '["111111111111","222222222222"]' \
--regions '["us-east-1","us-east-2"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

After running this command, the SNS topic devops-guru is created across both the accounts. Go to the email address specified and confirm the subscription by clicking the Confirm subscription link in each of the emails that you receive. Your SNS topic is now fully configured for DevOps Guru to use.

Figure: Shows creation of SNS topic to receive insights from DevOps Guru

Figure: Shows creation of SNS topic to receive insights from DevOps Guru

 

(b) Enabling DevOps Guru

Let us first examine the CloudFormation template format to enable DevOps Guru and configure it to send notifications over SNS topics. See the following code snippet:

Resources:
  DevOpsGuruMonitoring:
    Type: AWS::DevOpsGuru::ResourceCollection
    Properties:
      ResourceCollectionFilter:
        CloudFormation:
          StackNames: *

  DevOpsGuruNotification:
    Type: AWS::DevOpsGuru::NotificationChannel
    Properties:
      Config:
        Sns:
          TopicArn: arn:aws:sns:us-east-1:111111111111:SnsTopic

 

When the StackNames property is fed with a value of *, it enables DevOps Guru for all CloudFormation stacks. However, you can enable DevOps Guru for only specific CloudFormation stacks by providing the desired stack names as shown in the following code:

 

Resources:
  DevOpsGuruMonitoring:
    Type: AWS::DevOpsGuru::ResourceCollection
    Properties:
      ResourceCollectionFilter:
        CloudFormation:
          StackNames:
          - StackA
          - StackB

 

For the CloudFormation template in this post, we provide the names of the stacks using the parameter inputs. To enable the AWS CLI to accept a list of inputs, we need to configure the input type as CommaDelimitedList, instead of a base string. We also provide the parameter SnsTopicName, which the template substitutes into the TopicArn property.

See the following code:

AWSTemplateFormatVersion: 2010-09-09
Description: Enable Amazon DevOps Guru

Parameters:
  CfnStackNames:
    Type: CommaDelimitedList
    Description: Comma separated names of the CloudFormation Stacks for DevOps Guru to analyze.
    Default: "*"

  SnsTopicName:
    Type: String
    Description: Name of SNS Topic

Resources:
  DevOpsGuruMonitoring:
    Type: AWS::DevOpsGuru::ResourceCollection
    Properties:
      ResourceCollectionFilter:
        CloudFormation:
          StackNames: !Ref CfnStackNames

  DevOpsGuruNotification:
    Type: AWS::DevOpsGuru::NotificationChannel
    Properties:
      Config:
        Sns:
          TopicArn: !Sub arn:aws:sns:${AWS::Region}:${AWS::AccountId}:${SnsTopicName}

 

Now that we reviewed the CloudFormation syntax, we will use this template to implement the solution. For this post, we will consider three use cases for enabling Amazon DevOps Guru:

(i) For all resources across multiple accounts and Regions

(ii) For all resources from specific CloudFormation stacks across multiple accounts and Regions

(iii) For all resources in an organization

Let us review each of the above points in detail.

(i) Enabling DevOps Guru for all resources across multiple accounts and Regions

Note: Carry out the following steps in your primary AWS account.

You can use the CloudFormation template (EnableDevOpsGuruForAccount.yml) from the current directory, create a stack set, and then instantiate AWS CloudFormation StackSets instances across desired accounts and Regions.

The following command creates the stack set:

aws cloudformation create-stack-set \
--stack-set-name EnableDevOpsGuruForAccount \
--template-body file:///$PWD/EnableDevOpsGuruForAccount.yml \
--parameters ParameterKey=CfnStackNames,ParameterValue=* \
ParameterKey=SnsTopicName,ParameterValue=devops-guru \
--region us-east-1

The following command instantiates AWS CloudFormation StackSets instances across multiple accounts and Regions:

aws cloudformation create-stack-instances \
--stack-set-name EnableDevOpsGuruForAccount \
--accounts '["111111111111","222222222222"]' \
--regions '["us-east-1","us-east-2"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

 

The following screenshot of the AWS CloudFormation console in the primary account running StackSet, shows the stack set deployed in both accounts.

Figure: Screenshot for deployed StackSet and Stack instances

Figure: Screenshot for deployed StackSet and Stack instances

 

The following screenshot of the Amazon DevOps Guru console shows DevOps Guru is enabled to monitor all CloudFormation stacks.

Figure: Screenshot of DevOps Guru dashboard showing DevOps Guru enabled for all CloudFormation stacks

Figure: Screenshot of DevOps Guru dashboard showing DevOps Guru enabled for all CloudFormation stacks

 

(ii) Enabling DevOps Guru for specific CloudFormation stacks for individual accounts

Note: Carry out the following steps in your primary AWS account

In this use case, we want to enable Amazon DevOps Guru only for specific CloudFormation stacks for individual accounts. We use the AWS CloudFormation StackSets override parameters feature to rerun the stack set with specific values for CloudFormation stack names as parameter inputs. For more information, see Override parameters on stack instances.

If you haven’t created the stack instances for individual accounts, use the create-stack-instances AWS CLI command and pass the parameter overrides. If you have already created stack instances, update the existing stack instances using update-stack-instances and pass the parameter overrides. Replace the required account number, Regions, and stack names as needed.

In account 111111111111, create instances with the parameter override with the following command, where CloudFormation stacks STACK-NAME-1 and STACK-NAME-2 belong to this account in us-east-1 Region:

aws cloudformation create-stack-instances \
--stack-set-name  EnableDevOpsGuruForAccount \
--accounts '["111111111111"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-1>,<STACK-NAME-2>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

Update the instances with the following command:

aws cloudformation update-stack-instances \
--stack-set-name EnableDevOpsGuruForAccount \
--accounts '["111111111111"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-1>,<STACK-NAME-2>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

 

In account 222222222222, create instances with the parameter override with the following command, where CloudFormation stacks STACK-NAME-A and STACK-NAME-B belong to this account in the us-east-1 Region:

aws cloudformation create-stack-instances \
--stack-set-name  EnableDevOpsGuruForAccount \
--accounts '["222222222222"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-A>,<STACK-NAME-B>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

Update the instances with the following command:

aws cloudformation update-stack-instances \
--stack-set-name EnableDevOpsGuruForAccount \
--accounts '["222222222222"]' \
--parameter-overrides ParameterKey=CfnStackNames,ParameterValue=\"<STACK-NAME-A>,<STACK-NAME-B>\" \
--regions '["us-east-1"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

 

The following screenshot of the DevOps Guru console shows that DevOps Guru is enabled for only two CloudFormation stacks.

Figure: Screenshot of DevOps Guru dashboard showing DevOps Guru enabled for only two CloudFormation stacks

Figure: Screenshot of DevOps Guru dashboard with DevOps Guru enabled for two CloudFormation stacks

 

(iii) Enabling DevOps Guru for all resources in an organization

Note: Carry out the following steps in your primary AWS account

If you’re using AWS Organizations, you can take advantage of the AWS CloudFormation StackSets feature support for Organizations. This way, you don’t need to add or remove stacks when you add or remove accounts from the organization. For more information, see New: Use AWS CloudFormation StackSets for Multiple Accounts in an AWS Organization.

The following example shows the command line using multiple Regions to demonstrate the use. Update the OU as needed. If you need to use additional Regions, you may have to create an SNS topic in those Regions too.

To create a stack set for an OU and across multiple Regions, enter the following command:

aws cloudformation create-stack-set \
--stack-set-name EnableDevOpsGuruForAccount \
--template-body file:///$PWD/EnableDevOpsGuruForAccount.yml \
--parameters ParameterKey=CfnStackNames,ParameterValue=* \
ParameterKey=SnsTopicName,ParameterValue=devops-guru \
--region us-east-1 \
--permission-model SERVICE_MANAGED \
--auto-deployment Enabled=true,RetainStacksOnAccountRemoval=true

Instantiate AWS CloudFormation StackSets instances for an OU and across multiple Regions with the following command:

aws cloudformation create-stack-instances \
--stack-set-name  EnableDevOpsGuruForAccount \
--deployment-targets OrganizationalUnitIds='["<organizational-unit-id>"]' \
--regions '["us-east-1","us-east-2"]' \
--operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1

In this way, you can run CloudFormation StackSets to enable and configure DevOps Guru across multiple accounts, Regions, with simple and easy steps.

 

Reviewing DevOps Guru insights

Amazon DevOps Guru monitors for anomalies in the resources in the CloudFormation stacks that are enabled for monitoring. The following screenshot shows the initial dashboard.

Figure: Screenshot of DevOps Guru dashboard

Figure: Screenshot of DevOps Guru dashboard

On enabling DevOps Guru, it may take up to 24 hours to analyze the resources and baseline the normal behavior. When it detects an anomaly, it highlights the impacted CloudFormation stack, logs insights that provide details about the metrics indicating an anomaly, and prints actionable recommendations to mitigate the anomaly.

Figure: Screenshot of DevOps Guru dashboard showing ongoing reactive insight

Figure: Screenshot of DevOps Guru dashboard showing ongoing reactive insight

The following screenshot shows an example of an insight (which now has been resolved) that was generated for the increased latency for an ELB. The insight provides various sections in which it provides details about the metrics, the graphed anomaly along with the time duration, potential related events, and recommendations to mitigate and implement preventive measures.

Figure: Screenshot for an Insight generated about ELB Latency

Figure: Screenshot for an Insight generated about ELB Latency

 

Cleaning up

When you’re finished walking through this post, you should clean up or un-provision the resources to avoid incurring any further charges.

  1. On the AWS CloudFormation StackSets console, choose the stack set to delete.
  2. On the Actions menu, choose Delete stacks from StackSets.
  3. After you delete the stacks from individual accounts, delete the stack set by choosing Delete StackSet.
  4. Un-provision the environment for AWS Cloud9.

 

Conclusion

This post reviewed how to enable Amazon DevOps Guru using AWS CloudFormation StackSets across multiple AWS accounts or organizations to monitor the resources in existing CloudFormation stacks. Upon detecting an anomaly, DevOps Guru generates an insight that includes the vended CloudWatch metric, the CloudFormation stack in which the resource existed, and actionable recommendations.

We hope this post was useful to you to onboard DevOps Guru and that you try using it for your production needs.

 

About the Authors

Author's profile photo

 

Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.

 

 

 

 

Nuatu Tseggai is a Cloud Infrastructure Architect at Amazon Web Services. He enjoys working with customers to design and build event-driven distributed systems that span multiple services.

 

Incorporating security in code-reviews using Amazon CodeGuru Reviewer

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/incorporating-security-in-code-reviews-using-amazon-codeguru-reviewer/

Today, software development practices are constantly evolving to empower developers with tools to maintain a high bar of code quality. Amazon CodeGuru Reviewer offers this capability by carrying out automated code-reviews for developers, based on the trained machine learning models that can detect complex defects and providing intelligent actionable recommendations to mitigate those defects. A quick overview of CodeGuru is covered in this blog post.

Security analysis is a critical part of a code review and CodeGuru Reviewer offers this capability with a new set of security detectors. These security detectors introduced in CodeGuru Reviewer are geared towards identifying security risks from the top 10 OWASP categories and ensures that your code follows best practices for AWS Key Management Service (AWS KMS), Amazon Elastic Compute Cloud (Amazon EC2) API, and common Java crypto and TLS/SSL libraries. As of today, CodeGuru security analysis supports Java language, thus we will take an example of a Java application.

In this post, we will walk through the on-boarding workflow to carry out the security analysis of the code repository and generate recommendations for a Java application.

 

Security workflow overview:

The new security workflow, introduced for CodeGuru reviewer, utilizes the source code and build artifacts to generate recommendations. The security detector evaluates build artifacts to generate security-related recommendations whereas other detectors continue to scan the source code to generate recommendations. With the use of build artifacts for evaluation, the detector can carry out a whole-program inter-procedural analysis to discover issues that are caused across your code (e.g., hardcoded credentials in one file that are passed to an API in another) and can reduce false-positives by checking if an execution path is valid or not. You must provide the source code .zip file as well as the build artifact .zip file for a complete analysis.

Customers can run a security scan when they create a repository analysis. CodeGuru Reviewer provides an additional option to get both code and security recommendations. As explained in the following sections, CodeGuru Reviewer will create an Amazon Simple Storage Service (Amazon S3) bucket in your AWS account for that region to upload or copy your source code and build artifacts for the analysis. This repository analysis option can be run on Java code from any repository.

 

Prerequisites

Prepare the source code and artifact zip files: If you do not have your Java code locally, download the source code that you want to evaluate for security and zip it. Similarly, if needed, download the build artifact .jar file for your source code and zip it. It will be required to upload the source code and build artifact as separate .zip files as per the instructions in subsequent sections. Thus even if it is a single file (eg. single .jar file), you will still need to zip it. Even if the .zip file includes multiple files, the right files will be discovered and analyzed by CodeGuru. For our sample test, we will use src.zip and jar.zip file, saved locally.

Creating an S3 bucket repository association:

This section summarizes the high-level steps to create the association of your S3 bucket repository.

1. On the CodeGuru console, choose Code reviews.

2. On the Repository analysis tab, choose Create repository analysis.

Screenshot of initiating the repository analysis

Figure: Screenshot of initiating the repository analysis

 

3. For the source code analysis, select Code and security recommendations.

4. For Repository name, enter a name for your repository.

5. Under Additional settings, for Code review name, enter a name for trackability purposes.

6. Choose Create S3 bucket and associate.

Screenshot to show selection of Security Code Analysis

Figure: Screenshot to show selection of Security Code Analysis

It takes a few seconds to create a new S3 bucket in the current Region. When it completes, you will see the below screen.

Screenshot for Create repository analysis showing S3 bucket created

Figure: Screenshot for Create repository analysis showing S3 bucket created

 

7. Choose Upload to the S3 bucket option and under that choose Upload source code zip file and select the zip file (src.zip) from your local machine to upload.

Screenshot of popup to upload code and artifacts from S3 bucket

Figure: Screenshot of popup to upload code and artifacts from S3 bucket

 

8. Similarly, choose Upload build artifacts zip file and select the zip file (jar.zip) from your local machine and upload.

 

Screenshot for Create repository analysis showing S3 paths populated

Figure: Screenshot for Create repository analysis showing S3 paths populated

 

Alternatively, you can always upload the source code and build artifacts as zip file from any of your existing S3 bucket as below.

9. Choose Browse S3 buckets for existing artifacts and upload from there as shown below:

 

Screenshot to upload code and artifacts from S3 bucket

Figure: Screenshot to upload code and artifacts from an existing S3 bucket

 

10. Now click Create repository analysis and trigger the code review.

A new pending entry is created as shown below.

 

Screenshot of code review in Pending state

Figure: Screenshot of code review in Pending state

After a few minutes, you would see the recommendations generate that would include security analysis too. In the below case, there are 10 recommendations generated.

Screenshot of repository analysis being completed

Figure: Screenshot of repository analysis being completed

 

For the subsequent code reviews, you can use the same repository and upload new files or create a new repository as shown below:

 

Screenshot of subsequent code review making repository selection

Figure: Screenshot of subsequent code review making repository selection

 

Recommendations

Apart from detecting the security risks from the top 10 OWASP categories, the security detector, detects the deep security issues by analyzing data flow across multiple methods, procedures, and files.

The recommendations generated in the area of security are labelled as Security. In the below example we see a recommendation to remove hard-coded credentials and a non-security-related recommendation about refactoring of code for better maintainability.

Screenshot of Recommendations generated

Figure: Screenshot of Recommendations generated

 

Below is another example of recommendations pointing out the potential resource-leak as well as a security issue pointing to a potential risk of path traversal attack.

Screenshot of deep security recommendations

Figure: More examples of deep security recommendations

 

As this blog is focused on on-boarding aspects, we will cover the explanation of recommendations in more detail in a separate blog.

Disassociation of Repository (optional):

The association of CodeGuru to the S3 bucket repository can be removed by following below steps. Navigate to the Repositories page, select the repository and choose Disassociate repository.

Screenshot of disassociating the S3 bucket repo with CodeGuru

Figure: Screenshot of disassociating the S3 bucket repo with CodeGuru

 

Conclusion

This post reviewed the support for on-boarding workflow to carry out the security analysis in CodeGuru Reviewer. We initiated a full repository analysis for the Java code using a separate UI workflow and generated recommendations.

We hope this post was useful and would enable you to conduct code analysis using Amazon CodeGuru Reviewer.

 

About the Author

Author's profile photo

 

Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.

Hardening code reviews on GitHub Enterprise repositories with Amazon CodeGuru

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/hardening-code-reviews-on-github-enterprise-repositories-with-amazon-codeguru/

This post walks you through associating the GitHub Enterprise repository with Amazon CodeGuru Reviewer. This repository support is available for both self-hosted and cloud-hosted GitHub Enterprise options. In this post, we focus on associating CodeGuru with the repository on a self-hosted GitHub Enterprise Server.

CodeGuru Reviewer offers automated code reviews to catch difficult-to-find defects in the early stage of development. It is backed by machine learning models trained from millions of code reviews conducted within AWS and open-source projects. When the code repository is associated with CodeGuru, the creation of pull requests triggers CodeGuru to scan the code and, based on the analysis, provide you actionable recommendations. CodeGuru Reviewer currently identifies code quality issues in the following broad categories:

  • AWS best practices
  • Concurrency
  • Resource leaks
  • Sensitive information leaks
  • Code efficiency
  • Refactoring
  • Input validation

In short, CodeGuru equips your development team with the tools to maintain a high bar of coding standards in the software development process. For more information about configuring CodeGuru for automated code reviews and performance optimization, see Automated code reviews and application profiling with Amazon CodeGuru.

In this post, we discuss the following:

  • Support for GitHub Enterprise repositories: Cloud hosted and Self hosted
  • Associating CodeGuru with GitHub Enterprise Server:
    • Creating the repository provider association on CodeGuru
    • Setting up the host (authorizing AWS CodeStar to access and install the app on an endpoint) and creating the connection (providing access to selected repositories)
    • Completing the association of CodeGuru with the created connection
  • Generating a pull request
  • Cleaning up to avoid unnecessary charges

 

CodeGuru support for Cloud-hosted and self-hosted GitHub Enterprise repositories

GitHub Enterprise offers various ways to host repositories. When configuring CodeGuru to associate with GitHub Enterprise repositories, you can select GitHub for cloud hosted or GitHub Enterprise Server for self hosted. The cloud-hosted option refers to the GitHub cloud, whereas the self-hosted option refers to on-premises or the AWS Cloud.

Selecting the GitHub repository

For this post, we walk though configuring CodeGuru to associate with the self-hosted GitHub Enterprise Server.

We consider the following self-hosted options:

  • With a public endpoint – The Amazon Elastic Compute Cloud (Amazon EC2) compute instance hosting GitHub Enterprise Server is accessible from the internet.
  • With a private endpoint – This is a common scenario for an organization in which GitHub Enterprise Server is hosted on AWS Cloud as a private VPC endpoint and the developers securely access this endpoint from their corporate network or VPN. You could also host it on an on-premises server accessible via a specific VPC.

We revisit these scenarios later in this post when we configure the association with CodeGuru.

 

Before you associate GitHub Enterprise (GHE) repositories with CodeGuru, let us review our GHE server that is set up on our EC2 instance that will be hosting our repositories. The following screenshot shows the EC2 instance launched using the GitHub Enterprise AMI:

GitHub Enterprise Server running on EC2

After you instantiate GitHub Enterprise Server on Amazon EC2 using the desired AMI, confirm its reachability by choosing its DNS name. If it’s configured with a private endpoint or an on-premises server, test the reachability using the endpoint URL from the appropriate source location and logging in.

GitHub Enterprise Login

Log in to the account and check your repositories. The following screenshot shows your repositories listed in the navigation pane.

GitHub Enteprise Dashboard

In all these use cases, we recommend integrating Certificate Authority (CA) authorized certificates on GitHub Enterprise Server to enable a proper TLS handshake when accessing from a browser. If you’re using self-signed certificates on GitHub Enterprise Server, you may have to import those certificates in your browser to enable the TLS handshake and access the service.

 

Associating CodeGuru with GitHub Enterprise Server

This section summarizes the high-level steps to associate the self-hosted GitHub Enterprise Server code repository with CodeGuru Reviewer. To do so, we have to use the AWS CodeStar connections service, which offers a centralized place to create the association between a third-party service and an AWS service.

We first create an AWS CodeStar connection to GitHub Enterprise Server, then use this connection to list the repositories in that GitHub account and select the repository to associate with CodeGuru.

Creating the association

To start creating the association, complete the following steps:

  • On the CodeGuru console, left pane, choose Reviewer.
  • Choose Associated repositories.
  • Select GitHub Enterprise Server.
  • From the drop-down menu, check for any existing connections.
  • If you don’t have any connections, choose Create a GitHub Enterprise Server connection.

Create CodeStar connection

The Create a connection window appears.

  • For Connection name, enter a name.
  • For Choose a host, choose the search box. If no hosts are configured, it displays an informational box.
  • Choose Create host.

Create Connection

The host is configured to model the self-hosted GitHub Enterprise Server; for our use case, it’s hosted on our EC2 instance. This is configured only one time and later reused to create multiple connections in the same Region as the EC2 instance.

After you choose Create host, the Create host window appears.

  • For Host name, enter a name.
  • For Select a provider, choose GitHub Enterprise Server.
  • For Endpoint, enter your endpoint.
  • For VPC configuration, choose No VPC (we don’t need to configure a VPC for this use case).

The following screenshot shows an example of configuring for a self-hosted GitHub server that is accessible over the public internet.

Create CodeStar Host

For a GitHub Enterprise Server that isn’t accessible over the public internet, you need to choose Use a VPC and enter the following:

  • VPC ID – The VPC ID where GitHub Enterprise Server is located or where the on-premises GitHub Enterprise Server is accessible from (a VPC with reachability to the on-premises GitHub Enterprise Server)
  • Subnet ID – The subnets from the preceding VPC where GitHub Enterprise Server is located or where the on-premises GitHub Enterprise Server is accessible from
  • Security group ID – The security groups that allow CodeGuru Reviewer to access GitHub Enterprise Server in the preceding VPC.
  • TLS certificate – You don’t need to enter a TLS certificate if you’re using a certificate signed by a public CA on GitHub Enterprise Server. If you’re using a self-signed certificate or non-public certificate, enter the certificate. To obtain your certificate, complete the following:
    • Navigate to your endpoint URL in Firefox.
    • Choose the lock icon in the address field.
    • Choose Connection, More Information.
    • On the Security tab, choose View Certificate.
    • On the Details tab, choose Export.
    • Save it as a local file.
    • Open the file in your preferred text editor to locate the certificate and copy and paste the text.

 

  • Choose Create host.

Set up Host configuration

Set up Host configuration - 2

 

You should now see the Setup status change from VPC configuration initializing to Pending.

  • Choose Set up host.

Host Configured in Pending State

Setting up the host and creating the connection

When you complete the steps in the previous section, you’re redirected to the Create GitHub App page. You need administrator login credentials for GitHub Enterprise Server to allow the application installation.

  • After you log in with those credentials, for GitHub App name, enter a name.
  • Choose Create GitHub App.

GitHub Login

This step installs the app on GitHub Enterprise Server, and the host Setup status changes to Available.

GitHub Host in Available State

  • Choose Create connection. If the button isn’t available, complete the following:
    • In the navigation pane, choose Settings.
    • Choose Connections.
    • Choose Create connection.
  • On the Create a connection page, select GitHub Enterprise Server.
  • For Connection name, enter a name.
  • For Choose a host, enter your host.
  • Choose Connect to GitHub Enterprise Server.

Create Connection

  • In the window that appears, authorize the app installation.

Authorize CodeStar for GitHub

  • On the Connect to GitHub Enterprise Server page, choose Install a new app.

Install an App on GHE Server

  • In the window that appears, select to apply All repositories or Only select repositories in that organization.

Select Repositories to access for CodeStar

  • When you return to the Connect to GitHub Enterprise Server page, choose Connect.

Completing the connection creation

Completing the association of CodeGuru with the created connection

After you complete the steps in the preceding section, you return to the Associate repository page.

  • For Select source provider, choose GitHub Enterprise Server.
  • For Connect to GitHub Enterprise Server, choose your connection.
  • For Repository location, choose your repository.
  • Choose Associate.

Associating Repository with the connection

In less than 30 seconds, the repository status shows as Associated.

Associated State

Generating a pull request

You’re now ready to create a pull request for any changes to the code and trigger CodeGuru to scan the code and generate actionable recommendations.

 

In your repository, on the Code tab, choose Create pull request.

Create a Pull Request

You can see an active entry for the code review on the Code reviews page with the status Pending.

Code Review in Pending state

When it’s complete, you can see the recommendations on the Pull request tab.

Code Review in completed state

Cleaning up

When you’re finished testing, you should un-provision the following resources to avoid incurring further charges:

  • CodeGuru Reviewer – Remove the association of CodeGuru to the repository, so that any further pull request notifications don’t trigger CodeGuru to perform an automated code review
  • GitHub Enterprise Server – If hosted on an EC2 instance, stop the instance

 

Conclusion

This post reviewed the support of self-hosted GitHub Enterprise Server repositories for CodeGuru Reviewer. You can take advantage of these features to enhance your application development workflow.

 

About the Author

Author Photo

 

Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.

 

 

Automated code reviews on Bitbucket repositories and other enhancements in Amazon CodeGuru

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/automated-code-reviews-on-bitbucket-repositories-and-other-enhancements-in-amazon-codeguru/

This post covers the support for the Atlassian Bitbucket Cloud source repository for Amazon CodeGuru Reviewer, which was recently announced. It also delves into new functionalities introduced to enhance the developer experience in CodeGuru Reviewer.

CodeGuru Reviewer is a machine learning-based service that scans your pull requests and gives you recommendations against your source code in Bitbucket with a description of what’s causing the issue and how to remediate it. CodeGuru Reviewer identifies code quality issues in five broad categories:

  • AWS best practices
  • Concurrency
  • Resource leaks
  • Sensitive information leaks
  • Common code bugs

You can also use CodeGuru Reviewer to provide code quality or AWS best practice recommendations when you migrate your code base to Java or adopt AWS services to achieve scale and robustness. The CodeGuru Reviewer recommendations offer unique capabilities in static code analysis, including the following:

  • Lower false positives
  • Difficult-to-identify issues like resource leaks and concurrency
  • Machine learning to evolve continuously from Amazon code bases
  • AWS best practices

In short, Amazon CodeGuru (Preview) equips your development team with the tools to maintain a high bar of coding standards in the software development process. For more information about configuring CodeGuru (Preview) for automated code reviews and performance optimization, see Automated code reviews and application profiling with Amazon CodeGuru.

 

In this blog, we will go over the following items:

  1. Using the Getting Started wizard.
  2. Associating the Bitbucket repository with the CodeGuru Reviewer and generating a pull request to trigger an automated code review.
  3. Using the new pull request code reviews dashboard to keep track of the history of pull requests and associated CodeGuru Reviewer recommendations
  4. Using supported APIs and AWS Command Line Interface (AWS CLI) to carry out CodeGuru Reviewer functions.

 

Using the Getting Started wizard

If you’re new to CodeGuru (Preview), you should follow the wizard guidance from the Getting started drop-down menu on the CodeGuru (Preview) console. This has been recently introduced to facilitate configuring the service.

Choose CodeGuru Profiler or CodeGuru Reviewer for configuration and follow the guided steps.

Screenshot of Getting Started Wizard

Associating a Bitbucket repository

This section summarizes the high level steps to associate the Bitbucket code repository with CodeGuru Reviewer. For more information, see What is Amazon CodeGuru Reviewer?

1.  On the CodeGuru (Preview) console, choose Reviewer.

2.  Choose Associated repositories.

3.  Choose Associate repository.

4.  Select Bitbucket.

5.  For Connect to Bitbucket, you can choose an existing connection from the drop-down menu or choose Create a Bitbucket connection.

 

 

6.  After choosing Create a Bitbucket connection, for Connection name, provide a name for your connection; for example, Bitbucket-Connection.

Screenshot of Create Connection Step1

7.  Choose Connect to Bitbucket Cloud.

A window opens to log in to Bitbucket.

8.  If you’re not already logged in, enter your Bitbucket login credentials.

9.  In the Bitbucket connection settings section, for Connection name, the name you entered in earlier step will be displayed.

10.  For Bitbucket Cloud apps, you can search for an existing app in the search text box or choose Install a new app.

Screenshot to Connect to Bitbucket

After you choose Install a new app, a pop-up window appears asking for authorization to grant access for AWS CodeStar.

11.  Choose Grant Access.

You now see a connection string populated in the Bitbucket Cloud apps field.

 

12.  Choose Connect.

You return to the earlier screen, which provides you with a drop-down menu of repositories from your Bitbucket account.

 

13.  Choose an appropriate repository and choose Associate.

 

You can see your repository is in the status Associating as shown below:

Screenshot for Reviewer connection with Bitbucket in Associating State

 

When the association is complete, the status shows as Associated. The following screenshot shows two repositories in the Associated state. This indicates that CodeGuru (Preview) is now listening for any pull request notifications from these repositories.

Screenshot showing in Associated state

 

Now you can go to the Bitbucket site and access your repository.

 

14.  On the Bitbucket website, create a pull request for the Bitbucket repository.

Screenshot for Pull-Request

 

CodeGuru (Preview) is notified of any pull requests created on that repository. This triggers the code review for the code referenced in the pull request. You can see generated recommendations on the Activity tab.

 

The following screenshot shows recommendations generated on the Bitbucket dashboard.

Screenshot for Recommendation

 

You can now take further actions to address the comments and merge the code.

 

Using the pull request code reviews dashboard

AWS introduced this dashboard based on early feedback requesting a centralized place to manage details about code review history. The pull request dashboard allows you to view CodeGuru Reviewer recommendations for all code reviews. This page lists all code reviews with accompanying information such as the status of the code review, the repository, the number of recommendations, and more.

PR Dashboard

 

Every code review is assigned a unique ARN that allows you to see the details of an individual code review, including any recommendations that may have surfaced.

 

The following screenshot shows the details of a code review.

PR Dashboard with Status

 

 

The following are key details:

  • Status – Confirms that the code review is complete. This is especially useful in use cases where there are no generated recommendations.
  • Metered lines of code – Refers to the lines of code scanned for the code request, excluding the lines without significant code (for example, commented code or lines with only opening or closing braces).
  • Pull request Id – Provides a link to navigate back to the code review page on the repository and review the complete code review activity.
  • Recommendations – Provides a text search capability to locate specific recommendations

 

The following screenshot shows an individual recommendation.

Individual Recommendations from PR Dashboard

 

You can give feedback on CodeGuru recommendations by choosing either the thumbs up or thumbs down icon below each recommendation. This gives you the opportunity to provide input about whether the recommendation was useful for you. These inputs enable AWS to evolve the service with more relevant recommendations.

 

Using the supported APIs and AWS CLI

The pull request code review dashboard includes the following APIs:

  • DescribeCodeReview
  • ListCodeReviews
  • ListRecommendations
  • PutRecommendationFeedback
  • DescribeRecommendationFeedback
  • ListRecommendationFeedback

In addition, you can use equivalent AWS CLI commands. To use the AWS CLI, you need to install the AWS CLI version 2. For more information about CodeGuru Reviewer operations, see codeguru-reviewer. For more information about CodeGuru Profiler operations, see codeguruprofiler.

 

The following are a few examples exercising the above API’s using aws cli’s:

Admin:~/environment $ aws codeguru-reviewer list-repository-associations
{
{
    "RepositoryAssociationSummaries": [
        {
            "AssociationArn": "arn:aws:codeguru-reviewer:<Region>:<AcctID>:association:11c1b6a3-638f-4a6c-bfc8-3e286785632a",
            "LastUpdatedTimeStamp": "2020-05-06T04:46:16.742000+00:00",
            "AssociationId": "11c1b6a3-638f-4a6c-bfc8-3e286785632a",
            "Name": "codeguruapp",
            "Owner": "nvaidya1",
            "ProviderType": "Bitbucket",
            "State": "Associated"
        },
        {
            "AssociationArn": "arn:aws:codeguru-reviewer:<Region>:<AcctID>:association:29ec5f42-34b4-448e-909e-76fc98bd8e59",
            "LastUpdatedTimeStamp": "2020-05-06T03:13:55.878000+00:00",
            "AssociationId": "29ec5f42-34b4-448e-909e-76fc98bd8e59",
            "Name": "MyJavaProject",
            "Owner": "nvaidya1",
            "ProviderType": "Bitbucket",
            "State": "Associated"
        }
    ]
}

 

Admin:~/environment $ aws codeguru-reviewer list-code-reviews --type PullRequest
{
    "CodeReviewSummaries": [
        {
            "Name": "BITBUCKET-codeguruapp-2-3099f39ad26a2d70e93d0288dfbd8ae301e925d5",
            "CodeReviewArn": "arn:aws:codeguru-reviewer:<Region>:<AcctID>:code-review:PullRequest-BITBUCKET-codeguruapp-2-3099f39ad26a2d70e93d0288dfbd8ae301e925d5",
            "RepositoryName": "codeguruapp",
            "Owner": "<Snip>",
            "ProviderType": "Bitbucket",
            "State": "Completed",
            "CreatedTimeStamp": "2020-05-06T04:47:43.868000+00:00",
            "LastUpdatedTimeStamp": "2020-05-06T04:51:47.492000+00:00",
            "Type": "PullRequest",
            "PullRequestId": "2",
            "MetricsSummary": {
                "MeteredLinesOfCodeCount": 74,
                "FindingsCount": 5
            }
        }
    ]
} 
<SNIP>

Admin:~/environment $ aws codeguru-reviewer list-recommendations --code-review-arn arn:aws:codeguru-reviewer:<Remaining-ARN-String>
{
    "RecommendationSummaries": [
        {
            "FilePath": "src/main/java/com/company/sample/application/EventHandler.java",
            "RecommendationId": "573efe3796b8b96f455591aa6eb4b675f77c95b9cf42f5a260ecc0dee0299e53",
            "StartLine": 170,
            "EndLine": 170,
            "Description": "This code is written so that the client cannot be reused across invocations of the Lambda function.\nTo improve the performance of the Lambda function, consider using static initialization/constructor, global/static variables and singletons. It allows to keep alive and reuse HTTP connections that were established during a previous invocation.\nLearn more about [best practices for working with AWS Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)."
        },
<SNIP>

Admin:~/environment $ aws codeguru-reviewer describe-code-review --code-review-arn arn:aws:codeguru-reviewer:<Remaining-ARN-String>
{
    "CodeReview": {
        "Name": "BITBUCKET-codeguruapp-2-3099f39ad26a2d70e93d0288dfbd8ae301e925d5",
        "CodeReviewArn": "arn:aws:codeguru-reviewer:<Region>:<AcctID>:code-review:PullRequest-BITBUCKET-codeguruapp-2-3099f39ad26a2d70e93d0288dfbd8ae301e925d5",
        "RepositoryName": "codeguruapp",
        "Owner": "<snip>",
        "ProviderType": "Bitbucket",
        "State": "Completed",
        "StateReason": "CodeGuru Reviewer successfully finished reviewing the pull request source code.",
        "CreatedTimeStamp": "2020-05-06T04:47:43.868000+00:00",
        "LastUpdatedTimeStamp": "2020-05-06T04:51:47.492000+00:00",
        "Type": "PullRequest",
        "PullRequestId": "2",
        "SourceCodeType": {
            "CommitDiff": {
                "SourceCommit": "3099f39ad26a2d70e93d0288dfbd8ae301e925d5",
                "DestinationCommit": "7338cd2fd99e6e663b3a312e80e5ca2b570a6891"
            }
        },
        "Metrics": {
            "MeteredLinesOfCodeCount": 74,
            "FindingsCount": 5
        }
    }
}

Cleaning up

When you’re finished testing, you should un-provision the service to avoid incurring further charges:

  • CodeGuru Reviewer – Remove the association of CodeGuru (Preview) to the repository, so that any further pull request notifications doesn’t trigger CodeGuru (Preview) to perform an automated code review.
  • CodeGuru Profiler – If configured, remove the profiling group.

 

Conclusion

This post reviewed CodeGuru (Preview) support for Bitbucket repositories for CodeGuru Reviewer. It also reviewed the pull request dashboard and supported APIs and AWS CLI functionalities. You can take advantage of these features to enhance your application development workflow.

 

Automating code reviews and application profiling with Amazon CodeGuru

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/automating-code-reviews-and-application-profiling-with-amazon-codeguru/

Amazon CodeGuru is a machine learning-based service released during re:Invent 2019 for automated code reviews and application performance recommendations. CodeGuru equips the development teams with the tools to maintain a high bar for coding standards in their software development process.

CodeGuru Reviewer helps developers avoid introducing issues that are difficult to detect, troubleshoot, reproduce, and root-cause. It also enables them to improve application performance. This not only improves the reliability of the software, but also cuts down the time spent chasing difficult issues like race conditions, slow resource leaks, thread safety issues, use of un-sanitized inputs, inappropriate handling of sensitive data, and application performance impact, to name a few.

CodeGuru is powered by machine learning, best practices, and hard-learned lessons across millions of code reviews and thousands of applications profiled on open source projects and internally at Amazon.

The service leverages machine-learning abilities to provide following two functionalities:

a) Reviewer: provides automated code reviews for static code analysis

b) Profiler: provides visibility into and recommendations about application performance during runtime

This blog post provides a short workshop to get a feel for both the above functionalities.

 

Solution overview

The following diagram illustrates a typical developer workflow in which the CodeGuru service is used in the code-review stage and the application performance-monitoring stage. The code reviewer is used for static code analysis backed with trained machine-learning models, and the profiler is used to monitor application performance when the code artifact is deployed and executed on the target compute.

Development Workflow

 

The following diagram depicts additional details to show the big picture in the overall schema of the CodeGuru workflow:

Big Picture Development Workflow
This blog workshop automates the deployment of a sample application from a GitHub link via an AWS CloudFormation template, including the dependencies needed. It also demonstrates the Reviewer functionality.

 

Pre-requisites

Follow these steps to get set up:

1. Set up your AWS Cloud9 environment and access the bash terminal, preferably in the us-east-1 region.

2. Ensure you have an individual GitHub account.

3. Configure an Amazon EC2 key-pair (preferably in the us-east-1 region) and make the .pem file available from the terminal being used.

 

CodeGuru Reviewer

This section demonstrates how to configure CodeGuru Reviewer functionality and associate it with the GitHub repository. Execute the following configuration steps:

Step 1: Fork the GitHub repository
First, log in to your GitHub account and navigate to this sample code. Choose Fork and wait for it to create a fork in your account, as shown in the following screenshot.

Figure shows how to fork a repository from GitHub

 

Step 2: Associate the GitHub repository
Log in to the CodeGuru dashboard and follow these steps:

1. Choose Reviewer from the left panel and choose Associate repository.

2. Choose GitHub and then choose Connect to GitHub.

3. Once authenticated and connection made, you can select the repository aws-codeguru-profiler-sample-application from the Repository location drop-down list and choose Associate, as shown in the following screenshot.

Associate repository with CodeGuru service

This associates the CodeGuru Reviewer with the specified repository and continues to listen for any pull-request events.

Associated repository with CodeGuru service

Step 3: Prepare your code
From your AWS Cloud9 terminal, clone the repository, create a new branch, using the following example commands:

git clone https://github.com/<your-userid>/aws-codeguru-profiler-sample-application.git
cd aws-codeguru-profiler-sample-application
git branch dev
git checkout dev
cd src/main/java/com/company/sample/application/

Open the file: CreateOrderThread.java and goto the line 63. Below line 63 which adds an order entry, insert the if statement under the comment to introduce an order entry check. Please indent the lines with spaces so they are well aligned as shown below.

SalesSystem.orders.put(orderDate, order);
//Check if the Order entered and present
if (SalesSystem.orders.containsKey(orderDate)) {
        System.out.println("New order verified to be present in hashmap: " + SalesSystem.orders.get(orderDate)); 
}
id++;

Once the above changes are introduced in the file, save and commit it to git and push it to the Repository.

git add .
git commit -s -m "Introducing new code that is potentially thread unsafe and inefficient"
cd ../../../../../../../
ls src/main/java/com/company/sample/application/

Now, upload the new branch to the GitHub repository using the following commands. Enter your credentials when asked to authenticate against your GitHub account:

git status
git push --set-upstream origin dev
ls

Step 4: Create a Pull request on GitHub:
In your GitHub account, you should see a new branch: dev.

1. Go to your GitHub account and choose the Pull requests tab.

2. Select New pull request.

3. Under Comparing Changes, select <userid>/aws-codeguru-profiler-sample-application as the source branch.

4. Select the options from the two drop-down lists that selects a merge proposal from the dev branch to the master branch, as shown in the following screenshot.

5. Review the code diffs shown. It should say that the diffs are mergeable (able to merge). Choose Create Pull request to complete the process.

Creating Pull request

This sends a Pull request notification to the CodeGuru service and is reflected on the CodeGuru dashboard, as shown in the following screenshot.

CodeGuru Dashboard

After a short time, a set of recommendations appears on the same GitHub page on which the Pull request was created.

The demo profiler configuration and recommendations shown on the dashboard are provided by default as a sample application profile. See the profiler section of this post for further discussion.

The following screenshot shows a recommendation generated about potential thread concurrency susceptibility:

 

CodeGuru Recommendations on GitHub

 

Another example below to show how the developer can provide feedback about recommendations using emojis:

How to provide feedback for CodeGuru recommendations

As you can see from the recommendations, not only are the code issues detected, but a detailed recommendation is also provided on how to fix the issues, along with a link to examples, and documentation, wherever applicable. For each of the recommendations, a developer can give feedback about whether the recommendation was useful or not with a simple emoji selection under Pick your reaction.

Please note that the CodeGuru service is used to identify difficult-to-find functional defects and not syntactical errors. Syntax errors should be flagged by the IDE and addressed at an early stage of development. CodeGuru is introduced at a later stage in a developer workflow, when the code is already developed, unit-tested, and ready for code-review.

 

CodeGuru Profiler

CodeGuru Profiler functionality focuses on searching for application performance optimizations, identifying your most “expensive” lines of code that take unnecessarily long times or higher-than-expected CPU cycles, for which there is a better/faster/cheaper alternative. It generates recommendations with actions you can take in order to reduce your CPU use, lower your compute costs, and overall improve your application’s performance. The profiler simplifies the troubleshooting and exploration of the application’s runtime behavior using visualizations. Examples of such issues include excessive recreation of expensive objects, expensive deserialization, usage of inefficient libraries, and excessive logging.

This post provides two sample application Demo profiles by default in the profiler section to demonstrate the visualization of CPU and latency characteristics of those profiles. This offers a quick and easy way to check the profiler output without onboarding an application. Additionally, there are recommendations shown for the {CodeGuru} DemoProfilingGroup-WithIssues application profile. However, if you would like to run a proof-of-concept with real application, please follow the procedure below.

The following steps launch a sample application on Amazon EC2 and configure Profiler to monitor the application performance from the CodeGuru service.

Step 1: Create a profiling group
Follow these steps to create a profiling group:

1. From the CodeGuru dashboard, choose Profiler from the left panel.

2. Under Profiling groups, select Create profiling group and type the name of your group. This workshop uses the name DemoProfilingGroup.

3. After typing the name, choose Create in the bottom right corner.

The output page shows you instructions on how to onboard the CodeGuru Profiler Agent library into your application, along with the necessary permissions required to successfully submit data to CodeGuru. This workshop uses the AWS CloudFormation template to automate the onboarding configuration and launch Amazon EC2 with the application and its dependencies.

Step 2: Run AWS Cloudformation to launch Amazon EC2 with the Java application:
This example runs an AWS CloudFormation template that does all the undifferentiated heavy lifting of launching an Amazon EC2 machine and installing JDK, Maven, and the sample demo application.

Once done, it configures the application to use a profiling group named DemoProfilingGroup, compiles the application, and executes it as a background process. This results in the sample demo application running in the region you choose, and submits profiling data to the CodeGuru Profiler Service under the DemoProfilingGroup profiling group created in the previous step.

To launch the AWS CloudFormation template that deploys the demo application, choose the following Launch Stack button, and fill in the Stack name, Key-pair name, and Profiling Group name.

Launch Button

Once the AWS CloudFormation deployment succeeds, log in to your terminal of choice and use ssh to connect to the Amazon EC2 machine. Check the running application using the following commands:

ssh -i '<path-to-keypair.pem-file>' [email protected]<ec2-ip-address>
java -version
mvn -v
ps -ef | grep SalesSystem  => This is the java application running in the background
tail /var/log/cloud-init-output.log  => You should see the following output in 10-15 minutes as INFO: Successfully reported profile

Once the CodeGuru agent is imported into the application, a separate profiler thread spawns when the application runs. It samples the application CPU and Latency characteristics and delivers them to the backend Profiler service for building the application profile.

Step 3: Check the Profiler flame-graphs:
Wait for 10-15 minutes for your profiling-group to become active (if not already) and for profiling data to be submitted and aggregated by the CodeGuru Profiler service.

Visit the Profiling Groups page and choose DemoProfilingGroup. You should see the following page showing your application’s profiling data in a visualization called a flame-graph, as shown in the screenshot below. Detailed explanation about flame-graphs and how to read them follow.

Profiler flame-graph visualization

Profiler extensively uses flame-graph visualizations to display your application’s profiling data since they’re a great way to convey issues once you understand how to read them.

The x-axis shows the stack profile population (collection of stack traces) sorted alphabetically (not chronologically), and the y-axis shows stack depth, counting from zero at the bottom. Each rectangle represents a stack frame. The wider a frame is is, the more often it was present in the stacks. The top edge shows what is on CPU, and beneath it is its ancestry. The colors are usually not significant (they’re picked randomly to differentiate frames).

As shown in the preceding screenshot, the stack traces for the three threads are shown, which are triggered by the code in the SalesSystem.java file.

1) createOrderThread.run

2) createIllegalOrderThread.run

3) listOrderThread.run

The flame-graph also depicts the stack depth and points out specific function names when you hover over that block. The marked areas in the flame-graph highlight the top block functions on-CPU and spikes in stack-trace. This may indicate an opportunity to optimize.

It is evident from the preceding diagram that significant CPU time is being used by an exception stack trace (leftmost). It’s also highlighted in the recommendation report as described in Step 4 below.

The exception is caused by trying to instantiate an Enum class giving it invalid String values. If you review the file CreateIllegalOrderThread.java, you should notice the constructors being called with illegal product names, which are defined in ProductName.java.

Step 4: Profiler Recommendations:
Apart from the real-time visualization of application performance described in the preceding section, a recommendation report (generated after a period of time) may appear, pointing out suspected inefficiencies to fix to improve the application performance. Once the recommendation appears, select the Recommendation link to see the details.

Each section in the Recommendations report can be expanded in order to get instructions on how to resolve the issue, or to examine several locations in which there were issues in your data, as shown in the following screenshot.

CodeGuru profiler recommendations

In the preceding example, the report includes an issue named Checking for Values not in enum, in which it conveys that more time (15.4%) was spent processing exceptions than expected (less than 1%). The reason for the exceptions is described in Step 3 and the resolution recommendations are provided in the report.

 

CodeGuru supportability:

CodeGuru currently supports native Java-based applications for the Reviewer and Profiler functionality. The Reviewer functionality currently supports AWS CodeCommit and all cloud-hosted non-enterprise versions of GitHub products, including Free/Pro/Team, as code repositories.

Amazon CodeGuru Profiler does not have any code repository dependence and works with Java applications hosted on Amazon EC2, containerized applications running on Amazon ECS and Amazon EKS, serverless applications running on AWS Fargate, and on-premises hosts with adequate AWS credentials.

 

Cleanup

At the end of this workshop, once the testing is completed, follow these steps to disable the service to avoid incurring any further charges.

1. Reviewer: Remove the association of the CodeGuru service to the repository, so that any further Pull-request notifications don’t trigger the CodeGuru service to perform an automated code-review.

2. Profiler: Remove the profiling group.

3. Amazon EC2 Compute: Go to the Amazon EC2 service, select the CodeGuru EC2 machine, and select the option to terminate the Amazon EC2 compute.

 

Conclusion

This post reviewed the CodeGuru service and implemented code examples for the Reviewer and Profiler functionalities. It described Reviewer functionality providing automated code-reviews and detailed guidance on fixes. The Profiler functionality enabled you to visualize your real-time application stack for granular inspection and generate recommendations that provided guidance on performance improvements.

I hope this post was informative and enabled you to onboard and test the service, as well as to leverage this service in your application development workflow.

 

About the Author

Author Photo

 

Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.

 

 

Deploying applications at FINRA using AWS CodeDeploy

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/deploying-applications-at-finra-using-aws-codedeploy/

by Geethalaksmi Ramachandran (FINRA – Director, Application Engineering), Avinash Chukka (FINRA – Senior Application Engineer)

At FINRA, a financial regulatory organization that oversees the broker-dealer industry with market intelligence, we have been utilizing the AWS CodeDeploy services to deploy applications on the cloud as well as on on-premises production servers. This blog post provides insight into our operations and experience with CodeDeploy.

Migration overview

Since 2014, we have gone through a systematic effort to migrate over 100 applications from on-premises resources to the AWS Cloud for everything from case management to data ingestion and processing. The applications comprise multiple components or microservices as individually deployable units or a single monolith application with multiple shared components.

Most of the application components, running on Linux and Windows, were entirely redesigned, containerized, and gradually deployed from on-premises to cloud-based Amazon ECS clusters. Fifteen applications were to be deferred for migration or planned for retirement due to other dependencies. Those deferred applications are currently running on on-premises bare-metal servers hosted across 38 Windows servers and 15 Linux servers per environment with a total of 150 Windows servers and 60 Linux servers across all the environments.

The applications were deployed to the on-premises servers earlier using the XL Deploy tool from XebiaLabs. The tool has now been decommissioned and replaced by CodeDeploy to attain more reliability and consistency in deployments across various applications.

Infrastructure and Workflow overview

FINRA’s AWS Cloud infrastructure consists of Amazon EC2 instances, ECS, Amazon EMR clusters, and many resources from other AWS services. We host web applications in ECS clusters, running approximately 200 clusters on each environment and EC2 instances. The infrastructure uses AWS CloudFormation and AWS Java SDK as Infrastructure-as-Code (IaC).

The CI/CD pipeline comprises of:

  • A source stage (per branch) stored in a BitBucket repository
  • A build stage executed on Jenkins build slaves
  • A deploy stage involving deployment from:
    • AWS CloudFormation and SDK to ECS clusters
    • CodeDeploy to EC2 instances
    • CodeDeploy Service to on-premises servers across various development, quality assurance (QA), staging, and production environments.

Jenkins masters running on EC2 instances within an Auto Scaling group orchestrate the CI/CD pipeline. The master spawns the build slaves as ECS tasks to execute a build job. Once the image is built and containerized in the build stage, the build artifact is stored in Artifactory repos for shared common libraries or staged in S3 to be used in deployments. The Jenkins slave invokes the appropriate deployment service – AWS CloudFormation and SDK or CodeDeploy depending on the target server environment, as detailed in the preceding paragraphs. On completion of the deployment, the automated smoke tests are launched.

The following diagram depicts the CD workflow for the on-premises instances:

The Delivery pipeline deploys across various environments such as development, QA, user acceptance testing (UAT), and production. Approval gates control deployments to UAT and production.

CodeDeploy Operations

Our experience of utilizing CodeDeploy services has been “very smooth” since we moved from the XebiaLabs XL Deploy tool three months ago. The main factors that led FINRA to select CodeDeploy for our organization were:

  • Being able to use the same set of deployment tools between on-premises and cloud-based instances
  • Easy portability and reuse of scripts
  • Shared community knowledge rather than isolated expertise

The default deployment parameters were well-suited for our environments and didn’t require altering values or customization. Depending on the application being deployed, deployments can be carried out on cloud-based instances or on-premises instances. Cloud-based instances use AWS CloudFormation templates to trigger CodeDeploy; on-premises instances use AWS CLI-based scripts to trigger CodeDeploy.

The cloud-based deployments in production follow “blue-green” strategy for some of the applications, in which rollback is a critical requirement for minimal disruption. Other applications in cloud follow the “rolling updates” method, where as the on-premises servers in production are upgraded using “in-place” deployment method. The CodeDeploy agents running on on-premises servers are configured with roles to query for required artifacts stored on specific S3 buckets when deploying the package.

The applications’ deployment mappings to the instances are configured based on EC2 Auto Scaling groups in the cloud and based on tags for on-premises resources. Each component is logically mapped to a CodeDeploy deployment group. However, at one point, the maximum number of tags added to the CodeDeploy on-premises instance was restricted to a maximum tag limit of 10, but the instance needed 13 tags corresponding to 13 deployment groups.

We overcame this limitation by adding a common tenth tag on the on-premises instance and also on the remaining deployment groups (10-13) and stored the mapping of instances to deployment groups externally. The deployment script first looks up the mapping and proceeds with the deployment by validating if it matches the target server name, then runs deployments only on the matching target servers and skips deployments on the unmatched servers, as shown in the following diagram.

CodeDeploy offers the following benefits to FINRA:

  • Deployment configurations written as code: CodeDeploy configuration uses CloudFormation templates as Infrastructure-as-Code, which makes it easier to create and maintain.
  • Version controlled deployment code: AWS CloudFormation templates, deployment configuration, and deployment scripts are maintained in the source code repository and version-controlled.
  • Reusability: Most CodeDeploy resource provisioning code is reusable across all the on-premises instances and on different platforms, such as Linux (RHEL) and Windows.
  • Zero maintenance of deployment tool: As a managed service, CodeDeploy does not require maintenance and upgrade.
  • Secrets Management: CodeDeploy integrates with central secrets management systems, and externalizes environment configurations.

Monitoring

FINRA uses the in-house developed DevOps Dashboard to monitor the build and deploy stages, based upon a Grafana UI extracting CI and CD data from Jenkins.

The cloud instances and on-premises servers are configured with agents to stream the real-time logs to a central Splunk Server, where the logs are analyzed and health-monitored. Optionally, the deployment logs are forwarded to the functional owners via email attachments. These logs become critical for the troubleshooting activity in post-mortems of past events. Due to the restrictions on accessing the base instances across various functional teams, the above mechanism enables us to gain visibility into health of the CI/CD infrastructure.

Looking Forward

We plan to migrate the remaining on-premises applications to the cloud after necessary refactoring and retiring of the application dependencies over the next few years.

Our eventual goal is to move towards serverless technologies to eliminate server infrastructure management.

Conclusion

This post reviewed the Infrastructure at FINRA on both AWS Cloud and on-premises, the CI/CD pipeline, and the CodeDeploy workflow integration, as well as examining insights into CodeDeploy use.

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.