Tag Archives: Amazon CodeGuru

A new Spark plugin for CPU and memory profiling

Post Syndicated from Bo Xiong original https://aws.amazon.com/blogs/devops/a-new-spark-plugin-for-cpu-and-memory-profiling/

Introduction

Have you ever wondered if there are low-hanging optimization opportunities to improve the performance of a Spark app? Profiling can help you gain visibility regarding the runtime characteristics of the Spark app to identify its bottlenecks and inefficiencies. We’re excited to announce the release of a new Spark plugin that enables profiling for JVM based Spark apps via Amazon CodeGuru. The plugin is open sourced on GitHub and published to Maven.

Walkthrough

This post shows how you can onboard this plugin with two steps in under 10 minutes.

  • Step 1: Create a profiling group in Amazon CodeGuru Profiler and grant permission to your Amazon EMR on EC2 role, so that profiler agents can emit metrics to CodeGuru. Detailed instructions can be found here.
  • Step 2: Reference codeguru-profiler-for-spark when submitting your Spark job, along with PROFILING_CONTEXT and ENABLE_AMAZON_PROFILER defined.

Prerequisites

Your app is built against Spark 3 and run on Amazon EMR release 6.x or newer. It doesn’t matter if you’re using Amazon EMR on Amazon Elastic Compute Cloud (Amazon EC2) or on Amazon Elastic Kubernetes Service (Amazon EKS).

Illustrative Example

For the purposes of illustration, consider the following example where profiling results are collected by the plugin and emitted to the “CodeGuru-Spark-Demo” profiling group.

spark-submit \
--master yarn \
--deploy-mode cluster \
--class \
--packages software.amazon.profiler:codeguru-profiler-for-spark:1.0 \
--conf spark.plugins=software.amazon.profiler.AmazonProfilerPlugin \
--conf spark.executorEnv.PROFILING_CONTEXT="{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\"}" \
--conf spark.executorEnv.ENABLE_AMAZON_PROFILER=true \
--conf spark.dynamicAllocation.enabled=false \t

An alternative way to specify PROFILING_CONTEXT and ENABLE_AMAZON_PROFILER is under the yarn-env.export classification for instance groups in the Amazon EMR web console. Note that PROFILING_CONTEXT, if configured in the web console, must escape all of the commas on top of what’s for the above spark-submit command.

[
  {
    "classification": "yarn-env",
    "properties": {},
    "configurations": [
      {
        "classification": "export",
        "properties": {
          "ENABLE_AMAZON_PROFILER": "true",
          "PROFILING_CONTEXT": "{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\"\\,\\\"driverEnabled\\\":\\\"true\\\"}"
        },
        "configurations": []
      }
    ]
  }
]

Once the job above is launched on Amazon EMR, profiling results should show up in your CodeGuru web console in about 10 minutes, similar to the following screenshot. Internally, it has helped us identify issues, such as thread contentions (revealed by the BLOCKED state in the latency flame graph), and unnecessarily create AWS Java clients (revealed by the CPU Hotspots view).

Go to your profiling group under the Amazon CodeGuru web console. Click the “Visualize CPU” button to render a flame graph displaying CPU usage. Switch to the latency view to identify latency bottlenecks, and switch to the heap summary view to identify objects consuming most memory.

Troubleshooting

To help with troubleshooting, use a sample Spark app provided in the plugin to check if everything is set up correctly. Note that the profilingGroupName value specified in PROFILING_CONTEXT should match what’s created in CodeGuru.

spark-submit \
--master yarn \
--deploy-mode cluster \
--class software.amazon.profiler.SampleSparkApp \
--packages software.amazon.profiler:codeguru-profiler-for-spark:1.0 \
--conf spark.plugins=software.amazon.profiler.AmazonProfilerPlugin \
--conf spark.executorEnv.PROFILING_CONTEXT="{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\"}" \
--conf spark.executorEnv.ENABLE_AMAZON_PROFILER=true \
--conf spark.yarn.appMasterEnv.PROFILING_CONTEXT="{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\",\\\"driverEnabled\\\":\\\"true\\\"}" \
--conf spark.yarn.appMasterEnv.ENABLE_AMAZON_PROFILER=true \
--conf spark.dynamicAllocation.enabled=false \
/usr/lib/hadoop-yarn/hadoop-yarn-server-tests.jar

Running the command above from the master node of your EMR cluster should produce logs similar to the following:

21/11/21 21:27:21 INFO Profiler: Starting the profiler : ProfilerParameters{profilingGroupName='CodeGuru-Spark-Demo', threadSupport=BasicThreadSupport (default), excludedThreads=[Signal Dispatcher, Attach Listener], shouldProfile=true, integrationMode='', memoryUsageLimit=104857600, heapSummaryEnabled=true, stackDepthLimit=1000, samplingInterval=PT1S, reportingInterval=PT5M, addProfilerOverheadAsSamples=true, minimumTimeForReporting=PT1M, dontReportIfSampledLessThanTimes=1}
21/11/21 21:27:21 INFO ProfilingCommandExecutor: Profiling scheduled, sampling rate is PT1S
...
21/11/21 21:27:23 INFO ProfilingCommand: New agent configuration received : AgentConfiguration(AgentParameters={MaxStackDepth=1000, MinimumTimeForReportingInMilliseconds=60000, SamplingIntervalInMilliseconds=1000, MemoryUsageLimitPercent=10, ReportingIntervalInMilliseconds=300000}, PeriodInSeconds=300, ShouldProfile=true)
21/11/21 21:32:23 INFO ProfilingCommand: Attempting to report profile data: start=2021-11-21T21:27:23.227Z end=2021-11-21T21:32:22.765Z force=false memoryRefresh=false numberOfTimesSampled=300
21/11/21 21:32:23 INFO javaClass: [HeapSummary] Processed 20 events.
21/11/21 21:32:24 INFO ProfilingCommand: Successfully reported profile

Note that the CodeGuru Profiler agent uses a reporting interval of five minutes. Therefore, any executor process shorter than five minutes won’t be reflected by the profiling result. If the right profiling group is not specified, or it’s associated with a wrong EC2 role in CodeGuru, then the log will show a message similar to “CodeGuruProfilerSDKClient: Exception while calling agent orchestration” along with a stack trace including a 403 status code. To rule out any network issues (e.g., your EMR job running in a VPC without an outbound gateway or a misconfigured outbound security group), then you can remote into an EMR host and ping the CodeGuru endpoint in your Region (e.g., ping codeguru-profiler.us-east-1.amazonaws.com).

Cleaning up

To avoid incurring future charges, you can delete the profiling group configured in CodeGuru and/or set the ENABLE_AMAZON_PROFILER environment variable to false.

Conclusion

In this post, we describe how to onboard this plugin with two steps. Consider to give it a try for your Spark app? You can find the Maven artifacts here. If you have feature requests, bug reports, feedback of any kind, or would like to contribute, please head over to the GitHub repository.

Author:

Bo Xiong

Bo Xiong is a software engineer with Amazon Ads, leveraging big data technologies to process petabytes of data for billing and reporting. His main interests include performance tuning and optimization for Spark on Amazon EMR, and data mining for actionable business insights.

AWS Week in Review – May 9, 2022

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-week-in-review-may-9-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Another week starts, and here’s a collection of the most significant AWS news from the previous seven days. This week is also the one-year anniversary of CloudFront Functions. It’s exciting to see what customers have built during this first year.

Last Week’s Launches
Here are some launches that caught my attention last week:

Amazon RDS supports PostgreSQL 14 with three levels of cascaded read replicas – That’s 5 replicas per instance, supporting a maximum of 155 read replicas per source instance with up to 30X more read capacity. You can now build a more robust disaster recovery architecture with the capability to create Single-AZ or Multi-AZ cascaded read replica DB instances in same or cross Region.

Amazon RDS on AWS Outposts storage auto scalingAWS Outposts extends AWS infrastructure, services, APIs, and tools to virtually any datacenter. With Amazon RDS on AWS Outposts, you can deploy managed DB instances in your on-premises environments. Now, you can turn on storage auto scaling when you create or modify DB instances by selecting a checkbox and specifying the maximum database storage size.

Amazon CodeGuru Reviewer suppression of files and folders in code reviews – With CodeGuru Reviewer, you can use automated reasoning and machine learning to detect potential code defects that are difficult to find and get suggestions for improvements. Now, you can prevent CodeGuru Reviewer from generating unwanted findings on certain files like test files, autogenerated files, or files that have not been recently updated.

Amazon EKS console now supports all standard Kubernetes resources to simplify cluster management – To make it easy to visualize and troubleshoot your applications, you can now use the console to see all standard Kubernetes API resource types (such as service resources, configuration and storage resources, authorization resources, policy resources, and more) running on your Amazon EKS cluster. More info in the blog post Introducing Kubernetes Resource View in Amazon EKS console.

AWS AppConfig feature flag Lambda Extension support for Arm/Graviton2 processors – Using AWS AppConfig, you can create feature flags or other dynamic configuration and safely deploy updates. The AWS AppConfig Lambda Extension allows you to access this feature flag and dynamic configuration data in your Lambda functions. You can now use the AWS AppConfig Lambda Extension from Lambda functions using the Arm/Graviton2 architecture.

AWS Serverless Application Model (SAM) CLI now supports enabling AWS X-Ray tracing – With the AWS SAM CLI you can initialize, build, package, test on local and cloud, and deploy serverless applications. With AWS X-Ray, you have an end-to-end view of requests as they travel through your application, making them easier to monitor and troubleshoot. Now, you can enable tracing by simply adding a flag to the sam init command.

Amazon Kinesis Video Streams image extraction – With Amazon Kinesis Video Streams you can capture, process, and store media streams. Now, you can also request images via API calls or configure automatic image generation based on metadata tags in ingested video. For example, you can use this to generate thumbnails for playback applications or to have more data for your machine learning pipelines.

AWS GameKit supports Android, iOS, and MacOS games developed with Unreal Engine – With AWS GameKit, you can build AWS-powered game features directly from the Unreal Editor with just a few clicks. Now, the AWS GameKit plugin for Unreal Engine supports building games for the Win64, MacOS, Android, and iOS platforms.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates you might have missed:

🎂 One-year anniversary of CloudFront Functions – I can’t believe it’s been one year since we launched CloudFront Functions. Now, we have tens of thousands of developers actively using CloudFront Functions, with trillions of invocations per month. You can use CloudFront Functions for HTTP header manipulation, URL rewrites and redirects, cache key manipulations/normalization, access authorization, and more. See some examples in this repo. Let’s see what customers built with CloudFront Functions:

  • CloudFront Functions enables Formula 1 to authenticate users with more than 500K requests per second. The solution is using CloudFront Functions to evaluate if users have access to view the race livestream by validating a token in the request.
  • Cloudinary is a media management company that helps its customers deliver content such as videos and images to users worldwide. For them, [email protected] remains an excellent solution for applications that require heavy compute operations, but lightweight operations that require high scalability can now be run using CloudFront Functions. With CloudFront Functions, Cloudinary and its customers are seeing significantly increased performance. For example, one of Cloudinary’s customers began using CloudFront Functions, and in about two weeks it was seeing 20–30 percent better response times. The customer also estimates that they will see 75 percent cost savings.
  • Based in Japan, DigitalCube is a web hosting provider for WordPress websites. Previously, DigitalCube spent several hours completing each of its update deployments. Now, they can deploy updates across thousands of distributions quickly. Using CloudFront Functions, they’ve reduced update deployment times from 4 hours to 2 minutes. In addition, faster updates and less maintenance work result in better quality throughout DigitalCube’s offerings. It’s now easier for them to test on AWS because they can run tests that affect thousands of distributions without having to scale internally or introduce downtime.
  • Amazon.com is using CloudFront Functions to change the way it delivers static assets to customers globally. CloudFront Functions allows them to experiment with hyper-personalization at scale and optimal latency performance. They have been working closely with the CloudFront team during product development, and they like how it is easy to create, test, and deploy custom code and implement business logic at the edge.

AWS open-source news and updates – A newsletter curated by my colleague Ricardo to bring you the latest open-source projects, posts, events, and more. Read the latest edition here.

Reduce log-storage costs by automating retention settings in Amazon CloudWatch – By default, CloudWatch Logs stores your log data indefinitely. This blog post shows how you can reduce log-storage costs by establishing a log-retention policy and applying it across all of your log groups.

Observability for AWS App Runner VPC networking – With X-Ray support in App runner, you can quickly deploy web applications and APIs at any scale and take advantage of adding tracing without having to manage sidecars or agents. Here’s an example of how you can instrument your applications with the AWS Distro for OpenTelemetry (ADOT).

Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:

You can now register for re:MARS to get fresh ideas on topics such as machine learning, automation, robotics, and space. The conference will be in person in Las Vegas, June 21–24.

That’s all from me for this week. Come back next Monday for another Week in Review!

Danilo

Use Amazon CodeGuru Profiler to monitor and optimize performance in Amazon Kinesis Data Analytics applications for Apache Flink

Post Syndicated from Praveen Panati original https://aws.amazon.com/blogs/big-data/use-amazon-codeguru-profiler-to-monitor-and-optimize-performance-in-amazon-kinesis-data-analytics-applications-for-apache-flink/

Amazon Kinesis Data Analytics makes it easy to transform and analyze streaming data and gain actionable insights in real time with Apache Flink. Apache Flink is an open-source framework and engine for processing data streams in real time. Kinesis Data Analytics reduces the complexity of building and managing Apache Flink applications using open-source libraries and integrating with other AWS services.

Kinesis Data Analytics is a fully managed service that takes care of everything required to run real-time streaming applications continuously and scale automatically to match the volume and throughput of your incoming data.

As you start building and deploying business-critical, highly scalable, real-time streaming applications, it’s important that you continuously monitor applications for health and performance, and optimize the application to meet the demands of your business.

With Amazon CodeGuru Profiler, developers and operations teams can monitor the following:

You can use CodeGuru Profiler to analyze the application’s performance characteristics and bottlenecks in the application code by capturing metrics such as CPU and memory utilization. You can use these metrics and insights to identify the most expensive lines of code; optimize for performance; improve stability, latency, and throughput; and reduce operational cost.

In this post, we discuss some of the challenges of running streaming applications and how you can use Amazon Kinesis Data Analytics for Apache Flink to build reliable, scalable, and highly available streaming applications. We also demonstrate how to set up and use CodeGuru Profiler to monitor an application’s health and capture important metrics to optimize the performance of Kinesis Data Analytics for Apache Flink applications.

Challenges

Streaming applications are particularly complex in nature. The data is continuously generated from a variety of sources with varying amounts of throughput. It’s critical that the application infrastructure scales up and down according to these varying demands without becoming overloaded, and not run into operational issues that might result in downtime.

As such, it’s crucial to constantly monitor the application for health, and identify and troubleshoot the bottlenecks in the application configuration and application code to optimize the application and the underlying infrastructure to meet the demands while also reducing the operational costs.

What Kinesis Data Analytics for Apache Flink and CodeGuru Profiler do for you

With Kinesis Data Analytics for Apache Flink, you can use Java, Scala, and Python to process and analyze real-time streaming data using open-source libraries based on Apache Flink. Kinesis Data Analytics provides the underlying infrastructure for your Apache Flink applications. It handles core capabilities such as provisioning compute resources, parallel computation, automatic scaling, and application backups (implemented as checkpoints and snapshots) to rapidly create, test, deploy, and scale real-time data streaming applications using best practices. This allows developers to focus more on application development and less on Apache Flink infrastructure management.

With CodeGuru Profiler, you can quickly and easily monitor Kinesis Data Analytics for Apache Flink applications to:

  • Identify and troubleshoot CPU and memory issues using CPU and memory (heap summary) utilization metrics
  • Identify bottlenecks and the application’s most expensive lines of code
  • Optimize application performance (latency, throughput) and reduce infrastructure and operational costs

Solution overview

In this post, we use a sample Java application deployed as a Kinesis Data Analytics application for Apache Flink, which consumes the records from Amazon Kinesis Data Streams and uses Apache Flink operators to generate real-time actionable insights. We use this sample to understand and demonstrate how to integrate with CodeGuru Profiler to monitor the health and performance of your Kinesis Data Analytics applications.

The following diagram shows the solution components.

At a high level, the solution covers the following steps:

  1. Set up, configure, and deploy a sample Apache Flink Java application on Kinesis Data Analytics.
  2. Set up CodeGuru Profiler.
  3. Integrate the sample Apache Flink Java application with CodeGuru Profiler.
  4. Use CodeGuru Profiler to analyze, monitor, and optimize application performance.

Set up a sample Apache Flink Java application on Kinesis Data Analytics

Follow the instructions in the GitHub repo and deploy the sample application that includes source code as well as AWS CloudFormation templates to deploy the Kinesis Data Analytics for Apache Flink application.

For this post, I deploy the stack in the us-east-1 Region.

After you deploy the sample application, you can test the application by running the following commands, and providing the correct parameters for the Kinesis data stream and Region.

The Java application has already been downloaded to an EC2 instance that has been provisioned by AWS CloudFormation; you just need to connect to the instance and run the JAR file to start ingesting events into the stream.

$ ssh [email protected]«Replay instance DNS name»

$ java -jar amazon-kinesis-replay-*.jar -streamName «Kinesis data stream name» -streamRegion «AWS region» -speedup 3600

Set up CodeGuru Profiler

Set up and configure CodeGuru Profiler using the AWS Management Console. For instructions, see Set up in the CodeGuru Profiler console.

For this post, I create a profiling group called flinkappdemo in the us-east-1 Region.

In the next section, I demonstrate how to integrate the sample Kinesis Data Analytics application with the profiling group.

Integrate the sample Apache Flink Java application with CodeGuru Profiler

Download the source code that you deployed earlier and complete the following steps to integrate CodeGuru Profiler to the Java application:

  1. Include the CodeGuru Profiler agent in your application by adding the following dependencies to your pom.xml file:
    <project xmlns="http://maven.apache.org/POM/4.0.0" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    ...
        <repositories>
            <repository>
                <id>codeguru-profiler</id>
                <name>codeguru-profiler</name>
                <url>https://d1osg35nybn3tt.cloudfront.net</url>
            </repository>
        </repositories>
        ... 
        <dependencies>
            <dependency>
                <groupId>com.amazonaws</groupId>
                <artifactId>codeguru-profiler-java-agent</artifactId>
                <version>1.2.1</version>
            </dependency>
        </dependencies>
    ...
    </project> 

  2. Add the CodeGuru Profiler agent configuration code to the Apache Flink Operators (functions), as shown in the following code.

Because multiple operators and operator instances can run on the same TaskManager JVM, and because one instance of the profiler can capture all events in a JVM, you just need to enable the profiler on an operator that is guaranteed to be present on all TaskManager JVMs. For this, you can pick the operator with the highest parallelism. In addition, you could instantiate the profiler as a singleton such that there is one instance per JVM.

public class CountByGeoHash implements WindowFunction<TripGeoHash, PickupCount, String, TimeWindow> {

  static {
    new Profiler.Builder()
            .profilingGroupName("flinkappdemo")
            .withHeapSummary(false) // optional - to start without heap profiling set to false or remove line
            .build()
            .start();
  }
  .....
}
public class TripDurationToAverageTripDuration implements WindowFunction<TripDuration, AverageTripDuration, Tuple2<String, String>, TimeWindow> {

  static {
    new Profiler.Builder()
            .profilingGroupName("flinkappdemo")
            .withHeapSummary(false) // optional - to start without heap profiling set to false or remove line
            .build()
            .start();
  }
  .....
}
  1. Build the application using the following command:
    mvn clean package

The preceding command packages the application into a JAR file.

  1. Copy and replace the JAR file in the Amazon Simple Storage Service (Amazon S3) bucket that was created as part of the CloudFormation stack.
  2. Choose Save changes to update the application.

This step allows the application to use the latest JAR file that contains the CodeGuru Profiler code to start profiling the application.

Use CodeGuru Profiler to analyze, monitor, and optimize application performance

Now that the application has been configured to use CodeGuru Profiler, you can use the metrics and visualizations to explore profiling data collected from the application.

Run the following commands from when you set up your application to start ingesting data into the Kinesis data stream and enable CodeGuru Profiler to profile the application and gather metrics:

$ ssh [email protected]«Replay instance DNS name»

$ java -jar amazon-kinesis-replay-*.jar -streamName «Kinesis data stream name» -streamRegion «AWS region» -speedup 3600

On the CodeGuru console, navigate to flinkappdemo on the Profiling groups page.

The summary page displays the status of your profiling group as well as the relevant metrics gathered while profiling the application.

In the following sections, we discuss the metrics and reports on this page in more detail.

CPU summary

Use this summary and the associated metrics CPU utilization and Time spent executing code to understand how much of the instance’s CPU resources are consumed by the application and how frequently the application’s JVM threads were in the RUNNABLE state. This helps you measure the application’s time spent running operations on the CPU so you can tune your application code and configuration.

With the CPU utilization metric, a low value (such as less than 10%) indicates your application doesn’t consume a large amount of the system CPU capacity. This means there could be an opportunity to scale in the application parallelism to reduce cost. A high value (over 90%) indicates your application is consuming a large amount of system CPU capacity. This means there is likely value in looking at your CPU profiles and recommendations for areas of optimization.

When examining the time spent running code, a high percentage (over 90%) indicates most of your application’s time is spent running operations on the CPU. A very low percentage (under 1%) indicates that most of your application was spent in other thread states (such as BLOCKED or WAITING) and there may be more value in looking at the latency visualization, which displays all non-idle thread states, instead of the CPU visualization.

For more information on understanding the CPU summary, see CPU summary.

Latency summary

Use this summary and the metrics Time spent blocked and Time spent waiting to understand what sections of the code are causing threads to block and threads that are waiting to tune your application code and configuration. For more information, see Latency summary.

The CPU summary and latency visualization can help you analyze the thread blocking and wait operations to further identify bottlenecks and tune your application’s performance and configuration.

Heap usage

Use this summary and the metrics Average heap usage and Peak heap usage to understand how much of your application’s maximum heap capacity is consumed by your application and to spot memory leaks. If the graph grows continuously over time, that could be an indication of a memory leak.

With the average heap usage metric, a high percentage (over 90%) could indicate that your application is close to running out of memory most of the time. If you wish to optimize this, the heap summary visualization shows you the object types consuming the most space on the heap. A low percentage (less than 10%) may indicate that your JVM is being provided much more memory than it actually requires and cost savings may be available by scaling in the application parallelism, although you should check the peak usage too.

Peak heap usage shows the highest percentage of memory consumed by your application seen by the CodeGuru Profiler agent. This is based on the same dataset as seen in the heap summary visualization. A high percentage (over 90%) could indicate that your application has high spikes of memory usage, especially if your average heap usage is low.

For more information on the heap summary, see Understanding the heap summary.

Anomalies and recommendation reports

CodeGuru Profiler uses machine learning to detect and alert on anomalies in your application profile and code. Use this to identify parts of the code for performance optimization and potential savings.

The issues identified during analysis are included in the recommendations report. Use this report to identify potential outages, latency, and other performance issues. For more information on how to work with anomalies and recommendations, see Working with anomalies and recommendation reports.

Visualizations

You can use visualizations associated with the preceding metrics to drill down further to identify what parts of the application configuration and application code are impacting the performance, and use these insights to improve and optimize application performance.

CodeGuru Profiler supports three types of visualizations and a heap summary to display profiling data collected from applications:

Let’s explore the profiling data collected from the preceding steps to observe and monitor application performance.

CPU utilization

The following screenshot shows the snapshot of the application’s profiling data in a flame graph visualization. This view provides a bottom-up view of the application’s profiling data, with the X-axis showing the stack profile and the Y-axis showing the stack depth. Each rectangle represents a stack frame. This visualization can help you identify specific call stacks that lead to inefficient code by looking at the top block function on CPU. This may indicate an opportunity to optimize.

Recommendation report with opportunities to optimize the application

Use the recommendation report to identify and correlate the sections of the application code that can be improved to optimize the application performance. In our example, we can improve the application code by using StringBuilder instead of String.format and by reusing the loggers rather than reinitializing them repetitively, and also by selectively applying the debug/trace logging, as recommended in the following report.

Hotspot visualization

The hotspot visualization shows a top-down view of the application’s profiling data. The functions that consume the most CPU time are at the top of the visualization and have the widest block. You can use this view to investigate functions that are computationally expensive.

Latency visualization

In this mode, you can visualize frames with different thread states, which can help you identify functions that spent a lot of time being blocked on shared resources, or waiting for I/O or sleeping. You can use this view to identify threads that are waiting or dependent on other threads and use it to improve latency on all or parts of your application.

You can inspect a visualization to further analyze any frame by selecting a frame and then choosing (right-click) the frame and choosing Inspect.

Heap summary

This summary view shows how much heap space your application requires to store all objects required in memory after a garbage collection cycle. If this value continuously grows over time until it reaches total capacity, that could be an indication of a memory leak. If this value is very low compared to total capacity, you may be able to save money by reducing your system’s memory.

For more information on how to work and explore data with visualizations, refer to Working with visualizations and Exploring visualization data.

Clean up

To avoid ongoing charges, delete the resources you created from the previous steps.

  1. On the CodeGuru console, choose Profiling groups in the navigation pane.
  2. Select the flinkappdemo profiling group.
  3. On the Actions meu, choose Delete profiling group.
  4. On the AWS CloudFormation console, choose Stacks in the navigation pane.
  5. Select the stack you deployed (kinesis-analytics-taxi-consumer) and choose Delete.

Summary

This post explained how to configure, build, deploy, and monitor real-time streaming Java applications using Kinesis Data Analytics applications for Apache Flink and CodeGuru. We also explained how you can use CodeGuru Profiler to collect runtime performance data and metrics that can help you monitor application health and optimize your application performance.

For more information, see Build and run streaming applications with Apache Flink and Amazon Kinesis Data Analytics for Java Applications and the Amazon Kinesis Data Analytics Developer Guide.

Several customers are now using CodeGuru Profiler to monitor and improve application performance, and you too can start monitoring your applications by following the instructions in the product documentation. Head over to the CodeGuru console to get started today!


About the Author

Praveen Panati is a Senior Solutions Architect at Amazon Web Services. He is passionate about cloud computing and works with AWS enterprise customers to architect, build, and scale cloud-based applications to achieve their business goals. Praveen’s area of expertise includes cloud computing, big data, streaming analytics, and software engineering.

Detecting security issues in logging with Amazon CodeGuru Reviewer

Post Syndicated from Brian Farnhill original https://aws.amazon.com/blogs/devops/detecting-security-issues-in-logging-with-amazon-codeguru-reviewer/

Amazon CodeGuru is a developer tool that provides intelligent recommendations for identifying security risks in code and improving code quality. To help you find potential issues related to logging of inputs that haven’t been sanitized, Amazon CodeGuru Reviewer now includes additional checks for both Python and Java. In this post, we discuss these updates and show examples of code that relate to these new detectors.

In December 2021, an issue was discovered relating to Apache’s popular Log4j Java-based logging utility (CVE-2021-44228). There are several resources available to help mitigate this issue (some of which are highlighted in a post on the AWS Public Sector blog). This issue has drawn attention to the importance of logging inputs in a way that is safe. To help developers understand where un-sanitized values are being logged, CodeGuru Reviewer can now generate findings that highlight these and make it easier to remediate them.

The new detectors and recommendations in CodeGuru Reviewer can detect findings in Java where Log4j is used, and in Python where the standard logging module is used. The following examples demonstrate how this works and what the recommendations look like.

Findings in Java

Consider the following Java sample that responds to a web request.

@RequestMapping("/example.htm")
public ModelAndView handleRequest(HttpServletRequest request, HttpServletResponse response) {
    ModelAndView result = new ModelAndView("success");
    String userId = request.getParameter("userId");
    result.addObject("userId", userId);

    // More logic to populate `result`.
     log.info("Successfully processed {} with user ID: {}.", request.getRequestURL(), userId);
    return result;
}

This simple example generates a result to the initial request, and it extracts the userId field from the initial request to do this. Before returning the result, the userId field is passed to the log.info statement. This presents a potential security issue, because the value of userId is not sanitized or changed in any way before it is logged. CodeGuru Reviewer is able to identify that the variable userId points to a value that needs to be sanitized before it is logged, as it comes from an HTTP request. All user inputs in a request (including query parameters, headers, body and cookie values) should be checked before logging to ensure a malicious user hasn’t passed values that could compromise your logging mechanism.

CodeGuru Reviewer recommends to sanitize user-provided inputs before logging them to ensure log integrity. Let’s take a look at CodeGuru Reviewer’s findings for this issue.

A screenshot of the AWS Console that describes the log injection risk found by CodeGuru Reviewer

An option to remediate this risk would be to add a sanitize() method that checks and modifies the value to remove known risks. The specific process of doing this will vary based on the values you expect and what is safe for your application and its processes. By logging the now sanitized value, you have mitigated those risks that could impact on your logging framework. The modified code sample below shows one example of how this could be addressed.

@RequestMapping("/example.htm")
public ModelAndView handleRequestSafely(HttpServletRequest request, HttpServletResponse response) {
    ModelAndView result = new ModelAndView("success");
    String userId = request.getParameter("userId");
    String sanitizedUserId = sanitize(userId);
    result.addObject("userId", sanitizedUserId);

    // More logic to populate `result`.
    log.info("Successfully processed {} with user ID: {}.", request.getRequestURL(), sanitizedUserId);
    return result;
}

private static String sanitize(String userId) {
    return userId.replaceAll("\\D", "");
}

The example now uses the sanitize() method, which uses a replaceAll() call that uses a regular expression to remove all non-digit characters. This example assumes the userId value should only be digit characters, ensuring that any other characters that could be used to expose a vulnerability in the logging framework are removed first.

Findings in Python

Now consider the following python code from a sample Flask project that handles a web request.

from flask import app, current_app, request

@app.route('/log')
def getUserInput():
    input = request.args.get('input')
    current_app.logger.info("User input: %s", input)

    # More logic to process user input.

In this example, the input variable is assigned the input query string value from a web request. Then, the Flask logger records its value as an info level message. This has the same challenge as the Java example above. However this time rather than changing the value, we can instead inspect it and choose to log it only when it is in a format we expect. A simple example of this could be where we expect only alphanumeric characters in the input variable. The isalnum() function can act as a simple test in this case. Here is an example of what this style of validation could look like.

from flask import app, current_app, request

@app.route('/log')
def safe_getUserInput():
    input = request.args.get('input')    
    if input.isalnum():
        current_app.logger.info("User input: %s", input)        
    else:
        current_app.logger.warning("Unexpected input detected")

Getting started

While log sanitization implementation is a long journey for many, it is a guardrail for maintaining your application’s log integrity. With CodeGuru Reviewer detecting log inputs that are neither sanitized nor validated, developers can use these recommendations as a guide to reduce risks related to log injection attacks. Additionally, you can provide feedback on recommendations in the CodeGuru Reviewer console or by commenting on the code in a pull request. This feedback helps improve the precision of CodeGuru Reviewer, so the recommendations you see get better over time.

To get started with CodeGuru Reviewer, you can leverage AWS Free Tier without any cost. For 90 days, you can review up to 100K lines of code in onboarded repositories per AWS account. For more information, please review the pricing page.

About the authors

Brian Farnhill

Brian Farnhill is a Software Development Engineer in the Australian Public Sector team. His background is in building solutions and helping customers improve DevOps tools and processes. When he isn’t working, you’ll find him either coding for fun or playing online games.

Jia Qin

Jia Qin is part of the Solutions Architect team in Malaysia. She loves developing on AWS, trying out new technology, and sharing her knowledge with customers. Outside of work, she enjoys taking walks and petting cats.

Top 2021 AWS Security service launches security professionals should review – Part 1

Post Syndicated from Ryan Holland original https://aws.amazon.com/blogs/security/top-2021-aws-security-service-launches-part-1/

Given the speed of Amazon Web Services (AWS) innovation, it can sometimes be challenging to keep up with AWS Security service and feature launches. To help you stay current, here’s an overview of some of the most important 2021 AWS Security launches that security professionals should be aware of. This is the first of two related posts; Part 2 will highlight some of the important 2021 launches that security professionals should be aware of across all AWS services.

Amazon GuardDuty

In 2021, the threat detection service Amazon GuardDuty expanded the internal AWS security intelligence it consumes to use more of the intel that AWS internal threat detection teams collect, including additional nation-state threat intelligence. Sharing more of the important intel that internal AWS teams collect lets you quickly improve your protection. GuardDuty also launched domain reputation modeling. These machine learning models take all the domain requests from across all of AWS, and feed them into a model that allows AWS to categorize previously unseen domains as highly likely to be malicious or benign based on their behavioral characteristics. In practice, AWS is seeing that these models often deliver high-fidelity threat detections, identifying malicious domains 7–14 days before they are identified and available on commercial threat feeds.

AWS also launched second generation anomaly detection for GuardDuty. Shortly after the original GuardDuty launch in 2017, AWS added additional anomaly detection for user behavior analytics and monitoring for unusual activity of AWS Identity and Access Management (IAM) users. After receiving customer feedback that the original feature was a little too noisy, and that it was difficult to understand why some findings were generated, the GuardDuty analytics team rebuilt this functionality on an entirely new machine learning model, considerably reducing the number of detections and generating a more accurate positive-detection rate. The new model also added additional context that security professionals (such as analysts) can use to understand why the model shows findings as suspicious or unusual.

Since its introduction, GuardDuty has detected when AWS EC2 Role credentials are used to call AWS APIs from IP addresses outside of AWS. Beginning in early 2022, GuardDuty now supports detection when credentials are used from other AWS accounts, inside the AWS network. This is a complex problem for customers to solve on their own, which is why the GuardDuty team added this enhancement. The solution considers that there are legitimate reasons why a source IP address that is communicating with AWS services APIs might be different than the Amazon Elastic Compute Cloud (Amazon EC2) instance IP address, or a NAT gateway associated with the instance’s VPC. The enhancement also considers complex network topologies that route traffic to one or multiple VPCs—for example, AWS Transit Gateway or AWS Direct Connect.

Our customers are increasingly running container workloads in production; helping to raise the security posture of these workloads became an AWS development priority in 2021. GuardDuty for EKS Protection is one recent feature that has resulted from this investment. This new GuardDuty feature monitors Amazon Elastic Kubernetes Service (Amazon EKS) cluster control plane activity by analyzing Kubernetes audit logs. GuardDuty is integrated with Amazon EKS, giving it direct access to the Kubernetes audit logs without requiring you to turn on or store these logs. Once a threat is detected, GuardDuty generates a security finding that includes container details such as pod ID, container image ID, and associated tags. See below for details on how the new Amazon Inspector is also helping to protect containers.

Amazon Inspector

At AWS re:Invent 2021, we launched the new Amazon Inspector, a vulnerability management service that continually scans AWS workloads for software vulnerabilities and unintended network exposure. The original Amazon Inspector was completely re-architected in this release to automate vulnerability management and to deliver near real-time findings to minimize the time needed to discover new vulnerabilities. This new Amazon Inspector has simple one-click enablement and multi-account support using AWS Organizations, similar to our other AWS Security services. This launch also introduces a more accurate vulnerability risk score, called the Inspector score. The Inspector score is a highly contextualized risk score that is generated for each finding by correlating Common Vulnerability and Exposures (CVE) metadata with environmental factors for resources such as network accessibility. This makes it easier for you to identify and prioritize your most critical vulnerabilities for immediate remediation. One of the most important new capabilities is that Amazon Inspector automatically discovers running EC2 instances and container images residing in Amazon Elastic Container Registry (Amazon ECR), at any scale, and immediately starts assessing them for known vulnerabilities. Now you can consolidate your vulnerability management solutions for both Amazon EC2 and Amazon ECR into one fully managed service.

AWS Security Hub

In addition to a significant number of smaller enhancements throughout 2021, in October AWS Security Hub, an AWS cloud security posture management service, addressed a top customer enhancement request by adding support for cross-Region finding aggregation. You can now view all your findings from all accounts and all selected Regions in a single console view, and act on them from an Amazon EventBridge feed in a single account and Region. Looking back at 2021, Security Hub added 72 additional best practice checks, four new AWS service integrations, and 13 new external partner integrations. A few of these integrations are Atlassian Jira Service Management, Forcepoint Cloud Security Gateway (CSG), and Amazon Macie. Security Hub also achieved FedRAMP High authorization to enable security posture management for high-impact workloads.

Amazon Macie

Based on customer feedback, data discovery tool Amazon Macie launched a number of enhancements in 2021. One new feature, which made it easier to manage Amazon Simple Storage Service (Amazon S3) buckets for sensitive data, was criteria-based bucket selection. This Macie feature allows you to define runtime criteria to determine which S3 buckets should be included in a sensitive data-discovery job. When a job runs, Macie identifies the S3 buckets that match your criteria, and automatically adds or removes them from the job’s scope. Before this feature, once a job was configured, it was immutable. Now, for example, you can create a policy where if a bucket becomes public in the future, it’s automatically added to the scan, and similarly, if a bucket is no longer public, it will no longer be included in the daily scan.

Originally Macie included all managed data identifiers available for all scans. However, customers wanted more surgical search criteria. For example, they didn’t want to be informed if there were exposed data types in a particular environment. In September 2021, Macie launched the ability to enable/disable managed data identifiers. This allows you to customize the data types you deem sensitive and would like Macie to alert on, in accordance with your organization’s data governance and privacy needs.

Amazon Detective

Amazon Detective is a service to analyze and visualize security findings and related data to rapidly get to the root cause of potential security issues. In January 2021, Amazon Detective added a convenient, time-saving integration that allows you to start security incident investigation workflows directly from the GuardDuty console. This new hyperlink pivot in the GuardDuty console takes findings directly from the GuardDuty console into the Detective console. Another time-saving capability added was the IP address drill down functionality. This new capability can be useful to security forensic teams performing incident investigations, because it helps quickly determine the communications that took place from an EC2 instance under investigation before, during, and after an event.

In December 2021, Detective added support for AWS Organizations to simplify management for security operations and investigations across all existing and future accounts in an organization. This launch allows new and existing Detective customers to onboard and centrally manage the Detective graph database for up to 1,200 AWS accounts.

AWS Key Management Service

In June 2021, AWS Key Management Service (AWS KMS) introduced multi-Region keys, a capability that lets you replicate keys from one AWS Region into another. With multi-Region keys, you can more easily move encrypted data between Regions without having to decrypt and re-encrypt with different keys for each Region. Multi-Region keys are supported for client-side encryption using direct AWS KMS API calls, or in a simplified manner with the AWS Encryption SDK and Amazon DynamoDB Encryption Client.

AWS Secrets Manager

Last year was a busy year for AWS Secrets Manager, with four feature launches to make it easier to manage secrets at scale, not just for client applications, but also for platforms. In March 2021, Secrets Manager launched multi-Region secrets to automatically replicate secrets for multi-Region workloads. Also in March, Secrets Manager added three new rules to AWS Config, to help administrators verify that secrets in Secrets Manager are configured according to organizational requirements. Then in April 2021, Secrets Manager added a CSI driver plug-in, to make it easy to consume secrets from Amazon EKS by using Kubernetes’s standard Secrets Store interface. In November, Secrets Manager introduced a higher secret limit of 500,000 per account to simplify secrets management for independent software vendors (ISVs) that rely on unique secrets for a large number of end customers. Although launched in January 2022, it’s also worth mentioning Secrets Manager’s release of rotation windows to align automatic rotation of secrets with application maintenance windows.

Amazon CodeGuru and Secrets Manager

In November 2021, AWS announced a new secrets detector feature in Amazon CodeGuru that searches your codebase for hardcoded secrets. Amazon CodeGuru is a developer tool powered by machine learning that provides intelligent recommendations to detect security vulnerabilities, improve code quality, and identify an application’s most expensive lines of code.

This new feature can pinpoint locations in your code with usernames and passwords; database connection strings, tokens, and API keys from AWS; and other service providers. When a secret is found in your code, CodeGuru Reviewer provides an actionable recommendation that links to AWS Secrets Manager, where developers can secure the secret with a point-and-click experience.

Looking ahead for 2022

AWS will continue to deliver experiences in 2022 that meet administrators where they govern, developers where they code, and applications where they run. A lot of customers are moving to container and serverless workloads; you can expect to see more work on this in 2022. You can also expect to see more work around integrations, like CodeGuru Secrets Detector identifying plaintext secrets in code (as noted previously).

To stay up-to-date in the year ahead on the latest product and feature launches and security use cases, be sure to read the Security service launch announcements. Additionally, stay tuned to the AWS Security Blog for Part 2 of this blog series, which will provide an overview of some of the important 2021 launches that security professionals should be aware of across all AWS services.

If you’re looking for more opportunities to learn about AWS security services, check out AWS re:Inforce, the AWS conference focused on cloud security, identity, privacy, and compliance, which will take place June 28-29 in Houston, Texas.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Ryan Holland

Ryan is a Senior Manager with GuardDuty Security Response. His team is responsible for ensuring GuardDuty provides the best security value to customers, including threat intelligence, behavioral analytics, and finding quality.

Author

Marta Taggart

Marta is a Seattle-native and Senior Product Marketing Manager in AWS Security Product Marketing, where she focuses on data protection services. Outside of work you’ll find her trying to convince Jack, her rescue dog, not to chase squirrels and crows (with limited success).

Automate code reviews with Amazon CodeGuru Reviewer

Post Syndicated from Dhiraj Thakur original https://aws.amazon.com/blogs/devops/automate-code-reviews-with-amazon-codeguru-reviewer/

A common problem in software development is accidentally or unintentionally merging code with bugs, defects, or security vulnerabilities into your main branch. Finding and mitigating these faulty lines of code deployed to the production environment can cause severe outages in running applications and can cost unnecessary time and effort to fix.

Amazon CodeGuru Reviewer tackles this issue using automated code reviews, which allows developers to fix the issue based on automated CodeGuru recommendations before the code moves to production.

This post demonstrates how to use CodeGuru for automated code reviews and uses an AWS CodeCommit approval process to set up a code approval governance model.

Solution overview

In this post, you create an end-to-end code approval workflow and add required approvers to your repository pull requests. This can help you identify and mitigate issues before they’re merged into your main branches.

Let’s discuss the core services highlighted in our solution. CodeGuru Reviewer is a machine learning-based service for automated code reviews and application performance recommendations. CodeCommit is a fully managed and secure source control repository service. It eliminates the need to scale infrastructure to support highly available and critical code repository systems. CodeCommit allows you to configure approval rules on pull requests. Approval rules act as a gatekeeper on your source code changes. Pull requests that fail to satisfy the required approvals can’t be merged into your main branch for production deployment.

The following diagram illustrates the architecture of this solution.

With CodeCommit repository, creating a pull request and approval rule. Then run the workflow to test the code, review CodeGuru recommendations to make appropriate changes, and run the workflow again to confirm that the code is ready to be merged

The solution has three personas:

  • Repository admin – Sets up the code repository in CodeCommit
  • Developer – Develops the code and uses pull requests in the main branch to move the code to production
  • Code approver – Completes the code review based on the recommendations from CodeGuru and either approves the code or asks for fixes for the issue

The solution workflow contains the following steps:

  1. The repository admin sets up the workflow, including a code repository in CodeCommit for the development group, required access to check in their code to the dev branch, integration of the CodeCommit repository with CodeGuru, and approval details.
  2. Developers develop the code and check in their code in the dev branch. This creates a pull request to merge the code in the main branch.
  3. CodeGuru analyzes the code and reports any issues, along with recommendations based on the code quality.
  4. The code approver analyzes the CodeGuru recommendations and provides comments for how to fix the issue in the code.
  5. The developers fix the issue based on the feedback they received from the code approver.
  6. The code approver analyzes the CodeGuru recommendations of the updated code. They approve the code to merge if everything is okay.
  7. The code gets merged in the main branch upon approval from all approvers.
  8. An AWS CodePipeline pipeline is triggered to move the code to the preproduction or production environment based on its configuration.

In the following sections, we walk you through configuring the CodeCommit repository and creating a pull request and approval rule. We then run the workflow to test the code, review recommendations and make appropriate changes, and run the workflow again to confirm that the code is ready to be merged.

Prerequisites

Before we get started, we create an AWS Cloud9 development environment, which we use to check in the Python code for this solution. The sample Python code for the exercise is available at the link. Download the .py files to a local folder.

Complete the following steps to set up the prerequisite resources:

  1. Set up your AWS Cloud9 environment and access the bash terminal, preferably in the us-east-1 Region.
  2. Create three AWS Identity and Access Management (IAM) users and its roles for the repository admin, developer, and approver by running the AWS CloudFormation template.

Configuring IAM roles and users

  1. Sign in to the AWS Management Console.
  2. Download ‘Persona_Users.yaml’ from github
  3. Navigate to AWS CloudFormation and click on Create Stack drop down to choose With new resouces (Standard).
  4. click on Upload a template file to upload file form local.
  5. Enter a Stack Name such as ‘Automate-code-reviews-codeguru-blog’.
  6. Enter IAM user’s temp password.
  7. Click Next to all the other default options.
  8. Check mark I acknowledge that AWS CloudFormation might create IAM resources with custom names. Click Create Stack.

This template creates three IAM users for Repository admin, Code Approver, Developer that are required at different steps while following this blog.

Configure the CodeCommit repository

Let’s start with CodeCommit repository. The repository works as the source control for the Java and Python code.

  1. Sign in to the AWS Management Console as the repository admin.
  2. On the CodeCommit console, choose Getting started in the navigation pane.
  3. Choose Create repository.

Creating AWS CodeCommit create a new repository using AWS Console

  1. For Repository name, enter transaction_alert_repo.
  2. Select Enable Amazon CodeGuru Reviewer for Java and Python – optional.
  3. Choose Create.

create CodeCommit repository named transaction_alert_repo, check box on Enable Amazon CodeGuru Reviewer for Java and Python

The repository is created.

  1. On the repository details page, choose Clone HTTPS on the Clone URL menu.

clone HTTPS link for CodeCommit repo transaction_alert_repo using clone URL menu

  1. Copy the URL to use in the next step to clone the repository in the development environment.

Clone link for HTTPS for CodeCommit repo transaction_alert_repo is avaiable to copy

  1. On the CodeGuru console, choose Repositories in the navigation pane under Reviewer.

You can see our CodeCommit repository is associated with CodeGuru.

CodeCommit repository is to be associated with CodeGuru

  1. Sign in to the console as the developer.
  2. On the AWS Cloud9 console, clone the repository, using the URL that you copied in the previous step.

This action clones the repository and creates the transaction_alert_repo folder in the environment.

git clone https://git-codecommit.us-east-.amazonaws.com/v1/repos/transaction_alert_repo
cd transaction_alert_repo
echo "This is a test file" > README.md
git add -A
git commit -m "initial setup"
git push

git clone CodeCommit repo to Cloud9 using git clone command, readme.md file is created locally and pushed back to CodeCommit repo]

  1. Check the file in CodeCommit to confirm that the README.md file is copied and available in the CodeCommit repository.

CodeCommit repo is now pushed with readme.md file

  1. In the AWS Cloud9 environment, choose the transaction_alert_repo folder.
  2. On the File menu, choose Upload Local Files to upload the Python files from your local folder (which you downloaded earlier).

Upload downloaded python test files that we are going to use for this blog from local system to Cloud9

  1. Choose Select files and upload read_file.py and read_rule.py.

Drag and drop python files on cloud9 upload UI

  1. You can see that both files are copied in the AWS Cloud9 environment under the transaction_alert_repo folder:
git checkout -b dev
git add -A
git commit -m "initial import of files"
git push --set-upstream origin dev

Push python local files are pushed to CodeCommit repo using git push command

  1. Check the CodeCommit console to confirm that the read_file.py and read_rule.py files are copied in the repository.

Check the CodeCommit console to verify these pushed files are available

Create a pull request

Now we create our pull request.

  1. On the CodeCommit console, navigate to your repository and choose Pull requests in the navigation pane.
  2. Choose Create pull request.

Create pull request for the new files added

  1. For Destination, choose master.
  2. For Source, choose dev.
  3. Choose Compare to see any conflict details in merging the request.

Pull request is visible to master branch, ready to merge

  1. If the environments are mergeable, enter a title and description.
  2. Choose Create pull request.

Pull request is merged and CodeGuru recommendation is triggered

Create an approval rule

We now create an approval rule as the repository admin.

  1. Sign in to the console as the repository admin.
  2. On the CodeCommit console, navigate to the pull request you created.
  3. On the Approvals tab, choose Create approval rule.

Creating new Approval rule for any merge action

  1. For Rule name, enter Require an approval before merge.
  2. For Number of approvals needed, enter 1.
  3. Under Approval pool members, provide an IAM ARN value for the code approver.
  4. Choose Create.

Approval Rule mentions, requires an approval before merge

Review recommendations

We can now view any recommendations regarding our pull request code review.

  1. As the repository admin, on the CodeGuru console, choose Code reviews in the navigation pane.
  2. On the Pull request tab, confirm that the code review is completed, as it might take some time to process.
  3. To review recommendations, choose the completed code review.

Check CodeGuru recommendation to see avaiable recommendation

You can now review the recommendation details, as shown in the following screenshot.

Review CodeGuru review recommendation details

  1. Sign in to the console as the code approver.
  2. Navigate to the pull request to view its details.

check pull request in detail, check stauts and Approval status

  1. On the Changes tab, confirm that the CodeGuru recommendation files are available.

confirm that the CodeGuru recommendation files are available

  1. Check the details of each recommendation and provide any comments in the New comment section.

The developer can see this comment as feedback from the approver to fix the issue.

  1. Choose Save.

In CodeGuru console developer can see this comment as feedback from the approver to fix the issue

  1. Enter any overall comments regarding the changes and choose Save.

Enter any overall comments regarding the changes and choose save

  1. Sign in to the console as the developer.
  2. On the CodeCommit console, navigate to the pull request -> select the request -> click on Changes to review the approver feedback.

click on Changes to review the approver feedback in CodeCommit console

Make changes, rerun the code review, and merge the environments

Let’s say the developer makes the required changes in the code to address the issue and uploads the new code in the AWS Cloud9 environment. If CodeGuru doesn’t find additional issues, we can merge the environments.

  1. Run the following command to push the updated code to CodeCommit:
git add -A
git commit -m "code-fixed"
git push --set-upstream origin dev

git clone CodeCommit repo to Cloud9 using git clone command, readme.md file is created locally and pushed back to CodeCommit repo

  1. Sign in to the console as the approver.
  2. Navigate to the code review.

CodeGuru hasn’t found any issue in the updated code, so there are no recommendations.

CodeGuru hasn’t found any issue in the updated code, so there are no recommendations avaiable this time

  1. On the CodeCommit console, you can verify the code and provide your approval comment.
  2. Choose Save.

Using CodeCommit console, code can be now verified for approval

  1. On the pull request details page, choose Approve.

New code is found with no conflict and can be approved

Now the developer can see on the CodeCommit console that the pull request is approved.

Code pull request is in Approved status

  1. Sign in to the console as the developer. On the pull request details page, choose Merge.

Approved code is ready to be merged

  1. Select your merge strategy. For this post, we select Fast forward merge.
  2. Choose Merge pull request.

Fast and forward merge is used to merge the code

You can see a success message.

Success message is generated for successful merge

  1. On the CodeCommit console, choose Code in the navigation pane for your repository.
  2. Choose master from the branch list.

The read_file.py and read_rule.py files are available under the main branch.

the new files are also avaiable in main branch beacuse of the successful merge

Clean up the resources

To avoid incurring future charges, remove the resources created by this solution by

Conclusion

This post highlighted the benefits of CodeGuru automated code reviews. You created an end-to-end code approval workflow and added required approvers to your repository pull requests. This solution can help you identify and mitigate issues before they’re merged into your main branches.

You can get started from the CodeGuru console by integrating CodeGuru Reviewer with your supported CI/CD pipeline.

For more information about automating code reviews and check out the documentation.

About the Authors

Dhiraj Thakur

Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.

Akshay Goel

Akshay is a Cloud Support Associate with Amazon Web Services working closing with all AWS deployment services. He loves to play, test, create, modify and simplify the solution which makes the task easy and interesting.

Sameer Goel

Sameer is a Sr. Solutions Architect in Netherlands, who drives customer success by building prototypes on cutting-edge initiatives. Prior to joining AWS, Sameer graduated with a master’s degree from NEU Boston, with a concentration in data science. He enjoys building and experimenting with AI/ML projects on Raspberry Pi.

New for Amazon CodeGuru Reviewer – Detector Library and Security Detectors for Log-Injection Flaws

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-amazon-codeguru-reviewer-detector-library-and-security-detectors-for-log-injection-flaws/

Amazon CodeGuru Reviewer is a developer tool that detects security vulnerabilities in your code and provides intelligent recommendations to improve code quality. For example, CodeGuru Reviewer introduced Security Detectors for Java and Python code to identify security risks from the top ten Open Web Application Security Project (OWASP) categories and follow security best practices for AWS APIs and common crypto libraries. At re:Invent, CodeGuru Reviewer introduced a secrets detector to identify hardcoded secrets and suggest remediation steps to secure your secrets with AWS Secrets Manager. These capabilities help you find and remediate security issues before you deploy.

Today, I am happy to share two new features of CodeGuru Reviewer:

  • A new Detector Library describes in detail the detectors that CodeGuru Reviewer uses when looking for possible defects and includes code samples for both Java and Python.
  • New security detectors have been introduced for detecting log-injection flaws in Java and Python code, similar to what happened with the recent Apache Log4j vulnerability we described in this blog post.

Let’s see these new features in more detail.

Using the Detector Library
To help you understand more clearly which detectors CodeGuru Reviewer uses to review your code, we are now sharing a Detector Library where you can find detailed information and code samples.

These detectors help you build secure and efficient applications on AWS. In the Detector Library, you can find detailed information about CodeGuru Reviewer’s security and code quality detectors, including descriptions, their severity and potential impact on your application, and additional information that helps you mitigate risks.

Note that each detector looks for a wide range of code defects. We include one noncompliant and compliant code example for each detector. However, CodeGuru uses machine learning and automated reasoning to identify possible issues. For this reason, each detector can find a range of defects in addition to the explicit code example shown on the detector’s description page.

Let’s have a look at a few detectors. One detector is looking for insecure cross-origin resource sharing (CORS) policies that are too permissive and may lead to loading content from untrusted or malicious sources.

Detector Library screenshot.

Another detector checks for improper input validation that can enable attacks and lead to unwanted behavior.

Detector Library screenshot.

Specific detectors help you use the AWS SDK for Java and the AWS SDK for Python (Boto3) in your applications. For example, there are detectors that can detect hardcoded credentials, such as passwords and access keys, or inefficient polling of AWS resources.

New Detectors for Log-Injection Flaws
Following the recent Apache Log4j vulnerability, we introduced in CodeGuru Reviewer new detectors that check if you’re logging anything that is not sanitized and possibly executable. These detectors cover the issue described in CWE-117: Improper Output Neutralization for Logs.

These detectors work with Java and Python code and, for Java, are not limited to the Log4j library. They don’t work by looking at the version of the libraries you use, but check what you are actually logging. In this way, they can protect you if similar bugs happen in the future.

Detector Library screenshot.

Following these detectors, user-provided inputs must be sanitized before they are logged. This avoids having an attacker be able to use this input to break the integrity of your logs, forge log entries, or bypass log monitors.

Availability and Pricing
These new features are available today in all AWS Regions where Amazon CodeGuru is offered. For more information, see the AWS Regional Services List.

The Detector Library is free to browse as part of the documentation. For the new detectors looking for log-injection flaws, standard pricing applies. See the CodeGuru pricing page for more information.

Start using Amazon CodeGuru Reviewer today to improve the security of your code.

Danilo

Amazon CodeGuru Reviewer Introduces Secrets Detector to Identify Hardcoded Secrets and Secure Them with AWS Secrets Manager

Post Syndicated from Alex Casalboni original https://aws.amazon.com/blogs/aws/codeguru-reviewer-secrets-detector-identify-hardcoded-secrets/

Amazon CodeGuru helps you improve code quality and automate code reviews by scanning and profiling your Java and Python applications. CodeGuru Reviewer can detect potential defects and bugs in your code. For example, it suggests improvements regarding security vulnerabilities, resource leaks, concurrency issues, incorrect input validation, and deviation from AWS best practices.

One of the most well-known security practices is the centralization and governance of secrets, such as passwords, API keys, and credentials in general. As many other developers facing a strict deadline, I’ve often taken shortcuts when managing and consuming secrets in my code, using plaintext environment variables or hard-coding static secrets during local development, and then inadvertently commit them. Of course, I’ve always regretted it and wished there was an automated way to detect and secure these secrets across all my repositories.

Today, I’m happy to announce the new Amazon CodeGuru Reviewer Secrets Detector, an automated tool that helps developers detect secrets in source code or configuration files, such as passwords, API keys, SSH keys, and access tokens.

These new detectors use machine learning (ML) to identify hardcoded secrets as part of your code review process, ultimately helping you to ensure that all new code doesn’t contain hardcoded secrets before being merged and deployed. In addition to Java and Python code, secrets detectors also scan configuration and documentation files. CodeGuru Reviewer suggests remediation steps to secure your secrets with AWS Secrets Manager, a managed service that lets you securely and automatically store, rotate, manage, and retrieve credentials, API keys, and all sorts of secrets.

This new functionality is included as part of the CodeGuru Reviewer service at no additional cost and supports the most common API providers, such as AWS, Atlassian, Datadog, Databricks, GitHub, Hubspot, Mailchimp, Salesforce, SendGrid, Shopify, Slack, Stripe, Tableau, Telegram, and Twilio. Check out the full list here.

Secrets Detectors in Action
First, I select CodeGuru from the AWS Secrets Manager console. This new flow lets me associate a new repository and run a full repository analysis with the goal of identifying hardcoded secrets.

Associating a new repository only takes a few seconds. I connect my GitHub account, and then select a repository named hawkcd, which contains a few Java, C#, JavaScript, and configuration files.

A few minutes later, my full repository is successfully associated and the full scan is completed. I could also have a look at a demo repository analysis called DemoFullRepositoryAnalysisSecrets. You’ll find this demo in the CodeGuru console, under Full repository analysis, in your AWS Account.

I select the repository analysis and find 42 recommendations, including one recommendation for a hardcoded secret (you can filter recommendations by Type=Secrets). CodeGuru Reviewer identified a hardcoded AWS Access Key ID in a .travis.yml file.

The recommendation highlights the importance of storing these secrets securely, provides a link to learn more about the issue, and suggests rotating the identified secret to make sure that it can’t be reused by malicious actors in the future.

CodeGuru Reviewer lets me jump to the exact file and line of code where the secret appears, so that I can dive deeper, understand the context, verify the file history, and take action quickly.

Last but not least, the recommendation includes a Protect your credential button that lets me jump quickly to the AWS Secrets Manager console and create a new secret with the proper name and value.

I’m going to remove the plaintext secret from my source code and update my application to fetch the secret value from AWS Secrets Manager. In many cases, you can keep the current configuration structure and use existing parameters to store the secret’s name instead of the secret’s value.

Once the secret is securely stored, AWS Secrets Manager also provides me with code snippets that fetch my new secret in many programming languages using the AWS SDKs. These snippets let me save time and include the necessary SDK call, as well as the error handling, decryption, and decoding logic.

I’ve showed you how to run a full repository analysis, and of course the same analysis can be performed continuously on every new pull request to help you prevent hardcoded secrets and other issues from being introduced in the future.

Available Today with CodeGuru Reviewer
CodeGuru Reviewer Secrets Detector is available in all regions where CodeGuru Reviewer is available, at no additional cost.

If you’re new to CodeGuru Reviewer, you can try it for free for 90 days with repositories up to 100,000 lines of code. Connecting your repositories and starting a full scan takes only a couple of minutes, whether your code is hosted on AWS CodeCommit, BitBucket, or GitHub. If you’re using GitHub, check out the GitHub Actions integration as well.

You can learn more about Secrets Detector in the technical documentation.

Alex

Detect Python and Java code security vulnerabilities with Amazon CodeGuru Reviewer

Post Syndicated from Haider Naqvi original https://aws.amazon.com/blogs/devops/detect-python-and-java-code-security-vulnerabilities-with-codeguru-reviewer/

with Aaron Friedman (Principal PM-T for xGuru services)

Amazon CodeGuru is a developer tool that uses machine learning and automated reasoning to catch hard to find defects and security vulnerabilities in application code. The purpose of this blog is to show how new CodeGuru Reviewer features help improve the security posture of your Python applications and highlight some of the specific categories of code vulnerabilities that CodeGuru Reviewer can detect. We will also cover newly expanded security capabilities for Java applications.

Amazon CodeGuru Reviewer can detect code vulnerabilities and provide actionable recommendations across dozens of the most common and impactful categories of code security issues (as classified by industry-recognized standards, Open Web Application Security, OWASP , “top ten” and Common Weakness Enumeration, CWE. The following are some of the most severe code vulnerabilities that CodeGuru Reviewer can now help you detect and prevent:

  • Injection weaknesses typically appears in data-rich applications. Every year, hundreds of web servers are compromised using SQL Injection. An attacker can use this method to bypass access control and read or modify application data.
  • Path Traversal security issues occur when an application does not properly handle special elements within a provided path name. An attacker can use it to overwrite or delete critical files and expose sensitive data.
  • Null Pointer Dereference issues can occur due to simple programming errors, race conditions, and others. Null Pointer Dereference can cause availability issues in rare conditions. Attackers can use it to read and modify memory.
  • Weak or broken cryptography is a risk that may compromise the confidentiality and integrity of sensitive data.

Security vulnerabilities present in source code can result in application downtime, leaked data, lost revenue, and lost customer trust. Best practices and peer code reviews aren’t sufficient to prevent these issues. You need a systematic way of detecting and preventing vulnerabilities from being deployed to production. CodeGuru Reviewer Security Detectors can provide a scalable approach to DevSecOps, a mechanism that employs automation to address security issues early in the software development lifecycle. Security detectors automate the detection of hard-to-find security vulnerabilities in Java and now Python applications, and provide actionable recommendations to developers.

By baking security mechanisms into each step of the process, DevSecOps enables the development of secure software without sacrificing speed. However, false positive issues raised by Static Application Security Testing (SAST) tools often must be manually triaged effectively and work against this value. CodeGuru uses techniques from automated reasoning, and specifically, precise data-flow analysis, to enhance the precision of code analysis. CodeGuru therefore reports fewer false positives.

Many customers are already embracing open-source code analysis tools in their DevSecOps practices. However, integrating such software into a pipeline requires a heavy up front lift, ongoing maintenance, and patience to configure. Furthering the utility of new security detectors in Amazon CodeGuru Reviewer, this update adds integrations with Bandit  and Infer, two widely-adopted open-source code analysis tools. In Java code bases, CodeGuru Reviewer now provide recommendations from Infer that detect null pointer dereferences, thread safety violations and improper use of synchronization locks. And in Python code, the service detects instances of SQL injection, path traversal attacks, weak cryptography, or the use of compromised libraries. Security issues found and recommendations generated by these tools are shown in the console, in pull requests comments, or through CI/CD integrations, alongside code recommendations generated by CodeGuru’s code quality and security detectors. Let’s dive deep and review some examples of code vulnerabilities that CodeGuru Reviewer can help detect.

Injection (Python)

Amazon CodeGuru Reviewer can help detect the most common injection vulnerabilities including SQL, XML, OS command, and LDAP types. For example, SQL injection occurs when SQL queries are constructed through string formatting. An attacker could manipulate the program inputs to modify the intent of the SQL query. The following python statement executes a SQL query constructed through string formatting and can be an attack vector:

import sqlite3
from flask import request

def removing_product():
    productId = request.args.get('productId')
    str = 'DELETE FROM products WHERE productID = ' + productId
    return str

def sql_injection():
    connection = psycopg2.connect("dbname=test user=postgres")
    cur = db.cursor()
    query = removing_product()
    cur.execute(query)

CodeGuru will flag a potential SQL injection using Bandit security detector, will make the following recommendation:

>> We detected a SQL command that might use unsanitized input. 
This can result in an SQL injection. To increase the security of your code, 
sanitize inputs before using them to form a query string.

To avoid this, the user should correct the code to use a parameter sanitization mechanism that guards against SQL injection as done below:

import sqlite3
from flask import request

def removing_product():
    productId = sanitize_input(request.args.get('productId'))
    str = 'DELETE FROM products WHERE productID = ' + productId
    return str

def sql_injection():
    connection = psycopg2.connect("dbname=test user=postgres")
    cur = db.cursor()
    query = removing_product()
    cur.execute(query)

In the above corrected code, user supplied sanitize_input method will take care of sanitizing user inputs.

Path Traversal (Python)

When applications use user input to create a path to read or write local files, an attacker can manipulate the input to overwrite or delete critical files or expose sensitive data. These critical files might include source code, sensitive, or application configuration information.

@app.route('/someurl')
def path_traversal():
    file_name = request.args["file"]
    f = open("./{}".format(file_name))
    f.close()

In above example, file name is directly passed to an open API without checking or filtering its content.

CodeGuru’s recommendation:

>> Potentially untrusted inputs are used to access a file path.
To protect your code from a path traversal attack, verify that your inputs are
sanitized.

In response, the developer should sanitize data before using it for creating/opening file.

@app.route('/someurl')
def path_traversal():
file_name = sanitize_data(request.args["file"])
f = open("./{}".format(file_name))
f.close()

In this modified code, input data file_name has been clean/filtered by sanitized_data api call.

Null Pointer Dereference (Java)

Infer detectors are a new addition that complement CodeGuru Reviewer native Java Security Detectors. Infer detectors, based on the Facebook Infer static analyzer, include rules to detect null pointer dereferences, thread safety violations, and improper use of synchronization locks. In particular, the null-pointer-dereference rule detects paths in the code that lead to null pointer exceptions in Java. Null pointer dereference is a very common pitfall in Java and is considered one of 25 most dangerous software weaknesses.

The Infer null-pointer-dereference rule guards against unexpected null pointer exceptions by detecting locations in the code where pointers that could be null are dereferenced. CodeGuru augments the Infer analyzer with knowledge about the AWS APIs, which allows the security detectors to catch potential null pointer exceptions when using AWS APIs.

For example, the AWS DynamoDBMapper class provides a convenient abstraction for mapping Amazon DynamoDB tables to Java objects. However, developers should be aware that DynamoDB Mapper load operations can return a null pointer if the object was not found in the table. The following code snippet updates a record in a catalog using a DynamoDB Mapper:

DynamoDBMapper mapper = new DynamoDBMapper(client);
// Retrieve the item.
CatalogItem itemRetrieved = mapper.load(
CatalogItem.class, 601);
// Update the item.
itemRetrieved.setISBN("622-2222222222");
itemRetrieved.setBookAuthors(
new HashSet<String>(Arrays.asList(
"Author1", "Author3")));
mapper.save(itemRetrieved);

CodeGuru will protect against a potential null dereference by making the following recommendation:

object `itemRetrieved` last assigned on line 88 could be null
and is dereferenced at line 90.

In response, the developer should add a null check to prevent the null pointer dereference from occurring.

DynamoDBMapper mapper = new DynamoDBMapper(client);
// Retrieve the item.
CatalogItem itemRetrieved = mapper.load(CatalogItem.class, 601);
// Update the item.
if (itemRetrieved != null) {
itemRetrieved.setISBN("622-2222222222");
itemRetrieved.setBookAuthors(
new HashSet<String>(Arrays.asList(
"Author1","Author3")));
mapper.save(itemRetrieved);
} else {
throw new CatalogItemNotFoundException();
}

Weak or broken cryptography  (Python)

Python security detectors support popular frameworks along with built-in APIs such as cryptography, pycryptodome etc. to identify ciphers related vulnerability. As suggested in CWE-327 , the use of a non-standard/inadequate key length algorithm is dangerous because attacker may be able to break the algorithm and compromise whatever data has been protected. In this example, `PBKDF2` is used with a weak algorithm and may lead to cryptographic vulnerabilities.
from Crypto.Protocol.KDF import PBKDF2
from Crypto.Hash import SHA1

def risky_crypto_algorithm(password):
salt = get_random_bytes(16)
keys = PBKDF2(password, salt, 64, count=1000000,
hmac_hash_module=SHA1)

SHA1 is used to create a PBKDF2, however, it is insecure hence not recommended for PBKDF2. CodeGuru’s identifies the issue and makes the following recommendation:

>> The `PBKDF2` function is using a weak algorithm which might
lead to cryptographic vulnerabilities. We recommend that you use the
`SHA224`, `SHA256`, `SHA384`,`SHA512/224`, `SHA512/256`, `BLAKE2s`,
`BLAKE2b`, `SHAKE128`, `SHAKE256` algorithms.

In response, the developer should use the correct SHA algorithm to protect against potential cipher attacks.

from Crypto.Protocol.KDF import PBKDF2
from Crypto.Hash import SHA512

def risky_crypto_algorithm(password):
salt = get_random_bytes(16)
keys = PBKDF2(password, salt, 64, count=1000000,
hmac_hash_module=SHA512)

This modified example uses high strength SHA512 algorithm.

Conclusion

This post reviewed Amazon CodeGuru Reviewer security detectors and how they automatically check your code for vulnerabilities and provide actionable recommendations in code reviews. We covered new capabilities for detecting issues in Python applications, as well as additional security features from Bandit and Infer. Together CodeGuru Reviewer’s security features provide a scalable approach for customers embracing DevSecOps, a mechanism that requires automation to address security issues earlier in the software development lifecycle. CodeGuru automates detection and helps prevent hard-to-find security vulnerabilities, accelerating DevSecOps processes for application development workflow.

You can get started from the CodeGuru console by running a full repository scan or integrating CodeGuru Reviewer with your supported CI/CD pipeline. Code analysis from Infer and Bandit is included as part of the standard CodeGuru Reviewer service.

For more information about automating code reviews and application profiling with Amazon CodeGuru, check out the AWS DevOps Blog. For more details on how to get started, visit the documentation.

Help Make BugBusting History at AWS re:Invent 2021

Post Syndicated from Sean M. Tracey original https://aws.amazon.com/blogs/aws/help-make-bugbusting-history-at-aws-reinvent-2021/

Earlier this year, we launched the AWS BugBust Challenge, the world’s first global competition to fix one million code bugs and reduce technical debt by over $100 million. As part of this endeavor, we are launching the first AWS BugBust re:Invent Challenge at this year’s AWS re:Invent conference, from 10 a.m. (PST) November 29 to 2 p.m (PST) December 2, and in doing so, hope to create a new World Record for “Largest Bug Fixing Competition” as recognized by Guinness World Records.

To date, AWS BugBust events have been run internally by organizations that want to reduce the number of code bugs and the impact they have on their external customers. At these events, an event administrator from within the organization invites internal developers to collaborate in a shared AWS account via a unique link allowing them to participate in the challenge. While this benefits the organizations, it limits the reach of the event, as it focuses only on their internal bugs. To increase the impact that the AWS BugBust events have, at this year’s re:Invent we will open up the challenge to anybody with Java or Python knowledge to help fix open-source code bases.

Historically, finding bugs has been a labor-intensive challenge. 620 million developer hours are wasted each year searching for potential bugs in a code base and patching them before they can cause any trouble in production – or worse yet, fix them on-the-fly when they’re already causing issues. The AWS BugBust re:Invent Challenge uses Amazon CodeGuru, an ML-powered developer tool that provides intelligent recommendations to improve code quality and identify an application’s most expensive lines of code. Rather than reacting to security and operational events caused by bugs, Amazon CodeGuru proactively highlights any potential issues in defined code bases as they’re being written and before they can make it into production. Using CodeGuru, developers can immediately begin picking off bugs and earning points for the challenge.

As part of this challenge, AWS will be including a myriad of open source projects that developers will be able to patch and contribute to throughout the event. Bugs can range from security issues, to duplicate code, to resource leaks and more. Once each bug has been submitted and CodeGuru determines that the issue is resolved, all of the patched software will be released back to the open source projects so that everyone can benefit from the combined efforts to squash software bugs.

The AWS BugBust re:Invent Challenge is open to all developers who have Python or Java knowledge regardless of whether or not they’re attending re:Invent. There will be an array of prizes, from hoodies and fly swatters to Amazon Echo Dots, available to all who participate and meet certain milestones in the challenge. There’s also the coveted title of “Ultimate AWS BugBuster” accompanied by a cash prize of $1500 for whomever earns the most points by squashing bugs during the event.

A screenshot of the AWS BugBust re:Invent challenge registration page

For those attending in-person, we have created the AWS BugBust Hub, a 500 square-foot space in the main exhibition hall. This space will give developers a place to join in with the challenge and track their progress on the AWS BugBust leadership board while maintaining appropriate social-distancing. In addition to the AWS BugBust Hub, there will be an AWS BugBust kiosk located within the event space where developers will be able to sign up to contribute toward the Largest Bug Fixing Challenge World Record attempt. Attendees will also be able to speak with Amazonians from the AWS BugBust SWAT team who will be able to answer questions about the event and provide product demos.

To take part in the AWS BugBust re:Invent Challenge, developers must have an AWS BugBust player account and a GitHub account. Pre-registration for the competition can be done online, or if you’re at re:Invent 2021 in-person you can register to participate at our AWS BugBust Hub or kiosk. If you’re not planning on joining us in person at re:Invent, you can still join us online, fix bugs and earn points to win prizes.

Building an InnerSource ecosystem using AWS DevOps tools

Post Syndicated from Debashish Chakrabarty original https://aws.amazon.com/blogs/devops/building-an-innersource-ecosystem-using-aws-devops-tools/

InnerSource is the term for the emerging practice of organizations adopting the open source methodology, albeit to develop proprietary software. This blog discusses the building of a model InnerSource ecosystem that leverages multiple AWS services, such as CodeBuild, CodeCommit, CodePipeline, CodeArtifact, and CodeGuru, along with other AWS services and open source tools.

What is InnerSource and why is it gaining traction?

Most software companies leverage open source software (OSS) in their products, as it is a great mechanism for standardizing software and bringing in cost effectiveness via the re-use of high quality, time-tested code. Some organizations may allow its use as-is, while others may utilize a vetting mechanism to ensure that the OSS adheres to the organization standards of security, quality, etc. This confidence in OSS stems from how these community projects are managed and sustained, as well as the culture of openness, collaboration, and creativity that they nurture.

Many organizations building closed source software are now trying to imitate these development principles and practices. This approach, which has been perhaps more discussed than adopted, is popularly called “InnerSource”. InnerSource serves as a great tool for collaborative software development within the organization perimeter, while keeping its concerns for IP & Legality in check. It provides collaboration and innovation avenues beyond the confines of organizational silos through knowledge and talent sharing. Organizations reap the benefits of better code quality and faster time-to-market, yet at only a fraction of the cost.

What constitutes an InnerSource ecosystem?

Infrastructure and processes that harbor collaboration stand at the heart of InnerSource ecology. These systems (refer Figure 1) would typically include tools supporting features such as code hosting, peer reviews, Pull Request (PR) approval flow, issue tracking, documentation, communication & collaboration, continuous integration, and automated testing, among others. Another major component of this system is an entry portal that enables the employees to discover the InnerSource projects and join the community, beginning as ordinary users of the reusable code and later graduating to contributors and committers.

A typical InnerSource ecosystem

Figure 1: A typical InnerSource ecosystem

More to InnerSource than meets the eye

This blog focuses on detailing a technical solution for establishing the required tools for an InnerSource system primarily to enable a development workflow and infrastructure. But the secret sauce of an InnerSource initiative in an enterprise necessitates many other ingredients.

InnerSource Roles & Use Cases

Figure 2: InnerSource Roles & Use Cases

InnerSource thrives on community collaboration and a low entry barrier to enable adoptability. In turn, that demands a cultural makeover. While strategically deciding on the projects that can be inner sourced as well as the appropriate licensing model, enterprises should bootstrap the initiative with a seed product that draws the community, with maintainers and the first set of contributors. Many of these users would eventually be promoted, through a meritocracy-based system, to become the trusted committers.

Over a set period, the organization should plan to move from an infra specific model to a project specific model. In a Project-specific InnerSource model, the responsibility for a particular software asset is owned by a dedicated team funded by other business units. Whereas in the Infrastructure-based InnerSource model, the organization provides the necessary infrastructure to create the ecosystem with code & document repositories, communication tools, etc. This enables anybody in the organization to create a new InnerSource project, although each project initiator maintains their own projects. They could begin by establishing a community of practice, and designating a core team that would provide continuing support to the InnerSource projects’ internal customers. Having a team of dedicated resources would clearly indicate the organization’s long-term commitment to sustaining the initiative. The organization should promote this culture through regular boot camps, trainings, and a recognition program.

Lastly, the significance of having a modular architecture in the InnerSource projects cannot be understated. This architecture helps developers understand the code better, as well as aids code reuse and parallel development, where multiple contributors could work on different code modules while avoiding conflicts during code merges.

A model InnerSource solution using AWS services

This blog discusses a solution that weaves various services together to create the necessary infrastructure for an InnerSource system. While it is not a full-blown solution, and it may lack some other components that an organization may desire in its own system, it can provide you with a good head start.

The ultimate goal of the model solution is to enable a developer workflow as depicted in Figure 3.

Typical developer workflow at InnerSource

Figure 3: Typical developer workflow at InnerSource

At the core of the InnerSource-verse is the distributed version control (AWS CodeCommit in our case). To maintain system transparency, openness, and participation, we must have a discovery mechanism where users could search for the projects and receive encouragement to contribute to the one they prefer (Step 1 in Figure 4).

Architecture diagram for the model InnerSource system

Figure 4: Architecture diagram for the model InnerSource system

For this purpose, the model solution utilizes an open source reference implantation of InnerSource Portal. The portal indexes data from AWS CodeCommit by using a crawler, and it lists available projects with associated metadata, such as the skills required, number of active branches, and average number of commits. For CodeCommit, you can use the crawler implementation that we created in the open source code repo at https://github.com/aws-samples/codecommit-crawler-innersource.

The major portal feature is providing an option to contribute to a project by using a “Contribute” link. This can present a pop-up form to “apply as a contributor” (Step 2 in Figure 4), which when submitted sends an email (or creates a ticket) to the project maintainer/committer who can create an IAM (Step 3 in Figure 4) user with access to the particular repository. Note that the pop-up form functionality is built into the open source version of the portal. However, it would be trivial to add one with an associated action (send an email, cut a ticket, etc.).

InnerSource portal indexes CodeCommit repos and provides a bird’s eye view

Figure 5: InnerSource portal indexes CodeCommit repos and provides a bird’s eye view

The contributor, upon receiving access, logs in to CodeCommit, clones the mainline branch of the InnerSource project (Step 4 in Figure 4) into a fix or feature branch, and starts altering/adding the code. Once completed, the contributor commits the code to the branch and raises a PR (Step 5 in Figure 4). A Pull Request is a mechanism to offer code to an existing repository, which is then peer-reviewed and tested before acceptance for inclusion.

The PR triggers a CodeGuru review (Step 6 in Figure 4) that adds the recommendations as comments on the PR. Furthermore, it triggers a CodeBuild process (Steps 7 to 10 in Figure 4) and logs the build result in the PR. At this point, the code can be peer reviewed by Trusted Committers or Owners of the project repository. The number of approvals would depend on the approval template rule configured in CodeCommit. The Committer(s) can approve the PR (Step 12 in Figure 4) and merge the code to the mainline branch – that is once they verify that the code serves its purpose, has passed the required tests, and doesn’t break the build. They could also rely on the approval vote from a sanity test conducted by a CodeBuild process. Optionally, a build process could deploy the latest mainline code (Step 14 in Figure 4) on the PR merge commit.

To maintain transparency in all communications related to progress, bugs, and feature requests to downstream users and contributors, a communication tool may be needed. This solution does not show integration with any Issue/Bug tracking tool out of the box. However, multiple of these tools are available at the AWS marketplace, with some offering forum and Wiki add-ons in order to elicit discussions. Standard project documentation can be kept within the repository by using the constructs of the README.md file to provide project mission details and the CONTRIBUTING.md file to guide the potential code contributors.

An overview of the AWS services used in the model solution

The model solution employs the following AWS services:

  • Amazon CodeCommit: a fully managed source control service to host secure and highly scalable private Git repositories.
  • Amazon CodeBuild: a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy.
  • Amazon CodeDeploy: a service that automates code deployments to any instance, including EC2 instances and instances running on-premises.
  • Amazon CodeGuru: a developer tool providing intelligent recommendations to improve code quality and identify an application’s most expensive lines of code.
  • Amazon CodePipeline: a fully managed continuous delivery service that helps automate release pipelines for fast and reliable application and infrastructure updates.
  • Amazon CodeArtifact: a fully managed artifact repository service that makes it easy to securely store, publish, and share software packages utilized in their software development process.
  • Amazon S3: an object storage service that offers industry-leading scalability, data availability, security, and performance.
  • Amazon EC2: a web service providing secure, resizable compute capacity in the cloud. It is designed to ease web-scale computing for developers.
  • Amazon EventBridge: a serverless event bus that eases the building of event-driven applications at scale by using events generated from applications and AWS services.
  • Amazon Lambda: a serverless compute service that lets you run code without provisioning or managing servers.

The journey of a thousand miles begins with a single step

InnerSource might not be the right fit for every organization, but is a great step for those wanting to encourage a culture of quality and innovation, as well as purge silos through enhanced collaboration. It requires backing from leadership to sponsor the engineering initiatives, as well as champion the establishment of an open and transparent culture granting autonomy to the developers across the org to contribute to projects outside of their teams. The organizations best-suited for InnerSource have already participated in open source initiatives, have engineering teams that are adept with CI/CD tools, and are willing to adopt OSS practices. They should start small with a pilot and build upon their successes.

Conclusion

Ever more enterprises are adopting the open source culture to develop proprietary software by establishing an InnerSource. This instills innovation, transparency, and collaboration that result in cost effective and quality software development. This blog discussed a model solution to build the developer workflow inside an InnerSource ecosystem, from project discovery to PR approval and deployment. Additional features, like an integrated Issue Tracker, Real time chat, and Wiki/Forum, can further enrich this solution.

If you need helping hands, AWS Professional Services can help adapt and implement this model InnerSource solution in your enterprise. Moreover, our Advisory services can help establish the governance model to accelerate OSS culture adoption through Experience Based Acceleration (EBA) parties.

References

About the authors

Debashish Chakrabarty

Debashish Chakrabarty

Debashish is a Senior Engagement Manager at AWS Professional Services, India managing complex projects on DevOps, Security and Modernization and help ProServe customers accelerate their adoption of AWS Services. He loves to keep connected to his technical roots. Outside of work, Debashish is a Hindi Podcaster and Blogger. He also loves binge-watching on Amazon Prime, and spending time with family.

Akash Verma

Akash Verma

Akash works as a Cloud Consultant for AWS Professional Services, India. He enjoys learning new technologies and helping customers solve complex technical problems and drive business outcomes by providing solutions using AWS products and services. Outside of work, Akash loves to travel, interact with new people and try different cuisines. He also enjoy gardening, watching Stand-up comedy, and listening to poetry.

How Amazon CodeGuru Reviewer helps Gridium maintain a high quality codebase

Post Syndicated from Aaqib Bickiya original https://aws.amazon.com/blogs/devops/codeguru-gridium-maintain-codebase/

Gridium creates software that lets people run commercial buildings at a lower cost and with less energy. Currently, half of the world lives in cities. Soon, nearly 70% will, while buildings utilize 40% of the world’s electricity. In the U.S. alone, commercial real estate value tops one trillion dollars. Furthermore, much of this asset class is still run with spreadsheets, clipboards, and outdated software. Gridium’s software platform collects large amounts of operational data from commercial buildings like offices, medical providers, and corporate campuses. Then, our analytics identifies energy savings opportunities that we work with customers to actualize.

Gridium’s Challenge

Data streams from utility companies across the U.S. are an essential input for Gridium’s analytics. This data is processed and visualized so that our customers can garner new insights and drive smarter decisions. In order to integrate a new data source into our platform, we often utilize third-party contractors to write tools that ingest data feeds and prepare them for analysis.

Our team firmly emphasizes our codebase quality. We strive for our code to be stylistically consistent, easily testable, swiftly understood, and well-documented. We write code internally by using agreed-upon standards. This makes it easy for us to review and edit internal code using standard collaboration tools.

We work to ensure that code arriving from contractors meets similar standards, but enforcing external code quality can be difficult. Contractor-developed code will be written in the style of each contractor. Furthermore, reviewing and editing code written by contractors introduces new challenges beyond those of working with internal code. We have used tools like PyLint to help stylistically align code, but we wanted a method for uncovering more complex issues without burdening the team and undermining the benefits of outside contractors.

Adopting Amazon CodeGuru Reviewer

We began evaluating Amazon CodeGuru Reviewer in order to provide an additional review layer for code developed by contractors, and to find complex corrections within code that traditional tools might not catch.

Initially, we enabled the service only on external repositories and generated reviews on pull requests. CodeGuru immediately provided us with actionable recommendations for maintaining Gridium’s codebase quality.

CodeGuru exception handling correction

The example above demonstrates a useful recurring recommendation related to exception-handling. Strictly speaking, there is nothing wrong with utilizing a general Exception class whenever needed. However, utilizing general Exception classes in production can complicate error-handling functions and generate ambiguity when trying to debug or understand code.

After a few weeks of utilizing CodeGuru Reviewer and witnessing its benefits, we determined that we wanted to use it for our entire codebase. Moreover, CodeGuru provided us with meaningful recommendations on our internal repositories.

CodeGuru highly depend functions suggestions

This example once again showcases CodeGuru’s ability to highlight subtle issues that aren’t necessarily bugs. Without diving through related libraries and functions, it would be difficult to find any optimizable areas, but CodeGuru found that this function could become problematic due to its dependency on twenty other functions. If any of the dependencies is updated, then it could break the entire function, thereby making debugging the root cause difficult.

The explanations following each code recommendation were essential in our quick adoption of CodeGuru. Simply showing what code to change would make it tough to follow through with a code correction. CodeGuru provided ample context, reasoning, and even metrics in some cases to explain and justify a correction.

Conclusion

Our development team thoroughly appreciates the extra review layer that CodeGuru provides. CodeGuru indicates areas that internal review might otherwise miss, especially in code written by external contractors.

In some cases, CodeGuru highlighted issues never before considered by the team. Examples of these issues include highly coupled functions, ambiguous exceptions, and outdated API calls. Each suggestion is accompanied with its context and reasoning so that our developers can independently judge if an edit should be made. Furthermore, CodeGuru was easily set up with our GitHub repositories. We enabled it within minutes through the console.

After familiarizing ourselves with the CodeGuru workflow, Gridium treats CodeGuru recommendations like suggestions that an internal reviewer would make. Both internal developers and third parties act on recommendations in order to improve code health and quality. Any CodeGuru suggestions not accepted by contractors are verified and implemented by an internal reviewer if necessary.

Over the twelve weeks that we have been utilizing Amazon CodeGuru, we have had 281 automated pull request reviews. These provided 104 recommendations resulting in fifty code corrections that we may not have made otherwise. Our monthly bill for CodeGuru usage is about $10.00. If we make only a single correction to our code over a whole month that we would have missed otherwise, then we can safely say that it was well worth the price!

About the Authors

Kimberly Nicholls is an Engineering Technical Lead at Gridium who loves to make data useful. She also enjoys reading books and spending time outside.

Adnan Bilwani is a Sr. Specialist-Builder Experience providing fully managed ML-based solutions to enhance your DevOps workflows.

Aaqib Bickiya is a Solutions Architect at Amazon Web Services. He helps customers in the Midwest build and grow their AWS environments.

Finding code inconsistencies using Amazon CodeGuru Reviewer

Post Syndicated from Hangqi Zhao original https://aws.amazon.com/blogs/devops/inconsistency-detection-in-amazon-codeguru-reviewer/

Here we are introducing the inconsistency detector for Java in Amazon CodeGuru Reviewer. CodeGuru Reviewer automatically analyzes pull requests (created in supported repositories such as AWS CodeCommit, GitHub, GitHub Enterprise, and Bitbucket) and generates recommendations for improving code quality. For more information, see Automating code reviews and application profiling with Amazon CodeGuru.

The Inconsistency Principle

Software is repetitive, so it’s possible mine usage specifications from the mining of large code bases. While the mined specifications are often correct, specification violations are not often bugs. This is because the code uses are contextual, therefore contextual analysis is required to detect genuine violations. Moreover, naive enforcement of learned specifications leads to many false positives. The inconsistency principle is based on the following observation: While deviations from frequent behavior (or specifications) are often permissible across repositories or packages, deviations from frequent behavior inside of a given repository are worthy of investigation by developers, and these are often bugs or code smells.

Consider the following example. Assume that within a repository or package, we observe 10 occurrences of this statement:

private static final String stage = AppConfig.getDomain().toLowerCase();

And assume that there is one occurrence of the following statement in the same package:

private static final String newStage = AppConfig.getDomain();

The inconsistency can be described as: in this package, typically there is an API call of toLowerCase() follows AppConfig.getDomain() that converts the string generated by AppConfig.getDomain() to its lower-case correspondence. However, in one occurrence, the call is missing. This should be examined.

In summary, the inconsistency detector can learn from a single package (read: your own codebase) and can identify common code usage patterns in the code, such as common patterns of logging, throwing and handling exceptions, performing null checks, etc. Next, usage patterns inconsistent with frequent usage patterns are flagged. While the flagged violations are most often bugs or code smells, it is possible that in some cases the majority usage patterns are the anti-patterns and the flagged violations are actually correct. Mining and detection are conducted on-the-fly during code analysis, and neither the code nor the patterns are persisted. This ensures that the customer code privacy requirements are preserved.

Diverse Recommendations

Perhaps the most distinguishing feature of the inconsistency detector family is how it is based on learning from the customer’s own code while conducting analysis on-the-fly without persisting the code. This is rather different than the pre-defined rules or dedicated ML algorithms targeting specific code defects like concurrency. These approaches also persist learned patterns or models. Therefore, the inconsistency principle-based detectors are much more diversified and spread out to a wide range of code problems.

Our approach aims to automatically detect inconsistent code pieces that are abnormal and deviant from common code patterns in the same package—assuming that the common behaviors are correct. This anomaly detection approach doesn’t require prior knowledge and minimizes the human effort to predefine rule sets, as it can automatically and implicitly “infer” rules, i.e., common code patterns, from the source code. While inconsistencies can be detected in the specific aspects of programs (such as synchronization, exceptions, logging, etc.), CodeGuru Reviewer can detect inconsistencies in multiple program aspects. The detections aren’t restricted to specific code defect types, but instead cover diverse code issue categories. These code defects may not necessarily be bugs, but they could also be code smells that should be avoided for better code maintenance and readability. CodeGuru Reviewer can detect 12 types of code inconsistencies in Java code. These detectors identify issues like typos, inconsistent messaging, inconsistent logging, inconsistent declarations, general missing API calls, missing null checks, missing preconditions, missing exceptions, etc. Now let’s look at several real code examples (they may not cover all types of findings).

Typo

Typos frequently appear in code messages, logs, paths, variable names, and other strings. Even simple typos can be difficult to find and can cause severe issues in your code. In a real code example below, the developer is setting a configuration file location.

context.setConfigLocation("com.amazom.daas.customerinfosearch.config");

The inconsistency detector identifies a typo in the path amazomamazon. Failure to identify and fix this typo will make the configuration location not found. Accordingly, the developer fixed the typo. In some other cases, certain typos are customized to the customer code and may be difficult to find via pre-defined rules or vocabularies. In the example below, the developer sets a bean name as a string.

public static final String COOKIE_SELECTOR_BEAN_NAME = "com.amazon.um.webapp.config.CookiesSelectorService";

The inconsistency detector learns from the entire codebase and identifies that um should instead be uam, as uam appears in multiple similar cases in the code, thereby making the use of um abnormal. The developer responded “Cool!” to this finding and accordingly made the fix.

Inconsistent Logging

In the example below, the developer is logging an exception related to a missing field, ID_FIELD.

 log.info ("Missing {} from data {}", ID_FIELD, data); 

The inconsistency detector learns the patterns in the developer’s code and identifies that, in 47 other similar cases of the same codebase, the developer’s code uses a log level of error instead of info, as shown below.

 log.error ("Missing {} from data {}", ID_FIELD, data);

Utilizing the appropriate log level can improve the application debugging process and ensure that developers are aware of all potential issues within their applications. The message from the CodeGuru Reviewer was “The detector found 47 occurrences of code with logging similar to the selected lines. However, the logging level in the selected lines is not consistent with the other occurrences. We recommend you check if this is intentional. The following are some examples in your code of the expected level of logging in similar occurrences: log.error (“Missing {} from data {}”, ID_FIELD, data); …

The developer acknowledges the recommendation to make the log level consistent, and they made the corresponding change to upgrade the level from info to error.

Apart from log levels, the inconsistency detector also identifies potentially missing log statements in your code, and it ensures that the log messages are consistent across the entire codebase.

Inconsistent Declaration

The inconsistency detector also identifies inconsistencies in variable or method declarations. For instance, a typical example is missing a final keyword for certain variables or functions. While a variable doesn’t always need to be declared final, the inconsistency determines certain variables as final by inferring from other appearances of the variables. In the example below, the developer declares a method with the variable request.

protected ExecutionContext setUpExecutionContext(ActionRequest request) {

Without prior knowledge, the inconsistency learns from 9 other similar occurrences of the same codebase, and it determines that the request should be with final keyword. The developer responded by saying “Good catch!”, and then made the change.

Missing API

Missing API calls is a typical code bug, and their identification is highly valued by customers. The inconsistency detector can find these issues via the inconsistency principle. In the example below, the developer deletes stale entities in the document object.

private void deleteEntities(Document document) {
         ...
         for (Entity returnEntity : oldReturnCollection) {
             document.deleteEntity(returnEntity);
         }
     }

The inconsistency detector learns from 4 other similar methods in the developer’s codebase, and then determines that the API document.deleteEntity() is strongly associated with another API document.acceptChanges(), which is usually called after document.deleteEntity() in order to ensure that the document can accept other changes, while it is missing in this case. Therefore, the inconsistency detector recommends adding the document.acceptChanges() call. Another human reviewer agrees with CodeGuru’s suggestion, and they also refer to the coding guidelines of their platform that echo with CodeGuru’s recommendation. The developer made the corresponding fix as shown below.

private void deleteEntities(Document document) {
         ...
         for (Entity returnEntity : oldReturnCollection) {
             document.deleteEntity(returnEntity);
         }
         document.acceptChanges();
     }

Null Check

Consider the following code snippet:

String ClientOverride = System.getProperty(MB_CLIENT_OVERRIDE_PROPERTY);
if (ClientOverride != null) {
      log.info("Found Client override {}", ClientOverride);
      config.setUnitTestOverride(ConfigKeys.Client, ClientOverride);
}
 
String ServerOverride = System.getProperty(MB_SERVER_OVERRIDE_PROPERTY);
if (ClientOverride != null) {
      log.info("Found Server override {}", ServerOverride);
      config.setUnitTestOverride(ConfigKeys.ServerMode, ServerOverride);
}

CodeGuru Reviewer indicates that the output generated by the System.getProperty(MB_SERVER_OVERRIDE_PROPERTY) misses a null-check. A null-check appears to be available in the next line. However, if we look carefully, that null check is not applied to the correct output. It is a copy-paste error, and instead of checking on ServerOverride it checks ClientOverride. The developer fixed the code as the recommendation suggested.

Exception Handling

The following example shows an inconsistency finding regarding exception handling.

Detection:

try {
        if (null != PriceString) {
            final Double Price = Double.parseDouble(PriceString);
            if (Price.isNaN() || Price <= 0 || Price.isInfinite()) {
                throw new InvalidParameterException(
                  ExceptionUtil.getExceptionMessageInvalidParam(PRICE_PARAM));
             }
         }
    } catch (NumberFormatException ex) {
        throw new InvalidParameterException(ExceptionUtil.getExceptionMessageInvalidParam(PRICE_PARAM));
}

Supporting Code Block:

try {
        final Double spotPrice = Double.parseDouble(launchTemplateOverrides.getSpotPrice());
        if (spotPrice.isNaN() || spotPrice <= 0 || spotPrice.isInfinite()) {
            throw new InvalidSpotFleetRequestConfigException(
                ExceptionUtil.getExceptionMessageInvalidParam(LAUNCH_TEMPLATE_OVERRIDE_SPOT_PRICE_PARAM));
         }
      } catch (NumberFormatException ex) {
          throw new InvalidSpotFleetRequestConfigException(
                ExceptionUtil.getExceptionMessageInvalidParam(LAUNCH_TEMPLATE_OVERRIDE_SPOT_PRICE_PARAM));
 }

There are 4 occurrences of handling an exception thrown by Double.parseDouble using catch with InvalidSpotFleetRequestConfigException(). The detected code snippet with an incredibly similar context simply uses InvalidParameterException(), which does not provide as much detail as the customized exception. The developer responded with “yup the detection sounds right”, and then made the revision.

Responses to the Inconsistency Detector Findings

Large repos are typically developed by more than one developer over a period of time. Therefore, maintaining consistency for the code snippets that serve the same or similar purpose, code style, etc., can easily be overlooked, especially for repos with months or years of development history. As a result, the inconsistency detector in CodeGuru Reviewer has flagged numerous findings and alerted developers within Amazon regarding hundreds of inconsistency issues before they hit production.

The inconsistency detections have witnessed a high developer acceptance rate, and developer feedback of the inconsistency detector has been very positive. Some feedback from the developers includes “will fix. Didn’t realize as this was just copied from somewhere else.”, “Oops!, I’ll correct it and use consistent message format.”, “Interesting. Now it’s learning from our own code. AI is trying to take our jobs haha”, and “lol yeah, this was pretty surprising”. Developers have also concurred that the findings are essential and must be fixed.

Conclusion

Inconsistency bugs are difficult to detect or replicate during testing. They can impact production services availability. As a result, it’s important to automatically detect these bugs early in the software development lifecycle, such as during pull requests or code scans. The CodeGuru Reviewer inconsistency detector combines static code analysis algorithms with ML and data mining in order to surface only the high confidence deviations. It has a high developer acceptance rate and has alerted developers within Amazon to hundreds of inconsistencies before they reach production.

Improve the performance of Lambda applications with Amazon CodeGuru Profiler

Post Syndicated from Vishaal Thanawala original https://aws.amazon.com/blogs/devops/improve-performance-of-lambda-applications-amazon-codeguru-profiler/

As businesses expand applications to reach more users and devices, they risk higher latencies that could degrade the  customer experience. Slow applications can frustrate or even turn away customers. Developers therefore increasingly face the challenge of ensuring that their code runs as efficiently as possible, since application performance can make or break businesses.

Amazon CodeGuru Profiler helps developers improve their application speed by analyzing its runtime. CodeGuru Profiler analyzes single and multi-threaded applications and then generates visualizations to help developers understand the latency sources. Afterwards, CodeGuru Profiler provides recommendations to help resolve the root cause.

CodeGuru Profiler recently began providing recommendations for applications written in Python. Additionally, the new automated onboarding process for AWS Lambda functions makes it even easier to use CodeGuru Profiler with serverless applications built on AWS Lambda.

This post highlights these new features by explaining how to set up and utilize CodeGuru Profiler on an AWS Lambda function written in Python.

Prerequisites

This post focuses on improving the performance of an application written with AWS Lambda, so it’s important to understand the Lambda functions that work best with CodeGuru Profiler. You will get the most out of CodeGuru Profiler on long-duration Lambda functions (>10 seconds) or frequently invoked shorter generation Lambda functions (~100 milliseconds). Because CodeGuru Profiler requires five minutes of runtime data before the Lambda container is recycled, very short duration Lambda functions with an execution time of 1-10 milliseconds may not provide sufficient data for CodeGuru Profiler to generate meaningful results.

The automated CodeGuru Profiler onboarding process, which automatically creates the profiling group for you, supports Lambda functions running on Java 8 (Amazon Corretto), Java 11, and Python 3.8 runtimes. Additional runtimes, without the automated onboarding process, are supported and can be found in the Java documentation and the Python documentation.

Getting Started

Let’s quickly demonstrate the new Lambda onboarding process and the new Python recommendations. This example assumes you have already created a Lambda function, so we will just walk through the process of turning on CodeGuru Profiler and viewing results. If you don’t already have a Lambda function created, you can create one by following these set up instructions. If you would like to replicate this example, the code we used can be found on GitHub here.

  1. On the AWS Lambda Console page, open your Lambda function. For this example, we’re using a function with a Python 3.8 runtime.

 This image shows the Lambda console page for the Lambda function we are referencing.

2. Navigate to the Configuration tab, go to the Monitoring and operations tools page, and click Edit on the right side of the page.

This image shows the instructions to open the page to turn on profiling

3. Scroll down to “Amazon CodeGuru Profiler” and click the button next to “Code profiling” to turn it on. After enabling Code profiling, click Save.

This image shows how to turn on CodeGuru Profiler

4. Verify that CodeGuru Profiler has been turned on within the Monitoring and operations tools page

This image shows how to validate Code profiling has been turned on

That’s it! You can now navigate to CodeGuru Profiler within the AWS console and begin viewing results.

Viewing your results

CodeGuru Profiler requires 5 minutes of Lambda runtime data to generate results. After your Lambda function provides this runtime data, which may need multiple runs if your lambda has a short runtime, it will display within the “Profiling group” page in the CodeGuru Profiler console. The profiling group will be given a default name (i.e., aws-lambda-<lambda-function-name>), and it will take approximately 15 minutes after CodeGuru Profiler receives the runtime data before it appears on this page.

This image shows where you can see the profiling group created after turning on CodeGuru Profiler on your Lambda function

After the profile appears, customers can view their profiling results by analyzing the flame graphs. Additionally, after approximately 1 hour, customers will receive their first recommendation set (if applicable). For more information on how to reading the CodeGuru Profiler results, see Investigating performance issues with Amazon CodeGuru Profiler.

The below images show the two flame graphs (CPU Utilization and Latency) generated from profiling the Lambda function. Note that the highlighted horizontal bar (also referred to as a frame) in both images corresponds with one of the three frames that generates a recommendation. We’ll dive into more details on the recommendation in the following sections.

CPU Utilization Flame Graph:

This image shows the CPU Utilization flame graph for the Lambda function

Latency Flame Graph:

This image shows the Latency flame graph for the Lambda function

Here are the three recommendations generated from the above Lambda function:

This image shows the 3 recommendations generated by CodeGuru Profiler

Addressing a recommendation

Let’s dive even further into it an example recommendation. The first recommendation above notices that the Lambda function is spending more than the normal amount of runnable time (6.8% vs <1%) creating AWS SDK service clients. It recommends ensuring that the function doesn’t unnecessarily create AWS SDK service clients, which wastes CPU time.

This image shows the detailed recommendation for 'Recreation of AWS SDK service clients'

Based on the suggested resolution step, we made a quick and easy code change, moving the client creation outside of the lambda-handler function. This ensures that we don’t create unnecessary AWS SDK clients. The code change below shows how we would resolve the issue.

...
s3_client = boto3.client('s3')
cw_client = boto3.client('cloudwatch')

def lambda_handler(event, context):
...

Reviewing your savings

After making each of the three changes recommended above by CodeGuru Profiler, look at the new flame graphs to see how the changes impacted the applications profile. You’ll notice below that we no longer see the previously wide frames for boto3 clients, put_metric_data, or the logger in the S3 API call.

CPU Utilization Flame Graph:

This image shows the CPU Utilization flame graph after making the recommended changes

Latency Flame Graph:

This image shows the Latency flame graph after making the recommended changes

Moreover, we can run the Lambda function for one day (1439 invocations) and see the results in Lambda Insights in order to understand our total savings. After every recommendation was addressed, this Lambda function, with a 128 MB memory and 10 second timeout, decreased in CPU time by 10% and dropped in maximum memory usage and network IO leading to a 30% drop in GB-s. Decreasing GB-s leads to 30% lower cost for the Lambda’s duration bill as explained in the AWS Lambda Pricing.

Latency (before and after CodeGuru Profiler):

The graph below displays the duration change while running the Lambda function for 1 day.

This image shows the duration change while running the Lambda function for 1 day.

Cost (before and after CodeGuru Profiler):

The graph below displays the function cost change while running the lambda function for 1 day.

This image shows the function cost change while running the lambda function for 1 day.

Conclusion

This post showed how developers can easily onboard onto CodeGuru Profiler in order to improve the performance of their serverless applications built on AWS Lambda. Get started with CodeGuru Profiler by visiting the CodeGuru console.

All across Amazon, numerous teams have utilized CodeGuru Profiler’s technology to generate performance optimizations for customers. It has also reduced infrastructure costs, saving millions of dollars annually.

Onboard your Python applications onto CodeGuru Profiler by following the instructions on the documentation. If you’re interested in the Python agent, note that it is open-sourced on GitHub. Also for more demo applications using the Python agent, check out the GitHub repository for additional samples.

About the authors

Headshot of Vishaal

Vishaal is a Product Manager at Amazon Web Services. He enjoys getting to know his customers’ pain points and transforming them into innovative product solutions.

 

 

 

 

 

Headshot of Mirela

Mirela is working as a Software Development Engineer at Amazon Web Services as part of the CodeGuru Profiler team in London. Previously, she has worked for Amazon.com’s teams and enjoys working on products that help customers improve their code and infrastructure.

 

 

 

 

Headshot of Parag

Parag is a Software Development Engineer at Amazon Web Services as part of the CodeGuru Profiler team in Seattle. Previously, he was working for Amazon.com’s catalog and selection team. He enjoys working on large scale distributed systems and products that help customers build scalable, and cost-effective applications.

Introducing new self-paced courses to improve Java and Python code quality with Amazon CodeGuru

Post Syndicated from Rafael Ramos original https://aws.amazon.com/blogs/devops/new-self-paced-courses-to-improve-java-and-python-code-quality-with-amazon-codeguru/

Amazon CodeGuru icon

During the software development lifecycle, organizations have adopted peer code reviews as a common practice to keep improving code quality and prevent bugs from reaching applications in production. Developers traditionally perform those code reviews manually, which causes bottlenecks and blocks releases while waiting for the peer review. Besides impacting the teams’ agility, it’s a challenge to maintain a high bar for code reviews during the development workflow. This is especially challenging for less experienced developers, who have more difficulties identifying defects, such as thread concurrency and resource leaks.

With Amazon CodeGuru Reviewer, developers have an automated code review tool that catches critical issues, security vulnerabilities, and hard-to-find bugs during application development. CodeGuru Reviewer is powered by pre-trained machine learning (ML) models and uses millions of code reviews on thousands of open-source and Amazon repositories. It also provides recommendations on how to fix issues to improve code quality and reduces the time it takes to fix bugs before they reach customer-facing applications. Java and Python developers can simply add Amazon CodeGuru to their existing development pipeline and save time and reduce the cost and burden of bad code.

If you’re new to writing code or an experienced developer looking to automate code reviews, we’re excited to announce two new courses on CodeGuru Reviewer. These courses, developed by the AWS Training and Certification team, consist of guided walkthroughs, gaming elements, knowledge checks, and a final course assessment.

About the course

During these courses, you learn how to use CodeGuru Reviewer to automatically scan your code base, identify hard-to-find bugs and vulnerabilities, and get recommendations for fixing the bugs and security issues. The course covers CodeGuru Reviewer’s main features, provides a peek into how CodeGuru finds code anomalies, describes how its ML models were built, and explains how to understand and apply its prescriptive guidance and recommendations. Besides helping on improving the code quality, those recommendations are useful for new developers to learn coding best practices, such as refactor duplicated code, correct implementation of concurrency constructs, and how to avoid resource leaks.

The CodeGuru courses are designed to be completed within a 2-week time frame. The courses comprise 60 minutes of videos, which include 15 main lectures. Four of the lectures are specific to Java, and four focus on Python. The courses also include exercises and assessments at the end of each week, to provide you with in-depth, hands-on practice in a lab environment.

Week 1

During the first week, you learn the basics of CodeGuru Reviewer, including how you can benefit from ML and automated reasoning to perform static code analysis and identify critical defects from coding best practices. You also learn what kind of actionable recommendations CodeGuru Reviewer provides, such as refactoring, resource leak, potential race conditions, deadlocks, and security analysis. In addition, the course covers how to integrate this tool on your development workflow, such as your CI/CD pipeline.

Topics include:

  • What is Amazon CodeGuru?
  • How CodeGuru Reviewer is trained to provide intelligent recommendations
  • CodeGuru Reviewer recommendation categories
  • How to integrate CodeGuru Reviewer into your workflow

Week 2

Throughout the second week, you have the chance to explore CodeGuru Reviewer in more depth. With Java and Python code snippets, you have a more hands-on experience and dive into each recommendation category. You use these examples to learn how CodeGuru Reviewer looks for duplicated lines of code to suggest refactoring opportunities, how it detects code maintainability issues, and how it prevents resource leaks and concurrency bugs.

Topics include (for both Java and Python):

  • Common coding best practices
  • Resource leak prevention
  • Security analysis

Get started

Developed at the source, this new digital course empowers you to learn about CodeGuru from the experts at AWS whenever, wherever you want. Advance your skills and knowledge to build your future in the AWS Cloud. Enroll today:

Rafael Ramos

Rafael Ramos

Rafael is a Solutions Architect at AWS, where he helps ISVs on their journey to the cloud. He spent over 13 years working as a software developer, and is passionate about DevOps and serverless. Outside of work, he enjoys playing tabletop RPG, cooking and running marathons.

Amazon CodeGuru Reviewer Updates: New Java Detectors and CI/CD Integration with GitHub Actions

Post Syndicated from Alex Casalboni original https://aws.amazon.com/blogs/aws/amazon_codeguru_reviewer_updates_new_java_detectors_and_cicd_integration_with_github_actions/

Amazon CodeGuru allows you to automate code reviews and improve code quality, and thanks to the new pricing model announced in April you can get started with a lower and fixed monthly rate based on the size of your repository (up to 90% less expensive). CodeGuru Reviewer helps you detect potential defects and bugs that are hard to find in your Java and Python applications, using the AWS Management Console, AWS SDKs, and AWS CLI.

Today, I’m happy to announce that CodeGuru Reviewer natively integrates with the tools that you use every day to package and deploy your code. This new CI/CD experience allows you to trigger code quality and security analysis as a step in your build process using GitHub Actions.

Although the CodeGuru Reviewer console still serves as an analysis hub for all your onboarded repositories, the new CI/CD experience allows you to integrate CodeGuru Reviewer more deeply with your favorite source code management and CI/CD tools.

And that’s not all! Today we’re also releasing 20 new security detectors for Java to help you identify even more issues related to security and AWS best practices.

A New CI/CD Experience for CodeGuru Reviewer
As a developer or development team, you push new code every day and want to identify security vulnerabilities early in the development cycle, ideally at every push. During a pull-request (PR) review, all the CodeGuru recommendations will appear as a comment, as if you had another pair of eyes on the PR. These comments include useful links to help you resolve the problem.

When you push new code or schedule a code review, recommendations will appear in the Security > Code scanning alerts tab on GitHub.

Let’s see how to integrate CodeGuru Reviewer with GitHub Actions.

First of all, create a .yml file in your repository under .github/workflows/ (or update an existing action). This file will contain all your actions’ step. Let’s go through the individual steps.

The first step is configuring your AWS credentials. You want to do this securely, without storing any credentials in your repository’s code, using the Configure AWS Credentials action. This action allows you to configure an IAM role that GitHub will use to interact with AWS services. This role will require a few permissions related to CodeGuru Reviewer and Amazon S3. You can attach the AmazonCodeGuruReviewerFullAccess managed policy to the action role, in addition to s3:GetObject, s3:PutObject and s3:ListBucket.

This first step will look as follows:

- name: Configure AWS Credentials
  uses: aws-actions/[email protected]
  with:
    aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    aws-region: eu-west-1

These access key and secret key correspond to your IAM role and will be used to interact with CodeGuru Reviewer and Amazon S3.

Next, you add the CodeGuru Reviewer action and a final step to upload the results:

- name: Amazon CodeGuru Reviewer Scanner
  uses: aws-actions/codeguru-reviewer
  if: ${{ always() }} 
  with:
    build_path: target # build artifact(s) directory
    s3_bucket: 'codeguru-reviewer-myactions-bucket'  # S3 Bucket starting with "codeguru-reviewer-*"
- name: Upload review result
  if: ${{ always() }}
  uses: github/codeql-action/[email protected]
  with:
    sarif_file: codeguru-results.sarif.json

The CodeGuru Reviewer action requires two input parameters:

  • build_path: Where your build artifacts are in the repository.
  • s3_bucket: The name of an S3 bucket that you’ve created previously, used to upload the build artifacts and analysis results. It’s a customer-owned bucket so you have full control over access and permissions, in case you need to share its content with other systems.

Now, let’s put all the pieces together.

Your .yml file should look like this:

name: CodeGuru Reviewer GitHub Actions Integration
on: [pull_request, push, schedule]
jobs:
  CodeGuru-Reviewer-Actions:
    runs-on: ubuntu-latest
    steps:
      - name: Configure AWS Credentials
        uses: aws-actions/[email protected]
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-2
	  - name: Amazon CodeGuru Reviewer Scanner
        uses: aws-actions/codeguru-reviewer
        if: ${{ always() }} 
        with:
          build_path: target # build artifact(s) directory
          s3_bucket: 'codeguru-reviewer-myactions-bucket'  # S3 Bucket starting with "codeguru-reviewer-*"
      - name: Upload review result
        if: ${{ always() }}
        uses: github/codeql-action/[email protected]
        with:
          sarif_file: codeguru-results.sarif.json

It’s important to remember that the S3 bucket name needs to start with codeguru_reviewer- and that these actions can be configured to run with the pull_request, push, or schedule triggers (check out the GitHub Actions documentation for the full list of events that trigger workflows). Also keep in mind that there are minor differences in how you configure GitHub-hosted runners and self-hosted runners, mainly in the credentials configuration step. For example, if you run your GitHub Actions in a self-hosted runner that already has access to AWS credentials, such as an EC2 instance, then you don’t need to provide any credentials to this action (check out the full documentation for self-hosted runners).

Now when you push a change or open a PR CodeGuru Reviewer will comment on your code changes with a few recommendations.

Or you can schedule a daily or weekly repository scan and check out the recommendations in the Security > Code scanning alerts tab.

New Security Detectors for Java
In December last year, we launched the Java Security Detectors for CodeGuru Reviewer to help you find and remediate potential security issues in your Java applications. These detectors are built with machine learning and automated reasoning techniques, trained on over 100,000 Amazon and open-source code repositories, and based on the decades of expertise of the AWS Application Security (AppSec) team.

For example, some of these detectors will look at potential leaks of sensitive information or credentials through excessively verbose logging, exception handling, and storing passwords in plaintext in memory. The security detectors also help you identify several web application vulnerabilities such as command injection, weak cryptography, weak hashing, LDAP injection, path traversal, secure cookie flag, SQL injection, XPATH injection, and XSS (cross-site scripting).

The new security detectors for Java can identify security issues with the Java Servlet APIs and web frameworks such as Spring. Some of the new detectors will also help you with security best practices for AWS APIs when using services such as Amazon S3, IAM, and AWS Lambda, as well as libraries and utilities such as Apache ActiveMQ, LDAP servers, SAML parsers, and password encoders.

Available Today at No Additional Cost
The new CI/CD integration and security detectors for Java are available today at no additional cost, excluding the storage on S3 which can be estimated based on size of your build artifacts and the frequency of code reviews. Check out the CodeGuru Reviewer Action in the GitHub Marketplace and the Amazon CodeGuru pricing page to find pricing examples based on the new pricing model we launched last month.

We’re looking forward to hearing your feedback, launching more detectors to help you identify potential issues, and integrating with even more CI/CD tools in the future.

You can learn more about the CI/CD experience and configuration in the technical documentation.

Alex

New – AWS BugBust: It’s Game Over for Bugs

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/new-aws-bugbust-its-game-over-for-bugs/

Today, we are launching AWS BugBust, the world’s first global challenge to fix one million bugs and reduce technical debt by over $100 million.

You might have participated in a bug bash before. Many of the software companies where I’ve worked (including Amazon) run them in the weeks before launching a new product or service. AWS BugBust takes the concept of a bug bash to a new level.

AWS BugBust allows you to create and manage private events that will transform and gamify the process of finding and fixing bugs in your software. It includes automated code analysis, built-in leaderboards, custom challenges, and rewards. AWS BugBust fosters team building and introduces some friendly competition into improving code quality and application performance. What’s more, your developers can take part in the world’s largest code challenge, win fantastic prizes, and receive kudos from their peers.

Behind the scenes, AWS BugBust uses Amazon CodeGuru Reviewer and Amazon CodeGuru Profiler. These developer tools use machine learning and automated reasoning to find bugs in your applications. These bugs are then available for your developers to claim and fix. The more bugs a developer fixes, the more points the developer earns. A traditional bug bash requires developers to find and fix bugs manually. With AWS BugBust, developers get a list of bugs before the event begins so they can spend the entire event focused on fixing them.

Fix Your Bugs and Earn Points
As a developer, each time you fix a bug in a private event, points are allocated and added to the global leaderboard. Don’t worry: Only your handle (profile name) and points are displayed on the global leaderboard. Nobody can see your code or details about the bugs that you’ve fixed.

As developers reach significant individual milestones, they receive badges and collect exclusive prizes from AWS, for example, if they achieve 100 points they will win an AWS BugBust T-shirt and if they earn 2,000 points they will win an AWS BugBust Varsity Jacket. In addition, on the 30th of September 2021, the top 10 developers on the global leaderboard will receive a ticket to AWS re:Invent.

Create an Event
To show you how the challenge works, I’ll create a private AWS BugBust event. In the CodeGuru console, I choose Create BugBust event.

Under Step 1- Rules and scoring, I see how many points are awarded for each type of bug fix. Profiling groups are used to determine performance improvements after the players submit their improved solutions.

In Step 2, I sign in to my player account. In Step 3, I add event details like name, description, and start and end time.

I also enter details about the first-, second-, and third-place prizes. This information will be displayed to players when they join the event.

After I have reviewed the details and created the event, my event dashboard displays essential information, I can also import work items and invite players.

I select the Import work items button. This takes me to the Import work items screen where I choose to Import bugs from CodeGuru Reviewer and profiling groups from CodeGuru Profiler. I choose a repository analysis from my account and AWS BugBust imports all the identified bugs for players to claim and fix. I also choose several profiling groups that will be used by AWS BugBust.

Now that my event is ready, I can invite players. Players can now sign into a player portal using their player accounts and start claiming and fixing bugs.

Things to Know
Amazon CodeGuru currently supports Python and Java. To compete in the global challenge, your project must be written in one of these languages.

Pricing
When you create your first AWS BugBust event, all costs incurred by the underlying usage of Amazon CodeGuru Reviewer and Amazon CodeGuru Profiler are free of charge for 30 days per AWS account. This 30 day free period applies even if you have already utilized the free tiers for Amazon CodeGuru Reviewer and Amazon CodeGuru Profiler. You can create multiple AWS BugBust events within the 30-day free trial period. After the 30-day free trial expires, you will be charged for Amazon CodeGuru Reviewer and Amazon CodeGuru Profiler based on your usage in the challenge. See the Amazon CodeGuru Pricing page for details.

Available Today
Starting today, you can create AWS BugBust events in the Amazon CodeGuru console in the US East (N. Virginia) Region. Start planning your AWS BugBust today.

— Martin

How Amazon CodeGuru Profiler helps Coursera create high-quality online learning experiences

Post Syndicated from Adnan Bilwani original https://aws.amazon.com/blogs/devops/coursera-codeguru-profiler/

This guest post was authored by Rana Ahsan and Chris Liu, Software Engineers at Coursera, and edited by Dave Sanderson and Adnan Bilwani, at AWS.

Coursera was launched in 2012 by two Stanford Computer Science professors, Daphne Koller and Andrew Ng, with a vision of providing universal access to world-class learning. It’s now one of the world’s leading online learning platforms with more than 77 million registered learners. Coursera partners with over 200 leading university and industry partners to offer a broad catalog of content and credentials, including Guided Projects, courses, Specializations, certificates, and accredited bachelor’s and master’s degrees. More than 6,000 institutions have used Coursera to transform their talent by upskilling and reskilling their employees, citizens, and students in the high-demand data science, technology, and business skills required to compete in today’s economy.

Coursera’s challenge

About a year and a half ago, we were debugging some performance bottlenecks for our microservices where some requests were too slow and often resulted in server errors due to timeout. Even after digging into the logs and looking through existing performance metrics of the underlying systems, we couldn’t determine the cause or explain the behavior. To compound the matter, we saw the issues across multiple applications, leading to a concern that the issue resided in code shared across applications.

To dig deeper into the problem, we started doing manual reviews of the code for the applications associated with these failures. This process turned out to be too laborious and time-consuming.

We also started building out our debugging solution utilizing shell scripts to capture a snapshot of the system and application state so we could analyze the details of these bottlenecks. Although the scripts provided some valuable insights, we started to run into issues of scale. The process of deploying the tooling and collecting the data became tedious and cumbersome. At the same time, the more data we received back from our scripts, the more demand we saw for this level of detail.

Solution with Amazon CodeGuru

It was at this point we decided to look into CodeGuru Profiler to determine if it could provide the same insights as our tooling while reducing the administrative burden associated with deploying, aggregating, and analyzing the data.

Quickly after deploying CodeGuru Profiler, we started to get valuable and actionable data regarding the issue. The profiler highlighted a high CPU usage during the deserialization process used for application communications (see the following screenshot). This confirmed our suspicions that the issue was indeed in the shared code, but it was also an unexpected find, because JSON deserialization is widespread and it hadn’t hit our radar as a potential candidate for the issue.

Flame graph highlighted high CPU usage

The deserialization performance was causing slow processing time for each request and increasing the overall communication latency. This led to performance degradation by causing the calls to slowly stack up over time until the request would time out. With CodeGuru Profiler, we were also able to establish a relationship between the size of the JSON document and the high CPU usage. As the document size increased, so did CPU usage. In some cases, this resulted in a resource starvation situation on the system, which in turn could cause other processes on the instance to crash, resulting in a sudden failure.

In addition to the performance and application instances failures, the high CPU usage during deserialization was quietly driving up costs. Because the issue was in shared code, the wasteful resource consumption was contributing to the need for additional instances to be added into the application cluster in order to meet throughput requirements.

Results

The details from CodeGuru Profiler enabled us to see and understand the underlying cause and move forward with a solution. Because the code is in the critical communication path for our microservices and shared across many applications, we decided to replace the JSON serialization with an alternative object serialization approach that was much more performant.

To get a deeper understanding of the performance impact of this change, we deployed the original code to a test environment. We used CodeGuru Profiler to evaluate the performance of the JSON and native object serialization. After we ran the same battery of tests against the new code, the deserialization CPU usage dropped between 30–40%.

Flame graph displayed lower CPU usage after changes

This resulted in a corresponding decrease in serialization-deserialization latency of 5–50% depending on response payload size and workload.

As we roll out this optimization across our fleet, we anticipate seeing significant improvements in the performance and stability of our microservices platform as well as a reduction in CPU utilization cost.

Conclusion

CodeGuru Profiler provides additional application insights and helps us troubleshoot other issues in our applications, including resolving a case of flaky I/O errors and improving tail latency by recommending a different thread-pooling strategy.

CodeGuru Profiler was easy to maintain and scale across our fleet. It quickly identified these performance issues in our production environment and helped us verify and understand the impact of possible resolutions prior to releasing the code back into production.

 

About the authors

Rana Ahsan is a Staff Software Engineer at Coursera
Chris Liu is a Staff Software Engineer at Coursera
Adnan Bilwani is a Sr. Specialist Builders Experience at AWS
Dave Sanderson is a Sr. Technical Account Manager at AWS

 

Improving the CPU and latency performance of Amazon applications using AWS CodeGuru Profiler

Post Syndicated from Neha Gupta original https://aws.amazon.com/blogs/devops/improving-the-cpu-and-latency-performance-of-amazon-applications-using-aws-codeguru-profiler/

Amazon CodeGuru Profiler is a developer tool powered by machine learning (ML) that helps identify an application’s most expensive lines of code and provides intelligent recommendations to optimize it. You can identify application performance issues and troubleshoot latency and CPU utilization issues in your application.

You can use CodeGuru Profiler to optimize performance for any application running on AWS Lambda, Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), AWS Fargate, or AWS Elastic Beanstalk, and on premises.

This post gives a high-level overview of how CodeGuru Profiler has reduced CPU usage and latency by approximately 50% and saved around $100,000 a year for a particular Amazon retail service.

Technical and business value of CodeGuru Profiler

CodeGuru Profiler is easy and simple to use, just turn it on and start using it. You can keep it running in the background and you can just look into the CodeGuru Profiler findings and implement the relevant changes.

It’s fairly low cost and unlike traditional tools that take up lot of CPU and RAM, running CodeGuru Profiler has less than 1% impact on total CPU usage overhead to applications and typically uses no more than 100 MB of memory.

You can run it in a pre-production environment to test changes to ensure no impact occurs on your application’s key metrics.

It automatically detects performance anomalies in the application stack traces that start consuming more CPU or show increased latency. It also provides visualizations and recommendations on how to fix performance issues and the estimated cost of running inefficient code. Detecting the anomalies early prevents escalating the issue in production. This helps you prioritize remediation by giving you enough time to fix the issue before it impacts your service’s availability and your customers’ experience.

How we used CodeGuru Profiler at Amazon

Amazon has on-boarded many of its applications to CodeGuru Profiler, which has resulted in an annual savings of millions of dollars and latency improvements. In this post, we discuss how we used CodeGuru Profiler on an Amazon Prime service. A simple code change resulted in saving around $100,000 for the year.

Opportunity to improve

After a change to one of our data sources that caused its payload size to increase, we expected a slight increase to our service latency, but what we saw was higher than expected. Because CodeGuru Profiler is easy to integrate, we were able to quickly make and deploy the changes needed to get it running on our production environment.

After loading up the profile in Amazon CodeGuru Profiler, it was immediately apparent from the visualization that a very large portion of the service’s CPU time was being taken up by Jackson deserialization (37%, across the two call sites). It was also interesting that most of the blocking calls in the program (in blue) was happening in the jackson.databind method _createAndCacheValueDeserializer.

Flame graphs represent the relative amount of time that the CPU spends at each point in the call graph. The wider it is, the more CPU usage it corresponds to.

The following flame graph is from before the performance improvements were implemented.

The Flame Graph before the deployment

Looking at the source for _createAndCacheValueDeserializer confirmed that there was a synchronized block. From within it, _createAndCache2 was called, which actually did the adding to the cache. Adding to the cache was guarded by a boolean condition which had a comment that indicated that caching would only be enabled for custom serializers if @JsonCachable was set.

Solution

Checking the documentation for @JsonCachable confirmed that this annotation looked like the correct solution for this performance issue. After we deployed a quick change to add @JsonCachable to our four custom deserializers, we observed that no visible time was spent in _createAndCacheValueDeserializer.

Results

Adding a one-line annotation in four different places made the code run twice as fast. Because it was holding a lock while it recreated the same deserializers for every call, this was allowing only one of the four CPU cores to be used and therefore causing latency and inefficiency. Reusing the deserializers avoided repeated work and saved us lot of resources.

After the CodeGuru Profiler recommendations were implemented, the amount of CPU spent in Jackson reduced from 37% to 5% across the two call paths, and there was no visible blocking. With the removal of the blocking, we could run higher load on our hosts and reduce the fleet size, saving approximately $100,000 a year in Amazon EC2 costs, thereby resulting in overall savings.

The following flame graph shows performance after the deployment.

The Flame Graph after the deployment

Metrics

The following graph shows that CPU usage reduced by almost 50%. The blue line shows the CPU usage the week before we implemented CodeGuru Profiler recommendations, and green shows the dropped usage after deploying. We could later safely scale down the fleet to reduce costs, while still having better performance than prior to the change.

Average Fleet CPU Utilization

 

The following graph shows the server latency, which also dropped by almost 50%. The latency dropped from 100 milliseconds to 50 milliseconds as depicted in the initial portion of the graph. The orange line depicts p99, green p99.9, and blue p50 (mean latency).

Server Latency

 

Conclusion

With a few lines of changed code and a half-hour investigation, we removed the bottleneck which led to lower utilization of resources and  thus we were able to decrease the fleet size. We have seen many similar cases, and in one instance, a change of literally six characters of inefficient code, reduced CPU usage from 99% to 5%.

Across Amazon, CodeGuru Profiler has been used internally among various teams and resulted in millions of dollars of savings and performance optimization. You can use CodeGuru Profiler for quick insights into performance issues of your application. The more efficient the code and application is, the less costly it is to run. You can find potential savings for any application running in production and significantly reduce infrastructure costs using CodeGuru Profiler. Reducing fleet size, latency, and CPU usage is a major win.

 

 

About the Authors

Neha Gupta

Neha Gupta is a Solutions Architect at AWS and have 16 years of experience as a Database architect/ DBA. Apart from work, she’s outdoorsy and loves to dance.

Ian Clark

Ian is a Senior Software engineer with the Last Mile organization at Amazon. In his spare time, he enjoys exploring the Vancouver area with his family.