Tag Archives: Automated reasoning

Resource leak detection in Amazon CodeGuru Reviewer

Post Syndicated from Pranav Garg original https://aws.amazon.com/blogs/devops/resource-leak-detection-in-amazon-codeguru/

This post discusses the resource leak detector for Java in Amazon CodeGuru Reviewer. CodeGuru Reviewer automatically analyzes pull requests (created in supported repositories such as AWS CodeCommit, GitHub, GitHub Enterprise, and Bitbucket) and generates recommendations for improving code quality. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. This blog does not describe the resource leak detector for Python programs that is now available in preview.

What are resource leaks?

Resources are objects with a limited availability within a computing system. These typically include objects managed by the operating system, such as file handles, database connections, and network sockets. Because the number of such resources in a system is limited, they must be released by an application as soon as they are used. Otherwise, you will run out of resources and you won’t be able to allocate new ones. The paradigm of acquiring a resource and releasing it is also followed by other categories of objects such as metric wrappers and timers.

Resource leaks are bugs that arise when a program doesn’t release the resources it has acquired. Resource leaks can lead to resource exhaustion. In the worst case, they can cause the system to slow down or even crash.

Starting with Java 7, most classes holding resources implement the java.lang.AutoCloseable interface and provide a close() method to release them. However, a close() call in source code doesn’t guarantee that the resource is released along all program execution paths. For example, in the following sample code, resource r is acquired by calling its constructor and is closed along the path corresponding to the if branch, shown using green arrows. To ensure that the acquired resource doesn’t leak, you must also close r along the path corresponding to the else branch (the path shown using red arrows).

A resource must be closed along all execution paths to prevent resource leaks

Often, resource leaks manifest themselves along code paths that aren’t frequently run, or under a heavy system load, or after the system has been running for a long time. As a result, such leaks are latent and can remain dormant in source code for long periods of time before manifesting themselves in production environments. This is the primary reason why resource leak bugs are difficult to detect or replicate during testing, and why automatically detecting these bugs during pull requests and code scans is important.

Detecting resource leaks in CodeGuru Reviewer

For this post, we consider the following Java code snippet. In this code, method getConnection() attempts to create a connection in the connection pool associated with a data source. Typically, a connection pool limits the maximum number of connections that can remain open at any given time. As a result, you must close connections after their use so as to not exhaust this limit.

 1     private Connection getConnection(final BasicDataSource dataSource, ...)
               throws ValidateConnectionException, SQLException {
 2         boolean connectionAcquired = false;
 3         // Retrying three times to get the connection.
 4         for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
 5             Connection connection = dataSource.getConnection();
 6             // validateConnection may throw ValidateConnectionException
 7             if (! validateConnection(connection, ...)) {
 8                 // connection is invalid
 9                 DbUtils.closeQuietly(connection);
10             } else {
11                 // connection is established
12                 connectionAcquired = true;
13                 return connection;
14             }
15         }
16         return null;
17     }

At first glance, it seems that the method getConnection() doesn’t leak connection resources. If a valid connection is established in the connection pool (else branch on line 10 is taken), the method getConnection() returns it to the client for use (line 13). If the connection established is invalid (if branch on line 7 is taken), it’s closed in line 9 before another attempt is made to establish a connection.

However, method validateConnection() at line 7 can throw a ValidateConnectionException. If this exception is thrown after a connection is established at line 5, the connection is neither closed in this method nor is it returned upstream to the client to be closed later. Furthermore, if this exceptional code path runs frequently, for instance, if the validation logic throws on a specific recurring service request, each new request causes a connection to leak in the connection pool. Eventually, the client can’t acquire new connections to the data source, impacting the availability of the service.

A typical recommendation to prevent resource leak bugs is to declare the resource objects in a try-with-resources statement block. However, we can’t use try-with-resources to fix the preceding method because this method is required to return an open connection for use in the upstream client. The CodeGuru Reviewer recommendation for the preceding code snippet is as follows:

“Consider closing the following resource: connection. The resource is referenced at line 7. The resource is closed at line 9. The resource is returned at line 13. There are other execution paths that don’t close the resource or return it, for example, when validateConnection throws an exception. To prevent this resource leak, close connection along these other paths before you exit this method.”

As mentioned in the Reviewer recommendation, to prevent this resource leak, you must close the established connection when method validateConnection() throws an exception. This can be achieved by inserting the validation logic (lines 7–14) in a try block. In the finally block associated with this try, the connection must be closed by calling DbUtils.closeQuietly(connection) if connectionAcquired == false. The method getConnection() after this fix has been applied is as follows:

private Connection getConnection(final BasicDataSource dataSource, ...) 
        throws ValidateConnectionException, SQLException {
    boolean connectionAcquired = false;
    // Retrying three times to get the connection.
    for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
        Connection connection = dataSource.getConnection();
        try {
            // validateConnection may throw ValidateConnectionException
            if (! validateConnection(connection, ...)) {
                // connection is invalid
            } else {
                // connection is established
                connectionAcquired = true;
                return connection;
        } finally {
            if (!connectionAcquired) {
    return null;

As shown in this example, resource leaks in production services can be very disruptive. Furthermore, leaks that manifest along exceptional or less frequently run code paths can be hard to detect or replicate during testing and can remain dormant in the code for long periods of time before manifesting themselves in production environments. With the resource leak detector, you can detect such leaks on objects belonging to a large number of popular Java types such as file streams, database connections, network sockets, timers and metrics, etc.

Combining static code analysis with machine learning for accurate resource leak detection

In this section, we dive deep into the inner workings of the resource leak detector. The resource leak detector in CodeGuru Reviewer uses static analysis algorithms and techniques. Static analysis algorithms perform code analysis without running the code. These algorithms are generally prone to high false positives (the tool might report correct code as having a bug). If the number of these false positives is high, it can lead to alarm fatigue and low adoption of the tool. As a result, the resource leak detector in CodeGuru Reviewer prioritizes precision over recall— the findings we surface are resource leaks with a high accuracy, though CodeGuru Reviewer could potentially miss some resource leak findings.

The main reason for false positives in static code analysis is incomplete information available to the analysis. CodeGuru Reviewer requires only the Java source files and doesn’t require all dependencies or the build artifacts. Not requiring the external dependencies or the build artifacts reduces the friction to perform automated code reviews. As a result, static analysis only has access to the code in the source repository and doesn’t have access to its external dependencies. The resource leak detector in CodeGuru Reviewer combines static code analysis with a machine learning (ML) model. This ML model is used to reason about external dependencies to provide accurate recommendations.

To understand the use of the ML model, consider again the code above for method getConnection() that had a resource leak. In the code snippet, a connection to the data source is established by calling BasicDataSource.getConnection() method, declared in the Apache Commons library. As mentioned earlier, we don’t require the source code of external dependencies like the Apache library for code analysis during pull requests. Without access to the code of external dependencies, a pure static analysis-driven technique doesn’t know whether the Connection object obtained at line 5 will leak, if not closed. Similarly, it doesn’t know that DbUtils.closeQuietly() is a library function that closes the connection argument passed to it at line 9. Our detector combines static code analysis with ML that learns patterns over such external function calls from a large number of available code repositories. As a result, our resource leak detector knows that the connection doesn’t leak along the following code path:

  • A connection is established on line 5
  • Method validateConnection() returns false at line 7
  • DbUtils.closeQuietly() is called on line 9

This suppresses the possible false warning. At the same time, the detector knows that there is a resource leak when the connection is established at line 5, and validateConnection() throws an exception at line 7 that isn’t caught.

When we run CodeGuru Reviewer on this code snippet, it surfaces only the second leak scenario and makes an appropriate recommendation to fix this bug.

The ML model used in the resource leak detector has been trained on a large number of internal Amazon and GitHub code repositories.

Responses to the resource leak findings

Although closing an open resource in code isn’t difficult, doing so properly along all program paths is important to prevent resource leaks. This can easily be overlooked, especially along exceptional or less frequently run paths. As a result, the resource leak detector in CodeGuru Reviewer has observed a relatively high frequency, and has alerted developers within Amazon to thousands of resource leaks before they hit production.

The resource leak detections have witnessed a high developer acceptance rate, and developer feedback towards the resource leak detector has been very positive. Some of the feedback from developers includes “Very cool, automated finding,” “Good bot :),” and “Oh man, this is cool.” Developers have also concurred that the findings are important and need to be fixed.


Resource leak bugs are difficult to detect or replicate during testing. They can impact the availability of production services. As a result, it’s important to automatically detect these bugs early on in the software development workflow, such as during pull requests or code scans. The resource leak detector in CodeGuru Reviewer combines static code analysis algorithms with ML to surface only the high confidence leaks. It has a high developer acceptance rate and has alerted developers within Amazon to thousands of leaks before those leaks hit production.

New! Streamline existing IAM Access Analyzer findings using archive rules

Post Syndicated from Andrea Nedic original https://aws.amazon.com/blogs/security/new-streamline-existing-iam-access-analyzer-findings-using-archive-rules/

AWS Identity and Access Management (IAM) Access Analyzer generates comprehensive findings to help you identify resources that grant public and cross-account access. Now, you can also apply archive rules to existing findings, so you can better manage findings and focus on the findings that need your attention most.

You can think of archive rules as similar to email rules. You define email rules to automatically organize emails. With IAM Access Analyzer, you can define archive rules to automatically mark findings as intended access. Now, those rules can apply to existing as well as new IAM Access Analyzer findings. This helps you focus on findings for potential unintended access to your resources. You can then easily track and resolve these findings by reducing access, helping you to work towards least privilege.

In this post, first I give a brief overview of IAM Access Analyzer. Then I show you an example of how to create an archive rule to automatically archive findings for intended access. Finally, I show you how to update an archive rule to mark existing active findings as intended.

IAM Access Analyzer overview

IAM Access Analyzer helps you determine which resources can be accessed publicly or from other accounts or organizations. IAM Access Analyzer determines this by mathematically analyzing access control policies attached to resources. This form of analysis—called automated reasoning—applies logic and mathematical inference to determine all possible access paths allowed by a resource policy. This is how IAM Access Analyzer uses provable security to deliver comprehensive findings for potential unintended bucket access. You can enable IAM Access Analyzer in the IAM console by creating an analyzer for an account or an organization. Once you’ve created your analyzer, you can review findings for resources that can be accessed publicly or from other AWS accounts or organizations.

Create an archive rule to automatically archive findings for intended access

When you review findings and discover common patterns for intended access, you can create archive rules to automatically archive those findings. This helps you focus on findings for unintended access to your resources, just like email rules help streamline your inbox.

To create an archive rule

In the IAM console, choose Archive rules under Access Analyzer. Then, choose Create archive rule to display the Create archive rule page shown in Figure 1. There, you find the option to name the rule or use the name generated by default. In the Rule section, you define criteria to match properties of findings you want to archive. Just like email rules, you can add multiple criteria to the archive rule. You can define each criterion by selecting a finding property, an operator, and a value. To help ensure a rule doesn’t archive findings for public access, the criterion Public access is false is suggested by default.

Figure 1: IAM Access Analyzer create archive rule page where you add criteria to create a new archive rule

Figure 1: IAM Access Analyzer create archive rule page where you add criteria to create a new archive rule

For example, I have a security audit role external to my account that I expect to have access to resources in my account. To mark that access as intended, I create a rule to archive all findings for Amazon S3 buckets in my account that can be accessed by the security audit role outside of the account. To do this, I include two criteria: Resource type matches S3 bucket, and the AWS Account value matches the security audit role ARN. Once I add these criteria, the Results section displays the list of existing active findings the archive rule matches, as shown in Figure 2.

Figure 2: A rule to archive all findings for S3 buckets in an account that can be accessed by the audit role outside of the account, with matching findings displayed

Figure 2: A rule to archive all findings for S3 buckets in an account that can be accessed by the audit role outside of the account, with matching findings displayed

When you’re done adding criteria for your archive rule, select Create and archive active findings to archive new and existing findings based on the rule criteria. Alternatively, you can choose Create rule to create the rule for new findings only. In the preceding example, I chose Create and archive active findings to archive all findings—existing and new—that match the criteria.

Update an archive rule to mark existing findings as intended

You can also update an archive rule to archive existing findings retroactively and streamline your findings. To edit an archive rule, choose Archive rules under Access Analyzer, then select an existing rule and choose Edit. In the Edit archive rule page, update the archive rule criteria and review the list of existing active findings the archive rule applies to. When you save the archive rule, you can apply it retroactively to existing findings by choosing Save and archive active findings as shown in Figure 3. Otherwise, you can choose Save rule to update the rule and apply it to new findings only.

Note: You can also use the new IAM Access Analyzer API operation ApplyArchiveRule to retroactively apply an archive rule to existing findings that meet the archive rule criteria.


Figure 3: IAM Access Analyzer edit archive rule page where you can apply the rule retroactively to existing findings by choosing Save and archive active findings

Figure 3: IAM Access Analyzer edit archive rule page where you can apply the rule retroactively to existing findings by choosing Save and archive active findings

Get started

To turn on IAM Access Analyzer at no additional cost, open the IAM console. IAM Access Analyzer is available at no additional cost in the IAM console and through APIs in all commercial AWS Regions, AWS China Regions, and AWS GovCloud (US). To learn more about IAM Access Analyzer and which resources it supports, visit the feature page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS IAM forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Andrea Nedic

Andrea is a Sr. Tech Product Manager for AWS Identity and Access Management. She enjoys hearing from customers about how they build on AWS. Outside of work, Andrea likes to ski, dance, and be outdoors. She holds a PhD from Princeton University.

How to automatically archive expected IAM Access Analyzer findings

Post Syndicated from Josh Joy original https://aws.amazon.com/blogs/security/how-to-automatically-archive-expected-iam-access-analyzer-findings/

AWS Identity and Access Management (IAM) Access Analyzer continuously monitors your Amazon Web Services (AWS) resource-based policies for changes in order to identify resources that grant public or cross-account access from outside your AWS account or organization. Access Analyzer findings include detailed information that you can use to make an informed decision about whether access to the shared resource was intended or not. The findings information includes the affected AWS resource, the external principal that has access, the condition from the policy statement that grants the access, and the access level, such as read, write, or the ability to modify permissions.

In this blog post, we show you how to automatically archive Access Analyzer findings for expected events, such as authorized resource access. The benefit of automatically archiving expected findings is to help you reduce distraction from findings that don’t require action, enabling you to concentrate on remediating any unexpected access to your shared resources.

Access Analyzer provides you with the ability to archive findings that show intended cross-account sharing of your AWS resources. The AWS service-provided archive mechanism provides you with built-in archive rules that can automatically archive new findings that meet the criteria you define (such as directive controls). For example, your organizational access controls might allow your auditor to have read-only IAM role cross-account access from your security account into all of your accounts. In this security auditor scenario, you can define a built-in archive rule to automatically archive the findings related to the auditor cross-account IAM role that has authorized read-only access.

A limitation of the built-in archive rules is that they are static and based only on simple pattern matching. To build your own custom archiving logic, you can create an AWS Lambda function that listens to Amazon CloudWatch Events. Access Analyzer forwards all findings to CloudWatch Events, and you can easily configure a CloudWatch Events rule to trigger a Lambda function for each Access Analyzer finding. For example, if you want to look up the tags on a resource, you can make an AWS API call based on the Amazon Resource Name (ARN) for the resource in your Lambda function. As another example, you might want to compute an overall risk score based on the various parts of a finding and archive everything below a certain threshold score that you define.

In this blog post, we show you how to configure a built-in archive rule, how to add context enrichment for more complex rules, and how to trigger an alert for unintended findings. We first cover the scenario of the auditor role using a built-in archive rule. Then, we show how to perform automated archive remediation by using CloudWatch Events with AWS Step Functions to add context enrichment and automatically remediate the authorized sharing of a cross-account AWS Key Management Service (AWS KMS) key. Finally, we show how to trigger alerts for the unintended sharing of a public Amazon Simple Storage Service (Amazon S3) bucket.


The solution we give here assumes that you have Access Analyzer enabled in your AWS account. You can find more details about enabling Access Analyzer in the Getting Started guide for that feature. Access Analyzer is available at no additional cost in the IAM console and through APIs in all commercial AWS Regions. Access Analyzer is also available through APIs in the AWS GovCloud (US) Regions.

How to use the built-in archive rules

In our first example, there is a security auditor cross-account IAM role that can be assumed by security automation tools from the central security AWS account. We use the built-in archive rules to automatically archive cross-account findings related to the cross-account security auditor IAM role.

To create a built-in archive rule

  1. In the AWS Management Console, choose Identity and Access Management (IAM). On the dashboard, choose Access Analyzer, and then choose Archive rules.
  2. Choose the Create archive rule button.
    Figure 1: Create archive rule

    Figure 1: Create archive rule

  3. You can select archive rule criteria based on your use case. For this example, in the search box, choose AWS Account as the criteria, since we want to automatically archive the security auditor account.
    Figure 2: Select archive rule criteria

    Figure 2: Select archive rule criteria

  4. You can now enter the value for the selected criteria. In this case, for Criteria, choose AWS Account, and then choose the equals operator.
  5. After you’ve entered your criteria, choose the Create archive rule button.
    Figure 3: Finish creating the archive rule

    Figure 3: Finish creating the archive rule

    You should see a message confirming that you’ve successfully created a new archive rule.

    Figure 4: Successful creation of a new archive rule

    Figure 4: Successful creation of a new archive rule

How to automatically archive expected findings

We now show you how to automatically archive expected findings by using a serverless workflow that you define by using AWS Step Functions. We show you how to leverage Step Functions to enrich an Access Analyzer finding, evaluate the finding against your customized rule engine logic, and finally either archive the finding or send a notification. A CloudWatch Event Rule will trigger the Step Functions workflow when Access Analyzer generates a new finding.

Solution architecture – serverless workflow

The CloudWatch event bus delivers the Access Analyzer findings to the Step Functions workflow. The Step Functions workflow responds to each Access Analyzer finding and either archives the finding for authorized access or sends an Amazon Simple Notification Service (Amazon SNS) email notification for an unauthorized access finding, as shown in figure 5.

Figure 5: Solution architecture for automatic archiving

Figure 5: Solution architecture for automatic archiving

The Step Functions workflow enriches the finding and provides contextual information to the rules engine for evaluation, as shown in figure 6. The Access Analyzer finding is either archived or generates an alert, based on the result of the rules engine evaluation and the associated risk level. If you’re interested in remediating the finding, you can learn more by watching the talk AWS re:Invent 2019: [NEW LAUNCH!] Dive Deep into IAM Access Analyzer (SEC309).

Figure 6: Finding analysis and archival

Figure 6: Finding analysis and archival

This example uses four Lambda functions. One function is for context enrichment, a second function is for rule evaluation logic, a third function is to archive expected findings, and finally a fourth function is to send a notification for findings that require investigation by your security operations team.

First, the enrichment Lambda function retrieves the tags associated with the AWS resource. The following code example retrieves the S3 bucket tags.

def lookup_s3_tags(resource_arn):
  tags = {}

  s3_client = boto3.client("s3")
  bucket_tags = s3_client.get_bucket_tagging(Bucket=resource_arn)["TagSet"]

  return bucket_tags

The Lambda function can perform additional enrichment beyond looking up tags, such as looking up the AWS KMS key alias, as shown in the next code example.

def additional_enrichment(resource_type, resource_arn):
  additional_context = {}

  if resource_type == "AWS::KMS::Key":
    kms_client = boto3.client("kms")
    aliases = kms_client.list_aliases(KeyId=resource_arn)["Aliases"]
    additional_context["key_aliases"] = [alias["AliasName"] for alias in aliases]

  return additional_context

Next, the evaluation rule Lambda function determines whether the finding is authorized and can be archived, or whether the finding is unauthorized and a notification needs to be generated. In this example, we first check whether the resource is shared publicly and then immediately alert if there’s an unexpected public sharing of a resource. Additionally, we explicitly don’t want public sharing of resources that are tagged Confidential. Our example method checks whether the value “Confidential” is set as the “Data Classification” tag and correspondingly returns False in order to trigger a notification.

Also, we allow cross-account sharing of a key in the development environment with the tag key “IsAllowedToShare” and tag value “true”, tag key “Environment” with tag value “development”, and a key alias of “DevelopmentKey”.

# Evaluate Risk Level
# Return True to raise alert if risk level exceeds threshold
# Return False to archive finding
def should_raise_alert(finding_details, tags, additional_context):
  if (
      and not is_allowed_public(finding_details, tags, additional_context)
    return True
  elif (
        tags.get("IsAllowedToShare") == "true"
        and tags.get("Environment") == "development"
        and "DevelopmentKey" in additional_context.get("key_aliases", [])
    return False

  return True

def is_allowed_public(finding_details, tags, additional_context):
  # customize your logic here
  # for example, Data Classification is Confidential, return False for no public access
  if "Data Classification" in tags and tags["Data Classification"] == "Confidential":
    return False 

  return True
  if should_raise_alert(finding_details, tags, additional_context):
    return {"status": "NOTIFY"}
    return {"status": "ARCHIVE"}     

We then use the Choice condition to trigger either the archive or notification step.

 next(sfn.Choice(self, "Archive?"). \
  when(sfn.Condition.string_equals("$.guid.status", "ARCHIVE"), archive_task). \
  when(sfn.Condition.string_equals("$.guid.status", "NOTIFY"), notification_task) \

The archive Lambda step archives the Access Analyzer finding if a rule is successfully evaluated.

def archive_finding(finding_id, analyzer_arn):
  access_analyzer_client = boto3.client("accessanalyzer")

Otherwise, we raise an SNS notification because there is unauthorized resource sharing.

  resource_type = event["detail"]["resourceType"]
  resource_arn = event["detail"]["resource"]

  sns_client = boto3.client('sns')
      Message=f"Alert {resource_type} {resource_arn} exceeds risk level.",
      Subject="Alert Access Analyzer Finding"

Solution deployment

You can deploy the solution through either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).


Make sure that Access Analyzer is enabled in your AWS account. You can find an AWS CloudFormation template for doing so in the GitHub repository. It’s also possible for you to enable Access Analyzer across your organization by using the scripts for AWS CloudFormation StackSets found in the GitHub repository. See more details in the blog post Enabling AWS IAM Access Analyzer on AWS Control Tower accounts.

To deploy the solution by using the AWS Management Console

  1. In your security account, launch the template by choosing the following Launch Stack button.
    Select the Launch Stack button to launch the template
  2. Provide the following parameter for the security account:
    EmailSubscriptionParameter: The email address to receive subscription notifications for any findings that exceed your defined risk level.

To deploy the solution by using the AWS CDK

Additionally, you can find the latest code on GitHub, where you can also contribute to the sample code. The following commands shows how to deploy the solution by using the AWS Cloud Development Kit (AWS CDK). First, upload the Lambda assets to S3. Then, deploy the solution to your account.

cdk bootstrap

cdk deploy --parameters EmailSubscriptionParameter=YOUR_EMAIL_ADDRESS_HERE

To test the solution

  1. Create a cross-account KMS key. You should receive an email notification after several minutes.
  2. Create a cross-account KMS key with the tags IsAllowedToShare=true and Environment=development. Also, create a KMS key alias named alias/DevelopmentKey for this key. After a few seconds, you should see that the finding was automatically archived.


In this blog post, we showed you how IAM Access Analyzer can help you identify resources in your organization and accounts that are shared with an external identity. We explained how to automatically archive expected findings by using the built-in archive rules. Then, we walked you through how to automatically archive expected shared resources. We showed you how to create a serverless workflow that uses AWS Step Functions, which performs context enrichment and then automatically archives your findings for expected shared resources.

After you follow the steps in this blog post for automatic archiving, you will only receive Access Analyzer findings for unexpected AWS resource sharing. A good way to manage these unexpected Access Analyzer findings is with AWS Security Hub, alongside your other findings. Visit Getting started with AWS Security Hub to learn more. You can also see the blog post Automated Response and Remediation with AWS Security Hub for event patterns and remediation code examples.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Josh Joy

Josh is a Security Consultant with the AWS Global Security Practice, a part of our Worldwide Professional Services Organization. Josh helps customers improve their security posture as they migrate their most sensitive workloads to AWS. Josh enjoys diving deep and working backwards in order to help customers achieve positive outcomes.


Andrew Gacek

Andrew is a Principal Applied Scientist in the Automated Reasoning Group at Amazon. He designs analyses to ensure the safety and security of AWS customer configurations. Prior to joining Amazon, Andrew worked at Rockwell Collins where he used automated reasoning to verify aerospace applications. He holds a PhD in Computer Science from the University of Minnesota.

IAM Access Analyzer flags unintended access to S3 buckets shared through access points

Post Syndicated from Andrea Nedic original https://aws.amazon.com/blogs/security/iam-access-analyzer-flags-unintended-access-to-s3-buckets-shared-through-access-points/

Customers use Amazon Simple Storage Service (S3) buckets to store critical data and manage access to data at scale. With Amazon S3 Access Points, customers can easily manage shared data sets by creating separate access points for individual applications. Access points are unique hostnames attached to a bucket and customers can set distinct permissions using access point policies. To help you identify buckets that can be accessed publicly or from other AWS accounts or organizations, AWS Identity and Access Management (IAM) Access Analyzer mathematically analyzes resource policies. Now, Access Analyzer analyzes access point policies in addition to bucket policies and bucket ACLs. This helps you find unintended access to S3 buckets that use access points. Access Analyzer makes it easier to identify and remediate unintended public, cross-account, or cross-organization sharing of your S3 buckets that use access points. This enables you to restrict bucket access and adhere to the security best practice of least privilege.

In this post, first I review Access Analyzer and how to enable it. Then I walk through an example of how to use Access Analyzer to identify an S3 bucket that is shared through an access point. Finally, I show you how to view Access Analyzer bucket findings in the S3 Management Console.

IAM Access Analyzer overview

Access Analyzer helps you determine which resources can be accessed publicly or from other accounts or organizations. Access Analyzer determines this by mathematically analyzing access control policies attached to resources. This form of analysis, called automated reasoning, applies logic and mathematical inference to determine all possible access paths allowed by a resource policy. This is how IAM Access Analyzer uses provable security to deliver comprehensive findings for unintended bucket access. You can enable Access Analyzer by navigating to the IAM console. From there, select Access Analyzer to create an analyzer for an account or an organization.

How to use IAM Access Analyzer to identify an S3 bucket shared through an access point

Once you’ve created your analyzer, you can view findings for resources that can be accessed publicly or from other AWS accounts or organizations. For your S3 bucket findings, the Shared through column indicates whether a bucket is shared through its S3 bucket policy, one of its access points, or the bucket ACL. Looking at the Shared through column in the image below, we see the first finding is shared through an Access point.

Figure 1: IAM Access Analyzer report of findings for resources shared outside of my account

Figure 1: IAM Access Analyzer report of findings for resources shared outside of my account

If you use access points to manage bucket access and one of your buckets is shared through an access point, you will see the bucket finding indicate ‘Access Point’. In this example, I select the first finding to learn more. In the detail image below, you can see that the Shared through field lists the Amazon Resource Name (arn) of the access point that grants access to the bucket and the details of the resources and principals. If this access wasn’t your intent, you can review the access point details in the S3 console. There you can modify the access point policy to remove access.

Figure 2: IAM Access Analyzer finding details for a bucket shared through an access point

Figure 2: IAM Access Analyzer finding details for a bucket shared through an access point

How to use Access Analyzer for S3 to identify an S3 bucket shared through an access point

You can also view Access Analyzer findings for S3 buckets in the S3 Management Console with Access Analyzer for S3. This view reports S3 buckets that are configured to allow access to anyone on the internet or other AWS accounts. This includes accounts outside of your AWS organization. For each public or shared bucket, Access Analyzer for S3 displays whether the bucket is shared through the bucket policy, access points, or the bucket ACL. In the example below, we see the my-test-public-bucket is set to public access using a Bucket policy and bucket ACL. Additionally, the my-test-bucket is shared access to other AWS accounts using a Bucket policy and one or more access points. After you identify a bucket with unintended access using Access Analyzer for S3, you can Block Public Access to the bucket. Amazon S3 block public access settings override the bucket policies that are applied to the bucket. The settings also override the access point policies applied to the bucket’s access points.

Figure 3: Access Analyzer for S3 findings report in the S3 Management Console

Figure 3: Access Analyzer for S3 findings report in the S3 Management Console

Next steps

To turn on IAM Access Analyzer at no additional cost, head over to the IAM console. IAM Access Analyzer is available in the IAM console and through APIs in all commercial AWS Regions and AWS GovCloud (US). To learn more about IAM Access Analyzer, visit the feature page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS IAM Forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Andrea Nedic

Andrea is a Senior Technical Program Manager in the AWS Automated Reasoning Group. She enjoys hearing from customers about how they build on AWS. Outside of work, Andrea likes to ski, dance, and be outdoors. She holds a PhD from Princeton University.

Identify Unintended Resource Access with AWS Identity and Access Management (IAM) Access Analyzer

Post Syndicated from Brandon West original https://aws.amazon.com/blogs/aws/identify-unintended-resource-access-with-aws-identity-and-access-management-iam-access-analyzer/

Today I get to share my favorite kind of announcement. It’s the sort of thing that will improve security for just about everyone that builds on AWS, it can be turned on with almost no configuration, and it costs nothing to use. We’re launching a new, first-of-its-kind capability called AWS Identity and Access Management (IAM) Access Analyzer. IAM Access Analyzer mathematically analyzes access control policies attached to resources and determines which resources can be accessed publicly or from other accounts. It continuously monitors all policies for Amazon Simple Storage Service (S3) buckets, IAM roles, AWS Key Management Service (KMS) keys, AWS Lambda functions, and Amazon Simple Queue Service (SQS) queues. With IAM Access Analyzer, you have visibility into the aggregate impact of your access controls, so you can be confident your resources are protected from unintended access from outside of your account.

Let’s look at a couple examples. An IAM Access Analyzer finding might indicate an S3 bucket named my-bucket-1 is accessible to an AWS account with the id 123456789012 when originating from the source IP Or IAM Access Analyzer may detect a KMS key policy that allow users from another account to delete the key, identifying a data loss risk you can fix by adjusting the policy. If the findings show intentional access paths, they can be archived.

So how does it work? Using the kind of math that shows up on unexpected final exams in my nightmares, IAM Access Analyzer evaluates your policies to determine how a given resource can be accessed. Critically, this analysis is not based on historical events or pattern matching or brute force tests. Instead, IAM Access Analyzer understands your policies semantically. All possible access paths are verified by mathematical proofs, and thousands of policies can be analyzed in a few seconds. This is done using a type of cognitive science called automated reasoning. IAM Access Analyzer is the first service powered by automated reasoning available to builders everywhere, offering functionality unique to AWS. To start learning about automated reasoning, I highly recommend this short video explainer. If you are interested in diving a bit deeper, check out this re:Invent talk on automated reasoning from Byron Cook, Director of the AWS Automated Reasoning Group. And if you’re really interested in understanding the methodology, make yourself a nice cup of chamomile tea, grab a blanket, and get cozy with a copy of Semantic-based Automated Reasoning for AWS Access Policies using SMT.

Turning on IAM Access Analyzer is way less stressful than an unexpected nightmare final exam. There’s just one step. From the IAM Console, select Access analyzer from the menu on the left, then click Create analyzer.

Creating an Access Analyzer

Analyzers generate findings in the account from which they are created. Analyzers also work within the region defined when they are created, so create one in each region for which you’d like to see findings.

Once our analyzer is created, findings that show accessible resources appear in the Console. My account has a few findings that are worth looking into, such as KMS keys and IAM roles that are accessible by other accounts and federated users.Viewing Access Analyzer Findings

I’m going to click on the first finding and take a look at the access policy for this KMS key.

An Access Analyzer Finding

From here we can see the open access paths and details about the resources and principals involved. I went over to the KMS console and confirmed that this is intended access, so I archived this particular finding.

All IAM Access Analyzer findings are visible in the IAM Console, and can also be accessed using the IAM Access Analyzer API. Findings related to S3 buckets can be viewed directly in the S3 Console. Bucket policies can then be updated right in the S3 Console, closing the open access pathway.

An Access Analyzer finding in S3

You can also see high-priority findings generated by IAM Access Analyzer in AWS Security Hub, ensuring a comprehensive, single source of truth for your compliance and security-focused team members. IAM Access Analyzer also integrates with CloudWatch Events, making it easy to automatically respond to or send alerts regarding findings through the use of custom rules.

Now that you’ve seen how IAM Access Analyzer provides a comprehensive overview of cloud resource access, you should probably head over to IAM and turn it on. One of the great advantages of building in the cloud is that the infrastructure and tools continue to get stronger over time and IAM Access Analyzer is a great example. Did I mention that it’s free? Fire it up, then send me a tweet sharing some of the interesting things you find. As always, happy building!

— Brandon

AWS Security Profile: Byron Cook, Director of the AWS Automated Reasoning Group

Post Syndicated from Supriya Anand original https://aws.amazon.com/blogs/security/aws-security-profile-byron-cook-director-aws-automated-reasoning-group/


Byron Cook leads the AWS Automated Reasoning Group, which automates proof search in mathematical logic and builds tools that provide AWS customers with provable security. Byron has pushed boundaries in this field, delivered real-world applications in the cloud, and fostered a sense of community amongst its practitioners. In recognition of Byron’s contributions to cloud security and automated reasoning, the UK’s Royal Academy of Engineering elected him as one of 7 new Fellows in computing this year.

I recently sat down with Byron to discuss his new Fellowship, the work that it celebrates, and how he and his team continue to use automated reasoning in new ways to provide higher security assurance for customers in the AWS cloud.

Congratulations, Byron! Can you tell us a little bit about the Royal Academy of Engineering, and the significance of being a Fellow?

Thank you. I feel very honored! The Royal Academy of Engineering is focused on engineering in the broad sense; for example, aeronautical, biomedical, materials, etc. I’m one of only 7 Fellows elected this year that specialize in computing or logic, making the announcement really unique.

As for what the Royal Academy of Engineering is: the UK has Royal Academies for key disciplines such as music, drama, etc. The Royal Academies focus financial support and recognition on these fields, and gives a location and common meeting place. The Royal Academy of Music, for example, is near Regent’s Park in West London. The Royal Academy of Engineering’s building is in Carlton Place, one of the most exclusive locations in central London near Pall Mall and St. James’ Park. I’ve been to a number of lectures and events in that space. For example, it’s where I spoke ten years ago when I was the recipient of the Roger Needham prize. Some examples of previously elected Fellows include Sir Frank Whittle, who invented the jet engine; radar pioneer Sir George MacFarlane, and Sir Tim Berners-Lee, who developed the world-wide web.

Can you tell us a little bit about why you were selected for the award?

The letter I received from the Royal Academy says it better than I could say myself:

“Byron Cook is a world-renowned leader in the field of formal verification. For over 20 years Byron has worked to bring this field from academic hypothesis to mechanised industrial reality. Byron has made major research contributions, built influential tools, led teams that operationalised formal verification activities, and helped establish connections between others that have dramatically accelerated growth of the area. Byron’s tools have been applied to a wide array of topics, e.g. biological systems, computer operating systems, programming languages, and security. Byron’s Automated Reasoning Group at Amazon is leading the field to even greater success”.

Formal verification is the one term here that may be foreign to you, so perhaps I should explain. Formal verification is the use of mathematical logic to prove properties of systems. Euclid, for example, used formal verification in ~300 BC to prove that the Pythagorean theorem holds for all possible right-angled triangles. Today we are using formal verification to prove things about all possible configurations of a computer program might reach. When I founded Amazon’s Automated Reasoning Group, I named it that because my ambition was to automate all of the reasoning performed during formal verification.

Can you give us a bit of detail about some of the “research contributions and tools” mentioned in the text from Royal Academy of Engineering?

Probably my best-known work before joining Amazon was on the Terminator tool. Terminator was designed to reason at compile-time about what a given computer program would eventually do when running in production. For example, “Will the program eventually halt?” This is the famous “Halting problem,” proved undecidable in the 1930s. The Terminator tool piloted a new approach to the problem which is popular now, based on the idea of incrementally improving the best guess for a proof based on failed proof attempts. This was the first known approach capable of scaling termination proving to industrial problems. My colleagues and I used Terminator to find bugs in device drivers that could cause operating systems to become unresponsive. We found many bugs in device drivers that ran keyboards, mice, network devices, and video cards. The Terminator tool was also the basis of BioModelAnaylzer. It turns out that there’s a connection between diseases like Leukemia and the Halting problem: Leukemia is a termination bug in the genetic-regulatory pathways in your blood. You can think of it in the same way you think of a device driver that’s stuck in an infinite loop, causing your computer to freeze. My tools helped answer fundamental questions that no tool could solve before. Several pharmaceutical companies use BioModelAnaylzer today to understand disease and find new treatment options. And these days, there is an annual international competition with many termination provers that are much better than the Terminator. I think that this is what Royal Academy is talking about when they say I moved the area from “academic hypothesis to mechanized industrial reality.”

I have also worked on problems related to the question of P=NP, the most famous open problem in computing theory. From 2000-2006, I built tools that made NP feel equal to P in certain limited circumstances to try and understand the problem better. Then I focused on circumstances that aligned with important industrial problems, like proving the absence of bugs in microprocessors, flight control software, telecommunications systems, and railway control systems. These days the tools in this space are incredibly powerful. You should check out the software tools CVC4 or Z3.

And, of course, there’s my work with the Automated Reasoning Group, where I’ve built a team of domain experts that develop and apply formal verification tools to a wide variety of problems, helping make the cloud more secure. We have built tools that automatically reason about the semantics of policies, networks, cryptography, virtualization, etc. We reason about the implementation of Amazon Web Services (AWS) itself, and we’ve built tools that help customers prove the correctness of their AWS-based implementations.

Could you go into a bit more detail about how this work connects to Amazon and its customers?

AWS provides cloud services globally. Cloud is shorthand for on-demand access to IT resources such as compute, storage, and analytics via the Internet with pay-as-you-go pricing. AWS has a wide variety of customers, ranging from individuals to the largest enterprises, and practically all industries. My group develops mathematical proof tools that help make AWS more secure, and helps AWS customers understand how to build in the cloud more securely.

I first became an AWS customer myself when building BioModelAnaylzer. AWS allowed us working on this project to solve major scientific challenges (see this Nature Scientific Report for an example) using very large datacenters, but without having to buy the machines, maintain the machines, maintain the rooms that the machines would sit in, the A/C system that would keep them cool, etc. I was also able to easily provide our customers with access to the tool via the cloud, because it’s all over the internet. I just pointed people to the end-point on the internet and, presto, they were using the tool. About 5 years before developing BioModelAnalyzer, I was developing proof tools for device drivers and I gave a demo of the tool to my executive leadership. At the end of the demo, I was asked if 5,000 machines would help us do more proofs. Computationally, the answer was an obvious “yes,” but then I thought a minute about the amount of overhead required to manage a fleet of 5,000 machines and reluctantly replied “No, but thank you very much for the offer!” With AWS, it’s not even a question. Anyone with an Amazon account can provision 5,000 machines for practically nothing. In less than 5 minutes, you can be up and running and computing with thousands of machines.

What I love about working at AWS is that I can focus a very small team on proving the correctness of some aspect of AWS (for example, the cryptography) and, because of the size and importance of the customer base, we make much of the world meaningfully more secure. Just to name a few examples: s2n (the Amazon TLS implementation); the AWS Key Management Service (KMS), which allows customers to securely store crypto keys; and networking extensions to the IoT operating system Amazon FreeRTOS, which customers use to link cloud to IoT devices, such as robots in factories. We also focus on delivering service features that help customers prove the correctness of their AWS-based implementations. One example is Tiros, which powers a network reachability feature in Amazon Inspector. Another example is Zelkova, which powers features in services such as Amazon S3, AWS Config, and AWS IoT Device Defender.

When I think of mathematical logic I think of obscure theory and messy blackboards, not practical application. But it sounds like you’ve managed to balance the tension between theory and practical industrial problems?

I think that this is a common theme that great scientists don’t often talk about. Alan Turing, for example, did his best work during the war. John Snow, who made fundamental contributions to our understanding of germs and epidemics, did his greatest work while trying to figure out why people were dying in the streets of London. Christopher Stratchey, one of the founders of our field, wrote:

“It has long been my personal view that the separation of practical and theoretical work is artificial and injurious. Much of the practical work done in computing, both in software and in hardware design, is unsound and clumsy because the people who do it have not any clear understanding of the fundamental design principles in their work. Most of the abstract mathematical and theoretical work is sterile because it has no point of contact with real computing.”

Throughout my career, I’ve been at the intersection of practical and theoretical. In the early days, this was driven by necessity: I had two children during my PhD and, frankly, I needed the money. But I soon realized that my deep connection to real engineering problems was an advantage and not a disadvantage, and I’ve tried through the rest of my career to stay in that hot spot of commercially applicable problems while tackling abstract mathematical topics.

What’s next for you? For the Automated Reasoning Group? For your scientific field?

The Royal Academy of Engineering kindly said that I’ve brought “this field from academic hypothesis to mechanized industrial reality.” That’s perhaps true, but we are very far from done: it’s not yet an industrial standard. The full power of automated reasoning is not yet available to everyone because today’s tools are either difficult to use or weak. The engineering challenge is to make them both powerful and easy to use. With that I believe that they’ll become a key part of every software engineer’s daily routine. What excites me is that I believe that Amazon has a lot to teach me about how to operationalize the impossible. That’s what Amazon has done over and over again. That’s why I’m at Amazon today. I want to see these proof techniques operating automatically at Amazon scale.

Provable security webpage
Lecture: Fundamentals for Provable Security at AWS
Lecture: The evolution of Provable Security at AWS
Lecture: Automating compliance verification using provable security
Lecture: Byron speaks about Terminator at University of Colorado

If you have feedback about this post, let us know in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

AWS Security Profile: Rustan Leino, Senior Principal Applied Scientist

Post Syndicated from Supriya Anand original https://aws.amazon.com/blogs/security/aws-security-profile-rustan-leino-senior-principal-applied-scientist/


I recently sat down with Rustan from the Automated Reasoning Group (ARG) at AWS to learn more about the prestigious Computer Aided Verification (CAV) Award that he received, and to understand the work that led to the prize. CAV is a top international conference on formal verification of software and hardware. It brings together experts in this field to discuss groundbreaking research and applications of formal verification in both academia and industry. Rustan received this award as a result of his work developing program-verification technology. Rustan and his team have taken his research and applied it in unique ways to protect AWS core infrastructure on which customers run their most sensitive applications. He shared details about his journey in the formal verification space, the significance of the CAV award, and how he plans to continue scaling formal verification for cloud security at AWS.

Congratulations on your CAV Award! Can you tell us a little bit about the significance of the award and why you received it?

Thanks! I am thrilled to jointly receive this award with Jean-Christophe Filliâtre, who works at the CNRS Research Laboratory in France. The CAV Award recognizes fundamental contributions to program verification, that is, the field of mathematically proving the correctness of software and hardware. Jean-Christophe and I were recognized for the building of intermediate verification languages (IVL), which are a central building block of modern program verifiers.

It’s like this: the world relies on software, and the world relies on that software to function correctly. Software is written by software engineers using some programming language. If the engineers want to check, with mathematical precision, that a piece of software always does what it is intended to do, then they use a program verifier for the programming language at hand. IVLs have accelerated the building of program verifiers for many languages. So, IVLs aid the construction of program verifiers which, in turn, improve software quality that, in turn, makes technology more reliable for all.

What is your role at AWS? How are you applying technologies you’ve been recognized by CAV for at AWS?

I am building and applying proof tools to ensure the correctness and security of various critical components of AWS. This lets us deliver a better and safer experience for our customers. Several tools that we apply are based on IVLs. Among them are the SideTrail verifier for timing-based attacks, the VCC verifier for concurrent systems code, and the verification-aware programming language Dafny, all of which are built on my IVL named Boogie.

What does an automated program verification tool do?

An automated program verifier is a tool that checks if a program behaves as intended. More precisely, the verifier tries to construct a correctness proof that shows that the code meets the given specification. Specifications include things like “data at rest on disk drives is always encrypted,” or “the event-handler always eventually returns control back to the caller,” or “the API method returns a properly formatted buffer encrypted under the current session key.” If the verifier detects a discrepancy (that is, a bug), the developer responds by fixing the code. Sometimes, the verifier can’t determine what the answer is. In this case, the developer can respond by helping the tool with additional information, so-called proof hints, until the tool is able to complete the correctness proof or find another discrepancy.

For example, picture a developer who is writing a program. The program is like a letter written in a word processor, but the letter is written in a language that the computer can understand. For cloud security, say the program manages a set of data keys and takes requests to encrypt data under those keys. The developer writes down the intention that each encryption request must use a different key. This is the specification: the what.

Next, the developer writes code that instructs the computer how to respond to a request. The code separates the keys into two lists. An encryption request takes a key from the “not used” list, encrypts the given data, and then places the key on the “used” list.

To see that the code in this example meets the specification, it is crucial to understand the roles of the two lists. A program verifier might not figure this out by itself and would then indicate the part of the code it can’t verify, much like a spell-checker underlines spelling and grammar mistakes in a letter you write. To help the program verifier along, the developer provides a proof hint that says that the keys on the “not used” list have never been returned. The verifier checks that the proof hint is correct and then, using this hint, is able to construct the proof that the code meets the specification.

You’ve designed several verification tools in your career. Can you share how you’re using verification tools such as Dafny and Boogie to provide higher assurances for AWS infrastructure?

Dafny is a Java-like programming language that was designed with verification in mind. Whereas most programming languages only allow you to write code, Dafny allows you to write specifications and code at the same time. In addition, Dafny allows you to write proof hints (in fact, you can write entire proofs). Having specifications, code, and proofs in one language sets you up for an integrated verification experience. But this would remain an intellectual exercise without an automated program verifier. The Dafny language was designed alongside its automated program verifier. When you write a Dafny program, the verifier constantly runs in the background and points out mistakes as you go along, very much like the spell-checker underlines I alluded to. Internally, the Dafny verifier is based on the Boogie IVL.

At AWS, we’re currently using Dafny to write and prove a variety of security-critical libraries. For example: encryption libraries. Encryption is vital for keeping customer data safe, so it makes for a great place to focus energy on formal verification.

You spent time in scientific research roles before joining AWS. Has your experience at AWS caused you to see scientific challenges in a different way now?

I began my career in 1989 in the Microsoft Windows LAN Manager team. Based on my experiences helping network computers together, I became convinced that formally proving the correctness of programs was going to go from a “nice to have” to a “must have” in the future, because of the need for more security in a world where computers are so interconnected. At the time, the tools and techniques for proving programs correct were so rudimentary that the only safe harbor for this type of work was in esoteric research laboratories. Thus, that’s where I could be found. But these days, the tools are increasingly scalable and usable, so finally I made the jump back into development where I’m leading efforts to apply and operationalize this approach, and also to continue my research based on the problems that arise as we do so.

One of the challenges we had in the 1990s and 2000s was that few people knew how to use the tools, even if they did exist. Thus, while in research laboratories, an additional focus of mine has been on making tools that are so easy to use that they can be used in university education. Now, with dozens of universities using my tools and after several eye-opening successes with the Dafny language and verifier, I’m scaling these efforts up with development teams in AWS that can hire the students who are trained with Dafny.

I alluded to continuing research. There are still scientific challenges to make specifications more expressive and more concise, to design programming languages more streamlined for verification, and to make tools more automated, faster, and more predictable. But there’s an equally large challenge in influencing the software engineering process. The two are intimately linked, and cannot be teased apart. Only by changing the process can we hope for larger improvements in software engineering. Our application of formal verification at AWS is teaching us a lot about this challenge. We like to think we’re changing the software engineering world.

What are the next big challenges that we need to tackle in cloud security? How will automated reasoning play a role?

There is a lot of important software to verify. This excites me tremendously. As I see it, the only way we can scale is to distribute the verification effort beyond the verification community, and to get usable verification tools into the hands of software engineers. Tooling can help put the concerns of security engineers into everyday development. To meet this challenge, we need to provide appropriate training and we need to make tools as seamless as possible for engineers to use.

I hear your YouTube channel, Verification Corner, is loved by engineering students. What’s the next video you’ll be creating?

[Rustan laughs] Yes, Verification Corner has been a fun way for me to teach about verification and I receive appreciation from people around the world who have learned something from these videos. The episodes tend to focus on learning concepts of program verification. These concepts are important to all software engineers, and Verification Corner shows the concepts in the context of small (and sometimes beautiful) programs. Beyond learning the concepts in isolation, it’s also important to see the concepts in use in larger programs, to help engineers apply the concepts. I want to devote some future Verification Corner episodes to showing verification “in the trenches;” that is, the application of verification in larger, real-life (and sometimes not so beautiful) programs for cloud security, as we’re continuing to do at AWS.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Supriya Anand

Supriya is a Senior Digital Strategist at AWS.

AWS Security Profile: John Backes, Senior Software Development Engineer

Post Syndicated from Supriya Anand original https://aws.amazon.com/blogs/security/aws-security-profile-john-backes-senior-software-development-engineer/


AWS scientists and engineers believe in partnering closely with the academic and research community to drive innovation in a variety of areas of our business, including cloud security. One of the ways they do this is through participating in and sponsoring scientific conferences, where leaders in fields such as automated reasoning, artificial intelligence, and machine learning come together to discuss advancements in their field. The International Conference on Computer Aided Verification (CAV), is one such conference, sponsored and—this year—co-chaired by the AWS Automated Reasoning Group (ARG). CAV is dedicated to the advancement of the theory and practice of computer-aided formal analysis methods for hardware and software systems. This conference will take place next week, July 13-18, 2019 at The New School in New York City.

CAV covers the spectrum from theoretical results to concrete applications, with an emphasis on practical verification tools and the algorithms and techniques that are needed for their implementation. CAV also publishes scientific papers from the research community that it considers vital to continue spurring advances in hardware and software verification. One of the authors of a paper accepted this year, Reachability Analysis for AWS-based Networks, is authored by John Backes of AWS. I sat down with him to talk about the unique network-based analysis service, Tiros, that’s described in the paper and how it’s helping to set new standards for cloud network security.

Tell me about yourself: what made you decide to become a software engineer in the automated reasoning space?

It sounds cliche, but I have wanted to work with computers since I was a child. I recently was looking through my old school work, and I found an assignment from the second grade where I wrote about “What I wanted to be when I grow up.” I had drawn a crude picture of someone working on a computer and wrote “I want to be a computer programmer.” At university, I took a class on discrete mathematics where I learned about mathematical induction for the first time; it seemed like magic to me. I struggled a bit to develop proofs for the homework assignments and tests in the course. So the idea of writing a program to perform induction for me automatically became very compelling.

I decided to go to graduate school to do research related to proving the correctness of digital circuits. After graduating, I built automated reasoning tools for proving the correctness of software that controls airplanes and helicopters. I joined AWS because I wanted to prove properties about systems that are used by almost everyone.

I understand that your research paper on Tiros was recently published by CAV. What does the research paper cover?

Many influential papers in the space of automated reasoning have been published in CAV over the past three decades. We are publishing a paper at CAV 2019 about three different types of automated reasoning tools we used in the development of Tiros. It discusses different formal reasoning tools and techniques we used, and what tools and techniques were able to scale and which were not. The paper gives readers a blueprint for how they could build their own automated reasoning services on AWS.

What is Tiros? How is it being used in Amazon Inspector?

Tiros answers reachability questions about Amazon Virtual Private Cloud (Amazon VPC) networks. It allows customers to answer questions like “Which of my EC2 instances are reachable from the internet?” and “Is it possible for this Elastic Network Interface (ENI) to send traffic to that ENI?Amazon Inspector uses Tiros to power its recently launched Network Reachability Rules package. Customers can use this rules package to produce findings about how traffic originating from outside their accounts can reach their Amazon EC2 instances (for example, via an internet gateway, elastic load balancer, or virtual private gateway) and via which ports. Inspector also makes suggestions about how to remediate findings that a customer would like to eliminate. For example, if a customer has an EC2 instance that has port 22 (commonly associated with SSH) open to the internet, Amazon Inspector will suggest what security group needs to be changed to eliminate this finding.

Why are networks difficult to understand? How is Tiros helping to solve that problem?

As customers add more components and open them up to access from more addresses, the number of possible paths that traffic can flow through a network increases exponentially. It may be feasible to test all of the paths through a network with a dozen computers, but it would take longer than the heat death of the universe to test all possible paths of a network with hundreds of components (elastic load balancers, NAT gateways, network access control lists, EC2 instances, and so on). Tiros reasons about all possible network paths completely, using “symbolic methods,” where it does not send any packets but instead treats the network as a mathematical object. It does this by gathering information about how a VPC is configured using the describe APIs of relevant services. It takes this information and generates a set of logical constraints. It then proves properties about these sets of constraints using something called an SMT solver [Editor’s note: discussed below]

Tiros relies on the use of automated reasoning techniques and SMT solvers to provide customers with a better understanding of potential network vulnerabilities. Can you explain what these concepts are and how they’re being used in Tiros?

SMT stands for Satisfiability Modulo Theories. SMT solvers are general purpose software tools that solve a collection of mathematical constraints. The algorithms and heuristics that power these tools have been steadily improving over the past three decades. This means that if you can translate a problem into a form that can solved by an SMT solver then you can take advantage of highly optimized algorithms that have been continuously improved over decades. There are tutorials online about how to use SMT solvers to provide solutions to all sorts of interesting constraints problems. Another AWS service called Zelkova uses SMT solvers to answer questions about IAM policies. Tiros uses an SMT solver called MonoSAT to encode reachability constraints about VPC networks. The figure below shows how we encode constraints about what types of packets are allowed to flow from a subnet to an ENI:
Math equation

This diagram is from the CAV paper. It illustrates the constraints that Tiros generates to reason about packets moving from subnets to ENIs. Informally, these constraints say that a packet is allowed to flow from an ENI out to its subnet’s route table if the source IP address of the packet is the same as the source IP address of the ENI. Likewise, a packet can flow from a subnet to an ENI if the destination IP address of the packet is the same as that of the ENI.

Tiros generates all sorts of constraints like this to represent the rules of routing in VPCs. If the SMT solver is able to find a solution to satisfy all of the constraints, then this corresponds to a valid path that a packet can flow through the VPC from some source to some destination. Someone using Tiros can then inspect these paths to determine the source of a potential network misconfiguration.

Is Tiros helping customers meet their compliance requirements? How?

Many customers need to meet compliance standards such as PCI, FedRAMP, and HIPAA. The requirements in these standards call for evidence of properly configured network controls. For example, Requirement 11 of the PCI DSS gives guidance to regularly perform penetration testing and network vulnerability scans. Customers can use Amazon Inspector to automatically schedule assessments on a regular cadence to generate evidence that they can use to help meet this requirement.

What do you tell your friends and family about what you do?

I tell them that AWS is responsible for the security of the cloud, and AWS customers are responsible for their security in the cloud. AWS refers to this concept as the Shared Responsibility Model. I explain that I work on a technology called Tiros that automatically produces mathematical proofs to enable AWS customers to build secure applications in the cloud.

What’s next for Tiros? For automated reasoning at AWS?

AWS is constantly adding new networking features. For example, we recently announced support for Direct Connect in Transit Gateway. Tiros is continuously updated to reason about these new services and features so customers who use the service can see new reachability results as they use new VPC features. Right now, we are really focused on how Tiros can be used to help customers with compliance. We plan to integrate Tiros results into other services to help produce evidence of compliance that customers can provide to auditors.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Supriya Anand

Supriya is a Senior Digital Strategist at AWS.

Provable security podcast: automated reasoning’s past, present, and future with Moshe Vardi

Post Syndicated from Supriya Anand original https://aws.amazon.com/blogs/security/provable-security-podcast-automated-reasonings-past-present-and-future-with-moshe-vardi/

AWS just released the first podcast of a new miniseries called Provable Security: Conversations on Next Gen Security. We published a podcast on provable security last fall, and, due to high customer interest, we decided to bring you a regular peek into this AWS initiative. This series will explore the unique intersection between academia and industry in the cloud security space. Specifically, the miniseries will cover how the traditionally academic field of automated reasoning is being applied at AWS at scale to help provide higher assurances for our customers, regulators, and the broader cloud industry. We’ll talk to individuals whose minds helped shape the history of automated reasoning, as well as learn from engineers and scientists who are applying automated reasoning to help solve pressing security and privacy challenges in the cloud.

This first interview is with Moshe Vardi, Karen Ostrum George Distinguished Service Professor in Computational Engineering and Director of the Ken Kennedy Institute for Information Technology. Moshe describes the history of logic, automated reasoning, and formal verification. He discusses modern day applications in software and how principles of automated reasoning underlie core aspects of computer science, such as databases. The podcast interview highlights Moshe’s standout contributions to the formal verification and automated reasoning space, as well as the number of awards he’s received for his work. You can learn more about Moshe Vardi here.

Byron Cook, Director of the AWS Automated Reasoning Group, interviews Moshe and will be featured throughout the miniseries. Byron is leading the provable security initiative at AWS, which is a collection of technologies that provide higher security assurance to customers by giving them a deeper understanding of their cloud architecture.

You can listen to or download the podcast above, or visit this link. We’ve also included links below to many of the technology and references Moshe discusses in his interview.

We hope you enjoy the podcast and this new miniseries! If you have feedback, let us know in the Comments section below.

Automated reasoning public figures:

Automated techniques and algorithms:

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Supriya Anand

Supriya is a Content Strategist at AWS working with the Automated Reasoning Group.

How AWS SideTrail verifies key AWS cryptography code

Post Syndicated from Daniel Schwartz-Narbonne original https://aws.amazon.com/blogs/security/how-aws-sidetrail-verifies-key-aws-cryptography-code/

We know you want to spend your time learning valuable new skills, building innovative software, and scaling up applications — not worrying about managing infrastructure. That’s why we’re always looking for ways to help you automate the management of AWS services, particularly when it comes to cloud security. With that in mind, we recently developed SideTrail, an open source, program analysis tool that helps AWS developers verify key security properties of cryptographic implementations. In other words, the tool gives you assurances that secret information is kept secret. For example, developers at AWS have used this tool to verify that an AWS open source library, s2n, is protected against cryptographic side-channels. Let’s dive a little deeper into what this tool is and how it works.

SideTrail’s secret sauce is its automated reasoning capability, a mathematical proof-based technology, that verifies the correctness of critical code and helps detect errors that could lead to data leaks. Automated reasoning mathematically checks every possible input and reports any input that could cause an error. Full detail about SideTrail can be found in a scientific publication from VSTTE, a prominent software conference.

Automated reasoning is also useful because it can help prevent timing side-channels, a process by which data may be inadvertently revealed. Automated reasoning is what powers SideTrail to be able to detect the presence of these issues. If a code change causes a timing side-channel to be introduced, the mathematical proof will fail, and you’ll be notified about the flaw before the code is released.

Timing side-channels potentially allows attackers to learn about sensitive information, such as encryption keys, by observing the timing of messages on a network. For example, let’s say you have some code that checks a password as follows:

for (i = 0; i < length; ++i) {
	if (password[i] != input[i]) {
		send("bad password");

The amount of time until the reply message is received depends on which byte in the password is incorrect. As a result, an attacker can start guessing what the first character of the password is:

  • a*******
  • b*******
  • c*******

When the time to receive the error message changes, the first letter in the password is revealed. Repeating this process for each of the remaining characters turns a complex, exponential guessing challenge into a comparatively simple, linear one. SideTrail identifies these timing gaps, alerting developers to potential data leaks that could impact customers.

Here’s how to get started with SideTrail:

  1. Download SideTrail from GitHub.
  2. Follow the instructions provided with the download to setup your environment.
  3. Annotate your code with a threat model. The threat model is used to determine:
    • Which values should be considered secret. For example, a cryptographic key would be considered a secret, whereas the IP address of a public server wouldn’t.
    • The minimum timing difference an attacker can observe.
  4. Compile your code using SideTrail and analyze the report (for full guidance on how to analyze your report, read the VSTTE paper).

How SideTrail proves code correctness

The prior example had a data leak detected by SideTrail. However, SideTrail can do more than just detect leaks caused timing side-channels: it can prove that correct code won’t result in such leaks. One technique that developers can use to prevent data loss is to ensure that all paths in a program take the same amount of time to run no matter what a secret value may be. SideTrail helps to prove that such code is correct. Consider the following code that implements this technique:

int example(int pub, int secret) { 
	if (secret > pub) { 
  		pub -= secret; 
  	} else {
  		pub += secret; 
	return pub;

Following the steps outlined in the prior section, a developer creates a threat model for this code by determining which values are public and which are secret. After annotating their code accordingly, they then compile it using SideTrail, which verifies that all paths in the code take the same amount of time. SideTrail verifies the code by leveraging the industry-standard Clang compiler to compile the code to LLVM bitcode. Working at the LLVM bitcode level allows SideTrail to work on the same optimized representation the compiler does. This is important because compiler optimizations can potentially affect the timing behavior of code, and hence the effectiveness of countermeasures used in that code. To maximize the effectiveness of SideTrail, developers can compile code for SideTrail using the same compiler configuration that’s used for released code.

The result of this process is a modified program that tracks the time required to execute each instruction. Here is how the result looks, inclusive of the annotations SideTrail includes (in red) to indicate the runtime of each operation. For more detail on the LLVM annotations, please read the VSTTE paper referenced earlier in this post.

int example(int pub, int secret, int *cost) { 
	*cost += GT_OP_COST;  
	if (secret > pub) { 
		pub -= secret; 
  	} else {
		*cost += ADDITION_OP_COST;
  		pub += secret; 
	return pub;

If you’re wondering how this works in practice, you can inspect the SideTrail model by examining the annotated LLVM bitcode. Guidance on inspecting the SideTrail model can be found in the VSTTE paper.

To check whether the above code has a timing side-channel, SideTrail makes two copies of the code, symbolically executes both copies, and then checks whether the runtime differs. SideTrail does this by generating a wrapper and then using an automated reasoning engine to mathematically verify the correctness of this test for all possible inputs. If automated reasoning engine does this successfully, you can have confidence that the code does not contain a timing side-channel.


By using SideTrail, developers will be able to prove that there are no timing side channels present in their code, providing increased assurance against data loss. For example, Amazon s2n uses code-balancing countermeasures to mitigate this kind of attack. Through the use of SideTrail, we’ve proven the correctness of these countermeasures by running regression tests on every check-in to s2n, allowing AWS developers to build on top of this library with even more confidence, while delivering product and technology updates to customers with the highest security standards available.

More information about SideTrail and our other research can be found on our Provable Security hub. SideTrail is also available for public use as an open-source project.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Daniel Schwartz-Narbonne

Daniel is a Software Development Engineer in the AWS Automated Reasoning Group. Prior to joining Amazon, he earned a PhD at Princeton, where he developed a software framework to debug parallel programs. Then, at New York University, he designed a tool that automatically isolates and explains the cause of crashes in C programs. When he’s not working, you might find Daniel in the kitchen making dinner for his family, in a tent camping, or volunteering as an EMT with a local ambulance squad.

Podcast: AI tech named automated reasoning provides next-gen cloud security

Post Syndicated from Supriya Anand original https://aws.amazon.com/blogs/security/podcast-automated-reasoning-aws-next-gen-security-ai/

AWS just released a new podcast on how next generation security technology, backed by automated reasoning, is providing you higher levels of assurance for key components of your AWS architecture. Byron Cook, Director of the AWS Automated Reasoning Group, discusses how automated reasoning is embedded within AWS services and code and the tools customers can take advantage of today to achieve provable security.

Here’s a direct link to listen to it.

As the AWS cloud continues to grow, offering more services and features for you to architect your environment, AWS is working to ensure that the security of the cloud meets the pace of growth and your needs. To address the evolving threat landscape, AWS has made it easier to operate workloads securely in the cloud with a host of services and features that strengthen your security posture. Using automated reasoning, a branch of artificial intelligence, AWS is implementing next-generation security technology to help secure its platform. Automated reasoning at AWS helps customers verify and ensure continuous security of key components in the cloud, providing provable security—the highest assurance of cloud security.

Automated reasoning is powered by mathematical logic and consists of designing and implementing mechanized mathematical proofs that key components in the cloud are operating in alignment with customers’ intended security measures. With automated reasoning, AWS enables customers to detect entire classes of misconfigurations that could potentially expose vulnerable data. This relieves the customers’ burden of having to manually verify increasingly granular configurations for a complex organization, providing new levels of assurance that security verification scales with enterprise growth.

We hope you enjoy the podcast! If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.


Supriya Anand

Supriya is a Content Strategist at AWS working with the Automated Reasoning Group.

Daniel Schwartz-Narbonne shares how automated reasoning is helping achieve the provable security of AWS boot code

Post Syndicated from Supriya Anand original https://aws.amazon.com/blogs/security/automated-reasoning-provable-security-of-boot-code/

I recently sat down with Daniel Schwartz-Narbonne, a software development engineer in the Automated Reasoning Group (ARG) at AWS, to learn more about the groundbreaking work his team is doing in cloud security. The team uses automated reasoning, a technology based on mathematical logic, to prove that key components of the cloud are operating as intended. ARG recently hit a milestone by leveraging this technology to prove the memory safety of boot code components. Boot code is the foundation of the cloud. Proving the memory safety of boot code is akin to verifying that the foundation of your house is secure—it lets you build upon it without worry. Daniel shared details with the AWS Security Blog team about the project’s accomplishments and how it will help solve cloud security challenges.

Daniel Schwartz-Narbonne discusses ARG's work on the provable security of boot code

Daniel Schwartz-Narbonne discusses how automated reasoning, a branch of AI tech, can help prove the security of boot code

Tell me about yourself: what made you decide to become a software engineer with the Automated Reasoning Group?

I wanted to become an engineer because I like building things and understanding how they work. I get satisfaction out of producing something tangible. I went into cloud security because I believe it’s a major area of opportunity in the computing industry right now. As the cloud continues to scale in response to customer demand, we’re going to need more and more automation around security to meet this demand.

I was offered the opportunity to work with ARG after I finished up my post-doc at NYU. Byron Cook, the director of ARG, was starting up a team with the mission of using formal reasoning methods to solve real-world problems in the cloud. Joining ARG was an opportunity for me to help pioneer the use of automated reasoning for cloud security.

How would you describe automated reasoning?

Automated reasoning uses mathematical analysis to understand what’s happening in a complex computer system. The technique takes a system and a question you might have about the system—like “is the system memory safe?”—and reformulates the question as a set of mathematical properties. Then it uses automated reasoning tools called “constraint solvers” to analyze these properties and provide an answer. We’re using this technology to provide higher levels of cloud security assurance for AWS customers via features that protect key components of the cloud, including IAM permissions, networking controls, verification for security protocols and source code of foundational software packages in use at AWS. Links to this work can be found at the bottom of this post.

What is the Boot Code Verification Project?

The Boot Code Verification Project is one of several ARG projects that apply automated reasoning techniques to the foundational elements of cloud security. In this case, we’re looking at boot code. Boot code is the first code that starts when you turn on a computer. It’s the foundation for all computer code, which makes its security critical. This is joint work with my ARG colleagues Michael Tautschnig and Mark Tuttle and with infrastructure engineers.

Why is boot code so difficult to secure?

Ensuring boot code security by using traditional techniques, such as penetration testing and unit testing, is hard. You can only achieve visibility into code execution via debug ports, which means you have almost no ability to single-step the boot code for debugging. You often can’t instrument the boot code, either, because this can break the build process: the increased size of the instrumented code may be larger than the size of the ROM targeted by the build process. Extracting the data collected by instrumentation is also difficult because the boot code has no access to a file system to record the data, and memory available for storing the data may be limited.

Our aim is to gain increased confidence in the correctness of the boot code by using automated reasoning, instead. Applying automated reasoning to boot code has challenges, however. A big one is that boot code directly interfaces with hardware. Hardware can, for example, modify the value of memory locations through the use of memory-mapped input/output (IO). We developed techniques for modeling the effect that hardware can have on executing boot code. One technique we successfully tried is using model checking to symbolically represent all the effects that hardware could have on the memory state of the boot code. This required close collaboration with our product teams to understand AWS data center hardware and then design and validate a model based on these specifications. To ensure future code revisions maintain the properties we have validated, our analysis is embedded into the continuous integration flow. In such a workflow, each change by the developers triggers automated verification.

We published the full technical details, including the process by which we were able to prove the memory safety of boot code, in Model Checking Boot Code from AWS Data Centers, a peer-reviewed scientific publication at the Computer-Aided Verification Conference, the leading academic conference on automated reasoning.

You mention model checking. Can you explain what that is?

A software model checker is a tool that examines every path through a computer program from every possible input. There are different kinds of model checkers, but our model checker is based on a constraint solver (called a SAT solver, or a Satisfiability solver) that can test whether a given set of constraints is satisfiable. To understand how it works, first remember that each line of a computer program describes a particular change in the state of the computer (for example, turning on the device). Our model checker describes each change as an equation that shows how the computer’s state has changed. If you describe each line of code in a program this way, the result is a set of equations that describes a set of constraints upon all the ways that the program can change the state of the computer. We hand these constraints and a question (“Is there a bug?”) to a constraint solver, which then determines if the computer can ever reach a state in which the question (“Is there a bug?”) is true.

What is memory safety? Why is it so crucial to prove the memory safety of boot code?

A proof of memory safety gives you assurance that certain security issues cannot arise. Memory safety states that every operation in a program can only write to the variables it has access to, within the bounds of those variables. A classic example is a buffer that stores data coming in from a message on the network. If the message is larger than the buffer in which it’s stored, then you’ll overwrite the buffer, as well as whatever comes after the buffer. If trusted data stored after the buffer is overwritten, then the value of this trusted data is under the control of the adversary inducing the buffer overflow—and your system’s security is now at risk.

Boot code is written in C, a language that does not have the dynamic run-time support for memory safety found in other programming languages. The Boot Code Verification Project uses automated reasoning technology to prove memory safety of the boot code for every possible input.

What has the Boot Code Verification Project accomplished?

We’ve achieved two major accomplishments. The first is the concrete proof we’ve delivered. We have demonstrated that for every boot configuration, device configuration, possible boot source, and second stage binary, AWS boot code is memory safe.

The second accomplishment is more forward-looking. We haven’t just validated a piece of code—we’ve validated a methodology for testing security critical C code at AWS. As we describe in our paper, completing this proof required us to make significant advances in program analysis tooling, ranging from the way we handle memory-mapped IO, to a more efficient symbolic implementation of memcpy, to new tooling that can analyze the unusual linker configurations used in boot code. We made the tooling easier to use, with AWS Batch scripts that allow automatic proof re-runs, and HTML-based reports that make it easy to dive in and understand code. We expect to build on these improvements as we continue to apply automated reasoning to the AWS cloud.

Is your work open source?

We use the model checker CBMC (C Bounded Model Checker), which is available on GitHub under the open source Berkeley Software Distribution license. AWS is committed to the open source community, and we have a number of other projects that you can also find on GitHub.

Overall, how does the Boot Code Verification Project benefit customers?

Customers ask how AWS secures their data. This project is a part of the answer, providing assurance for how AWS protects low-level code in customer data centers running on AWS. Given all systems and processes run on top of this code, customers need to know measures are in place to keep it continuously safe.

We also believe that technology powered by automated reasoning has wider applicability. Our team has created tools like Zelkova, which we’ve embedded in a variety of AWS services to help customers validate their security-critical code. Because the Boot Code Verification Project is based on an existing open source project, wider applications of our methodology have also been documented in a variety of scientific publications that you can find on the the AWS Provable Security page under “Insight papers.” We encourage customers to check out our resources and comment below!

Want more AWS Security news? Follow us on Twitter.


Supriya Anand

Supriya is a Content Strategist at AWS working with the Automated Reasoning Group.

How AWS uses automated reasoning to help you achieve security at scale

Post Syndicated from Andrew Gacek original https://aws.amazon.com/blogs/security/protect-sensitive-data-in-the-cloud-with-automated-reasoning-zelkova/

At AWS, we focus on achieving security at scale to diminish risks to your business. Fundamental to this approach is ensuring your policies are configured in a way that helps protect your data, and the Automated Reasoning Group (ARG), an advanced innovation team at AWS, is using automated reasoning to do it.

What is automated reasoning, you ask? It’s a method of formal verification that automatically generates and checks mathematical proofs which help to prove the correctness of systems; that is, fancy math that proves things are working as expected. If you want a deeper understanding of automated reasoning, check out this re:Invent session. While the applications of this methodology are vast, in this post I’ll explore one specific aspect: analyzing policies using an internal Amazon service named Zelkova.

What is Zelkova? How will it help me?

Zelkova uses automated reasoning to analyze policies and the future consequences of policies. This includes AWS Identity and Access Management (IAM) policies, Amazon Simple Storage Service (S3) policies, and other resource policies. These policies dictate who can (or can’t) do what to which resources. Because Zelkova uses automated reasoning, you no longer need to think about what questions you need to ask about your policies. Using fancy math, as mentioned above, Zelkova will automatically derive the questions and answers you need to be asking about your policies, improving confidence in your security configuration(s).

How does it work?

Zelkova translates policies into precise mathematical language and then uses automated reasoning tools to check properties of the policies. These tools include automated reasoners called Satisfiability Modulo Theories (SMT) solvers, which use a mix of numbers, strings, regular expressions, dates, and IP addresses to prove and disprove logical formulas. Zelkova has a deep understanding of the semantics of the IAM policy language and builds upon a solid mathematical foundation. While tools like the IAM Policy Simulator let you test individual requests, Zelkova is able to use mathematics to talk about all possible requests. Other techniques guess and check, but Zelkova knows.

You may have noticed, as an example, the new “Public / Not public” checks in S3. These are powered by Zelkova:

Figure 1: the "public/Not public" checks in S3

Figure 1: the “Public/Not public” checks in S3

S3 uses Zelkova to check each bucket policy and warns you if an unauthorized user is able to read or write to your bucket. When a bucket is flagged as “Public”, there are some public requests that are allowed to access the bucket. However, when a bucket is flagged as “Not public”, all public requests are denied. Zelkova is able to make such statements because it has a precise mathematical representation of IAM policies. In fact, it creates a formula for each policy and proves a theorem about that formula.

Consider the following S3 bucket policy statement where my goal is to disallow a certain principal from accessing the bucket:

    "Effect": "Allow",
    "NotPrincipal": { "AWS": "111122223333" },
    "Action": "*",
    "Resource": "arn:aws:s3:::test-bucket"

Unfortunately, this policy statement does not capture my intentions. Instead, it allows access for everybody in the world who is not the given principal. This means almost everybody now has access to my bucket, including anonymous unauthorized users. Fortunately, as soon as I attach this policy, S3 flags my bucket as “Public”—warning me that there’s something wrong with the policy I wrote. How did it know?

Zelkova translates this policy into a mathematical formula:

(Resource = “arn:aws:s3:::test-bucket”) ∧ (Principal ≠ 11112222333)

Here, ∧ is the mathematical symbol for “and” which is true only when both its left and right side are true. Resource and Principal are variables just like you would use x and y in algebra class. The above formula is true exactly when my policy allows a request. The precise meaning of my policy has now been defined in the universal language of mathematics. The next step is to decide if this policy formula allows public access, but this is a hard problem. Now Zelkova really goes to work.

A counterintuitive trick sometimes used by mathematicians is to make a problem harder in order to make finding a solution easier. That is, solving a more difficult problem can sometimes lead to a simpler solution. In this case, Zelkova solves the harder problem of comparing two policies against each other to decide which is more permissive. If P1 and P2 are policy formulas, then suppose formula P1 ⇒ P2 is true. This arrow symbol is an implication that means whenever P1 is true, P2 must also be true. So, whenever policy 1 accepts a request, policy 2 must also accept the request. Thus, policy 2 is at least as permissive as policy 1. Suppose also that the converse formula P2 ⇒ P1 is not true. That means there’s a request which makes P2 true and P1 false. This request is allowed by policy 2 and is denied by policy 1. Combining all these results, policy 1 is strictly less permissive than policy 2.

How does this solve the “Public / Not public” problem? Zelkova has a special policy that allows anonymous, unauthorized users to access an S3 resource. It compares your policy against this policy. If your policy is more permissive, then Zelkova says your policy allows public access. If you restrict access—for example, based on source VPC endpoint (aws:SourceVpce) or source IP address (aws:SourceIp)—then your policy is not more permissive than the special policy, and Zelkova says your policy does not allow public access.

For all this to work, Zelkova uses SMT solvers. Using mathematical language, these tools take a formula and either prove it is true for all possible values of the variables, or they return a counterexample that makes the formula false.

To understand SMT solvers better, you can play with them yourself. For example, if asked to prove x+y > xy, an SMT solver will quickly find a counterexample such as x=5,y=-1. To fix this, you could strengthen your formula to assume that y is positive:

(y > 0) ⇒ (x + y > xy)

The SMT solver will now respond that your formula is true for all values of the variables x and y. It does this using the rules of algebra and logic. This same idea carries over into theories like strings. You can ask the SMT solver to prove the formula length(append(a,b)) > length(a) where a and b are string variables. It will find a counterexample such as a=”hello” and b=”” where b is the empty string. This time, you could fix your formula by changing from greater-than to greater-than-or-equal-to:

length(append(a, b)) ≥ length(a)

The SMT solver will now respond that the formula is true for all values of the variables a and b. Here, the solver has combined reasoning about strings (length, append) with reasoning about numbers (greater-than-or-equal-to). SMT solvers are designed for exactly this sort of theory composition.

What about my original policy? Once I see that my bucket is public, I can fix my policy using an explicit “Deny”:

    "Effect": "Deny"
    "Principal": { "AWS": "111122223333" },
    "Action": "*",
    "Resource": "arn:aws:s3:::test-bucket"

With this policy statement attached, S3 correctly reports my bucket as “Not public”. Zelkova has translated this policy into a mathematical formula, compared it against a special policy, and proved that my policy is less permissive. Fancy math has proved that things are working (or in this case, not working) as expected.

Where else is Zelkova being used?

In addition to S3, several AWS services are using Zelkova:

We have also engaged with a number of enterprise and regulated customers who have adopted Zelkova for their use cases:

“Bridgewater, like many other security-conscious AWS customers, needs to quickly reason about the security posture of our AWS infrastructure, and an integral part of that posture is IAM policies. These govern permissions on everything from individual users, to S3 buckets, KMS keys, and even VPC endpoints, among many others. Bridgewater uses Zelkova to verify and provide assurances that our policies do not allow data exfiltration, misconfigurations, and many other malicious and accidental undesirable behaviors. Zelkova allows our security experts to encode their understanding once and then mechanically apply it to any relevant policies, avoiding error-prone and slow human reviews, while at the same time providing us high confidence in the correctness and security of our IAM policies.”
Dan Peebles, Lead Cloud Security Architect at Bridgewater Associates


AWS services such as S3 use Zelkova to precisely represent policies and prove that they are secure—improving confidence in your security configurations. Zelkova can make broad statements about all resource requests because it’s based on mathematics and proofs instead of heuristics, pattern matching, or simulation. The ubiquity of policies in AWS means that the value of Zelkova and its benefits will continue to grow as it serves to protect more customers every day.

Want more AWS Security news? Follow us on Twitter.