Tag Archives: Security, Identity & Compliance

2023 Canadian Centre for Cyber Security Assessment Summary report available with 20 additional services

Post Syndicated from Naranjan Goklani original https://aws.amazon.com/blogs/security/2023-canadian-centre-for-cyber-security-assessment-summary-report-available-with-20-additional-services/

At Amazon Web Services (AWS), we are committed to providing continued assurance to our customers through assessments, certifications, and attestations that support the adoption of current and new AWS services and features. We are pleased to announce the availability of the 2023 Canadian Centre for Cyber Security (CCCS) assessment summary report for AWS. With this assessment, a total of 150 AWS services and features are assessed in the Canada (Central) Region, including 20 additional AWS services and features. The assessment report is available for review and download on demand through AWS Artifact.

The full list of services in scope for the CCCS assessment is available on the Services in Scope page. The 20 new services and features are the following:

The CCCS is Canada’s authoritative source of cyber security expert guidance for the Canadian government, industry, and the general public. Public and commercial sector organizations across Canada rely on CCCS’s rigorous Cloud Service Provider (CSP) IT Security (ITS) assessment in their decision to use CSP services. In addition, CCCS’s ITS assessment process is a mandatory requirement for AWS to provide cloud services to Canadian federal government departments and agencies.  

The CCCS cloud service provider information technology security assessment process determines if the Government of Canada (GC) ITS requirements for the CCCS Medium cloud security profile (previously referred to as GC’s PROTECTED B/Medium Integrity/Medium Availability [PBMM] profile) are met as described in ITSG-33 (IT security risk management: A lifecycle approach, Annex 3 – Security control catalogue). As of November 2023, 150 AWS services in the Canada (Central) Region have been assessed by CCCS and meet the requirements for the Medium cloud security profile. Meeting the Medium cloud security profile is required to host workloads that are classified up to and including Medium categorization. On a periodic basis, CCCS assesses new or previously unassessed services and re-assesses the AWS services that were previously assessed to verify that they continue to meet the GC’s requirements. CCCS prioritizes the assessment of new AWS services based on their availability in Canada, and customer demand for the AWS services. The full list of AWS services that have been assessed by CCCS is available on our Services in Scope for CCCS Assessment page.

To learn more about the CCCS assessment or our other compliance and security programs, visit AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Naranjan Goklani

Naranjan Goklani

Naranjan is an Audit Lead for Canada. He has experience leading audits, attestations, certifications, and assessments across the Americas. Naranjan has more than 13 years of experience in risk management, security assurance, and performing technology audits. He previously worked in one of the Big 4 accounting firms and supported clients from the financial services, technology, retail, and utilities industries.

Implement an early feedback loop with AWS developer tools to shift security left

Post Syndicated from Barry Conway original https://aws.amazon.com/blogs/security/implement-an-early-feedback-loop-with-aws-developer-tools-to-shift-security-left/

Early-feedback loops exist to provide developers with ongoing feedback through automated checks. This enables developers to take early remedial action while increasing the efficiency of the code review process and, in turn, their productivity.

Early-feedback loops help provide confidence to reviewers that fundamental security and compliance requirements were validated before review. As part of this process, common expectations of code standards and quality can be established, while shifting governance mechanisms to the left.

In this post, we will show you how to use AWS developer tools to implement a shift-left approach to security that empowers your developers with early feedback loops within their development practices. You will use AWS CodeCommit to securely host Git repositories, AWS CodePipeline to automate continuous delivery pipelines, AWS CodeBuild to build and test code, and Amazon CodeGuru Reviewer to detect potential code defects.

Why the shift-left approach is important

Developers today are an integral part of organizations, building and maintaining the most critical customer-facing applications. Developers must have the knowledge, tools, and processes in place to help them identify potential security issues before they release a product to production.

This is why the shift-left approach is important. Shift left is the process of checking for vulnerabilities and issues in the earlier stages of software development. By following the shift-left process (which should be part of a wider application security review and threat modelling process), software teams can help prevent undetected security issues when they build an application. The modern DevSecOps workflow continues to shift left towards the developer and their practices with the aim to achieve the following:

  • Drive accountability among developers for the security of their code
  • Empower development teams to remediate issues up front and at their own pace
  • Improve risk management by enabling early visibility of potential security issues through early feedback loops

You can use AWS developer tools to help provide this continual early feedback for developers upon each commit of code.

Solution prerequisites

To follow along with this solution, make sure that you have the following prerequisites in place:

Make sure that you have a general working knowledge of the listed services and DevOps practices.

Solution overview

The following diagram illustrates the architecture of the solution.

Figure 1: Solution overview

Figure 1: Solution overview

We will show you how to set up a continuous integration and continuous delivery (CI/CD) pipeline by using AWS developer tools—CodeCommit, CodePipeline, CodeBuild, and CodeGuru—that you will integrate with the code repository to detect code security vulnerabilities. As shown in Figure 1, the solution has the following steps:

  1. The developer commits the new branch into the code repository.
  2. The developer creates a pull request to the main branch.
  3. Pull requests initiate two jobs: an Amazon CodeGuru Reviewer code scan and a CodeBuild job.
    1. CodeGuru Reviewer uses program analysis and machine learning to help detect potential defects in your Java and Python code, and provides recommendations to improve the code. CodeGuru Reviewer helps detect security vulnerabilities, secrets, resource leaks, concurrency issues, incorrect input validation, and deviation from best practices for using AWS APIs and SDKs.
    2. You can configure the CodeBuild deployment with third-party tools, such as Bandit for Python to help detect security issues in your Python code.
  4. CodeGuru Reviewer or CodeBuild writes back the findings of the code scans to the pull request to provide a single common place for developers to review the findings that are relevant to their specific code updates.

The following table presents some other tools that you can integrate into the early-feedback toolchain, depending on the type of code or artefacts that you are evaluating:

Early feedback – security tools Usage License
cfn-guard , cfn-nag , cfn-lint Infrastructure linting and validation cfn-guard license, cfn-nag license, cfn-lint license
CodeGuru, Bandit Python Bandit license
CodeGuru Java
npm-audit, Dependabot npm libraries Dependabot license

When you deploy the solution in your AWS account, you can review how Bandit for Python has been built into the deployment pipeline by using AWS CodeBuild with a configured buildspec file, as shown in Figure 2. You can implement the other tools in the table by using a similar approach.

Figure 2: Bandit configured in CodeBuild

Figure 2: Bandit configured in CodeBuild

Walkthrough

To deploy the solution, you will complete the following steps:

  1. Deploy the solution by using a CloudFormation template
  2. Associate CodeGuru with a code repository
  3. Create a pull request to the code repository
  4. Review the code scan results in the pull request and address the findings

Deploy the solution

The first step is to deploy the required resources into your AWS environment by using CloudFormation.

To deploy the solution

  1. Choose the following Launch Stack button to deploy the solution’s CloudFormation template:

    Select this image to open a link that starts building the CloudFormation stack

    The solution deploys in the AWS US East (N. Virginia) Region (us-east-1) by default because each service listed in the Prerequisites section is available in this Region. To deploy the solution in a different Region, use the Region selector in the console navigation bar and make sure that the services required for this walkthrough are supported in your newly selected Region. For service availability by Region, see AWS Services by Region.

  2. On the Quick Create Stack screen, do the following:
    1. Leave the provided parameter defaults in place.
    2. Scroll to the bottom, and in the Capabilities section, select I acknowledge that AWS CloudFormation might create IAM resources with custom names.
    3. Choose Create Stack.
  3. When the CloudFormation template has completed, open the AWS Cloud9 console.
  4. In the Environments table, for the provisioned shift-left-blog-cloud9-ide environment, choose Open, as shown in Figure 3.
    Figure 3: Cloud9 environments

    Figure 3: Cloud9 environments

  5. The provisioned Cloud9 environment opens in a new tab. Wait for Cloud9 to initialize the two sample code repositories: shift-left-sample-app-java and shift-left-sample-app-python, as shown in Figure 4. For this post, you will work only with the Python sample repository shift-left-sample-app-python, but the procedures we outline will also work for the Java repository.
    Figure 4: Cloud9 IDE

    Figure 4: Cloud9 IDE

Associate CodeGuru Reviewer with a code repository

The next step is to associate the Python code repository with CodeGuru Reviewer. After you associate the repository, CodeGuru Reviewer analyzes and comments on issues that it finds when you create a pull request.

To associate CodeGuru Reviewer with a repository

  1. Open the CodeGuru console, and in the left navigation pane, under Reviewer, choose Repositories.
  2. In the Repositories section, choose Associate repository and run analysis.
  3. In the Associate repository section, do the following:
    1. For Select source provider, select AWS CodeCommit.
    2. For Repository location,select shift-left-sample-app-python.
  4. In the Run a repository analysis section, do the following, as shown in Figure 5:
    1. For Source branch, select main.
    2. For Code review name – optional, enter a name.
    3. For Tagsoptional, leave the default settings.
    4. Choose Associate repository and run analysis.
      Figure 5: CodeGuru repository configuration

      Figure 5: CodeGuru repository configuration

  5. CodeGuru initiates the Full repository analysis and the status is Pending, as shown in Figure 6. The full analysis takes about 5 minutes to complete. Wait for the status to change from Pending to Completed.
    Figure 6: CodeGuru full analysis pending

    Figure 6: CodeGuru full analysis pending

Create a pull request

The next step is to create a new branch and to push sample code to the repository by creating a pull request so that the code scan can be initiated by CodeGuru Reviewer and the CodeBuild job.

To create a new branch

  1. In the Cloud9 IDE, locate the terminal and create a new branch by running the following commands.
    cd ~/environment/shift-left-sample-app-python
    git checkout -b python-test

  2. Confirm that you are working from the new branch, which will be highlighted in the Cloud9 IDE terminal, as shown in Figure 7.
    git branch -v

    Figure 7: Cloud9 IDE terminal

    Figure 7: Cloud9 IDE terminal

To create a new file and push it to the code repository

  1. Create a new file called sample.py.
    touch sample.py

  2. Copy the following sample code, paste it into the sample.py file, and save the changes, as shown in Figure 8.
    import requests
    
    data = requests.get("https://www.example.org/", verify = False)
    print(data.status_code)

    Figure 8: Cloud9 IDE noncompliant code

    Figure 8: Cloud9 IDE noncompliant code

  3. Commit the changes to the repository.
    git status
    git add -A
    git commit -m "shift left blog python sample app update"

    Note: if you receive a message to set your name and email address, you can ignore it because Git will automatically set these for you, and the Git commit will complete successfully.

  4. Push the changes to the code repository, as shown in Figure 9.
    git push origin python-test

    Figure 9: Git push

    Figure 9: Git push

To create a new pull request

  1. Open the CodeCommit console and select the code repository called shift-left-sample-app-python.
  2. From the Branches dropdown, select the new branch that you created and pushed, as shown in Figure 10.
    Figure 10: CodeCommit branch selection

    Figure 10: CodeCommit branch selection

  3. In your new branch, select the file sample.py, confirm that the file has the changes that you made, and then choose Create pull request, as shown in Figure 11.
    Figure 11: CodeCommit pull request

    Figure 11: CodeCommit pull request

    A notification appears stating that the new code updates can be merged.

  4. In the Source dropdown, choose the new branch python-test. In the Destination dropdown, choose the main branch where you intend to merge your code changes when the pull request is closed.
  5. To have CodeCommit run a comparison between the main branch and your new branch python-test, choose Compare. To see the differences between the two branches, choose the Changes tab at the bottom of the page. CodeCommit also assesses whether the two branches can be merged automatically when the pull request is closed.
  6. When you’re satisfied with the comparison results for the pull request, enter a Title and an optional Description, and then choose Create pull request. Your pull request appears in the list of pull requests for the CodeCommit repository, as shown in Figure 12.
    Figure 12: Pull request

    Figure 12: Pull request

The creation of this pull request has automatically started two separate code scans. The first is a CodeGuru incremental code review and the second uses CodeBuild, which utilizes Bandit to perform a security code scan of the Python code.

Review code scan results and resolve detected security vulnerabilities

The next step is to review the code scan results to identify security vulnerabilities and the recommendations on how to fix them.

To review the code scan results

  1. Open the CodeGuru console, and in the left navigation pane, under Reviewer, select Code reviews.
  2. On the Incremental code reviews tab, make sure that you see a new code review item created for the preceding pull request.
    Figure 13: CodeGuru Code review

    Figure 13: CodeGuru Code review

  3. After a few minutes, when CodeGuru completes the incremental analysis, choose the code review to review the CodeGuru recommendations on the pull request. Figure 14 shows the CodeGuru recommendations for our example.
    Figure 14: CodeGuru recommendations

    Figure 14: CodeGuru recommendations

  4. Open the CodeBuild console and select the CodeBuild job called shift-left-blog-pr-Python. In our example, this job should be in a Failed state.
  5. Open the CodeBuild run, and under the Build history tab, select the CodeBuild job, which is in Failed state. Under the Build Logs tab, scroll down until you see the following errors in the logs. Note that the severity of the finding is High, which is why the CodeBuild job failed. You can review the Bandit scanning options in the Bandit documentation.
    Test results:
    >> Issue: [B501:request_with_no_cert_validation] Call to requests with verify=False disabling SSL certificate checks, security issue.
       Severity: High   Confidence: High
       CWE: CWE-295 (https://cwe.mitre.org/data/definitions/295.html)
       More Info: https://bandit.readthedocs.io/en/1.7.5/plugins/b501_request_with_no_cert_validation.html
       Location: sample.py:3:7
    
    2   
    3   data = requests.get("https://www.example.org/", verify = False)
    4   print(data.status_code)

  6. Navigate to the CodeCommit console, and on the Activity tab of the pull request, review the CodeGuru recommendations. You can also review the results of the CodeBuild jobs that Bandit performed, as shown in Figure 15.
    Figure 15: CodeGuru recommendations and CodeBuild logs

    Figure 15: CodeGuru recommendations and CodeBuild logs

This demonstrates how developers can directly link the relevant information relating to security code scans with their code development and associated pull requests, hence shifting to the left the required security awareness for developers.

To resolve the detected security vulnerabilities

  1. In the Cloud9 IDE, navigate to the file sample.py in the Python sample repository, as shown in Figure 16.
    Figure 16: Cloud9 IDE sample.py

    Figure 16: Cloud9 IDE sample.py

  2. Copy the following code and paste it in the sample.py file, overwriting the existing code. Save the update.
    import requests
    
    data = requests.get("https://www.example.org", timeout=5)
    print(data.status_code)

  3. Commit the changes by running the following commands.
    git status
    git add -A
    git commit -m "shift left python sample.py resolve security errors"
    git push origin python-test

  4. Open the CodeCommit console and choose the Activity tab on the pull request that you created earlier. You will see a banner indicating that the pull request was updated. You will also see new comments indicating that new code scans using CodeGuru and CodeBuild were initiated for the new pull request update.
  5. In the CodeGuru console, on the Incremental code reviews page, check that a new code scan has begun. When the scans are finished, review the results in the CodeGuru console and the CodeBuild build logs, as described previously. The previously detected security vulnerability should now be resolved.
  6. In the CodeCommit console, on the Activity tab, under Activity history, review the comments to verify that each of the code scans has a status of Passing, as shown in Figure 17.
    Figure 17: CodeCommit activity history

    Figure 17: CodeCommit activity history

  7. Now that the security issue has been resolved, merge the pull request into the main branch of the code repository. Choose Merge, and under Merge strategy, select Fast Forward merge.

AWS account clean-up

Clean up the resources created by this solution to avoid incurring future charges.

To clean up your account

  1. Start by deleting the CloudFormation stacks for the Java and Python sample applications that you deployed. In the CloudFormation console, in the Stacks section, select one of these stacks and choose Delete; then select the other stack and choose Delete.
    Figure 18: Delete repository stack

    Figure 18: Delete repository stack

  2. To initiate deletion of the Cloud9 CloudFormation stack, select it and choose Delete.
  3. Open the Amazon S3 console, and in the search box, enter shift-left to search for the S3 bucket that CodePipeline used.
    Figure 19: Select CodePipeline S3 bucket

    Figure 19: Select CodePipeline S3 bucket

  4. Select the S3 bucket, select all of the object folders in the bucket, and choose Delete
    Figure 20: Select CodePipeline S3 objects

    Figure 20: Select CodePipeline S3 objects

  5. To confirm deletion of the objects, in the section Permanently delete objects?, enter permanently delete, and then choose Delete objects. A banner message that states Successfully deleted objects appears at the top confirming the object deletion.
  6. Navigate back to the CloudFormation console, select the stack named shift-left-blog, and choose Delete.

Conclusion

In this blog post, we showed you how to implement a solution that enables early feedback on code development through status comments in the CodeCommit pull request activity tab by using Amazon CodeGuru Reviewer and CodeBuild to perform automated code security scans on the creation of a code repository pull request.

We configured CodeBuild with Bandit for Python to demonstrate how you can integrate third-party or open-source tools into the development cycle. You can use this approach to integrate other tools into the workflow.

Shifting security left early in the development cycle can help you identify potential security issues earlier and empower teams to remediate issues earlier, helping to prevent the need to refactor code towards the end of a build.

This solution provides a simple method that you can use to view and understand potential security issues with your newly developed code and thus enhances your awareness of the security requirements within your organization.

It’s simple to get started. Sign up for an AWS account, deploy the provided CloudFormation template through the Launch Stack button, commit your code, and start scanning for vulnerabilities.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Barry Conway

Barry Conway

Barry is an Enterprise Solutions Architect with years of experience in the technology industry, bridging the gap between business and technology. Barry has helped banking, manufacturing, logistics, and retail organizations realize their business goals.

Author

Deenadayaalan Thirugnanasambandam

Deenadayaalan is a Senior Practice manager at AWS. He provides prescriptive architectural guidance and consulting to help accelerate customers’ adoption of AWS.

Balamurugan Kumaran

Balamurugan Kumaran

Balamurugan is a Senior Cloud Architect at AWS. Over the years, Bala has architected and implemented highly available, scalable, and secure applications using AWS services for various enterprise customers.

Nitin Kumar

Nitin Kumar

Nitin is a Senior Cloud Architect at AWS. He plays a pivotal role in driving organizational success by harnessing the power of technology. With a focus on enabling innovation through architectural guidance and consulting, he empowers customers to excel on the AWS Cloud. Outside of work, Nitin dedicates his time to crafting IoT devices for reef tanks.

Use scalable controls for AWS services accessing your resources

Post Syndicated from James Greenwood original https://aws.amazon.com/blogs/security/use-scalable-controls-for-aws-services-accessing-your-resources/

Sometimes you want to configure an AWS service to access your resource in another service. For example, you can configure AWS CloudTrail, a service that monitors account activity across your AWS infrastructure, to write log data to your bucket in Amazon Simple Storage Service (Amazon S3). When you do this, you want assurance that the service will only access your resource on your behalf—you don’t want an untrusted entity to be able to use the service to access your resource. Before today, you could achieve this by using the two AWS Identity and Access Management (IAM) condition keys, aws:SourceAccount and aws:SourceArn. You can use these condition keys to help make sure that a service accesses your resource only on behalf of specific accounts or resources that you trust. However, because these condition keys require you to specify individual accounts and resources, they can be difficult to manage at scale, especially in larger organizations.

Recently, IAM launched two new condition keys that can help you achieve this in a more scalable way that is simpler to manage within your organization:

  • aws:SourceOrgID — use this condition key to make sure that an AWS service can access your resources only when the request originates from a particular organization ID in AWS Organizations.
  • aws:SourceOrgPaths — use this condition key to make sure that an AWS service can access your resources only when the request originates from one or more organizational units (OUs) in your organization.

In this blog post, we describe how you can use the four available condition keys, including the two new ones, to help you control how AWS services access your resources.

Background

Imagine a scenario where you configure an AWS service to access your resource in another service. Let’s say you’re using Amazon CloudWatch to observe resources in your AWS environment, and you create an alarm that activates when certain conditions occur. When the alarm activates, you want it to publish messages to a topic that you create in Amazon Simple Notification Service (Amazon SNS) to generate notifications.

Figure 1 depicts this process.

Figure 1: Amazon CloudWatch publishing messages to an SNS topic

Figure 1: Amazon CloudWatch publishing messages to an SNS topic

In this scenario, there’s a resource-based policy controlling access to your SNS topic. For CloudWatch to publish messages to it, you must configure the policy to allow access by CloudWatch. When you do this, you identify CloudWatch using an AWS service principal, in this case cloudwatch.amazonaws.com.

Cross-service access

This is an example of a common pattern known as cross-service access. With cross-service access, a calling service accesses your resource in a called service, and a resource-based policy attached to your resource grants access to the calling service. The calling service is identified using an AWS service principal in the form <SERVICE-NAME>.amazonaws.com, and it accesses your resource on behalf of an originating resource, such as a CloudWatch alarm.

Figure 2 shows cross-service access.

Figure 2: Cross-service access

Figure 2: Cross-service access

When you configure cross-service access, you want to make sure that the calling service will access your resource only on your behalf. That means you want the originating resource to be controlled by someone whom you trust. If an untrusted entity creates their own CloudWatch alarm in their AWS environment, for example, then their alarm should not be able to publish messages to your SNS topic.

If an untrusted entity could use a calling service to access your resource on their behalf, it would be an example of what’s known as the confused deputy problem. The confused deputy problem is a security issue in which an entity that doesn’t have permission to perform an action coerces a more privileged entity (in this case, a calling service) to perform the action instead.

Use condition keys to help prevent cross-service confused deputy issues

AWS provides global condition keys to help you prevent cross-service confused deputy issues. You can use these condition keys to control how AWS services access your resources.

Before today, you could use the aws:SourceAccount or aws:SourceArn condition keys to make sure that a calling service accesses your resource only when the request originates from a specific account (with aws:SourceAccount) or a specific originating resource (with aws:SourceArn). However, there are situations where you might want to allow multiple resources or accounts to use a calling service to access your resource. For example, you might want to create many VPC flow logs in an organization that publish to a central S3 bucket. To achieve this using the aws:SourceAccount or aws:SourceArn condition keys, you must enumerate all the originating accounts or resources individually in your resource-based policies. This can be difficult to manage, especially in large organizations, and can potentially cause your resource-based policy documents to reach size limits.

Now, you can use the new aws:SourceOrgID or aws:SourceOrgPaths condition keys to make sure that a calling service accesses your resource only when the request originates from a specific organization (with aws:SourceOrgID) or a specific organizational unit (with aws:SourceOrgPaths). This helps avoid the need to update policies when accounts are added or removed, reduces the size of policy documents, and makes it simpler to create and review policy statements.

The following table summarizes the four condition keys that you can use to help prevent cross-service confused deputy issues. These keys work in a similar way, but with different levels of granularity.

Use case Condition key Value Allowed operators Single/multi valued Example value
Allow a calling service to access your resource only on behalf of an organization that you trust. aws:SourceOrgID AWS organization ID of the resource making a cross-service access request String operators Single-valued key o-a1b2c3d4e5
Allow a calling service to access your resource only on behalf of an organizational unit (OU) that you trust. aws:SourceOrgPaths Organization entity paths of the resource making a cross-service access request Set operators and string operators Multivalued key o-a1b2c3d4e5/r-ab12/ou-ab12-11111111/ou-ab12-22222222/
Allow a calling service to access your resource only on behalf of an account that you trust. aws:SourceAccount AWS account ID of the resource making a cross-service access request String operators Single-valued key 111122223333
Allow a calling service to access your resource only on behalf of a resource that you trust. aws:SourceArn Amazon Resource Name (ARN) of the resource making a cross- service access request ARN operators (recommended) or string operators Single-valued key arn:aws:cloudwatch:eu-west-1:111122223333:alarm:myalarm

When to use the condition keys

AWS recommends that you use these condition keys in any resource-based policy statements that allow access by an AWS service, except where the relevant condition key is not yet supported by the service. To find out whether a condition key is supported by a particular service, see AWS global condition context keys in the AWS Identity and Access Management User Guide.

Note: Only use these condition keys in resource-based policies that allow access by an AWS service. Don’t use them in other use cases, including identity-based policies and service control policies (SCPs), where these condition keys won’t be populated.

Use condition keys for defense in depth

AWS services use a variety of mechanisms to help prevent cross-service confused deputy issues, and the details vary by service. For example, where a calling service accesses an S3 bucket, some services use S3 prefixes to help prevent confused deputy issues. For more information, see the relevant service documentation.

Where supported by the service, AWS recommends that you use the condition keys we describe in this post regardless of whether the service has another mechanism in place to help prevent cross-service confused deputy issues. This helps to make your intentions explicit, provide defense in depth, and guard against misconfigurations.

Example use cases

Let’s walk through some example use cases to learn how to use these condition keys in practice.

First, imagine you’re using Amazon Virtual Private Cloud (Amazon VPC) to manage logically isolated virtual networks. In Amazon VPC, you can configure flow logs, which capture information about your network traffic. Let’s say you want a flow log to write data into an S3 bucket for later analysis. This process is depicted in Figure 3.

Figure 3: Amazon VPC writing flow logs to an S3 bucket

Figure 3: Amazon VPC writing flow logs to an S3 bucket

This constitutes another cross-service access scenario. In this case, Amazon VPC is the calling service, Amazon S3 is the called service, the VPC flow log is the originating resource, and the S3 bucket is your resource in the called service.

To allow access, the resource-based policy for your S3 bucket (known as a bucket policy) must allow Amazon VPC to put objects there. The Principal element in this policy specifies the AWS service principal of the service that will access the resource, which for VPC flow logs is delivery.logs.amazonaws.com.

Initial policy without confused deputy prevention

The following is an initial version of the bucket policy that allows Amazon VPC to put objects in the bucket but doesn’t yet provide confused deputy prevention. We’re showing this policy for illustration purposes; don’t use it in its current form.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PARTIAL-EXAMPLE-DO-NOT-USE",
            "Effect": "Allow",
            "Principal": {
                "Service": "delivery.logs.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<DOC-EXAMPLE-BUCKET>/*"
        }
    ]
}

Note: For simplicity, we only show one of the policy statements that you need to allow VPC flow logs to write to a bucket. In a real-life bucket policy for flow logs, you need two policy statements: one allowing actions on the bucket, and one allowing actions on the bucket contents. These are described in Publish flow logs to Amazon S3. Both policy statements work in the same way with respect to confused deputy prevention.

This policy statement allows Amazon VPC to put objects in the bucket. However, it allows Amazon VPC to do that on behalf of any flow log in any account. There’s nothing in the policy to tell Amazon VPC that it should access this bucket only if the flow log belongs to a specific organization, OU, account, or resource that you trust.

Let’s now update the policy to help prevent cross-service confused deputy issues. For the rest of this post, the remaining policy samples provide confused deputy protection, but at different levels of granularity.

Specify a trusted organization

Continuing with the previous example, imagine that you now have an organization in AWS Organizations, and you want to create VPC flow logs in various accounts within your organization that publish to a central S3 bucket. You want Amazon VPC to put objects in the bucket only if the request originates from a flow log that resides in your organization.

You can achieve this by using the new aws:SourceOrgID condition key. In a cross-service access scenario, this condition key evaluates to the ID of the organization that the request came from. You can use this condition key in the Condition element of a resource-based policy to allow actions only if aws:SourceOrgID matches the ID of a specific organization, as shown in the following example. In your own policy, make sure to replace <DOC-EXAMPLE-BUCKET> and <MY-ORGANIZATION-ID> with your own information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VPCLogsDeliveryWrite",
            "Effect": "Allow",
            "Principal": {
                "Service": "delivery.logs.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<DOC-EXAMPLE-BUCKET>/*",
            "Condition": {
                "StringEquals": {
                    "aws:SourceOrgID": "<MY-ORGANIZATION-ID>"
                }
            }
        }
    ]
}

The revised policy states that Amazon VPC can put objects in the bucket only if the request originates from a flow log in your organization. Now, if someone creates a flow log outside your organization and configures it to access your bucket, they will get an access denied error.

You can use aws:SourceOrgID in this way to allow a calling service to access your resource only if the request originates from a specific organization, as shown in Figure 4.

Figure 4: Specify a trusted organization using aws:SourceOrgID

Figure 4: Specify a trusted organization using aws:SourceOrgID

Specify a trusted OU

What if you don’t want to trust your entire organization, but only part of it? Let’s consider a different scenario. Imagine that you want to send messages from Amazon SNS into a queue in Amazon Simple Queue Service (Amazon SQS) so they can be processed by consumers. This is depicted in Figure 5.

Figure 5: Amazon SNS sending messages to an SQS queue

Figure 5: Amazon SNS sending messages to an SQS queue

Now imagine that you want your SQS queue to receive messages only if they originate from an SNS topic that resides in a specific organizational unit (OU) in your organization. For example, you might want to allow messages only if they originate from a production OU that is subject to change control.

You can achieve this by using the new aws:SourceOrgPaths condition key. As before, you use this condition key in a resource-based policy attached to your resource. In a cross-service access scenario, this condition key evaluates to the AWS Organizations entity path that the request came from. An entity path is a text representation of an entity within an organization.

You build an entity path for an OU by using the IDs of the organization, root, and all OUs in the path down to and including the OU. For example, consider the organizational structure shown in Figure 6.

Figure 6: Example organization structure

Figure 6: Example organization structure

In this example, you can specify the Prod OU by using the following entity path:

o-a1b2c3d4e5/r-ab12/ou-ab12-11111111/ou-ab12-22222222/

For more information about how to construct an entity path, see Understand the AWS Organizations entity path.

Let’s now match the aws:SourceOrgPaths condition key against a specific entity path in the Condition element of a resource-based policy for an SQS queue. In your own policy, make sure to replace <MY-QUEUE-ARN> and <MY-ENTITY-PATH> with your own information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Allow-SNS-SendMessage",
            "Effect": "Allow",
            "Principal": {
                "Service": "sns.amazonaws.com"
            },
            "Action": "sqs:SendMessage",
            "Resource": "<MY-QUEUE-ARN>",
            "Condition": {
                "Null": {
                    "aws:SourceOrgPaths": "false"
                },
                "ForAllValues:StringEquals": {
                    "aws:SourceOrgPaths": "<MY-ENTITY-PATH>"
                }
            }
        }
    ]
}

Note: aws:SourceOrgPaths is a multivalued condition key, which means it’s capable of having multiple values in the request context. At the time of writing, it contains a single entity path if the request originates from an account in an organization, and a null value if the request originates from an account that’s not in an organization. Because this key is multivalued, you need to use both a set operator and a string operator to compare values.

In this policy, there are two conditions in the Condition block. The first uses the Null condition operator and compares with a false value to confirm that the condition key’s value is not null. The second uses set operator ForAllValues, which returns true if every condition key value in the request matches at least one value in your policy condition, and string operator StringEquals, which requires an exact match with a value specified in your policy condition.

Note: The reason for the null check is that set operator ForAllValues returns true when a condition key resolves to null. With an Allow effect and the null check in place, access is denied if the request originates from an account that’s not in an organization.

With this policy applied to your SQS queue, Amazon SNS can send messages to your queue only if the message came from an SNS topic in a specific OU.

You can use aws:SourceOrgPaths in this way to allow a calling service to access your resource only if the request originates from a specific organizational unit, as shown in Figure 7.

Figure 7: Specify a trusted OU using aws:SourceOrgPaths

Figure 7: Specify a trusted OU using aws:SourceOrgPaths

Specify a trusted OU and its children

In the previous example, we specified a trusted OU, but that didn’t include its child OUs. What if you want to include its children as well?

You can achieve this by replacing the string operator StringEquals with StringLike. This allows you to use wildcards in the entity path. Using the organization structure from the previous example, the following Condition evaluates to true only if the condition key value is not null and the request originates from the Prod OU or any of its child OUs.

"Condition": {
    "Null": {
        "aws:SourceOrgPaths": "false"
    },
    "ForAllValues:StringLike": {
        "aws:SourceOrgPaths": "o-a1b2c3d4e5/r-ab12/ou-ab12-11111111/ou-ab12-22222222/*"
    }
}

Specify a trusted account

If you want to be more granular, you can allow a service to access your resource only if the request originates from a specific account. You can achieve this by using the aws:SourceAccount condition key. In a cross-service access scenario, this condition key evaluates to the ID of the account that the request came from. 

The following Condition evaluates to true only if the request originates from the account that you specify in the policy. In your own policy, make sure to replace <MY-ACCOUNT-ID> with your own information.

"Condition": {
    "StringEquals": {
        "aws:SourceAccount": "<MY-ACCOUNT-ID>"
    }
}

You can use this condition element within a resource-based policy to allow a calling service to access your resource only if the request originates from a specific account, as shown in Figure 8.

Figure 8: Specify a trusted account using aws:SourceAccount

Figure 8: Specify a trusted account using aws:SourceAccount

Specify a trusted resource

If you want to be even more granular, you can allow a service to access your resource only if the request originates from a specific resource. For example, you can allow Amazon SNS to send messages to your SQS queue only if the request originates from a specific topic within Amazon SNS.

You can achieve this by using the aws:SourceArn condition key. In a cross-service access scenario, this condition key evaluates to the Amazon Resource Name (ARN) of the originating resource. This provides the most granular form of cross-service confused deputy prevention.

The following Condition evaluates to true only if the request originates from the resource that you specify in the policy. In your own policy, make sure to replace <MY-RESOURCE-ARN> with your own information.

"Condition": {
    "ArnEquals": {
        "aws:SourceArn": "<MY-RESOURCE-ARN>"
    }
}

Note: AWS recommends that you use an ARN operator rather than a string operator when comparing ARNs. This example uses ArnEquals to match the condition key value against the ARN specified in the policy.

You can use this condition element within a resource-based policy to allow a calling service to access your resource only if the request comes from a specific originating resource, as shown in Figure 9.

Figure 9: Specify a trusted resource using aws:SourceArn

Figure 9: Specify a trusted resource using aws:SourceArn

Specify multiple trusted resources, accounts, OUs, or organizations

The four condition keys allow you to specify multiple trusted entities by matching against an array of values. This allows you to specify multiple trusted resources, accounts, OUs, or organizations in your policies.

Conclusion

In this post, you learned about cross-service access, in which an AWS service communicates with another AWS service to access your resource. You saw that it’s important to make sure that such services access your resources only on your behalf in order to help avoid cross-service confused deputy issues.

We showed you how to help prevent cross-service confused deputy issues by using two new condition keys aws:SourceOrgID and aws:SourceOrgPaths, as well as the other available condition keys aws:SourceAccount and aws:SourceArn. You learned that you should use these condition keys in any resource-based policy statements that allow access by an AWS service, if the condition key is supported by the service. This helps make sure that a calling service can access your resource only when the request originates from a specific organization, OU, account, or resource that you trust.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS IAM re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

James Greenwood

James is a Principal Security Solutions Architect who helps AWS Financial Services customers meet their security and compliance objectives in the AWS Cloud. James has a background in identity and access management, authentication, credential management, and data protection with more than 20 years of experience in the financial services industry.

Sophia Yang

Sophia Yang

Sophia is a Senior Product Manager on the AWS Identity and Access Management (IAM) service. She is passionate about enabling customers to build and innovate in AWS in a secure manner.

Automate and enhance your code security with AI-powered services

Post Syndicated from Dylan Souvage original https://aws.amazon.com/blogs/security/automate-and-enhance-your-code-security-with-ai-powered-services/

Organizations are increasingly embracing a shift-left approach when it comes to security, actively integrating security considerations into their software development lifecycle (SDLC). This shift aligns seamlessly with modern software development practices such as DevSecOps and continuous integration and continuous deployment (CI/CD), making it a vital strategy in today’s rapidly evolving software development landscape. At its core, shift left promotes a security-as-code culture, where security becomes an integral part of the entire application lifecycle, starting from the initial design phase and extending all the way through to deployment. This proactive approach to security involves seamlessly integrating security measures into the CI/CD pipeline, enabling automated security testing and checks at every stage of development. Consequently, it accelerates the process of identifying and remediating security issues.

By identifying security vulnerabilities early in the development process, you can promptly address them, leading to significant reductions in the time and effort required for mitigation. Amazon Web Services (AWS) encourages this shift-left mindset, providing services that enable a seamless integration of security into your DevOps processes, fostering a more robust, secure, and efficient system. In this blog post we share how you can use Amazon CodeWhisperer, Amazon CodeGuru, and Amazon Inspector to automate and enhance code security.

CodeWhisperer is a versatile, artificial intelligence (AI)-powered code generation service that delivers real-time code recommendations. This innovative service plays a pivotal role in the shift-left strategy by automating the integration of crucial security best practices during the early stages of code development. CodeWhisperer is equipped to generate code in Python, Java, and JavaScript, effectively mitigating vulnerabilities outlined in the OWASP (Open Web Application Security Project) Top 10. It uses cryptographic libraries aligned with industry best practices, promoting robust security measures. Additionally, as you develop your code, CodeWhisperer scans for potential security vulnerabilities, offering actionable suggestions for remediation. This is achieved through generative AI, which creates code alternatives to replace identified vulnerable sections, enhancing the overall security posture of your applications.

Next, you can perform further vulnerability scanning of code repositories and supported integrated development environments (IDEs) with Amazon CodeGuru Security. CodeGuru Security is a static application security tool that uses machine learning to detect security policy violations and vulnerabilities. It provides recommendations for addressing security risks and generates metrics so you can track the security health of your applications. Examples of security vulnerabilities it can detect include resource leaks, hardcoded credentials, and cross-site scripting.

Finally, you can use Amazon Inspector to address vulnerabilities in workloads that are deployed. Amazon Inspector is a vulnerability management service that continually scans AWS workloads for software vulnerabilities and unintended network exposure. Amazon Inspector calculates a highly contextualized risk score for each finding by correlating common vulnerabilities and exposures (CVE) information with factors such as network access and exploitability. This score is used to prioritize the most critical vulnerabilities to improve remediation response efficiency. When started, it automatically discovers Amazon Elastic Compute Cloud (Amazon EC2) instances, container images residing in Amazon Elastic Container Registry (Amazon ECR), and AWS Lambda functions, at scale, and immediately starts assessing them for known vulnerabilities.

Figure 1: An architecture workflow of a developer’s code workflow

Figure 1: An architecture workflow of a developer’s code workflow

Amazon CodeWhisperer 

CodeWhisperer is powered by a large language model (LLM) trained on billions of lines of code, including code owned by Amazon and open-source code. This makes it a highly effective AI coding companion that can generate real-time code suggestions in your IDE to help you quickly build secure software with prompts in natural language. CodeWhisperer can be used with four IDEs including AWS Toolkit for JetBrains, AWS Toolkit for Visual Studio Code, AWS Lambda, and AWS Cloud9.

After you’ve installed the AWS Toolkit, there are two ways to authenticate to CodeWhisperer. The first is authenticating to CodeWhisperer as an individual developer using AWS Builder ID, and the second way is authenticating to CodeWhisperer Professional using the IAM Identity Center. Authenticating through AWS IAM Identity Center means your AWS administrator has set up CodeWhisperer Professional for your organization to use and provided you with a start URL. AWS administrators must have configured AWS IAM Identity Center and delegated users to access CodeWhisperer.

As you use CodeWhisperer it filters out code suggestions that include toxic phrases (profanity, hate speech, and so on) and suggestions that contain commonly known code structures that indicate bias. These filters help CodeWhisperer generate more inclusive and ethical code suggestions by proactively avoiding known problematic content. The goal is to make AI assistance more beneficial and safer for all developers.

CodeWhisperer can also scan your code to highlight and define security issues in real time. For example, using Python and JetBrains, if you write code that would write unencrypted AWS credentials to a log — a bad security practice — CodeWhisperer will raise an alert. Security scans operate at the project level, analyzing files within a user’s local project or workspace and then truncating them to create a payload for transmission to the server side.

For an example of CodeGuru in action, see Security Scans. Figure 2 is a screenshot of a CodeGuru scan.

Figure 2: CodeWhisperer performing a security scan in Visual Studio Code

Figure 2: CodeWhisperer performing a security scan in Visual Studio Code

Furthermore, the CodeWhisperer reference tracker detects whether a code suggestion might be similar to particular CodeWhisperer open source training data. The reference tracker can flag such suggestions with a repository URL and project license information or optionally filter them out. Using CodeWhisperer, you improve productivity while embracing the shift-left approach by implementing automated security best practices at one of the principal layers—code development.

CodeGuru Security

Amazon CodeGuru Security significantly bolsters code security by harnessing the power of machine learning to proactively pinpoint security policy violations and vulnerabilities. This intelligent tool conducts a thorough scan of your codebase and offers actionable recommendations to address identified issues. This approach verifies that potential security concerns are corrected early in the development lifecycle, contributing to an overall more robust application security posture.

CodeGuru Security relies on a set of security and code quality detectors crafted to identify security risks and policy violations. These detectors empower developers to spot and resolve potential issues efficiently.

CodeGuru Security allows manual scanning of existing code and automating integration with popular code repositories like GitHub and GitLab. It establishes an automated security check pipeline through either AWS CodePipeline or Bitbucket Pipeline. Moreover, CodeGuru Security integrates with Amazon Inspector Lambda code scanning, enabling automated code scans for your Lambda functions.

Notably, CodeGuru Security doesn’t just uncover security vulnerabilities; it also offers insights to optimize code efficiency. It identifies areas where code improvements can be made, enhancing both security and performance aspects within your applications.

Initiating CodeGuru Security is a straightforward process, accessible through the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS SDKs, and multiple integrations. This allows you to run code scans, review recommendations, and implement necessary updates, fostering a continuous improvement cycle that bolsters the security stance of your applications.

Use Amazon CodeGuru to scan code directly and in a pipeline

Use the following steps to create a scan in CodeGuru to scan code directly and to integrate CodeGuru with AWS CodePipeline.

Note: You must provide sample code to scan.

Scan code directly

  1. Open the AWS Management Console using your organization management account and go to Amazon CodeGuru.
  2. In the navigation pane, select Security and then select Scans.
  3. Choose Create new scan to start your manual code scan.
    Figure 3: Scans overview

    Figure 3: Scans overview

  4. On the Create Scan page:
    1. Choose Choose file to upload your code.

      Note: The file must be in .zip format and cannot exceed 5 GB.

    2. Enter a unique name to identify your scan.
    3. Choose Create scan.
      Figure 4: Create scan

      Figure 4: Create scan

  5. After you create the scan, the configured scan will automatically appear in the Scans table, where you see the Scan name, Status, Open findings, Date of last scan, and Revision number (you review these findings later in the Findings section of this post).
    Figure 5: Scan update

    Figure 5: Scan update

Automated scan using AWS CodePipeline integration

  1. Still in the CodeGuru console, in the navigation pane under Security, select Integrations. On the Integrations page, select Integration with AWS CodePipeline. This will allow you to have an automated security scan inside your CI/CD pipeline.
    Figure 6: CodeGuru integrations

    Figure 6: CodeGuru integrations

  2. Next, choose Open template in CloudFormation to create a CodeBuild project to allow discovery of your repositories and run security scans.
    Figure 7: CodeGuru and CodePipeline integration

    Figure 7: CodeGuru and CodePipeline integration

  3. The CloudFormation template is already entered. Select the acknowledge box, and then choose Create stack.
    Figure 8: CloudFormation quick create stack

    Figure 8: CloudFormation quick create stack

  4. If you already have a pipeline integration, go to Step 2 and select CodePipeline console. If this is your first time using CodePipeline, this blog post explains how to integrate it with AWS CI/CD services.
    Figure 9: Integrate with AWS CodePipeline

    Figure 9: Integrate with AWS CodePipeline

  5. Choose Edit.
    Figure 10: CodePipeline with CodeGuru integration

    Figure 10: CodePipeline with CodeGuru integration

  6. Choose Add stage.
    Figure 11: Add Stage in CodePipeline

    Figure 11: Add Stage in CodePipeline

  7. On the Edit action page:
    1. Enter a stage name.
    2. For the stage you just created, choose Add action group.
    3. For Action provider, select CodeBuild.
    4. For Input artifacts, select SourceArtifact.
    5. For Project name, select CodeGuruSecurity.
    6. Choose Done, and then choose Save.
    Figure 12: Add action group

    Figure 12: Add action group

Test CodeGuru Security

You have now created a security check stage for your CI/CD pipeline. To test the pipeline, choose Release change.

Figure 13: CodePipeline with successful security scan

Figure 13: CodePipeline with successful security scan

If your code was successfully scanned, you will see Succeeded in the Most recent execution column for your pipeline.

Figure 14: CodePipeline dashboard with successful security scan

Figure 14: CodePipeline dashboard with successful security scan

Findings

To analyze the findings of your scan, select Findings under Security, and you will see the findings for the scans whether manually done or through integrations. Each finding will show the vulnerability, the scan it belongs to, the severity level, the status of an open case or closed case, the age, and the time of detection.

Figure 15: Findings inside CodeGuru security

Figure 15: Findings inside CodeGuru security

Dashboard

To view a summary of the insights and findings from your scan, select Dashboard, under Security, and you will see high level summary of your findings overview and a vulnerability fix overview.

Figure 16:Findings inside CodeGuru dashboard

Figure 16:Findings inside CodeGuru dashboard

Amazon Inspector

Your journey with the shift-left model extends beyond code deployment. After scanning your code repositories and using tools like CodeWhisperer and CodeGuru Security to proactively reduce security risks before code commits to a repository, your code might still encounter potential vulnerabilities after being deployed to production. For instance, faulty software updates can introduce risks to your application. Continuous vigilance and monitoring after deployment are crucial.

This is where Amazon Inspector offers ongoing assessment throughout your resource lifecycle, automatically rescanning resources in response to changes. Amazon Inspector seamlessly complements the shift-left model by identifying vulnerabilities as your workload operates in a production environment.

Amazon Inspector continuously scans various components, including Amazon EC2, Lambda functions, and container workloads, seeking out software vulnerabilities and inadvertent network exposure. Its user-friendly features include enablement in a few clicks, continuous and automated scanning, and robust support for multi-account environments through AWS Organizations. After activation, it autonomously identifies workloads and presents real-time coverage details, consolidating findings across accounts and resources.

Distinguishing itself from traditional security scanning software, Amazon Inspector has minimal impact on your fleet’s performance. When vulnerabilities or open network paths are uncovered, it generates detailed findings, including comprehensive information about the vulnerability, the affected resource, and recommended remediation. When you address a finding appropriately, Amazon Inspector autonomously detects the remediation and closes the finding.

The findings you receive are prioritized according to a contextualized Inspector risk score, facilitating prompt analysis and allowing for automated remediation.

Additionally, Amazon Inspector provides robust management APIs for comprehensive programmatic access to the Amazon Inspector service and resources. You can also access detailed findings through Amazon EventBridge and seamlessly integrate them into AWS Security Hub for a comprehensive security overview.

Scan workloads with Amazon Inspector

Use the following examples to learn how to use Amazon Inspector to scan AWS workloads.

  1. Open the Amazon Inspector console in your AWS Organizations management account. In the navigation pane, select Activate Inspector.
  2. Under Delegated administrator, enter the account number for your desired account to grant it all the permissions required to manage Amazon Inspector for your organization. Consider using your Security Tooling account as delegated administrator for Amazon Inspector. Choose Delegate. Then, in the confirmation window, choose Delegate again. When you select a delegated administrator, Amazon Inspector is activated for that account. Now, choose Activate Inspector to activate the service in your management account.
    Figure 17: Set the delegated administrator account ID for Amazon Inspector

    Figure 17: Set the delegated administrator account ID for Amazon Inspector

  3. You will see a green success message near the top of your browser window and the Amazon Inspector dashboard, showing a summary of data from the accounts.
    Figure 18: Amazon Inspector dashboard after activation

    Figure 18: Amazon Inspector dashboard after activation

Explore Amazon Inspector

  1. From the Amazon Inspector console in your delegated administrator account, in the navigation pane, select Account management. Because you’re signed in as the delegated administrator, you can enable and disable Amazon Inspector in the other accounts that are part of your organization. You can also automatically enable Amazon Inspector for new member accounts.
    Figure 19: Amazon Inspector account management dashboard

    Figure 19: Amazon Inspector account management dashboard

  2. In the navigation pane, select Findings. Using the contextualized Amazon Inspector risk score, these findings are sorted into several severity ratings.
    1. The contextualized Amazon Inspector risk score is calculated by correlating CVE information with findings such as network access and exploitability.
    2. This score is used to derive severity of a finding and prioritize the most critical findings to improve remediation response efficiency.
    Figure 20: Findings in Amazon Inspector sorted by severity (default)

    Figure 20: Findings in Amazon Inspector sorted by severity (default)

    When you enable Amazon Inspector, it automatically discovers all of your Amazon EC2 and Amazon ECR resources. It scans these workloads to detect vulnerabilities that pose risks to the security of your compute workloads. After the initial scan, Amazon Inspector continues to monitor your environment. It automatically scans new resources and re-scans existing resources when changes are detected. As vulnerabilities are remediated or resources are removed from service, Amazon Inspector automatically updates the associated security findings.

    In order to successfully scan EC2 instances, Amazon Inspector requires inventory collected by AWS Systems Manager and the Systems Manager agent. This is installed by default on many EC2 instances. If you find some instances aren’t being scanned by Amazon Inspector, this might be because they aren’t being managed by Systems Manager.

  3. Select a findings title to see the associated report.
    1. Each finding provides a description, severity rating, information about the affected resource, and additional details such as resource tags and how to remediate the reported vulnerability.
    2. Amazon Inspector stores active findings until they are closed by remediation. Findings that are closed are displayed for 30 days.
    Figure 21: Amazon Inspector findings report details

    Figure 21: Amazon Inspector findings report details

Integrate CodeGuru Security with Amazon Inspector to scan Lambda functions

Amazon Inspector and CodeGuru Security work harmoniously together. CodeGuru Security is available through Amazon Inspector Lambda code scanning. After activating Lambda code scanning, you can configure automated code scans to be performed on your Lambda functions.

Use the following steps to configure Amazon CodeGuru Security with Amazon Inspector Lambda code scanning to evaluate Lambda functions.

  1. Open the Amazon Inspector console and select Account management from the navigation pane.
  2. Select the AWS account you want to activate Lambda code scanning in.
    Figure 22: Activating AWS Lambda code scanning from the Amazon Inspector Account management console

    Figure 22: Activating AWS Lambda code scanning from the Amazon Inspector Account management console

  3. Choose Activate and select AWS Lambda code scanning.

With Lambda code scanning activated, security findings for your Lambda function code will appear in the All findings section of Amazon Inspector.

Amazon Inspector plays a crucial role in maintaining the highest security standards for your resources. Whether you’re installing a new package on an EC2 instance, applying a software patch, or when a new CVE affecting a specific resource is disclosed, Amazon Inspector can assist with quick identification and remediation.

Conclusion

Incorporating security at every stage of the software development lifecycle is paramount and requires that security be a consideration from the outset. Shifting left enables security teams to reduce overall application security risks.

Using these AWS services — Amazon CodeWhisperer, Amazon CodeGuru and Amazon Inspector — not only aids in early risk identification and mitigation, it empowers your development and security teams, leading to more efficient and secure business outcomes.

For further reading, check out the AWS Well Architected Security Pillar, the Generative AI on AWS page, and more blogs like this on the AWS Security Blog page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon CodeWhisperer re:Post forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Dylan Souvage

Dylan Souvage

Dylan is a Solutions Architect based in Toronto, Canada. Dylan loves working with customers to understand their business needs and enable them in their cloud journey. In his spare time, he enjoys going out in nature, going on long road trips, and traveling to warm, sunny places.

Temi Adebambo

Temi Adebambo

Temi is the Head of Security Solutions Architecture at AWS with extensive experience leading technical teams and delivering enterprise-wide technology transformations programs. He has assisted Fortune 500 corporations with Cloud Security Architecture, Cyber Risk Management, Compliance, IT Security strategy, and governance. He currently leads teams of Security Solutions Architects solving business problems on behalf of customers.

Caitlin McDonald

Caitlin McDonald

Caitlin is a Montreal-based Solutions Architect at AWS with a development background. Caitlin works with customers in French and English to accelerate innovation and advise them through technical challenges. In her spare time, she enjoys triathlons, hockey, and making food with friends!

Shivam Patel

Shivam Patel

Shivam is a Solutions Architect at AWS. He comes from a background in R&D and combines this with his business knowledge to solve complex problems faced by his customers. Shivam is most passionate about workloads in machine learning, robotics, IoT, and high-performance computing.

Wael Abboud

Wael Abboud

Wael is a Solutions Architect at AWS. He assists enterprise customers in implementing innovative technologies, leveraging his background integrating cellular networks and concentrating on 5G technologies during his 5 years in the telecom industry.

Building sensitive data remediation workflows in multi-account AWS environments

Post Syndicated from Darren Roback original https://aws.amazon.com/blogs/security/building-sensitive-data-remediation-workflows-in-multi-account-aws-environments/

The rapid growth of data has empowered organizations to develop better products, more personalized services, and deliver transformational outcomes for their customers. As organizations use Amazon Web Services (AWS) to modernize their data capabilities, they can sometimes find themselves with data spread across several AWS accounts, each aligned to distinct use cases and business units. This can present a challenge for security professionals, who need not only a mechanism to identify sensitive data types—such as protected health information (PHI), payment card industry (PCI), and personally identifiable information (PII), or organizational intellectual property—stored on AWS, but also the ability to automatically act upon these findings through custom logic that supports organizational policies and regulatory requirements.

In this blog post, we present a solution that provides you with visibility into sensitive data residing across a fleet of AWS accounts through a ChatOps-style notification mechanism using Microsoft Teams, which also provides contextual information needed to conduct security investigations. This solution also incorporates a decision logic mechanism to automatically quarantine sensitive data assets while they’re pending review, which can be tailored to meet unique organizational, industry, or regulatory environment requirements.

Prerequisites

Before you proceed, ensure that you have the following within your environment:

Assumptions

Things to know about the solution in this blog post:

Solution overview

The solution architecture and overall workflow are detailed in Figure 1 that follows.

Figure 1: Solution overview

Figure 1: Solution overview

Upon discovering sensitive data in member accounts, this solution selectively quarantines objects based on their finding severity and public status. This logic can be customized to evaluate additional details such as the finding type or number of sensitive data occurrences detected. The ability to adjust this workflow logic can provide a custom-tailored solution to meet a variety of industry use cases, helping you adhere to industry-specific security regulations and frameworks.

Figure 1 provides an overview of the components used in this solution, and for illustrative purposes we step through them here.

Automated scanning of buckets

Macie supports various scope options for sensitive data discovery jobs, including the use of bucket tags to determine in-scope buckets for scanning. Setting up automated scanning includes the use of an AWS Organizations tag policy to verify that the S3 buckets deployed in your chosen OUs conform to tagging requirements, an AWS Config job to automatically check that tags have been applied correctly, an Amazon EventBridge rule and bus to receive compliance events from a fleet of member accounts, and an AWS Lambda function to notify administrators of compliance change events.

  1. An AWS Organizations tag policy verifies that the S3 buckets created in the in-scope AWS accounts have a tag structure that facilitates automated scanning and identification of the bucket owner. Specific tags enforced with this tag policy include RequireMacieScan : True|False and BucketOwner : <variable>. The tag policy is enforced at the OU level.
  2. A custom AWS Config rule evaluates whether these tags have been applied to all in-scope S3 buckets, and marks resources as compliant or not compliant.
  3. After every evaluation of S3 bucket tag compliance, AWS Config will send compliance events to an EventBridge event bus in the same AWS member account.
  4. An EventBridge rule is used to send compliance messages from member accounts to a centralized security account, which is used for operational administration of the solution.
  5. An EventBridge rule in the centralized security account is used to send all tag compliance events to a Lambda serverless function, which is used for notification.
  6. The tag compliance notification function receives the tag compliance events from EventBridge, parses the information, and posts it to a Microsoft Teams webhook. This provides you with details such as the affected bucket, compliance status, and bucket owner, along with a link to investigate the compliance event in AWS Config.

Detecting sensitive data

Macie is a key component of this solution and facilitates sensitive data discovery across a fleet of AWS accounts based on the RequireMacieScan tag value configured at the time of bucket creation. Setting up sensitive data detection includes using Macie for sensitive data discovery, an EventBridge rule and bus to receive these finding events from the Macie delegated administrator account, an AWS Step Functions state machine to process these findings, and a Lambda function to notify administrators of new findings.

  1. Macie is used to detect sensitive data in member account S3 buckets with job definitions based on bucket tags, and central reporting of findings through a Macie delegated administrator account (that is, the central security account).
  2. When new findings are detected, Macie will send finding events to an EventBridge event bus in the security account.
  3. An EventBridge rule is used to send all finding events to AWS Step Functions, which runs custom business logic to evaluate each finding and determine the next action to take on the affected object.
  4. In the default configuration, for findings that are of low or medium severity and in objects that are not publicly accessible, Step Functions sends finding event information to a Lambda serverless function, which is used for notification. You can alter this event processing logic in Step Functions to change which finding types initiate an automated quarantine of the object.
  5. The Macie finding notification function receives the finding events from Step Functions, parses this information, and posts it to a Microsoft Teams webhook. This presents you with details such as the affected bucket and AWS account, finding ID and severity, public status, and bucket owner, along with a link to investigate the finding in Macie.

Automated quarantine

As outlined in the previous section, this solution uses event processing decision logic in Step Functions to determine whether to quarantine a sensitive object in place. Setting up automated quarantine includes a Step Functions state machine for Macie finding processing logic, an Amazon DynamoDB table to track items moved to quarantine, a Lambda function to notify administrators of quarantine, and an Amazon API Gateway and second Step Functions state machine to facilitate remediation or removal of objects from quarantine.

  1. In the default configuration for findings that are of high severity, or in objects that are publicly accessible, Step Functions adds the affected object’s metadata to an Amazon DynamoDB table, which is used to track quarantine and remediation status at scale.
  2. Step Functions then quarantines the affected object, moving it to an automatically configured and secured prefix in the same S3 bucket while the finding is being investigated. Only select personnel (that is, cybersecurity) has access to the object.
  3. Step Functions then sends finding event information to a Lambda serverless function, which is used for notification.
  4. The Macie quarantine notification function receives the finding events from Step Functions, parses this information, and posts it to a Microsoft Teams webhook. This presents you with similar details to the Macie finding notification function, but also notes the object has been moved to quarantine, and provides a one-click option to release the object from quarantine.
  5. In the Microsoft Teams channel, which should only be accessible to qualified security professionals, DLP administrators select the option to release the object from quarantine. This invokes a REST API deployed on API Gateway.
  6. API Gateway invokes a release object workflow in Step Functions, which begins to process releasing the affected object from quarantine.
  7. Step Functions inspects the affected object ID received through API Gateway, and first retrieves details about this object from DynamoDB. Upon receiving these details, the object is removed from the quarantine tracking database.
  8. Step Functions then moves the affected object to its original location in Amazon S3, thereby making the object accessible again under the original bucket policy and IAM permissions.

Organization structure

As mentioned in the prerequisites, the solution uses a set of AWS accounts that have been joined to an organization, which is shown in Figure 2. While the logical structure of your AWS Organizations deployment can differ from what’s shown, for illustration purposes, we’re looking for sensitive data in the Development and Production AWS accounts, and the examples shown throughout the remainder of this blog post reflect that.

Figure 2: Organization structure

Figure 2: Organization structure

Deploy the solution

The overall deployment process of this solution has been decomposed into three AWS CloudFormation templates to be deployed to your management, security, and member accounts as CloudFormation stacks and StackSets, respectively. Performing the deployment in this manner not only verifies that the solution is extended to other member accounts created after the initial solution deployment, but also serves as an illustrative aid of the components deployed in each portion of the solution. An overview of the deployment process is as follows:

  1. Set up of Microsoft Teams webhooks to receive information from this solution.
  2. Deployment of a CloudFormation stack to the management account to configure the tag policy for this solution.
  3. Deployment of a CloudFormation stack set to member accounts to enable monitoring of S3 bucket tags and forwarding of tag compliance events to the security account.
  4. Deployment of a CloudFormation stack to the security account to configure the remainder of the solution that will facilitate sensitive data discovery, finding event processing, administrator notification, and release from quarantine functionality.
  5. Remaining manual configuration of quarantine remediation API authorization, enabling Macie, and specifying a Macie results location.

Set up Microsoft Teams

Before deploying the solution, you must create two incoming webhooks in a Microsoft Teams channel of your choice. Due to the sensitivity of the event information provided and the ability to release objects from quarantine, we recommend that this channel only be accessible to information security professionals. In addition, we recommend creating two distinct webhooks to distinguish tag compliance events from finding events and have named the webhooks in the examples S3-Tag-Compliance-Notification and Macie-Finding-Notification, respectively. A complete walkthrough of creating an incoming webhook is out of scope for this blog post, but you can access Microsoft’s documentation on creating incoming webhooks for an overview. After the webhooks have been created, save the URLs, to use in the solution deployment process.

Configure the management account

The first step of the deployment process is to deploy a CloudFormation stack in your management account that creates an AWS Organizations tag policy and applies it to the OUs of your choice. Before performing this step, note the two OU IDs you will apply the policy to, as these will be captured as input parameters for the CloudFormation stack.

  1. Choose the following Launch Stack button to open the CloudFormation console pre-loaded with the template for this step:

    Launch Stack

    Note: The stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, change your regional selection in the CloudFormation console, and deploy it to the selected Region.

  2. For Stack name, enter ConfigureManagementAccount.
  3. For First AWS Organizations Unit ID, enter your first OU ID.
  4. For Second AWS Organizations Unit ID, enter your second OU ID.
  5. Choose Next.
    Figure 3: Management account stack details

    Figure 3: Management account stack details

After you’ve entered all details, launch the stack and wait until the stack has reached CREATE_COMPLETE status before proceeding. The deployment process will take 1–2 minutes.

Configure the member accounts

The next step of the deployment process is to deploy a CloudFormation Stack, which will initiate a StackSet deployment from your management account that’s scoped to the OUs of your choice. This stack set will enable AWS Config along with an AWS Config rule to evaluate Amazon S3 tag compliance and will deploy an EventBridge rule to send compliance events from AWS Config in your member accounts to a centralized event bus in your security account. If AWS Config has previously been enabled, select True for the AWS Config Status parameter to help prevent an overwrite of your existing settings. Prior to performing this setup, note the two AWS Organizations OU IDs you will deploy the stack set to. You will also be prompted to enter the AWS account ID and Region of your security account.

  1. Choose the following Launch Stack button to open the CloudFormation console pre-loaded with the template for this step:

    Launch Stack

    Note: The stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, change your regional selection in the CloudFormation console, and deploy it to the selected Region.

  2. For Stack name, enter DeployToMemberAccounts.
  3. For First AWS Organizations Unit ID, enter your first OU ID.
  4. For Second AWS Organizations Unit ID, enter your second OU ID.
  5. For Deployment Region, enter the Region you want to deploy the Stack set to.
  6. For AWS Config Status, accept the default value of false if you have not previously enabled AWS Config in your accounts.
  7. For Support all resource types, accept the default value of false.
  8. For Include global resource types, accept the default value of false.
  9. For List of resource types if not all supported, accept the default value of AWS::S3::Bucket.
  10. For Configuration delivery channel name, accept the default value of <Generated>.
  11. For Snapshot delivery frequency, accept the default value of 1 hour.
  12. For Security account ID, enter your security account ID.
  13. For Security account region, select the Region of your security account.
  14. Choose Next.
    Figure 4: Member account Stack details

    Figure 4: Member account Stack details

After you’ve entered all details, launch the stack and wait until the stack has reached CREATE_COMPLETE status before proceeding. The deployment process will take 3–5 minutes.

Configure the security account

The next step of the deployment process involves deploying a CloudFormation stack in your security account that creates all the resources needed for at-scale sensitive data detection, automated quarantine, and security professional notification and response. This stack configures the following:

  • An S3 bucket and AWS Key Management Service (AWS KMS) keys for storage and encryption of discovery results in Macie.
  • Two rules in EventBridge for routing tag compliance and Macie finding events.
  • Two Step Functions state machines whose logic will be used for automated object quarantine and release.
  • Three Lambda functions for tag compliance, Macie findings, and quarantine notification.
  • A DynamoDB table for quarantine status tracking.
  • An API Gateway REST endpoint to facilitate the release of objects from quarantine.

Before performing this setup, note your AWS Organizations ID and two Microsoft Teams webhook URLs previously configured.

  1. Choose the following Launch Stack button to open the CloudFormation console pre-loaded with the template for this step:

    Launch Stack

    Note: The stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, change your regional selection in the CloudFormation console, and deploy it to the selected Region.

  2. For Stack name, enter ConfigureSecurityAccount.
  3. For AWS Org ID, enter your AWS Organizations ID.
  4. For Webhook URI for S3 tag compliance notifications, enter the first webhook URL you created in Microsoft Teams.
  5. For Webhook URI for Macie finding and quarantine notifications, enter the second webhook URL you created in Microsoft Teams.
  6. Choose Next.
    Figure 5: Security account stack details

    Figure 5: Security account stack details

After you’ve entered all details, launch the stack and wait until the stack has reached CREATE_COMPLETE status before proceeding. The deployment process will take 3–5 minutes.

Remaining configuration

While most of the solution is deployed automatically using CloudFormation, there are a few items that you must configure manually.

Configure Lambda API key environment variable

When the CloudFormation stack was deployed to the security account, CloudFormation created a new REST API for security professionals to release objects from quarantine. This API was configured with an API key to be used for authorization, and you must retrieve the value of this API key and set it as an environment variable in your MacieQuarantineNotification function, which also deployed in the security account. To retrieve the value of this API key, navigate to the REST API created in the security account, select API Keys, and retrieve the value of APIKey1. Next, navigate to the MacieQuarantineNotification function in the Lambda console, and set the ReleaseObjectApiKey environment variable to the value of your API key.

Enable Macie

Next, you must enable Macie to facilitate sensitive data discovery in selected accounts in your organization, and this process begins with the selection of a delegated administrator account (that is, the security account), followed by onboarding the member accounts you want to test with. See Integrating and configuring an organization in Amazon Macie for detailed instructions on enabling Macie in your organization.

Configure the Macie results bucket and KMS key

Macie creates an analysis record for each Amazon S3 object that it analyzes. This includes objects where Macie has detected sensitive data as well as objects where sensitive data was not detected or that Macie could not analyze. The CloudFormation stack deployed in the security account created an S3 bucket and KMS key for this, and they are noted as MacieResultsS3BucketName and MacieResultsKmsKeyAlias in the CloudFormation stack output. Use these resources to configure the Macie results bucket and KMS key in the security account according to Storing and retaining sensitive data discovery results with Amazon Macie. Customization of the S3 bucket policy or KMS key policy has already been done for you as part of the ConfigureSecurityAccount CloudFormation template deployed earlier.

Validate the solution

With the solution fully deployed, you now need to deploy an S3 bucket with sample data to test the solution and review the findings.

Create a member account S3 bucket

In any of the member accounts onboarded into this solution as part of the Configure the member accounts step, deploy a new S3 bucket and the KMS key used to encrypt the bucket using the CloudFormation template that follows. Before performing this step, note the InvestigateMacieFindingsRole, StepFunctionsProcessFindingsRole, and StepFunctionsReleaseObjectRole outputs from the CloudFormation template deployed to the security account, as these will be captured as input parameters for the CloudFormation stack.

  1. Choose the following Launch Stack button to open the CloudFormation console pre-loaded with the template for this step:

    Launch Stack

    Note: The stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, change your regional selection in the CloudFormation console, and deploy it to the selected Region.

  2. For Stack name, enter DeployS3BucketKmsKey.
  3. For IAM Role used by Step Functions to process Macie findings, enter the ARN that was provided to you as the StepFunctionsProcessFindingsRole output from the Configure security account step.
  4. For IAM Role used by Step Functions to release objects from quarantine, enter the ARN that was provided to you as the StepFunctionsReleaseObjectRole output from the Configure security account step.
  5. For IAM Role used by security professionals to investigate Macie findings, enter the ARN that was provided to you as the InvestigateMacieFindingsRole output from the Configure security account step.
  6. For Department name of the bucket owner, enter any department or team name you want to designate as having ownership responsibility over the S3 bucket.
  7. Choose Next.
    Figure 6: Deploy S3 bucket and KMS key stack details

    Figure 6: Deploy S3 bucket and KMS key stack details

After you’ve entered all details, launch the stack and wait until the stack has reached CREATE_COMPLETE status before proceeding. The deployment process will take 3–5 minutes.

Monitor S3 bucket tag compliance

Shortly after the deployment of the new S3 bucket, you should see a message in your Microsoft Teams channel notifying you of the tag compliance status of the new bucket. This AWS Config rule is evaluated automatically any time an S3 resource event takes place, and the tag compliance event is sent to the centralized security account for notification purposes. While the notification shown in Figure 7 depicts a compliant S3 bucket, a bucket deployed without the required tags will be marked as NON_COMPLIANT, and security professionals can check the AWS Config compliance status directly in the AWS console for the member account.

Figure 7: S3 tag compliance notification

Figure 7: S3 tag compliance notification

Upload sample data

Download this .zip file of sample data and upload the expanded files into the newly created S3 bucket. The sample files include fictitious PII, including credit card information and social security numbers, and so will invoke various Macie findings.

Note: All data in this blog post has been artificially created by AWS for demonstration purposes and has not been collected from any individual person. Similarly, such data does not, nor is it intended, to relate back to any individual person.

Configure a Macie discovery job

Configure a sensitive data discovery job in the Amazon Macie delegated administrator account (that is, the security account) according to Creating a sensitive data discovery job. When creating the job, specify tag-based bucket criteria instructing Macie to scan any bucket with a tag key of RequireMacieScan and a tag value of True. This instructs Macie to scan buckets matching this criterion across the accounts that have been onboarded into Macie.

Figure 8: Macie bucket criteria

Figure 8: Macie bucket criteria

On the discovery options page, specify a one-time job with a sampling depth of 100 percent. Further refine the job scope by adding the quarantine prefix to the exclusion criteria of the sensitive data discovery job.

Figure 9: Macie scan scope

Figure 9: Macie scan scope

Select the AWS_CREDENTIALS, CREDIT_CARD_NUMBER, CREDIT_CARD_SECURITY_CODE, and DATE_OF_BIRTH managed data identifiers and proceed to the review screen. On the review screen, ensure that the bucket name you created is listed under bucket matches, and launch the discovery job.

Note: This solution also works with the Automated Sensitive Data Discovery feature in Amazon Macie. I recommend you investigate this feature further for broad visibility into where sensitive data might reside within your Amazon S3 data estate. Regardless of the method you choose, you will be able to integrate the appropriate notification and quarantine solution.

Review and investigate findings

After a few minutes, the discovery job will complete and soon you should see four messages in your Microsoft teams channel notifying you of the finding events created by Macie. One of these findings will be marked as medium severity, while the other three will be marked as high.

Review the medium severity finding, and recall the flow of event information from the solution overview section. Macie scanned a bucket in the member account and presented this finding in the Macie delegated administrator account. Macie then sent this finding event to EventBridge, which initiated a workflow run in Step Functions. Step Functions invoked customer-specified logic to evaluate the finding and determined that because the object isn’t publicly accessible, and because the finding isn’t high severity, it should only notify security professionals rather than quarantine the object in question. Several key pieces of information necessary for investigation are presented to the security team, along with a link to directly investigate the finding in the AWS console of the central security account.

Figure 10: Macie medium severity finding notification

Figure 10: Macie medium severity finding notification

Now review the high severity finding. The flow of event information in this scenario is identical to the medium severity finding, but in this case, Step Functions quarantined the object because the severity is high. The security team is again presented with an option to use the console to investigate further. The process to investigate this finding is a bit different due to the object being moved to a quarantine prefix. If security professionals want to view the original object in its entirety, they would assume the InvestigateMacieFindingsRole in the security account, which has cross-account access to the S3 bucket quarantine prefix in the in-scope member accounts. S3 buckets deployed in member accounts using the CloudFormation template listed above will have a special bucket policy that denies access to the quarantine prefix for any role other than the InvestigateMacieFindingsRole, StepFunctionsProcessFindingsRole, and StepFunctionsReleaseObjectRole. This makes sure that objects are truly quarantined and inaccessible while being investigated by security professionals.

Figure 11: Macie high severity finding notification

Figure 11: Macie high severity finding notification

Unlike the previous example, the security team is also notified that an affected object was moved to quarantine, and is presented with a separate option to release the object from quarantine. Choosing Release Object from Quarantine runs an HTTP POST to the REST API transparently in the background, and the API responds with a SUCCEEDED or FAILED message indicating the result of the operation.

Figure 12: Release object from quarantine

Figure 12: Release object from quarantine

The state machine uses decision logic based on the affected object’s public status and the severity of the finding. Customers deploying this solution can choose to alter this logic or add additional customization by altering the Step Functions state machine definition either directly in the CloudFormation template or through the Step Functions Workflow Studio low-code interface available in the Step Functions console. For reference, the full event schema used by Macie can be found in the Eventbridge event schema for Macie findings.

The logic of the Step Functions state machine used to process Macie finding events follows and is shown in Figure 13.

  1. An EventBridge rule invokes the Step Functions state machine as Macie findings are received.
  2. Step Functions parses Macie event data to determine the finding severity.
  3. If the finding is high severity or determined to be public, the affected object is then added to the quarantine tracking database in DynamoDB.
  4. After adding the object to quarantine tracking, the object is copied into the quarantine prefix within its S3 bucket.
  5. After being copied, the object is deleted from its original S3 location.
  6. After the object is deleted, the MacieQuarantineNotification function is invoked to alert you of the finding and quarantine status.
  7. If the finding is not high severity and not determined to be public, the MacieFindingNotification function is invoked to alert you of the finding.
    Figure 13: Process Macie findings workflow

    Figure 13: Process Macie findings workflow

Solution cleanup

To remove the solution and avoid incurring additional charges for the AWS resources used in this solution, perform the following steps.

Note: If you want to only suspend Macie, which will preserve your settings, resources, and findings, see Suspending or disabling Amazon Macie.

  1. Open the Macie console in your security account. Under Settings, choose Accounts. Select the checkboxes next to the member accounts onboarded previously, select the Actions dropdown, and select Disassociate Account. When that has completed, select the same accounts again, and choose Delete.
  2. Open the Macie console in your management account. Click on Get started, and remove your security account as the Macie delegated administrator.
  3. Open the Macie console in your security account, choose Settings, then choose Disable Macie.
  4. Open the S3 console in your security account. Remove all objects from the Macie results S3 bucket.
  5. Open the CloudFormation console in your security account. Select the ConfigureSecurityAccount stack and choose Delete.
  6. Open the Macie console in your member accounts. Under Settings, choose Disable Macie.
  7. Open the Amazon S3 console in your member accounts. Remove all sample data objects from the S3 bucket.
  8. Open the CloudFormation console in your member accounts. Select the DeployS3BucketKmsKey stack and choose Delete.
  9. Open the CloudFormation console in your management account. Select the DeployToMemberAccounts stack and choose Delete.
  10. Still in the CloudFormation console in your management account, select the ConfigureManagementAccount stack and choose Delete.

Summary

In this blog post, we demonstrated a solution to process—and act upon—sensitive data discovery findings surfaced by Macie through the incorporation of decision logic implemented in Step Functions. This logic provides granularity on quarantine decisions and extensibility you can use to customize automated finding response to suit your business and regulatory needs.

In addition, we demonstrated a mechanism to notify you of finding event information as it’s discovered, while providing you with the contextual details necessary for investigative purposes. Additional customization of the information presented in Microsoft Teams can be accomplished through parsing of the event payload. You can also customize the Lambda functions to interface with Slack, as demonstrated in this sample solution for Macie.

Finally, we demonstrated a solution that can operate at-scale, and will automatically be deployed to new member accounts in your organization. By using an in-place quarantine strategy instead of one that is centralized, you can more easily manage this solution across potentially hundreds of AWS accounts and thousands of S3 buckets. By incorporating a global tracking database in DynamoDB, this solution can also be enhanced through a visual dashboard depicting quarantine insights.

If you have feedback about this post, let us know in the comments section. If you have questions about this post, start a new thread on AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Darren Roback

Darren Roback

Darren is a Senior Solutions Architect with Amazon Web Services based in St. Louis, Missouri. He has a background in Security and Compliance, Serverless Event-Driven Architecture, and Enterprise Architecture. At AWS, Darren partners with customers to help them solve business challenges with AWS technology. Outside of work, Darren enjoys spending time in his shop working on woodworking projects.

Seth Malone

Seth Malone

Seth is a Solutions Architect with Amazon Web Services based in Missouri. He has experience maintaining secure workloads across healthcare, financial, and engineering firms. Seth enjoys the outdoors, and can be found on the water after building solutions with his customers.

Chuck Enstall

Chuck Enstall

Chuck is a Principal Solutions Architect with AWS based in St. Louis, MO, focusing on enterprise customers. He has decades of experience in security across multiple clouds, and holds numerous security certifications such as CISSP, CCSP, and others. As a former Navy veteran, Chuck enjoys mentoring other veterans as they explore possible careers within the security field.

AWS Speaker Profile: Zach Miller, Senior Worldwide Security Specialist Solutions Architect

Post Syndicated from Roger Park original https://aws.amazon.com/blogs/security/aws-speaker-profile-zach-miller-senior-worldwide-security-specialist-solutions-architect/

In the AWS Speaker Profile series, we interview Amazon Web Services (AWS) thought leaders who help keep our customers safe and secure. This interview features Zach Miller, Senior Worldwide Security Specialist SA and re:Invent 2023 presenter of Securely modernize payment applications with AWS and Centrally manage application secrets with AWS Secrets Manager. Zach shares thoughts on the data protection and cloud security landscape, his unique background, his upcoming re:Invent sessions, and more.


How long have you been at AWS?

I’ve been at AWS for more than four years, and I’ve enjoyed every minute of it! I started as a consultant in Professional Services, and I’ve been a Security Solutions Architect for around three years.

How do you explain your job to your non-tech friends?

Well, my mother doesn’t totally understand my role, and she’s been known to tell her friends that I’m the cable company technician that installs your internet modem and router. I usually tell my non-tech friends that I help AWS customers protect their sensitive data. If I mention cryptography, I typically only get asked questions about cryptocurrency—which I’m not qualified to answer. If someone asks what cryptography is, I usually say it’s protecting data by using mathematics.

How did you get started in data protection and cryptography? What about it piqued your interest?

I originally went to school to become a network engineer, but I discovered that moving data packets from point A to point B wasn’t as interesting to me as securing those data packets. Early in my career, I was an intern at an insurance company, and I had a mentor who set up ethnical hacking lessons for me—for example, I’d come into the office and he’d have a compromised workstation preconfigured. He’d ask me to do an investigation and determine how the workstation was compromised and what could be done to isolate it and collect evidence. Other times, I’d come in and find my desk cabinets were locked with a padlock, and he wanted me to pick the lock. Security is particularly interesting because it’s an ever-evolving field, and I enjoy learning new things.

What’s been the most dramatic change you’ve seen in the data protection landscape?

One of the changes that I’ve been excited to see is an emphasis on encrypting everything. When I started my career, we’d often have discussions about encryption in the context of tradeoffs. If we needed to encrypt sensitive data, we’d have a conversation with application teams about the potential performance impact of encryption and decryption operations on their systems (for example, their databases), when to schedule downtime for the application to encrypt the data or rotate the encryption keys protecting the data, how to ensure the durability of their keys and make sure they didn’t lose data, and so on.

When I talk to customers about encryption on AWS today—of course, it’s still useful to talk about potential performance impact—but the conversation has largely shifted from “Should I encrypt this data?” to “How should I encrypt this data?” This is due to services such as AWS Key Management Service (AWS KMS) making it simpler for customers to manage encryption keys and encrypt and decrypt data in their applications with minimal performance impact or application downtime. AWS KMS has also made it simple to enable encryption of sensitive data—with over 120 AWS services integrated with AWS KMS, and services such as Amazon Simple Storage Service (Amazon S3) encrypting new S3 objects by default.

You are a frequent contributor to the AWS Security Blog. What were some of your recent posts about?

My last two posts covered how to use AWS Identity and Access Management (IAM) condition context keys to create enterprise controls for certificate management and how to use AWS Secrets Manager to securely manage and retrieve secrets in hybrid or multicloud workloads. I like writing posts that show customers how to use a new feature, or highlight a pattern that many customers ask about.

You are speaking in a couple of sessions at AWS re:Invent; what will your sessions focus on? What do you hope attendees will take away from your session?

I’m delivering two sessions at re:Invent this year. The first is a chalk talk, Centrally manage application secrets with AWS Secrets Manager (SEC221), that I’m delivering with Ritesh Desai, who is the General Manager of Secrets Manager. We’re discussing how you can securely store and manage secrets in your workloads inside and outside of AWS. We will highlight some recommended practices for managing secrets, and answer your questions about how Secrets Manager integrates with services such as AWS KMS to help protect application secrets.

The second session is also a chalk talk, Securely modernize payment applications with AWS (SEC326). I’m delivering this talk with Mark Cline, who is the Senior Product Manager of AWS Payment Cryptography. We will walk through an example scenario on creating a new payment processing application. We will discuss how to use AWS Payment Cryptography, as well as other services such as AWS Lambda, to build a simple architecture to help process and secure credit card payment data. We will also include common payment industry use cases such as tokenization of sensitive data, and how to include basic anti-fraud detection, in our example app.

What are you currently working on that you’re excited about?

My re:Invent sessions are definitely something that I’m excited about. Otherwise, I spend most of my time talking to customers about AWS Cryptography services such as AWS KMS, AWS Secrets Manager, and AWS Private Certificate Authority. I also lead a program at AWS that enables our subject matter experts to create and publish videos to demonstrate new features of AWS Security Services. I like helping people create videos, and I hope that our videos provide another mechanism for viewers who prefer information in a video format. Visual media can be more inclusive for customers with certain disabilities or for neurodiverse customers who find it challenging to focus on written text. Plus, you can consume videos differently than a blog post or text documentation. If you don’t have the time or desire to read a blog post or AWS public doc, you can listen to an instructional video while you work on other tasks, eat lunch, or take a break. I invite folks to check out the AWS Security Services Features Demo YouTube video playlist.

Is there something you wish customers would ask you about more often?

I always appreciate when customers provide candid feedback on our services. AWS is a customer-obsessed company, and we build our service roadmaps based on what our customers tell us they need. You should feel comfortable letting AWS know when something could be easier, more efficient, or less expensive. Many customers I’ve worked with have provided actionable feedback on our services and influenced service roadmaps, just by speaking up and sharing their experiences.

How about outside of work, any hobbies?

I have two toddlers that keep me pretty busy, so most of my hobbies are what they like to do. So I tend to spend a lot of time building elaborate toy train tracks, pushing my kids on the swings, and pretending to eat wooden toy food that they “cook” for me. Outside of that, I read a lot of fiction and indulge in binge-worthy TV.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Roger Park

Roger Park

Roger is a Senior Security Content Specialist at AWS Security focusing on data protection. He has worked in cybersecurity for almost ten years as a writer and content producer. In his spare time, he enjoys trying new cuisines, gardening, and collecting records.

Zach Miller

Zach Miller

Zach is a Senior Worldwide Security Specialist Solutions Architect at AWS. His background is in data protection and security architecture, focused on a variety of security domains, including cryptography, secrets management, and data classification. Today, he is focused on helping enterprise AWS customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

New – AWS Audit Manager now supports first third-party GRC integration

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/new-aws-audit-manager-now-supports-first-third-party-grc-integration/

Auditing is a continuous and ongoing process, and every audit includes the collection of evidence. The evidence gathered helps confirm the state of resources and it’s used to demonstrate that the customer’s policies, procedures, and activities (controls), are in place, and that the control has been operational for a specified period of time. AWS Audit Manager already automates this evidence collection for AWS usage. However, large enterprise organizations who deploy their workloads across a range of locations such as cloud, on-premises, or a combination of both, manage this evidence data using a combination of third-party or homegrown tools, spreadsheets, and emails.

Today we’re excited to announce the integration of AWS Audit Manager with third party Governance, Risk, and Compliance (GRC) provider, MetricStream CyberGRC, an AWS Partner with GRC capabilities. This integration allows enterprises to manage compliance across AWS, on-premises, and other cloud environments in a centralized GRC environment.

Before this announcement, Audit Manager operated only in the AWS context, allowing customers to collect compliance evidence for resources in AWS. They would then relay that information to their GRC systems external to AWS for additional aggregation and analysis. This process left customers without an automated way to monitor and evaluate all compliance data in one centralized location, resulting in delays to compliance outcomes.

The GRC integration with Audit Manager allows you to use audit evidence collected by Audit Manager directly in MetricStream CyberGRC. Audit Manager now receives the controls in scope from MetricStream CyberGRC, collects evidence around these controls, and exports the data related to the audit into MetricStream CyberGRC for aggregation and analysis. You will now have aggregated compliance, real-time monitoring and centralized reporting. This will reduce compliance fatigue and improve stakeholder collaboration.

How It Works
Using Amazon Cognito User Pools, you’ll be onboarded into the multi-tenant instance of MetricStream CyberGRC.

Amazon Cognito User Pools diagram

Amazon Cognito User Pools

Once onboarded, you’ll be able to view AWS assets and frameworks inside MetricStream CyberGRC. You can then begin by choosing the suitable Audit Manager framework to define the relationships between your existing enterprise controls and AWS controls. After creating this one-time control mapping, you can define the accounts in scope to create an assessment that MetricStream CyberGRC will manage in AWS Audit Manager on your behalf. This assessment triggers AWS Audit Manager to collect evidence in context of the mapped controls. As a result, you get a unified view of compliance evidence inside your GRC application. Any standard controls that you have in Audit Manager will be provided to MetricStream CyberGRC by using the GetControl API to facilitate manual mapping process wherever automated mapping fails or does not suffice. The EvidenceFinder API will send bulk evidence from Audit Manager to MetricStream CyberGRC.

Available Now
This feature is available today where Audit Manager (AWS Regions) and MetricStream CyberGRC are both available. There are no additional AWS Audit Manager charges for using this integration. To use this integration, please reach out to MetricStream for information about access and purchase of MetricStream CyberGRC software.

As part of the AWS Free Tier, AWS Audit Manager offers a free tier for first-time customers. The free tier will expire in two calendar months after the first subscription. For more information, see AWS Audit Manager pricing. To learn more about AWS Audit Manager integration with MetricStream CyberGRC, see Audit Manager documentation.

Veliswa

AWS Security Profile: Tom Scholl, VP and Distinguished Engineer, AWS

Post Syndicated from Tom Scholl original https://aws.amazon.com/blogs/security/aws-security-profile-tom-scholl-vp-and-distinguished-engineer-aws/

Tom Scholl Main Image

In the AWS Security Profile series, we feature the people who work in Amazon Web Services (AWS) Security and help keep our customers safe and secure. This interview is with Tom Scholl, VP and Distinguished Engineer for AWS.

What do you do in your current role and how long have you been at AWS?

I’m currently a vice president and distinguished engineer in the infrastructure organization at AWS. My role includes working on the AWS global network backbone, as well as focusing on denial-of-service detection and mitigation systems. I’ve been with AWS for over 12 years.

What initially got you interested in networking and how did that lead you to the anti-abuse and distributed denial of service (DDoS) space?

My interest in large global network infrastructure started when I was a teenager in the 1990s. I remember reading a magazine at the time that cataloged all the large IP transit providers on the internet, complete with network topology maps and sizes of links. It inspired me to want to work on the engineering teams that supported that. Over time, I was fortunate enough to move from working at a small ISP to a telecom carrier where I was able to work on their POP and backbone designs. It was there that I learned about the internet peering ecosystem and started collaborating with network operators from around the globe.

For the last 20-plus years, DDoS was always something I had to deal with to some extent. Namely, from the networking lens of preventing network congestion through traffic-engineering and capacity planning, as well as supporting the integration of DDoS traffic scrubbers into network infrastructure.

About three years ago, I became especially intrigued by the network abuse and DDoS space after using AWS network telemetry to observe the size of malicious events in the wild. I started to be interested in how mitigation could be improved, and how to break down the problem into smaller pieces to better understand the true sources of attack traffic. Instead of merely being an observer, I wanted to be a part of the solution and make it better. This required me to immerse myself into the domain, both from the perspective of learning the technical details and by getting hands-on and understanding the DDoS industry and environment as a whole. Part of how I did this was by engaging with my peers in the industry at other companies and absorbing years of knowledge from them.

How do you explain your job to your non-technical friends and family?

I try to explain both areas that I work on. First, that I help build the global network infrastructure that connects AWS and its customers to the rest of the world. I explain that for a home user to reach popular destinations hosted on AWS, data has to traverse a series of networks and physical cables that are interconnected so that the user’s home computer or mobile phone can send packets to another part of the world in less than a second. All that requires coordination with external networks, which have their own practices and policies on how they handle traffic. AWS has to navigate that complexity and build and operate our infrastructure with customer availability and security in mind. Second, when it comes to DDoS and network abuse, I explain that there are bad actors on the internet that use DDoS to cause impairment for a variety of reasons. It could be someone wanting to disrupt online gaming, video conferencing, or regular business operations for any given website or company. I work to prevent those events from causing any sort of impairment and trace back the source to disrupt that infrastructure launching them to prevent it from being effective in the future.

Recently, you were awarded the J.D. Falk Award by the Messaging Malware Mobile Anti-Abuse Working Group (M3AAWG) for IP Spoofing Mitigation. Congratulations! Please tell us more about the efforts that led to this.

Basically, there are three main types of DDoS attacks we observe: botnet-based unicast floods, spoofed amplification/reflection attacks, and proxy-driven HTTP request floods. The amplification/reflection aspect is interesting because it requires DDoS infrastructure providers to acquire compute resources behind providers that permit IP spoofing. IP spoofing itself has a long history on the internet, with a request for comment/best current practice (RFC/BCP) first written back in 2000 recommending that providers prevent this from occurring. However, adoption of this practice is still spotty on the internet.

At NANOG76, there was a proposal that these sorts of spoofed attacks could be traced by network operators in the path of the pre-amplification/reflection traffic (before it bounced off the reflectors). I personally started getting involved in this effort about two years ago. AWS operates a large global network and has network telemetry data that would help me identify pre-amplification/reflection traffic entering our network. This would allow me to triangulate the source network generating this. I then started engaging various networks directly that we connect to and provided them timestamps, spoofed source IP addresses, and specific protocols and ports involved with the traffic, hoping they could use their network telemetry to identify the customer generating it. From there, they’d engage with their customer to get the source shutdown or, failing that, implement packet filters on their customer to prevent spoofing.

Initially, only a few networks were capable of doing this well. This meant I had to spend a fair amount of energy in educating various networks around the globe on what spoofed traffic is, how to use their network telemetry to find it, and how to handle it. This was the most complicated and challenging part because this wasn’t on the radar of many networks out there. Up to this time, frontline network operations and abuse teams at various networks, including some very large ones, were not proficient in dealing with this.

The education I did included a variety of engagements, including sharing drawings with the day-in-the-life of a spoofed packet in a reflection attack, providing instructions on how to use their network telemetry tools, connecting them with their network telemetry vendors to help them, and even going so far as using other more exotic methods to identify which of their customers were spoofing and pointing out who they needed to analyze more deeply. In the end, it’s about getting other networks to be responsive and take action and, in the best cases, find spoofing on their own and act upon it.

Incredible! How did it feel accepting the award at the M3AAWG General Meeting in Brooklyn?

It was an honor to accept it and see some acknowledgement for the behind-the-scenes work that goes on to make the internet a better place.

What’s next for you in your work to suppress IP spoofing?

Continue tracing exercises and engaging with external providers. In particular, some of the network providers that experience challenges in dealing with spoofing and how we can improve their operations. Also, determining more effective ways to educate the hosting providers where IP spoofing is a common issue and making them implement proper default controls to not allow this behavior. Another aspect is being a force multiplier to enable others to spread the word and be part of the education process.

Looking ahead, what are some of your other goals for improving users’ online experiences and security?

Continually focusing on improving our DDoS defense strategies and working with customers to build tailored solutions that address some of their unique requirements. Across AWS, we have many services that are architected in different ways, so a key part of this is how do we raise the bar from a DDoS defense perspective across each of them. AWS customers also have their own unique architecture and protocols that can require developing new solutions to address their specific needs. On the disruption front, we will continue to focus on disrupting DDoS-as-a-service provider infrastructure beyond disrupting spoofing to disrupting botnets and the infrastructure associated with HTTP request floods.

With HTTP request floods being much more popular than byte-heavy and packet-heavy threat methods, it’s important to highlight the risks open proxies on the internet pose. Some of this emphasizes why there need to be some defaults in software packages to prevent misuse, in addition to network operators proactively identifying open proxies and taking appropriate action. Hosting providers should also recognize when their customer resources are communicating with large fleets of proxies and consider taking appropriate mitigations.

What are the most critical skills you would advise people need to be successful in network security?

I’m a huge proponent of being hands-on and diving into problems to truly understand how things are operating. Putting yourself outside your comfort zone, diving deep into the data to understand something, and translating that into outcomes and actions is something I highly encourage. After you immerse yourself in a particular domain, you can be much more effective at developing strategies and rapid prototyping to move forward. You can make incremental progress with small actions. You don’t have to wait for the perfect and complete solution to make some progress. I also encourage collaboration with others because there is incredible value in seeking out diverse opinions. There are resources out there to engage with, provided you’re willing to put in the work to learn and determine how you want to give back. The best people I’ve worked with don’t do it for public attention, blog posts, or social media status. They work in the background and don’t expect anything in return. They do it because of their desire to protect their customers and, where possible, the internet at large.

Lastly, if you had to pick an industry outside of security for your career, what would you be doing?

I’m over the maximum age allowed to start as an air traffic controller, so I suppose an air transport pilot or a locomotive engineer would be pretty neat.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Tom Scholl

Tom Scholl

Tom is Vice President and Distinguished Engineer at AWS.

Amanda Mahoney

Amanda Mahoney

Amanda is the public relations lead for the AWS portfolio of security, identity, and compliance services. She joined AWS in September 2022.

Writing IAM Policies: Grant Access to User-Specific Folders in an Amazon S3 Bucket

Post Syndicated from Dylan Souvage original https://aws.amazon.com/blogs/security/writing-iam-policies-grant-access-to-user-specific-folders-in-an-amazon-s3-bucket/

November 14, 2023: We’ve updated this post to use IAM Identity Center and follow updated IAM best practices.

In this post, we discuss the concept of folders in Amazon Simple Storage Service (Amazon S3) and how to use policies to restrict access to these folders. The idea is that by properly managing permissions, you can allow federated users to have full access to their respective folders and no access to the rest of the folders.

Overview

Imagine you have a team of developers named Adele, Bob, and David. Each of them has a dedicated folder in a shared S3 bucket, and they should only have access to their respective folders. These users are authenticated through AWS IAM Identity Center (successor to AWS Single Sign-On).

In this post, you’ll focus on David. You’ll walk through the process of setting up these permissions for David using IAM Identity Center and Amazon S3. Before you get started, let’s first discuss what is meant by folders in Amazon S3, because it’s not as straightforward as it might seem. To learn how to create a policy with folder-level permissions, you’ll walk through a scenario similar to what many people have done on existing files shares, where every IAM Identity Center user has access to only their own home folder. With folder-level permissions, you can granularly control who has access to which objects in a specific bucket.

You’ll be shown a policy that grants IAM Identity Center users access to the same Amazon S3 bucket so that they can use the AWS Management Console to store their information. The policy allows users in the company to upload or download files from their department’s folder, but not to access any other department’s folder in the bucket.

After the policy is explained, you’ll see how to create an individual policy for each IAM Identity Center user.

Throughout the rest of this post, you will use a policy, which will be associated with an IAM Identity Center user named David. Also, you must have already created an S3 bucket.

Note: S3 buckets have a global namespace and you must change the bucket name to a unique name.

For this blog post, you will need an S3 bucket with the following structure (the example bucket name for the rest of the blog is “my-new-company-123456789”):

/home/Adele/
/home/Bob/
/home/David/
/confidential/
/root-file.txt

Figure 1: Screenshot of the root of the my-new-company-123456789 bucket

Figure 1: Screenshot of the root of the my-new-company-123456789 bucket

Your S3 bucket structure should have two folders, home and confidential, with a file root-file.txt in the main bucket directory. Inside confidential you will have no items or folders. Inside home there should be three sub-folders: Adele, Bob, and David.

Figure 2: Screenshot of the home/ directory of the my-new-company-123456789 bucket

Figure 2: Screenshot of the home/ directory of the my-new-company-123456789 bucket

A brief lesson about Amazon S3 objects

Before explaining the policy, it’s important to review how Amazon S3 objects are named. This brief description isn’t comprehensive, but will help you understand how the policy works. If you already know about Amazon S3 objects and prefixes, skip ahead to Creating David in Identity Center.

Amazon S3 stores data in a flat structure; you create a bucket, and the bucket stores objects. S3 doesn’t have a hierarchy of sub-buckets or folders; however, tools like the console can emulate a folder hierarchy to present folders in a bucket by using the names of objects (also known as keys). When you create a folder in S3, S3 creates a 0-byte object with a key that references the folder name that you provided. For example, if you create a folder named photos in your bucket, the S3 console creates a 0-byte object with the key photos/. The console creates this object to support the idea of folders. The S3 console treats all objects that have a forward slash (/) character as the last (trailing) character in the key name as a folder (for example, examplekeyname/)

To give you an example, for an object that’s named home/common/shared.txt, the console will show the shared.txt file in the common folder in the home folder. The names of these folders (such as home/ or home/common/) are called prefixes, and prefixes like these are what you use to specify David’s department folder in his policy. By the way, the slash (/) in a prefix like home/ isn’t a reserved character — you could name an object (using the Amazon S3 API) with prefixes such as home:common:shared.txt or home-common-shared.txt. However, the convention is to use a slash as the delimiter, and the Amazon S3 console (but not S3 itself) treats the slash as a special character for showing objects in folders. For more information on organizing objects in the S3 console using folders, see Organizing objects in the Amazon S3 console by using folders.

Creating David in Identity Center

IAM Identity Center helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. Identity Center is the recommended approach for workforce authentication and authorization on AWS for organizations of any size and type. Using Identity Center, you can create and manage user identities in AWS, or connect your existing identity source, including Microsoft Active Directory, Okta, Ping Identity, JumpCloud, Google Workspace, and Azure Active Directory (Azure AD). For further reading on IAM Identity Center, see the Identity Center getting started page.

Begin by setting up David as an IAM Identity Center user. To start, open the AWS Management Console and go to IAM Identity Center and create a user.

Note: The following steps are for Identity Center without System for Cross-domain Identity Management (SCIM) turned on, the add user option won’t be available if SCIM is turned on.

  1. From the left pane of the Identity Center console, select Users, and then choose Add user.
    Figure 3: Screenshot of IAM Identity Center Users page.

    Figure 3: Screenshot of IAM Identity Center Users page.

  2. Enter David as the Username, enter an email address that you have access to as you will need this later to confirm your user, and then enter a First name, Last name, and Display name.
  3. Leave the rest as default and choose Add user.
  4. Select Users from the left navigation pane and verify you’ve created the user David.
    Figure 4: Screenshot of adding users to group in Identity Center.

    Figure 4: Screenshot of adding users to group in Identity Center.

  5. Now that you’re verified the user David has been created, use the left pane to navigate to Permission sets, then choose Create permission set.
    Figure 5: Screenshot of permission sets in Identity Center.

    Figure 5: Screenshot of permission sets in Identity Center.

  6. Select Custom permission set as your Permission set type, then choose Next.
    Figure 6: Screenshot of permission set types in Identity Center.

    Figure 6: Screenshot of permission set types in Identity Center.

David’s policy

This is David’s complete policy, which will be associated with an IAM Identity Center federated user named David by using the console. This policy grants David full console access to only his folder (/home/David) and no one else’s. While you could grant each user access to their own bucket, keep in mind that an AWS account can have up to 100 buckets by default. By creating home folders and granting the appropriate permissions, you can instead allow thousands of users to share a single bucket.

{
 “Version”:”2012-10-17”,
 “Statement”: [
   {
     “Sid”: “AllowUserToSeeBucketListInTheConsole”,
     “Action”: [“s3:ListAllMyBuckets”, “s3:GetBucketLocation”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3:::*”]
   },
  {
     “Sid”: “AllowRootAndHomeListingOfCompanyBucket”,
     “Action”: [“s3:ListBucket”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3::: my-new-company-123456789”],
     “Condition”:{“StringEquals”:{“s3:prefix”:[“”,”home/”, “home/David”],”s3:delimiter”:[“/”]}}
    },
   {
     “Sid”: “AllowListingOfUserFolder”,
     “Action”: [“s3:ListBucket”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3:::my-new-company-123456789”],
     “Condition”:{“StringLike”:{“s3:prefix”:[“home/David/*”]}}
   },
   {
     “Sid”: “AllowAllS3ActionsInUserFolder”,
     “Effect”: “Allow”,
     “Action”: [“s3:*”],
     “Resource”: [“arn:aws:s3:::my-new-company-123456789/home/David/*”]
   }
 ]
}
  1. Now, copy and paste the preceding IAM Policy into the inline policy editor. In this case, you use the JSON editor. For information on creating policies, see Creating IAM policies.
    Figure 7: Screenshot of the inline policy inside the permissions set in Identity Center.

    Figure 7: Screenshot of the inline policy inside the permissions set in Identity Center.

  2. Give your permission set a name and a description, then leave the rest at the default settings and choose Next.
  3. Verify that you modify the policies to have the bucket name you created earlier.
  4. After your permission set has been created, navigate to AWS accounts on the left navigation pane, then select Assign users or groups.
    Figure 8: Screenshot of the AWS accounts in Identity Center.

    Figure 8: Screenshot of the AWS accounts in Identity Center.

  5. Select the user David and choose Next.
    Figure 9: Screenshot of the AWS accounts in Identity Center.

    Figure 9: Screenshot of the AWS accounts in Identity Center.

  6. Select the permission set you created earlier, choose Next, leave the rest at the default settings and choose Submit.
    Figure 10: Screenshot of the permission sets in Identity Center.

    Figure 10: Screenshot of the permission sets in Identity Center.

    You’ve now created and attached the permissions required for David to view his S3 bucket folder, but not to view the objects in other users’ folders. You can verify this by signing in as David through the AWS access portal.

    Figure 11: Screenshot of the settings summary in Identity Center.

    Figure 11: Screenshot of the settings summary in Identity Center.

  7. Navigate to the dashboard in IAM Identity Center and go to the Settings summary, then choose the AWS access portal URL.
    Figure 12: Screenshot of David signing into the console via the Identity Center dashboard URL.

    Figure 12: Screenshot of David signing into the console via the Identity Center dashboard URL.

  8. Sign in as the user David with the one-time password you received earlier when creating David.
    Figure 13: Second screenshot of David signing into the console through the Identity Center dashboard URL.

    Figure 13: Second screenshot of David signing into the console through the Identity Center dashboard URL.

  9. Open the Amazon S3 console.
  10. Search for the bucket you created earlier.
    Figure 14: Screenshot of my-new-company-123456789 bucket in the AWS console.

    Figure 14: Screenshot of my-new-company-123456789 bucket in the AWS console.

  11. Navigate to David’s folder and verify that you have read and write access to the folder. If you navigate to other users’ folders, you’ll find that you don’t have access to the objects inside their folders.

David’s policy consists of four blocks; let’s look at each individually.

Block 1: Allow required Amazon S3 console permissions

Before you begin identifying the specific folders David can have access to, you must give him two permissions that are required for Amazon S3 console access: ListAllMyBuckets and GetBucketLocation.

   {
      "Sid": "AllowUserToSeeBucketListInTheConsole",
      "Action": ["s3:GetBucketLocation", "s3:ListAllMyBuckets"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::*"]
   }

The ListAllMyBuckets action grants David permission to list all the buckets in the AWS account, which is required for navigating to buckets in the Amazon S3 console (and as an aside, you currently can’t selectively filter out certain buckets, so users must have permission to list all buckets for console access). The console also does a GetBucketLocation call when users initially navigate to the Amazon S3 console, which is why David also requires permission for that action. Without these two actions, David will get an access denied error in the console.

Block 2: Allow listing objects in root and home folders

Although David should have access to only his home folder, he requires additional permissions so that he can navigate to his folder in the Amazon S3 console. David needs permission to list objects at the root level of the my-new-company-123456789 bucket and to the home/ folder. The following policy grants these permissions to David:

   {
      "Sid": "AllowRootAndHomeListingOfCompanyBucket",
      "Action": ["s3:ListBucket"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::my-new-company-123456789"],
      "Condition":{"StringEquals":{"s3:prefix":["","home/", "home/David"],"s3:delimiter":["/"]}}
   }

Without the ListBucket permission, David can’t navigate to his folder because he won’t have permissions to view the contents of the root and home folders. When David tries to use the console to view the contents of the my-new-company-123456789 bucket, the console will return an access denied error. Although this policy grants David permission to list all objects in the root and home folders, he won’t be able to view the contents of any files or folders except his own (you specify these permissions in the next block).

This block includes conditions, which let you limit under what conditions a request to AWS is valid. In this case, David can list objects in the my-new-company-123456789 bucket only when he requests objects without a prefix (objects at the root level) and objects with the home/ prefix (objects in the home folder). If David tries to navigate to other folders, such as confidential/, David is denied access. Additionally, David needs permissions to list prefix home/David to be able to use the search functionality of the console instead of scrolling down the list of users’ folders.

To set these root and home folder permissions, I used two conditions: s3:prefix and s3:delimiter. The s3:prefix condition specifies the folders that David has ListBucket permissions for. For example, David can list the following files and folders in the my-new-company-123456789 bucket:

/root-file.txt
/confidential/
/home/Adele/
/home/Bob/
/home/David/

But David cannot list files or subfolders in the confidential/home/Adele, or home/Bob folders.

Although the s3:delimiter condition isn’t required for console access, it’s still a good practice to include it in case David makes requests by using the API. As previously noted, the delimiter is a character—such as a slash (/)—that identifies the folder that an object is in. The delimiter is useful when you want to list objects as if they were in a file system. For example, let’s assume the my-new-company-123456789 bucket stored thousands of objects. If David includes the delimiter in his requests, he can limit the number of returned objects to just the names of files and subfolders in the folder he specified. Without the delimiter, in addition to every file in the folder he specified, David would get a list of all files in any subfolders.

Block 3: Allow listing objects in David’s folder

In addition to the root and home folders, David requires access to all objects in the home/David/ folder and any subfolders that he might create. Here’s a policy that allows this:

{
      “Sid”: “AllowListingOfUserFolder”,
      “Action”: [“s3:ListBucket”],
      “Effect”: “Allow”,
      “Resource”: [“arn:aws:s3:::my-new-company-123456789”],
      "Condition":{"StringLike":{"s3:prefix":["home/David/*"]}}
    }

In the condition above, you use a StringLike expression in combination with the asterisk (*) to represent an object in David’s folder, where the asterisk acts as a wildcard. That way, David can list files and folders in his folder (home/David/). You couldn’t include this condition in the previous block (AllowRootAndHomeListingOfCompanyBucket) because it used the StringEquals expression, which would interpret the asterisk (*) as an asterisk, not as a wildcard.

In the next section, the AllowAllS3ActionsInUserFolder block, you’ll see that the Resource element specifies my-new-company/home/David/*, which looks like the condition that I specified in this section. You might think that you can similarly use the Resource element to specify David’s folder in this block. However, the ListBucket action is a bucket-level operation, meaning the Resource element for the ListBucket action applies only to bucket names and doesn’t take folder names into account. So, to limit actions at the object level (files and folders), you must use conditions.

Block 4: Allow all Amazon S3 actions in David’s folder

Finally, you specify David’s actions (such as read, write, and delete permissions) and limit them to just his home folder, as shown in the following policy:

    {
      "Sid": "AllowAllS3ActionsInUserFolder",
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": ["arn:aws:s3:::my-new-company-123456789/home/David/*"]
    }

For the Action element, you specified s3:*, which means David has permission to do all Amazon S3 actions. In the Resource element, you specified David’s folder with an asterisk (*) (a wildcard) so that David can perform actions on the folder and inside the folder. For example, David has permission to change his folder’s storage class. David also has permission to upload files, delete files, and create subfolders in his folder (perform actions in the folder).

An easier way to manage policies with policy variables

In David’s folder-level policy you specified David’s home folder. If you wanted a similar policy for users like Bob and Adele, you’d have to create separate policies that specify their home folders. Instead of creating individual policies for each IAM Identity Center user, you can use policy variables and create a single policy that applies to multiple users (a group policy). Policy variables act as placeholders. When you make a request to a service in AWS, the placeholder is replaced by a value from the request when the policy is evaluated.

For example, you can use the previous policy and replace David’s user name with a variable that uses the requester’s user name through attributes and PrincipalTag as shown in the following policy (copy this policy to use in the procedure that follows):

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "AllowUserToSeeBucketListInTheConsole",
			"Action": [
				"s3:ListAllMyBuckets",
				"s3:GetBucketLocation"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::*"
			]
		},
		{
			"Sid": "AllowRootAndHomeListingOfCompanyBucket",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789"
			],
			"Condition": {
				"StringEquals": {
					"s3:prefix": [
						"",
						"home/",
						"home/${aws:PrincipalTag/userName}"
					],
					"s3:delimiter": [
						"/"
					]
				}
			}
		},
		{
			"Sid": "AllowListingOfUserFolder",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789"
			],
			"Condition": {
				"StringLike": {
					"s3:prefix": [
						"home/${aws:PrincipalTag/userName}/*"
					]
				}
			}
		},
		{
			"Sid": "AllowAllS3ActionsInUserFolder",
			"Effect": "Allow",
			"Action": [
				"s3:*"
			],
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789/home/${aws:PrincipalTag/userName}/*"
			]
		}
	]
}
  1. To implement this policy with variables, begin by opening the IAM Identity Center console using the main AWS admin account (ensuring you’re not signed in as David).
  2. Select Settings on the left-hand side, then select the Attributes for access control tab.
    Figure 15: Screenshot of Settings inside Identity Center.

    Figure 15: Screenshot of Settings inside Identity Center.

  3. Create a new attribute for access control, entering userName as the Key and ${path:userName} as the Value, then choose Save changes. This will add a session tag to your Identity Center user and allow you to use that tag in an IAM policy.
    Figure 16: Screenshot of managing attributes inside Identity Center settings.

    Figure 16: Screenshot of managing attributes inside Identity Center settings.

  4. To edit David’s permissions, go back to the IAM Identity Center console and select Permission sets.
    Figure 17: Screenshot of permission sets inside Identity Center with Davids-Permissions selected.

    Figure 17: Screenshot of permission sets inside Identity Center with Davids-Permissions selected.

  5. Select David’s permission set that you created previously.
  6. Select Inline policy and then choose Edit to update David’s policy by replacing it with the modified policy that you copied at the beginning of this section, which will resolve to David’s username.
    Figure 18: Screenshot of David’s policy inside his permission set inside Identity Center.

    Figure 18: Screenshot of David’s policy inside his permission set inside Identity Center.

You can validate that this is set up correctly by signing in to David’s user through the Identity Center dashboard as you did before and verifying you have access to the David folder and not the Bob or Adele folder.

Figure 19: Screenshot of David’s S3 folder with access to a .jpg file inside.

Figure 19: Screenshot of David’s S3 folder with access to a .jpg file inside.

Whenever a user makes a request to AWS, the variable is replaced by the user name of whoever made the request. For example, when David makes a request, ${aws:PrincipalTag/userName} resolves to David; when Adele makes the request, ${aws:PrincipalTag/userName} resolves to Adele.

It’s important to note that, if this is the route you use to grant access, you must control and limit who can set this username tag on an IAM principal. Anyone who can set this tag can effectively read/write to any of these bucket prefixes. It’s important that you limit access and protect the bucket prefixes and who can set the tags. For more information, see What is ABAC for AWS, and the Attribute-based access control User Guide.

Conclusion

By using Amazon S3 folders, you can follow the principle of least privilege and verify that the right users have access to what they need, and only to what they need.

See the following example policy that only allows API access to the buckets, and only allows for adding, deleting, restoring, and listing objects inside the folders:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAllS3ActionsInUserFolder",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:DeleteObjectTagging",
                "s3:DeleteObjectVersion",
                "s3:DeleteObjectVersionTagging",
                "s3:GetObject",
                "s3:GetObjectTagging",
                "s3:GetObjectVersion",
                "s3:GetObjectVersionTagging",
                "s3:ListBucket",
                "s3:PutObject",
                "s3:PutObjectTagging",
                "s3:PutObjectVersionTagging",
                "s3:RestoreObject"
            ],
            "Resource": [
		   "arn:aws:s3:::my-new-company-123456789",
                "arn:aws:s3:::my-new-company-123456789/home/${aws:PrincipalTag/userName}/*"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "home/${aws:PrincipalTag/userName}/*"
                    ]
                }
            }
        }
    ]
}

We encourage you to think about what policies your users might need and restrict the access by only explicitly allowing what is needed.

Here are some additional resources for learning about Amazon S3 folders and about IAM policies, and be sure to get involved at the community forums:

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Dylan Souvage

Dylan Souvage

Dylan is a Solutions Architect based in Toronto, Canada. Dylan loves working with customers to understand their business needs and enable them in their cloud journey. In his spare time, he enjoys going out in nature, going on long road trips, and traveling to warm, sunny places.

Abhra Sinha

Abhra Sinha

Abhra is a Toronto-based Senior Solutions Architect at AWS. Abhra enjoys being a trusted advisor to customers, working closely with them to solve their technical challenges and help build a secure scalable architecture on AWS. In his spare time, he enjoys Photography and exploring new restaurants.

Divyajeet Singh

Divyajeet Singh

Divyajeet (DJ) is a Sr. Solutions Architect at AWS Canada. He loves working with customers to help them solve their unique business challenges using the cloud. In his free time, he enjoys spending time with family and friends, and exploring new places.

Use private key JWT authentication between Amazon Cognito user pools and an OIDC IdP

Post Syndicated from Martin Pagel original https://aws.amazon.com/blogs/security/use-private-key-jwt-authentication-between-amazon-cognito-user-pools-and-an-oidc-idp/

With Amazon Cognito user pools, you can add user sign-up and sign-in features and control access to your web and mobile applications. You can enable your users who already have accounts with other identity providers (IdPs) to skip the sign-up step and sign in to your application by using an existing account through SAML 2.0 or OpenID Connect (OIDC). In this blog post, you will learn how to extend the authorization code grant between Cognito and an external OIDC IdP with private key JSON Web Token (JWT) client authentication.

For OIDC, Cognito uses the OAuth 2.0 authorization code grant flow as defined by the IETF in RFC 6749 Section 1.3.1. This flow can be broken down into two steps: user authentication and token request. When a user needs to authenticate through an external IdP, the Cognito user pool forwards the user to the IdP’s login endpoint. After successful authentication, the IdP sends back a response that includes an authorization code, which concludes the authentication step. The Cognito user pool now uses this code, together with a client secret for client authentication, to retrieve a JWT from the IdP. The JWT consists of an access token and an identity token. Cognito ingests that JWT, creates or updates the user in the user pool, and returns a JWT it has created for the client’s session, to the client. You can find a more detailed description of this flow in the Amazon Cognito documentation.

Although this flow sufficiently secures the requests between Cognito and the IdP for most customers, those in the public sector, healthcare, and finance sometimes need to integrate with IdPs that enforce additional security measures as part of their security requirements. In the past, this has come up in conversations at AWS when our customers needed to integrate Cognito with, for example, the HelseID (healthcare sector, Norway), login.gov (public sector, USA), or GOV.UK One Login (public sector, UK) IdPs. Customers who are using Okta, PingFederate, or similar IdPs and want additional security measures as part of their internal security requirements, might also find adding further security requirements desirable as part of their own policies.

The most common additional requirement is to replace the client secret with an assertion that consists of a private key JWT as a means of client authentication during token requests. This method is defined through a combination of RFC 7521 and RFC 7523. Instead of a symmetric key (the client secret), this method uses an asymmetric key-pair to sign a JWT with a private key. The IdP can then verify the token request by validating the signature of that JWT using the corresponding public key. This helps to eliminate the exposure of the client secret with every request, thereby reducing the risk of request forgery, depending on the quality of the key material that was used and how access to the private key is secured. Additionally, the JWT has an expiry time, which further constrains the risk of replay attacks to a narrow time window.

A Cognito user pool does not natively support private key JWT client authentication when integrating with an external IdP. However, you can still integrate Cognito user pools with IdPs that support or require private key JWT authentication by using Amazon API Gateway and AWS Lambda.

This blog post presents a high-level overview of how you can implement this solution. To learn more about the underlying code, how to configure the included services, and what the detailed request flow looks like, check out the Deploy a demo section later in this post. Keep in mind that this solution does not cover the request flow between your own application and a Cognito user pool, but only the communication between Cognito and the IdP.

Solution overview

Following the technical implementation details of the previously mentioned RFCs, the required request flow between a Cognito user pool and the external OIDC IdP can be broken down into four simplified steps, shown in Figure 1.

Figure 1: Simplified UML diagram of the target implementation for using a private key JWT during the authorization code grant

Figure 1: Simplified UML diagram of the target implementation for using a private key JWT during the authorization code grant

In this example, we’re using the Cognito user pool hosted UI—because it already provides OAuth 2.0-aligned IdP integration—and extending it with the private key JWT. Figure 1 illustrates the following steps:

  1. The hosted UI forwards the user client to the /authorize endpoint of the external OIDC IdP with an HTTP GET request.
  2. After the user successfully logs into the IdP, the IdP‘s response includes an authorization code.
  3. The hosted UI sends this code in an HTTP POST request to the IdP’s /token endpoint. By default, the hosted UI also adds a client secret for client authentication. To align with the private key JWT authentication method, you need to replace the client secret with a client assertion and specify the client assertion type, as highlighted in the diagram and further described later.
  4. The IdP validates the client assertion by using a pre-shared public key.
  5. The IdP issues the user’s JWT, which Cognito ingests to create or update the user in the user pool.

As mentioned earlier, token requests between a Cognito user pool and an external IdP do not natively support the required client assertion. However, you can redirect the token requests to, for example, an Amazon API Gateway, which invokes a Lambda function to extend the request with the new parameters. Because you need to sign the client assertion with a private key, you also need a secure location to store this key. For this, you can use AWS Secrets Manager, which helps you to secure the key from unauthorized use. With the required flow and additional services in mind, you can create the following architecture.

Figure 2: Architecture diagram with Amazon API Gateway and Lambda to process token requests between Cognito and the OIDC identity provider

Figure 2: Architecture diagram with Amazon API Gateway and Lambda to process token requests between Cognito and the OIDC identity provider

Let’s have a closer look at the individual components and the request flow that are shown in Figure 2.

When adding an OIDC IdP to a Cognito user pool, you configure endpoints for Authorization, UserInfo, Jwks_uri, and Token. Because the private key is required only for the token request flow, you can configure resources to redirect and process requests, as follows (the step numbers correspond to the step numbering in Figure 2):

  1. Configure the endpoints for Authorization, UserInfo, and Jwks_Uri with the ones from the IdP.
  2. Create an API Gateway with a dedicated route for token requests (for example, /token) and add it as the Token endpoint in the IdP configuration in Cognito.
  3. Integrate this route with a Lambda function: When Cognito calls the API endpoint, it will automatically invoke the function.

Together with the original request parameters, which include the authorization code, this function does the following:

  1. Retrieves the private key from Secrets Manager.
  2. Creates and signs the client assertion.
  3. Makes the token request to the IdP token endpoint.
  4. Receives the response from the IdP.
  5. Returns the response to the Cognito IdP response endpoint.

The details of the function logic can be broken down into the following:

  • Decode the body of the original request—this includes the authorization code that was acquired during the authorize flow.
    import base64
    
    encoded_message = event["body"]
    decoded_message = base64.b64decode(encoded_message)
    decoded_message = decoded_message.decode("utf-8")
    

  • Retrieve the private key from Secrets Manager by using the GetSecretValue API or SDK equivalent or by using the AWS Parameters and Secrets Lambda Extension.
  • Create and sign the JWT.
    import jwt # third party library – requires a Lambda Layer
    import time
    
    instance = jwt.JWT()
    private_key_jwt = instance.encode({
        "iss": <Issuer. Contains IdP client ID>,
        "sub": <Subject. Contains IdP client ID>,
        "aud": <Audience. IdP token endpoint>,
        "iat": int(time.time()),
        "exp": int(time.time()) + 300
    },
        {<YOUR RETRIEVED PRIVATE KEY IN JWK FORMAT>},
        alg='RS256',
        optional_headers = {"kid": private_key_dict["kid"]}
    )
    

  • Modify the original body and make the token request, including the original parameters for grant_type, code, and client_id, with added client_assertion_type and the client_assertion. (The following example HTTP request has line breaks and placeholders in angle brackets for better readability.)
    POST /token HTTP/1.1
      Scheme: https
      Host: your-idp.example.com
      Content-Type: application/x-www-form-urlencoded
    
      grant_type=authorization_code&
        code=<the authorization code>&
        client_id=<IdP client ID>&
        client_assertion_type=
        urn%3Aietf%3Aparams%3Aoauth%3Aclient-assertion-type%3Ajwt-bearer&
        client_assertion=<signed JWT>
    

  • Return the IdP’ s response.

Note that there is no client secret needed in this request. Instead, you add a client assertion type as urn:ietf:params:oauth:client-assertion-type:jwt-bearer, and the client assertion with the signed JWT.

If the request is successful, the IdP’s response includes a JWT with the access token and identity token. On returning the response via the Lambda function, Cognito ingests the JWT and creates or updates the user in the user pool. It then responds to the original authorize request of the user client by sending its own authorization code, which can be exchanged for a Cognito issued JWT in your own application.

Deploy a demo

To deploy an example of this solution, see our GitHub repository. You will find the prerequisites and deployment steps there, as well as additional in-depth information.

Additional considerations

To further optimize this solution, you should consider checking the event details in the Lambda function before fully processing the requests. This way, you can, for example, check that all required parameters are present and valid. One option to do that, is to define a client secret when you create the IdP integration for the user pool. When Cognito sends the token request, it adds the client secret in the encoded body, so you can retrieve it and validate its value. If the validation fails, requests can be dropped early to improve exception handling and to prevent invalid requests from causing unnecessary function charges.

In this example, we used Secrets Manager to store the private key. You can explore other alternatives, like AWS Systems Manager Parameter Store or AWS Key Management Service (AWS KMS). To retrieve the key from the Parameter Store, you can use the SDK or the AWS Parameter and Secrets Lambda Extension. With AWS KMS, you can both create and store the private key as well as derive a public key through the service’s APIs, and you can also use the signing API to sign the JWT in the Lambda function.

Conclusion

By redirecting the IdP token endpoint in the Cognito user pool’s external OIDC IdP configuration to a route in an API Gateway, you can use Lambda functions to customize the request flow between Cognito and the IdP. In the example in this post, we showed how to change the client authentication mechanism during the token request from a client secret to a client assertion with a signed JWT (private key JWT). You can also apply the same proxy-like approach to customize the request flow even further—for example, by adding a Proof Key for Code Exchange (PKCE), for which you can find an example in the aws-samples GitHub repository.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post for Amazon Cognito User Pools or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Martin Pagel

Martin Pagel

Martin is a Solutions Architect in Norway who specializes in security and compliance, with a focus on identity. Before joining AWS, he walked in the shoes of a systems administrator and engineer, designing and implementing the technical side of a wide variety of compliance controls at companies in automotive, fintech, and healthcare. Today, he helps our customers across the Nordic region to achieve their security and compliance objectives.

Set up AWS Private Certificate Authority to issue certificates for use with IAM Roles Anywhere

Post Syndicated from Chris Sciarrino original https://aws.amazon.com/blogs/security/set-up-aws-private-certificate-authority-to-issue-certificates-for-use-with-iam-roles-anywhere/

Traditionally, applications or systems—defined as pieces of autonomous logic functioning without direct user interaction—have faced challenges associated with long-lived credentials such as access keys. In certain circumstances, long-lived credentials can increase operational overhead and the scope of impact in the event of an inadvertent disclosure.

To help mitigate these risks and follow the best practice of using short-term credentials, Amazon Web Services (AWS) introduced IAM Roles Anywhere, a feature of AWS Identity and Access Management (IAM). With the introduction of IAM Roles Anywhere, systems running outside of AWS can exchange X.509 certificates to assume an IAM role and receive temporary IAM credentials from AWS Security Token Service (AWS STS).

You can use IAM Roles Anywhere to help you implement a secure and manageable authentication method. It uses the same IAM policies and roles as within AWS, simplifying governance and policy management across hybrid cloud environments. Additionally, the certificates used in this process come with a built-in validity period defined when the certificate request is created, enhancing the security by providing a time-limited trust for the identities. Furthermore, customers in high security environments can optionally keep private keys for the certificates stored in PKCS #11-compatible hardware security modules for extra protection.

For organizations that lack an existing public key infrastructure (PKI), AWS Private Certificate Authority allows for the creation of a certificate hierarchy without the complexity of self-hosting a PKI.

With the introduction of IAM Roles Anywhere, there is now an accompanying requirement to manage certificates and their lifecycle. AWS Private CA is an AWS managed service that can issue x509 certificates for hosts. This makes it ideal for use with IAM Roles Anywhere. However, AWS Private CA doesn’t natively deploy certificates to hosts.

Certificate deployment is an essential part of managing the certificate lifecycle for IAM Roles Anywhere, the absence of which can lead to operational inefficiencies. Fortunately, there is a solution. By using AWS Systems Manager with its Run Command capability, you can automate issuing and renewing certificates from AWS Private CA. This simplifies the management process of IAM Roles Anywhere on a large scale.

In this blog post, we walk you through an architectural pattern that uses AWS Private CA and Systems Manager to automate issuing and renewing x509 certificates. This pattern smooths the integration of non-AWS hosts with IAM Roles Anywhere. It can help you replace long-term credentials while reducing operational complexity of IAM Roles Anywhere with certificate vending automation.

While IAM Roles Anywhere supports both Windows and Linux, this solution is designed for a Linux environment. Windows users integrating with Active Directory should check out the AWS Private CA Connector for Active Directory. By implementing this architectural pattern, you can distribute certificates to your non-AWS Linux hosts, thereby enabling them to use IAM Roles Anywhere. This approach can help you simplify certificate management tasks.

Architecture overview

Figure 1: Architecture overview

Figure 1: Architecture overview

The architectural pattern we propose (Figure 1) is composed of multiple stages, involving AWS services including Amazon EventBridge, AWS Lambda, Amazon DynamoDB, and Systems Manager.

  1. Amazon EventBridge Scheduler invokes a Lambda function called CertCheck twice daily.
  2. The Lambda function scans a DynamoDB table to identify instances that require certificate management. It specifically targets instances managed by Systems Manager, which the administrator populates into the table.
  3. The information about the instances with no certificate and instances requiring new certificates due to expiry of existing ones is received by CertCheck.
  4. Depending on the certificate’s expiration date for a particular instance, a second Lambda function called CertIssue is launched.
  5. CertIssue instructs Systems Manager to apply a run command on the instance.
  6. Run Command generates a certificate signing request (CSR) and a private key on the instance.
  7. The CSR is retrieved by Systems Manager, the private key remains securely on the instance.
  8. CertIssue then retrieves the CSR from Systems Manager.
  9. CertIssue uses the CSR to request a signed certificate from AWS Private CA.
  10. On successful certificate issuance, AWS Private CA creates an event through EventBridge that contains the ID of the newly issued certificate.
  11. This event subsequently invokes a third Lambda function called CertDeploy.
  12. CertDeploy retrieves the certificate from AWS Private CA and invokes Systems Manager to launch Run Command with the certificate data and updates the certificate’s expiration date in the DynamoDB table for future reference.
  13. Run Command conducts a brief test to verify the certificate’s functionality, and upon success, stores the signed certificate on the instance.
  14. The instance can then exchange the certificate for AWS credentials through IAM Roles Anywhere.

Additionally, on a certificate rotation failure, an Amazon Simple Notification Service (Amazon SNS) notification is delivered to an email address specified during the AWS CloudFormation deployment.

The solution enables periodic certificate rotation. If a certificate is nearing expiration, the process initiates the generation of a new private key and CSR, thus issuing a new certificate. Newly generated certificates, private keys, and CSRs replace the existing ones.

With certificates in place, they can be used by IAM Roles Anywhere to obtain short-term IAM credentials. For more details on setting up IAM Roles Anywhere, see the IAM Roles Anywhere User Guide.

Costs

Although this solution offers significant benefits, it’s important to consider the associated costs before you deploy. To provide a cost estimate, managing certificates for 100 hosts would cost approximately $85 per month. However, for a larger deployment of 1,100 hosts with the Systems Manager advanced tier, the cost would be around $5937 per month. These pricing estimates include the rotation of certificates six times a month.

AWS Private CA in short-lived mode incurs a monthly charge of $50, and each certificate issuance costs $0.058. Systems Manager Hybrid Activation standard has no additional cost for managing fewer than 1,000 hosts. If you have more than 1,000 hosts, the advanced plan must be used at an approximate cost of $5 per host per month. DynamoDB, Amazon SNS, and Lambda costs should be under $5 per month per service for under 1000 hosts. For environments with over 1,000 hosts, it might be worthwhile to explore other options of machine to machine authentication or another option for distributing certificates.

Please note that the estimated pricing mentioned here is specific to the us-east-1 AWS Region and can be calculated for other regions using the AWS Pricing Calculator.

Prerequisites

You should have several items already set up to make it easier to follow along with the blog.

Enabling Systems manager hybrid activation

To create a hybrid activation, follow these steps:

  1. Open the AWS Management Console for Systems Manager, go to Hybrid activations and choose Create an Activation.
    Figure 2: Hybrid activation page

    Figure 2: Hybrid activation page

  2. Enter a description [optional] for the activation and adjust the Instance limit value to the maximum you need, then choose Create activation.
    Figure 3: Create hybrid activation

    Figure 3: Create hybrid activation

  3. This gives you a green banner with an Activation Code and Activation ID. Make a note of these.
    Figure 4: Successful hybrid activation with activation code and ID

    Figure 4: Successful hybrid activation with activation code and ID

  4. Install the AWS Systems Manager Agent (SSM Agent) on the hosts to be managed. Follow the instructions for the appropriate operating system. In the example commands, replace <activation-code>, <activation-id>, and <region> with the activation code and ID from the previous step and your Region. Here is an example of commands to run for an Ubuntu host:
    mkdir /tmp/ssm
    
    curl https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/debian_amd64/amazon-ssm-agent.deb -o /tmp/ssm/amazon-ssm-agent.deb
    
    sudo dpkg -i /tmp/ssm/amazon-ssm-agent.deb
    
    sudo service amazon-ssm-agent stop
    
    sudo -E amazon-ssm-agent -register -code "<activation-code>" -id "<activation-id>" -region <region> 
    
    sudo service amazon-ssm-agent start
    

You should see a message confirming the instance was successfully registered with Systems Manager.

Note: If you receive errors during Systems Manager registration about the Region having invalid characters, verify that the Region is not in quotation marks.

Deploy with CloudFormation

We’ve created a Git repository with a CloudFormation template that sets up the aforementioned architecture. An existing S3 bucket is required for CloudFormation to upload the Lambda package.

To launch the CloudFormation stack:

  1. Clone the Git repository that contains the CloudFormation template and the Lambda function code.
    git clone https://github.com/aws-samples/aws-privateca-certificate-deployment-automator.git
    

  2. cd into the directory created by Git.
    cd aws-privateca-certificate-deployment-automator
    

  3. Launch the CloudFormation stack within the cloned Git directory using the cf_template.yaml file, replacing <DOC-EXAMPLE-BUCKET> with the name of your S3 bucket from the prerequisites.
    aws cloudformation package --template-file cf_template.yaml --output-template-file packaged.yaml --s3-bucket <DOC-EXAMPLE-BUCKET>
    

Note: These commands should be run on the system you plan to use to deploy the CloudFormation and have the Git and AWS CLI installed.

After successfully running the CloudFormation package command, run the CloudFormation deploy command. The template supports various parameters to change the path where the certificates and keys will be generated. Adjust the paths as needed with the parameter-overrides flag, but verify that they exist on the hosts. Replace the <email> placeholder with one that you want to receive alerts for failures. The stack name must be in lower case.

aws cloudformation deploy --template packaged.yaml --stack-name ssm-pca-stack --capabilities CAPABILITY_NAMED_IAM --parameter-overrides SNSSubscriberEmail=<email>

The available CloudFormation parameters are listed in the following table:

Parameter Default value Use
AWSSigningHelperPath /root Default path on the host for the AWS Signing Helper binary
CACertPath /tmp Default path on the host the CA certificate will be created in
CACertValidity 10 Default CA certificate length in years
CACommonNam ca.example.com Default CA certificate common name
CACountry US Default CA certificate country code
CertPath /tmp Default path on the host the certificates will be created in
CSRPath /tmp Default path on the host the certificates will be created in
KeyAlgorithm RSA_2048 Default algorithm use to create the CA private key
KeyPath /tmp Default path on the host the private keys will be created in
OrgName Example Corp Default CA certificate organization name
SigningAlgorithm SHA256WITHRSA Default CA signing algorithm for issued certificates

After the CloudFormation stack is ready, manually add the hosts requiring certificate management into the DynamoDB table.

You will also receive an email at the email address specified to accept the SNS topic subscription. Make sure to choose the Confirm Subscription link as shown in Figure 5.

Figure 5: SNS topic subscription confirmation

Figure 5: SNS topic subscription confirmation

Add data to the DynamoDB table

  1. Open the AWS Systems Manager console and select Fleet Manager.
  2. Choose Managed Nodes and copy the Node ID value. The node ID value in the Fleet Manager as shown in Figure 6 will be the host ID to be used in a subsequent step.
    Figure 6: Systems Manager Node ID

    Figure 6: Systems Manager Node ID

  3. Open the DynamoDB console and select Dashboard and then Tables in the left navigation pane.
    Figure 7: DynamoDB menu

    Figure 7: DynamoDB menu

  4. Select the certificates table.
    Figure 8: DynamoDB tables

    Figure 8: DynamoDB tables

  5. Choose Explore table items and then choose Create item.
  6. Enter the node ID as a value for the hostID attribute as copied in step 2.
    Figure 9: DynamoDB table hostID attribute creation

    Figure 9: DynamoDB table hostID attribute creation

Additional string attributes listed in the following table can be added to the item to specify paths for the certificates on a per host basis. If these attributes aren’t created, either the default paths or overrides in the CloudFormation parameters will be used.

Additional supported attributes Use
certPath Path on the host the certificate will be created in
keyPath Path on the host the private key will be created in
signinghelperPath Path on the host for the AWS Signing Helper binary
cacertPath Path on the host the CA certificate will be created in

The CertCheck Lambda function created by the CloudFormation template runs twice daily to verify that the certificates for these hosts are kept up to date. If necessary, you can use the Lambda invoke command to run the Lambda function on-demand.

aws lambda invoke --function-name CertCheck-Trigger --cli-binary-format raw-in-base64-out response.json

The certificate expiration and serial number metadata are stored in the DynamoDB table certificate. Select the certificates table and choose Explore table items to view the data.

Figure 10: DynamoDB table item with certificate expiration and serial for a host

Figure 10: DynamoDB table item with certificate expiration and serial for a host

Validation

To validate successful certificate deployment, you should find four files in the location specified in the CloudFormation parameter or DynamoDB table attribute, as shown in the following table.

File Use Location
{host}.crt The certificate containing the public key, signed by AWS Private CA. certPath attribute in DynamoDB. Otherwise, default specified by the certPath CF parameter.
ca_chain_certificate.crt The certificate chain including intermediates from AWS Private CA. cacertPath attribute in DynamoDB. Otherwise, default specified by the CACertPath CF parameter.
{host}.key The private key for the certificate. keyPath attribute in DynamoDB. Otherwise, default specified by the KeyPath CF parameter.
{host}.csr The CSR used to generate the signed certificate. Default specified by the CSRPath CF parameter.

These certificates can now be used to configure the host for IAM Roles Anywhere. See Obtaining temporary security credentials from AWS Identity and Access Management Roles Anywhere for using the signing helper tool provided by IAM Roles Anywhere. The signing helper must be installed on the instance for the validation to work. You can pass the location of the signing helper as a parameter to the CloudFormation template.

Note: As a security best practice, it’s important to use permissions and ACLs to keep the private key secure and restrict access to it. The automation will create and set the private key with chmod 400 permissions. Chmod command is used to change the permission for a file or directory. Chmod 400 permission will allow owner of the file to read the file while restricting others from reading, writing, or running the file.

Revoke a certificate

AWS Private CA also supports generating a certificate revocation list (CRL), which can be imported to IAM Roles Anywhere. The CloudFormation template automatically sets up the CRL process between AWS Private CA and IAM Roles Anywhere.

Figure 11: Certificate revocation process

Figure 11: Certificate revocation process

Within 30 minutes after revocation, AWS Private CA generates a CRL file and uploads it to the CRL S3 bucket that was created by the CloudFormation template. Then, the CRLProcessor Lambda function receives a notification through EventBridge of the new CRL file and passes it to the IAM Roles Anywhere API.

To revoke a certificate, use the AWS CLI. In the following example, replace <certificate-authority-arn>, <certificate-serial>, and<revocation-reason> with your own information.

aws acm-pca revoke-certificate --certificate-authority-arn <certificate-authority-arn> --certificate-serial <certificate-serial> --revocation-reason <revocation-reason>

The AWS Private CA ARN can be found in the Cloudformation stack outputs under the name PCAARN. The certificate serial number are listed in the DynamoDB table for each host as previously mentioned. The revocation reasons can be one of these possible values:

  • UNSPECIFIED
  • KEY_COMPROMISE
  • CERTIFICATE_AUTHORITY_COMPROMISE
  • AFFILIATION_CHANGED
  • SUPERSEDED
  • CESSATION_OF_OPERATION
  • PRIVILEGE_WITHDRAWN
  • A_A_COMPROMISE

Revoking a certificate won’t automatically generate a new certificate for the host. See Manually rotate certificates.

Manually rotate certificates

The certificates are set to expire weekly and are rotated the day of expiration. If you need to manually replace a certificate sooner, remove the expiration date for the host’s record in the DynamoDB table (see Figure 12). On the next run of the Lambda function, the lack of an expiration date will cause the certificate for that host to be replaced. To immediately renew a certificate or test the rotation function, remove the expiration date from the DynamoDB table and run the following Lambda invoke command. After the certificates have been rotated, the new expiration date will be listed in the table.

aws lambda invoke --function-name CertCheck-Trigger --cli-binary-format raw-in-base64-out response.json

Conclusion

By using AWS IAM Roles Anywhere, systems outside of AWS can use short-term credentials in the form of x509 certificates in exchange for AWS STS credentials. This can help you improve your security in a hybrid environment by reducing the use of long-term access keys as credentials.

For organizations without an existing enterprise PKI, the solution described in this post provides an automated method of generating and rotating certificates using AWS Private CA and AWS Systems Manager. We showed you how you can use Systems Manager to set up a non-AWS host with certificates for use with IAM Roles Anywhere and ensure they’re rotated regularly.

Deploy this solution today and move towards IAM Roles Anywhere to remove long term credentials for programmatic access. For more information, see the IAM Roles Anywhere blog article or post your queries on AWS re:Post.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on IAM re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Chris Sciarrino

Chris Sciarrino

Chris is a Senior Solutions Architect and a member of the AWS security field community based in Toronto, Canada. He works with enterprise customers helping them design solutions on AWS. Outside of work, Chris enjoys spending his time hiking and skiing with friends and listening to audiobooks.

Ravikant Sharma/>

Ravikant Sharma

Ravikant is a Solutions Architect based in London. He specializes in cloud security and financial services. He helps Fintech and Web3 startups build and scale their business using AWS Cloud. Prior to AWS, he worked at Citi Singapore as Vice President – API management, where he played a pivotal role in implementing Open API security frameworks and Open Banking regulations.

Rahul Gautam

Rahul Gautam

Rahul is a Security and Compliance Specialist Solutions Architect based in London. He helps customers in adopting AWS security services to meet and improve their security posture in the cloud. Before joining the SSA team, Rahul spent 5 years as a Cloud Support Engineer in AWS Premium Support. Outside of work, Rahul enjoys travelling as much as he can.

AWS KMS is now FIPS 140-2 Security Level 3. What does this mean for you?

Post Syndicated from Rushir Patel original https://aws.amazon.com/blogs/security/aws-kms-now-fips-140-2-level-3-what-does-this-mean-for-you/

AWS Key Management Service (AWS KMS) recently announced that its hardware security modules (HSMs) were given Federal Information Processing Standards (FIPS) 140-2 Security Level 3 certification from the U.S. National Institute of Standards and Technology (NIST). For organizations that rely on AWS cryptographic services, this higher security level validation has several benefits, including simpler set up and operation. In this post, we will share more details about the recent change in FIPS validation status for AWS KMS and explain the benefits to customers using AWS cryptographic services as a result of this change.

Background on NIST FIPS 140

The FIPS 140 framework provides guidelines and requirements for cryptographic modules that protect sensitive information. FIPS 140 is the industry standard in the US and Canada and is recognized around the world as providing authoritative certification and validation for the way that cryptographic modules are designed, implemented, and tested against NIST cryptographic security guidelines.

Organizations follow FIPS 140 to help ensure that their cryptographic security is aligned with government standards. FIPS 140 validation is also required in certain fields such as manufacturing, healthcare, and finance and is included in several industry and regulatory compliance frameworks, such as the Payment Card Industry Data Security Standard (PCI DSS), the Federal Risk and Authorization Management Program (FedRAMP), and the Health Information Trust Alliance (HITRUST) framework. FIPS 140 validation is recognized in many jurisdictions around the world, so organizations that operate globally can use FIPS 140 certification internationally.

For more information on FIPS Security Levels and requirements, see FIPS Pub 140-2: Security Requirements for Cryptographic Modules.

What FIPS 140-2 Security Level 3 means for AWS KMS and you

Until recently, AWS KMS had been validated at Security Level 2 overall and at Security Level 3 in the following four sub-categories:

  • Cryptographic module specification
  • Roles, services, and authentication
  • Physical security
  • Design assurance

The latest certification from NIST means that AWS KMS is now validated at Security Level 3 overall in each sub-category. As a result, AWS assumes more of the shared responsibility model, which will benefit customers for certain use cases. Security Level 3 certification can assist organizations seeking compliance with several industry and regulatory standards. Even though FIPS 140 validation is not expressly required in a number of regulatory regimes, maintaining stronger, easier-to-use encryption can be a powerful tool for complying with FedRAMP, U.S. Department of Defense (DOD) Approved Product List (APL), HIPAA, PCI, the European Union’s General Data Protection Regulation (GDPR), and the ISO 27001 standard for security management best practices and comprehensive security controls.

Customers who previously needed to meet compliance requirements for FIPS 140-2 Level 3 on AWS were required to use AWS CloudHSM, a single-tenant HSM solution that provides dedicated HSMs instead of managed service HSMs. Now, customers who were using CloudHSM to help meet their compliance obligations for Level 3 validation can use AWS KMS by itself for key generation and usage. Compared to CloudHSM, AWS KMS is typically lower cost and easier to set up and operate as a managed service, and using AWS KMS shifts the responsibility for creating and controlling encryption keys and operating HSMs from the customer to AWS. This allows you to focus resources on your core business instead of on undifferentiated HSM infrastructure management tasks.

AWS KMS uses FIPS 140-2 Level 3 validated HSMs to help protect your keys when you request the service to create keys on your behalf or when you import them. The HSMs in AWS KMS are designed so that no one, not even AWS employees, can retrieve your plaintext keys. Your plaintext keys are never written to disk and are only used in volatile memory of the HSMs while performing your requested cryptographic operation.

The FIPS 140-2 Level 3 certified HSMs in AWS KMS are deployed in all AWS Regions, including the AWS GovCloud (US) Regions. The China (Beijing) and China (Ningxia) Regions do not support the FIPS 140-2 Cryptographic Module Validation Program. AWS KMS uses Office of the State Commercial Cryptography Administration (OSCCA) certified HSMs to protect KMS keys in China Regions. The certificate for the AWS KMS FIPS 140-2 Security Level 3 validation is available on the NIST Cryptographic Module Validation Program website.

As with many industry and regulatory frameworks, FIPS 140 is evolving. NIST approved and published a new updated version of the 140 standard, FIPS 140-3, which supersedes FIPS 140-2. The U.S. government has begun transitioning to the FIPS 140-3 cryptography standard, with NIST announcing that they will retire all FIPS 140-2 certificates on September 22, 2026. NIST recently validated AWS-LC under FIPS 140-3 and is currently in the process of evaluating AWS KMS and certain instance types of AWS CloudHSM under the FIPS 140-3 standard. To check the status of these evaluations, see the NIST Modules In Process List.

For more information on FIPS 140-3, see FIPS Pub 140-3: Security Requirements for Cryptographic Modules.

Legal Disclaimer

This document is provided for the purposes of information only; it is not legal advice, and should not be relied on as legal advice. Customers are responsible for making their own independent assessment of the information in this document. This document: (a) is for informational purposes only, (b) represents current AWS product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

AWS encourages its customers to obtain appropriate advice on their implementation of privacy and data protection environments, and more generally, applicable laws and other obligations relevant to their business.

AWS encourages its customers to obtain appropriate advice on their implementation of privacy and data protection environments, and more generally, applicable laws and other obligations relevant to their business.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Rushir Patel

Rushir Patel

Rushir is a Senior Security Specialist at AWS, focused on data protection and cryptography services. His goal is to make complex topics simple for customers and help them adopt better security practices. Before joining AWS, he worked in security product management at IBM and Bank of America.

Rohit Panjala

Rohit Panjala

Rohit is a Worldwide Security GTM Specialist at AWS, focused on data protection and cryptography services. He is responsible for developing and implementing go-to-market (GTM) strategies and sales plays and driving customer and partner engagements for AWS data protection services on a global scale. Before joining AWS, Rohit worked in security product management and electrical engineering roles.

How to improve your security incident response processes with Jupyter notebooks

Post Syndicated from Tim Manik original https://aws.amazon.com/blogs/security/how-to-improve-your-security-incident-response-processes-with-jupyter-notebooks/

Customers face a number of challenges to quickly and effectively respond to a security event. To start, it can be difficult to standardize how to respond to a partic­ular security event, such as an Amazon GuardDuty finding. Additionally, silos can form with reliance on one security analyst who is designated to perform certain tasks, such as investigate all GuardDuty findings. Jupyter notebooks can help you address these challenges by simplifying both standardization and collaboration.

Jupyter Notebook is an open-source, web-based application to run and document code. Although Jupyter notebooks are most frequently used for data science and machine learning, you can also use them to more efficiently and effectively investigate and respond to security events.

In this blog post, we will show you how to use Jupyter Notebook to investigate a security event. With this solution, you can automate the tasks of gathering data, presenting the data, and providing procedures and next steps for the findings.

Benefits of using Jupyter notebooks for security incident response

The following are some ways that you can use Jupyter notebooks for security incident response:

  • Develop readable code for analysts – Within a notebook, you can combine markdown text and code cells to improve readability. Analysts can read context around the code cell, run the code cell, and analyze the results within the notebook.
  • Standardize analysis and response – You can reuse notebooks after the initial creation. This makes it simpler for you to standardize your incident response processes for how to respond to a certain type of security event. Additionally, you can use notebooks to achieve repeatable responses. You can rerun an entire notebook or a specific cell.
  • Collaborate and share incident response knowledge – After you create a Jupyter notebook, you can share it with peers to more seamlessly collaborate and share knowledge, which helps reduce silos and reliance on certain analysts.
  • Iterate on your incident response playbooks – Developing a security incident response program involves continuous iteration. With Jupyter notebooks, you can start small and iterate on what you have developed. You can keep Jupyter notebooks under source code control by using services such as AWS CodeCommit. This allows you to approve and track changes to your notebooks.

Architecture overview

Figure 1: Architecture for incident response analysis

Figure 1: Architecture for incident response analysis

The architecture shown in Figure 1 consists of the foundational services required to analyze and contain security incidents on AWS. You create and access the playbooks through the Jupyter console that is hosted on Amazon SageMaker. Within the playbooks, you run several Amazon Athena queries against AWS CloudTrail logs hosted in Amazon Simple Storage Service (Amazon S3).

Solution implementation

To deploy the solution, you will complete the following steps:

  1. Deploy a SageMaker notebook instance
  2. Create an Athena table for your CloudTrail trail
  3. Grant AWS Lake Formation access
  4. Access the Credential Compromise playbooks by using JupyterLab

Step 1: Deploy a SageMaker notebook instance

You will host your Jupyter notebooks on a SageMaker notebook instance. We chose to use SageMaker instead of running the notebooks locally because SageMaker provides flexible compute, seamless integration with CodeCommit and GitHub, temporary credentials through AWS Identity and Access Management (IAM) roles, and lower latency for Athena queries.

You can deploy the SageMaker notebook instance by using the AWS CloudFormation template from our jupyter-notebook-for-incident-response GitHub repository. We recommend that you deploy SageMaker in your security tooling account or an equivalent.

The CloudFormation template deploys the following resources:

  • A SageMaker notebook instance to run the analysis notebooks. Because this is a proof of concept (POC), the deployed SageMaker instance is the smallest instance type available. However, within an enterprise environment, you will likely need a larger instance type.
  • An AWS Key Management Service (AWS KMS) key to encrypt the SageMaker notebook instance and protect sensitive data.
  • An IAM role that grants the SageMaker notebook permissions to query CloudTrail, VPC Flow Logs, and other log sources.
  • An IAM role that allows access to the pre-signed URL of the SageMaker notebook from only an allowlisted IP range.
  • A VPC configured for SageMaker with an internet gateway, NAT gateway, and VPC endpoints to access required AWS services securely. The internet gateway and NAT gateway provide internet access to install external packages.
  • An S3 bucket to store results for your Athena log queries—you will reference the S3 bucket in the next step.

Step 2: Create an Athena table for your CloudTrail trail

The solution uses Athena to query CloudTrail logs, so you need to create an Athena table for CloudTrail.

There are two main ways to create an Athena table for CloudTrail:

For either of these methods to create an Athena table, you need to provide the URI of an S3 bucket. For this blog post, use the URI of the S3 bucket that the CloudFormation template created in Step 1. To find the URI of the S3 bucket, see the Output section of the CloudFormation stack.

Step 3: Grant AWS Lake Formation access

If you don’t use AWS Lake Formation in your AWS environment, skip to Step 4. Otherwise, continue with the following instructions. Lake Formation is how data access control for your Athena tables is managed.

To grant permission to the Security Log database

  1. Open the Lake Formation console.
  2. Select the database that you created in Step 2 for your security logs. If you used the Security Analytics Bootstrap, then the table name is either security_analysis or a custom name that you provided—you can find the name in the CloudFormation stack. If you created the Athena table by using the CloudTrail console, then the database is named default.
  3. From the Actions dropdown, select Grant.
  4. In Grant data permissions, select IAM users and roles.
  5. Find the IAM role used by the SageMaker Notebook instance.
  6. In Database permissions, select Describe and then Grant.

To grant permission to the Security Log CloudTrail table

  1. Open the Lake Formation console.
  2. Select the database that you created in Step 2.
  3. Choose View Tables.
  4. Select CloudTrail. If you created VPC flow log and DNS log tables, select those, too.
  5. From the Actions dropdown, select Grant.
  6. In Grant data permissions, select IAM users and roles.
  7. Find the IAM role used by the SageMaker notebook instance.
  8. In Table permissions, select Describe and then Grant.

Step 4: Access the Credential Compromise playbooks by using JupyterLab

The CloudFormation template clones the jupyter-notebook-for-incident-response GitHub repo into your Jupyter workspace.

You can access JupyterLab hosted on your SageMaker notebook instance by following the steps in the Access Notebook Instances documentation.

Your folder structure should match that shown in Figure 2. The parent folder should be jupyter-notebook-for-incident-response, and the child folders should be playbooks and cfn-templates.

Figure 2: Folder structure after GitHub repo is cloned to the environment

Figure 2: Folder structure after GitHub repo is cloned to the environment

Sample investigation of a spike in failed login attempts

In the following sections, you will use the Jupyter notebook that we created to investigate a scenario where failed login attempts have spiked. We designed this notebook to guide you through the process of gathering more information about the spike.

We discuss the important components of these notebooks so that you can use the framework to create your own playbooks. We encourage you to build on top of the playbook, and add additional queries and steps in the playbook to customize it for your organization’s specific business and security goals.

For this blog post, we will focus primarily on the analysis phase of incident response and walk you through how you can use Jupyter notebooks to help with this phase.

Before you get started with the following steps, open the credential-compromise-analysis.ipynb notebook in your JupyterLab environment.

How to import Python libraries and set environment variables

The notebooks require that you have the following Python libraries:

  • Boto3 – to interact with AWS services through API calls
  • Pandas – to visualize the data
  • PyAthena – to simplify the code to connect to Athena

To install the required Python libraries, in the Setup section of the notebook, under Load libraries, edit the variables in the two code cells as follows:

  • region – specify the AWS Region that you want your AWS API commands to run in (for example, us-east-1).
  • athena_bucket – specify the S3 bucket URI that is configured to store your Athena queries. You can find this information at Athena > Query Editor > Settings > Query result location.
  • db_name – specify the database used by Athena that contains your Athena table for CloudTrail.
Figure 3: Load the Python libraries in the notebook

Figure 3: Load the Python libraries in the notebook

This helps ensure that subsequent code cells that run are configured to run in your environment.

Run each code cell by choosing the cell and pressing SHIFT+ENTER or by choosing the play button (▶) in the toolbar at the top of the console.

How to set up the helper function for Athena

The Python query_results function, shown in the following figure, helps you query Athena tables. Run this code cell. You will use the query_results function later in the 2.0 IAM Investigation section of the notebook.

Figure 4: Code cell for the helper function to query with Athena

Figure 4: Code cell for the helper function to query with Athena

Credential Compromise Analysis Notebook

The credential-compromise-analysis.ipynb notebook includes several prebuilt queries to help you start your investigation of a potentially compromised credential. In this post, we discuss three of these queries:

  • The first query provides a broad view by retrieving the CloudTrail events related to authorization failures. By reviewing these results, you get baseline information about where users and roles are attempting to access resources or take actions without having the proper permissions.
  • The second query narrows the focus by identifying the top five IAM entities (such as users, roles, and identities) that are causing most of the authorization failures. Frequent failures from specific entities often indicate that their credentials are compromised.
  • The third query zooms in on one of the suspicious entities from the previous query. It retrieves API activity and events initiated by that entity across AWS services or resource. Analyzing actions performed by a suspicious entity can reveal if valid permissions are being misused or if the entity is systematically trying to access resources it doesn’t have access to.

Investigate authorization failures

The notebook has markdown cells that provide a description of the expected result of the query. The next cell contains the query statement. The final cell calls the query_result function to run your query by using Athena and display your results in tabular format.

In query 2.1, you query for specific error codes such as AccessDenied, and filter for anything that is an IAM entity by looking for useridentity.arn like ‘%iam%’. The notebook orders the entries by eventTime. If you want to look for specific IAM Identity Center entities, update the query to filter by useridentity.sessioncontext.sessionissuer.arn like ‘%sso.amazonaws.com%’.

This query retrieves a list of failed API calls to AWS services. From this list, you can gain additional insight into the context surrounding the spike in failed login attempts.

When you investigate denied API access requests, carefully examine details such as the user identity, timestamp, source IP address, and other metadata. This information helps you determine if the event is a legitimate threat or a false positive. Here are some specific questions to ask:

  • Does the IP address originate from within your network, or is it external? Internal addresses might be less concerning.
  • Is the access attempt occurring during normal working hours for that user? Requests outside of normal times might warrant more scrutiny.
  • What resources or changes is the user trying to access or make? Attempts to modify sensitive data or systems might indicate malicious intent.

By thoroughly evaluating the context around denied API calls, you can more accurately assess the risk they pose and whether you need to take further action. You can use the specifics in the logs to go beyond just the fact that access was denied, and learn the story of who, when, and why.

As shown in the following figure, the queries in the notebook use the following structure.

  1. Markdown cell to explain the purpose of the query (the query statement).
  2. Code cell to run the query and display the query results.

In the figure, the first code cell that runs stores the input for the query statement. After that finishes, the next code block displays the query results.

Figure 5: Run predefined Athena queries in JupyterLab

Figure 5: Run predefined Athena queries in JupyterLab

Figure 6 shows the output of the query that you ran in the 2.1 Investigation Authorization Failures section. It contains critical details for understanding the context around a denied API call:

  • The eventtime field shows the date and time that the request was completed.
  • The useridentity field reveals which IAM identity made a request.
  • The sourceipddress provides the IP address that the request was made from.
  • The useragent shows which client or app was used to make the call.
Figure 6: Results from the first investigative query

Figure 6: Results from the first investigative query

Figure 6 only shows a subset of the many details captured in CloudTrail logs. By scrolling to the right in the query output, you can view additional attributes that provide further context around the event. The CloudTrail record contents guide contains a comprehensive list of the fields included in the logs, along with descriptions of each attribute.

Often, you will need to search for more information to determine if remediation is necessary. For this reason, we have included additional queries to help you further examine the sequence of events leading up to the failed login attempt spike and after the spike occurred.

Triaging suspicious entities (Queries 2.2 and 2.3)

By running the second and third queries you can dig deeper into anomalous authorization failures. As shown in Figure 7, Query 2.2 provides the top five IAM entities with the most frequent access denials. This highlights the specific users, roles, and identities causing the most failures, which indicates potentially compromised credentials.

Query 2.3 takes the investigation further by isolating the activity from one suspicious entity. Retrieving the actions attempted by a single problematic user or role reveals useful context to determine if you need to revoke credentials. For example, is the entity probing resources that it shouldn’t have access to? Are there unusual API calls outside of normal hours? By scrutinizing an entity’s full history, you can make an informed decision on remediation.

Figure 7: Overview of queries 2.2 and 2.3

Figure 7: Overview of queries 2.2 and 2.3

You can use these two queries together to triage authorization failures: query 2 identifies high-risk entities, and query 3 gathers intelligence to drive your response. This progression from a macro view to a micro view is crucial for transforming signals into action.

Although log analysis relies on automation and queries to facilitate insights, human judgment is essential to interpret these signals and determine the appropriate response. You should discuss flagged events with stakeholders and resource owners to benefit from their domain expertise. You can export the results of your analysis by exporting your Jupyter notebook.

By collaborating with other people, you can gather contextual clues that might not be captured in the raw data. For example, an owner might confirm that a suspicious login time is expected for users in a certain time zone. By pairing automated detection with human perspectives, you can accurately assess risk and decide if credential revocation or other remediation is truly warranted. Uptime or downtime technical issues alone can’t dictate if remediation is necessary—the human element provides pivotal context.

Build your own queries

In addition to the existing queries, you can run your own queries and include them in your copy of the Credential-compromise-analysis.ipynb notebook. The AWS Security Analytics Bootstrap contains a library of common Athena queries for CloudTrail. We recommend that you review these queries before you start to build your own queries. The key takeaway is that these notebooks are highly customizable. You can use the Jupyter Notebook application to help meet the specific incident response requirements of your organization.

Contain compromised IAM entities

If the investigation reveals that a compromised IAM entity requires containment, follow these steps to revoke access:

  • For federated users, revoke their active AWS sessions according to the guidance in How to revoke federated users’ active AWS sessions. This uses IAM policies and AWS Organizations service control policies (SCPs) to revoke access to assumed roles.
  • Avoid using long-lived IAM credentials such as access keys. Instead, use temporary credentials through IAM roles. However, if you detect a compromised access key, immediately rotate or deactivate it by following the guidance in What to Do If You Inadvertently Expose an AWS Access Key. Review the permissions granted to the compromised IAM entity and consider if these permissions should be reduced after access is restored. Overly permissive policies might have enabled broader access for the threat actor.

Going forward, implement least privilege access and monitor authorization activity to detect suspicious behavior. By quickly containing compromised entities and proactively improving IAM hygiene, you can minimize the adversaries’ access duration and prevent further unauthorized access.

Additional considerations

In addition to querying CloudTrail, you can use Athena to query other logs, such as VPC Flow Logs and Amazon Route 53 DNS logs. You can also use Amazon Security Lake, which is generally available, to automatically centralize security data from AWS environments, SaaS providers, on-premises environments, and cloud sources into a purpose-built data lake stored in your account. To better understand which logs to collect and analyze as part of your incident response process, see Logging strategies for security incident response.

We recommended that you understand the playbook implementation described in this blog post before you expand the scope of your incident response solution. The running of queries and automation of containment are two elements to consider as you think about the next steps to evolve your incident response processes.

Conclusion

In this blog post, we showed how you can use Jupyter notebooks to simplify and standardize your incident response processes. You reviewed how to respond to a potential credential compromise incident using a Jupyter notebook style playbook. You also saw how this helps reduce the time to resolution and standardize the analysis and response. Finally, we presented several artifacts and recommendations showing how you can tailor this solution to meet your organization’s specific security needs. You can use this framework to evolve your incident response process.

Further resources

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Tim Manik

Tim Manik

Tim is a Solutions Architect at AWS working with enterprise customers in North America. He specializes in cybersecurity and AI/ML. When he’s not working, you can find Tim exploring new hiking trails, swimming, or playing the guitar.

Daria Pshonkina

Daria Pshonkina

Daria is a Solutions Architect at AWS supporting enterprise customers that started their business journey in the cloud. Her specialization is in security. Outside of work, she enjoys outdoor activities, travelling, and spending quality time with friends and family.

Bryant Pickford

Bryant Pickford

Bryant is a Security Specialist Solutions Architect within the Prototyping Security team under the Worldwide Specialist Organization. He has experience in incident response, threat detection, and perimeter security. In his free time, he likes to DJ, make music, and skydive.

AWS Weekly Roundup—Reserve GPU capacity for short ML workloads, Finch is GA, and more—November 6, 2023

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-reserve-gpu-capacity-for-short-ml-workloads-finch-is-ga-and-more-november-6-2023/

The year is coming to an end, and there are only 50 days until Christmas and 21 days to AWS re:Invent! If you are in Las Vegas, come and say hi to me. I will be around the Serverlesspresso booth most of the time.

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon EC2 – Amazon EC2 announced Capacity Blocks for ML. This means that you can now reserve GPU compute capacity for your short-duration ML workloads. Learn more about this launch on the feature page and announcement blog post.

Finch – Finch is now generally available. Finch is an open source tool for local container development on macOS (using Intel or Apple Silicon). It provides a command line developer tool for building, running, and publishing Linux containers on macOS. Learn more about Finch in this blog post written by Phil Estes or on the Finch website.

AWS X-Ray – AWS X-Ray now supports W3C format trace IDs for distributed tracing. AWS X-Ray supports trace IDs generated through OpenTelemetry or any other framework that conforms to the W3C Trace Context specification.

Amazon Translate Amazon Translate introduces a brevity customization to reduce translation output length. This is a new feature that you can enable in your real-time translations where you need a shorter translation to meet caption size limits. This translation is not literal, but it will preserve the underlying message.

AWS IAM IAM increased the actions last accessed to 60 more services. This functionality is very useful when fine-tuning the permissions of the roles, identifying unused permissions, and granting the least amount of permissions that your roles need.

AWS IAM Access AnalyzerIAM Access Analyzer policy generator expanded support to identify over 200 AWS services to help you create fine-grained policies based on your AWS CloudTrail access activity.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Some other news and blog posts that you may have missed:

AWS Compute BlogDaniel Wirjo and Justin Plock wrote a very interesting article about how you can send and receive webhooks on AWS using different AWS serverless services. This is a good read if you are working with webhooks on your application, as it not only shows you how to build these solutions but also what considerations you should have when building them.

AWS Storage Blog Bimal Gajjar and Andrew Peace wrote a very useful blog post about how to handle event ordering and duplicate events with Amazon S3 Event Notifications. This is a common challenge for many customers.

Amazon Science BlogDavid Fan wrote an article about how to build better foundation models for video representation. This article is based on a paper that Prime Video presented at a conference about this topic.

The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the ones in FrenchGermanItalian, and Spanish.

AWS open-source news and updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Ecuador (November 7), Mexico (November 11), Montevideo (November 14), Central Asia (Kazakhstan, Uzbekistan, Kyrgyzstan, and Mongolia on November 17–18), and Guatemala (November 18).

AWS re:Invent (November 27–December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Browse the session catalog and attendee guides and check out the highlights for generative artificial intelligence (AI).

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Marcia

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Build an entitlement service for business applications using Amazon Verified Permissions

Post Syndicated from Abdul Qadir original https://aws.amazon.com/blogs/security/build-an-entitlement-service-for-business-applications-using-amazon-verified-permissions/

Amazon Verified Permissions is designed to simplify the process of managing permissions within an application. In this blog post, we aim to help customers understand how this service can be applied to several business use cases.

Companies typically use custom entitlement logic embedded in their business applications. This is the most common approach, and it involves writing custom code to manage user access permissions. We’ll explore the common challenges faced by application developers and access administrators when handling user access permissions in an application and how Verified Permissions can help you solve these challenges. We’ll provide an integration guide for incorporating Verified Permissions into an entitlement service, specifically for use cases such as payment management. Finally, we’ll discuss the advantages of using a granular, adaptable, and externally managed access control system.

This blog post will provide a comprehensive and centralized approach to managing access policies, reducing administrative overhead, and empowering line-of-business users to define, administer, and enforce application entitlement policies.

Challenges of building an entitlement system

Entitlements refer to the rules that determine what each user can or cannot do within an application. Figure 1 shows the architecture of a common entitlement system, with components embedded in applications and entitlements stored in multiple data stores.

Figure 1: Typical entitlement system

Figure 1: Typical entitlement system

Creating your own permissions management system can be resource-intensive, requiring time and expertise to ensure its effectiveness. Enterprises face many issues when building a custom entitlement management system, such as complexity, security risks, performance, and lack of scalability. Let’s delve into these issues in detail.

  • Data complexity – Entitlement decisions are often based on complex data relationships, such as user roles, group membership, and product permissions. Managing this complexity can be challenging, especially in a large organization with a lot of users, groups, and products.
  • Compliance and security – Building an entitlement system requires careful consideration of compliance regulations and security best practices. You need to protect user data, implement secure communication protocols, and handle potential security vulnerabilities.
  • Scalability – Permissions management systems must scale to handle large number of users and transactions. This can be a challenge, especially if the service is used to control access to critical resources.
  • Performance and availability – Entitlement services need to be performant, because they are often used to make real-time decisions. Additionally, they need to be reliable and consistent, so that users can be confident that their entitlements are accurate.

Architecting an entitlement service using Amazon Verified Permissions

Amazon Verified Permissions is a scalable permissions management and fine-grained authorization service that helps you build and modernize applications without relying heavily on coding authorization within your applications.

Let’s discuss how you can use Verified Permissions to manage entitlements.

Creating and deploying policies

Verified Permissions uses Cedar, a policy language that allows developers to express permissions as policies that permit users or forbid them from doing certain tasks. A central policy-based authorization system gives developers a consistent way to define and manage fine-grained authorization across applications, simplifies changing permission rules without a need to change code, and improves visibility by moving permissions out of the code.

By using Verified Permissions, you can create specific permission policies that incorporate characteristics of role-based access control (RBAC) and attribute-based access control (ABAC). This approach enables you to implement granular controls while prioritizing the principle of least privilege.

Use case 1: Mary, who works as a clerk, can submit and view payments. Her role within the payment management system allows for multiple actions, and the policy for this role can be defined as follows.

permit (
    principal,
    action in [
        PaymentManager::Action::"SubmitPayment",
        PaymentManager::Action::"UpdatePayment",
        PaymentManager::Action::"ListPayment"
    ],
    resource
)
when { principal.role == "clerk" };

In contrast, Shirley is an auditor, with access that only allows her to list payments. The policy for this role is as follows.

permit (
    principal,
    action in [PaymentManager::Action::"ListPayment"],
    resource
)
when { principal.role == "auditor" };

The payment system will pass the principal, action, resource, and the entity data to Verified Permissions. If the user information is not explicitly defined within the application, the payment system must retrieve it from data stores such as an identity provider or database.

Following that, Verified Permissions evaluates relevant policies by assembling policies that affect the calling principal and the resource in question to make a decision on whether the action should be permitted or denied. Once a decision is made, it is conveyed back to the application, which can then enforce the decision.

As you can see in Figure 2, Mary has access to submit a payment because she has the role of “clerk” and the policy shown earlier permits this action.

Figure 2: Using the test bench to test if Mary can submit payment

Figure 2: Using the test bench to test if Mary can submit payment

Shirley can’t submit a payment based on her role as an “auditor” and the action is denied, as shown in Figure 3.

Figure 3: Using the test bench to test if Shirley can submit payment

Figure 3: Using the test bench to test if Shirley can submit payment

However, she can list the payments, as the policy shown earlier permits this action, as shown in Figure 4.

Figure 4: Using the test bench to test if Shirley can list payments

Figure 4: Using the test bench to test if Shirley can list payments

Use case 2: Using the payment system application, CFO Jane delegates access for a high-value account, 111222333, to John, VP of Finance, during her vacation by creating a policy from a template. This gives John permission to approve payments on the account without Jane’s direct presence.

Policy template for approving payment: Figure 5 shows a sample policy template to approve payment. Policies created by using this template, like the one following, will provide the principal with the ability to approve payments for the resource.

permit (
    principal == ?principal,
    action in [PaymentManager::Action::"ApprovePayment”],
    resource == ?resource
);
Figure 5: Creating a policy template

Figure 5: Creating a policy template

Create the policy from the template: Figure 6 shows the policy created by using the preceding template. The parameters that you have to pass are the principal and resource information. For this use case, the principal is “John” and the resource is the account “111222333”, enabling John to approve payment for the account. (AWS recommends using a universally unique identifier (UUID) for the principal, but “John” is used in this blog post to make it more readable.)

Figure 6: Creating a policy from template

Figure 6: Creating a policy from template

Evaluate the policy: As expected, John is granted access to approve payment for the account 111222333, as shown in Figure 7.

Figure 7: Using the test bench to test if Jeff can approve payment

Figure 7: Using the test bench to test if Jeff can approve payment

Building an entitlement service with Verified Permissions

Verified Permissions enables you to build an entitlement service by externalizing authorization and centralizing policy management and administration. It allows you to tailor access control to your specific application requirements while leveraging the underlying entitlement management provided by Verified Permissions.

Integrating an existing entitlement service with Verified Permissions

Let’s look at how you can integrate an existing entitlement service with Verified Permissions, as shown in Figure 8. In this diagram, the underlying implementation of the entitlement service uses the standard enterprise technology stack. Amazon DynamoDB is used to store the user and role information.

Figure 8: Integrating an entitlement service with Verified Permissions

Figure 8: Integrating an entitlement service with Verified Permissions

Here’s an approach you can use to seamlessly integrate your existing entitlement service with Verified Permissions:

  1. Identify permissions: Begin by assessing your existing entitlement service to identify the permissions it currently uses, different roles, actions, and resources. Compile a detailed list of the permissions along with their respective purposes.
  2. Formulate policies: Map the permissions identified for each use case in the previous step into policies. You can use both inline policies and policy templates. In the AWS Management Console, use the Verified Permissions test bench to evaluate the policies you’ve drafted.
  3. Create policies: Depending on your business needs, create one or more policy stores within Verified Permissions. Create the policies within these policy stores. This is a one-time task and we recommend using automation to accomplish it.
  4. Update entitlement service: Use your entitlement service’s existing interface to create a logic that transforms the current request payload into the format that Verified Permissions’ authorization request expects. You might need to identify and incorporate missing parameters into the existing interfaces. Apply this same transformation logic to the response payload. Refer to this documentation for the Verified Permissions authorization request and response format.
  5. Integrate with Verified Permissions: Use the Verified Permission API or AWS SDK to integrate the entitlements service with Verified Permissions. This involves tasks such as fetching the user role from Amazon DynamoDB, making authorization requests to Verified Permissions, and processing the resulting responses.
  6. Testing: Thoroughly test your service after making the permission changes. Verify that all functionalities are working as expected and that the policies in Verified Permissions are being utilized correctly.
  7. Deployment: After your service passes the review process, roll out the updated entitlement service along with the integrated Verified Permissions functionality.
  8. Monitor and maintain: Following deployment, continuously monitor the performance and gather feedback. Be prepared to make further adjustments if necessary.
  9. Documentation and support: Provide comprehensive documentation for developers who will use your entitlement service. Clearly explain the available endpoints, the request and response formats, and the authorization requirements.

You can use a similar approach to integrate your existing entitlement service with other third-party permission management systems.

Building a new entitlement service in AWS using Amazon Verified Permissions

The reference architecture in Figure 9 shows how to build a new entitlement service using Verified Permissions. AWS customers already use Amazon Cognito for simple, fast authentication. With Amazon Verified Permissions, customers can also add simple, fast authorization to their applications by adding user profile attributes to the identity token generated by Amazon Cognito.

Figure 9: Entitlement service using Verified Permissions

Figure 9: Entitlement service using Verified Permissions

The workflow in the diagram is as follows:

  1. The user signs in to the application by using Amazon Cognito.
  2. If the authentication is successful, the pre-token generation Lambda function will be invoked.
  3. You can use the pre-token generation Lambda function to customize an identity token before Amazon Cognito generates it. In this case, the trigger is used to add the user profile attributes as new claims in the identity token.
    1. The user profile attributes are retrieved from Amazon Dynamo DB.
    2. The attributes are then added as new claims in the identity token.
  4. After the user is signed in, they request access to the protected resource in the application through Amazon API Gateway.
  5. Amazon API Gateway initiates an authorization check using a Lambda authorizer. A Lambda authorizer is a feature of the API Gateway that allows you to implement a custom authorization scheme using the identity token generated by Amazon Cognito.
  6. The Lambda authorizer validates, decodes, and retrieves the user profile attributes from the identity token.
  7. The Lambda authorizer calls the Verified Permission authorization API and passes the principal, action, resource, and user profile attributes as entities.
  8. Based on the decision returned by Verified Permissions, the user is permitted or denied access to the resource.

Common pitfalls of using an entitlement service

Entitlement services can be tricky, but there are a few common mistakes you can avoid to make them more secure and simpler to use:

  • Entitlement service misconfigurations can create security vulnerabilities and lead to data breaches. It is important to carefully configure the entitlement service and to regularly review policies to verify that they are correct and up-to-date.
  • When you first start using an entitlement service, it’s easy to give users too many permissions. This can make your application less secure and harder to manage. It’s important to give users only the permissions they need to do their jobs.
  • Users need to be trained on how to use the entitlement service correctly, especially when it comes to requesting and managing permissions. If users don’t know how to do these tasks appropriately, they could make mistakes that could leave your system vulnerable.

Conclusion

Amazon Verified Permissions is a comprehensive solution for businesses looking to manage granular access control, flexible authorization, and externalized access control. With this service, organizations can quickly and conveniently apply new policies across their environment, streamlining user management processes and helping to improve overall security. This post has highlighted the many benefits of using Verified Permissions for entitlement management within an application. We hope it has been helpful in understanding how you can apply this service to your business use cases.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Abdul Qadir

Abdul Qadir

Abdul is a Solutions Architect based in New York. He designs and architects solutions for independent software vendor (ISV) customers and helps customers in their cloud journeys. He’s been working in the financial and insurance industries and helping companies with digital transformation and modernizing their legacy systems.

Arun Sivaraman

Arun Sivaraman

Arun is a Boston-based Solutions Architect. He enjoys working with customers to create innovative solutions and supporting their digital transformation.

How to create an AMI hardening pipeline and automate updates to your ECS instance fleet

Post Syndicated from Nima Fotouhi original https://aws.amazon.com/blogs/security/how-to-create-an-ami-hardening-pipeline-and-automate-updates-to-your-ecs-instance-fleet/

Amazon Elastic Container Service (Amazon ECS) is a comprehensive managed container orchestrator that simplifies the deployment, maintenance, and scalability of container-based applications. With Amazon ECS, you can deploy your containerized application as a standalone task, or run a task as part of a service in your cluster. The Amazon ECS infrastructure for tasks includes Amazon Elastic Compute Cloud (Amazon EC2) instances in the AWS Cloud, serverless (AWS Fargate) in the AWS Cloud, or on-premises virtual machines (VMs) or servers. You can enable auto-scaling for Amazon ECS capacity providers when using EC2 instances, allowing your infrastructure to dynamically adjust based on workload demands. You define the infrastructure type or the capacity providers where you deploy your tasks or services.

You can choose EC2 instances as the computing resources for your ECS cluster, which allows you to control your cluster’s underlying infrastructure, including the size of EC2 instances, the instance operating system, and extra security controls required by a compliance framework. AWS recommends that you use Amazon ECS-optimized Amazon Machine Images (AMIs), which are set up with the requirements and recommendations to efficiently run your container workloads on Amazon Linux instances. We recommend that you refresh your container instances fleet with the latest ECS-optimized AMIs to include the latest bug fixes and feature updates. However, managing and updating your container instance fleet might become complex as your Amazon ECS workload grows.

In this blog post, I will show you how to create a workflow to enhance Amazon ECS-optimized AMIs by using the CIS Docker Benchmark and automatically updating your EC2 instances in your ECS cluster with the newly created AMIs.

Overview of CIS Docker Benchmark

The CIS Docker Benchmark provides prescriptive guidance for establishing a secure configuration posture for a Docker container engine, container host, container images and build files. The CIS Docker Benchmark has seven sections about Docker and container security:

  1. Host configuration
  2. Docker daemon configuration
  3. Docker daemon configuration files
  4. Container images and build file configuration
  5. Container runtime configuration
  6. Docker security operations
  7. Docker swarm configuration

The solution described in this post covers sections 1, 2, and 3 of the CIS Docker Benchmark, including security recommendations to prepare the host machine used for Amazon ECS workloads, securing the behavior of the Docker daemon (server), and securing Docker-related files and directory permissions and ownerships. However, the solution doesn’t implement all of the controls listed in these three sections. For a complete list of controls implemented, see the solution’s repository.

Solution overview

EC2 Image Builder is a fully managed AWS service, designed to simplify the process of creating, handling, and implementing server images that are custom, secure, and consistently updated. For this solution, you will deploy an EC2 Image Builder pipeline to apply the CIS Docker Benchmarks to an Amazon ECS-optimized AMI and use the created AMI to refresh the Amazon ECS instance fleet. This solution is customizable, so you can select the security controls to harden your base AMI. You can also specify cluster tags during CloudFormation template deployment; these tags will filter the ECS clusters that you have included in the Amazon EC2 instance refresh process. I’ve provided an AWS CloudFormation template that you can use to provision the necessary resources.

Figure 1: Amazon ECS instance refresh workflow

Figure 1: Amazon ECS instance refresh workflow

As shown in Figure 1, the solution involves the following steps:

  1. EC2 Image Builder
    1. The AMI image pipeline downloads the ansible playbook from the S3 bucket, and runs it against the base image.
    2. The pipeline publishes the hardened AMI.
    3. The pipeline validates the benchmarks applied to the base image and publishes the results to a test results S3 bucket. It also invokes Amazon Inspector to run a vulnerability scan on the published image.
  2. State machine initiation
    1. When the AMI is successfully published, the pipeline publishes a message to the AMI status SNS topic. The SNS topic invokes the State machine initiation Lambda function.
    2. The State machine initiation Lambda function extracts the image ID of the published AMI and uses it as the input to initiate the state machine.
  3. State machine
    1. The first state gathers information related to Amazon ECS clusters, including the capacity providers for the EC2 auto scaling group. It creates a new launch template version with the hardened AMI image ID for the EC2 auto scaling group.
    2. The second state uses the new launch template to initiate an instance refresh for the EC2 auto scaling group.
  4. Instance refresh status update
    1. The instance refresh rule selects the auto scaling group instance refresh events (failure, success, and cancellation events) and sends them to the Instance refresh status SNS topic.
    2. The Instance refresh status SNS topic sends an email on the instance refresh status to subscribers.
  5. Image update reminder
    1. A weekly scheduled rule invokes the Image update reminder Lambda function.
    2. The Image update reminder Lambda function retrieves the value for LatestECSOptimizedAMI from the CloudFormation template, and extracts the last modified date of the Amazon ECS-optimized AMI used as the base image in the EC2 Image Builder pipeline. It compares the last modified date of the AMI with the creation date of the latest AMI published by the pipeline. If a new base image is available, it publishes a message to the image update reminder SNS topic.
    3. The Image update reminder SNS topic sends a message to subscribers notifying them of a new base image. You need to create a new version of your image recipe to update it with the new AMI.

Prerequisites

To follow along with this walkthrough, make sure that you have the following prerequisites in place:

Walkthrough

To deploy the solution, complete the following steps.

Step 1: Download or clone the repository

The first step is to download or clone the solution’s repository.

To download the repository

  1. Go to the main page of the repository on GitHub.
  2. Choose Code, and then choose Download ZIP.

To clone the repository

  1. Make sure that you have Git installed.
  2. Run the following command in your terminal:

git clone https://github.com/aws-samples/ecs-image-hardening-and-instance-refresh.git

Step 2: Create an S3 bucket

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. An S3 bucket is a container for objects stored on Amazon S3. For this walkthrough, you need to create an S3 bucket and copy the content of the ansible folder to your newly created bucket. Make a note of your S3 bucket name because you will need it in the next step.

Step 3: Create the CloudFormation stack

In this step, you deploy the solution’s resources by creating a CloudFormation stack using the provided CloudFormation template. Sign in to your account and choose an AWS Region where you want to create the stack. Make sure that the Region you choose supports the services used by this solution. To create the stack, follow the steps in Creating a stack on the AWS CloudFormation console. Note that you need to provide values for the parameters defined in the template to deploy the stack. The following table lists the parameters that you need to provide.

Parameter Description
AnsiblePlaybookArguments ansible-playbook command arguments
AnsiblePlaybookBucket S3 bucket name containing ansible playbook
CloudFormationUpdaterEventBridgeRuleState Amazon EventBridge rule that invokes the Lambda function that checks for a new version of the EC2 Image Builder parent image
ClusterTags Tags in JSON format to filter the ECS clusters that you want to update
ComponentName Name of the EC2 Image Builder component
DistributionConfigurationName Name of the EC2 Image Builder distribution configuration
EnableImageScanning Choose whether or not to enable Amazon Inspector image scanning
ImagePipelineName Name of the EC2 Image Builder pipeline
InfrastructureConfigurationName Name of the EC2 Image Builder infrastructure configuration
InstanceType EC2 Image Builder infrastructure configuration EC2 instance type
LatestECSOptimizedAMI ECS-optimized AMI parameter name; for more info, see Retrieving Amazon ECS-optimized AMI metadata
libDockerVolumeSize Container partition size in gigabytes (GB)
libDockerVolumeType Container partition volume type
RecipeName Name of the EC2 Image Builder recipe
RootVolumeSize AMI root partition volume size in GB
RootVolumeType AMI root partition volume type

Step 4: Set up Amazon SNS topic subscribers

Amazon Simple Notification Service (Amazon SNS) is a web service that coordinates and manages the delivery or sending of messages to subscribing endpoints or clients. An Amazon SNS topic is a logical access point that acts as a communication channel.

The solution in this post creates three Amazon SNS topics to keep you informed of each step of the process. The following is a list of the topics that the solution creates and their purpose.

  • AMI status topic – a message is published to this topic upon successful creation of an AMI.
  • Image update reminder topic – a message is published to this topic if a newer version of the base Amazon ECS-optimized AMI is published by AWS.
  • Instance refresh status topic – a message is published to this topic each time that an ECS cluster capacity provider gets an instance fleet refresh.

You need to manually modify the subscriptions for each topic to receive messages published to that topic.

To modify the subscriptions for the topics created by the CloudFormation template

  1. Sign in to the Amazon SNS console.
  2. In the left navigation pane, choose Subscriptions.
  3. On the Subscriptions page, choose Create subscription.
  4. On the Create subscription page, in the Details section, do the following:
    • For Topic ARN, choose the Amazon Resource Name (ARN) of one of the topics that the CloudFormation topic created.
    • For Protocol, choose Email.
    • For Endpoint, enter the endpoint value. In our example, this is an email address, such as the email address of a distribution list.
    • Choose Create subscription.
  5. Repeat the preceding steps for the other two topics.

Step 5: Run the pipeline

The EC2 Image Builder pipeline that the solution creates consists of an image recipe with one component, an infrastructure configuration, and a distribution configuration. I’ve set up the image recipe to create an AMI, select a base image, choose components, and define block device mapping. There’s only one component where building and testing steps are defined. For the building step, the solution creates a separate partition for /var/lib/docker and mounts it to a dedicated device specified in the image recipe. It then applies the CIS Docker Benchmark ansible playbook and cleans up the unnecessary files and folders. In the test step, the solution runs Amazon inspector, a continuous assessment service that scans your AWS workloads for software vulnerabilities and unintended network exposure, and Docker Bench for Security. Optionally, you can create your own components and associate them with the image recipe to make further modifications on the base image.

You will need to manually run the pipeline by using either the AWS Management Console or AWC CLI.

To run the pipeline (console)

  1. Open the EC2 Image Builder console.
  2. From the pipeline details page, choose the name of your pipeline.
  3. From the Actions menu at the top of the page, select Run pipeline.

To run the pipeline (AWS CLI)

  1. Make sure that you have properly configured your AWS CLI.
  2. Run the following command. Replace <pipeline region> with your own information.

aws imagebuilder list-image-pipelines –region <pipeline region>

  1. From the list of pipelines, find the pipeline named ECSAnsiblePipeline and note the pipeline ARN, which you will use in the next step.
  2. Run the pipeline. Make sure to replace <pipeline arn> and <region> with your own information.

aws imagebuilder start-image-pipeline-execution –image-pipeline-arn <pipeline arn> –region <region>

The following is a process overview of the image hardening and instance refresh:

  1. Image hardening – when you start the pipeline, EC2 Image Builder creates the required infrastructure to build your AMI, applies the ansible playbook (CIS Docker Benchmark) to the base AMI, and publishes the hardened AMI. A message is published to the AMI status topic as well.
  2. Image testing – after publishing the AMI, EC2 Image Builder scans the newly created AMI with Amazon Inspector and reports the findings back. It also runs Docker Bench for Security to verify the changes that the ansible playbook made to the base AMI and publishes the results to an S3 bucket.
  3. State machine initiation – after a new AMI is successfully published, the AMI status topic invokes the State machine initiation Lambda function. The Lambda function invokes the instance refresh state machine and passes on the AMI info.
  4. Instance refresh – the instance refresh state machine has two steps:
    1. Gather cluster information – a Lambda function gathers information regarding EC2 capacity providers and their associated auto scaling groups. For each auto scaling group, it creates a new launch template and includes the hardened AMI information. When you create the CloudFormation stack, if you pass a tag or a list of tags, only clusters with matching tags are processed in this step.
    2. Auto scaling group instance refresh – the state machine uses the output of the first Lambda function (first state) and starts instance refresh for auto scaling groups in parallel (second state). An EventBridge rule publishes a message to the Instance refresh status topic upon successful refresh of each auto scaling group.

This solution also creates an EventBridge rule that is invoked weekly. This rule invokes the Image update reminder Lambda function, and notifies you if a new version of your base AMI has been published by AWS so that you can run the pipeline and update your hardened AMI.

Conclusion

In this blog post, you learned how to create a workflow to harden Amazon ECS-optimized AMIs by using the CIS Docker Benchmark and to automate the refresh of EC2 instances in your ECS clusters. This automated workflow has several advantages. First, it helps ensure a consistent and standardized process for image hardening, reducing potential human errors and inconsistencies. By automating the entire process, you can apply security and compliance standards across your instances. Second, the tight integration with AWS Step Functions enables smooth, orchestrated updates to the ECS cluster instances, enhancing the reliability and predictability of deployments. This automation also reduces manual intervention, helping you achieve time savings so that your teams can focus on more value-driven tasks. Moreover, this systematic approach helps to enhance the security posture of your Amazon ECS workloads because you can address vulnerabilities rapidly and systematically, helping to keep the environment resilient against potential threats.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Nima Fotouhi

Nima Fotouhi

Nima is a Security Consultant at AWS. He’s a builder with a passion for infrastructure as code (IaC) and policy as code (PaC) and helps customers build secure infrastructure on AWS. In his spare time, he loves to hit the slopes and go snowboarding.

How to use chaos engineering in incident response

Post Syndicated from Kevin Low original https://aws.amazon.com/blogs/security/how-to-use-chaos-engineering-in-incident-response/

Simulations, tests, and game days are critical parts of preparing and verifying incident response processes. Customers often face challenges getting started and building their incident response function as the applications they build become increasingly complex. In this post, we will introduce the concept of chaos engineering and how you can use it to accelerate your incident response preparation and testing processes.

Why chaos engineering?

Chaos engineering is a formalized approach that uses fault injection experiments to create real-world conditions needed to understand how your system will react to unknowns and build confidence in the system’s resiliency and security.

Modern applications can have multiple components, including web, API, application, and data persistence layers. To respond to potential security events, you must understand the failure scenarios across each component and their downstream impacts. One challenge is that creating incident response processes and playbooks for components in a silo doesn’t consider known unknowns—how these components interact with each other—and can’t reveal unknown unknowns such as second-order effects during a security event.

As an example, consider the personalization microservice shown in Figure 1.

The microservice relies on two Amazon Elastic Compute Cloud (Amazon EC2) instances that are deployed in an auto scaling group across two Availability Zones. An upstream data collection microservice sends data for the personalization microservice to process. In addition, a downstream website microservice takes the personalized data and displays it to customers.

Figure 1: Architecture of the personalization microservice

Figure 1: Architecture of the personalization microservice

Now imagine that unexpected activity occurred on an EC2 instance. The instance started to query a domain name that’s associated with cryptocurrency-related activity. A first set of unknowns already emerges:

  • Can your detective controls detect the activity on the instance?
  • How long do they take to do so?
  • How long does it take your security team to be notified?
  • Does the security team know what to do?
  • Does the notification have all the information that the team needs to respond?
  • Is there an existing automated response to other stakeholders?

Security professionals may not consider all of these questions when building and designing their threat detection and incident response capabilities.

In our example, Amazon GuardDuty is able to detect the unexpected activity and generates the CryptoCurrency:EC2/BitcoinTool.B!DNS finding within 15 minutes. The security team takes a snapshot for further forensics before the instance is isolated, as shown in Figure 2.

Figure 2: Architecture after GuardDuty detects unexpected activity and the security team isolates the EC2 instance

Figure 2: Architecture after GuardDuty detects unexpected activity and the security team isolates the EC2 instance

Although this might seem like an adequate response in isolation, it leads to more questions.

From a security perspective:

  • What other logs do we need for further investigation?
  • Do we know if the credentials need to be rotated and what impact that will have on the workload?
  • Should other parts of the system be replaced or restarted?

From an operational perspective:

  • Do any of the (manual or automated) incident response processes impact the performance of the workload?
  • Can the remaining instance handle the traffic before the auto scaling group creates another instance?
  • If there is increased latency or failure of the microservice, how will the data collection and website microservices react to it?

Creating detection and incident response plans in isolation doesn’t consider the second order effects that could have an impact on the integrity and availability of the system.

How can chaos engineering help?

Chaos engineering is a formalized process that can help solve this problem. It creates failure in a controlled environment with well-defined experiments to generate data on system behavior during a simulated event. You can use this data to improve incident response processes and make proactive changes that improve the security of your workloads. By using chaos engineering, developer and security teams can reveal additional unknowns and understand areas of opportunity to improve incident response processes and workload availability.

Chaos engineering has five phases—steady state, hypothesis, run experiment, verify, and improve—which we’ll discuss in more detail next.

Steady state 

The first phase involves an understanding of the behavior and configuration of the system under normal conditions. Instead of focusing on the internal attributes of the system, you should focus on an output metric or indicator that ties operational metrics and customer experience together. Including these output metrics in your hypothesis helps you collect data on security events and understand how these events and your response to them impact business outcomes.

Returning to our earlier example, this could be the latency when a user attempts to retrieve personalized information. This output is critical to the customer experience and relies on multiple operational metrics.

In addition, two key metrics in incident response are time to detect (TTD) and time to remediate (TTR). These metrics help capture how effectively your team has responded to the security event.

By defining your steady state, you can detect deviations from that state and determine if your system has fully returned to the known good state. You should identify the relevant metrics to measure your system and make these metrics simple for engineers to consume.

Using AWS, you can collect logs from the different services that you use in a workload, such as Amazon VPC Flow Logs, Amazon CloudWatch log groups, and AWS CloudTrail. For more details about the different log sources, see Logging strategies for security incident response.

Hypothesis

After you understand the steady state behavior, you can write a hypothesis about it. Security hypotheses can take the following form:

When _________ happens, ________ system will notify the team within _______ and the application’s metric _________ will remain at ________.

It can be challenging to decide what should happen. Chaos engineering recommends that you choose real-world events that are likely to occur and that will impact the user experience. Get your team to brainstorm. For security issues, this is an ideal time to use your threat model as the starting point for discussions. Starting with one of your identified threats and then running experiments based on that threat can help you test both your processes and automation.

After you’ve chosen your component, decide which variable to influence or what could happen in your complex system. For example, a misconfigured Amazon Simple Storage Service (Amazon S3) bucket or an open database port could lead to unintended exposure of customer data. A software flaw in your application could lead to the misuse of resources by an unauthorized user.

Here are a few examples of hypotheses:

  • If port 22 permits unrestricted access on a security group, AWS Config will detect it, run an automation to remove the security group rule, and notify the security team through Slack within 5 minutes, and the application’s latency will remain at 0.005 seconds.
  • If malware is run on an EC2 instance, Amazon GuardDuty will detect it within 15 minutes and notify the security team. Remediation playbooks will not affect the application’s error rate of 1 error for every thousand requests.

Design and run the experiment 

The next phase is to run the experiment. You don’t need to run experiments in production right away. A great place to get started with chaos engineering is the staging environment. One benefit of the AWS Cloud is that you can configure your staging environment to be identical to production. This increases the value of using an approach like chaos engineering before you get to production. By running experiments in staging, you can see how your system will likely react in production while earning trust within your organization.

As you gain confidence, you can begin running experiments in production. Because you configured staging to be identical to production, the risk of this transition is mitigated.

You can use AWS Fault Injection Simulator (FIS), our fully managed service for running fault injection experiments. FIS supports multiple fault injection actions, such as injecting API errors, restarting instances, running scripts on instances, disrupting network connectivity, and more. For the full list, see the FIS actions reference.

Although FIS doesn’t support security-related actions out of the box, you can use FIS to run AWS Systems Manager Automation documents that can run AWS APIs and scripts to simulate security events. To learn how to set up FIS to run a Systems Manager document that turns off bucket-level block public access for a randomly-selected S3 bucket, see the workshop Chaos Kitty – Gamifying Incident Response with Chaos Engineering. To learn how to set up FIS to run experiments that simulate events such as an RDP brute force event, lateral movement, cryptocurrency mining, and DNS data exfiltration, see the workshop Validating security guardrails with Chaos Engineering.

During this phase, you must understand the scope of impact of your experiment and work to minimize it. If an Amazon CloudWatch alarm goes into an alarm state, FIS can automatically stop the experiment. You should have a plan to return the environment to the steady state if the experiment has an unintended impact.

As you run your experiment, remember to document the key metrics and human responses, such as whether incident responders were confident, knew where to find the correct resources, or were aware of the escalation points.

Learn and verify 

The next step is to analyze and document the data to understand what happened. Lessons learned during the experiment are critical and should promote a culture of support instead of blame.

Here are some questions that you should address:

  1. What happened?
  2. What was the impact on our customers?
  3. What did we learn? Did we have enough information in the notification to investigate?
  4. What could have reduced our time to detect or time to remediate by 50 percent?
  5. Can we apply this to other similar systems?
  6. How can we improve our incident response processes?
  7. What human steps in the process can we automate?

Here are a few examples of things you might learn from your chaos engineering experiments:

  • After port 22 was opened, AWS Config detected the misconfigured security group within 2 minutes. However, the notification system was misconfigured, and the security team wasn’t notified. During the five minutes that port 22 was opened, the EC2 instance received 22 attempts to connect to it from unknown IP addresses.
  • After a cryptocurrency mining script was run on an EC2 instance, GuardDuty detected the activity, generated a finding within 10 minutes, and notified the security team.
  • The security team’s remediation actions—terminating the instance—led to increased application latency beyond the SLA of 0.05 seconds.

Improve and fix it

Use your learnings to improve the workload. It’s vital that you get leadership alignment and support to prioritize the remediation of findings from chaos experiments or other testing scenarios. Examples include improving incident response playbooks, creating new forms of automation, or creating preventative controls to prevent the event from happening. For more guidance on playbooks, see these sample incident response playbooks and the workshop Building an AWS incident response runbook using Jupyter notebooks and CloudTrail Lake.

Preparation is important in incident response. As you improve your processes, run more experiments to collect additional data and continue to iteratively improve. Automate chaos experiments on new environments or applications with minimal user traffic before directing the majority of traffic to it. As you use chaos engineering approaches to prepare incident response processes, your detection and incident response capabilities should improve.

Conclusion

It’s vital to be prepared when a security event happens. In this blog post, you learned about the five phases of chaos engineering—steady state, hypothesis, design and run the experiment, learn and verify, and improve and fix—and how you can use them to accelerate your incident response preparation and testing processes. For more information on chaos engineering, see the following resources. Choose a workload and run an experiment on it to verify and improve your incident response processes today.

Additional resources

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Kevin Low

Kevin Low

Kevin is a Security Solutions Architect who helps customers of all sizes across ASEAN build securely. He is passionate about integrating resilience and security and has a keen interest in chaos engineering. Outside of work, he loves spending time with his wife and dog, a poodle called Noodle.

Approaches for migrating users to Amazon Cognito user pools

Post Syndicated from Edward Sun original https://aws.amazon.com/blogs/security/approaches-for-migrating-users-to-amazon-cognito-user-pools/

Update: An earlier version of this post was published on September 14, 2017, on the Front-End Web and Mobile Blog.


Amazon Cognito user pools offer a fully managed OpenID Connect (OIDC) identity provider so you can quickly add authentication and control access to your mobile app or web application. User pools scale to millions of users and add layers of additional features for security, identity federation, app integration, and customization of the user experience. Amazon Cognito is available in regions around the globe, processing over 100 billion authentications each month. You can take advantage of security features when using user pools in Cognito, such as email and phone number verification, multi-factor authentication, and advanced security features, such as compromised credentials detection, and adaptive authentications.

Many customers ask about the best way to migrate their existing users to Amazon Cognito user pools. In this blog post, we describe several different recommended approaches and provide step-by-step instructions on how to implement them.

Key considerations

The main consideration when migrating users across identity providers is maintaining a consistent end-user experience. Ideally, users can continue to use their existing passwords so that their experience is seamless. However, security best practices dictate that passwords should never be stored directly as cleartext in a user store. Instead, passwords are used to compute cryptographic hashes and verifiers that can later be used to verify submitted passwords. This means that you cannot securely export passwords in cleartext form from an existing user store and import them into a Cognito user pool. You might ask your users to choose a new password during the migration. Or, if you want to retain the existing passwords, you need to retain access to the existing hashes and verifiers, at least during the migration period.

A secondary consideration is the migration timeline. For example, do you need a faster migration timeline because your current identity store’s license is expiring? Or do you prefer a slow and steady migration because you are modernizing your current application, and it takes time to connect your existing systems to the new identity provider?

The following two methods define our recommended approaches for migrating existing users into a user pool:

  • Bulk user import – Export your existing users into a comma-separated (.csv) file, and then upload this .csv file to import users into a user pool. Your desired user attributes (except passwords) can be included and mapped to attributes in the target user pool. This approach requires users to reset their passwords when they sign in with Cognito. You can choose to migrate your existing user store entirely in a single import job or split users into multiple jobs for parallel or incremental processing.
  • Just-in-time user migration – Migrate users just in time into a Cognito user pool as they sign in to your mobile or web app. This approach allows users to retain their current passwords, because the migration process captures and verifies the password during the sign-in process, seamlessly migrating them to the Cognito user pool.

In the following sections, we describe the bulk user import and just-in-time user migration methods in more detail and then walk through the steps of each approach.

Bulk user import

You perform bulk import of users into an Amazon Cognito user pool by uploading a .csv file that contains user profile data, including usernames, email addresses, phone numbers, and other attributes. You can download a template .csv file for your user pool from Cognito, with a user schema structured in the template header.

Following is an example of performing bulk user import.

To create an import job

  1. Open the Cognito user pool console and select the target user pool for migration.
  2. On the Users tab, navigate to the Import users section, and choose Create import job.
  3. Figure 1: Create import job

    Figure 1: Create import job

  4. In the Create import job dialog box, download the template.csv file for user import.
  5. Export your existing user data from your existing user directory or store your data into the .csv file
  6. Match the user attribute types with column headings in the template. Each user must have an email address or a phone number that is marked as verified in the .csv file, in order to receive the password reset confirmation code.
  7. Figure 2: Configure import job

    Figure 2: Configure import job

  8. Go back to the Create import job dialog box (as shown in Figure 2) and do the following:
    1. Enter a Job name.
    2. Choose to Create a new IAM role or Use an existing IAM role. This role grants Amazon Cognito permission to write to Amazon CloudWatch Logs in your account, so that Cognito can provide logs for successful imports and errors for skipped or failed transactions.
    3. Upload the .csv file that you have prepared, and choose Create and start job.

Depending on the size of the .csv file, the job can run for minutes or hours, and you can follow the status from that same page in the Amazon Cognito console.

Figure 3: Check import job status

Figure 3: Check import job status

Cognito runs through the import job and imports users with a RESET_REQUIRED state. When users attempt to sign in, Cognito will return PasswordResetRequiredException from the sign-in API, and the app should direct the user into the ForgotPassword flow.

Figure 4: View imported user

Figure 4: View imported user

The bulk import approach can also be used continuously to incrementally import users. You can set up an Extract-Transform-Load (ETL) batch job process to extract incremental changes to your existing user directories, such as the new sign-ups on the existing systems before you switch over to a Cognito user pool. Your batch job will transform the changes into a .csv file to map user attribute schemas, and load the .csv file as a Cognito import job through the CreateUserImportJob CLI or SDK operation. Then start the import job through the StartUserImportJob CLI or SDK operation. For more information, see Importing users into user pools in the Amazon Cognito Developer Guide.

Just-in-time user migration

The just-in-time (JIT) user migration method involves first attempting to sign in the user through the Amazon Cognito user pool. Then, if the user doesn’t exist in the Cognito user pool, Cognito calls your Migrate User Lambda trigger and sends the username and password to the Lambda trigger to sign the user in through the existing user store. If successful, the Migrate User Lambda trigger will also fetch user attributes and return them to Cognito. Then Cognito silently creates the user in the user pool with user attributes, as well as salts and password verifiers from the user-provided password. With the Migrate User Lambda trigger, your client app can start to use the Cognito user pool to sign in users who have already been migrated, and continue migrating users who are signing in for the first time towards the user pool. This just-in-time migration approach helps to create a seamless authentication experience for your users.

Cognito, by default, uses the USER_SRP_AUTH authentication flow with the Secure Remote Password (SRP) protocol. This flow doesn’t involve sending the password across the network, but rather allows the client to exchange a cryptographic proof with the Cognito service to prove the client’s knowledge of the password. For JIT user migration, Cognito needs to verify the username and password against the existing user store. Therefore, you need to enable a different Cognito authentication flow. You can choose to use either the USER_PASSWORD_AUTH flow for client-side authentication or the ADMIN_USER_PASSWORD_AUTH flow for server-side authentication. This will allow the password to be sent to Cognito over an encrypted TLS connection, and allow Cognito to pass the information to the Lambda function to perform user authentication against the original user store.

This JIT approach might not be compatible with existing identity providers that have multi-factor authentication (MFA) enabled, because the Lambda function cannot support multiple rounds of challenges. If the existing identity provider requires MFA, you might consider the alternative JIT migration approach discussed later in this blog post.

Figure 5 illustrates the steps for the JIT sign-in flow. The mobile or web app first tries to sign in the user in the user pool. If the user isn’t already in the user pool, Cognito handles user authentication and invokes the Migrate User Lambda trigger to migrate the user. This flow keeps the logic in the app simple and allows the app to use the Amazon Cognito SDK to sign in users in the standard way. The migration logic takes place in the Lambda function in the backend.

Figure 5: JIT migration user authentication flow

Figure 5: JIT migration user authentication flow

The flow in Figure 5 starts in the mobile or web app, which attempts to sign in the user by using the AWS SDK. If the user doesn’t exist in the user pool, the migration attempt starts. Cognito calls the Migrate User Lambda trigger with triggerSource set to UserMigration_Authentication, and passes the user’s username and password in the request in order to attempt to migrate the user.

This approach also works in the forgot password flow shown in Figure 6, where the user has forgotten their password and hasn’t been migrated yet. In this case, once the user makes a “Forgot Password” request, your mobile or web app will send a forgot password request to Cognito. Cognito invokes your Migrate User Lambda trigger with triggerSource set to UserMigration_ForgotPassword, and passes the username in the request in order to attempt user lookup, migrate the user profile, and facilitate the password reset process.

Figure 6: JIT migration forgot password flow

Figure 6: JIT migration forgot password flow

Just-in-time user migration sample code

In this section, we show sample source codes for a Migrate User Lambda trigger overall structure. We will fill in the commented sections with additional code, shown later in the section. When you set up your own Lambda function, configure a Lambda execution role to grant permissions for CloudWatch logs.

const handler = async (event) => {
    if (event.triggerSource == "UserMigration_Authentication") {
        //***********************************************************************
        // Attempt to sign in the user or verify the password with existing identity store
        // (shown in the Section A – Migrate User of this post)
        //***********************************************************************
    }
    else if (event.triggerSource == "UserMigration_ForgotPassword") {
       //***********************************************************************
       // Attempt to look up the user in your existing identity store
       // (shown in the section B – Forget Password of this post)
       //***********************************************************************
    }
    return event;
};
export { handler };

In the migration flow, the Lambda trigger will sign in the user and verify the user’s password in the existing user store. That may involve a sign-in attempt against your existing user store or a check of the password against a stored hash. You need to customize this step based on your existing setup. You can also create a function to fetch user attributes that you want to migrate. If your existing user store conforms to the OIDC specification, you can parse the ID Token claims to retrieve the user’s attributes. The following example shows how to set the username and attributes for the migrated user.

// Section A – Migrate User
if (event.triggerSource == "UserMigration_Authentication") {
// Attempt to sign in the user or verify the password with the existing user store.
// Add an authenticateUser() functionbased on your existing user store setup. 
    const user = await authenticateUser(event.userName, event.request.password);
    if (user) {
        // Migrating user attributes from the source user store. You can migrate additional attributes as needed.
        event.response.userAttributes = {
            // Setting username and email address
            username: event.userName,
            email: user.emailAddress,
            email_verified: "true",
        };
        // Setting user status to CONFIRMED to autoconfirm users so they can sign in to the user pool
        event.response.finalUserStatus = "CONFIRMED";
        // Setting messageAction to SUPPRESS to decline to send the welcome message that Cognito usually sends to new users
        event.response.messageAction = "SUPPRESS";
        }
    }

The user is now migrated from the existing user store to the user pool, as well as the user’s attributes. Users will also be redirected to your application with the authorization code or JSON Web Tokens, depending on the OAuth 2.0 grant types you configured in the user pool.

Let’s look at the forgot password flow. Your Lambda function calls the existing user store and migrates other attributes in the user’s profile first, and then Lambda sets user attributes in the response to the Cognito user pool. Cognito initiates the ForgotPassword flow and sends a confirmation code to the user to confirm the password reset process. The user needs to have a verified email address or phone number migrated from the existing user store to receive the forgot password confirmation code. The following sample code demonstrates how to complete the ForgotPassword flow.

// Section B – Forgot Password
else if (event.triggerSource == "UserMigration_ForgotPassword") {
        // Look up the user in your existing user store service.  
		// Add a lookupUser() function based on your existing user store setup. 
        const lookupResult = await lookupUser(event.userName);
        if (lookupResult) {
            // Setting user attributes from the source user store
            event.response.userAttributes = {
                username: event.userName,
                // Required to set verified communication to receive password recovery code
                email: lookupResult.emailAddress,
                email_verified: "true",
            };
            event.response.finalUserStatus = "RESET_REQUIRED";
            event.response.messageAction = "SUPPRESS";
        }
    }

Just-in-time user migration – alternative approach

Using the Migrate User Lambda trigger, we showed the JIT migration approach where the app switches to use the Cognito user pool at the beginning of the migration period, to interface with the user for signing in and migrating them from the existing user store. An alternative JIT approach is to maintain the existing systems and user store, but to silently create each user in the Cognito user pool in a backend process as users sign in, then switch over to use Cognito after enough users have been migrated.

Figure 7: JIT migration alternative approach with backend process

Figure 7: JIT migration alternative approach with backend process

Figure 7 shows this alternative approach in depth. When an end user signs in successfully in your mobile or web app, the backend migration process is initiated. This backend process first calls the Cognito admin API operation, AdminCreateUser, to create users and map user attributes in the destination user pool. The user will be created with a temporary password and be placed in FORCE_CHANGE_PASSWORD status. If you capture the user password during the sign-in process, you can also migrate the password by setting it permanently for the newly created user in the Cognito user pool using the AdminSetUserPassword API operation. This operation will also set the user status to CONFIRMED to allow the user to sign in to Cognito using the existing password.

Following is a code example for the AdminCreateUser function using the AWS SDK for JavaScript.

var params = {
    MessageAction: "SUPPRESS",
    UserAttributes: [{
        Name: "name",
        Value: "Nikki Wolf"
    },
    {
        Name: "email",
        Value: "[email protected]"
    },
    {
        Name: "email_verified",
        Value: "True"
    }
    ],
    UserPoolId: "us-east-1_EXAMPLE",
    Username: "nikki_wolf"
};
const cognito = new CognitoIdentityProviderClient();
const createUserCommand  = new AdminCreateUserCommand(params);
await cognito.send (createUserCommand);

The following is a code example for the AdminSetUserPassword function.

var params = {
    UserPoolId: 'us-east-1_EXAMPLE' ,
    Username: 'nikki_wolf' ,
    Password: 'ExamplePassword1$' ,
    Permanent: true
};
const cognito = new CognitoIdentityProviderClient();
const setUserPasswordCommand = new AdminSetUserPasswordCommand(params);
await cognito.send(setUserPasswordCommand);

This alternative approach does not require the app to update its authentication codebase until a majority of users are migrated, but you need to propagate user attribute changes and new user signups from the existing systems to Cognito. If you are capturing and migrating passwords, you should also build a similar logic to capture password changes in existing systems and set the new password in the user pool to keep it synchronized until you perform a full switchover from the existing identity store to the Cognito user pool.

Summary and best practices

In this post, we described our two recommended approaches for migrating users into an Amazon Cognito user pool. You can decide which approach is best suited for your use case. The bulk method is simpler to implement, but it doesn’t preserve user passwords like the just-in-time migration does. The just-in-time migration is transparent to users and mitigates the potential attrition of users that can occur when users need to reset their passwords.

You could also consider a hybrid approach, where you first apply JIT migration as users are actively signing in to your app, and perform bulk import for the remaining less-active users. This hybrid approach helps provide a good experience for your active user communities, while being able to decommission existing user stores in a manageable timeline because you don’t need to wait for every user to sign in and be migrated through JIT migration.

We hope you can use these explanations and code samples to set up the most suitable approach for your migration project.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Edward Sun

Edward Sun

Edward is a Security Specialist Solutions Architect focused on identity and access management. He loves helping customers throughout their cloud transformation journey with architecture design, security best practices, migration, and cost optimizations. Outside of work, Edward enjoys hiking, golfing, and cheering for his alma mater, the Georgia Bulldogs.

How to share security telemetry per OU using Amazon Security Lake and AWS Lake Formation

Post Syndicated from Chris Lamont-Smith original https://aws.amazon.com/blogs/security/how-to-share-security-telemetry-per-ou-using-amazon-security-lake-and-aws-lake-formation/

This is the final part of a three-part series on visualizing security data using Amazon Security Lake and Amazon QuickSight. In part 1, Aggregating, searching, and visualizing log data from distributed sources with Amazon Athena and Amazon QuickSight, you learned how you can visualize metrics and logs centrally with QuickSight and AWS Lake Formation irrespective of the service or tool generating them. In part 2, How to visualize Amazon Security Lake findings with Amazon QuickSight (LINK NOT LIVE YET), you learned how to integrate Amazon Athena with Security Lake and create visualizations with QuickSight of the data and events captured by Security Lake.

For companies where security administration and ownership are distributed across a single organization in AWS Organizations, it’s important to have a mechanism for securely sharing and visualizing security data. This can be achieved by enriching data within Security Lake with organizational unit (OU) structure and account tags and using AWS Lake Formation to securely share data across your organization on a per-OU basis. Users can then analyze and visualize security data of only those AWS accounts in the OU that they have been granted access to. Enriching the data enables users to effectively filter information using business-specific criteria, minimizing distractions and enabling them to concentrate on key priorities.

Distributed security ownership

It’s not unusual to find security ownership distributed across an organization in AWS Organizations. Take for example a parent company with legal entities operating under it, which are responsible for the security posture of the AWS accounts within their lines of business. Not only is each entity accountable for managing and reporting on security within its area, it must not be able to view the security data of other entities within the same organization.

In this post, we discuss a common example of distributing dashboards on a per-OU basis for visualizing security posture measured by the AWS Foundational Security Best Practices (FSBP) standard as part of AWS Security Hub. In this post, you learn how to use a simple tool published on AWS Samples to extract OU and account tags from your organization and automatically create row-level security policies to share Security Lake data to AWS accounts you specify. At the end, you will have an aggregated dataset of Security Hub findings enriched with AWS account metadata that you can use as a basis for building QuickSight dashboards.

Although this post focuses on sharing Security Hub data through Security Lake, the same steps can be performed to share any data—including Security Hub findings in Amazon S3—according to OU. You need to ensure any tables you want to share contain an AWS account ID column and that the tables are managed by Lake Formation.

Prerequisites

This solution assumes you have:

  • Followed the previous posts in this series and understand how Security Lake, Lake Formation, and QuickSight work together.
  • Enabled Security Lake across your organization and have set up a delegated administrator account.
  • Configured Security Hub across your organization and have enabled the AWS FSBP standard.

Example organization

AnyCorp Inc, a fictional organization, wants to provide security compliance dashboards to its two subsidiaries, ExampleCorpEast and ExampleCorpWest, so that each only has access to data for their respective companies.

Each subsidiary has an OU under AnyCorp’s organization as well as multiple nested OUs for each line of business they operate. ExampleCorpEast and ExampleCorpWest have their own security teams and each operates a security tooling AWS account and uses QuickSight for visibility of security compliance data. AnyCorp has implemented Security Lake to centralize the collection and availability of security data across their organization and has enabled Security Hub and the AWS FSBP standard across every AWS account.

Figure 1 – Overview of AnyCorp Inc OU structure and AWS accounts

Figure 1: Overview of AnyCorp Inc OU structure and AWS accounts


Note: Although this post describes a fictional OU structure to demonstrate the grouping and distribution of security data, you can substitute your specific OU and AWS account details and achieve the same results.

Logical architecture

Figure 2 – Logical overview of solution components

Figure 2: Logical overview of solution components

The solution includes the following core components:

  • An AWS Lambda function is deployed into the Security Lake delegated administrator account (Account A) and extracts AWS account metadata for grouping Security Lake data and manages secure sharing through Lake Formation.
  • Lake Formation implements row-level security using data filters to restrict access to Security Lake data to only records from AWS accounts in a particular OU. Lake Formation also manages the grants that allow consumer AWS accounts access to the filtered data.
  • An Amazon Simple Storage Service (Amazon S3) bucket is used to store metadata tables that the solution uses. Apache Iceberg tables are used to allow record-level updates in S3.
  • QuickSight is configured within each data consumer AWS account (Account B) and is used to visualize the data for the AWS accounts within an OU.

Deploy the solution

You can deploy the solution through either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

To deploy the solution using the AWS Management Console, follow these steps:

  1. Download the CloudFormation template.
  2. In your Amazon Security Lake delegated administrator account (Account A), navigate to create a new AWS CloudFormation stack.
  3. Under Specify a template, choose Upload a template file and upload the file downloaded in the previous step. Then choose Next.
  4. Enter RowLevelSecurityLakeStack as the stack name.

    The table names used by Security Lake include AWS Region identifiers that you might need to change depending on the Region you’re using Security Lake in. Edit the following parameters if required and then choose Next.

    • MetadataDatabase: the name you want to give the metadata database.
      • Default: aws_account_metadata_db
    • SecurityLakeDB: the Security Lake database as registered by Security Lake.
      • Default: amazon_security_lake_glue_db_ap_southeast_2
    • SecurityLakeTable: the Security Lake table you want to share.
      • Default: amazon_security_lake_table_ap_southeast_2_sh_findings_1_0
  5. On the Configure stack options screen, leave all other values as default and choose Next.
  6. On the next screen, navigate to the bottom of the page and select the checkbox next to I acknowledge that AWS CloudFormation might create IAM resources. Choose Submit.

The solution takes about 5 minutes to deploy.

To deploy the solution using the AWS CDK, follow these steps:

  1. Download the code from the row-level-security-lake GitHub repository, where you can also contribute to the sample code. The CDK initializes your environment and uploads the Lambda assets to Amazon S3. Then, deploy the solution to your account.
  2. For a CDK deployment, you can edit the same Region identifier parameters discussed in the CloudFormation deployment option by editing the cdk.context.json file and changing the metadata_database, security_lake_db, and security_lake_table values if required.
  3. While you’re authenticated in the Security Lake delegated administrator account, you can bootstrap the account and deploy the solution by running the following commands:
  4. cdk bootstrap
    cdk deploy

Configuring the solution in the Security Lake delegated administrator account

After the solution has been successfully deployed, you can review the OUs discovered within your organization and specify which consumer AWS accounts (Account B) you want to share OU data with.

To specify AWS accounts to share OU security data with, follow these steps:

  1. While in the Security Lake delegated administrator account (Account A), go to the Lake Formation console.
  2. To view and update the metadata discovered by the Lambda function, you first must grant yourself access to the tables where it’s stored. Select the radio button for aws_account_metadata_db. Then, under the Action dropdown menu, select Grant.
  3. Figure 3: Creating a grant for your IAM role

    Figure 3: Creating a grant for your IAM role

  4. On the Grant data permissions page, under Principals, select the IAM users and roles dropdown and select the IAM role that you are currently logged in as.
  5. Under LF-Tags or catalog resources, select the Tables dropdown and select All tables.
  6. Figure 4: Choosing All Tables for the grant

    Figure 4: Choosing All Tables for the grant

  7. Under Table permissions, select Select, Insert, and Alter. These permissions let you view and update the data in the tables.
  8. Leave all other options as default and choose Grant.
  9. Now go to the AWS Athena console.
  10. Note: To use Athena for queries you must configure an S3 bucket to store query results. If this is the first time Athena is being used in your account, you will receive a message saying that you need to configure an S3 bucket. To do this, select the Edit settings button in the blue information notice and follow the instructions.

  11. On the left side, select aws_account_metadata_db> as the Database. You will see aws_account_metadata and ou_groups >as tables within the database.
  12. Figure 5: List of tables under the aws_accounts_metadata_db database

    Figure 5: List of tables under the aws_accounts_metadata_db database

  13. To view the OUs available within your organization, paste the following query into the Athena query editor window and choose Run.
  14. SELECT * FROM "aws_account_metadata_db"."ou_groups"
    

  15. Next, you must specify an AWS account you want to share an OU’s data with. Run the following SQL query in Athena and replace <AWS account Id> and <OU to assign> with values from your organization:
  16. UPDATE "aws_account_metadata_db"."ou_groups"
    SET consumer_aws_account_id = '<AWS account Id>'
    WHERE ou = '<OU to assign>' 

    In the example organization, all ExampleCorpWest security data is shared with AWS account 123456789012 (Account B) using the following SQL query:

    UPDATE "aws_account_metadata_db"."ou_groups"
    SET consumer_aws_account_id = '123456789012'
    WHERE ou = 'OU=root,OU=ExampleCorpWest'

    Note: You must specify the full OU path beginning with OU=root.

  17. Repeat this process for each OU you want to assign different AWS accounts to.
  18. Note: You can only assign one AWS account ID to each OU group

  19. You can confirm that changes have been applied by running the Athena query from Step 3 again.
  20. SELECT * FROM "aws_account_metadata_db"."ou_groups"

You should see the AWS account ID you specified next to your OU.

Figure 6 – Consumer AWS account listed against ExampleCorpWest OU

Figure 6: Consumer AWS account listed against ExampleCorpWest OU

Invoke the Lambda function manually

By default, the Lambda function is scheduled to run hourly to monitor for changes to AWS account metadata and to update Lake Formation sharing permissions (grants) if needed. To perform the remaining steps in this post without having to wait for the hourly run, you must manually invoke the Lambda function.

To invoke the Lambda function manually, follow these steps:

  1. Open the AWS Lambda console.
  2. Select the RowLevelSecurityLakeStack-* Lambda function.
  3. Under Code source, choose Test.
  4. The Lambda function doesn’t take any parameters. Enter rl-sec-lake-test as the Event name and leave all other options as the default. Choose Save.
  5. Choose Test again. The Lambda function will take approximately 5 minutes to complete in an environment with less than 100 AWS accounts.

After the Lambda function has finished, you can review the data cell filters and grants that have been created in Lake Formation to securely share Security Lake data with your consumer AWS account (Account B).

To review the data filters and grants, follow these steps:

  1. Open the Lake Formation console.
  2. In the navigation pane, select Data filters under Data catalog to see a list of data cells filters that have been created for each OU that you assigned a consumer AWS account to. One filter is created per table. Each consumer AWS account is granted restricted access to the aws_account_metadata table and the aggregated Security Lake table.
  3. Figure 7 – Viewing data filters in Lake Formation

    Figure 7: Viewing data filters in Lake Formation

  4. Select one of the filters in the list and choose Edit. Edit data filter displays information about the filter such as the database and table it’s applied to, as well as the Row filter expression that enforces row-level security to only return rows where the AWS account ID is in the OU it applies to. Choose Cancel to close the window.
  5. Figure 8 – Details of a data filter showing row filter expression

    Figure 8: Details of a data filter showing row filter expression

  6. To see how the filters are used to grant restricted access to your tables, select Data lake permission under Permissions from navigation pane. In the search bar under Data permissions, enter the AWS account ID for your consumer AWS account (Account B) and press Enter. You will see a list of all the grants applied to that AWS account. Scroll to the right to see a column titled Resource that lists the names of the data cell filters you saw in the previous step.
  7. Figure 9 – Grants to the data consumer account for data filters

    Figure 9: Grants to the data consumer account for data filters

You can now move on to setting up the consumer AWS account.

Configuring QuickSight in the consumer AWS account (Account B)

Now that you’ve configured everything in the Security Lake delegated administrator account (Account A), you can configure QuickSight in the consumer account (Account B).

To confirm you can access shared tables, follow these steps:

  1. Sign in to your consumer AWS account (also known  as Account B).
  2. Follow the same steps as outlined in this previous post (NEEDS 2ND POST IN SERIES LINK WHEN LIVE) to accept the AWS Resource Access Manager invitation, create a new database, and create resource links for the aws_account_metadata and amazon_security_lake_table_<region>_sh_findings_1_0 tables that have been shared with your consumer AWS account. Make sure you create resource links for both tables shared with the account. When done, return to this post and continue with step 3.
  3. [Optional] After the resource links have been created, test that you’re able to query the data by selecting the radio button next to the aws_account_metadata resource link, select Actions, and then select View data under Table. This takes you to the Athena query editor where you can now run queries on the shared tables.
  4. Figure 10 – Selecting View data in Lake Formation to open Athena

    Figure 10: Selecting View data in Lake Formation to open Athena

    Note: To use Athena for queries you must configure an S3 bucket to store query results. If this is the first time using Athena in your account, you will receive a message saying that you need to configure an S3 bucket. To do this, choose Edit settings in the blue information notice and follow the instructions.

  5. In the Editor configuration, select AwsDataCatalog from the Data source options. The Database should be the database you created in the previous steps, for example security_lake_visualization. After selecting the database, copy the SQL query that follows and paste it into your Athena query editor, and choose Run. You will only see rows of account information from the OU you previously shared.
  6. SELECT * FROM "security_lake_visualization"."aws_account_metadata"

  7. Next, to enrich your Security Lake data with the AWS account metadata you need to create an Athena View that will join the datasets and filter the results to only return findings from the AWS Foundational Security Best Practices Standard. You can do this by copying the below query and running it in the Athena query editor.
  8. CREATE OR REPLACE VIEW "security_hub_fsbps_joined_view" AS 
    WITH
      security_hub AS (
       SELECT *
       FROM
         "security_lake_visualization"."amazon_security_lake_table_ap_southeast_2_sh_findings_1_0"
       WHERE (metadata.product.feature.uid LIKE 'aws-foundational-security-best-practices%')
    ) 
    SELECT
      amm.*
    , security_hub.*
    FROM
      (security_hub
    INNER JOIN "security_lake_visualization"."aws_account_metadata" amm ON (security_hub.cloud.account_uid = amm.id))

The SQL above performs a subquery to find only those findings in the Security Lake table that are from the AWS FSBP standard and then joins those rows with the aws_account_metadata table based on the AWS account ID. You can see it has created a new view listed under Views containing enriched security data that you can import as a dataset in QuickSight.

Figure 11 – Additional view added to the security_lake_visualization database

Figure 11: Additional view added to the security_lake_visualization database

Configuring QuickSight

To perform the initial steps to set up QuickSight in the consumer AWS account, you can follow the steps listed in the second post in this series. You must also provide the following grants to your QuickSight user:

Type Resource Permissions
GRANT security_hub_fsbps_joined_view SELECT
GRANT aws_metadata_db (resource link) DESCRIBE
GRANT amazon_security_lake_table_<region>_sh_findings_1_0 (resource link) DESCRIBE
GRANT ON TARGET aws_metadata_db (resource link) SELECT
GRANT ON TARGET amazon_security_lake_table_<region>_sh_findings_1_0 (resource link) SELECT

To create a new dataset in QuickSight, follow these steps:

  1. After your QuickSight user has the necessary permissions, open the QuickSight console and verify that you’re in same Region where Lake Formation is sharing the data.
  2. Add your data by choosing Datasets from the navigation pane and then selecting New dataset. To create a new dataset from new data sources, select Athena.
  3. Enter a data source name, for example security_lake_visualization, leave the Athena workgroup as [ primary ]. Then choose Create data source.
  4. The next step is to select the tables to build your dashboards. On the Choose your table prompt, for Catalog, select AwsDataCatalog. For Database, select the database you created in the previous steps, for example security_lake_visualization. For Table, select the security_hub_fsbps_joined_view you created previously and choose Edit/Preview data.
  5. Figure 12: Choosing the joined dataset in QuickSight

    Figure 12 – Choosing the joined dataset in QuickSight

  6. You will be taken to a screen where you can preview the data in your dataset.
  7. Figure 13: Previewing data in QuickSight

    Figure 13: Previewing data in QuickSight

  8. After you confirm you’re able to preview the data from the view, select the SPICE radio button in the bottom left of the screen and then choose PUBLISH & VISUALIZE.
  9. You can now create analyses and dashboards from Security Hub AWS FSBP standard findings per OU and filter data based on business dimensions available to you through OU structure and account tags.
  10. Figure 14 – QuickSight dashboard showing only ExampleCorpWest OU data and incorporating business dimensions

    Figure 14: QuickSight dashboard showing only ExampleCorpWest OU data and incorporating business dimensions

Clean up the resources

To clean up the resources that you created for this example:

  1. Sign in to the Security Lake delegated admin account and delete the CloudFormation stack by either:
    • Using the CloudFormation console to delete the stack, or
    • Using the AWS CDK to run cdk destroy in your terminal. Follow the instructions and enter y when prompted to delete the stack.
  2. Remove any data filters you created by navigating to data filters within Lake Formation, selecting each one and choosing Delete.

Conclusion

In this final post of the series on visualizing Security Lake data with QuickSight, we introduced you to using a tool—available from AWS Samples—to extract OU structure and account metadata from your organization and use it to securely share Security Lake data on a per-OU basis across your organization. You learned how to enrich Security Lake data with account metadata and use it to create row-level security controls in Lake Formation. You were then able to address a common example of distributing security posture measured by the AWS Foundational Security Best Practices standard as part of AWS Security Hub.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Chris Lamont-Smith

Chris Lamont-Smith

Chris is a Senior Security Consultant working in the Security, Risk and Compliance team for AWS ProServe based out of Perth, Australia. He enjoys working in the area where security and data analytics intersect, and is passionate about helping customers gain actionable insights from their security data. When Chris isn’t working, he is out camping or off-roading with his family in the Australian bush.

Aggregating, searching, and visualizing log data from distributed sources with Amazon Athena and Amazon QuickSight

Post Syndicated from Pratima Singh original https://aws.amazon.com/blogs/security/aggregating-searching-and-visualizing-log-data-from-distributed-sources-with-amazon-athena-and-amazon-quicksight/

Customers using Amazon Web Services (AWS) can use a range of native and third-party tools to build workloads based on their specific use cases. Logs and metrics are foundational components in building effective insights into the health of your IT environment. In a distributed and agile AWS environment, customers need a centralized and holistic solution to visualize the health and security posture of their infrastructure.

You can effectively categorize the members of the teams involved using the following roles:

  1. Executive stakeholder: Owns and operates with their support staff and has total financial and risk accountability.
  2. Data custodian: Aggregates related data sources while managing cost, access, and compliance.
  3. Operator or analyst: Uses security tooling to monitor, assess, and respond to related events such as service disruptions.

In this blog post, we focus on the data custodian role. We show you how you can visualize metrics and logs centrally with Amazon QuickSight irrespective of the service or tool generating them. We use Amazon Simple Storage Service (Amazon S3) for storage, AWS Glue for cataloguing, and Amazon Athena for querying the data and creating structured query language (SQL) views for QuickSight to consume.

Target architecture

This post guides you towards building a target architecture in line with the AWS Well-Architected Framework. The tiered and multi-account target architecture, shown in Figure 1, uses account-level isolation to separate responsibilities across the various roles identified above and makes access management more defined and specific to those roles. The workload accounts generate the telemetry around the applications and infrastructure. The data custodian account is where the data lake is deployed and collects the telemetry. The operator account is where the queries and visualizations are created.

Throughout the post, I mention AWS services that reduce the operational overhead in one or more stages of the architecture.

Figure 1: Data visualization architecture

Figure 1: Data visualization architecture

Ingestion

Irrespective of the technology choices, applications and infrastructure configurations should generate metrics and logs that report on resource health and security. The format of the logs depends on which tool and which part of the stack is generating the logs. For example, the format of log data generated by application code can capture bespoke and additional metadata deemed useful from a workload perspective as compared to access logs generated by proxies or load balancers. For more information on types of logs and effective logging strategies, see Logging strategies for security incident response.

Amazon S3 is a scalable, highly available, durable, and secure object storage that you will use as the storage layer. To build a solution that captures events agnostic of the source, you must forward data as a stream to the S3 bucket. Based on the architecture, there are multiple tools you can use to capture and stream data into S3 buckets. Some tools support integration with S3 and directly stream data to S3. Resources like servers and virtual machines need forwarding agents such as Amazon Kinesis Agent, Amazon CloudWatch agent, or Fluent Bit.

Amazon Kinesis Data Streams provides a scalable data streaming environment. Using on-demand capacity mode eliminates the need for capacity provisioning and capacity management for streaming workloads. For log data and metric collection, you should use on-demand capacity mode, because log data generation can be unpredictable depending on the requests that are being handled by the environment. Amazon Kinesis Data Firehose can convert the format of your input data from JSON to Apache Parquet before storing the data in Amazon S3. Parquet is naturally compressed, and using Parquet native partitioning and compression allows for faster queries compared to JSON formatted objects.

Scalable data lake

Use AWS Lake Formation to build, secure, and manage the data lake to store log and metric data in S3 buckets. We recommend using tag-based access control and named resources to share the data in your data store to share data across accounts to build visualizations. Data custodians should configure access for relevant datasets to the operators who can use Athena to perform complex queries and build compelling data visualizations with QuickSight, as shown in Figure 2. For cross-account permissions, see Use Amazon Athena and Amazon QuickSight in a cross-account environment. You can also use Amazon DataZone to build additional governance and share data at scale within your organization. Note that the data lake is different to and separate from the Log Archive bucket and account described in Organizing Your AWS Environment Using Multiple Accounts.

Figure 2: Account structure

Figure 2: Account structure

Amazon Security Lake

Amazon Security Lake is a fully managed security data lake service. You can use Security Lake to automatically centralize security data from AWS environments, SaaS providers, on-premises, and third-party sources into a purpose-built data lake that’s stored in your AWS account. Using Security Lake reduces the operational effort involved in building a scalable data lake, as the service automates the configuration and orchestration for the data lake with Lake Formation. Security Lake automatically transforms logs into a standard schema—the Open Cybersecurity Schema Framework (OCSF) — and parses them into a standard directory structure, which allows for faster queries. For more information, see How to visualize Amazon Security Lake findings with Amazon QuickSight.

Querying and visualization

Figure 3: Data sharing overview

Figure 3: Data sharing overview

After you’ve configured cross-account permissions, you can use Athena as the data source to create a dataset in QuickSight, as shown in Figure 3. You start by signing up for a QuickSight subscription. There are multiple ways to sign in to QuickSight; this post uses AWS Identity and Access Management (IAM) for access. To use QuickSight with Athena and Lake Formation, you first must authorize connections through Lake Formation. After permissions are in place, you can add datasets. You should verify that you’re using QuickSight in the same AWS Region as the Region where Lake Formation is sharing the data. You can do this by checking the Region in the QuickSight URL.

You can start with basic queries and visualizations as described in Query logs in S3 with Athena and Create a QuickSight visualization. Depending on the nature and origin of the logs and metrics that you want to query, you can use the examples published in Running SQL queries using Amazon Athena. To build custom analytics, you can create views with Athena. Views in Athena are logical tables that you can use to query a subset of data. Views help you to hide complexity and minimize maintenance when querying large tables. Use views as a source for new datasets to build specific health analytics and dashboards.

You can also use Amazon QuickSight Q to get started on your analytics journey. Powered by machine learning, Q uses natural language processing to provide insights into the datasets. After the dataset is configured, you can use Q to give you suggestions for questions to ask about the data. Q understands business language and generates results based on relevant phrases detected in the questions. For more information, see Working with Amazon QuickSight Q topics.

Conclusion

Logs and metrics offer insights into the health of your applications and infrastructure. It’s essential to build visibility into the health of your IT environment so that you can understand what good health looks like and identify outliers in your data. These outliers can be used to identify thresholds and feed into your incident response workflow to help identify security issues. This post helps you build out a scalable centralized visualization environment irrespective of the source of log and metric data.

This post is part 1 of a series that helps you dive deeper into the security analytics use case. In part 2, How to visualize Amazon Security Lake findings with Amazon QuickSight, you will learn how you can use Security Lake to reduce the operational overhead involved in building a scalable data lake and centralizing log data from SaaS providers, on-premises, AWS, and third-party sources into a purpose-built data lake. You will also learn how you can integrate Athena with Security Lake and create visualizations with QuickSight of the data and events captured by Security Lake.

Part 3, How to share security telemetry per Organizational Unit using Amazon Security Lake and AWS Lake Formation, dives deeper into how you can query security posture using AWS Security Hub findings integrated with Security Lake. You will also use the capabilities of Athena and QuickSight to visualize security posture in a distributed environment.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Pratima Singh

Pratima Singh

Pratima is a Security Specialist Solutions Architect with Amazon Web Services based out of Sydney, Australia. She is a security enthusiast who enjoys helping customers find innovative solutions to complex business challenges. Outside of work, Pratima enjoys going on long drives and spending time with her family at the beach.