Tag Archives: Intermediate (200)

Implement tenant isolation for Amazon S3 and Aurora PostgreSQL by using ABAC

2021-07-13 Ashutosh Upadhyay

Post Syndicated from Ashutosh Upadhyay original https://aws.amazon.com/blogs/security/implement-tenant-isolation-for-amazon-s3-and-aurora-postgresql-by-using-abac/

In software as a service (SaaS) systems, which are designed to be used by multiple customers, isolating tenant data is a fundamental responsibility for SaaS providers. The practice of isolation of data in a multi-tenant application platform is called tenant isolation. In this post, we describe an approach you can use to achieve tenant isolation in Amazon Simple Storage Service (Amazon S3) and Amazon Aurora PostgreSQL-Compatible Edition databases by implementing attribute-based access control (ABAC). You can also adapt the same approach to achieve tenant isolation in other AWS services.

ABAC in Amazon Web Services (AWS), which uses tags to store attributes, offers advantages over the traditional role-based access control (RBAC) model. You can use fewer permissions policies, update your access control more efficiently as you grow, and last but not least, apply granular permissions for various AWS services. These granular permissions help you to implement an effective and coherent tenant isolation strategy for your customers and clients. Using the ABAC model helps you scale your permissions and simplify the management of granular policies. The ABAC model reduces the time and effort it takes to maintain policies that allow access to only the required resources.

The solution we present here uses the pool model of data partitioning. The pool model helps you avoid the higher costs of duplicated resources for each tenant and the specialized infrastructure code required to set up and maintain those copies.

Solution overview

In a typical customer environment where this solution is implemented, the tenant request for access might land at Amazon API Gateway, together with the tenant identifier, which in turn calls an AWS Lambda function. The Lambda function is envisaged to be operating with a basic Lambda execution role. This Lambda role should also have permissions to assume the tenant roles. As the request progresses, the Lambda function assumes the tenant role and makes the necessary calls to Amazon S3 or to an Aurora PostgreSQL-Compatible database. This solution helps you to achieve tenant isolation for objects stored in Amazon S3 and data elements stored in an Aurora PostgreSQL-Compatible database cluster.

Figure 1 shows the tenant isolation architecture for both Amazon S3 and Amazon Aurora PostgreSQL-Compatible databases.

Figure 1: Tenant isolation architecture diagram

As shown in the numbered diagram steps, the workflow for Amazon S3 tenant isolation is as follows:

AWS Lambda sends an AWS Security Token Service (AWS STS) assume role request to AWS Identity and Access Management (IAM).
IAM validates the request and returns the tenant role.
Lambda sends a request to Amazon S3 with the assumed role.
Amazon S3 sends the response back to Lambda.

The diagram also shows the workflow steps for tenant isolation for Aurora PostgreSQL-Compatible databases, as follows:

Lambda sends an STS assume role request to IAM.
IAM validates the request and returns the tenant role.
Lambda sends a request to IAM for database authorization.
IAM validates the request and returns the database password token.
Lambda sends a request to the Aurora PostgreSQL-Compatible database with the database user and password token.
Aurora PostgreSQL-Compatible database returns the response to Lambda.

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account for your workload.
An Amazon S3 bucket.
An Aurora PostgreSQL-Compatible cluster with a database created.

Note: Make sure to note down the default master database user and password, and make sure that you can connect to the database from your desktop or from another server (for example, from Amazon Elastic Compute Cloud (Amazon EC2) instances).
A security group and inbound rules that are set up to allow an inbound PostgreSQL TCP connection (Port 5432) from Lambda functions. This solution uses regular non-VPC Lambda functions, and therefore the security group of the Aurora PostgreSQL-Compatible database cluster should allow an inbound PostgreSQL TCP connection (Port 5432) from anywhere (0.0.0.0/0).

Make sure that you’ve completed the prerequisites before proceeding with the next steps.

Deploy the solution

The following sections describe how to create the IAM roles, IAM policies, and Lambda functions that are required for the solution. These steps also include guidelines on the changes that you’ll need to make to the prerequisite components Amazon S3 and the Aurora PostgreSQL-Compatible database cluster.

Step 1: Create the IAM policies

In this step, you create two IAM policies with the required permissions for Amazon S3 and the Aurora PostgreSQL database.

To create the IAM policies

Open the AWS Management Console.
Choose IAM, choose Policies, and then choose Create policy.

Use the following JSON policy document to create the policy. Replace the placeholder <111122223333> with the bucket name from your account.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "arn:aws:s3:::sts-ti-demo-<111122223333>/${aws:PrincipalTag/s3_home}/*"
        }
    ]
}

Save the policy with the name sts-ti-demo-s3-access-policy.

Figure 2: Create the IAM policy for Amazon S3 (sts-ti-demo-s3-access-policy)
Open the AWS Management Console.
Choose IAM, choose Policies, and then choose Create policy.
Use the following JSON policy document to create a second policy. This policy grants an IAM role permission to connect to an Aurora PostgreSQL-Compatible database through a database user that is IAM authenticated. Replace the placeholders with the appropriate Region, account number, and cluster resource ID of the Aurora PostgreSQL-Compatible database cluster, respectively.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "rds-db:connect"
            ],
            "Resource": [
                "arn:aws:rds-db:<us-west-2>:<111122223333>:dbuser:<cluster- ZTISAAAABBBBCCCCDDDDEEEEL4>/${aws:PrincipalTag/dbuser}"
            ]
        }
    ]
}

Save the policy with the name sts-ti-demo-dbuser-policy.

Figure 3: Create the IAM policy for Aurora PostgreSQL database (sts-ti-demo-dbuser-policy)

Note: Make sure that you use the cluster resource ID for the clustered database. However, if you intend to adapt this solution for your Aurora PostgreSQL-Compatible non-clustered database, you should use the instance resource ID instead.

Step 2: Create the IAM roles

In this step, you create two IAM roles for the two different tenants, and also apply the necessary permissions and tags.

To create the IAM roles

In the IAM console, choose Roles, and then choose Create role.
On the Trusted entities page, choose the EC2 service as the trusted entity.
On the Permissions policies page, select sts-ti-demo-s3-access-policy and sts-ti-demo-dbuser-policy.
On the Tags page, add two tags with the following keys and values.

Tag key Tag value

s3_home tenant1_home

dbuser tenant1_dbuser
On the Review screen, name the role assumeRole-tenant1, and then choose Save.
In the IAM console, choose Roles, and then choose Create role.
On the Trusted entities page, choose the EC2 service as the trusted entity.
On the Permissions policies page, select sts-ti-demo-s3-access-policy and sts-ti-demo-dbuser-policy.
On the Tags page, add two tags with the following keys and values.

Tag key Tag value

s3_home tenant2_home

dbuser tenant2_dbuser
On the Review screen, name the role assumeRole-tenant2, and then choose Save.

Step 3: Create and apply the IAM policies for the tenants

In this step, you create a policy and a role for the Lambda functions. You also create two separate tenant roles, and establish a trust relationship with the role that you created for the Lambda functions.

To create and apply the IAM policies for tenant1

In the IAM console, choose Policies, and then choose Create policy.

Use the following JSON policy document to create the policy. Replace the placeholder <111122223333> with your AWS account number.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": [
                "arn:aws:iam::<111122223333>:role/assumeRole-tenant1",
                "arn:aws:iam::<111122223333>:role/assumeRole-tenant2"
            ]
        }
    ]
}

Save the policy with the name sts-ti-demo-assumerole-policy.
In the IAM console, choose Roles, and then choose Create role.
On the Trusted entities page, select the Lambda service as the trusted entity.
On the Permissions policies page, select sts-ti-demo-assumerole-policy and AWSLambdaBasicExecutionRole.
On the review screen, name the role sts-ti-demo-lambda-role, and then choose Save.
In the IAM console, go to Roles, and enter assumeRole-tenant1 in the search box.
Select the assumeRole-tenant1 role and go to the Trust relationship tab.

Choose Edit the trust relationship, and replace the existing value with the following JSON document. Replace the placeholder <111122223333> with your AWS account number, and choose Update trust policy to save the policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<111122223333>:role/sts-ti-demo-lambda-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

To verify that the policies are applied correctly for tenant1

In the IAM console, go to Roles, and enter assumeRole-tenant1 in the search box. Select the assumeRole-tenant1 role and on the Permissions tab, verify that sts-ti-demo-dbuser-policy and sts-ti-demo-s3-access-policy appear in the list of policies, as shown in Figure 4.

Figure 4: The assumeRole-tenant1 Permissions tab

On the Trust relationships tab, verify that sts-ti-demo-lambda-role appears under Trusted entities, as shown in Figure 5.

Figure 5: The assumeRole-tenant1 Trust relationships tab

On the Tags tab, verify that the following tags appear, as shown in Figure 6.

Tag key	Tag value
dbuser	tenant1_dbuser
s3_home	tenant1_home

Figure 6: The assumeRole-tenant1 Tags tab

To create and apply the IAM policies for tenant2

In the IAM console, go to Roles, and enter assumeRole-tenant2 in the search box.
Select the assumeRole-tenant2 role and go to the Trust relationship tab.

Edit the trust relationship, replacing the existing value with the following JSON document. Replace the placeholder <111122223333> with your AWS account number.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<111122223333>:role/sts-ti-demo-lambda-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Choose Update trust policy to save the policy.

To verify that the policies are applied correctly for tenant2

In the IAM console, go to Roles, and enter assumeRole-tenant2 in the search box. Select the assumeRole-tenant2 role and on the Permissions tab, verify that sts-ti-demo-dbuser-policy and sts-ti-demo-s3-access-policy appear in the list of policies, you did for tenant1. On the Trust relationships tab, verify that sts-ti-demo-lambda-role appears under Trusted entities.

On the Tags tab, verify that the following tags appear, as shown in Figure 7.

Tag key	Tag value
dbuser	tenant2_dbuser
s3_home	tenant2_home

Figure 7: The assumeRole-tenant2 Tags tab

Step 4: Set up an Amazon S3 bucket

Next, you’ll set up an S3 bucket that you’ll use as part of this solution. You can either create a new S3 bucket or re-purpose an existing one. The following steps show you how to create two user homes (that is, S3 prefixes, which are also known as folders) in the S3 bucket.

In the AWS Management Console, go to Amazon S3 and select the S3 bucket you want to use.
Create two prefixes (folders) with the names tenant1_home and tenant2_home.
Place two test objects with the names tenant.info-tenant1_home and tenant.info-tenant2_home in the prefixes that you just created, respectively.

Step 5: Set up test objects in Aurora PostgreSQL-Compatible database

In this step, you create a table in Aurora PostgreSQL-Compatible Edition, insert tenant metadata, create a row level security (RLS) policy, create tenant users, and grant permission for testing purposes.

To set up Aurora PostgreSQL-Compatible

Connect to Aurora PostgreSQL-Compatible through a client of your choice, using the master database user and password that you obtained at the time of cluster creation.

Run the following commands to create a table for testing purposes and to insert a couple of testing records.

CREATE TABLE tenant_metadata (
    tenant_id VARCHAR(30) PRIMARY KEY,
    email     VARCHAR(50) UNIQUE,
    status    VARCHAR(10) CHECK (status IN ('active', 'suspended', 'disabled')),
    tier      VARCHAR(10) CHECK (tier IN ('gold', 'silver', 'bronze')));

INSERT INTO tenant_metadata (tenant_id, email, status, tier) 
VALUES ('tenant1_dbuser','[email protected]','active','gold');
INSERT INTO tenant_metadata (tenant_id, email, status, tier) 
VALUES ('tenant2_dbuser','[email protected]','suspended','silver');
ALTER TABLE tenant_metadata ENABLE ROW LEVEL SECURITY;

Run the following command to query the newly created database table.
```
SELECT * FROM tenant_metadata;
```
Figure 8: The tenant_metadata table content

Run the following command to create the row level security policy.

CREATE POLICY tenant_isolation_policy ON tenant_metadata
USING (tenant_id = current_user);

Run the following commands to establish two tenant users and grant them the necessary permissions.

CREATE USER tenant1_dbuser WITH LOGIN;
CREATE USER tenant2_dbuser WITH LOGIN;
GRANT rds_iam TO tenant1_dbuser;
GRANT rds_iam TO tenant2_dbuser;

GRANT select, insert, update, delete ON tenant_metadata to tenant1_dbuser, tenant2_dbuser;

Run the following commands to verify the newly created tenant users.

SELECT usename AS role_name,
  CASE
     WHEN usesuper AND usecreatedb THEN
       CAST('superuser, create database' AS pg_catalog.text)
     WHEN usesuper THEN
        CAST('superuser' AS pg_catalog.text)
     WHEN usecreatedb THEN
        CAST('create database' AS pg_catalog.text)
     ELSE
        CAST('' AS pg_catalog.text)
  END role_attributes
FROM pg_catalog.pg_user
WHERE usename LIKE (‘tenant%’)
ORDER BY role_name desc;

Figure 9: Verify the newly created tenant users output

Step 6: Set up the AWS Lambda functions

Next, you’ll create two Lambda functions for Amazon S3 and Aurora PostgreSQL-Compatible. You also need to create a Lambda layer for the Python package PG8000.

To set up the Lambda function for Amazon S3

Navigate to the Lambda console, and choose Create function.
Choose Author from scratch. For Function name, enter sts-ti-demo-s3-lambda.
For Runtime, choose Python 3.7.
Change the default execution role to Use an existing role, and then select sts-ti-demo-lambda-role from the drop-down list.
Keep Advanced settings as the default value, and then choose Create function.

Copy the following Python code into the lambda_function.py file that is created in your Lambda function.

import json
import os
import time 

def lambda_handler(event, context):
    import boto3
    bucket_name     =   os.environ['s3_bucket_name']

    try:
        login_tenant_id =   event['login_tenant_id']
        data_tenant_id  =   event['s3_tenant_home']
    except:
        return {
            'statusCode': 400,
            'body': 'Error in reading parameters'
        }

    prefix_of_role  =   'assumeRole'
    file_name       =   'tenant.info' + '-' + data_tenant_id

    # create an STS client object that represents a live connection to the STS service
    sts_client = boto3.client('sts')
    account_of_role = sts_client.get_caller_identity()['Account']
    role_to_assume  =   'arn:aws:iam::' + account_of_role + ':role/' + prefix_of_role + '-' + login_tenant_id

    # Call the assume_role method of the STSConnection object and pass the role
    # ARN and a role session name.
    RoleSessionName = 'AssumeRoleSession' + str(time.time()).split(".")[0] + str(time.time()).split(".")[1]
    try:
        assumed_role_object = sts_client.assume_role(
            RoleArn         = role_to_assume, 
            RoleSessionName = RoleSessionName, 
            DurationSeconds = 900) #15 minutes

    except:
        return {
            'statusCode': 400,
            'body': 'Error in assuming the role ' + role_to_assume + ' in account ' + account_of_role
        }

    # From the response that contains the assumed role, get the temporary 
    # credentials that can be used to make subsequent API calls
    credentials=assumed_role_object['Credentials']
    
    # Use the temporary credentials that AssumeRole returns to make a connection to Amazon S3  
    s3_resource=boto3.resource(
        's3',
        aws_access_key_id=credentials['AccessKeyId'],
        aws_secret_access_key=credentials['SecretAccessKey'],
        aws_session_token=credentials['SessionToken']
    )

    try:
        obj = s3_resource.Object(bucket_name, data_tenant_id + "/" + file_name)
        return {
            'statusCode': 200,
            'body': obj.get()['Body'].read()
        }
    except:
        return {
            'statusCode': 400,
            'body': 'error in reading s3://' + bucket_name + '/' + data_tenant_id + '/' + file_name
        }

Under Basic settings, edit Timeout to increase the timeout to 29 seconds.
Edit Environment variables to add a key called s3_bucket_name, with the value set to the name of your S3 bucket.
Configure a new test event with the following JSON document, and save it as testEvent.
```
{
  "login_tenant_id": "tenant1",
  "s3_tenant_home": "tenant1_home"
}
```
Choose Test to test the Lambda function with the newly created test event testEvent. You should see status code 200, and the body of the results should contain the data for tenant1.

Figure 10: The result of running the sts-ti-demo-s3-lambda function

Next, create another Lambda function for Aurora PostgreSQL-Compatible. To do this, you first need to create a new Lambda layer.

To set up the Lambda layer

Use the following commands to create a .zip file for Python package pg8000.

Note: This example is created by using an Amazon EC2 instance running the Amazon Linux 2 Amazon Machine Image (AMI). If you’re using another version of Linux or don’t have the Python 3 or pip3 packages installed, install them by using the following commands.
```
sudo yum update -y 
sudo yum install python3 
sudo pip3 install pg8000 -t build/python/lib/python3.8/site-packages/ 
cd build 
sudo zip -r pg8000.zip python/
```
Download the pg8000.zip file you just created to your local desktop machine or into an S3 bucket location.
Navigate to the Lambda console, choose Layers, and then choose Create layer.
For Name, enter pgdb, and then upload pg8000.zip from your local desktop machine or from the S3 bucket location.

Note: For more details, see the AWS documentation for creating and sharing Lambda layers.
For Compatible runtimes, choose python3.6, python3.7, and python3.8, and then choose Create.

To set up the Lambda function with the newly created Lambda layer

In the Lambda console, choose Function, and then choose Create function.
Choose Author from scratch. For Function name, enter sts-ti-demo-pgdb-lambda.
For Runtime, choose Python 3.7.
Change the default execution role to Use an existing role, and then select sts-ti-demo-lambda-role from the drop-down list.
Keep Advanced settings as the default value, and then choose Create function.
Choose Layers, and then choose Add a layer.
Choose Custom layer, select pgdb with Version 1 from the drop-down list, and then choose Add.

Copy the following Python code into the lambda_function.py file that was created in your Lambda function.

import boto3
import pg8000
import os
import time
import ssl

connection = None
assumed_role_object = None
rds_client = None

def assume_role(event):
    global assumed_role_object
    try:
        RolePrefix  = os.environ.get("RolePrefix")
        LoginTenant = event['login_tenant_id']
    
        # create an STS client object that represents a live connection to the STS service
        sts_client      = boto3.client('sts')
        # Prepare input parameters
        role_to_assume  = 'arn:aws:iam::' + sts_client.get_caller_identity()['Account'] + ':role/' + RolePrefix + '-' + LoginTenant
        RoleSessionName = 'AssumeRoleSession' + str(time.time()).split(".")[0] + str(time.time()).split(".")[1]
    
        # Call the assume_role method of the STSConnection object and pass the role ARN and a role session name.
        assumed_role_object = sts_client.assume_role(
            RoleArn         =   role_to_assume, 
            RoleSessionName =   RoleSessionName,
            DurationSeconds =   900) #15 minutes 
        
        return assumed_role_object['Credentials']
    except Exception as e:
        print({'Role assumption failed!': {'role': role_to_assume, 'Exception': 'Failed due to :{0}'.format(str(e))}})
        return None

def get_connection(event):
    global rds_client
    creds = assume_role(event)

    try:
        # create an RDS client using assumed credentials
        rds_client = boto3.client('rds',
            aws_access_key_id       = creds['AccessKeyId'],
            aws_secret_access_key   = creds['SecretAccessKey'],
            aws_session_token       = creds['SessionToken'])

        # Read the environment variables and event parameters
        DBEndPoint   = os.environ.get('DBEndPoint')
        DatabaseName = os.environ.get('DatabaseName')
        DBUserName   = event['dbuser']

        # Generates an auth token used to connect to a database with IAM credentials.
        pwd = rds_client.generate_db_auth_token(
            DBHostname=DBEndPoint, Port=5432, DBUsername=DBUserName, Region='us-west-2'
        )

        ssl_context             = ssl.SSLContext()
        ssl_context.verify_mode = ssl.CERT_REQUIRED
        ssl_context.load_verify_locations('rds-ca-2019-root.pem')

        # create a database connection
        conn = pg8000.connect(
            host        =   DBEndPoint,
            user        =   DBUserName,
            database    =   DatabaseName,
            password    =   pwd,
            ssl_context =   ssl_context)
        
        return conn
    except Exception as e:
        print ({'Database connection failed!': {'Exception': "Failed due to :{0}".format(str(e))}})
        return None

def execute_sql(connection, query):
    try:
        cursor = connection.cursor()
        cursor.execute(query)
        columns = [str(desc[0]) for desc in cursor.description]
        results = []
        for res in cursor:
            results.append(dict(zip(columns, res)))
        cursor.close()
        retry = False
        return results    
    except Exception as e:
        print ({'Execute SQL failed!': {'Exception': "Failed due to :{0}".format(str(e))}})
        return None


def lambda_handler(event, context):
    global connection
    try:
        connection = get_connection(event)
        if connection is None:
            return {'statusCode': 400, "body": "Error in database connection!"}

        response = {'statusCode':200, 'body': {
            'db & user': execute_sql(connection, 'SELECT CURRENT_DATABASE(), CURRENT_USER'), \
            'data from tenant_metadata': execute_sql(connection, 'SELECT * FROM tenant_metadata')}}
        return response
    except Exception as e:
        try:
            connection.close()
        except Exception as e:
            connection = None
        return {'statusCode': 400, 'statusDesc': 'Error!', 'body': 'Unhandled error in Lambda Handler.'}

Add a certificate file called rds-ca-2019-root.pem into the Lambda project root by downloading it from https://s3.amazonaws.com/rds-downloads/rds-ca-2019-root.pem.
Under Basic settings, edit Timeout to increase the timeout to 29 seconds.
Edit Environment variables to add the following keys and values.

Key Value

DBEndPoint Enter the database cluster endpoint URL

DatabaseName Enter the database name

RolePrefix assumeRole

Figure 11: Example of environment variables display
Configure a new test event with the following JSON document, and save it as testEvent.
```
{
  "login_tenant_id": "tenant1",
  "dbuser": "tenant1_dbuser"
}
```
Choose Test to test the Lambda function with the newly created test event testEvent. You should see status code 200, and the body of the results should contain the data for tenant1.

Figure 12: The result of running the sts-ti-demo-pgdb-lambda function

Step 7: Perform negative testing of tenant isolation

You already performed positive tests of tenant isolation during the Lambda function creation steps. However, it’s also important to perform some negative tests to verify the robustness of the tenant isolation controls.

To perform negative tests of tenant isolation

In the Lambda console, navigate to the sts-ti-demo-s3-lambda function. Update the test event to the following, to mimic a scenario where tenant1 attempts to access other tenants’ objects.
```
{
  "login_tenant_id": "tenant1",
  "s3_tenant_home": "tenant2_home"
}
```
Choose Test to test the Lambda function with the updated test event. You should see status code 400, and the body of the results should contain an error message.

Figure 13: The results of running the sts-ti-demo-s3-lambda function (negative test)
Navigate to the sts-ti-demo-pgdb-lambda function and update the test event to the following, to mimic a scenario where tenant1 attempts to access other tenants’ data elements.
```
{
  "login_tenant_id": "tenant1",
  "dbuser": "tenant2_dbuser"
}
```
Choose Test to test the Lambda function with the updated test event. You should see status code 400, and the body of the results should contain an error message.

Figure 14: The results of running the sts-ti-demo-pgdb-lambda function (negative test)

Cleaning up

To de-clutter your environment, remove the roles, policies, Lambda functions, Lambda layers, Amazon S3 prefixes, database users, and the database table that you created as part of this exercise. You can choose to delete the S3 bucket, as well as the Aurora PostgreSQL-Compatible database cluster that we mentioned in the Prerequisites section, to avoid incurring future charges.

Update the security group of the Aurora PostgreSQL-Compatible database cluster to remove the inbound rule that you added to allow a PostgreSQL TCP connection (Port 5432) from anywhere (0.0.0.0/0).

Conclusion

By taking advantage of attribute-based access control (ABAC) in IAM, you can more efficiently implement tenant isolation in SaaS applications. The solution we presented here helps to achieve tenant isolation in Amazon S3 and Aurora PostgreSQL-Compatible databases by using ABAC with the pool model of data partitioning.

If you run into any issues, you can use Amazon CloudWatch and AWS CloudTrail to troubleshoot. If you have feedback about this post, submit comments in the Comments section below.

To learn more, see these AWS Blog and AWS Support articles:

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How to create auto-suppression rules in AWS Security Hub

2021-07-12 BK Das

Post Syndicated from BK Das original https://aws.amazon.com/blogs/security/how-to-create-auto-suppression-rules-in-aws-security-hub/

AWS Security Hub gives you a comprehensive view of your security alerts and security posture across your AWS accounts. With Security Hub, you have a single place that aggregates, organizes, and prioritizes your security alerts, or findings, from multiple AWS services. Security Hub lets you assign workflow statuses to these findings, which are NEW, NOTIFIED, SUPPRESSED, or RESOLVED. These statuses allow you to categorize which findings are open and need your attention.

In this blog post, we show how you can create automated suppression rules for specific types of findings in AWS Security Hub, such as ones that are an accepted risk by design, or have a compensating control. By automatically suppressing these findings that don’t require follow-up action from your security team, you can concentrate on investigating and remediating findings that are not yet resolved.

As an example of a finding that you may want to suppress, suppose that your development environment doesn’t need to have Amazon Virtual Private Cloud (VPC) Flow Logs enabled because it does not contain any sensitive data (that is, it is an accepted risk). However, your production environment must have VPC Flow Logs enabled. You can use this solution to automatically suppress the development environment findings regarding VPC Flow Logs not being enabled. Then, you can focus on responding to and remediating findings regarding the production environment VPC Flow Logs that are not enabled.

This solution uses an Amazon EventBridge rule to evaluate Security Hub findings based on predefined filters. An AWS Lambda function is the target of the rule, and is triggered to perform the suppression. The Lambda function calls the Security Hub BatchUpdateFindings API action to set the finding of interest to the SUPPRESSED status.

Prerequisites

This solution assumes that you have Security Hub and AWS Config enabled in your administrator and member AWS accounts. AWS Config is required to execute the rules that will generate the findings. You will also need to enable the AWS Foundational Security Best Practices standard, because the examples in this post rely on those findings. You should ensure that you have configured your administrator account to aggregate your Security Hub findings from across your AWS accounts.

Solution overview

In Security Hub, the status of an investigation of a finding is tracked using the workflow status attribute. The workflow status for new findings is initially set to NEW. You can change the workflow status of a finding either by selecting it in the AWS Security Hub console, or by automating the change of workflow status by using AWS CLI or Security Hub API. After the owner of the finding’s resource is notified to take action, you can set the workflow status to NOTIFIED. After a finding is remediated, you can set the workflow status to RESOLVED. If the finding is not a concern for your given environment and does not require any action, then you can set the workflow status to SUPPRESSED.

In this solution, we show you how to automatically set the workflow status to SUPPRESSED for expected findings, by using EventBridge event patterns that trigger on Security Hub findings that match your defined criteria. The event pattern can match on fields of the findings such as account number, AWS Region, and Amazon Resource Names (ARNs). The Lambda function triggers on findings that match all defined criteria, and then sets the workflow status to SUPPRESSED for all matched findings using the BatchUpdateFindings Security Hub API action.

Solution architecture

Figure 1: Solution architecture overview

Figure 1 shows the administrator account aggregating the Security Hub findings from the member accounts.

Security Hub generates findings in the member accounts, then forwards the findings to the administrator account to be evaluated.
In the administrator account, Security Hub evaluates every finding (whether generated or forwarded) against EventBridge rules.
If a finding satisfies any of the defined EventBridge rule conditions, EventBridge triggers a Lambda function in the same Region. The EventBridge event bus delivers the finding to the Lambda function.
The Lambda function in the administrator account performs the finding suppression evaluation, and sets the Security Hub workflow status of the finding to SUPPRESSED.

This architecture uses one Lambda function per Region. You can group together multiple suppression rules into the same EventBridge pattern when they apply to the same group of AWS accounts. You can also configure multiple separate EventBridge event patterns when a suppression rule shouldn’t apply to an account.

Implementation

First, we show how to write the EventBridge event pattern. You use the CDK to define the event rule and pattern. The following example code will suppress Security Hub findings that originate in the development accounts for VPC flow logs that aren’t enabled. The solution will filter new findings only.

In the following example, replace <account-id-1> and <account-id-2> with your own information.

event_pattern_obj = events.EventPattern(
            source=["aws.securityhub"],
            detail_type=["Security Hub Findings - Imported"],
            detail= {
                "findings": {
                    "GeneratorId": [
                        "aws-foundational-security-best-practices/v/1.0.0/EC2.6"
                    ],
                    "AwsAccountId": [
                        "<account-id-1>",
                        "<account-id-2>"
                    ],
                    "Workflow": {
                        "Status": [
                            "NEW"
                        ]
                    }
                }
            }
        )

Second, you define the EventBridge rule that will match on the defined pattern.

        vpc_flow_log_dev_account_event_rule = events.Rule(
                    self,
                    'vpc-flow-logs-development-account-eventbridge-rule',
                    description='VPC flow logs in development account finding suppression',
                    rule_name='vpc-flow-logs-development-account-sechub-rule',
                    event_pattern=event_pattern_obj
                    )

Finally, the EventBridge rule triggers the suppression Lambda function.

 vpc_flow_log_dev_account_event_rule.add_target(lambda_targets.LambdaFunction(security_hub_suppression_lambda))

Solution deployment

You can deploy the solution through either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

To deploy the solution by using the AWS Management Console

In your security account, launch the template by choosing the following Launch Stack button.

To deploy the solution by using the AWS CDK

You can find the latest code on GitHub, where you can also contribute to the sample code. The following commands show how to deploy the solution by using the AWS CDK. First, the CDK initializes your environment and uploads the Lambda assets to Amazon Simple Storage Service (Amazon S3). Then, you can deploy the solution to your account. For <account_id>, specify the account number, or comma separated list of account numbers, that you want the suppression rule to apply to.

cdk bootstrap

cdk deploy sechub-finding-suppression --parameters GeneratorIds=<generator_ids> --parameters AccountNumbers=<account_ids>

To test the solution

Create a VPC that does not have flow logs enabled. We have included a test VPC that you can deploy with the following command:
```
cdk deploy vpc-test-suppression
```
Verify that the Security Hub finding EC2.6 has been suppressed in the parent account and the target account. You might need to wait a few minutes for the AWS Config recorder to detect the newly created resource and then to manually trigger the following AWS Config rule:
```
securityhub-vpc-flow-logs-enabled-* 
```
After verifying the suppression, delete the test VPC you created to test the suppression rule:
```
cdk destroy vpc-test-suppression
```

Next steps

You can configure EventBridge rules and patterns to suppress all of your findings that are accepted risk, by design, or that have a compensating control. For example, if you are performing IAM authentication by using Amazon RDS Proxy, you could consider suppressing the control [RDS.10] IAM authentication should be configured for RDS instances. You can also consider creating event patterns that filter based on resource tags, such as filtering VPCs based on tags rather than account numbers for [EC2.6] VPC flow logging should be enabled in all VPCs.

Summary

In this blog post, we showed how you can automatically suppress specific findings by using the Security Hub BatchUpdateFindings API action. We showed you how to configure EventBridge patterns and rules in order to trigger a Lambda function that calls this API action to suppress your expected findings. After you follow the steps in this blog post for automatic Security Hub suppression, your console view in Security Hub will only show findings that are not suppressed.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Configure SAML single sign-on for Kibana with AD FS on Amazon Elasticsearch Service

2021-07-09 Sajeev Attiyil Bhaskaran

Post Syndicated from Sajeev Attiyil Bhaskaran original https://aws.amazon.com/blogs/security/configure-saml-single-sign-on-for-kibana-with-ad-fs-on-amazon-elasticsearch-service/

It’s a common use case for customers to integrate identity providers (IdPs) with Amazon Elasticsearch Service (Amazon ES) to achieve single sign-on (SSO) with Kibana. This integration makes it possible for users to leverage their existing identity credentials and offers administrators a single source of truth for user and permissions management. In this blog post, we’ll discuss how you can configure Security Assertion Markup Language (SAML) authentication for Kibana by using Amazon ES and Microsoft Active Directory Federation Services (AD FS).

Amazon ES now natively supports SSO authentication that uses the SAML protocol. With SAML authentication for Kibana, users can integrate directly with their existing third-party IdPs, such as Okta, Ping Identity, OneLogin, Auth0, AD FS, AWS Single Sign-on, and Azure Active Directory. SAML authentication for Kibana is powered by Open Distro for Elasticsearch, an Apache 2.0-licensed distribution of Elasticsearch, and is available to all Amazon ES customers who have enabled fine-grained access controls.

When you set up SAML authentication with Kibana, you can configure authentication that uses either service provider (SP)-initiated SSO or IdP-initiated SSO. The SP-initiated SSO flow occurs when a user directly accesses any SAML-configured Kibana endpoint, at which time Amazon ES redirects the user to their IdP for authentication, followed by a redirect back to Amazon ES after successful authentication. An IdP-initiated SSO flow typically occurs when a user chooses a link that first initiates the sign-in flow at the IdP, skipping the redirect between Amazon ES and the IdP. This blog post will focus on the SAML SP-initiated SSO flow.

Prerequisites

To complete this walkthrough, you must have the following:

A virtual private cloud (VPC)-based Amazon ES version 6.7 or later with fine-grained access control enabled.
AD FS installed and configured with at least one user and one group.
A browser with network connectivity to AD FS, Amazon ES, and Kibana.

Solution overview

For the solution presented in this post, you use your existing AD FS as an IdP for the user’s authentication. The SAML federation uses a claim-based authentication model in which user attributes (in this case stored in Active Directory) are passed from the IdP (AD FS) to the SP (Kibana).

Let’s walk through how a user would use the SAML protocol to access Amazon ES Kibana (the SP) while using AD FS as the IdP. In Figure 1, the user authentication request comes from an on-premises network, which is connected to Amazon VPC through a VPN connection—in this case, this could also be over AWS Direct Connect. The Amazon ES domain and AD FS are created in the same VPC.

Figure 1: A high-level view of a SAML transaction between Amazon ES and AD FS

The initial sign-in flow is as follows:

Open a browser on the on-premises computer and navigate to the Kibana endpoint for your Amazon ES domain in the VPC.
Amazon ES generates a SAML authentication request for the user and redirects it back to the browser.
The browser redirects the SAML authentication request to AD FS.
AD FS parses the SAML request and prompts user to enter credentials.
1. User enters credentials and AD FS authenticates the user with Active Directory.
2. After successful authentication, AD FS generates a SAML response and returns the encoded SAML response to the browser. The SAML response contains the destination (the Assertion Consumer Service (ACS) URL), the authentication response issuer (the AD FS entity ID URL), the digital signature, and the claim (which user is authenticated with AD FS, the user’s NameID, the group, the attribute used in SAML assertions, and so on).
The browser sends the SAML response to the Kibana ACS URL, and then Kibana redirects to Amazon ES.
Amazon ES validates the SAML response. If all the validations pass, you are redirected to the Kibana front page. Authorization is performed by Kibana based on the role mapped to the user. The role mapping is performed based on attributes of the SAML assertion being consumed by Kibana and Amazon ES.

Deploy the solution

Now let’s walk through the steps to set up SAML authentication for Kibana single sign-on by using Amazon ES and Microsoft AD FS.

Enable SAML for Amazon Elasticsearch Service

The first step in the configuration setup process is to enable SAML authentication in the Amazon ES domain.

To enable SAML for Amazon ES

Sign in to the Amazon ES console and choose any existing Amazon ES domain that meets the criteria described in the Prerequisites section of this post.
Under Actions, select Modify Authentication.
Select the Enable SAML authentication check box.

Figure 2: Enable SAML authentication

When you enable SAML, it automatically creates and displays the different URLs that are required to configure SAML support in your IdP.

Figure 3: URLs for configuring the IdP
Look under Configure your Identity Provider (IdP), and note down the URL values for Service provider entity ID and SP-initiated SSO URL.

Set up and configure AD FS

During the SAML authentication process, the browser receives the SAML assertion token from AD FS and forwards it to the SP. In order to pass the claims to the Amazon ES domain, AD FS (the claims provider) and the Amazon ES domain (the relying party) have to establish a trust between them. Then you define the rules for what type of claims AD FS needs to send to the Amazon ES domain. The Amazon ES domain authorizes the user with internal security roles or backend roles, according to the claims in the token.

To configure Amazon ES as a relying party in AD FS

Sign in to the AD FS server. In Server Manager, choose Tools, and then choose AD FS Management.
In the AD FS management console, open the context (right-click) menu for Relying Party Trust, and then choose Add Relying Party Trust.

Figure 4: Set up a relying party trust
In the Add Relying Party Trust Wizard, select Claims aware, and then choose Start.

Figure 5: Create a claims aware application
On the Select Data Source page, choose Enter data about the relying party manually, and then choose Next.

Figure 6: Enter data about the relying party manually
On the Specify Display Name page, type in the display name of your choice for the relying party, and then choose Next. Choose Next again to move past the Configure Certificate screen. (Configuring a token encryption certificate is optional and at the time of writing, Amazon ES doesn’t support SAML token encryption.)

Figure 7: Provide a display name for the relying party
On the Configure URL page, do the following steps.
1. Choose the Enable support for the SAML 2.0 WebSSO protocol check box.
2. In the URL field, add the SP-initiated SSO URL that you noted when you enabled SAML authentication in Amazon ES earlier.
3. Choose Next.
  
  Figure 8: Enable SAML support and provide the SP-initiated SSO URL
On the Configure Identifiers page, do the following:
1. 1. For Relying party trust identifier, provide the service provider entity ID that you noted when you enabled SAML authentication in Amazon ES.
  2. Choose Add, and then choose Next.
Figure 9: Provide the service provider entity ID
On the Choose Access Control Policy page, choose the appropriate access for your domain. Depending on your requirements, choose one of these options:
- Choose Permit Specific Group to restrict access to one or more groups in your Active Directory domain based on the Active Directory group.
- Choose Permit Everyone to allow all Active Directory domain users to access Kibana.
Note: This step only provides access for the users to authenticate into Kibana. You have not yet set up Open Distro security roles and permissions.

Figure 10: Choose an access control policy
On the Ready to Add Trust page, choose Next, and then choose Close.

Now you’ve finished adding Amazon ES as a relying party trust.

To configure claim issuance rules for the relying party during the authentication process, AD FS sends user attributes—claims—to the relying party. With claim rules, you define what claims AD FS can send to the Amazon ES domain. In the following procedure, you create two claim rules: one is to send the incoming Windows account name as the Name ID and the other is to send Active Directory groups as roles.

To configure claim issuance rules

On the Relying Party Trusts page, right-click the relying party trust (in this case, AWS_ES_Kibana) and choose Edit Claim Issuance Policy.

Figure 11: Edit the claim issuance policy
Configure the claim rule to send the Windows account name as the Name ID, using these steps.
1. In the Edit Claim Issuance Policy dialog box, choose Add Rule. The Add Transform Claim Rule Wizard opens.
2. For Rule Type, choose Transform an Incoming Claim, and then choose Next.
3. On the Configure Rule page, enter the following information:
  - Claim rule name: NameId
  - Incoming claim type: Windows account name
  - Outgoing claim type: Name ID
  - Outgoing name ID format: Unspecified
  - Pass through all claim values: Select this option
4. Choose Finish.
Figure 12: Set the claim rule for Name ID
Configure Active Directory groups to send as roles, using the following steps.
1. In the Edit Claim Issuance Policy dialog box, choose Add Rule. The Add Transform Claim Rule Wizard opens.
2. For Rule Type, choose Send LDAP Attributes as Claims, and then choose Next.
3. On the Configure Rule page, enter or choose the following settings:
  - Claim rule name: Send-Groups-as-Roles
  - Attribute store: Active Directory
  - LDAP attribute: Token-Groups – Unqualified Names (to select the group name)
  - Outgoing claim type: Roles (the value for Roles should match the Roles Key that you will set in the Configure SAML in the Amazon ES domain step later in this process)
4. Choose Finish
  
  Figure 13: Set claim rule for Active Directory groups as Roles

The configuration of AD FS is now complete and you can download the SAML metadata file from AD FS. The SAML metadata is in XML format and is needed to configure SAML in the Amazon ES domain. The AD FS metadata file (the IdP metadata) can be accessed from the following link (replace <AD FS FQDN> with the domain name of your AD FS server). Copy the XML and note down the value of entityID from the XML, as shown in Figure 14. You will need this information in the next steps.

https://<AD FS FQDN>/FederationMetadata/2007-06/FederationMetadata.xml

Figure 14: The value of entityID in the XML file

Configure SAML in the Amazon ES domain

Next, you configure SAML settings in the Amazon Elasticsearch Service console. You need to import the IdP metadata, configure the IdP entity ID, configure the backend role, and set up the Roles key.

To configure SAML setting in the Amazon ES domain

1. Sign in to the Amazon Elasticsearch Service console. On the Actions menu, choose Modify authentication.
2. Import the IdP metadata, using the following steps.
  1. Choose Import IdP metadata, and then choose Metadata from IdP.
  2. Paste the contents of the FederationMetadata XML file (the IdP metadata) that you copied earlier in the Add or edit metadata field. You can also choose the Import from XML file button if you have the metadata file on the local disk.
    
    Figure 15: The imported identity provider metadata
3. Copy and paste the value of entityID from the XML file to the IdP entity ID field, if that field isn’t autofilled.
4. For SAML manager backend role (the console may refer to this as master backend role), enter the name of the group you created in AD FS as part of the prerequisites for this post. In this walkthrough, we set the name of the group as admins, and therefore the backend role is admins.

Optionally, you can also provide the user name instead of the backend role.

Set up the Roles key, using the following steps.
1. Under Optional SAML settings, for Roles key, enter Roles. This value must match the value for Outgoing claim type, which you set when you configured claims rules earlier.
  
  Figure 16: Set the Roles key
2. Leave the Subject key field empty to use the NameID element of the SAML assertion for the user name. Keep the defaults for everything else, and then choose Submit.

It can take few minutes to update the SAML settings and for the domain to come back to the active state.

Congratulations! You’ve completed all the SP and IdP configurations.

Sign in to Kibana

When the domain comes back to the active state, choose the Kibana URL in the Amazon ES console. You will be redirected to the AD FS sign-in page for authentication. Provide the user name and password for any of the users in the admins group. The example in Figure 17 uses the credentials for the user [email protected], who is a member of the admins group.

Figure 17: The AD FS sign-in screen with user credentials

AD FS authenticates the user and redirect the page to Kibana. If the user has at least one role mapped, you go to the Kibana home page, as shown in Figure 18. In this walkthrough, you mapped the AD FS group admins as a backend role to the manager user. Internally, the Open Distro security plugin maps the backend role admins to the security roles all_access and security_manager. Therefore, the Active Directory user in the admins group is authorized with the privileges of the manager user in the domain. For more granular access, you can create different AD FS groups and map the group names (backend roles) to internal security roles by using Role Mappings in Kibana.

Figure 18: The AD FS user user1@example.com is successfully logged in to Kibana

Figure 18: The AD FS user [email protected] is successfully logged in to Kibana

Note: At the time of writing for this blog post, if you specify the <SingleLogoutService /> details in the AD FS metadata XML, when you sign out from Kibana, Kibana will call AD FS directly and try to sign the user out. This doesn’t work currently, because AD FS expects the sign-out request to be signed with a certificate that Amazon ES doesn’t currently support. If you remove <SingleLogoutService /> from the metadata XML file, Amazon ES will use its own internal sign-out mechanism and sign the user out on the Amazon ES side. No calls will be made to AD FS for signing out.

Conclusion

In this post, we covered setting up SAML authentication for Kibana single sign-on by using Amazon ES and Microsoft AD FS. The integration of IdPs with your Amazon ES domain provides a powerful way to control fine-grained access to your Kibana endpoint and integrate with existing identity lifecycle processes for create/update/delete operations, which reduces the operational overhead required to manage users.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Elasticsearch Service forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

ERGO Breaks New Frontiers for Insurance with AI Factory on AWS

2021-07-07 Piotr Klesta

Post Syndicated from Piotr Klesta original https://aws.amazon.com/blogs/architecture/ergo-breaks-new-frontiers-for-insurance-with-ai-factory-on-aws/

This post is co-authored with Piotr Klesta, Robert Meisner and Lukasz Luszczynski of ERGO

Artificial intelligence (AI) and related technologies are already finding applications in our homes, cars, industries, and offices. The insurance business is no exception to this. When AI is implemented correctly, it adds a major competitive advantage. It enhances the decision-making process, improves efficiency in operations, and provides hassle-free customer assistance.

At ERGO Group, we realized early on that innovation using AI required more flexibility in data integration than most of our legacy data architectures allowed. Our internal governance, data privacy processes, and IT security requirements posed additional challenges towards integration. We had to resolve these issues in order to use AI at the enterprise level, and allow for sensitive data to be used in a cloud environment.

We aimed for a central system that introduces ‘intelligence’ into other core application systems, and thus into ERGO’s business processes. This platform would support the process of development, training, and testing of complex AI models, in addition to creating more operational efficiency. The goal of the platform is to take the undifferentiated heavy lifting away from our data teams so that they focus on what they do best – harness data insights.

Building ERGO AI Factory to power AI use cases

Our quest for this central system led to the creation of AI Factory built on AWS Cloud. ERGO AI Factory is a compliant platform for running production-ready AI use cases. It also provides a flexible model development and testing environment. Let’s look at some of the capabilities and services we offer to our advanced analytics teams.

Figure 1: AI Factory imperatives

Compliance: Enforcing security measures (for example, authentication, encryption, and least privilege) was one of our top priorities for the platform. We worked closely with the security teams to meet strict domain and geo-specific compliance requirements.
Data governance: Data lineage and deep metadata extraction are important because they support proper data governance and auditability. They also allow our users to navigate a complex data landscape. Our data ingestion frameworks include a mixture of third party and AWS services to capture and catalog both technical and business metadata.
Data storage and access: AI Factory stores data in Amazon Simple Storage Service (S3) in a secure and compliant manner. Access rights are only granted to individuals working on the corresponding projects. Roles are defined in Active Directory.
Automated data pipelines: We sought to provide a flexible and robust data integration solution. An ETL pipeline using Apache Spark, Apache Airflow, and Kubernetes pods is central to our data ingestion. We use this for AI model development and subsequent data preparation for operationalization and model integration.
Monitoring and security: AI Factory relies on open-source cloud monitoring solutions like Grafana to detect security threats and anomalies. It does this by collecting service and application logs, tracking metrics, and generating alarms.
Feedback loop: We store model inputs/outputs and use BI tools, such as Amazon QuickSight, to track the behavior and performance of productive AI models. It’s important to share such information with our business partners so we can build their trust and confidence with AI.
Developer-friendly environment: Creating AI models is possible in a notebook-style or integrated development environment. Because our data teams use a variety of machine learning (ML) frameworks and libraries, we keep our platform extensible and our framework agnostic. We support Python/R, Apache Spark, PyTorch and TensorFlow, and more. All this is bolstered by CI/CD processes that accelerate delivery and reduce errors.
Business process integration: AI Factory offers services to integrate ML models into existing business processes. We focus on standardizing processes and close collaboration with business and technical stakeholders. Our overarching goal is to operationalize the AI model in the shortest possible timeframe, while preserving high quality and security standards.

AI Factory architecture

So far, we have looked at the functional building blocks of the AI Factory. Let’s take an architectural view of the platform using a five-step workflow:

Figure 2: AI Factory high-level architecture

Data ingestion environment: We use this environment to ingest data from the prominent on-premises ERGO data sources. We can schedule the batch or Delta transfer data to various cloud destinations using multiple Kubernetes-hosted microservices. Once ingested, data is persisted and cataloged as ERGO’s data lake on Amazon S3. It is prepared for processing by the upstream environments.
Model development environment: This environment is used primarily by data scientists and data engineers. We use Amazon EMR and Amazon SageMaker extensively for data preparation, data wrangling, experimentation with predictive models, and development through rapid iterations.
Model operationalization environment: Trained models with satisfactory KPIs are promoted from the model development to the operationalization environment. This is where we integrate AI models in business processes. The team focuses on launching and optimizing the operation of services and algorithms.
- Integration with ERGO business processes is achieved using Kubernetes-hosted ‘Model Service.’ This allows us to infuse AI models provided by data scientists in existing business processes.
- An essential part of model operationalization is to continuously monitor the quality of the deployed ML models using the ‘feedback loop service.’
Model insights environment: This environment is used for displaying information about platform performance, processes, and analytical data. Data scientists use its services to check for unexpected bias or performance drifts that the model could exhibit. Feedback coming from the business through the “feedback loop service’ allows them to identify problems fast and retrain the model.
Shared services: Though shown as the fifth step of the workflow, the shared services environment supports almost every step in the process. It provides common, shared components between different parts of the platform managing CI/CD and orchestration processes within the AI factory. Additional services like platform logging and monitoring, authentication, and metadata management are also delivered from the shared services environment.

A binding theme across the various subplatforms is that all provisioning and deployment activities are automated using Infrastructure as Code (IaC) practices. This reduces the potential for human error, provides architectural flexibility, and greatly speeds up software development and our infrastructure-related operations.

All components of the AI factory are run in the AWS Cloud and can be scaled and adapted as needed. The connection between model development and operationalization happens at well-defined interfaces to prevent unnecessary coupling of components.

Lessons learned

Security first

Align with security early and often
Understand all the regulatory obligations and document them as critical, non-functional requirements

Modular approach

Combine modern data science technology and professional IT with a cross-functional, agile way of working
Apply loosely coupled services with an API-first approach

Data governance

Tracking technical metadata is important but not sufficient, you need business attributes too
Determine data ownership in operational systems to map upstream data governance workflows
Establish solutions to data masking as the data moves across sub-platforms
Define access rights and permissions boundaries among various personas

FinOps strategy

Carefully track platform cost
Assign owners responsible for monitoring and cost improvements
Provide regular feedback to platform stakeholders on usage patterns and associated expenses

Working with our AWS team

Establish cadence for architecture review and new feature updates
Plan cloud training and enablement

The future for the AI factory

The creation of the AI Factory was an essential building block of ERGO’s strategy. Now we are ready to embrace the next chapter in our advanced analytics journey.

We plan to focus on important use cases that will deliver the highest business value. We want to make the AI Factory available to ERGO’s international subsidiaries. We are also enhancing and scaling its capabilities. We are creating an ‘analytical content hub’ based on automated text extraction, improving speech to text, and developing translation processes for all unstructured and semistructured data using AWS AI services.

Vertical Integration Strategy Powered by Amazon EventBridge

2021-07-03 Tiago Oliveira

Post Syndicated from Tiago Oliveira original https://aws.amazon.com/blogs/architecture/vertical-integration-strategy-powered-by-amazon-eventbridge/

Over the past few years, midsize and large enterprises have adopted vertical integration as part of their strategy to optimize operations and profitability. Vertical integration consists of separating different stages of the production line from other related departments, such as marketing and logistics. Enterprises implement such strategy to gain full control of their value chain: from the raw material production to the assembly lines and end consumer.

To achieve operational efficiency, enterprises must keep a level of independence between departments. However, this can lead to unstandardized operations and communication issues. Moreover, with this kind of autonomy for independent and dynamic verticals, the enterprise may lose some measure of visibility and control. As a result, it becomes challenging to generate a basic report from multiple departments. This blog post provides a high-level solution to integrate your different business verticals, using an event-driven architecture on top of Amazon EventBridge.

Event-driven architecture

Event-driven architecture is an architectural pattern to model communication between services while decoupling applications from each other. Applications scale and fail independently, and a central event bus facilitates the communication between the services in the enterprise. Instead of a particular application sending a request directly to another, it produces an event. The central event router captures it and forwards the message to the proper destinations.

For instance, when a customer places a new order on the retail website, the application sends the event to the event bus. Following, the event bus sends the message to the ERP system and the fulfillment center for dispatch. In this scenario, we call the application sending the event, an event publisher, and the applications receiving the event, event consumers.

Because all messages are going through the central event bus, there is clear independence between the applications within the enterprise. Here are some benefits:

Application independence occurs even if they belong to the same business workflow
You can plug in more event consumers to receive the same event type
You can add a data lake to receive all new order events from the retail website
You can receive all the events from the payment system and the customer relations department

This ensures you can integrate independent departments, increase overall visibility, and make sense of specific processes happening in the organization using the right tools.

Implementing event-driven architecture with Amazon EventBridge

Each vertical organically generates lifecycle events. Enterprises can use the event-driven architecture paradigm to make the information flow between the departments by asynchronously exchanging events through the event bus. This way, each department can react to events generated by other departments and initiate processes or actions depending on its business needs.

Such an approach creates a dynamic and flexible choreography between the different participants, which is unique to the enterprise. Such choreography can be followed and monitored using analytics and fine-grained event data collected on the data lake. Read Using AWS X-Ray tracing with Amazon EventBridge to learn how to debug and analyze this kind of distributed application.

Figure 1. Architecture diagram depicting enterprise vertical integration with Amazon EventBridge

In Figure 1, Amazon EventBridge works as the central event bus, the core component of this event-driven architecture. Through Amazon EventBridge, each event publisher sends or receives lifecycle events to and from all the other participants. Amazon EventBridge has an advanced routing mechanism using the concept of rules. Each rule defines up to five targets for the event arriving on the bus. Events are selected based on the event pattern. You can set up routing rules to determine where to send your data to build application architectures. These will react in real time to your data sources, with event publisher and consumer decoupled.

In addition to initiating the heavy routing and distribution of events, Amazon EventBridge can also give real-time insights into how the business runs. Using metrics automatically sent to Amazon CloudWatch, it is possible to see which kinds of events are arriving, and at which rate. You can also see how those events are distributed across the registered targets, and any failures that occur during this distribution. Every event can also be archived using the Amazon EventBridge events archiving feature.

Amazon Simple Storage Service (S3) is the backend storage, or data lake, for all the events that have ever transited via the event bus. With Amazon S3, customers have a cost-efficient storage service at any scale, with 11 9’s of durability. To help customers manage and secure their data, S3 provides features such as Amazon S3 Lifecycle to optimize costs. S3 Object Lock allows the write-once-read-many (WORM) model. You can expand this data and transform it into information using S3. Using services like Amazon Athena, Amazon Redshift, and Amazon EMR, those events can be transformed, correlated, and aggregated to generate insights on the business. The Amazon S3 data lake can also be the input to a data warehouse, machine learning models, and real-time analytics. Learn more about how to use Amazon S3 as the data lake storage.

A critical feature of this solution is the initiation of complex queries on top of the data lake. Amazon API Gateway provides one single flexible and elastic API entry point to retrieve data from the data lake. It also can publish events directly to the event bus. For complex queries, Amazon API Gateway can be integrated with an AWS Lambda. It will coordinate the execution of standard SQL queries using Amazon Athena as the query engine. You can read about a fully functional example of such an API called athena-express.

After collecting data from multiple departments, third-party entities, and shop floors, you can use the data to derive business value using cross-organization dashboards. In this way, you can increase visibility over the different entities and make sense of the data from the distributed systems. Even though this design allows you to use your favorite BI tool, we are using Amazon QuickSight for this solution. For example, with QuickSight, you can author your interactive dashboards, which include machine learning-powered insights. Those dashboards can then connect the marketing campaigns data with the sales data. You can measure how effective those campaigns were and forecast the demand on the production lines.

Conclusion

In this blog post, we showed you how to use Amazon EventBridge as an event bus to allow event-driven architectures. This architecture pattern streamlines the adoption of vertical integration. Enterprises can decouple IT systems from each other while retaining visibility into the data they generate. Integrating those systems can happen asynchronously using a choreography approach instead of having an orchestrator as a central component. There are technical challenges to implement this kind of solution, such as maintaining consistency in distributed applications and transactions spanning multiple microservices. Refer to the saga pattern for microservices-based architecture, and how to implement it using AWS Step Functions.

With a data lake in place to collect all the data produced by IT systems, you can create BI dashboards that provide a holistic view of multiple departments. Moreover, it allows organizations to get better insights into their valuable data and explore other use cases, such as machine learning. To support the data lake creation and management, refer to AWS Lake Formation and a series of other blog posts.

To learn more about Amazon EventBridge from a hands-on perspective, refer to this EventBridge workshop.

How to monitor and track failed logins for your AWS Managed Microsoft AD

2021-07-02 Tekena Orugbani

Post Syndicated from Tekena Orugbani original https://aws.amazon.com/blogs/security/how-to-monitor-and-track-failed-logins-for-your-aws-managed-microsoft-ad/

AWS Directory Service for Microsoft Active Directory provides customers with the ability to review security logs on their AWS Managed Microsoft AD domain controllers by either using a domain management Amazon Elastic Compute Cloud (Amazon EC2) instance or by forwarding domain controller security event logs to Amazon CloudWatch Logs.

You can further improve visibility by monitoring Windows login activities on your AWS Managed Microsoft AD domain-joined EC2 instances, and in this blog post, I show you how. Monitoring and tracking Windows security events on your AWS Managed Microsoft AD domain-joined instances can reveal unexpected activities on your domain-joined EC2 instances so that you can take proactive remediating action.

For example, every time there is an unsuccessful attempt to log in to a domain-joined EC2 instance or on-premises server by using an AWS Managed Microsoft AD user or a local account, an “Audit Failure” Windows security event with ID 4625 is recorded on the EC2 instance itself. The event data includes details of the account name, workstation name, and source network address. Unsuccessful attempts to log in to non–domain-joined EC2 instances and servers are handled the same way. You can track and monitor these events on an ongoing basis across your fleet of Windows EC2 instances by using the solution described here.

Solution overview

Figure 1 shows the workflow for the solution.

Figure 1: Solution architecture

The workflow steps are as follows:

An Amazon CloudWatch agent that is running on the EC2 instances sends the Windows security event logs to Amazon CloudWatch.
CloudWatch filters the logs based on the filter you specify. When the configured threshold is met, CloudWatch posts an alert to an SNS topic.
Amazon Simple Notification Service (Amazon SNS) invokes an AWS Lambda function.
The Lambda function scans through the events and determines which EC2 instance(s) generated the security events at a frequency that satisfies the configured threshold. It discards any other instances listed in the events that don’t meet the specified criteria. The function sends an email to the configured email address with a high-level description of the event logs and the instance(s) that generated them.
Amazon Simple Email Service (Amazon SES) delivers the emails in the specified mailbox.

Note: Although this example uses email notification via Amazon SES to monitor failed logins, there are opportunities to extend the solution. For example, you can integrate with a Security Information and Event Management (SIEM) tool that may potentially be integrated with a ticketing service and/or some automation or incident response process when a set threshold for failed logins is breached.

Prerequisites

Before you deploy the solution, you must complete the following steps:

Create AWS Identity and Access Management (IAM) roles for use with the CloudWatch agent
Sign up for Amazon SES
Verify the sender and recipient email addresses that you’ll use to send and receive email notifications

Deploy the solution

The solution I present here involves four main steps:

Install and configure the CloudWatch agent for your EC2 instances.
Create a metric filter in CloudWatch.
Create a CloudWatch alarm based on the metric filter and add SNS notification.
Create a Lambda function and subscribe the function to the SNS topic.

Step 1: Install and configure the CloudWatch agent for all your EC2 instances

The first step is to create an AWS Systems Manager parameter to contain the JSON configuration for the CloudWatch agent that runs on the EC2 instances. You’ll then use Systems Manager Run Command to install the CloudWatch agent on the instances and to apply the configuration in the Parameter Store to the CloudWatch agent.

To install and configure the CloudWatch agent

Open the AWS Systems Manager console and in the navigation pane, choose Parameter Store to create a new Systems Manager parameter.
Give your parameter a name. In my example, I named my parameter AmazonCloudWatch-Windows.
For Tier, choose Standard. For Type, choose String. For Data type, choose Text.
For the value of the parameter, enter the following JSON configuration and choose Create Parameter.

Note: This JSON configuration creates a log group in CloudWatch with the name /aws/SecurityAuditLogs. If you would prefer to use another log group name, you can modify the JSON configuration. Also, if you already have a Systems Manager parameter named AmazonCloudWatch-Windows, you can use any other name of your choice.
```
{ "logs": { "logs_collected": { "windows_events": { "collect_list": [ { "event_format": "xml", "event_levels": [ "VERBOSE", "INFORMATION", "WARNING", "ERROR", "CRITICAL" ], "event_name": "Security", "log_group_name": "/aws/SecurityAuditLogs", "log_stream_name": "{instance_id}" } ] } } }, "metrics": { "metrics_collected": { "statsd": { "metrics_aggregation_interval": 60, "metrics_collection_interval": 10, "service_address": ":8125" } } } }
```
The Parameter details page should look similar to the following.

Figure 2: Create the System Manager parameter for the CloudWatch agent
Next, you’ll use Run Command to install and configure the CloudWatch agent. In the navigation pane, choose Run Command.
On the Run a command page, in the search box, enter Document name prefix: Equals: AWS-ConfigureAWSPackage. Press Enter and select the document that appears.
Under Command parameters, for Name, enter AmazonCloudWatchAgent.

Figure 3: Install the CloudWatch agent on the instances
Under Targets, specify your EC2 instances based on their tags, or choose them manually, and then choose Run.
To configure the CloudWatch agent, choose Run Command again. On the Run a command screen, enter Document name prefix: Equals: AmazonCloudWatch-ManageAgent. Press Enter and select the document that appears.
Under Command parameters, for Optional Configuration Location, enter the name of the Systems Manager parameter you created earlier. In my example, I used the name AmazonCloudWatch-Windows. Keep the defaults for the other settings.

Figure 4: Configure the CloudWatch agent on the instances
Under Targets, specify your EC2 instances based on their tags, or choose them manually, and then choose Run.

Step 2: Create a metric filter in CloudWatch

After the completion of the tasks in Step 1, your EC2 instances should now be sending logs to a log group in Amazon CloudWatch called /aws/SecurityAuditLogs. The log group should have log streams named after the EC2 instances that are sending the logs to CloudWatch. The next step is to create a metric filter to filter the noise from the logs.

To create a metric filter

Open the CloudWatch console and in the left navigation menu, choose Log Groups.
Select the check box next to the /aws/SecurityAuditLogs log group, choose Actions, and then choose Create metric filter.
On the Define pattern page, enter Audit Failure, keep the defaults for the other settings, and then choose Next.
Enter values for Filter name, Metric namespace, Metric name, and Metric value, and then choose Next to create the metric filter.

Figure 5: Create a CloudWatch metric filter

Step 3: Create a CloudWatch alarm based on the metric filter and add SNS notification

In this step, you set a threshold for how many “Audit Failure” events you want to allow within a period of time before triggering an alarm.

To create the CloudWatch alarm and add SNS notification

Open the Amazon Simple Notification Service console and in the left navigation menu, choose Topics.
Choose Create topic, and then choose Standard.
Provide a name for your topic, and then choose Create topic. In my example, I named the topic WindowsSecurityLogsAlarmNotifications.
Open the CloudWatch console, choose Log groups, and select the /aws/SecurityAuditLogs log group.
Choose the Metric filters tab, select the check box next to the WindowsSecurityAuditFailures filter you just created, and choose Create alarm.
On the Specify metric and conditions page, set the parameters as follows:
1. For Statistic, choose Sample count.
2. For Period, choose 5 minutes.
3. For Threshold type, choose Static.
4. For Define the alarm condition, choose Greater>threshold.
5. For Define the threshold value, specify the threshold number of failed login attempts that will cause a notification to be sent.
  
  Note: In my example, I’ve specified to be notified after five failed login attempts. You should determine the appropriate threshold to use, based on your organization’s security policies.
Figure 6: Create a CloudWatch alarm
On the Configure actions page, choose Next.
Choose In alarm, choose Select an existing SNS topic, and then select the SNS topic you created earlier in this procedure.
Specify a name for the alarm, and then choose Create Alarm.

Step 4: Create a Lambda function and subscribe the function to the SNS topic

CloudWatch alarm messages are predefined, can’t be modified, and don’t provide details based on CloudWatch streams. Additionally, a CloudWatch alarm will trigger when a combination of failed login attempts on two or more instances meets the threshold. For instance, in my example, when there are three failed attempts on one instance and two failed attempts on a second instance all within a 5-minute period, a CloudWatch alarm will be triggered.

The purpose of the Lambda function that you’ll create in this step is to validate whether the triggered alarms meet the specified threshold on a per-instance basis before the function sends an email notification to the designated email address. When a CloudWatch alarm is triggered, the function reads through the CloudWatch logs and filters the logs based on CloudWatch log streams that meet the specified threshold for the alarm. If no individual CloudWatch log stream (that is, no individual instance or server) meets the threshold, the function won’t send a notification. The function only sends a notification if it determines that one or more instances have each met the specified threshold. The function also provides more information about the failed login attempts when it does send you an email.

To create the Lambda function and subscribe it to the SNS topic

Open the AWS Lambda console and choose Create function.
Choose Author from scratch, and provide a name for your function. Under Runtime, select Node.js 14.x, and then choose Create function.

Double-click index.js, replace the code with the following code, and then choose Deploy.

var aws = require('aws-sdk');
var cwl = new aws.CloudWatchLogs();
var ses = new aws.SES();
let alarmThreshold = process.env.ALARM_THRESHOLD;

exports.handler = function(event, context) {
    var message = JSON.parse(event.Records[0].Sns.Message);
    var alarmName = message.AlarmName;
    var oldState = message.OldStateValue;
    var newState = message.NewStateValue;
    var reason = message.NewStateReason;
    var requestParams = {
        metricName: message.Trigger.MetricName,
        metricNamespace: message.Trigger.Namespace
    };
    cwl.describeMetricFilters(requestParams, function(err, data) {
        if(err) console.error('Error is:', err);
        else {
            console.log('Metric Filter data is:', data);
    	    getInstanceIdsAndSendEmail(message, data);
        }
    });
};

function getInstanceIdsAndSendEmail(message, metricFilterData) {
    var timestamp = Date.parse(message.StateChangeTime);
    var offset = message.Trigger.Period * message.Trigger.EvaluationPeriods * 1000;
    var metricFilter = metricFilterData.metricFilters[0];
    var dictInstances = {};
    var arrayInstances = [];
    var instancesFinalList = [];
    var key;
    var val;
    // Getting the Instance Ids
    var paramsForInstanceId = {
        'logGroupName' : metricFilter.logGroupName,
        'filterPattern' : metricFilter.filterPattern ? metricFilter.filterPattern : "",
         'startTime' : timestamp - offset,
         'endTime' : timestamp
    };
    cwl.filterLogEvents(paramsForInstanceId, function (err, data){
        if (err) {
            console.error('Filtering failure:', err);
        } else {
            var events = data.events;
            for (var i in events) {
                var InstanceId = JSON.stringify(events[i]['logStreamName']);
                arrayInstances.push(InstanceId);
            }
            console.log('Array Instance is:', arrayInstances);
            for (var i = 0; i < arrayInstances.length; i++) {
                var instId = arrayInstances[i];
                dictInstances[instId] = dictInstances[instId] ? dictInstances[instId] + 1 : 1;
            }
            console.log('Instance(s) and number of audit failure occurrences:', dictInstances);
            for([key, val] of Object.entries(dictInstances)) {
                if (val > alarmThreshold) {
                    instancesFinalList.push(key.replace(/['"]+/g, ''));
                }
            }
            console.log('Instance(s) with failure audit that exceed the threshold:', instancesFinalList);
    	    getLogsAndSendEmail(message, metricFilterData, instancesFinalList);
        }
    });
}

function getLogsAndSendEmail(message, metricFilterData, logStreamNames_Instance) {
    var timestamp = Date.parse(message.StateChangeTime);
    var offset = message.Trigger.Period * message.Trigger.EvaluationPeriods * 1000;
    var metricFilter = metricFilterData.metricFilters[0];
    var dictInstances = {};
    var arrayInstances = [];
    var instancesFinalList = []

    // Send Email to the Instances
    var paramsForEmail = {
        'logGroupName' : metricFilter.logGroupName,
        'filterPattern' : metricFilter.filterPattern ? metricFilter.filterPattern : "",
         'startTime' : timestamp - offset,
         'endTime' : timestamp,
         'logStreamNames' : logStreamNames_Instance
    };
    cwl.filterLogEvents(paramsForEmail, function (err, data){
        if (err) {
            console.error('Filtering failure:', err);
        } else {
            console.log("===SENDING EMAIL===");
			var email = ses.sendEmail(generateEmailContent(data, message), function(err, data){
                if(err) console.error(err);
                else {
                    console.log("===EMAIL SENT===");
                    console.log(data);
                }
            });
        }
    });
}

function generateEmailContent(data, message) {
    var events = data.events;
    let senderEmail = process.env.SENDER_EMAIL;
    let recipientEmail = process.env.RECIPIENT_EMAIL.split(",");
    console.log('Recipient is: ', recipientEmail);
    console.log('Events are:', events);
    var style = '<style> pre {color: red;} </style>';
    var logData = '<br/>Logs:<br/>' + style;
    for (var i in events) {
        logData += '<pre>Instance:' + JSON.stringify(events[i]['logStreamName'])  + '</pre>';
        logData += '<pre>Message:' + JSON.stringify(events[i]['message']) + '</pre><br/>';
    }
    var date = new Date(message.StateChangeTime);
    var text = 'Alarm Name: ' + '<b>' + message.AlarmName + '</b><br/>' + 
               'Message: ' + 'There has been an unusually high number of Windows Security Audit Failure events for the instance(s) with details below. Please review the event logs <br/>' +
               'Account ID: ' + message.AWSAccountId + '<br/>'+
               'Region: ' + message.Region + '<br/>'+
               'Alarm Time: ' + date.toString() + '<br/>'+
               logData;
    var subject = 'Alarm Triggered - ' + message.AlarmName;
    var emailContent = {
        Destination: {
            ToAddresses: recipientEmail
        },

        Message: {
            Body: {
                Html: {
                    Data: text
                }
            },
            Subject: {
                Data: subject
            }
        },
        Source: senderEmail
    };
    return emailContent;
}

Choose Add trigger, and in the drop-down list, choose SNS.
Under SNS topic, select the SNS topic you created in Step 3, and then choose Add.

Figure 7: Create the AWS Lambda function
Choose the Configuration tab, and then choose Environment variables. Choose Edit to add the environment variables for ALARM_THRESHOLD, RECIPIENT_EMAIL, and SENDER_EMAIL, and then choose Save.

Figure 8: The Lambda environment variables

Note: The variables’ keys must be set exactly as ALARM_THRESHOLD, RECIPIENT_EMAIL, and SENDER_EMAIL, because otherwise the code will fail. For the recipient, you can specify a single email or multiple email addresses that are separated by commas, as shown in Figure 8, provided that the emails are verified as specified in the Prerequisites section.

Next, create an IAM policy, which you’ll attach to a role that will be assumed by the Lambda function. This policy provides permissions to perform the DescribeMetricFilters, FilterLogEvents, and SendEmail API calls that are necessary for the function to work. It also provides permissions to create a log group and log stream in CloudWatch for the Lambda function, so that you can review the logs if the Lambda function fails to run properly.

To create the IAM policy

Sign in to the IAM console, and in the navigation bar, choose Policies.
In the content pane, choose Create policy, and then choose JSON.

Replace the content with the following script. Make sure to replace the placeholders with the ARN of the Lambda function, the ARNs for log group creation and the ARN of your SES verified email address to use as sender.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "ses:SendEmail",
            "Resource": "<arn-of-verified-ses-email-sender>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:DescribeMetricFilters"
            ],
            "Resource": "<arn-for-CloudWatch-log-groups>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:FilterLogEvents"
            ],
            "Resource": "<arn-of-CloudWatch-log-group-created-in-step-1>"

        },
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "<arn-for-CloudWatch-log-groups>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "<arn-of-lambda-function:*>"
            ]
        }
    ]
}

Here is how it appears in my example. Note that FilterSendSecurityEvents is the name of my Lambda function and /aws/SecurityAuditLogs is the name my log group created in Step 1.

Figure 9: Example policy for the IAM role to be attached to the Lambda function

Choose Review policy, specify a name and a description for the policy, and then choose Create policy.

Next, create an IAM role and attach this policy.

To create the IAM role and attach the policy

In the IAM console navigation bar, choose Roles, and then choose Create role.
Under Choose the service that will use this role, choose Lambda, and then choose Next: Permissions.
On the next page, select the policy you just created, and then choose Next: Tags. Add an optional tag, and then choose Next: Review.
Specify a name and description for the role, and then choose Create role.
To attach this role to the Lambda function, go to the AWS Lambda console. Navigate to the Lambda function, and choose Configurations.
Choose Permissions, and then under Execution role, choose Edit.
On the Edit basic settings page, under Existing role, select the role you just created, and then choose Save.

And that’s it! You will now be notified whenever there are “Audit Failure” events that reach the threshold you set on a per-instance basis for your AWS Managed Microsoft AD domain-joined instances. If you installed and configured the CloudWatch agent on non–domain-joined instances in Step 1, then you’ll also get notifications for “Audit Failure” events that are generated by failed login attempts that use local accounts.

Conclusion

In this post, I showed you how you can proactively track and monitor Windows security audit failures across your AWS Managed Microsoft AD domain-joined EC2 instances. This helps provide greater visibility into Windows login activities for administrators, so that they can take action to maintain the security of their server fleet. This solution can also be extended to potentially trigger an automation workflow or incident response process in the event of unexpected events.

Although this blog has specifically targeted AWS Managed Microsoft AD domain-joined instances, the procedure here also applies to standalone EC2 instances or on-premises servers that are configured to send logs to CloudWatch.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Directory Service forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How to integrate third-party IdP using developer authenticated identities

2021-07-01 Andrew Lee

Post Syndicated from Andrew Lee original https://aws.amazon.com/blogs/security/how-to-integrate-third-party-idp-using-developer-authenticated-identities/

Amazon Cognito identity pools enable you to create and manage unique identifiers for your users and provide temporary, limited-privilege credentials to your application to access AWS resources. Currently, there are several out of the box external identity providers (IdPs) to integrate with Amazon Cognito identity pools, including Facebook, Google, and Apple. If your application’s primary users use another social media network such as Snapchat and you would like to make it easier for them to authenticate with your application, you would need to use developer authenticated identities and interface with their third-party IdP. This blog post will describe what is needed from the third-party IdP, how to build a scalable backend for authentication, and how to access AWS services from the client.

As an example, this post will use Snapchat’s Login Kit to integrate with Amazon Cognito. The overall authentication flow for the integration is shown in Figure 1.

Figure 1: Overall authentication flow of integration

Prerequisites

The following are the prerequisites for integrating third-party IdP using developer authenticated identities.

The client SDK to authenticate with third-party IdP. This will handle client authentication and access token retrieval. In the example in this post, we use Snapchat’s Login Kit.
A method to authenticate access tokens that are retrieved from the third-party IdP. For this blog post, I am using an endpoint provided by Snapchat which will retrieve user data by passing in access tokens. A successful query of user data indicates the access token is valid.
Developer authenticated identities (identity pool) configured in Amazon Cognito. You will need to note the identity pool ID and the developer provider name you specify.

Client SDK

Follow the third-party client SDK instructions for implementing authentication in your application. Snapchat’s Login Kit provides an SDK to mount a login button in your app, and to allow you to authenticate against your Snapchat account credentials. After a user clicks on the login button, they will be redirected to Snapchat to login. After successfully logging in, they will be redirected back to your application with an access token. The handleResponseCallback is where you can implement an API call to your developer backend, to pass your access token to retrieve credentials from Amazon Cognito to access AWS services. The following code example mounts a login button on your application, to allow the user to authenticate with Snapchat and retrieve an access token.

var loginButtonIconId = "<HTML div id>";
// Mount Login Button
snap.loginkit.mountButton(loginButtonIconId, {
    clientId: "<Snapchat Client Id>",
    redirectURI: "<Developer backend url>",
    scopeList: [
    "user.display_name",
    "user.bitmoji.avatar",
    "user.external_id",
    ],
    handleResponseCallback: function (token) {
   <IMPLEMENT API CALL TO DEVELOPER BACKEND PASSING SNAPCHAT ACCESS TOKEN>
   }
});

Developer backend

The developer backend is responsible for authenticating access tokens from the third-party IdP and exchanging them for an OpenID Connect token that can be used to access AWS services. For this example, you will use Amazon API Gateway with AWS Lambda with the IAM permissions to call getOpenIdTokenForDeveloperIdentity.

The following is a code example to authenticate access tokens with Snapchat.

let result = await axios({
    method: 'post',
    url: 'https://kit.snapchat.com/v1/me',
    headers: {'Content-Type': 'application/json', 
             'Authorization': 'Bearer ' + body.access_token},
    data: {"query":"{me{displayName bitmoji{avatar} externalId}}"}
});

After successful authentication, next you call getOpenIdTokenForDeveloperIdentity with the identity pool ID and logins map. The logins map has a mapping of the developer provider name to an external ID from the IdP. An OpenID Connect token and the Amazon Cognito identifier (identity ID) will be returned from the call, which can be sent to the application. The identity ID and token can be used to access AWS services. The following is a code example to retrieve AWS credentials after authenticating with Snapchat.

if(result.status == 200) {
    returnbody = JSON.stringify(await cognitoidentity.getOpenIdTokenForDeveloperIdentity({
        IdentityPoolId: '<Identity Pool Id>',
        Logins: {
            '<Developer provider name>': result.data.data.me.externalId,  
        }
    }).promise());
}

Considerations

The following are considerations about Amazon Cognito identity pools that you should keep in mind when building your solution.

The identity ID returned by getOpenIdTokenForDeveloperIdentity is mapped to the external ID provided in the logins map. This mapping is stored in Amazon Cognito. You can then use the identity ID to identify who is calling AWS services, which is especially useful for auditing purposes.
This solution is suitable for multi-regional deployments. All that is required is that you copy the identity pool to another AWS Region. Please note that the identity ID will be different in each Region, but that should not affect functionality.

Conclusion

By using developer authenticated identities, you can integrate your application with Amazon Cognito and a third-party IdP with the proper prerequisites. For more examples of using developer authenticated identities, see Developer Authenticated Identities (Identity Pools) in the Amazon Cognito Developer Guide. If you have feedback about this post, submit comments in the Comments section below or start a new thread on one of our forums.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How Banks Can Use AWS to Meet Compliance

2021-06-29 Jiwan Panjiker

Post Syndicated from Jiwan Panjiker original https://aws.amazon.com/blogs/architecture/how-banks-can-use-aws-to-meet-compliance/

Since the 2008 financial crisis, banking supervisory institutions such as the Basel Committee on Banking Supervision (BCBS) have strengthened regulations. There is now increased oversight over the financial services industry. For banks, making the necessary changes to comply with these rules is a challenging, multi-year effort.

Basel IV, a massive update to existing rules, is due for implementation in January 2023. Basel IV standardizes the approach to calculating credit risk, increases the impact of risk-weighted assets (RWAs) and emphasizes data transparency.

Given the complexity of data, modeling, and numerous assumptions that have to be made, compliance under Basel IV implementation will be challenging. Standardization omits nuances unique to your business, which can drive up costs, but violating guidelines will result in steep penalties.

This post will address these challenges by outlining a mechanism that facilitates a healthy, data-driven dialogue between banks and regulators to better achieve compliance objectives. The reference architecture will focus on enabling fast, iterative releases with the help of serverless AWS services.

There are four key actions to take in order to support this mechanism:

Automate data management
Establish a continuous integration/continuous delivery (CI/CD) pipeline
Enable fast, point-in-time audit replays
Set up proactive monitoring and notifications

Automate data management

Due to frequent merger activity, banks are typically comprised of a web of integrated systems and siloed business units, making it difficult to consolidate data. Under Basel IV guidelines, auditors want banks to provide detailed data in a presentable way.

You can tackle this first challenge by establishing a data pipeline as shown in Figure 1. Take inventory of each data source as it is incorporated into the pipeline. Identify the critical internal and external data sources that will be used to populate the initial landing area. Amazon Simple Storage Service (S3) is a great choice for this.

Figure 1. Data pipeline that cleans, processes, and segments data

Amazon S3 is a highly available, durable service that is a popular data lake solution. S3 offers WORM storage capabilities like S3 Glacier Vault and S3 Object Lock to protect the integrity of your archived data in accordance with U.S. SEC and FINRA rules.

Basel IV regulations also require banks to use many attributes to develop accurate credit risk models. The attributes can be a mix of datasets such as financial statements, internal balanced scorecards, macro-economic data, and credit ratings. The risk models themselves can also be segmented by portfolio types, industry segments, asset types and much more.

You can split data into different domains and designate data owners with separate S3 buckets. Credit risk model developers, analyst, and data scientists can then use the structure of the S3 buckets to pull together relevant datasets. They can then store the outputs into S3 buckets.

To support fast, automated data retrieval, store object metadata in a highly scalable, and queryable database. You can set up Amazon S3 so that an event can initiate a function to populate Amazon DynamoDB. Developers can use AWS Lambda to write these functions using popular languages like Python.

With AWS Glue, you can automate Extract/Load/Transform (ETL) processes to clean and move data to the different S3 buckets. AWS Glue can also support data operations by automatically cataloging your various data sources.

Taking on a structured approach will simplify data governance and transparency as the business continues to grow and operate.

Establish a CI/CD pipeline

Adopt tools that machine learning teams can use to build a streamlined CI/CD solution as demonstrated in Figure 2.

Figure 2. An end-to-end machine learning development and deployment pipeline

Using tightly integrated AWS services, your teams can minimize time spent managing tools and deployment processes, and instead, focus on tuning the models and analyzing the results.

Amazon SageMaker brings together a powerful set of machine learning capabilities on the AWS Cloud. It helps data scientists and engineers build insightful models. Figure 2 depicts the high-level architecture and shows how Amazon SageMaker Pipelines helps teams orchestrate the automation and deployment processes.

The core of the pipeline uses a set of AWS deployment services so that your teams can collaborate and review effectively. With AWS CodeCommit, your teams can set up git-based repository to store and version models for data processing, training, and evaluation. The repository can also store code and configuration files using AWS CloudFormation for deployment. You can use AWS CodePipeline and AWS CodeBuild to create and update a model endpoint based on the approved/reviewed changes.

Any updates detected in the AWS CodeCommit repository initiate a deployment whenever a new model version is added to the Model Registry. Amazon S3 can be used to store generated model artifacts, historical data, and models.

Enable fast, point-in-time audit replays

Figure 3. Containers offer a lightweight, powerful solution to run audits using historical assets

One of the main themes of Basel IV is transparency. Figure 3 illustrates a solution to build trust with regulators by allowing them to verify and understand modeling activity.

A lightweight application is hosted in AWS Fargate and enables auditors to re-run Basel credit risk models under specified conditions. With AWS Fargate, you don’t need to manually manage instances or container orchestration. Configure the CPU or memory specifications at the task level and set guidelines around scalability for your service. Your tasks then scale up and down automatically, based on demand, and will optimize cost efficiency and availability.

Figure 3 shows the following:

The application takes inputs such as date, release version, and model type.
It then queries DynamoDB with this information.
The query will return the data necessary to retrieve model artifacts from previous CI/CD deployments and relevant datasets from historical S3 buckets.
Using this information, it can spin up as many containers as needed to run the model.
It then stores the outputs in a separate S3 bucket.
Auditors will have a detailed trace of all the attributes, assumptions, and data that went into the modeling effort. To streamline this process, the app can also compare the outputs of the historical runs to the recent replay and highlight any significant deviations.

Though internal models will be de-emphasized under Basel IV, banks will continue to run internal models as a benchmark against the broader standards. Schedule AWS Fargate tasks to run these models regularly to capitalize on highly performant compute services while minimizing costs.

Set up proactive monitoring and notifications

Figure 4. Scheduled jobs can send out notifications using Amazon SNS when certain thresholds are breached

The last principle is based around establishing an early warning system, enabling banks to take on a more proactive role in maintaining compliance.

With automated monitoring and notifications, banks will be able to respond quickly to potential concerns. For instance, there can be a daily scheduled job that launches containers and runs the models against the latest data. If any thresholds are breached, alerts can be sent out via SMS or email. Operational teams can be subscribed to certain message topics using Amazon Simple Notification Service (SNS). They can then respond before actual compliance issues emerge.

Conclusion

With a Well-Architected approach, AWS helps you control your data, deploy new features, and embrace a serverless approach. This frees you to innovate quickly and address regulatory challenges.

You can iterate with new AWS services and bring machine learning to bear on various streams of data to identify high impact pools of value. You can get a clearer picture of the data to make it easier to identify areas where you can reduce RWAs. Using Amazon S3, you can turn on AWS analytics services such as Amazon QuickSight and Amazon Athena to visualize the data. You’ll be able to fulfill reporting requirements such as those found in regulatory studies like CCAR, DFAST, CECL, and IFRS9.

For more information about establishing a data pipeline, read Lake House Formation Architecture. It is a powerful pattern that combines a few concepts that will help bring your data together cohesively. To set up a robust CI/CD pipeline, explore the AWS Serverless CI/CD Reference Architecture.

Integrate AWS Network Firewall with your ISV Firewall Rulesets

2021-06-29 Mony Kiem

Post Syndicated from Mony Kiem original https://aws.amazon.com/blogs/architecture/integrate-aws-network-firewall-with-your-isv-firewall-rulesets/

You may have requirements to leverage on-premises firewall technology in AWS by using your existing firewall implementation. As you move these workloads to AWS or launch new ones, you may replicate your existing on-premises firewall architecture. In this case, you can run partner appliances such as Palo Alto and Fortinet firewall appliances on Amazon EC2 instances.

Ensure that the firewall and intrusion prevention system (IPS) rules that protect your on-premises data center will also protect your Amazon Virtual Private Cloud (VPC). These rules must be frequently updated to ensure protection against the latest security threats. Many enterprises do not want to manage multiple rulesets across their entire hybrid architecture.

AWS Network Firewall takes the responsibility of this undifferentiated heavy lifting by providing a managed service that runs a fleet of firewall appliances, from patching to security updates. It uses the free and open-source intrusion prevention system (IPS), Suricata, for stateful inspection. Suricata is a network threat detection engine capable of real-time intrusion detection (IDS). It also provides inline intrusion prevention (IPS), network security monitoring (NSM), and offline packet capture processing. Customers can now import their existing IPS rules from their firewall provider software that adheres to the open source Suricata standard. This enables a network security model for your hybrid architecture that minimizes operational overhead while achieving consistent protection.

Overview of AWS services used

The following are AWS services that are used in our solution. These are the fundamental building blocks of a hybrid architecture on AWS.

AWS Network Firewall (ANFW): a stateful, managed, network firewall and intrusion detection and prevention service. You can filter network traffic at your VPC using AWS Network Firewall. AWS Network Firewall pricing is based on the number of firewalls deployed and the amount of traffic inspected. There are no upfront commitments, and you pay only for what you use.
AWS Transit Gateway (TGW): a network transit hub that you use to interconnect your virtual private clouds (VPCs) and on-premises networks. Transit Gateway enables customers to connect thousands of VPCs. You can attach all your hybrid connectivity (VPN and Direct Connect connections) to a single Transit Gateway. This enables you to consolidate and control your organization’s entire AWS routing configuration in one place.
AWS Direct Connect, AWS Site-to-Site VPN, and Amazon VPC are other core components of this hybrid architecture.

Hybrid architecture with centralized network inspection

The example architecture in Figure 1 depicts the deployment model of a centralized network security architecture. It shows all inbound and outbound traffic flowing through a single VPC for inspection. The centralized inspection architecture incorporates the use of AWS Network Firewall deployed in an inspection VPC. All traffic is routed from other VPCs through AWS Transit Gateway (TGW). The threat intelligence rulesets are managed by a partner integration solution and can be automatically imported into AWS Network Firewall. This will allow you to use the same ruleset that is deployed on-premises. It will reduce inconsistent and manual processes to maintain and update the rules.

Figure 1. Centralized inspection architecture with AWS Network Firewall and imported rules

The partner integration with AWS Network Firewall (ANFW) will work for both a centralized and distributed inspection architecture. The AWS Network Firewall service will house the rulesets, and you only need to deploy a Firewall endpoint in the Availability Zone of your VPC. In the centralized architecture deployment, all traffic originating from the attached VPCs is routed to the TGW. On the TGW route table, all traffic is routed to the inspection VPC attachment ID. The route table associated to the subnet where the TGW ENI is created in the inspection VPC will have a default route via the ANFW endpoint and return traffic from the ANFW endpoint is routed back to the TGW. If your VPC Firewall endpoint is being deployed across multiple Availability Zones (AZ), use the TGW appliance mode to allow traffic flow symmetry. This will ensure that return traffic is processed by the same AZ. For further details on how to set up your network routing, reference the Deployment models for AWS Network Firewall blog post.

AWS Network Firewall partner integrations

Figure 1 depicts two partner integrations, which include Trend Micro and FortiNet. View this complete and latest list of partner integrations with AWS Network Firewall.

If you are already a user of Trend Micro for your threat intelligence, you can leverage this deployment model to standardize your hybrid cloud security. Trend Micro enables you to deploy your AWS managed network infrastructure and pair it with a partner-supported threat intelligence. This focuses on detecting and disrupting malware in your environments. You just need to enable the Sharing capability on Trend Micro Cloud One. For further information, see these detailed instructions.

For existing users of Fortinet that are using their managed IPS rulesets, you can automatically deploy updated IPS rule sets to AWS Network Firewall. This will ensure consistent protection across your applications landscape. For more details on this integration, visit the partner page.

Getting started with AWS Firewall

You can get started with this pattern through the following high-level steps with link to detailed instructions along the way.

Determine your current networking architecture and cross reference it with the different deployment models supported by AWS Network Firewall. You can learn more about your different options in the blog Deployment models for AWS Network Firewall. The deployment model will determine how you set up your route tables and where you will deploy your AWS Network Firewall endpoint.
Visit the AWS Network Firewall Partners page to confirm your provider’s integration with ANFW and follow the integration instructions from the partner’s documentation.
Get started with AWS Network Firewall by visiting the Amazon VPC Console to create or import your firewall rules. You can group them into policies and apply them to the VPCs you want to protect per the developer guide.
To start inspecting traffic, deploy your Network Firewall endpoint in your inspection VPC.

Conclusion

You may need to operate a hybrid architecture using the same firewall and IPS rules for both your on-premises and cloud networks. For implementing these rules in the cloud, you can run partner firewall appliances on EC2 instances. This model of operation requires some heavy lifting.

Instead, you can set up AWS Network Firewall quickly, and not worry about deploying and managing any infrastructure. AWS Network Firewall automatically scales with your organizations’ network traffic. AWS provides a flexible rules engine that enables you to define firewall rules to control this traffic. To simplify how organizations determine what rules to define, Fortinet and Trend Micro have made managed rulesets available through AWS Marketplace offerings. These can be deployed to your environment with a few clicks. These partners remove complexity for security teams so they can easily create and maintain rules to take full advantage of the AWS Network Firewall.

Customize requests and responses with AWS WAF

2021-06-21 Kaustubh Phatak

Post Syndicated from Kaustubh Phatak original https://aws.amazon.com/blogs/security/customize-requests-and-responses-with-aws-waf/

In March 2021, AWS introduced support for custom responses and request header insertion with AWS WAF. This blog post will demonstrate how you can use these new features to customize your AWS WAF solution to improve the user experience and security posture of your applications.

HTTP response codes are standard responses sent by a server in response to a client request. When AWS WAF blocks a request, the default response code sent back to the client is HTTP 403 (Forbidden). The HTTP 403 response code is associated with a default error page built by the web server engine. This page is typically generic and not user-friendly. With the Custom Response feature, AWS WAF now allows you to modify the status code from HTTP 403 to HTTP 2xx, 3xx, 4xx, and 5xx, and to return a custom body when the request is blocked by AWS WAF. The custom responses unique to AWS WAF also allow you to differentiate blocked requests generated by AWS WAF or your server.

When inspected HTTP requests are allowed by AWS WAF, the request is passed through to the associated resource. Now you have the ability to insert custom HTTP request headers for each rule inside your web access control list (web ACL) set to allow or count, and you can create additional logic with your application by tagging these requests with the headers.

We will be outlining three different use cases to show how you can use these AWS WAF features.

Use case 1: Custom response code

In this example, you will use the custom response code feature to redirect a viewer request to a different webpage. You use HTTP 3xx response codes to redirect the incoming request, and use the HTTP header Location to specify the website URL for redirection. Figure 1 shows an overview of this workflow.

Figure 1: Overview of using custom response code to redirect the request

Figure 1 illustrates the following steps:

AWS WAF has a rate-based rule to allow 100 requests every 5 minutes.
A user sends multiple requests and breaches AWS WAF rate-based rules threshold.
AWS WAF blocks any further requests from the user.
The AWS WAF custom response code feature modifies the response code from HTTP 403 to HTTP 302 – Temporary Redirect with a Location header specifying the redirected URL.

Configure the AWS WAF web ACL and rule for custom response code

To create an Application Load Balancer and associate it to AWS WAF

Follow the steps to configure a load balancer and a listener to create an internet-facing load balancer in the N.Virginia AWS Region.
After the load balancer is created, open the AWS WAF console.
In the navigation pane, choose Web ACLs, and then choose Create web ACL in US east (N.Virginia) Region.
For Name, enter the name that you want to use to identify this web ACL.
For Resource type, choose the Application Load Balancer that you created in Step 1 and choose Add.
Choose Next.
Choose Add rules and then choose Add my own rules and rule groups.
For Name, enter the name that you want to use to identify this rule.
For Rule type, choose Rate-based rule.
For Rate limit, enter 100.
Under Actions, keep the default action of Block and enable Custom response.
Enter the response code as 302.
Under Response headers, add a new custom header with Key as Location and Value as example.com
Choose Add rule.
Continue to choose Next to reach the summary page, and then choose Create new web ACL.

After the web ACL is created, you should see the web ACL configuration as shown in Figure 2.

Figure 2: Custom Response – Web ACL configuration

Now, the setup is complete. You have a web ACL with a rate-based rule configured to redirect blocked requests to a different URL. To verify that the setup is working as expected, you can enable and analyze the AWS WAF logs for a test user that is sending more than 100 requests in a period of 5 minutes.

In Figure 3, you can see the custom response code of 302 being sent to the test user instance.

Figure 3: Verifying the AWS WAF logs for custom response

In the example in Figure 3, we tested our configuration by having a user send more than 100 requests from a PC to trigger a block. To verify the Location header, we analyzed the network traffic by using the developer tools of the browser. As you can see in Figure 4, the response includes the custom header Location with the configured redirect URL.

Figure 4: Verifying response in the browser tools for custom response

Use case 2: Custom error page

In this example, you will use the AWS WAF custom error page to route the request to a different error page, rather than the default web server error pages. As you can see in Figure 5, the workflow is similar to use case 1.

Figure 5: Overview of using custom error page to redirect the request

Figure 5 shows the following steps:

AWS WAF has a rate-based rule to allow 100 requests every 5 minutes.
A user sends multiple requests and breaches AWS WAF rate-based rules threshold.
AWS WAF blocks any further requests from the user.
AWS WAF custom response code feature modifies the response code to HTTP 307 – Temporary Redirect and responds with a custom error page with the message Too Many Requests.

To configure the AWS WAF web ACL and rule for custom error page

In the AWS WAF console, in the navigation pane, choose Web ACLs, and then choose the web ACL that you created in use case 1.
Click on Rules tab and choose Add rules and then choose Add my own rules and rule groups.
For Name, enter the name that you want to use to identify this rule.
For Rule type, choose Rate-based rule.
For Rate limit, enter 100.
Under Actions, keep the default action of Block and enable Custom response.
For the response code, enter 307.
For Choose how you would like to specify the response body, select Create a custom response body.
A pop-up box will open. Enter a name for the Response body object name.
For Content type, you can select JSON, HTML, or Plain Text. In this example, we select Plain Text.
For Response body, enter any sample text. In this example, we enter This is a sample custom error page. Then choose Save.
Choose Add Rule.
For Set rule priority, move your new rule to the top so that this rule is processed first.

Figure 6 shows a summary of the rate based-rule created for use case 2.

Figure 6: Custom error page – Web ACL configuration

Now, the setup is complete. You have a web ACL with a rate-based rule configured to redirect blocked requests to different URL. To verify the setup is working as expected, you can analyze the AWS WAF logs for a test user that is sending more than 100 requests in a period of 5 minutes. Figure 7 shows the custom response code of 307 being sent to our example test user instance.

Figure 7: Verifying responseCodeSent in the AWS WAF logs

When you access the load balancer URL from your browser, you should see the custom error page similar to Figure 8.

Figure 8: Verifying response using the browser

Use case 3: Header insertion for request tagging

This example demonstrates the AWS WAF header insertion capability to route the request based on geolocation. You will use the header country-check to notify the Application Load Balancer to route the request to a different target group, by using the Application Load Balancer advanced routing feature.

Figure 9: Overview of using request header insertion to tag the request to be processed downstream

Figure 9 shows the following steps:

User sends request to the Application Load Balancer that is attached with AWS WAF.
AWS WAF applies a geographic location rule that conditionally allows requests from unexpected countries in Count mode.
AWS WAF adds a custom HTTP request header to tag this request.
An Application Load Balancer listener rule is configured to route requests based on this header.
Request tagged by AWS WAF with the custom header is routed to a separate target group.

To add a geographical location rule for request header insertion

In the AWS WAF console, in the navigation pane, choose Web ACLs, and then choose the web ACL that you created in use case 1.
On the Rules tab, choose Add rules and then choose Add my own rules and rule groups.
For Name, enter the name that you want to use to identify this rule.
For Rule type, choose Regular rule.
For If a request, select doesn’t match the statement (NOT).
For Inspect, select Originates from a country in.
In this example, normal traffic originates from United States; so under Country codes, select United States – US.
For IP address to use to determine the country of origin, Choose Source IP Address.
For Action, choose Count. This will allow requests to be logged and tagged while processing other rules that follow.
Expand Custom request, choose Add new custom header. For Key, choose country-check and for Value, choose true.

Note: custom request headers are prefixed with x-amzn-waf-
Choose Save rule.
Set rule priority, move your new rule to the top to allow this rule to be processed first.
Choose Save.

Figure 10: Header insertion – Web ACL configuration

For this use-case, you set up a geographical location rule to check for requests that originate from countries outside of the normal traffic flow of your application (in this example, the United States). You do not want to block the requests right away, but instead tag the requests triggered by this AWS WAF rule for further validation downstream by the application logic. To route the tagged requests differently, you use ALB advanced request routing feature to route AWS WAF tagged traffic to a different target group.

You can verify the header inserted by the rule by enabling AWS WAF full logs and looking at the requestHeadersInserted log field, as shown in Figure 11.

Figure 11: Verifying the AWS WAF logs for header insertion

Conclusion

AWS WAF provides the ability to create a custom response for blocked requests by changing the status code and response body. The header insertion capability allows you to tag requests allowed by AWS WAF for your application to perform another action.

In this post, we showed you three basic use-cases to demonstrate how you can create a better user experience by redirecting users to another location instead of responding with a denied page. We showed you how you can create custom AWS WAF rules by tagging the request for your application logic to see it has been inspected, and how you can make a decision around this information.

If you’re new to AWS WAF, see Getting started with AWS WAF.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS WAF forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Approaches to meeting Australian Government gateway requirements on AWS

2021-06-14 John Hildebrandt

Post Syndicated from John Hildebrandt original https://aws.amazon.com/blogs/security/approaches-to-meeting-australian-government-gateway-requirements-on-aws/

Australian Commonwealth Government agencies are subject to specific requirements set by the Protective Security Policy Framework (PSPF) for securing connectivity between systems that are running sensitive workloads, and for accessing less trusted environments, such as the internet. These agencies have often met the requirements by using some form of approved gateway solution that provides network-based security controls.

This post examines the types of controls you need to provide a gateway that can meet Australian Government requirements defined in the Protective Security Policy Framework (PSPF) and the challenges of using traditional deployment models to support cloud-based solutions. PSPF requirements are mandatory for non-corporate Commonwealth entities, and represent better practice for corporate Commonwealth entities, wholly-owned Commonwealth companies, and state and territory agencies. We discuss the ability to deploy gateway-style solutions in the cloud, and show how you can meet the majority of gateway requirements by using standard cloud architectures plus services. We provide guidance on deploying gateway solutions in the AWS Cloud, and highlight services that can support such deployments. Finally, we provide an illustrative AWS web architecture pattern to show how to meet the majority of gateway requirements through Well-Architected use of services.

Australian Government gateway requirements

The Australian Government Protective Security Policy Framework (PSPF) highlights the requirement to use secure internet gateways (SIGs) and references the Australian Information Security Manual (ISM) control framework to guide agencies. The ISM has a chapter on gateways, which includes the following recommendations for gateway architecture and operations:

Provide a central control point for traffic in and out of the system.
Inspect and filter traffic.
Log and monitor traffic and gateway operation to a secure location. Use appropriate security event alerting.
Use secure administration practices, including multi-factor authentication (MFA) access control, minimum privilege, separation of roles, and network segregation.
Perform appropriate authentication and authorization of users, traffic, and equipment. Use MFA when possible.
Use demilitarized zone (DMZ) patterns to limit access to internal networks.
Test security controls regularly.
Set up firewalls between security domains and public network infrastructure.

Since the PSPF references the ISM, the agency should apply the overall ISM framework to meet ISM requirements such as governance and security patching for the environment. The ISM is a risk-based framework, and the risk posture of the workload and organization should inform how to assess the controls. For example, requirements for authentication of users might be relaxed for a public-facing website.

In traditional on-premises environments, some Australian Government agencies have mandated centrally assessed and managed gateway capabilities in order to drive economies of scale across multiple government agencies. However, the PSPF does provide the option for gateways used only by a single government agency to undertake their own risk-based assessment for the single agency gateway solution.

Other government agencies also have specific requirements to connect with cloud providers. For example, the U.S. Government Office of Management and Budget (OMB) mandates that U.S. government users access the cloud through a specific agency connection.

Connecting to the cloud through on-premises gateways

Given the existence of centrally managed off-cloud gateways, one approach by customers has been to continue to use these off-cloud gateways and then connect to AWS through the on-premises gateway environment by using AWS Direct Connect, as shown in Figure 1.

Figure 1: Connecting to the AWS Cloud through an agency gateway and then through AWS Direct Connect

Although this approach does work, and makes use of existing gateway capability, it has a number of downsides:

A potential single point of failure: If the on-premises gateway capability is unavailable, the agency can lose connectivity to the cloud-based solution.
Bandwidth limitations: The agency is limited by the capacity of the gateway, which might not have been developed with dynamically scalable and bandwidth-intensive cloud-based workloads in mind.
Latency issues: The requirement to traverse multiple network hops, in addition to the gateway, will introduce additional latency. This can be particularly problematic with architectures that involve API communications being sent back and forth across the gateway environment.
Castle-and-moat thinking: Relying only on the gateway as the security boundary can discourage agencies from using and recognizing the cloud-based security controls that are available.

Some of these challenges are discussed in the context of US Trusted Internet Connection (TIC) programs in this whitepaper.

Moving gateways to the cloud

In response to the limitations discussed in the last section, both customers and AWS Partners have built gateway solutions on AWS to meet gateway requirements while remaining fully within the cloud environment. See this type of solution in Figure 2.

Figure 2: Moving the gateway to the AWS Cloud

With this approach, you can fully leverage the scalable bandwidth that is available from the AWS environment, and you can also reduce latency issues, particularly when multiple hops to and from the gateway are required. This blog post describes a pilot program in the US that combines AWS services and AWS Marketplace technologies to provide a cloud-based gateway.

You can use AWS Transit Gateway (released after the referenced pilot program) to provide the option to centralize such a gateway capability within an organization. This makes it possible to utilize the gateway across multiple cloud solutions that are running in their own virtual private clouds (VPCs) and accounts. This approach also facilitates the principle of the gateway being the central control point for traffic flowing in and out. For more information on using AWS Transit Gateway with security appliances, see the Appliance VPC topic in the Amazon VPC documentation.

More recently, AWS has released additional services and features that can assist with delivering government gateway requirements.

Elastic Load Balancing Gateway Load Balancer provide the capability to deploy third-party network appliances in a scalable fashion. With this capability, you can leverage existing investment in licensing, use familiar tooling, reuse intellectual property (IP) such as rule sets, and reuse skills, because staff are already trained in configuring and managing the chosen device. You have one gateway for distributing traffic across multiple virtual appliances, while scaling the appliances up and down based on demand. This reduces the potential points of failure in your network and increases availability. Gateway Load Balancer is a straightforward way to use third-party network appliances from industry leaders in the cloud. You benefit from the features of these devices, while Gateway Load Balancer makes them automatically scalable and easier to deploy. You can find an AWS Partner with Gateway Load Balancer expertise on the AWS Marketplace. For more information on combining Transit Gateway and Gateway Load Balancer for a centralized inspection architecture, see this blog post. The post shows centralized architecture for East-West (VPC-to-VPC) and North-South (internet or on-premises bound) traffic inspection, plus processing.

To further simplify this area for customers, AWS has introduced the AWS Network Firewall service. Network Firewall is a managed service that you can use to deploy essential network protections for your VPCs. The service is simple to set up and scales automatically with your network traffic so you don’t have to worry about deploying and managing any infrastructure. You can combine Network Firewall with Transit Gateway to set up centralized inspection architecture models, such as those described in this blog post.

Reviewing a typical web architecture in the cloud

In the last section, you saw that SIG patterns can be created in the cloud. Now we can put that in context with the layered security controls that are implemented in a typical web application deployment. Consider a web application hosted on Amazon Elastic Compute Cloud (Amazon EC2) instances, as shown in Figure 3, within the context of other services that will support the architecture.

Figure 3: Security controls in a web application hosted on EC2

Although this example doesn’t include a traditional SIG-type infrastructure that inspects and controls traffic before it’s sent to the AWS Cloud, the architecture has many of the technical controls that are called for in SIG solutions as a result of using the AWS Well-Architected Framework. We’ll now step through some of these services to highlight the relevant security functionality that each provides.

Network control services

Amazon Virtual Private Cloud (Amazon VPC) is a service you can use to launch AWS resources in a logically isolated virtual network that you define. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways. Amazon VPC lets you use multiple layers of security, including security groups and network access control lists (network ACLs), to help control access to Amazon EC2 instances in each subnet. Security groups act as a firewall for associated EC2 instances, controlling both inbound and outbound traffic at the instance level. A network ACL is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. You might set up network ACLs with rules similar to your security groups to add an additional layer of security to your VPC. Read about the specific differences between security groups and network ACLs.

Having this level of control throughout the application architecture has advantages over relying only on a central, border-style gateway pattern, because security groups for each tier of the application architecture can be locked down to only those ports and sources required for that layer. For example, in the architecture shown in Figure 3, only the application load balancer security group would allow web traffic (ports 80, 443) from the internet. The web-tier-layer security group would only accept traffic from the load-balancer layer, and the database-layer security group would only accept traffic from the web tier.

If you need to provide a central point of control with this model, you can use AWS Firewall Manager, which simplifies the administration and maintenance of your VPC security groups across multiple accounts and resources. With Firewall Manager, you can configure and audit your security groups for your organization using a single, central administrator account. Firewall Manager automatically applies rules and protections across your accounts and resources, even as you add new resources. Firewall Manager is particularly useful when you want to protect your entire organization, or if you frequently add new resources that you want to protect via a central administrator account.

To support separation of management plan activities from data plane aspects in workloads, agencies can use multiple elastic network interface patterns on EC2 instances to provide a separate management network path.

Edge protection services

In the example in Figure 3, several services are used to provide edge-based protections in front of the web application. AWS Shield is a managed distributed denial of service (DDoS) protection service that safeguards applications that are running on AWS. AWS Shield provides always-on detection and automatic inline mitigations that minimize application downtime and latency, so there’s no need to engage AWS Support to benefit from DDoS protection. There are two tiers of AWS Shield: Standard and Advanced. When you use Shield Advanced, you can apply protections at both the Amazon CloudFront, Amazon EC2 and application load balancer layers. Shield Advanced also gives you 24/7 access to the AWS DDoS Response Team (DRT).

AWS WAF is a web application firewall that helps protect your web applications or APIs against common web exploits that can affect availability, compromise security, or consume excessive resources. AWS WAF gives you control over how traffic reaches your applications by enabling you to create security rules that block common attack patterns, such as SQL injection or cross-site scripting, and rules that filter out specific traffic patterns that you define. Again, you can apply this protection at both the Amazon CloudFront and application load balancer layers in our illustrated solution. Agencies can also use managed rules for WAF to benefit from rules developed and maintained by AWS Marketplace sellers.

Amazon CloudFront is a fast content delivery network (CDN) service. CloudFront seamlessly integrates with AWS Shield, AWS WAF, and Amazon Route 53 to help protect against multiple types of unauthorized access, including network and application layer DDoS attacks.

Logging and monitoring services

The example application in Figure 3 shows several services that provide logging and monitoring of network traffic, application activity, infrastructure, and AWS API usage.

At the VPC level, the VPC Flow Logs feature provides you with the ability to capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data can be published to Amazon CloudWatch Logs or Amazon Simple Storage Service (Amazon S3). Traffic Mirroring is a feature that you can use in a VPC to capture traffic if needed for inspection. This allows agencies to implement full packet capture on a continuous basis, or in response to a specific event within the application.

Amazon CloudWatch provides a monitoring service with alarms and analytics. In the example application, AWS WAF can also be configured to log activity as described in the AWS WAF Developer Guide.

AWS Config provides a timeline view of the configuration of the environment. You can also define rules to provide alerts and remediation when the environment moves away from the desired configuration.

AWS CloudTrail is a service that you can use to handle governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity that is related to actions across your AWS infrastructure.

Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior to protect your AWS accounts. GuardDuty analyzes tens of billions of events across multiple AWS data sources, such as AWS CloudTrail event logs, Amazon VPC Flow Logs, and DNS logs. This blog post highlights a third-party assessment of GuardDuty that compares its performance to other intrusion detection systems (IDS).

Route 53 Resolver Query Logging lets you log the DNS queries that originate in your VPCs. With query logging turned on, you can see which domain names have been queried, the AWS resources from which the queries originated—including source IP and instance ID—and the responses that were received.

With Route 53 Resolver DNS Firewall, you can filter and regulate outbound DNS traffic for your VPCs. To do this, you create reusable collections of filtering rules in DNS Firewall rule groups, associate the rule groups to your VPC, and then monitor activity in DNS Firewall logs and metrics. Based on the activity, you can adjust the behavior of DNS Firewall accordingly.

Mapping services to control areas

Based on the above description of the use of additional services, we can summarize which services contribute to the control and recommendation areas in the gateway chapter in the Australian ISM framework.

Control and recommendation areas	Contributing services
Inspect and filter traffic	AWS WAF, VPC Traffic Mirroring
Central control point	Infrastructure as code, AWS Firewall Manager
Authentication and authorization (MFA)	AWS Identity and Access Management (IAM), solution and application IAM, VPC security groups
Logging and monitoring	Amazon CloudWatch, AWS CloudTrail, AWS Config, Amazon VPC (flow logs and mirroring), load balancer logs, Amazon CloudFront logs, Amazon GuardDuty, Route 53 Resolver Query Logging
Secure administration (MFA)	IAM, directory federation (if used)
DMZ patterns	VPC subnet layout, security groups, network ACLs
Firewalls	VPC security groups, network ACLs, AWS WAF, Route 53 Resolver DNS Firewall
Web proxy; site and content filtering and scanning	AWS WAF, Firewall Manager

Note that the listed AWS service might not provide all relevant controls in each area, and it is part of the customer’s risk assessment and design to determine what additional controls might need to be implemented.

As you can see, many of the recommended practices and controls from the Australian Government gateway requirements are already encompassed in a typical Well-Architected solution. The implementing agency has the choice of two options: it can continue to place such a solution behind a gateway that runs either within or outside of AWS, leveraging the gateway controls that are inherent in the application architecture as additional layers of defense. Otherwise, the agency can conduct a risk assessment to understand which gateway controls can be supplied by means of the application architecture to reduce the gateway control requirements at any gateway layer in front of the application.

Summary

In this blog post, we’ve discussed the requirements for Australian Government gateways which provide network controls to secure workloads. We’ve outlined the downsides of using traditional on-premises solutions and illustrated how services such as AWS Transit Gateway, Elastic Load Balancing, Gateway Load Balancer, and AWS Network Firewall facilitate moving gateway solutions into the cloud. These are services you can evaluate against your network control requirements. Finally, we reviewed a typical web architecture running in the AWS Cloud with associated services to illustrate how many of the typical gateway controls can be met by using a standard Well-Architected approach.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on one of the AWS Security or Networking forums or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Hackathons with AWS Cloud9: Collaboration simplified for your next big idea

2021-06-12 Mahesh Biradar

Post Syndicated from Mahesh Biradar original https://aws.amazon.com/blogs/devops/hackathons-with-aws-cloud9-collaboration-simplified-for-your-next-big-idea/

Many organizations host ideation events to innovate and prototype new ideas faster. These events usually run for a short duration and involve collaboration between members of participating teams. By the end of the event, a successful demonstration of a working prototype is expected and the winner or the next steps are determined. Therefore, it’s important to build a working proof of concept quickly, and to do that teams need to be able to share the code and get peer reviewed in real time.

In this post, you see how AWS Cloud9 can help teams collaborate, pair program, and track each other’s inputs in real time for a successful hackathon experience.

AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you to write, run, and debug code from any machine with just a browser. A shared environment is an AWS Cloud9 development environment that multiple users have been invited to participate in and can edit or view its shared resources.

Pair programming and mob programming are development approaches in which two or more developers collaborate simultaneously to design, code, or test solutions. At the core is the premise that two or more people collaborate on the same code at the same time, which allows for real-time code review and can result in higher quality software.

Hackathons are one of the best ways to collaboratively solve problems, often with code. Cross-functional two-pizza teams compete with limited resources under time constraints to solve a challenging business problem. Several companies have adopted the concept of hackathons to foster a culture of innovation, providing a platform for developers to showcase their creativity and acquire new skills. Teams are either provided a roster of ideas to choose from or come up with their own new idea.

Solution overview

In this post, you create an AWS Cloud9 environment shared with three AWS Identity and Access Management (IAM) users (the hackathon team). You also see how this team can code together to develop a sample serverless application using an AWS Serverless Application Model (AWS SAM) template.

The following diagram illustrates the deployment architecture.

Figure1: Solution Overview

Prerequisites

To complete the steps in this post, you need an AWS account with administrator privileges.

Set up the environment

To start setting up your environment, complete the following steps:

Create an AWS Cloud9 environment in your AWS account.
Create and attach an instance profile to AWS Cloud9 to call AWS services from an environment.For more information, see Create and store permanent access credentials in an environment.
On the AWS Cloud9 console, select the environment you just created and choose View details.

Figure2: Cloud9 View details
Note the environment ID from the Environment ARN value; we use this ID in a later step.

Figure3: Environment ARN

In your AWS Cloud9 terminal, create the file usersetup.sh with the following contents:

#USAGE: 
#STEP 1: Execute following command within Cloud9 terminal to retrieve environment id
# aws cloud9 list-environments
#STEP 2: Execute following command by providing appropriate parameters: -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3 
# sh usersetup.sh -e 877f86c3bb80418aabc9956580436e9a -u User1,User2
function usage() {
  echo "USAGE: sh usersetup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3"
}
while getopts ":e:u:" opt; do
  case $opt in
    e)  if ! aws cloud9 describe-environment-status --environment-id "$OPTARG" 2>&1 >/dev/null; then
          echo "Please provide valid cloud9 environmentid."
          usage
          exit 1
        fi
        environmentId="$OPTARG" ;;
    u)  if [ "$OPTARG" == "" ]; then
          echo "Please provide comma separated list of usernames."
          usage
          exit 1
        fi
        users="$OPTARG" ;;
    \?) echo "Incorrect arguments."
        usage
        exit 1;;
  esac
done
if [ "$OPTIND" -lt 5 ]; then
  echo "Missing required arguments."
  usage
  exit 1
fi
IFS=',' read -ra userNames <<< "$users"
groupName='HackathonUsers'
groupPolicy='arn:aws:iam::aws:policy/AdministratorAccess'
userArns=()
function createUsers() {
    userList=""    
    if aws iam get-group --group-name $groupName  > /dev/null 2>&1; then
      echo "$groupName group already exists."  
    else
      if aws iam create-group --group-name $groupName 2>&1 >/dev/null; then
        echo "Created user group - $groupName."  
      else
        echo "Error creating user group - $groupName."  
        exit 1
      fi
    fi
    if aws iam attach-group-policy --policy-arn $groupPolicy --group-name $groupName; then
      echo "Attached group policy."  
    else
      echo "Error attaching group policy to - $groupName."  
      exit 1
    fi
    
    for userName in "${userNames[@]}" ; do 
        
        randomPwd=`aws secretsmanager get-random-password \
        --require-each-included-type \
        --password-length 20 \
        --no-include-space \
        --output text`
    
        userList="$userList"$'\n'"Username: $userName, Password: $randomPwd"
        
        userArn=`aws iam create-user \
        --user-name $userName \
        --query 'User.Arn' | sed -e 's/\/.*\///g' | tr -d '"'`
        
        userArns+=( $userArn )
      
        aws iam wait user-exists \
        --user-name $userName
        
        echo "Successfully created user $userName."
        
        aws iam create-login-profile \
        --user-name $userName \
        --password $randomPwd \
        --password-reset-required 2>&1 >/dev/null
        
        aws iam add-user-to-group \
        --user-name $userName \
        --group-name $groupName
    done
    echo "Waiting for users profile setup..."
    sleep 8
    
    for arn in "${userArns[@]}" ; do 
      aws cloud9 create-environment-membership \
        --environment-id $environmentId \
        --user-arn $arn \
        --permissions read-write 2>&1 >/dev/null
    done
    echo "Following users have been created and added to $groupName group."
    echo "$userList"
}
createUsers

Run the following command by replacing the following parameters:
1. 1. ENVIRONMENTID – The environment ID you saved earlier
  2. USERNAME1, USERNAME2… – A comma-separated list of users. In this example, we use three users.
sh usersetup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3
The script creates the following resources:
- - The number of IAM users that you defined
  - The IAM user group HackathonUsers with the users created from previous step assigned with administrator access
  - These users are assigned a random password, which must be changed before their first login.
  - User passwords can be shared with your team from the AWS Cloud9 Terminal output.
Instruct your team to sign in to the AWS Cloud9 console open the shared environment by choosing Shared with you.

Figure4: Shared environments

Run the create-repository command, specifying a unique name, optional description, and optional tags:

aws codecommit create-repository --repository-name hackathon-repo --repository-description "Hackathon repository" --tags Team=hackathon

Note the cloneUrlHttp value from the output; we use this in a later step.

Figure5: CodeCommit repo url

The environment is now ready for the hackathon team to start coding.
Instruct your team members to open the shared environment from the AWS Cloud9 dashboard.
For demo purposes, you can quickly create a sample Python-based Hello World application using the AWS SAM CLI

Run the following commands to commit the files to the local repo:

cd hackathon-repo
git config --global init.defaultBranch main
git init
git add .
git commit -m "Initial commit

Run the following command to push the local repo to AWS CodeCommit by replacing CLONE_URL_HTTP with the cloneUrlHttp value you noted earlier:
```
git push <CLONEURLHTTP> —all
```

For a sample collaboration scenario, watch the video Collaboration with Cloud9 .

Clean up

The cleanup script deletes all the resources it created. Make a local copy of any files you want to save.

Create a file named cleanup.sh with the following content:

#USAGE: 
#STEP 1: Execute following command within Cloud9 terminal to retrieve envronment id
# aws cloud9 list-environments
#STEP 2: Execute following command by providing appropriate parameters: -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3 
# sh cleanup.sh -e 877f86c3bb80418aabc9956580436e9a -u User1,User2
function usage() {
  echo "USAGE: sh cleanup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3"
}
while getopts ":e:u:" opt; do
  case $opt in
    e)  if ! aws cloud9 describe-environment-status --environment-id "$OPTARG" 2>&1 >/dev/null; then
          echo "Please provide valid cloud9 environmentid."
          usage
          exit 1
        fi
        environmentId="$OPTARG" ;;
    u)  if [ "$OPTARG" == "" ]; then
          echo "Please provide comma separated list of usernames."
          usage
          exit 1
        fi
        users="$OPTARG" ;;
    \?) echo "Incorrect arguments."
        usage
        exit 1;;
  esac
done
if [ "$OPTIND" -lt 5 ]; then
  echo "Missing required arguments."
  usage
  exit 1
fi
IFS=',' read -ra userNames <<< "$users"
groupName='HackathonUsers'
groupPolicy='arn:aws:iam::aws:policy/AdministratorAccess'
function cleanUp() {
    echo "Starting cleanup..."
    groupExists=false
    if aws iam get-group --group-name $groupName  > /dev/null 2>&1; then
      groupExists=true
    else
      echo "$groupName does not exist."  
    fi
    
    for userName in "${userNames[@]}" ; do 
        if ! aws iam get-user --user-name $userName >/dev/null 2>&1; then
          echo "$userName does not exist."  
        else
          userArn=$(aws iam get-user \
          --user-name $userName \
          --query 'User.Arn' | tr -d '"') 
          
          if $groupExists ; then 
            aws iam remove-user-from-group \
            --user-name $userName \
            --group-name $groupName
          fi
  
          aws iam delete-login-profile \
          --user-name $userName 
  
          if aws iam delete-user --user-name $userName ; then
            echo "Succesfully deleted $userName"
          fi
          
          aws cloud9 delete-environment-membership \
          --environment-id $environmentId --user-arn $userArn
          
        fi
    done
    if $groupExists ; then 
      aws iam detach-group-policy \
      --group-name $groupName \
      --policy-arn $groupPolicy
  
      if aws iam delete-group --group-name $groupName ; then
        echo "Succesfully deleted $groupName user group"
      fi
    fi
    
    echo "Cleanup complete."
}
cleanUp

Run the script by passing the same parameters you passed when setting up the script:
```
sh cleanup.sh -e ENVIRONMENTID -u USERNAME1,USERNAME2,USERNAME3
```
Delete the CodeCommit repository by running the following commands in the root directory with the appropriate repository name:
```
aws codecommit delete-repository —repository-name hackathon-repo
rm -rf hackathon-repo
```
You can delete the Cloud9 environment when the event is over

Conclusion

In this post, you saw how to use an AWS Cloud9 IDE to collaborate as a team and code together to develop a working prototype. For organizations looking to host hackathon events, these tools can be a powerful way to deliver a rich user experience. For more information about AWS Cloud9 capabilities, see the AWS Cloud9 User Guide. If you plan on using AWS Cloud9 for an ongoing collaboration, refer to the best practices for sharing environments in Working with shared environment in AWS Cloud9.

About the authors

	Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale.
	Guy Savoie is a Senior Solutions Architect at AWS working with SMB customers, primarily in Florida. In his role as a technical advisor, he focuses on unlocking business value through outcome based innovation.
	Ramesh Chidirala is a Solutions Architect focused on SMB customers in the Central region. He is passionate about helping customers solve challenging technical problems with AWS and help them achieve their desired business outcomes.

Creating a notification workflow from sensitive data discover with Amazon Macie, Amazon EventBridge, AWS Lambda, and Slack

2021-06-10 Bruno Silviera

Post Syndicated from Bruno Silviera original https://aws.amazon.com/blogs/security/creating-a-notification-workflow-from-sensitive-data-discover-with-amazon-macie-amazon-eventbridge-aws-lambda-and-slack/

Following the example of the EU in implementing the General Data Protection Regulation (GDPR), many countries are implementing similar data protection laws. In response, many companies are forming teams that are responsible for data protection. Considering the volume of information that companies maintain, it’s essential that these teams are alerted when sensitive data is at risk.

This post shows how to deploy a solution that uses Amazon Macie to discover sensitive data. This solution enables you to set up automatic notification to your company’s designated data protection team via a Slack channel when sensitive data that needs to be protected is discovered by Amazon EventBridge and AWS Lambda.

The challenge

Let’s imagine that you’re part of a team that’s responsible for classifying your organization’s data but the data structure isn’t documented. Amazon Macie provides you the ability to run a scheduled classification job that examines your data, and you want to notify the data protection team when there’s new sensitive data to classify. Let’s build a solution to automatically notify the data protection team.

Solution overview

To be scalable and cost-effective, this solution uses serverless technologies and managed AWS services, including:

Macie – A fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in Amazon Web Services (AWS).
EventBridge – A serverless event bus that connects application data from your apps, SaaS, and AWS services. EventBridge can respond to specific events or run according to a schedule. The solution presented in this post uses EventBridge to initiate a custom Lambda function in response to a specific event.
Lambda – Runs code in response to events such as changes in data, changes in application state, or user actions. In this solution, a Lambda function is initiated by EventBridge.

Solution architecture

The architecture workflow is shown in Figure 1 and includes the following steps:

Macie runs a classification job and publishes its findings to EventBridge as a JSON object.
The EventBridge rule captures the findings and invokes a Lambda function as a target.
The Lambda function parses the JSON object. The function then sends a custom message to a Slack channel with the sensitive data finding for the data protection team to evaluate and respond to.

Figure 1: Solution architecture workflow

Set up Slack

For this solution, you need a Slack workspace and an incoming webhook. The workspace must be in place before you create the webhook.

Create a Slack workspace

If you already have a Slack workspace in your environment, you can skip forward, to creating the webhook.

If you don’t have a Slack workspace, follow the steps in Create a Slack Workspace to create one.

Create an incoming webhook in Slack API

Go to your Slack API.
Choose Start Building to create an app.
Enter the following details for your app:
- App Name – macie-to-slack.
- Development Slack Workspace – Choose the Slack workspace—either an existing workspace or one you created for this solution—to receive the Macie findings.
Choose the Create App button.
In the left menu, choose Incoming Webhooks.
At the Activate Incoming Webhooks screen, move the slider from OFF to ON.
Scroll down and choose Add New Webhook to Workspace.
In the screen asking where your app should post, enter the name of the Slack channel from your Workspace that you want to send notification to and choose Authorize.
On the next screen, scroll down to the Webhook URL section. Make a note of the URL to use later.

Deploy the CloudFormation template with the solution

The deployment of the CloudFormation template automatically creates the following resources:

A Lambda function that begins with the name named macie-to-slack-lambdafindingsToSlack-.
An EventBridge rule named MacieFindingsToSlack.
An IAM role named MacieFindingsToSlackkRole.
A permission to invoke the Lambda function named LambdaInvokePermission.

Note: Before you proceed, make sure you’re deploying the template to the same Region that your production Macie is running.

To deploy the Cloudformation template

Download the YAML template to your computer.

Note: To save the template, you can right click the Raw button at the top of the code and then select Save link as if you’re using Chrome, or the equivalent in your browser. This file is used in Step 4.
Open CloudFormation in the AWS Management Console.
On the Welcome page, choose Create stack and then choose With new resources.
On Step 1 — Specify template, choose Upload a template file, select Choose file and then select the file template.yaml (the file extension might be .YML), then choose Next.
On Step 2 — Specify stack details:
1. Enter macie-to-slack as the Stack name.
2. At the Slack Incoming Web Hook URL, paste the webhook URL you copied earlier.
3. At Slack channel, enter the name of the channel in your workspace that will receive the alerts and choose Next.
Figure 2: Defining stack details
On Step 3 – Configure Stack options, you can leave the default settings, or change them for your environment. Choose Next to continue.
At the bottom of Step 4 – Review, select I acknowledge that AWS CloudFormation might create IAM resources, and choose Create stack.

Figure 3: Confirmation before stack creation
Wait for the stack to reach status CREATE_COMPLETE.

Running the solution

At this point, you’ve deployed the solution and your resources are created.

To test the solution, you can schedule a Macie job targeting a bucket that contains a file with sensitive information that Macie can detect.

Note: You can check the Amazon Macie documentation to see the list of supported managed data identifiers.

When the Macie job is complete, any findings are sent to the Slack channel.

Figure 4: Macie finding delivered to Slack channel

Select the link in the message sent to the Slack channel to open that finding in the Macie console, as shown in Figure 5.

Figure 5: Finding details

And you’re done!

Now your Macie finding results are delivered to your Slack channel where they can be easily monitored, reducing response time and risk exposure.

If you deployed this for testing purposes, or want to clean this up and move to your production account, you can delete the Cloudformation stack:

Open the CloudFormation console.
Select the stack and choose Delete.

Conclusion

In this blog post we walked through the steps to configure a notification workflow using Macie, Lambda, and EventBridge to send sensitive data findings to your data protection team via a Slack channel.

Your data protection team will appreciate the timely notifications of sensitive data findings, giving you the ability to focus on creating controls to improve data security and compliance with regulations related to protection and treatment of personal data.

For more information about data privacy on AWS, see Data Privacy FAQ.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Choosing a Well-Architected CI/CD approach: Open-source software and AWS Services

2021-06-10 Brian Carlson

Post Syndicated from Brian Carlson original https://aws.amazon.com/blogs/devops/choosing-well-architected-ci-cd-open-source-software-aws-services/

This series of posts discusses making informed decisions when choosing to implement open-source tools on AWS services, adopt managed AWS services to satisfy the same needs, or use a combination of both.

We look at key considerations for evaluating open-source software and AWS services using the perspectives of a startup company and a mature company as examples. You can use these two different points of view to compare to your own organization. To make this investigation easier we will use Continuous Integration (CI) and Continuous Delivery (CD) capabilities as the target of our investigation.

Startup Company rocket and Mature Company rocket

In two related posts, we follow two AWS customers, Iponweb and BigHat Biosciences, as they share their CI/CD journeys, their perspectives, the decisions they made, and why. To end the series, we explore an example reference architecture showing the benefits AWS provides regardless of your emphasis on open-source tools or managed AWS services.

Why CI/CD?

Until your creations are in the hands of your customers, investment in development has provided no return. The faster valuable changes enter production, the greater positive impact you can have on your customer. In today’s highly competitive world, the ability to frequently and consistently deliver value is a competitive advantage. The Operational Excellence (OE) pillar of the AWS Well-Architected Framework recognizes this impact and focuses on the capabilities of CI/CD in two dedicated sections.

The concepts in CI/CD originate from software engineering but apply equally to any form of content. The goal is to support development, integration, testing, deployment, and delivery to production. For example, making changes to an application, updating your machine learning (ML) models, changing your multimedia assets, or referring to the AWS Well-Architected Framework.

Adopting CI/CD and the best practices from the Operational Excellence pillar can help you address risks in your environment, and limit errors from manual processes. More importantly, they help free your teams from the related manual processes, so they can focus on satisfying customer needs, differentiating your organization, and accelerating the flow of valuable changes into production.

A red question mark sits on a field of chaotically arranged black question marks.

How do you decide what you need?

The first question in the Operational Excellence pillar is about understanding needs and making informed decisions. To help you frame your own decision-making process, we explore key considerations from the perspective of a fictional startup company and a fictional mature company. In our two related posts, we explore these same considerations with Iponweb and BigHat.

The key considerations include:

Functional requirements – Providing specific features and capabilities that deliver value to your customers.
Non-functional requirements – Enabling the safe, effective, and efficient delivery of the functional requirements. Non-functional requirements include security, reliability, performance, and cost requirements.
- Without security, you can’t earn customer trust. If your customers can’t trust you, you won’t have customers.
- Without reliability you aren’t available to serve your customers. If you can’t serve your customers, you won’t have customers.
- Performance is focused on timely and efficient delivery of value, not delivering as fast as possible.
- Cost is focused on optimizing the value received for the resources spent (for example, money, time, or effort), not minimizing expense.
Operational requirements – Enabling you to effectively and efficiently support, maintain, sustain, and improve the delivery of value to your customers. When you “Design with Ops in Mind,” you’re enabling effective and efficient support for your business outcomes.

These non-feature-related key considerations are why Operational Excellence, Security, Reliability, Performance Efficiency, and Cost Optimization are the five pillars of the AWS Well-Architected Framework.

The startup company

Any startup begins as a small team of inspired people working together to realize the unique solution they believe solves an unsolved problem.

For our fictional small team, everyone knows each other personally and all speak frequently. We share processes and procedures in discussions, and everyone know what needs to be done. Our team members bring their expertise and dedicate it, and the majority of their work time, to delivering our solution. The results of our efforts inform changes we make to support our next iteration.

However, our manual activities are error-prone and inconsistencies exist in the way we do them. Performing these tasks takes time away from delivering our solution. When errors occur, they have the potential to disrupt everyone’s progress.

We have capital available to make some investments. We would prefer to bring in more team members who can contribute directly to developing our solution. We need to iterate faster if we are to achieve a broadly viable product in time to qualify for our next round of funding. We need to decide what investments to make.

Goals – Reach the next milestone and secure funding to continue development
Needs – Reduce or eliminate the manual processes and associated errors
Priority – Rapid iteration
CI/CD emphasis – Baseline CI/CD capabilities and non-functional requirements are emphasized over a rich feature set

The mature company

Our second fictional company is a large and mature organization operating in a mature market segment. We’re focused on consistent, quality customer experiences to serve and retain our customers.

Our size limits the personal relationships between our service and development teams. The process to make requests, and the interfaces between teams and their systems, are well documented and understood.

However, the systems we have implemented over time, as needs were identified and addressed, aren’t well documented. Our existing tool chain includes some in-house scripting and both supported and unsupported versions of open-source tools. There are limited opportunities for us to acquire new customers.

When conditions change and new features are desired, we want to be able to rapidly implement and deploy those features as fast as possible. If we can differentiate our services, however briefly, we may be able to win customers away from our competitors. Our other path to improved profitability is to evolve our processes, maximizing integration and efficiencies, and capturing cost reductions.

Goals – Differentiate ourselves in the marketplace with desired new features
Needs – Address the risks of poorly documented systems and unsupported software
Priority – Evolve efficiency
CI/CD emphasis – Rich feature set and integrations are emphasized over improving the existing non-functional capabilities

Open-source tools on AWS vs. AWS services

The choice of open-source tools or AWS service is not binary. You can select the combination of solutions that provides the greatest value. You can implement open-source tools for their specific benefits where they outweigh the costs and operational burden, using underlying AWS services like Amazon Elastic Compute Cloud (Amazon EC2) to host them. You can then use AWS managed services, like AWS CodeBuild, for the undifferentiated features you need, without additional cost or operational burden.

A group of people sit around a table discussing the pieces of a puzzle and their ideas.

Feature Set

Our fictional organizations both want to accelerate the flow of beneficial changes into production and are evaluating CI/CD alternatives to support that outcome. Our startup company wants a working solution—basic capabilities, author/code, build, and deploy, so that they can focus on development. Our mature company is seeking every advantage—a rich feature set, extensive opportunities for customization, integration capabilities, and fine-grained control.

Open-source tools

Open-source tools often excel at meeting functional requirements. When a new functionality, capability, or integration is desired, any developer can implement it for themselves, and then contribute their code back to the project. As the user community for an open-source project expands the number of use cases and the features identified grows, so does the number of potential solutions and potential contributors. Developers are using these tools to support their efforts and implement new features that provide value to them.

However, features may be released in unsupported versions and then later added to the supported feature set. Non-functional requirements take time and are less appealing because they don’t typically bring immediate value to the product. Non-functional capabilities may lag behind the feature set.

Consider the following:

Open-source tools may have more features and existing integrations to other tools
The pace of feature set delivery may be extremely rapid
The features delivered are those desired and created by the active members of the community
You are free to implement the features your company desires
There is no commitment to long-term support for the project or any given feature
You can implement open-source tools on multiple cloud providers or on premises
If the project is abandoned, you’re responsible for maintaining your implementation

AWS services

AWS services are driven by customer needs. Services and features are supported by dedicated teams. These customer-obsessed teams focus on all customer needs, with security being their top priority. Both functional and non-functional requirements are addressed with an emphasis on enabling customer outcomes while minimizing the effort they expend to achieve them.

Consider the following:

The pace of delivery of feature sets is consistent
The feature roadmap is driven by customer need and customer requests
The AWS service team is dedicated to support of the service
AWS services are available on the AWS Cloud and on premises through AWS Outposts

Picture showing symbol of dollar

Cost Optimization

Why are we discussing cost after the feature set? Security and reliability are fundamentally more important. Leadership naturally gravitates to following the operational excellence best practice of evaluating trade-offs. Having looked at the potential benefits from the feature set, the next question is typically, “What is this going to cost?” Leadership defines the priorities and allocates the resources necessary (capital, time, effort). We review cost optimization second so that leadership can make a comparison of the expected benefits between CI/CD investments, and investments in other efforts, so they can make an informed decision.

Our organizations are both cost conscious. Our startup is working with finite capital and time. In contrast, our mature company can plan to make investments over time and budget for the needed capital. Early investment in a robust and feature-rich CI/CD tool chain could provide significant advantages towards the startup’s long-term success, but if the startup fails early, the value of that investment will never be realized. The mature company can afford to realize the value of their investment over time and can make targeted investments to address specific short-term needs.

Open-source tools

Open-source software doesn’t have to be purchased, but there are costs to adopt. Open-source tools require appropriate skills in order to be implemented, and to perform management and maintenance activities. Those skills must be gained through dedicated training of team members, team member self-study, or by hiring new team members with the existing skills. The availability of skilled practitioners of open-source tools varies with how popular a tool is and how long it has had an active community. Loss of skilled team members includes the loss of their institutional knowledge and intimacy with the implementation. Skills must be maintained with changes to the tools and as team members join or leave. Time is required from skilled team members to support management and maintenance activities. If commercial support for the tool is desired, it may be available through third-parties at an additional cost.

The time to value of an open-source implementation includes the time to implement and configure the resources and software. Additional value may be realized through investment of time configuring or implementing desired integrations and capabilities. There may be existing community-supported integrations or capabilities that reduce the level of effort to achieve these.

Consider the following:

No cost to acquire the software.
The availability of skill practitioners of open-source tools may be lower. Cost (capital and time) to acquire, establish, or maintain skill set may be higher.
There is an ongoing cost to maintain the team member skills necessary to support the open-source tools.
There is an ongoing cost of time for team members to perform management and maintenance activities.
Additional commercial support for open-source tools may be available at additional cost
Time to value includes implementation and configuration of resources and the open-source software. There may be more predefined community integrations.

AWS services

AWS services are provided pay-as-you-go with no required upfront costs. As of August 2020, more than 400,000 individuals hold active AWS Certifications, a number that grew more than 85% between August 2019 and August 2020.

Time to value for AWS services is extremely short and limited to the time to instantiate or configure the service for your use. Additional value may be realized through the investment of time configuring or implementing desired integrations. Predefined integrations for AWS services are added as part of the service development roadmap. However, there may be fewer existing integrations to reduce your level of effort.

Consider the following:

No cost to acquire the software; AWS services are pay-as-you-go for use.
AWS skill sets are broadly available. Cost (capital and time) to acquire, establish, or maintain skill sets may be lower.
AWS services are fully managed, and service teams are responsible for the operation of the services.
Time to value is limited to the time to instantiate or configure the service. There may be fewer predefined integrations.
Additional support for AWS services is available through AWS Support. Cost for support varies based on level of support and your AWS utilization.

Open-source tools on AWS services

Open-source tools on AWS services don’t impact these cost considerations. Migration off of either of these solutions is similarly not differentiated. In either case, you have to invest time in replacing the integrations and customizations you wish to maintain.

Picture showing a checkmark put on security

Security

Both organizations are concerned about reputation and customer trust. They both want to act to protect their information systems and are focusing on confidentiality and integrity of data. They both take security very seriously. Our startup wants to be secure by default and wants to trust the vendor to address vulnerabilities within the service. Our mature company has dedicated resources that focus on security, and the company practices defense in depth across internal organizations.

The startup and the mature company both want to know whether a choice is safe, secure, and can validate the security of their choice. They also want to understand their responsibilities and the shared responsibility model that applies.

Open-source tools

Open-source tools are the product of the contributors and may contain flaws or vulnerabilities. The entire community has access to the code to test and validate. There are frequently many eyes evaluating the security of the tools. A company or individual may perform a validation for themselves. However, there may be limited guidance on secure configurations. Controls in the implementer’s environment may reduce potential risk.

Consider the following:

You’re responsible for the security of the open-source software you implement
You control the security of your data within your open-source implementation
You can validate the security of the code and act as desired

AWS services

AWS service teams make security their highest priority and are able to respond rapidly when flaws are identified. There is robust guidance provided to support configuring AWS services securely.

Consider the following:

AWS is responsible for the security of the cloud and the underlying services
You are responsible for the security of your data in the cloud and how you configure AWS services
You must rely on the AWS service team to validate the security of the code

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation and the configuration of the AWS services it consumes. AWS is responsible for the security of the AWS Cloud and the managed AWS services.

Picture showing global distribution for redundancy to depict reliability

Reliability

Everyone wants reliable capabilities. What varies between companies is their appetite for risk, and how much they can tolerate the impact of non-availability. The startup emphasized the need for their systems to be available to support their rapid iterations. The mature company is operating with some existing reliability risks, including unsupported open-source tools and in-house scripts.

The startup and the mature company both want to understand the expected reliability of a choice, meaning what percentage of the time it is expected to be available. They both want to know if a choice is designed for high availability and will remain available even if a portion of the systems fails or is in a degraded state. They both want to understand the durability of their data, how to perform backups of their data, and how to perform recovery in the event of a failure.

Both companies need to determine what is an acceptable outage duration, commonly referred to as a Recovery Time Objective (RTO), and for what quantity of elapsed time it is acceptable to lose transactions (including committing changes), commonly referred to as Recovery Point Objective (RPO). They need to evaluate if they can achieve their RTO and RPO objectives with each of the choices they are considering.

Open-source tools

Open-source reliability is dependent upon the effectiveness of the company’s implementation, the underlying resources supporting the implementation, and the reliability of the open-source software. Open-source tools are the product of the contributors and may or may not incorporate high availability features. Depending on the implementation and tool, there may be a requirement for downtime for specific management or maintenance activities. The ability to support RTO and RPO depends on the teams supporting the company system, the implementation, and the mechanisms implemented for backup and recovery.

Consider the following:

You are responsible for implementing your open-source software to satisfy your reliability needs and high availability needs
Open-source tools may have downtime requirements to support specific management or maintenance activities
You are responsible for defining, implementing, and testing the backup and recovery mechanisms and procedures
You are responsible for the satisfaction of your RTO and RPO in the event of a failure of your open-source system

AWS services

AWS services are designed to support customer availability needs. As managed services, the service teams are responsible for maintaining the health of the services.

Consider the following:

AWS services are fully managed and service teams are responsible for the health of the services.
AWS services are designed and implemented to support customer reliability requirements. AWS CodeCommit is specifically designed for high availability. AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy are all engineered to exceed 99.9% availability.
CodeCommit, CodeBuild, CodePipeline, and CodeDeploy use highly durable services, including Amazon S3 and Amazon DynamoDB, to store customer data redundantly across multiple facilities.

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation (including data durability, backup, and recovery) and the configuration of the AWS services it consumes. AWS is responsible for the health of the AWS Cloud and the managed services.

Picture showing a graph depicting performance measurement

Performance

What defines timely and efficient delivery of value varies between our two companies. Each is looking for results before an engineer becomes idled by having to wait for results. The startup iterates rapidly based on the results of each prior iteration. There is limited other activity for our startup engineer to perform before they have to wait on actionable results. Our mature company is more likely to have an outstanding backlog or improvements that can be acted upon while changes moves through the pipeline.

Open-source tools

Open-source performance is defined by the resources upon which it is deployed. Open-source tools that can scale out can dynamically improve their performance when resource constrained. Performance can also be improved by scaling up, which is required when performance is constrained by resources and scaling out isn’t supported. The performance of open-source tools may be constrained by characteristics of how they were implemented in code or the libraries they use. If this is the case, the code is available for community or implementer-created improvements to address the limitation.

Consider the following:

You are responsible for managing the performance of your open-source tools
The performance of open-source tools may be constrained by the resources they are implemented upon; the code and libraries used; their system, resource, and software configuration; and the code and libraries present within the tools

AWS services

AWS services are designed to be highly scalable. CodeCommit has a highly scalable architecture, and CodeBuild scales up and down dynamically to meet your build volume. CodePipeline allows you to run actions in parallel in order to increase your workflow speeds.

Consider the following:

AWS services are fully managed, and service teams are responsible for the performance of the services.
AWS services are designed to scale automatically.
Your configuration of the services you consume can affect the performance of those services.
AWS services quotas exist to prevent unexpected costs. You can make changes to service quotas that may affect performance and costs.

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations; the customer is responsible for the open-source implementation (including the selection and configuration of the AWS Cloud resources) and the configuration of the AWS services it consumes. AWS is responsible for the performance of the AWS Cloud and the managed AWS services.

Picture showing cart-wheels in motion, depicting operations

Operations

Our startup company wants to limit its operations burden as much as possible in order to focus on development efforts. Our mature company has an established and robust operations capability. In both cases, they perform the management and maintenance activities necessary to support their needs.

Open-source tools

Open-source tools are supported by their volunteer communities. That support is voluntary, without any obligation or commitment from the users. If either company adopts open-source tools, they’re responsible for the management and maintenance of the system. If they want additional support with an obligation and commitment to support their implementation, third parties may provide commercial support at additional cost.

Consider the following:

You are responsible for supporting your implementation.
The open-source community may provide volunteer support for the software.
There is no commitment to support the software by the open-source community.
There may be less documentation, or accepted best practices, available to support open-source tools.
Early adoption of open-source tools, or the use of development builds, includes the chance of encountering unidentified edge cases and unanticipated issues.
The complexity of an implementation and its integrations may increase the difficulty to support open-source tools. The time to identify contributing factors may be extended by the complexity during an incident. Maintaining a set of skilled team members with deep understanding of your implementation may help mitigate this risk.
You may be able to acquire commercial support through a third party.

AWS services

AWS services are committed to providing long-term support for their customers.

Consider the following:

There is long-term commitment from AWS to support the service
As a managed service, the service team maintains current documentation
Additional levels of support are available through AWS Support
Support for AWS is available through partners and third parties

Open-source tools on AWS services

Open-source tools on AWS services combine these considerations. The company is responsible for operating the open-source tools (for example, software configuration changes, updates, patching, and responding to faults). AWS is responsible for the operation of the AWS Cloud and the managed AWS services.

Conclusion

In this post, we discussed how to make informed decisions when choosing to implement open-source tools on AWS services, adopt managed AWS services, or use a combination of both. To do so, you must examine your organization and evaluate the benefits and risks.

A magnifying glass is focused on the single red figure in a group of otherwise blue paper figures standing on a white surface.

Examine your organization

You can make an informed decision about the capabilities you adopt. The insight you need can be gained by examining your organization to identify your goals, needs, and priorities, and discovering what your current emphasis is. Ask the following questions:

What is your organization trying to accomplish and why?
How large is your organization and how is it structured?
How are roles and responsibilities distributed across teams?
How well defined and understood are your processes and procedures?
How do you manage development, testing, delivery, and deployment today?
What are the major challenges your organization faces?
What are the challenges you face managing development?
What problems are you trying to solve with CI/CD tools?
What do you want to achieve with CI/CD tools?

Evaluate benefits and risk

Armed with that knowledge, the next step is to explore the trade-offs between open-source options and managed AWS services. Then evaluate the benefits and risks in terms of the key considerations:

Features
Cost
Security
Reliability
Performance
Operations

When asked “What is the correct answer?” the answer should never be “It depends.” We need to change the question to “What is our use case and what are our needs?” The answer will emerge from there.

Make an informed decision

A Well-Architected solution can include open-source tools, AWS Services, or any combination of both! A Well-Architected choice is an informed decision that evaluates trade-offs, balances benefits and risks, satisfies your requirements, and most importantly supports the achievement of your business outcomes.

Read the other posts in this series and take this journey with BigHat Biosciences and Iponweb as they share their perspectives, the decisions they made, and why.

Resources

Want to learn more? Check out the following CI/CD and developer tools on AWS:

Continuous integration (CI)
Continuous delivery (CD)
AWS Developer Tools

For more information about the AWS Well-Architected Framework, refer to the following whitepapers:

AWS Well-Architected Framework
AWS Well-Architected Operational Excellence pillar
AWS Well-Architected Security pillar
AWS Well-Architected Reliability pillar
AWS Well-Architected Performance Efficiency pillar
AWS Well-Architected Cost Optimization pillar

The 3 hexagons of the well architected logo appear to the right of the words AWS Well-Architected.

Author bio

Brian is the global Operational Excellence lead for the AWS Well-Architected program. Formerly the technical lead for an international network, Brian works with customers and partners researching the operations best practices with the greatest positive impact and produces guidance to help you achieve your goals.

How to replicate secrets in AWS Secrets Manager to multiple Regions

2021-03-04 Fatima Ahmed

Post Syndicated from Fatima Ahmed original https://aws.amazon.com/blogs/security/how-to-replicate-secrets-aws-secrets-manager-multiple-regions/

On March 3, 2021, we launched a new feature for AWS Secrets Manager that makes it possible for you to replicate secrets across multiple AWS Regions. You can give your multi-Region applications access to replicated secrets in the required Regions and rely on Secrets Manager to keep the replicas in sync with the primary secret. In scenarios such as disaster recovery, you can read replicated secrets from your recovery Regions, even if your primary Region is unavailable. In this blog post, I show you how to automatically replicate a secret and access it from the recovery Region to support a disaster recovery plan.

With Secrets Manager, you can store, retrieve, manage, and rotate your secrets, including database credentials, API keys, and other secrets. When you create a secret using Secrets Manager, it’s created and managed in a Region of your choosing. Although scoping secrets to a Region is a security best practice, there are scenarios such as disaster recovery and cross-Regional redundancy that require replication of secrets across Regions. Secrets Manager now makes it possible for you to easily replicate your secrets to one or more Regions to support these scenarios.

With this new feature, you can create Regional read replicas for your secrets. When you create a new secret or edit an existing secret, you can specify the Regions where your secrets need to be replicated. Secrets Manager will securely create the read replicas for each secret and its associated metadata, eliminating the need to maintain a complex solution for this functionality. Any update made to the primary secret, such as a secret value updated through automatic rotation, will be automatically propagated by Secrets Manager to the replica secrets, making it easier to manage the life cycle of multi-Region secrets.

Note: Each replica secret is billed as a separate secret. For more details on pricing, see the AWS Secrets Manager pricing page.

Architecture overview

Suppose that your organization has a requirement to set up a disaster recovery plan. In this example, us-east-1 is the designated primary Region, where you have an application running on a simple AWS Lambda function (for the example in this blog post, I’m using Python 3). You also have an Amazon Relational Database Service (Amazon RDS) – MySQL DB instance running in the us-east-1 Region, and you’re using Secrets Manager to store the database credentials as a secret. Your application retrieves the secret from Secrets Manager to access the database. As part of the disaster recovery strategy, you set up us-west-2 as the designated recovery Region, where you’ve replicated your application, the DB instance, and the database secret.

To elaborate, the solution architecture consists of:

A primary Region for creating the secret, in this case us-east-1 (N. Virginia).
A replica Region for replicating the secret, in this case us-west-2 (Oregon).
An Amazon RDS – MySQL DB instance that is running in the primary Region and configured for replication to the replica Region. To set up read replicas or cross-Region replicas for Amazon RDS, see Working with read replicas.
A secret created in Secrets Manager and configured for replication for the replica Region.
AWS Lambda functions (running on Python 3) deployed in the primary and replica Regions acting as clients to the MySQL DBs.

This architecture is illustrated in Figure 1.

Figure 1: Architecture overview for a multi-Region secret replication with the primary Region active

In the primary region us-east-1, the Lambda function uses the credentials stored in the secret to access the database, as indicated by the following steps in Figure 1:

The Lambda function sends a request to Secrets Manager to retrieve the secret value by using the GetSecretValue API call. Secrets Manager retrieves the secret value for the Lambda function.
The Lambda function uses the secret value to connect to the database in order to read/write data.

The replicated secret in us-west-2 points to the primary DB instance in us-east-1. This is because when Secrets Manager replicates the secret, it replicates the secret value and all the associated metadata, such as the database endpoint. The database endpoint details are stored within the secret because Secrets Manager uses this information to connect to the database and rotate the secret if it is configured for automatic rotation. The Lambda function can also use the database endpoint details in the secret to connect to the database.

To simplify database failover during disaster recovery, as I’ll cover later in the post, you can configure an Amazon Route 53 CNAME record for the database endpoint in the primary Region. The database host associated with the secret is configured with the database CNAME record. When the primary Region is operating normally, the CNAME record points to the database endpoint in the primary Region. The requests to the database CNAME are routed to the DB instance in the primary Region, as shown in Figure 1.

During disaster recovery, you can failover to the replica Region, us-west-2, to make it possible for your application running in this Region to access the Amazon RDS read replica in us-west-2 by using the secret stored in the same Region. As part of your failover script, the database CNAME record should also be updated to point to the database endpoint in us-west-2. Because the database CNAME is used to point to the database endpoint within the secret, your application in us-west-2 can now use the replicated secret to access the database read replica in this Region. Figure 2 illustrates this disaster recovery scenario.

Figure 2: Architecture overview for a multi-Region secret replication with the replica Region active

Prerequisites

The procedure described in this blog post requires that you complete the following steps before starting the procedure:

Configure an Amazon RDS DB instance in the primary Region, with replication configured in the replica Region.
Configure a Route 53 CNAME record for the database endpoint in the primary Region.
Configure the Lambda function to connect with the Amazon RDS database and Secrets Manager by following the procedure in this blog post.
Sign in to the AWS Management Console using a role that has SecretsManagerReadWrite permissions in the primary and replica Regions.

Enable replication for secrets stored in Secrets Manager

In this section, I walk you through the process of enabling replication in Secrets Manager for:

A new secret that is created for your Amazon RDS database credentials
An existing secret that is not configured for replication

For the first scenario, I show you the steps to create a secret in Secrets Manager in the primary Region (us-east-1) and enable replication for the replica Region (us-west-2).

To create a secret with replication enabled

In the AWS Management Console, navigate to the Secrets Manager console in the primary Region (N. Virginia).
Choose Store a new secret.
On the Store a new secret screen, enter the Amazon RDS database credentials that will be used to connect with the Amazon RDS DB instance. Select the encryption key and the Amazon RDS DB instance, and then choose Next.
Enter the secret name of your choice, and then enter a description. You can also optionally add tags and resource permissions to the secret.
Under Replicate Secret – optional, choose Replicate secret to other regions.

Figure 3: Replicate a secret to other Regions
For AWS Region, choose the replica Region, US West (Oregon) us-west-2. For Encryption Key, choose Default to store your secret in the replica Region. Then choose Next.

Figure 4: Configure secret replication
In the Configure Rotation section, you can choose whether to enable rotation. For this example, I chose not to enable rotation, so I selected Disable automatic rotation. However, if you want to enable rotation, you can do so by following the steps in Enabling rotation for an Amazon RDS database secret in the Secrets Manager User Guide. When you enable rotation in the primary Region, any changes to the secret from the rotation process are also replicated to the replica Region. After you’ve configured the rotation settings, choose Next.
On the Review screen, you can see the summary of the secret configuration, including the secret replication configuration.

Figure 5: Review the secret before storing
At the bottom of the screen, choose Store.
At the top of the next screen, you’ll see two banners that provide status on:
- The creation of the secret in the primary Region
- The replication of the secret in the Secondary Region
After the creation and replication of the secret is successful, the banners will provide you with confirmation details.

At this point, you’ve created a secret in the primary Region (us-east-1) and enabled replication in a replica Region (us-west-2). You can now use this secret in the replica Region as well as the primary Region.

Now suppose that you have a secret created in the primary Region (us-east-1) that hasn’t been configured for replication. You can also configure replication for this existing secret by using the following procedure.

To enable multi-Region replication for existing secrets

In the Secrets Manager console, choose the secret name. At the top of the screen, choose Replicate secret to other regions.

Figure 6: Enable replication for existing secrets

This opens a pop-up screen where you can configure the replica Region and the encryption key for encrypting the secret in the replica Region.
Choose the AWS Region and encryption key for the replica Region, and then choose Complete adding region(s).

Figure 7: Configure replication for existing secrets

This starts the process of replicating the secret from the primary Region to the replica Region.
Scroll down to the Replicate Secret section. You can see that the replication to the us-west-2 Region is in progress.

Figure 8: Review progress for secret replication

After the replication is successful, you can look under Replication status to review the replication details that you’ve configured for your secret. You can also choose to replicate your secret to more Regions by choosing Add more regions.

Figure 9: Successful secret replication to a replica Region

Update the secret with the CNAME record

Next, you can update the host value in your secret to the CNAME record of the DB instance endpoint. This will make it possible for you to use the secret in the replica Region without making changes to the replica secret. In the event of a failover to the replica Region, you can simply update the CNAME record to the DB instance endpoint in the replica Region as a part of your failover script

To update the secret with the CNAME record

Navigate to the Secrets Manager console, and choose the secret that you have set up for replication
In the Secret value section, choose Retrieve secret value, and then choose Edit.
Update the secret value for the host with the CNAME record, and then choose Save.

Figure 10: Edit the secret value
After you choose Save, you’ll see a banner at the top of the screen with a message that indicates that the secret was successfully edited.Because the secret is set up for replication, you can also review the status of the synchronization of your secret to the replica Region after you updated the secret. To do so, scroll down to the Replicate Secret section and look under Region Replication Status.

Figure 11: Successful secret replication for a modified secret

Access replicated secrets from the replica Region

Now that you’ve configured the secret for replication in the primary Region, you can access the secret from the replica Region. Here I demonstrate how to access a replicated secret from a simple Lambda function that is deployed in the replica Region (us-west-2).

To access the secret from the replica Region

From the AWS Management Console, navigate to the Secrets Manager console in the replica Region (Oregon) and view the secret that you configured for replication in the primary Region (N. Virginia).

Figure 12: View secrets that are configured for replication in the replica Region
Choose the secret name and review the details that were replicated from the primary Region. A secret that is configured for replication will display a banner at the top of the screen stating the replication details.

Figure 13: The replication status banner
Under Secret Details, you can see the secret’s ARN. You can use the secret’s ARN to retrieve the secret value from the Lambda function or application that is deployed in your replica Region (Oregon). Make a note of the ARN.

Figure 14: View secret details

During a disaster recovery scenario when the primary Region isn’t available, you can update the CNAME record to point to the DB instance endpoint in us-west-2 as part of your failover script. For this example, my application that is deployed in the replica Region is configured to use the replicated secret’s ARN.

Let’s suppose your sample Lambda function defines the secret name and the Region in the environment variables. The REGION_NAME environment variable contains the name of the replica Region; in this example, us-west-2. The SECRET_NAME environment variable is the ARN of your replicated secret in the replica Region, which you noted earlier.

Figure 15: Environment variables for the Lambda function

In the replica Region, you can now refer to the secret’s ARN and Region in your Lambda function code to retrieve the secret value for connecting to the database. The following sample Lambda function code snippet uses the secret_name and region_name variables to retrieve the secret’s ARN and the replica Region values stored in the environment variables.

secret_name = os.environ['SECRET_NAME']
region_name = os.environ['REGION_NAME']

def openConnection():
    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )
    try:
        get_secret_value_response = 
client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        if e.response['Error']['Code'] == 
'DecryptionFailureException':

Alternately, you can simply use the Python 3 sample code for the replicated secret to retrieve the secret value from the Lambda function in the replica Region. You can review the provided sample codes by navigating to the secret details in the console, as shown in Figure 16.

Figure 16: Python 3 sample code for the replicated secret

Summary

When you plan for disaster recovery, you can configure replication of your secrets in Secrets Manager to provide redundancy for your secrets. This feature reduces the overhead of deploying and maintaining additional configuration for secret replication and retrieval across AWS Regions. In this post, I showed you how to create a secret and configure it for multi-Region replication. I also demonstrated how you can configure replication for existing secrets across multiple Regions.

I showed you how to use secrets from the replica Region and configure a sample Lambda function to retrieve a secret value. When replication is configured for secrets, you can use this technique to retrieve the secrets in the replica Region in a similar way as you would in the primary Region.

You can start using this feature through the AWS Secrets Manager console, AWS Command Line Interface (AWS CLI), AWS SDK, or AWS CloudFormation. To learn more about this feature, see the AWS Secrets Manager documentation. If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Secrets Manager forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

TLS 1.2 will be required for all AWS FIPS endpoints beginning March 31, 2021

2021-03-01 Janelle Hopper

Post Syndicated from Janelle Hopper original https://aws.amazon.com/blogs/security/tls-1-2-required-for-aws-fips-endpoints/

To help you meet your compliance needs, we’re updating all AWS Federal Information Processing Standard (FIPS) endpoints to a minimum of Transport Layer Security (TLS) 1.2. We have already updated over 40 services to require TLS 1.2, removing support for TLS 1.0 and TLS 1.1. Beginning March 31, 2021, if your client application cannot support TLS 1.2, it will result in connection failures. In order to avoid an interruption in service, we encourage you to act now to ensure that you connect to AWS FIPS endpoints at TLS version 1.2. This change does not affect non-FIPS AWS endpoints.

Amazon Web Services (AWS) continues to notify impacted customers directly via their Personal Health Dashboard and email. However, if you’re connecting anonymously to AWS shared resources, such as through a public Amazon Simple Storage Service (Amazon S3) bucket, then you would not have received a notification, as we cannot identify anonymous connections.

Why are you removing TLS 1.0 and TLS 1.1 support from FIPS endpoints?

At AWS, we’re continually expanding the scope of our compliance programs to meet the needs of customers who want to use our services for sensitive and regulated workloads. Compliance programs, including FedRAMP, require a minimum level of TLS 1.2. To help you meet compliance requirements, we’re updating all AWS FIPS endpoints to a minimum of TLS version 1.2 across all AWS Regions. Following this update, you will not be able to use TLS 1.0 and TLS 1.1 for connections to FIPS endpoints.

How can I detect if I am using TLS 1.0 or TLS 1.1?

To detect the use of TLS 1.0 or 1.1, we recommend that you perform code, network, or log analysis. If you are using an AWS Software Developer Kit (AWS SDK) or Command Line Interface (CLI), we have provided hyperlinks to detailed guidance in our previous TLS blog post about how to examine your client application code and properly configure the TLS version used.

When the application source code is unavailable, you can use a network tool, such as TCPDump (Linux) or Wireshark (Linux or Windows), to analyze your network traffic to find the TLS versions you’re using when connecting to AWS endpoints. For a detailed example of using these tools, see the example, below.

If you’re using Amazon S3, you can also use your access logs to view the TLS connection information for these services and identify client connections that are not at TLS 1.2.

What is the most common use of TLS 1.0 or TLS 1.1?

The most common client applications that use TLS 1.0 or 1.1 are Microsoft .NET Framework versions earlier than 4.6.2. If you use the .NET Framework, please confirm you are using version 4.6.2 or later. For information on how to update and configure .NET Framework to support TLS 1.2, see How to enable TLS 1.2 on clients.

How do I know if I am using an AWS FIPS endpoint?

All AWS services offer TLS 1.2 encrypted endpoints that you can use for all API calls. Some AWS services also offer FIPS 140-2 endpoints for customers who need to use FIPS-validated cryptographic libraries to connect to AWS services. You can check our list of all AWS FIPS endpoints and compare the list to your application code, configuration repositories, DNS logs, or other network logs.

EXAMPLE: TLS version detection using a packet capture

To capture the packets, multiple online sources, such as this article, provide guidance for setting up TCPDump on a Linux operating system. On a Windows operating system, the Wireshark tool provides packet analysis capabilities and can be used to analyze packets captured with TCPDump or it can also directly capture packets.

In this example, we assume there is a client application with the local IP address 10.25.35.243 that is making API calls to the CloudWatch FIPS API endpoint in the AWS GovCloud (US-West) Region. To analyze the traffic, first we look up the endpoint URL in the AWS FIPS endpoint list. In our example, the endpoint URL is monitoring.us-gov-west-1.amazonaws.com. Then we use NSLookup to find the IP addresses used by this FIPS endpoint.

Figure 1: Use NSLookup to find the IP addresses used by this FIPS endpoint

Wireshark is then used to open the captured packets, and filter to just the packets with the relevant IP address. This can be done automatically by selecting one of the packets in the upper section, and then right-clicking to use the Conversation filter/IPv4 option.

After the results are filtered to only the relevant IP addresses, the next step is to find the packet whose description in the Info column is Client Hello. In the lower packet details area, expand the Transport Layer Security section to find the version, which in this example is set to TLS 1.0 (0x0301). This indicates that the client only supports TLS 1.0 and must be modified to support a TLS 1.2 connection.

Figure 2: After the conversation filter has been applied, select the Client Hello packet in the top pane. Expand the Transport Layer Security section in the lower pane to view the packet details and the TLS version.

Figure 3 shows what it looks like after the client has been updated to support TLS 1.2. This second packet capture confirms we are sending TLS 1.2 (0x0303) in the Client Hello packet.

Figure 3: The client TLS has been updated to support TLS 1.2

Is there more assistance available?

If you have any questions or issues, you can start a new thread on one of the AWS forums, or contact AWS Support or your technical account manager (TAM). The AWS support tiers cover development and production issues for AWS products and services, along with other key stack components. AWS Support doesn’t include code development for client applications.

Additionally, you can use AWS IQ to find, securely collaborate with, and pay AWS-certified third-party experts for on-demand assistance to update your TLS client components. Visit the AWS IQ page for information about how to submit a request, get responses from experts, and choose the expert with the right skills and experience. Log in to your console and select Get Started with AWS IQ to start a request.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How to set up a recurring Security Hub summary email

2021-02-24 Justin Criswell

Post Syndicated from Justin Criswell original https://aws.amazon.com/blogs/security/how-to-set-up-a-recurring-security-hub-summary-email/

AWS Security Hub provides a comprehensive view of your security posture in Amazon Web Services (AWS) and helps you check your environment against security standards and best practices. In this post, we’ll show you how to set up weekly email notifications using Security Hub to provide account owners with a summary of the existing security findings to prioritize, new findings, and links to the Security Hub console for more information.

When you enable Security Hub, it collects and consolidates findings from AWS security services that you’re using, such as intrusion detection findings from Amazon GuardDuty, vulnerability scans from Amazon Inspector, Amazon Simple Storage Service (Amazon S3) bucket policy findings from Amazon Macie, publicly accessible and cross-account resources from IAM Access Analyzer, and resources lacking AWS WAF coverage from AWS Firewall Manager. Security Hub also consolidates findings from integrated AWS Partner Network (APN) security solutions.

Cloud security processes can differ from traditional on-premises security in that security is often decentralized in the cloud. With traditional on-premises security operations, security alerts are typically routed to centralized security teams operating out of security operations centers (SOCs). With cloud security operations, it’s often the application builders or DevOps engineers who are best situated to triage, investigate, and remediate the security alerts. This integration of security into DevOps processes is referred to as DevSecOps, and as part of this approach, centralized security teams look for additional ways to proactively engage application account owners in improving the security posture of AWS accounts.

This solution uses Security Hub custom insights, AWS Lambda, and the Security Hub API. A custom insight is a collection of findings that are aggregated by a grouping attribute, such as severity or status. Insights help you identify common security issues that might require remediation action. Security Hub includes several managed insights, or you can create your own custom insights. Amazon SNS topic subscribers will receive an email, similar to the one shown in Figure 1, that summarizes the results of the Security Hub custom insights.

Figure 1: Example email with a summary of security findings for an account

Solution overview

This solution assumes that Security Hub is enabled in your AWS account. If it isn’t enabled, set up the service so that you can start seeing a comprehensive view of security findings across your AWS accounts.

A recurring Security Hub summary email provides recipients with a proactive communication that summarizes the security posture and any recent improvements within their AWS accounts. The email message contains the following sections:

AWS Foundational Security Best Practices findings by status
AWS Foundational Security Best Practices findings by severity
Amazon GuardDuty findings by severity
AWS Identity and Access Management (IAM) Access Analyzer findings by severity
Unresolved findings by severity
New findings in the last seven days by security product
Top 10 resource types with the most findings

Here’s how the solution works:

Seven Security Hub custom insights are created when you first deploy the solution.
An Amazon CloudWatch time-based event invokes a Lambda function for processing.
The Lambda function gets the results of the custom insights from Security Hub, formats the results for email, and sends a message to Amazon SNS.
Amazon SNS sends the email notification to the address you provided during deployment.
The email includes the summary and links to the Security Hub UI so that the recipient can follow the remediation workflow.

Figure 2 shows the solution workflow.

Figure 2: Solution overview, deployed through AWS CloudFormation

Security Hub custom insight

The finding results presented in the email are summarized by Security Hub custom insights. A Security Hub insight is a collection of related findings. Each insight is defined by a group by statement and optional filters. The group by statement indicates how to group the matching findings, and identifies the type of item that the insight applies to. For example, if an insight is grouped by resource identifier, then the insight produces a list of resource identifiers. The optional filters narrow down the matching findings for the insight. For example, you might want to see only the findings from specific providers or findings associated with specific types of resources. Figure 3 shows the seven custom insights that are created as part of deploying this solution.

Figure 3: Custom insights created by the solution

Sample custom insight

Security Hub offers several built-in managed (default) insights. You can’t modify or delete managed insights. You can view the custom insights created as part of this solution in the Security Hub console under Insights, by selecting the Custom Insights filter. From the email, follow the link for “Summary Email – 02 – Failed AWS Foundational Security Best Practices” to see the summarized finding counts, as well as graphs with related data, as shown in Figure 4.

Figure 4: Detail view of the email titled “Summary Email – 02 – Failed AWS Foundational Security Best Practices”

Let’s evaluate the filters that create this custom insight:

Filter setting	Filter results
Type is “Software and Configuration Checks/Industry and Regulatory Standards/AWS-Foundational-Security-Best-Practices”	Captures all current and future findings created by the security standard AWS Foundational Security Best Practices.
Status is FAILED	Captures findings where the compliance status of the resource doesn’t pass the assessment.
Workflow Status is not SUPPRESSED	Captures findings where Security Hub users haven’t updated the finding to the SUPPRESSED status.
Record State is ACTIVE	Captures findings that represent the latest assessment of the resource. Security Hub automatically archives control-based findings if the associated resource is deleted, the resource does not exist, or the control is disabled.
Group by SeverityLabel	Creates the insight and populates the counts.

Solution artifacts

The solution provided with this blog post consists of two files:

An AWS CloudFormation template named security-hub-email-summary-cf-template.json.
A zip file named sec-hub-email.zip for the Lambda function that generates the Security Hub summary email.

In addition to the Security Hub custom insights as discussed in the previous section, the solution also deploys the following artifacts:

An Amazon Simple Notification Service (Amazon SNS) topic named SecurityHubRecurringSummary and an email subscription to the topic.

Figure 5: SNS topic created by the solution

The email address that subscribes to the topic is captured through a CloudFormation template input parameter. The subscriber is notified by email to confirm the subscription, and after confirmation, the subscription to the SNS topic is created.

Figure 6: SNS email subscription
Two Lambda functions:
1. A Lambda function named *-CustomInsightsFunction-* is used only by the CloudFormation template to create the custom Insights.
2. A Lambda function named SendSecurityHubSummaryEmail queries the custom insights from the Security Hub API and uses the insights’ data to create the summary email message. The function then sends the email message to the SNS topic.
  
  Figure 7: Example of Lambda functions created by the solution

Two IAM roles for the Lambda functions provide the following rights, respectively:

The minimum rights required to create insights and to create CloudWatch log groups and logs.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "securityhub:CreateInsight"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

The minimum rights required to query Security Hub insights and to send email messages to the SNS topic named SecurityHubRecurringSummary.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "sns:Publish",
            "Resource": "arn:aws:sns:[REGION]:[ACCOUNT-ID]:SecurityHubRecurringSummary",
            "Effect": "Allow"
        }
    ]
} ,
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "securityhub:Get*",
                "securityhub:List*",
                "securityhub:Describe*"
            ],
            "Resource": "*"
        }
    ]
}

A CloudWatch scheduled event named SecurityHubSummaryEmailSchedule for invoking the Lambda function that generates the summary email. The default schedule is every Monday at 8:00 AM GMT. This schedule can be overwritten by using a CloudFormation input parameter. Learn more about creating Cron expressions.

Figure 8: Example of CloudWatch schedule created by the solution

Deploy the solution

The following steps demonstrate the deployment of this solution in a single AWS account and Region. Repeat these steps in each of the AWS accounts that are active with Security Hub, so that the respective application owners can receive the relevant data from their accounts.

To deploy the solution

Download the CloudFormation template security-hub-email-summary-cf-template.json and the .zip file sec-hub-email.zip from https://github.com/aws-samples/aws-security-hub-summary-email.
Copy security-hub-email-summary-cf-template.json and sec-hub-email.zip to an S3 bucket within your target AWS account and Region. Copy the object URL for the CloudFormation template .json file.
On the AWS Management Console, open the service CloudFormation. Choose Create Stack with new resources.

Figure 9: Create stack with new resources
Under Specify template, in the Amazon S3 URL textbox, enter the S3 object URL for the file security-hub-email-summary-cf-template.json that you uploaded in step 1.

Figure 10: Specify S3 URL for CloudFormation template
Choose Next. On the next page, under Stack name, enter a name for the stack.

Figure 11: Enter stack name
On the same page, enter values for the input parameters. These are the input parameters that are required for this CloudFormation template:
1. S3 Bucket Name: The S3 bucket where the .zip file for the Lambda function (sec-hub-email.zip) is stored.
2. S3 key name (with prefixes): The S3 key name (with prefixes) for the .zip file for the Lambda function.
3. Email address: The email address of the subscriber to the Security Hub summary email.
4. CloudWatch Cron Expression: The Cron expression for scheduling the Security Hub summary email. The default is every Monday 8:00 AM GMT. Learn more about creating Cron expressions.
5. Additional Footer Text: Text that will appear at the bottom of the email message. This can be useful to guide the recipient on next steps or provide internal resource links. This is an optional parameter; leave it blank for no text.
Figure 12: Enter CloudFormation parameters
Choose Next.
Keep all defaults in the screens that follow, and choose Next.
Select the check box I acknowledge that AWS CloudFormation might create IAM resources, and then choose Create stack.

Test the solution

You can send a test email after the deployment is complete. To do this, navigate to the Lambda console and locate the Lambda function named SendSecurityHubSummaryEmail. Perform a manual invocation with any event payload to receive an email within a few minutes. You can repeat this procedure as many times as you wish.

Conclusion

We’ve outlined an approach for rapidly building a solution for sending a weekly summary of the security posture of your AWS account as evaluated by Security Hub. This solution makes it easier for you to be diligent in reviewing any outstanding findings and to remediate findings in a timely way based on their severity. You can extend the solution in many ways, including:

Add links in the footer text to the remediation workflows, such as creating a ticket for ServiceNow or any Security Information and Event Management (SIEM) that you use.
Add links to internal wikis for workflows like organizational exceptions to vulnerabilities or other internal processes.
Extend the solution by modifying the custom insights content, email content, and delivery frequency.

To learn more about how to set up and customize Security Hub, see these additional blog posts.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the AWS Security Hub forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How to continuously audit and limit security groups with AWS Firewall Manager

2021-02-18 Jesse Lepich

Post Syndicated from Jesse Lepich original https://aws.amazon.com/blogs/security/how-to-continuously-audit-and-limit-security-groups-with-aws-firewall-manager/

At AWS re:Invent 2019 and in a subsequent blog post, Stephen Schmidt, Chief Information Security Officer for Amazon Web Services (AWS), laid out the top 10 security items that AWS customers should pay special attention to if they want to improve their security posture. High on the list is the need to manage your network security and virtual private cloud (VPC) security groups. In this blog post, we’ll look at how you can use AWS Firewall Manager to address item number 4 on Stephen’s list: “Limit Security Groups.”

One fundamental security measure is to restrict network access to a server or service when connecting it to a network. In an on-premises scenario, you would use a firewall or similar technology to restrict network access to only approved IPs, ports, and protocols. When you migrate existing workloads or launch new workloads in AWS, the same basic security measures should be applied. Security groups, network access control lists, and AWS Network Firewall provide network security functionality in AWS. In this post, we’ll summarize the main use cases for managing security groups with Firewall Manager, and then we’ll take a step-by-step look at how you can configure Firewall Manager to manage protection of high-risk applications, such as Remote Desktop Protocol (RDP) and Secure Shell (SSH).

What are security groups?

Security groups are a powerful tool provided by AWS for use in enforcing network security and access control to your AWS resources and Amazon Elastic Compute Cloud (Amazon EC2) instances. Security groups provide stateful Layer 3/Layer 4 filtering for EC2 interfaces.

There are some things you need to know about configuring security groups:

A security group with no inbound rules denies all inbound traffic.
You need to create rules in order to allow traffic to flow.
You cannot create an explicit deny rule with a security group.
There are separate inbound and outbound rules for each security group.
Security groups are assigned to an EC2 instance, similar to a host-based firewall, and not to the subnet or VPC, and you can assign up to five security groups to each instance.
Security groups can be built by referencing IP addresses, subnets, or by referencing another security group.
Security groups can be reused across different instances. This means that you don’t have to create long complex rulesets when dealing with multiple subnets.

Best practices for security groups

AWS recommends that you follow these best practices when you work with security groups.

Remove unused or unattached security groups
Large numbers of unused or unattached security groups create confusion and invite misconfiguration. Remove any unused security groups. (PCI.EC2.3)

Limit modification to authorized roles only
AWS Identity and Access Management (IAM) roles with access can modify security groups. Limit the number of roles that have authorization to change security groups. (PCI DSS 7.2.1)

Monitor the creation or deletion of security groups
This best practice works hand in hand with the first two; you should always monitor for the attempted creation, modification, and deletion of security groups. (CIS AWS Foundations 3.10)

Don’t ignore the outbound or egress rules
Limit outbound access to only the subnets that are required. For example, in a three-tier web application, the app layer likely shouldn’t have unrestricted access to the internet, so configure the security group to allow access to only those hosts or subnets needed for correct functioning of the application. (PCI DSS 1.3.4)

Limit the ingress or inbound port ranges that are accessible
Limit the ports that are open in a security group to only those that are necessary for the application to function correctly. With large port ranges open, you may be exposed to any vulnerabilities or unintended access to services. This is especially important with high-risk applications. (CIS AWS Foundations 4.1, 4.2) (PCI DSS 1.2.1, 1.3.2)

Maintaining these best practices manually can be a challenge in large-scale AWS environments, or where developers and application owners might be deploying new applications often. Organizations can address this challenge by providing centrally configured guardrails. At AWS, we view security as an enabler to development velocity, making it possible for developers to move applications into production very quickly, but with the correct safeguards in place automatically.

Manage security groups with Firewall Manager

Firewall Manager is a security management service that you can use to centrally configure and manage firewall rules across your accounts and applications in AWS Organizations. As new applications are created, Firewall Manager makes it easier to bring them into compliance by enforcing a common set of baseline security rules and ensuring that overly permissive rules generate compliance findings or are automatically removed. With Firewall Manager, you have a single service to build firewall rules, create security policies, and enforce rules and policies in a consistent, hierarchical way across your entire infrastructure. Learn more about the Firewall Manager prerequisites.

The security group capabilities of Firewall Manager fall into three broad categories:

Create and apply baseline security groups to AWS accounts and resources.
Audit and clean up unused or redundant security groups.
Audit and control security group rules to identify rules that are too permissive and high risk.

In the following sections, we’re going to show how you can use Firewall Manager to audit and limit security groups by identifying rules that are too permissive and expose high-risk applications to external threats.

Use Firewall Manager to help protect high-risk applications

In this example, we’ll show how customers can use Firewall Manager to improve their security posture by automatically limiting access to high-risk applications, such as RDP, SSH, and SMB, from anywhere on the internet. All too often, access to these applications is left open to the internet, where unauthorized parties can find them using automated scanning tools. It has become increasingly important for customers to work towards reducing their risk surface due to the decrease in technical difficulty these types of attacks require. In many cases, the overly permissive access begins as a temporary setting for testing, and then is inadvertently left open over the long term. With a simple-to-configure policy, Firewall Manager can find and even automatically fix this issue across all of your AWS accounts.

Let’s jump right into configuring Firewall Manager for this use case, where you’ll inventory where public IP addresses are allowed to access high-risk applications. Once you’ve evaluated all the occurrences, then you’ll automatically remediate them.

To use Firewall Manager to limit access to high-risk applications

Sign in to the AWS Management Console using the Firewall Manager administrator account, then navigate to Firewall Manager in the Console and choose Security policies.
Specify the correct AWS Region your policy should be deployed to, and then choose Create policy.

Figure 1: Create Firewall Manager policy
Under Policy type, choose Security group. Under Security group policy type, choose Auditing and enforcement of security group rules. Then confirm the Region is correct and choose Next.

Figure 2: Firewall Manager policy type and Region
Enter a policy name. Under Policy options, choose Configure managed audit policy rules. Under Policy rules, choose Inbound Rules, and then turn on the Audit high risk applications action.

Figure 3: Firewall Manager managed audit policy
Next, choose Applications that can only access local CIDR ranges, and then choose Add application list.As you can see from Figure 4 below, what this setting does is look for resources that allow non-RFC1918 private address ranges (publically routable internet IP addresses) to connect to them. By listing these applications, you can focus on your highest risk scenarios (accessibility to these high-risk applications from the internet) first. As an information security practitioner, you always want to maximize your limited time and focus on the highest risk items first. Firewall Manager makes this easier to do at scale across all AWS resources.

Figure 4: Firewall Manager audit high risk applications setting
Under Add application list, choose Add an existing list. Then select FMS-Default-Public-Access-Apps-Denied, and choose Add application list. The default managed list includes SSH, RDP, NFS, SMB, and NetBIOS, but you can also create your own custom application lists in Firewall Manager.

Figure 5: Firewall Manager list of applications denied public access
Under Policy action, choose Identify resources that don’t comply with the policy rules, but don’t auto remediate, and then choose Next.This is where you can choose whether to have Firewall Manager provide alerts only, or to alert and automatically remove the specific risky security group rules. We recommend that customers start this process by only identifying noncompliant resources so that they can understand the full impact of eventually setting the auto remediation policy action.

Figure 6: Firewall Manager policy action
Under AWS accounts this policy applies to, choose Include all accounts under my AWS organization. Under Resource type, select all of the resource types. Under Resources, choose Include all resources that match the selected resource type to define the scope of this policy (what the policy will apply to), and then choose Next.This scope will give you a broad view of all resources that have high-risk applications exposed to the internet, but if you wanted, you could be much more targeted with how you apply your security policies using the other available scope options here. For now, let’s keep the scope broad so you can get a comprehensive view of your risk surface.

Figure 7: Firewall Manager policy scope
If you choose to, you can apply a tag to this specific Firewall Manager security policy for tracking and documentation purposes. Then choose Next.

Figure 8: Firewall Manager policy tags
The final page gives an overview of all the configuration settings so you can review and verify the correct configuration. Once you’re done reviewing the policy, choose Create policy to deploy this policy.

Figure 9: Review and create policy in Firewall Manager

Now that you’ve created your Firewall Manager policy, you need to wait five minutes for Firewall Manager to inventory all of your AWS accounts and resources as it looks for noncompliant high-risk applications exposed to the internet.

Review policy findings to understand the risk surface

There are two main ways to review details about resources that are noncompliant with the Firewall Manager security policy you created: you can use Firewall Manager itself, or you can also use AWS Security Hub, since Firewall Manager sends all findings to Security Hub by default. Security Hub is a central location you can use to view findings from many security tools, including both native AWS security tools and third-party security tools. Security Hub can help you further focus your time in the highest value areas by, for example, showing you which resources have the largest number of security findings associated with them, and therefore represent a higher risk that should be addressed first. We won’t cover Security Hub here, but it’s helpful to know that Firewall Manager integrates with Security Hub.

Now that you’ve configured your Firewall Manager security policy and it has had time to inventory your environment to help identify noncompliant resources, you can review what Firewall Manager has found by viewing the Firewall Manager security policy.

To review policy findings on the Security policies page in the Firewall Manager console, you can see an overview of the policy you just created. You can see that the policy isn’t set to auto remediate yet, and that there are seven accounts that have noncompliant resources in them.

Figure 10: Firewall Manager policy result overview

To view the specific details of each noncompliant resource, choose the name of your security policy. A list of accounts with noncompliant resources will be displayed.

Figure 11: Firewall Manager noncompliant accounts

Choose an account number to get more details about that account. Now you can see a list of noncompliant resources.

Figure 12: Firewall Manager noncompliant resources

To get further details regarding why a resource is noncompliant, choose the Resource ID. This will show you the specific noncompliant security group rule.

Here you can see that this security group resource violates the Firewall Manager security policy that you created because it allows a source of 0.0.0.0/0 (any) to access TCP/3389 (RDP).

Figure 13: Firewall Manager non compliant security group rule

The recommended action is to remove this noncompliant rule from the security group. You can choose to do that manually. Or, alternatively, once you’ve reviewed all the findings and have a good understanding of all of the noncompliant resources, you can simply edit your existing “Protect high risk applications from the Internet” Firewall Manager security policy and set the policy action to Auto remediate non-compliant resources. This causes Firewall Manager to attempt to force compliance across all these resources automatically using its service-linked role. This level of automation can help security teams make sure that their organization’s resources aren’t being accidentally exposed to high-risk scenarios.

Use Firewall Manager to address other security group use cases

Firewall Manager has many other security group–related capabilities that I didn’t cover here. You can learn more about those here. This post was focused on helping customers start today to address high-risk scenarios that they may inadvertently have in their AWS environment. Firewall Manager can help you get continuous visibility into these scenarios, as well as automatically remediate them, even if these scenarios occur in the future. Here’s a quick overview of other use cases Firewall Manager can help you with. Keep in mind that these rules can be set to alert you only, or alert and auto remediate:

Deploy pre-approved security groups to AWS accounts and automatically associate them with resources
Deny the use of “ALL” protocol in security group rules, instead requiring that a specific protocol be selected
Deny the use of port ranges greater than n in security group rules
Deny the use of Classless Inter-Domain Routing (CIDR) ranges less than n in security group rules
Specify a list of applications that can be accessible from anywhere across the internet (and deny access to all other applications)
Identify security groups that are unused for n number of days
Identify redundant security groups

Firewall Manager has received many significant feature enhancements over the last year, but we’re not done yet. We have a robust roadmap of features we’re actively working on that will continue to make it easier for AWS customers to achieve security compliance of their resources.

Conclusion

In this post, we explored how Firewall Manager can help you more easily manage the VPC security groups in your AWS environments from a single central tool. Specifically, we showed how Firewall Manager can assist in implementing Stephen Schmidt’s best practice #4, “Limit Security Groups.” We focused on exactly how you can configure Firewall Manager to evaluate and get visibility into your external-facing risk surface of high-risk applications such as SSH, RDP, and SMB, and how you can use Firewall Manager to automatically remediate out-of-compliance security groups. We also summarized the other security group–related capabilities of Firewall Manager so that you can see there are many more use cases you can address with Firewall Manager. We encourage you to start using Firewall Manager today to protect your applications.

To learn more, see these AWS Security Blog posts on Firewall Manager.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Use new account assignment APIs for AWS SSO to automate multi-account access

2021-02-08 Akhil Aendapally

Post Syndicated from Akhil Aendapally original https://aws.amazon.com/blogs/security/use-new-account-assignment-apis-for-aws-sso-to-automate-multi-account-access/

In this blog post, we’ll show how you can programmatically assign and audit access to multiple AWS accounts for your AWS Single Sign-On (SSO) users and groups, using the AWS Command Line Interface (AWS CLI) and AWS CloudFormation.

With AWS SSO, you can centrally manage access and user permissions to all of your accounts in AWS Organizations. You can assign user permissions based on common job functions, customize them to meet your specific security requirements, and assign the permissions to users or groups in the specific accounts where they need access. You can create, read, update, and delete permission sets in one place to have consistent role policies across your entire organization. You can then provide access by assigning permission sets to multiple users and groups in multiple accounts all in a single operation.

AWS SSO recently added new account assignment APIs and AWS CloudFormation support to automate access assignment across AWS Organizations accounts. This release addressed feedback from our customers with multi-account environments who wanted to adopt AWS SSO, but faced challenges related to managing AWS account permissions. To automate the previously manual process and save your administration time, you can now use the new AWS SSO account assignment APIs, or AWS CloudFormation templates, to programmatically manage AWS account permission sets in multi-account environments.

With AWS SSO account assignment APIs, you can now build your automation that will assign access for your users and groups to AWS accounts. You can also gain insights into who has access to which permission sets in which accounts across your entire AWS Organizations structure. With the account assignment APIs, your automation system can programmatically retrieve permission sets for audit and governance purposes, as shown in Figure 1.

Figure 1: Automating multi-account access with the AWS SSO API and AWS CloudFormation

Overview

In this walkthrough, we’ll illustrate how to create permission sets, assign permission sets to users and groups in AWS SSO, and grant access for users and groups to multiple AWS accounts by using the AWS Command Line Interface (AWS CLI) and AWS CloudFormation.

To grant user permissions to AWS resources with AWS SSO, you use permission sets. A permission set is a collection of AWS Identity and Access Management (IAM) policies. Permission sets can contain up to 10 AWS managed policies and a single custom policy stored in AWS SSO.

A policy is an object that defines a user’s permissions. Policies contain statements that represent individual access controls (allow or deny) for various tasks. This determines what tasks users can or cannot perform within the AWS account. AWS evaluates these policies when an IAM principal (a user or role) makes a request.

When you provision a permission set in the AWS account, AWS SSO creates a corresponding IAM role on that account, with a trust policy that allows users to assume the role through AWS SSO. With AWS SSO, you can assign more than one permission set to a user in the specific AWS account. Users who have multiple permission sets must choose one when they sign in through the user portal or the AWS CLI. Users will see these as IAM roles.

To learn more about IAM policies, see Policies and permissions in IAM. To learn more about permission sets, see Permission Sets.

Assume you have a company, Example.com, which has three AWS accounts: an organization management account (ExampleOrgMaster), a development account (ExampleOrgDev), and a test account (ExampleOrgTest). Example.com uses AWS Organizations to manage these accounts and has already enabled AWS SSO.

Example.com has the IT security lead, Frank Infosec, who needs PowerUserAccess to the test account (ExampleOrgTest) and SecurityAudit access to the development account (ExampleOrgDev). Alice Developer, the developer, needs full access to Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3) through the development account (ExampleOrgDev). We’ll show you how to assign and audit the access for Alice and Frank centrally with AWS SSO, using the AWS CLI.

The flow includes the following steps:

Create three permission sets:
- PowerUserAccess, with the PowerUserAccess policy attached.
- AuditAccess, with the SecurityAudit policy attached.
- EC2-S3-FullAccess, with the AmazonEC2FullAccess and AmazonS3FullAccess policies attached.
Assign permission sets to the AWS account and AWS SSO users:
- Assign the PowerUserAccess and AuditAccess permission sets to Frank Infosec, to provide the required access to the ExampleOrgDev and ExampleOrgTest accounts.
- Assign the EC2-S3-FullAccess permission set to Alice Developer, to provide the required permissions to the ExampleOrgDev account.
Retrieve the assigned permissions by using Account Entitlement APIs for audit and governance purposes.

Note: AWS SSO Permission sets can contain either AWS managed policies or custom policies that are stored in AWS SSO. In this blog we attach AWS managed polices to the AWS SSO Permission sets for simplicity. To help secure your AWS resources, follow the standard security advice of granting least privilege access using AWS SSO custom policy while creating AWS SSO Permission set.

Figure 2: AWS Organizations accounts access for Alice and Frank

To help simplify administration of access permissions, we recommend that you assign access directly to groups rather than to individual users. With groups, you can grant or deny permissions to groups of users, rather than having to apply those permissions to each individual. For simplicity, in this blog you’ll assign permissions directly to the users.

Prerequisites

Before you start this walkthrough, complete these steps:

Identify the AWS accounts to which you want to grant AWS SSO access, and add them to your organization. To learn more, see Managing the AWS accounts in your organization.
Get the permissions that are required to use the AWS SSO console. To learn more, see Permissions Required to Use the AWS SSO Console.
Sign in to the AWS Organizations management account AWS Management Console with AWS SSO administrator credentials. To learn more about AWS Organizations and the management account, see AWS Organizations FAQs.
Enable AWS SSO for your AWS Organizations structure. To learn more, see Enable AWS SSO.
Have your users and groups provisioned in AWS SSO. You can manage your users and groups in AWS SSO internal identity store, connect AWS SSO to your Microsoft Active Directory or integrate with an external identity provider using SAML 2.0 and SCIM 2.0. To learn more about AWS SSO identity store options, see Manage Your Identity Source.

Use the AWS SSO API from the AWS CLI

In order to call the AWS SSO account assignment API by using the AWS CLI, you need to install and configure AWS CLI v2. For more information about AWS CLI installation and configuration, see Installing the AWS CLI and Configuring the AWS CLI.

Step 1: Create permission sets

In this step, you learn how to create EC2-S3FullAccess, AuditAccess, and PowerUserAccess permission sets in AWS SSO from the AWS CLI.

Before you create the permission sets, run the following command to get the Amazon Resource Name (ARN) of the AWS SSO instance and the Identity Store ID, which you will need later in the process when you create and assign permission sets to AWS accounts and users or groups.

aws sso-admin list-instances

Figure 3 shows the results of running the command.

Figure 3: AWS SSO list instances

Next, create the permission set for the security team (Frank) and dev team (Alice), as follows.

Permission set for Alice Developer (EC2-S3-FullAccess)

Run the following command to create the EC2-S3-FullAccess permission set for Alice, as shown in Figure 4.

aws sso-admin create-permission-set --instance-arn '<Instance ARN>' --name 'EC2-S3-FullAccess' --description 'EC2 and S3 access for developers'

Figure 4: Creating the permission set EC2-S3-FullAccess

Permission set for Frank Infosec (AuditAccess)

Run the following command to create the AuditAccess permission set for Frank, as shown in Figure 5.

aws sso-admin create-permission-set --instance-arn '<Instance ARN>' --name 'AuditAccess' --description 'Audit Access for security team on ExampleOrgDev account'

Figure 5: Creating the permission set AuditAccess

Permission set for Frank Infosec (PowerUserAccess)

Run the following command to create the PowerUserAccess permission set for Frank, as shown in Figure 6.

aws sso-admin create-permission-set --instance-arn '<Instance ARN>' --name 'PowerUserAccess' --description 'Power User Access for security team on ExampleOrgDev account'

Figure 6: Creating the permission set PowerUserAccess

Copy the permission set ARN from these responses, which you will need when you attach the managed policies.

Step 2: Assign policies to permission sets

In this step, you learn how to assign managed policies to the permission sets that you created in step 1.

Attach policies to the EC2-S3-FullAccess permission set

Run the following command to attach the amazonec2fullacess AWS managed policy to the EC2-S3-FullAccess permission set, as shown in Figure 7.

aws sso-admin attach-managed-policy-to-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --managed-policy-arn 'arn:aws:iam::aws:policy/amazonec2fullaccess'

Figure 7: Attaching the AWS managed policy amazonec2fullaccess to the EC2-S3-FullAccess permission set

Run the following command to attach the amazons3fullaccess AWS managed policy to the EC2-S3-FullAccess permission set, as shown in Figure 8.

aws sso-admin attach-managed-policy-to-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --managed-policy-arn 'arn:aws:iam::aws:policy/amazons3fullaccess'

Figure 8: Attaching the AWS managed policy amazons3fullaccess to the EC2-S3-FullAccess permission set

Attach a policy to the AuditAccess permission set

Run the following command to attach the SecurityAudit managed policy to the AuditAccess permission set that you created earlier, as shown in Figure 9.

aws sso-admin attach-managed-policy-to-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --managed-policy-arn 'arn:aws:iam::aws:policy/SecurityAudit'

Figure 9: Attaching the AWS managed policy SecurityAudit to the AuditAccess permission set

Attach a policy to the PowerUserAccess permission set

The following command is similar to the previous command; it attaches the PowerUserAccess managed policy to the PowerUserAccess permission set, as shown in Figure 10.

aws sso-admin attach-managed-policy-to-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --managed-policy-arn 'arn:aws:iam::aws:policy/PowerUserAccess'

Figure 10: Attaching AWS managed policy PowerUserAccess to the PowerUserAccess permission set

In the next step, you assign users (Frank Infosec and Alice Developer) to their respective permission sets and assign permission sets to accounts.

Step 3: Assign permission sets to users and groups and grant access to AWS accounts

In this step, you assign the AWS SSO permission sets you created to users and groups and AWS accounts, to grant the required access for these users and groups on respective AWS accounts.

To assign access to an AWS account for a user or group, using a permission set you already created, you need the following:

The principal ID (the ID for the user or group)
The AWS account ID to which you need to assign this permission set

To obtain a user’s or group’s principal ID (UserID or GroupID), you need to use the AWS SSO Identity Store API. The AWS SSO Identity Store service enables you to retrieve all of your identities (users and groups) from AWS SSO. See AWS SSO Identity Store API for more details.

Use the first two commands shown here to get the principal ID for the two users, Alice (Alice’s user name is [email protected]) and Frank (Frank’s user name is [email protected]).

Alice’s user ID

Run the following command to get Alice’s user ID, as shown in Figure 11.

aws identitystore list-users --identity-store-id '<Identity Store ID>' --filter AttributePath='UserName',AttributeValue='[email protected]'

Figure 11: Retrieving Alice’s user ID

Frank’s user ID

Run the following command to get Frank’s user ID, as shown in Figure 12.

aws identitystore list-users --identity-store-id '<Identity Store ID>'--filter AttributePath='UserName',AttributeValue='[email protected]'

Figure 12: Retrieving Frank’s user ID

Note: To get the principal ID for a group, use the following command.
aws identitystore list-groups --identity-store-id '<Identity Store ID>' --filter AttributePath='DisplayName',AttributeValue='<Group Name>'

Assign the EC2-S3-FullAccess permission set to Alice in the ExampleOrgDev account

Run the following command to assign Alice access to the ExampleOrgDev account using the EC2-S3-FullAccess permission set. This will give Alice full access to Amazon EC2 and S3 services in the ExampleOrgDev account.

Note: When you call the CreateAccountAssignment API, AWS SSO automatically provisions the specified permission set on the account in the form of an IAM policy attached to the AWS SSO–created IAM role. This role is immutable: it’s fully managed by the AWS SSO, and it cannot be deleted or changed by the user even if the user has full administrative rights on the account. If the permission set is subsequently updated, the corresponding IAM policies attached to roles in your accounts won’t be updated automatically. In this case, you will need to call ProvisionPermissionSet to propagate these updates.

aws sso-admin create-account-assignment --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --principal-id '<user/group ID>' --principal-type '<USER/GROUP>' --target-id '<AWS Account ID>' --target-type AWS_ACCOUNT

Figure 13: Assigning the EC2-S3-FullAccess permission set to Alice on the ExampleOrgDev account

Assign the AuditAccess permission set to Frank Infosec in the ExampleOrgDev account

Run the following command to assign Frank access to the ExampleOrgDev account using the EC2-S3- AuditAccess permission set.

aws sso-admin create-account-assignment --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --principal-id '<user/group ID>' --principal-type '<USER/GROUP>' --target-id '<AWS Account ID>' --target-type AWS_ACCOUNT

Figure 14: Assigning the AuditAccess permission set to Frank on the ExampleOrgDev account

Assign the PowerUserAccess permission set to Frank Infosec in the ExampleOrgTest account

Run the following command to assign Frank access to the ExampleOrgTest account using the PowerUserAccess permission set.

aws sso-admin create-account-assignment --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>' --principal-id '<user/group ID>' --principal-type '<USER/GROUP>' --target-id '<AWS Account ID>' --target-type AWS_ACCOUNT

Figure 15: Assigning the PowerUserAccess permission set to Frank on the ExampleOrgTest account

To view the permission sets provisioned on the AWS account, run the following command, as shown in Figure 16.

aws sso-admin list-permission-sets-provisioned-to-account --instance-arn '<Instance ARN>' --account-id '<AWS Account ID>'

Figure 16: View the permission sets (AuditAccess and EC2-S3-FullAccess) assigned to the ExampleOrgDev account

To review the created resources in the AWS Management Console, navigate to the AWS SSO console. In the list of permission sets on the AWS accounts tab, choose the EC2-S3-FullAccess permission set. Under AWS managed policies, the policies attached to the permission set are listed, as shown in Figure 17.

Figure 17: Review the permission set in the AWS SSO console

To see the AWS accounts, where the EC2-S3-FullAccess permission set is currently provisioned, navigate to the AWS accounts tab, as shown in Figure 18.

Figure 18: Review permission set account assignment in the AWS SSO console

Step 4: Audit access

In this step, you learn how to audit access assigned to your users and group by using the AWS SSO account assignment API. In this example, you’ll start from a permission set, review the permissions (AWS-managed policies or a custom policy) attached to the permission set, get the users and groups associated with the permission set, and see which AWS accounts the permission set is provisioned to.

List the IAM managed policies for the permission set

Run the following command to list the IAM managed policies that are attached to a specified permission set, as shown in Figure 19.

aws sso-admin list-managed-policies-in-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>'

Figure 19: View the managed policies attached to the permission set

List the assignee of the AWS account with the permission set

Run the following command to list the assignee (the user or group with the respective principal ID) of the specified AWS account with the specified permission set, as shown in Figure 20.

aws sso-admin list-account-assignments --instance-arn '<Instance ARN>' --account-id '<Account ID>' --permission-set-arn '<Permission Set ARN>'

Figure 20: View the permission set and the user or group attached to the AWS account

List the accounts to which the permission set is provisioned

Run the following command to list the accounts that are associated with a specific permission set, as shown in Figure 21.

aws sso-admin list-accounts-for-provisioned-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<Permission Set ARN>'

Figure 21: View AWS accounts to which the permission set is provisioned

In this section of the post, we’ve illustrated how to create a permission set, assign a managed policy to the permission set, and grant access for AWS SSO users or groups to AWS accounts by using this permission set. In the next section, we’ll show you how to do the same using AWS CloudFormation.

Use the AWS SSO API through AWS CloudFormation

In this section, you learn how to use CloudFormation templates to automate the creation of permission sets, attach managed policies, and use permission sets to assign access for a particular user or group to AWS accounts.

Sign in to your AWS Management Console and create a CloudFormation stack by using the following CloudFormation template. For more information on how to create a CloudFormation stack, see Creating a stack on the AWS CloudFormation console.

//start of Template//
{
    "AWSTemplateFormatVersion": "2010-09-09",
  
    "Description": "AWS CloudFormation template to automate multi-account access with AWS Single Sign-On (Entitlement APIs): Create permission sets, assign access for AWS SSO users and groups to AWS accounts using permission sets. Before you use this template, we assume you have enabled AWS SSO for your AWS Organization, added the AWS accounts to which you want to grant AWS SSO access to your organization, signed in to the AWS Management Console with your AWS Organizations management account credentials, and have the required permissions to use the AWS SSO console.",
  
    "Parameters": {
      "InstanceARN" : {
        "Type" : "String",
        "AllowedPattern": "arn:aws:sso:::instance/(sso)?ins-[a-zA-Z0-9-.]{16}",
        "Description" : "Enter AWS SSO InstanceARN. Ex: arn:aws:sso:::instance/ssoins-xxxxxxxxxxxxxxxx",
        "ConstraintDescription": "must be the name of an existing AWS SSO InstanceARN associated with the management account."
      },
      "ExampleOrgDevAccountId" : {
        "Type" : "String",
        "AllowedPattern": "\\d{12}",
        "Description" : "Enter 12-digit Developer AWS Account ID. Ex: 123456789012"
        },
      "ExampleOrgTestAccountId" : {
        "Type" : "String",
        "AllowedPattern": "\\d{12}",
        "Description" : "Enter 12-digit AWS Account ID. Ex: 123456789012"
        },
      "AliceDeveloperUserId" : {
        "Type" : "String",
        "AllowedPattern": "^([0-9a-f]{10}-|)[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}$",
        "Description" : "Enter Developer UserId. Ex: 926703446b-f10fac16-ab5b-45c3-86c1-xxxxxxxxxxxx"
        },
        "FrankInfosecUserId" : {
            "Type" : "String",
            "AllowedPattern": "^([0-9a-f]{10}-|)[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}$",
            "Description" : "Enter Test UserId. Ex: 926703446b-f10fac16-ab5b-45c3-86c1-xxxxxxxxxxxx"
            }
    },
    "Resources": {
        "EC2S3Access": {
            "Type" : "AWS::SSO::PermissionSet",
            "Properties" : {
                "Description" : "EC2 and S3 access for developers",
                "InstanceArn" : {
                    "Ref": "InstanceARN"
                },
                "ManagedPolicies" : ["arn:aws:iam::aws:policy/amazonec2fullaccess","arn:aws:iam::aws:policy/amazons3fullaccess"],
                "Name" : "EC2-S3-FullAccess",
                "Tags" : [ {
                    "Key": "Name",
                    "Value": "EC2S3Access"
                 } ]
              }
        },  
        "SecurityAuditAccess": {
            "Type" : "AWS::SSO::PermissionSet",
            "Properties" : {
                "Description" : "Audit Access for Infosec team",
                "InstanceArn" : {
                    "Ref": "InstanceARN"
                },
                "ManagedPolicies" : [ "arn:aws:iam::aws:policy/SecurityAudit" ],
                "Name" : "AuditAccess",
                "Tags" : [ {
                    "Key": "Name",
                    "Value": "SecurityAuditAccess"
                 } ]
              }
        },    
        "PowerUserAccess": {
            "Type" : "AWS::SSO::PermissionSet",
            "Properties" : {
                "Description" : "Power User Access for Infosec team",
                "InstanceArn" : {
                    "Ref": "InstanceARN"
                },
                "ManagedPolicies" : [ "arn:aws:iam::aws:policy/PowerUserAccess"],
                "Name" : "PowerUserAccess",
                "Tags" : [ {
                    "Key": "Name",
                    "Value": "PowerUserAccess"
                 } ]
              }      
        },
        "EC2S3userAssignment": {
            "Type" : "AWS::SSO::Assignment",
            "Properties" : {
                "InstanceArn" : {
                    "Ref": "InstanceARN"
                },
                "PermissionSetArn" : {
                    "Fn::GetAtt": [
                        "EC2S3Access",
                        "PermissionSetArn"
                     ]
                },
                "PrincipalId" : {
                    "Ref": "AliceDeveloperUserId"
                },
                "PrincipalType" : "USER",
                "TargetId" : {
                    "Ref": "ExampleOrgDevAccountId"
                },
                "TargetType" : "AWS_ACCOUNT"
              }
          },
          "SecurityAudituserAssignment": {
            "Type" : "AWS::SSO::Assignment",
            "Properties" : {
                "InstanceArn" : {
                    "Ref": "InstanceARN"
                },
                "PermissionSetArn" : {
                    "Fn::GetAtt": [
                        "SecurityAuditAccess",
                        "PermissionSetArn"
                     ]
                },
                "PrincipalId" : {
                    "Ref": "FrankInfosecUserId"
                },
                "PrincipalType" : "USER",
                "TargetId" : {
                    "Ref": "ExampleOrgDevAccountId"
                },
                "TargetType" : "AWS_ACCOUNT"
              }
          },
          "PowerUserAssignment": {
            "Type" : "AWS::SSO::Assignment",
            "Properties" : {
                "InstanceArn" : {
                    "Ref": "InstanceARN"
                },
                "PermissionSetArn" : {
                    "Fn::GetAtt": [
                        "PowerUserAccess",
                        "PermissionSetArn"
                     ]
                },
                "PrincipalId" : {
                    "Ref": "FrankInfosecUserId"
                },
                "PrincipalType" : "USER",
                "TargetId" : {
                    "Ref": "ExampleOrgTestAccountId"
                },
                "TargetType" : "AWS_ACCOUNT"
              }
          }
    }
}
//End of Template//

When you create the stack, provide the following information for setting the example permission sets for Frank Infosec and Alice Developer, as shown in Figure 22:

The Alice Developer and Frank Infosec user IDs
The ExampleOrgDev and ExampleOrgTest account IDs
The AWS SSO instance ARN

Then launch the CloudFormation stack.

Figure 22: User inputs to launch the CloudFormation template

AWS CloudFormation creates the resources that are shown in Figure 23.

Figure 23: Resources created from the CloudFormation stack

Cleanup

To delete the resources you created by using the AWS CLI, use these commands.

Run the following command to delete the account assignment.

delete-account-assignment --instance-arn '<Instance ARN>' --target-id '<AWS Account ID>' --target-type 'AWS_ACCOUNT' --permission-set-arn '<PermissionSet ARN>' --principal-type '<USER/GROUP>' --principal-id '<user/group ID>'

After the account assignment is deleted, run the following command to delete the permission set.

delete-permission-set --instance-arn '<Instance ARN>' --permission-set-arn '<PermissionSet ARN>'

To delete the resource that you created by using the CloudFormation template, go to the AWS CloudFormation console. Select the appropriate stack you created, and then choose delete. Deleting the CloudFormation stack cleans up the resources that were created.

Summary

In this blog post, we showed how to use the AWS SSO account assignment API to automate the deployment of permission sets, how to add managed policies to permission sets, and how to assign access for AWS users and groups to AWS accounts by using specified permission sets.

To learn more about the AWS SSO APIs available for you, see the AWS Single Sign-On API Reference Guide.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS SSO forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Resource leak detection in Amazon CodeGuru Reviewer

2021-01-14 Pranav Garg

Post Syndicated from Pranav Garg original https://aws.amazon.com/blogs/devops/resource-leak-detection-in-amazon-codeguru/

This post discusses the resource leak detector for Java in Amazon CodeGuru Reviewer. CodeGuru Reviewer automatically analyzes pull requests (created in supported repositories such as AWS CodeCommit, GitHub, GitHub Enterprise, and Bitbucket) and generates recommendations for improving code quality. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. This blog does not describe the resource leak detector for Python programs that is now available in preview.

What are resource leaks?

Resources are objects with a limited availability within a computing system. These typically include objects managed by the operating system, such as file handles, database connections, and network sockets. Because the number of such resources in a system is limited, they must be released by an application as soon as they are used. Otherwise, you will run out of resources and you won’t be able to allocate new ones. The paradigm of acquiring a resource and releasing it is also followed by other categories of objects such as metric wrappers and timers.

Resource leaks are bugs that arise when a program doesn’t release the resources it has acquired. Resource leaks can lead to resource exhaustion. In the worst case, they can cause the system to slow down or even crash.

Starting with Java 7, most classes holding resources implement the java.lang.AutoCloseable interface and provide a close() method to release them. However, a close() call in source code doesn’t guarantee that the resource is released along all program execution paths. For example, in the following sample code, resource r is acquired by calling its constructor and is closed along the path corresponding to the if branch, shown using green arrows. To ensure that the acquired resource doesn’t leak, you must also close r along the path corresponding to the else branch (the path shown using red arrows).

A resource must be closed along all execution paths to prevent resource leaks

Often, resource leaks manifest themselves along code paths that aren’t frequently run, or under a heavy system load, or after the system has been running for a long time. As a result, such leaks are latent and can remain dormant in source code for long periods of time before manifesting themselves in production environments. This is the primary reason why resource leak bugs are difficult to detect or replicate during testing, and why automatically detecting these bugs during pull requests and code scans is important.

Detecting resource leaks in CodeGuru Reviewer

For this post, we consider the following Java code snippet. In this code, method getConnection() attempts to create a connection in the connection pool associated with a data source. Typically, a connection pool limits the maximum number of connections that can remain open at any given time. As a result, you must close connections after their use so as to not exhaust this limit.

 1     private Connection getConnection(final BasicDataSource dataSource, ...)
               throws ValidateConnectionException, SQLException {
 2         boolean connectionAcquired = false;
 3         // Retrying three times to get the connection.
 4         for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
 5             Connection connection = dataSource.getConnection();
 6             // validateConnection may throw ValidateConnectionException
 7             if (! validateConnection(connection, ...)) {
 8                 // connection is invalid
 9                 DbUtils.closeQuietly(connection);
10             } else {
11                 // connection is established
12                 connectionAcquired = true;
13                 return connection;
14             }
15         }
16         return null;
17     }

At first glance, it seems that the method getConnection() doesn’t leak connection resources. If a valid connection is established in the connection pool (else branch on line 10 is taken), the method getConnection() returns it to the client for use (line 13). If the connection established is invalid (if branch on line 7 is taken), it’s closed in line 9 before another attempt is made to establish a connection.

However, method validateConnection() at line 7 can throw a ValidateConnectionException. If this exception is thrown after a connection is established at line 5, the connection is neither closed in this method nor is it returned upstream to the client to be closed later. Furthermore, if this exceptional code path runs frequently, for instance, if the validation logic throws on a specific recurring service request, each new request causes a connection to leak in the connection pool. Eventually, the client can’t acquire new connections to the data source, impacting the availability of the service.

A typical recommendation to prevent resource leak bugs is to declare the resource objects in a try-with-resources statement block. However, we can’t use try-with-resources to fix the preceding method because this method is required to return an open connection for use in the upstream client. The CodeGuru Reviewer recommendation for the preceding code snippet is as follows:

“Consider closing the following resource: connection. The resource is referenced at line 7. The resource is closed at line 9. The resource is returned at line 13. There are other execution paths that don’t close the resource or return it, for example, when validateConnection throws an exception. To prevent this resource leak, close connection along these other paths before you exit this method.”

As mentioned in the Reviewer recommendation, to prevent this resource leak, you must close the established connection when method validateConnection() throws an exception. This can be achieved by inserting the validation logic (lines 7–14) in a try block. In the finally block associated with this try, the connection must be closed by calling DbUtils.closeQuietly(connection) if connectionAcquired == false. The method getConnection() after this fix has been applied is as follows:

private Connection getConnection(final BasicDataSource dataSource, ...) 
        throws ValidateConnectionException, SQLException {
    boolean connectionAcquired = false;
    // Retrying three times to get the connection.
    for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
        Connection connection = dataSource.getConnection();
        try {
            // validateConnection may throw ValidateConnectionException
            if (! validateConnection(connection, ...)) {
                // connection is invalid
                DbUtils.closeQuietly(connection);
            } else {
                // connection is established
                connectionAcquired = true;
                return connection;
            }
        } finally {
            if (!connectionAcquired) {
                DBUtils.closeQuietly(connection);
            }
        }
    }
    return null;
}

As shown in this example, resource leaks in production services can be very disruptive. Furthermore, leaks that manifest along exceptional or less frequently run code paths can be hard to detect or replicate during testing and can remain dormant in the code for long periods of time before manifesting themselves in production environments. With the resource leak detector, you can detect such leaks on objects belonging to a large number of popular Java types such as file streams, database connections, network sockets, timers and metrics, etc.

Combining static code analysis with machine learning for accurate resource leak detection

In this section, we dive deep into the inner workings of the resource leak detector. The resource leak detector in CodeGuru Reviewer uses static analysis algorithms and techniques. Static analysis algorithms perform code analysis without running the code. These algorithms are generally prone to high false positives (the tool might report correct code as having a bug). If the number of these false positives is high, it can lead to alarm fatigue and low adoption of the tool. As a result, the resource leak detector in CodeGuru Reviewer prioritizes precision over recall— the findings we surface are resource leaks with a high accuracy, though CodeGuru Reviewer could potentially miss some resource leak findings.

The main reason for false positives in static code analysis is incomplete information available to the analysis. CodeGuru Reviewer requires only the Java source files and doesn’t require all dependencies or the build artifacts. Not requiring the external dependencies or the build artifacts reduces the friction to perform automated code reviews. As a result, static analysis only has access to the code in the source repository and doesn’t have access to its external dependencies. The resource leak detector in CodeGuru Reviewer combines static code analysis with a machine learning (ML) model. This ML model is used to reason about external dependencies to provide accurate recommendations.

To understand the use of the ML model, consider again the code above for method getConnection() that had a resource leak. In the code snippet, a connection to the data source is established by calling BasicDataSource.getConnection() method, declared in the Apache Commons library. As mentioned earlier, we don’t require the source code of external dependencies like the Apache library for code analysis during pull requests. Without access to the code of external dependencies, a pure static analysis-driven technique doesn’t know whether the Connection object obtained at line 5 will leak, if not closed. Similarly, it doesn’t know that DbUtils.closeQuietly() is a library function that closes the connection argument passed to it at line 9. Our detector combines static code analysis with ML that learns patterns over such external function calls from a large number of available code repositories. As a result, our resource leak detector knows that the connection doesn’t leak along the following code path:

A connection is established on line 5
Method validateConnection() returns false at line 7
DbUtils.closeQuietly() is called on line 9

This suppresses the possible false warning. At the same time, the detector knows that there is a resource leak when the connection is established at line 5, and validateConnection() throws an exception at line 7 that isn’t caught.

When we run CodeGuru Reviewer on this code snippet, it surfaces only the second leak scenario and makes an appropriate recommendation to fix this bug.

The ML model used in the resource leak detector has been trained on a large number of internal Amazon and GitHub code repositories.

Responses to the resource leak findings

Although closing an open resource in code isn’t difficult, doing so properly along all program paths is important to prevent resource leaks. This can easily be overlooked, especially along exceptional or less frequently run paths. As a result, the resource leak detector in CodeGuru Reviewer has observed a relatively high frequency, and has alerted developers within Amazon to thousands of resource leaks before they hit production.

The resource leak detections have witnessed a high developer acceptance rate, and developer feedback towards the resource leak detector has been very positive. Some of the feedback from developers includes “Very cool, automated finding,” “Good bot :),” and “Oh man, this is cool.” Developers have also concurred that the findings are important and need to be fixed.

Conclusion

Resource leak bugs are difficult to detect or replicate during testing. They can impact the availability of production services. As a result, it’s important to automatically detect these bugs early on in the software development workflow, such as during pull requests or code scans. The resource leak detector in CodeGuru Reviewer combines static code analysis algorithms with ML to surface only the high confidence leaks. It has a high developer acceptance rate and has alerted developers within Amazon to thousands of leaks before those leaks hit production.

Key	Value
DBEndPoint	Enter the database cluster endpoint URL
DatabaseName	Enter the database name
RolePrefix	assumeRole