Tag Archives: Security, Identity & Compliance

Automatically detect and block low-volume network floods

2023-09-07 Bryan Van Hook

Post Syndicated from Bryan Van Hook original https://aws.amazon.com/blogs/security/automatically-detect-and-block-low-volume-network-floods/

In this blog post, I show you how to deploy a solution that uses AWS Lambda to automatically manage the lifecycle of Amazon VPC Network Access Control List (ACL) rules to mitigate network floods detected using Amazon CloudWatch Logs Insights and Amazon Timestream.

Application teams should consider the impact unexpected traffic floods can have on an application’s availability. Internet-facing applications can be susceptible to traffic that some distributed denial of service (DDoS) mitigation systems can’t detect. For example, hit-and-run events are a popular approach that use short-lived floods that reoccur at random intervals. Each burst is small enough to go unnoticed by mitigation systems, but still occur often enough and are large enough to be disruptive. Automatically detecting and blocking temporary sources of invalid traffic, combined with other best practices, can strengthen the resiliency of your applications and maintain customer trust.

Use resilient architectures

AWS customers can use prescriptive guidance to improve DDoS resiliency by reviewing the AWS Best Practices for DDoS Resiliency. It describes a DDoS-resilient reference architecture as a guide to help you protect your application’s availability.

The best practices above address the needs of most AWS customers; however, in this blog we cover a few outlier examples that fall outside normal guidance. Here are a few examples that might describe your situation:

You need to operate functionality that isn’t yet fully supported by an AWS managed service that takes on the responsibility of DDoS mitigation.
Migrating to an AWS managed service such as Amazon Route 53 isn’t immediately possible and you need an interim solution that mitigates risks.
Network ingress must be allowed from a wide public IP space that can’t be restricted.
You’re using public IP addresses assigned from the Amazon pool of public IPv4 addresses (which can’t be protected by AWS Shield) rather than Elastic IP addresses.
The application’s technology stack has limited or no support for horizontal scaling to absorb traffic floods.
Your HTTP workload sits behind a Network Load Balancer and can’t be protected by AWS WAF.
Network floods are disruptive but not significant enough (too infrequent or too low volume) to be detected by your managed DDoS mitigation systems.

For these situations, VPC network ACLs can be used to deny invalid traffic. Normally, the limit on rules per network ACL makes them unsuitable for handling truly distributed network floods. However, they can be effective at mitigating network floods that aren’t distributed enough or large enough to be detected by DDoS mitigation systems.

Given the dynamic nature of network traffic and the limited size of network ACLs, it helps to automate the lifecycle of network ACL rules. In the following sections, I show you a solution that uses network ACL rules to automatically detect and block infrastructure layer traffic within 2–5 minutes and automatically removes the rules when they’re no longer needed.

Detecting anomalies in network traffic

You need a way to block disruptive traffic while not impacting legitimate traffic. Anomaly detection can isolate the right traffic to block. Every workload is unique, so you need a way to automatically detect anomalies in the workload’s traffic pattern. You can determine what is normal (a baseline) and then detect statistical anomalies that deviate from the baseline. This baseline can change over time, so it needs to be calculated based on a rolling window of recent activity.

Z-scores are a common way to detect anomalies in time-series data. The process for creating a Z-score is to first calculate the average and standard deviation (a measure of how much the values are spread out) across all values over a span of time. Then for each value in the time window calculate the Z-score as follows:

Z-score = (value – average) / standard deviation

A Z-score exceeding 3.0 indicates the value is an outlier that is greater than 99.7 percent of all other values.

To calculate the Z-score for detecting network anomalies, you need to establish a time series for network traffic. This solution uses VPC flow logs to capture information about the IP traffic in your VPC. Each VPC flow log record provides a packet count that’s aggregated over a time interval. Each flow log record aggregates the number of packets over an interval of 60 seconds or less. There isn’t a consistent time boundary for each log record. This means raw flow log records aren’t a predictable way to build a time series. To address this, the solution processes flow logs into packet bins for time series values. A packet bin is the number of packets sent by a unique source IP address within a specific time window. A source IP address is considered an anomaly if any of its packet bins over the past hour exceed the Z-score threshold (default is 3.0).

When overall traffic levels are low, there might be source IP addresses with a high Z-score that aren’t a risk. To mitigate against false positives, source IP addresses are only considered to be an anomaly if the packet bin exceeds a minimum threshold (default is 12,000 packets).

Let’s review the overall solution architecture.

Solution overview

This solution, shown in Figure 1, uses VPC flow logs to capture information about the traffic reaching the network interfaces in your public subnets. CloudWatch Logs Insights queries are used to summarize the most recent IP traffic into packet bins that are stored in Timestream. The time series table is queried to identify source IP addresses responsible for traffic that meets the anomaly threshold. Anomalous source IP addresses are published to an Amazon Simple Notification Service (Amazon SNS) topic. A Lambda function receives the SNS message and decides how to update the network ACL.

Figure 1: Automating the detection and mitigation of traffic floods using network ACLs

How it works

The numbered steps that follow correspond to the numbers in Figure 1.

Capture VPC flow logs. Your VPC is configured to stream flow logs to CloudWatch Logs. To minimize cost, the flow logs are limited to particular subnets and only include log fields required by the CloudWatch query. When protecting an endpoint that spans multiple subnets (such as a Network Load Balancer using multiple availability zones), each subnet shares the same network ACL and is configured with a flow log that shares the same CloudWatch log group.
Scheduled flow log analysis. Amazon EventBridge starts an AWS Step Functions state machine on a time interval (60 seconds by default). The state machine starts a Lambda function immediately, and then again after 30 seconds. The Lambda function performs steps 3–6.
Summarize recent network traffic. The Lambda function runs a CloudWatch Logs Insights query. The query scans the most recent flow logs (5-minute window) to summarize packet frequency grouped by source IP. These groupings are called packet bins, where each bin represents the number of packets sent by a source IP within a given minute of time.
Update time series database. A time series database in Timestream is updated with the most recent packet bins.
Use statistical analysis to detect abusive source IPs. A Timestream query is used to perform several calculations. The query calculates the average bin size over the past hour, along with the standard deviation. These two values are then used to calculate the maximum Z-score for all source IPs over the past hour. This means an abusive IP will remain flagged for one hour even if it stopped sending traffic. Z-scores are sorted so that the most abusive source IPs are prioritized. If a source IP meets these two criteria, it is considered abusive.
1. Maximum Z-score exceeds a threshold (defaults to 3.0).
2. Packet bin exceeds a threshold (defaults to 12,000). This avoids flagging source IPs during periods of overall low traffic when there is no need to block traffic.
Publish anomalous source IPs. Publish a message to an Amazon SNS topic with a list of anomalous source IPs. The function also publishes CloudWatch metrics to help you track the number of unique and abusive source IPs over time. At this point, the flow log summarizer function has finished its job until the next time it’s invoked from EventBridge.
Receive anomalous source IPs. The network ACL updater function is subscribed to the SNS topic. It receives the list of anomalous source IPs.
Update the network ACL. The network ACL updater function uses two network ACLs called blue and green. This verifies that the active rules remain in place while updating the rules in the inactive network ACL. When the inactive network ACL rules are updated, the function swaps network ACLs on each subnet. By default, each network ACL has a limit of 20 rules. If the number of anomalous source IPs exceeds the network ACL limit, the source IPs with the highest Z-score are prioritized. CloudWatch metrics are provided to help you track the number of source IPs blocked, and how many source IPs couldn’t be blocked due to network ACL limits.

Prerequisites

This solution assumes you have one or more public subnets used to operate an internet-facing endpoint.

Deploy the solution

Follow these steps to deploy and validate the solution.

Download the latest release from GitHub.
Upload the AWS CloudFormation templates and Python code to an S3 bucket.
Gather the information needed for the CloudFormation template parameters.
Create the CloudFormation stack.
Monitor traffic mitigation activity using the CloudWatch dashboard.

Let’s review the steps I followed in my environment.

Step 1. Download the latest release

I create a new directory on my computer named auto-nacl-deploy. I review the releases on GitHub and choose the latest version. I download auto-nacl.zip into the auto-nacl-deploy directory. Now it’s time to stage this code in Amazon Simple Storage Service (Amazon S3).

Figure 2: Save auto-nacl.zip to the auto-nacl-deploy directory

Step 2. Upload the CloudFormation templates and Python code to an S3 bucket

I extract the auto-nacl.zip file into my auto-nacl-deploy directory.

Figure 3: Expand auto-nacl.zip into the auto-nacl-deploy directory

The template.yaml file is used to create a CloudFormation stack with four nested stacks. You copy all files to an S3 bucket prior to creating the stacks.

To stage these files in Amazon S3, use an existing bucket or create a new one. For this example, I used an existing S3 bucket named auto-nacl-us-east-1. Using the Amazon S3 console, I created a folder named artifacts and then uploaded the extracted files to it. My bucket now looks like Figure 4.

Figure 4: Upload the extracted files to Amazon S3

Step 3. Gather information needed for the CloudFormation template parameters

There are six parameters required by the CloudFormation template.

Template parameter	Parameter description
VpcId	The ID of the VPC that runs your application.
SubnetIds	A comma-delimited list of public subnet IDs used by your endpoint.
ListenerPort	The IP port number for your endpoint’s listener.
ListenerProtocol	The Internet Protocol (TCP or UDP) used by your endpoint.
SourceCodeS3Bucket	The S3 bucket that contains the files you uploaded in Step 2. This bucket must be in the same AWS Region as the CloudFormation stack.
SourceCodeS3Prefix	The S3 prefix (folder) of the files you uploaded in Step 2.

For the VpcId parameter, I use the VPC console to find the VPC ID for my application.

Figure 5: Find the VPC ID

For the SubnetIds parameter, I use the VPC console to find the subnet IDs for my application. My VPC has public and private subnets. For this solution, you only need the public subnets.

Figure 6: Find the subnet IDs

My application uses a Network Load Balancer that listens on port 80 to handle TCP traffic. I use 80 for ListenerPort and TCP for ListenerProtocol.

The next two parameters are based on the Amazon S3 location I used earlier. I use auto-nacl-us-east-1 for SourceCodeS3Bucket and artifacts for SourceCodeS3Prefix.

Step 4. Create the CloudFormation stack

I use the CloudFormation console to create a stack. The Amazon S3 URL format is https://<bucket>.s3.<region>.amazonaws.com/<prefix>/template.yaml. I enter the Amazon S3 URL for my environment, then choose Next.

Figure 7: Specify the CloudFormation template

I enter a name for my stack (for example, auto-nacl-1) along with the parameter values I gathered in Step 3. I leave all optional parameters as they are, then choose Next.

Figure 8: Provide the required parameters

I review the stack options, then scroll to the bottom and choose Next.

Figure 9: Review the default stack options

I scroll down to the Capabilities section and acknowledge the capabilities required by CloudFormation, then choose Submit.

Figure 10: Acknowledge the capabilities required by CloudFormation

I wait for the stack to reach CREATE_COMPLETE status. It takes 10–15 minutes to create all of the nested stacks.

Figure 11: Wait for the stacks to complete

Step 5. Monitor traffic mitigation activity using the CloudWatch dashboard

After the CloudFormation stacks are complete, I navigate to the CloudWatch console to open the dashboard. In my environment, the dashboard is named auto-nacl-1-MitigationDashboard-YS697LIEHKGJ.

Figure 12: Find the CloudWatch dashboard

Initially, the dashboard, shown in Figure 13, has little information to display. After an hour, I can see the following metrics from my sample environment:

The Network Traffic graph shows how many packets are allowed and rejected by network ACL rules. No anomalies have been detected yet, so this only shows allowed traffic.
The All Source IPs graph shows how many total unique source IP addresses are sending traffic.
The Anomalous Source Networks graph shows how many anomalous source networks are being blocked by network ACL rules (or not blocked due to network ACL rule limit). This graph is blank unless anomalies have been detected in the last hour.
The Anomalous Source IPs graph shows how many anomalous source IP addresses are being blocked (or not blocked) by network ACL rules. This graph is blank unless anomalies have been detected in the last hour.
The Packet Statistics graph can help you determine if the sensitivity should be adjusted. This graph shows the average packets-per-minute and the associated standard deviation over the past hour. It also shows the anomaly threshold, which represents the minimum number of packets-per-minute for a source IP address to be considered an anomaly. The anomaly threshold is calculated based on the CloudFormation parameter MinZScore.
anomaly threshold = (MinZScore * standard deviation) + average

Increasing the MinZScore parameter raises the threshold and reduces sensitivity. You can also adjust the CloudFormation parameter MinPacketsPerBin to mitigate against blocking traffic during periods of low volume, even if a source IP address exceeds the minimum Z-score.
The Blocked IPs grid shows which source IP addresses are being blocked during each hour, along with the corresponding packet bin size and Z-score. This grid is blank unless anomalies have been detected in the last hour.

Figure 13: Observe the dashboard after one hour

Let’s review a scenario to see what happens when my endpoint sees two waves of anomalous traffic.

By default, my network ACL allows a maximum of 20 inbound rules. The two default rules count toward this limit, so I only have room for 18 more inbound rules. My application sees a spike of network traffic from 20 unique source IP addresses. When the traffic spike begins, the anomaly is detected in less than five minutes. Network ACL rules are created to block the top 18 source IP addresses (sorted by Z-score). Traffic is blocked for about 5 minutes until the flood subsides. The rules remain in place for 1 hour by default. When the same 20 source IP addresses send another traffic flood a few minutes later, most traffic is immediately blocked. Some traffic is still allowed from two source IP addresses that can’t be blocked due to the limit of 18 rules.

Figure 14: Observe traffic blocked from anomalous source IP addresses

Customize the solution

You can customize the behavior of this solution to fit your use case.

Block many IP addresses per network ACL rule. To enable blocking more source IP addresses than your network ACL rule limit, change the CloudFormation parameter NaclRuleNetworkMask (default is 32). This sets the network mask used in network ACL rules and lets you block IP address ranges instead of individual IP addresses. By default, the IP address 192.0.2.1 is blocked by a network ACL rule for 192.0.2.1/32. Setting this parameter to 24 results in a network ACL rule that blocks 192.0.2.0/24. As a reminder, address ranges that are too wide might result in blocking legitimate traffic.
Only block source IPs that exceed a packet volume threshold. Use the CloudFormation parameter MinPacketsPerBin (default is 12,000) to set the minimum packets per minute. This mitigates against blocking source IPs (even if their Z-score is high) during periods of overall low traffic when there is no need to block traffic.
Adjust the sensitivity of anomaly detection. Use the CloudFormation parameter MinZScore to set the minimum Z-score for a source IP to be considered an anomaly. The default is 3.0, which only blocks source IPs with packet volume that exceeds 99.7 percent of all other source IPs.
Exclude trusted source IPs from anomaly detection. Specify an allow list object in Amazon S3 that contains a list of IP addresses or CIDRs that you want to exclude from network ACL rules. The network ACL updater function reads the allow list every time it handles an SNS message.

Limitations

As covered in the preceding sections, this solution has a few limitations to be aware of:

CloudWatch Logs queries can only return up to 10,000 records. This means the traffic baseline can only be calculated based on the observation of 10,000 unique source IP addresses per minute.
The traffic baseline is based on a rolling 1-hour window. You might need to increase this if a 1-hour window results in a baseline that allows false positives. For example, you might need a longer baseline window if your service normally handles abrupt spikes that occur hourly or daily.
By default, a network ACL can only hold 20 inbound rules. This includes the default allow and deny rules, so there’s room for 18 deny rules. You can increase this limit from 20 to 40 with a support case; however, it means that a maximum of 18 (or 38) source IP addresses can be blocked at one time.
The speed of anomaly detection is dependent on how quickly VPC flow logs are delivered to CloudWatch. This usually takes 2–4 minutes but can take over 6 minutes.

Cost considerations

CloudWatch Logs Insights queries are the main element of cost for this solution. See CloudWatch pricing for more information. The cost is about 7.70 USD per GB of flow logs generated per month.

To optimize the cost of CloudWatch queries, the VPC flow log record format only includes the fields required for anomaly detection. The CloudWatch log group is configured with a retention of 1 day. You can tune your cost by adjusting the anomaly detector function to run less frequently (the default is twice per minute). The tradeoff is that the network ACL rules won’t be updated as frequently. This can lead to the solution taking longer to mitigate a traffic flood.

Conclusion

Maintaining high availability and responsiveness is important to keeping the trust of your customers. The solution described above can help you automatically mitigate a variety of network floods that can impact the availability of your application even if you’ve followed all the applicable best practices for DDoS resiliency. There are limitations to this solution, but it can quickly detect and mitigate disruptive sources of traffic in a cost-effective manner. Your feedback is important. You can share comments below and report issues on GitHub.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

AWS achieves ISO/IEC 20000-1:2018 certification for AWS Asia Pacific (Mumbai) and (Hyderabad) Regions

2023-09-07 Airish Mariano

Post Syndicated from Airish Mariano original https://aws.amazon.com/blogs/security/aws-achieves-iso-iec-20000-12018-certification-for-aws-asia-pacific-mumbai-and-hyderabad-regions/

Amazon Web Services (AWS) is proud to announce the successful completion of the ISO/IEC 20000-1:2018 certification for the AWS Asia Pacific (Mumbai) and (Hyderabad) Regions in India.

The scope of the ISO/IEC 20000-1:2018 certification is limited to the IT Service Management System (ITSMS) of AWS India Data Center (DC) Operations that supports the delivery of Security Operations Center (SOC) and Network Operation Center (NOC) managed services.

ISO/IEC 20000-1 is a service management system (SMS) standard that specifies requirements for establishing, implementing, maintaining, and continually improving an SMS. An SMS supports the management of the service lifecycle, including the planning, design, transition, delivery, and improvement of services, which meet agreed upon requirements and deliver value for customers, users, and the organization that delivers the services.

The ISO/IEC 20000-1 certification provides an assurance that the AWS Data Center operations in India support the delivery of SOC and NOC managed services, in accordance with the ISO/IEC 20000-1 guidance and in line with the requirements of the Ministry of Electronics and Information Technology (MeitY), government of India.

An independent third-party auditor assessed AWS. Customers can download the latest ISO/IEC 20000-1:2018 certificate on AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

AWS is committed to bringing new services into the scope of its compliance programs to help you meet your architectural, business, and regulatory needs. If you have questions about the ISO/IEC 20000-1:2018 certification, contact your AWS account team.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Build streaming data pipelines with Amazon MSK Serverless and IAM authentication

2023-09-06 Marvin Gersho

Post Syndicated from Marvin Gersho original https://aws.amazon.com/blogs/big-data/build-streaming-data-pipelines-with-amazon-msk-serverless-and-iam-authentication/

Currently, MSK Serverless only directly supports IAM for authentication using Java. This example shows how to use this mechanism. Additionally, it provides a pattern creating a proxy that can easily be integrated into solutions built in languages other than Java.

The rising trend in today’s tech landscape is the use of streaming data and event-oriented structures. They are being applied in numerous ways, including monitoring website traffic, tracking industrial Internet of Things (IoT) devices, analyzing video game player behavior, and managing data for cutting-edge analytics systems.

Apache Kafka, a top-tier open-source tool, is making waves in this domain. It’s widely adopted by numerous users for building fast and efficient data pipelines, analyzing streaming data, merging data from different sources, and supporting essential applications.

Amazon’s serverless Apache Kafka offering, Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless, is attracting a lot of interest. It’s appreciated for its user-friendly approach, ability to scale automatically, and cost-saving benefits over other Kafka solutions. However, a hurdle encountered by many users is the requirement of MSK Serverless to use AWS Identity and Access Management (IAM) access control. At the time of writing, the Amazon MSK library for IAM is exclusive to Kafka libraries in Java, creating a challenge for users of other programming languages. In this post, we aim to address this issue and present how you can use Amazon API Gateway and AWS Lambda to navigate around this obstacle.

SASL/SCRAM authentication vs. IAM authentication

Compared to the traditional authentication methods like Salted Challenge Response Authentication Mechanism (SCRAM), the IAM extension into Apache Kafka through MSK Serverless provides a lot of benefits. Before we delve into those, it’s important to understand what SASL/SCRAM authentication is. Essentially, it’s a traditional method used to confirm a user’s identity before giving them access to a system. This process requires users or clients to provide a user name and password, which the system then cross-checks against stored credentials (for example, via AWS Secrets Manager) to decide whether or not access should be granted.

Compared to this approach, IAM simplifies permission management across AWS environments, enables the creation and strict enforcement of detailed permissions and policies, and uses temporary credentials rather than the typical user name and password authentication. Another benefit of using IAM is that you can use IAM for both authentication and authorization. If you use SASL/SCRAM, you have to additionally manage ACLs via a separate mechanism. In IAM, you can use the IAM policy attached to the IAM principal to define the fine-grained access control for that IAM principal. All of these improvements make the IAM integration a more efficient and secure solution for most use cases.

However, for applications not built in Java, utilizing MSK Serverless becomes tricky. The standard SASL/SCRAM authentication isn’t available, and non-Java Kafka libraries don’t have a way to use IAM access control. This calls for an alternative approach to connect to MSK Serverless clusters.

But there’s an alternative pattern. Without having to rewrite your existing application in Java, you can employ API Gateway and Lambda as a proxy in front of a cluster. They can handle API requests and relay them to Kafka topics instantly. API Gateway takes in producer requests and channels them to a Lambda function, written in Java using the Amazon MSK IAM library. It then communicates with the MSK Serverless Kafka topic using IAM access control. After the cluster receives the message, it can be further processed within the MSK Serverless setup.

You can also utilize Lambda on the consumer side of MSK Serverless topics, bypassing the Java requirement on the consumer side. You can do this by setting Amazon MSK as an event source for a Lambda function. When the Lambda function is triggered, the data sent to the function includes an array of records from the Kafka topic—no need for direct contact with Amazon MSK.

Solution overview

This example walks you through how to build a serverless real-time stream producer application using API Gateway and Lambda.

For testing, this post includes a sample AWS Cloud Development Kit (AWS CDK) application. This creates a demo environment, including an MSK Serverless cluster, three Lambda functions, and an API Gateway that consumes the messages from the Kafka topic.

The following diagram shows the architecture of the resulting application including its data flows.

The data flow contains the following steps:

The infrastructure is defined in an AWS CDK application. By running this application, a set of AWS CloudFormation templates is created.
AWS CloudFormation creates all infrastructure components, including a Lambda function that runs during the deployment process to create a topic in the MSK Serverless cluster and to retrieve the authentication endpoint needed for the producer Lambda function. On destruction of the CloudFormation stack, the same Lambda function gets triggered again to delete the topic from the cluster.
An external application calls an API Gateway endpoint.
API Gateway forwards the request to a Lambda function.
The Lambda function acts as a Kafka producer and pushes the message to a Kafka topic using IAM authentication.
The Lambda event source mapping mechanism triggers the Lambda consumer function and forwards the message to it.
The Lambda consumer function logs the data to Amazon CloudWatch.

Note that we don’t need to worry about Availability Zones. MSK Serverless automatically replicates the data across multiple Availability Zones to ensure high availability of the data.

The demo additionally shows how to use Lambda Powertools for Java to streamline logging and tracing and the IAM authenticator for the simple authentication process outlined in the introduction.

The following sections take you through the steps to deploy, test, and observe the example application.

Prerequisites

The example has the following prerequisites:

An AWS account. If you haven’t signed up, complete the following steps:
- Create an account. For instructions, see Sign Up For AWS.
- Create an IAM user. For instructions, see Create IAM User.
The following software installed on your development machine, or use an AWS Cloud9 environment, which comes with all requirements preinstalled:
- Java Development Kit 17 or higher (for example, Amazon Corretto 17, OpenJDK 17)
- Python version 3.11 or higher
- Apache Maven version 3.8.4 or higher
- Docker version 24.0.2 or higher
- Node.js v18.0.0
- AWS CLI 2.12.1 or higher
- AWS CDK 2.89.0 or higher
Appropriate AWS credentials for interacting with resources in your AWS account.

Deploy the solution

Complete the following steps to deploy the solution:

Clone the project GitHub repository and change the directory to subfolder serverless-kafka-iac:

git clone https://github.com/aws-samples/apigateway-lambda-msk-serverless-integration
cd apigateway-lambda-msk-serverless-integration/serverless-kafka-iac

Configure environment variables:

export CDK_DEFAULT_ACCOUNT=$(aws sts get-caller-identity --query 'Account' --output text)
export CDK_DEFAULT_REGION=$(aws configure get region)

Prepare the virtual Python environment:

python3 -m venv .venv

source .venv/bin/activate

pip3 install -r requirements.txt

Bootstrap your account for AWS CDK usage:

cdk bootstrap aws://$CDK_DEFAULT_ACCOUNT/$CDK_DEFAULT_REGION

Run cdk synth to build the code and test the requirements (ensure docker daemon is running on your machine):

cdk synth

Run cdk deploy to deploy the code to your AWS account:

cdk deploy --all

Test the solution

To test the solution, we generate messages for the Kafka topics by sending calls through the API Gateway from our development machine or AWS Cloud9 environment. We then go to the CloudWatch console to observe incoming messages in the log files of the Lambda consumer function.

Open a terminal on your development machine to test the API with the Python script provided under /serverless_kafka_iac/test_api.py:

python3 test-api.py

On the Lambda console, open the Lambda function named ServerlessKafkaConsumer.

On the Monitor tab, choose View CloudWatch logs to access the logs of the Lambda function.

Choose the latest log stream to access the log files of the last run.

You can review the log entry of the received Kafka messages in the log of the Lambda function.

Trace a request

All components integrate with AWS X-Ray. With AWS X-Ray, you can trace the entire application, which is useful to identify bottlenecks when load testing. You can also trace method runs at the Java method level.

Lambda Powertools for Java allows you to shortcut this process by adding the @Trace annotation to a method to see traces on the method level in X-Ray.

To trace a request end to end, complete the following steps:

On the CloudWatch console, choose Service map in the navigation pane.
Select a component to investigate (for example, the Lambda function where you deployed the Kafka producer).
Choose View traces.

Choose a single Lambda method invocation and investigate further at the Java method level.

Implement a Kafka producer in Lambda

Kafka natively supports Java. To stay open, cloud native, and without third-party dependencies, the producer is written in that language. Currently, the IAM authenticator is only available to Java. In this example, the Lambda handler receives a message from an API Gateway source and pushes this message to an MSK topic called messages.

Typically, Kafka producers are long-living and pushing a message to a Kafka topic is an asynchronous process. Because Lambda is ephemeral, you must enforce a full flush of a submitted message until the Lambda function ends by calling producer.flush():

// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: MIT-0
package software.amazon.samples.kafka.lambda;
 
// This class is part of the AWS samples package and specifically deals with Kafka integration in a Lambda function.
// It serves as a simple API Gateway to Kafka Proxy, accepting requests and forwarding them to a Kafka topic.
public class SimpleApiGatewayKafkaProxy implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
 
    // Specifies the name of the Kafka topic where the messages will be sent
    public static final String TOPIC_NAME = "messages";
 
    // Logger instance for logging events of this class
    private static final Logger log = LogManager.getLogger(SimpleApiGatewayKafkaProxy.class);
    
    // Factory to create properties for Kafka Producer
    public KafkaProducerPropertiesFactory kafkaProducerProperties = new KafkaProducerPropertiesFactoryImpl();
    
    // Instance of KafkaProducer
    private KafkaProducer<String, String>[KT1]  producer;
 
    // Overridden method from the RequestHandler interface to handle incoming API Gateway proxy events
    @Override
    @Tracing
    @Logging(logEvent = true)
    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent input, Context context) {
        
        // Creating a response object to send back 
        APIGatewayProxyResponseEvent response = createEmptyResponse();
        try {
            // Extracting the message from the request body
            String message = getMessageBody(input);
 
            // Create a Kafka producer
            KafkaProducer<String, String> producer = createProducer();
 
            // Creating a record with topic name, request ID as key and message as value 
            ProducerRecord<String, String> record = new ProducerRecord<String, String>(TOPIC_NAME, context.getAwsRequestId(), message);
 
            // Sending the record to Kafka topic and getting the metadata of the record
            Future<RecordMetadata>[KT2]  send = producer.send(record);
            producer.flush();
 
            // Retrieve metadata about the sent record
            RecordMetadata metadata = send.get();
 
            // Logging the partition where the message was sent
            log.info(String.format("Message was send to partition %s", metadata.partition()));
 
            // If the message was successfully sent, return a 200 status code
            return response.withStatusCode(200).withBody("Message successfully pushed to kafka");
        } catch (Exception e) {
            // In case of exception, log the error message and return a 500 status code
            log.error(e.getMessage(), e);
            return response.withBody(e.getMessage()).withStatusCode(500);
        }
    }
 
    // Creates a Kafka producer if it doesn't already exist
    @Tracing
    private KafkaProducer<String, String> createProducer() {
        if (producer == null) {
            log.info("Connecting to kafka cluster");
            producer = new KafkaProducer<String, String>(kafkaProducerProperties.getProducerProperties());
        }
        return producer;
    }
 
    // Extracts the message from the request body. If it's base64 encoded, it's decoded first.
    private String getMessageBody(APIGatewayProxyRequestEvent input) {
        String body = input.getBody();
 
        if (input.getIsBase64Encoded()) {
            body = decode(body);
        }
        return body;
    }
 
    // Creates an empty API Gateway proxy response event with predefined headers.
    private APIGatewayProxyResponseEvent createEmptyResponse() {
        Map<String, String> headers = new HashMap<>();
        headers.put("Content-Type", "application/json");
        headers.put("X-Custom-Header", "application/json");
        APIGatewayProxyResponseEvent response = new APIGatewayProxyResponseEvent().withHeaders(headers);
        return response;
    }
}

Connect to Amazon MSK using IAM authentication

This post uses IAM authentication to connect to the respective Kafka cluster. For information about how to configure the producer for connectivity, refer to IAM access control.

Because you configure the cluster via IAM, grant Connect and WriteData permissions to the producer so that it can push messages to Kafka:

{
    “Version”: “2012-10-17”,
    “Statement”: [
        {            
            “Effect”: “Allow”,
            “Action”: [
                “kafka-cluster:Connect”
            ],
            “Resource”: “arn:aws:kafka:region:account-id:cluster/cluster-name/cluster-uuid “
        }
    ]
}
 
 
{
    “Version”: “2012-10-17”,
    “Statement”: [
        {            
            “Effect”: “Allow”,
            “Action”: [
                “kafka-cluster:Connect”,
                “kafka-cluster: DescribeTopic”,
            ],
            “Resource”: “arn:aws:kafka:region:account-id:topic/cluster-name/cluster-uuid/topic-name“
        }
    ]
}

This shows the Kafka excerpt of the IAM policy, which must be applied to the Kafka producer. When using IAM authentication, be aware of the current limits of IAM Kafka authentication, which affect the number of concurrent connections and IAM requests for a producer. Refer to Amazon MSK quota and follow the recommendation for authentication backoff in the producer client:

        Map<String, String> configuration = Map.of(
                “key.serializer”, “org.apache.kafka.common.serialization.StringSerializer”,
                “value.serializer”, “org.apache.kafka.common.serialization.StringSerializer”,
                “bootstrap.servers”, getBootstrapServer(),
                “security.protocol”, “SASL_SSL”,
                “sasl.mechanism”, “AWS_MSK_IAM”,
                “sasl.jaas.config”, “software.amazon.msk.auth.iam.IAMLoginModule required;”,
                “sasl.client.callback.handler.class”,
				“software.amazon.msk.auth.iam.IAMClientCallbackHandler”,
                “connections.max.idle.ms”, “60”,
                “reconnect.backoff.ms”, “1000”
        );

Additional considerations

Each MSK Serverless cluster can handle 100 requests per second. To reduce IAM authentication requests from the Kafka producer, place it outside of the handler. For frequent calls, there is a chance that Lambda reuses the previously created class instance and only reruns the handler.

For bursting workloads with a high number of concurrent API Gateway requests, this can lead to dropped messages. Although this might be tolerable for some workloads, for others this might not be the case.

In these cases, you can extend the architecture with a buffering technology like Amazon Simple Queue Service (Amazon SQS) or Amazon Kinesis Data Streams between API Gateway and Lambda.

To reduce latency, reduce cold start times for Java by changing the tiered compilation level to 1, as described in Optimizing AWS Lambda function performance for Java. Provisioned concurrency ensures that polling Lambda functions don’t need to warm up before requests arrive.

Conclusion

In this post, we showed how to create a serverless integration Lambda function between API Gateway and MSK Serverless as a way to do IAM authentication when your producer is not written in Java. You also learned about the native integration of Lambda and Amazon MSK on the consumer side. Additionally, we showed how to deploy such an integration with the AWS CDK.

The general pattern is suitable for many use cases where you want to use IAM authentication but your producers or consumers are not written in Java, but you still want to take advantage of the benefits of MSK Serverless, like its ability to scale up and down with unpredictable or spikey workloads or its little to no operational overhead of running Apache Kafka.

You can also use MSK Serverless to reduce operational complexity by automating provisioning and the management of capacity needs, including the need to constantly monitor brokers and storage.

For more serverless learning resources, visit Serverless Land.

For more information on MSK Serverless, check out the following:

About the Authors

Philipp Klose is a Global Solutions Architect at AWS based in Munich. He works with enterprise FSI customers and helps them solve business problems by architecting serverless platforms. In this free time, Philipp spends time with his family and enjoys every geek hobby possible.

Daniel Wessendorf is a Global Solutions Architect at AWS based in Munich. He works with enterprise FSI customers and is primarily specialized in machine learning and data architectures. In his free time, he enjoys swimming, hiking, skiing, and spending quality time with his family.

Marvin Gersho is a Senior Solutions Architect at AWS based in New York City. He works with a wide range of startup customers. He previously worked for many years in engineering leadership and hands-on application development, and now focuses on helping customers architect secure and scalable workloads on AWS with a minimum of operational overhead. In his free time, Marvin enjoys cycling and strategy board games.

Nathan Lichtenstein is a Senior Solutions Architect at AWS based in New York City. Primarily working with startups, he ensures his customers build smart on AWS, delivering creative solutions to their complex technical challenges. Nathan has worked in cloud and network architecture in the media, financial services, and retail spaces. Outside of work, he can often be found at a Broadway theater.

How to enforce DNS name constraints in AWS Private CA

2023-09-06 Isaiah Schisler

Post Syndicated from Isaiah Schisler original https://aws.amazon.com/blogs/security/how-to-enforce-dns-name-constraints-in-aws-private-ca/

In March 2022, AWS announced support for custom certificate extensions, including name constraints, using AWS Certificate Manager (ACM) Private Certificate Authority (CA). Defining DNS name constraints with your subordinate CA can help establish guardrails to improve public key infrastructure (PKI) security and mitigate certificate misuse. For example, you can set a DNS name constraint that restricts the CA from issuing certificates to a resource that is using a specific domain name. Certificate requests from resources using an unauthorized domain name will be rejected by your CA and won’t be issued a certificate.

In this blog post, I’ll walk you step-by-step through the process of applying DNS name constraints to a subordinate CA by using the AWS Private CA service.

Prerequisites

You need to have the following prerequisite tools, services, and permissions in place before following the steps presented within this post:

AWS Identity and Access Management (IAM) permissions with full access to AWS Certificate Manager and AWS Private CA. The corresponding AWS managed policies are named AWSCertificateManagerFullAccess and AWSCertificateManagerPrivateCAFullAccess.
AWS Command Line Interface (AWS CLI) 2.9.13 or later installed.
Python 3.7.15 or later installed.
Python’s package manager, pip, 20.2.2 or later installed.
An existing deployment of AWS Private CA with a root and subordinate CA.
The Amazon Resource Names (ARN) for your root and subordinate CAs.
The command-line JSON processor jq.
The Git command-line tool.
All of the examples in this blog post are provided for the us-west-2 AWS Region. You will need to make sure that you have access to resources in your desired Region and specify the Region in the example commands.

Retrieve the solution code

Our GitHub repository contains the Python code that you need in order to replicate the steps presented in this post. There are two methods for cloning the repository provided, HTTPS or SSH. Select the method that you prefer.

To clone the solution repository using HTTPS

Run the following command in your terminal.

git clone https://github.com/aws-samples/aws-private-ca-enforce-dns-name-constraints.git

To clone the solution repository using SSH

Run the following command in your terminal.

git clone [email protected]:aws-samples/aws-private-ca-enforce-dns-name-constraints.git

Set up your Python environment

Creating a Python virtual environment will allow you to run this solution in a fresh environment without impacting your existing Python packages. This will prevent the solution from interfering with dependencies that your other Python scripts may have. The virtual environment has its own set of Python packages installed. Read the official Python documentation on virtual environments for more information on their purpose and functionality.

To create a Python virtual environment

Create a new directory for the Python virtual environment in your home path.
```
mkdir ~/python-venv-for-aws-private-ca-name-constraints
```
Create a Python virtual environment using the directory that you just created.
```
python -m venv ~/python-venv-for-aws-private-ca-name-constraints
```

Activate the Python virtual environment.

source ~/python-venv-for-aws-private-ca-name-constraints/bin/activate

Upgrade pip to the latest version.
```
python -m pip install --upgrade pip
```

To install the required Python packages

Navigate to the solution source directory. Make sure to replace <~/github> with your information.
```
cd <~/github>/aws-private-ca-name-constraints/src/
```
Install the necessary Python packages and dependencies. Make sure to replace <~/github> with your information.
```
pip install -r <~/github>/aws-private-ca-name-constraints/src/requirements.txt
```

Generate the API passthrough file with encoded name constraints

This step allows you to define the permitted and excluded DNS name constraints to apply to your subordinate CA. Read the documentation on name constraints in RFC 5280 for more information on their usage and functionality.

The Python encoder provided in this solution accepts two arguments for the permitted and excluded name constraints. The -p argument is used to provide the permitted subtrees, and the -e argument is used to provide the excluded subtrees. Use commas without spaces to separate multiple entries. For example: -p .dev.example.com,.test.example.com -e .prod.dev.example.com,.amazon.com.

To encode your name constraints

Run the following command, and update <~/github> with your information and provide your desired name constraints for the permitted (-p) and excluded (-e) arguments.
```
python <~/github>/aws-private-ca-name-constraints/src/name-constraints-encoder.py -p <.dev.example.com,.test.example.com> -e <.prod.dev.example.com>
```
If the command runs successfully, you will see the message “Successfully Encoded Name Constraints” and the name of the generated API passthrough JSON file. The output of Permitted Subtrees will show the domain names you passed with the -p argument, and Excluded Subtrees will show the domain names you passed with the -e argument in the previous step.

Figure 1: Command line output example for name-constraints-encoder.py
Use the following command to display the contents of the API passthrough file generated by the Python encoder.
```
cat <~/github>/aws-private-ca-name-constraints/src/api_passthrough_config.json | jq .
```
The contents of api_passthrough_config.json will look similar to the following screenshot. The JSON object will have an ObjectIdentifier key and value of 2.5.29.30, which represents the name constraints OID from the Global OID database. The base64-encoded Value represents the permitted and excluded name constraints you provided to the Python encoder earlier.

Figure 2: Viewing contents of api_passthrough_config.json

Generate a CSR from your subordinate CA

You must generate a certificate signing request (CSR) from the subordinate CA to which you intend to have the name constraints applied. Otherwise, you might encounter errors when you attempt to install the new certificate with name constraints.

To generate the CSR

Update and run the following command with your subordinate CA ARN and Region. The ARN is something that uniquely identifies AWS resources, similar to how your home address tells the mail person where to deliver the mail. In this case, the ARN is the unique identifier for your subordinate CA that tells the command which subordinate CA it’s interacting with.
```
aws acm-pca get-certificate-authority-csr \
--certificate-authority-arn <arn:aws:acm-pca:us-west-2:111111111111:certificate-authority/cdd22222-2222-2f22-bb2e-222f222222ab> \
--output text \
--region <us-west-2> > ca.csr 
```

View your subordinate CA’s CSR.

openssl req -text -noout -verify -in ca.csr

The following screenshot provides an example output for a CSR. Your CSR details will be different; however, you should see something similar. Look for verify OK in the output and make sure that the Subject details match your subordinate CA. The subject details will provide the country, state, and city. The details will also likely contain your organization’s name, organizational unit or department name, and a common name for the subordinate CA.

Figure 3: Reviewing CSR content using openssl

Use the root CA to issue a new certificate with the name constraints custom extension

This post uses a two-tiered certificate authority architecture for simplicity. However, you can use the steps in this post with a more complex multi-level CA architecture. The name constraints certificate will be generated by the root CA and applied to the intermediary CA.

To issue and download a certificate with name constraints

Run the following command, making sure to update the argument values in red italics with your information. Make sure that the certificate-authority-arn is that of your root CA.
- Note that the provided template-arn instructs the root CA to use the api_passthrough_config.json file that you created earlier to generate the certificate with the name constraints custom extension. If you use a different template, the new certificate might not be created as you intended.
- Also, note that the validity period provided in this example is 5 years or 1825 days. The validity period for your subordinate CA must be less than that of your root CA.
```
aws acm-pca issue-certificate \
--certificate-authority-arn <arn:aws:acm-pca:us-west-2:111111111111:certificate-authority/111f1111-ba1b-1111-b11d-11ce1a11afae> \
--csr fileb://ca.csr \
--signing-algorithm <SHA256WITHRSA> \
--template-arn arn:aws:acm-pca:::template/SubordinateCACertificate_PathLen0_APIPassthrough/V1 \
--api-passthrough file://api_passthrough_config.json \
--validity Value=<1825>,Type=<DAYS> \
--region <us-west-2>
```
If the issue-certificate command is successful, the output will provide the ARN of the new certificate that is issued by the root CA. Copy the certificate ARN, because it will be used in the following command.

Figure 4: Issuing a certificate with name constraints from the root CA using the AWS CLI

To download the new certificate, run the following command. Make sure to update the placeholders in red italics with your root CA’s certificate-authority-arn, the certificate-arn you obtained from the previous step, and your region.

aws acm-pca get-certificate \
--certificate-authority-arn <arn:aws:acm-pca:us-west-2:111111111111:certificate-authority/111f1111-ba1b-1111-b11d-11ce1a11afae> \
--certificate-arn <arn:aws:acm-pca:us-west-2:11111111111:certificate-authority/111f1111-ba1b-1111-b11d-11ce1a11afae/certificate/c555ced55c5a55aaa5f555e5555fd5f5> \
--region <us-west-2> \
--output json > cert.json

Separate the certificate and certificate chain into two separate files by running the following commands. The new subordinate CA certificate is saved as cert.pem and the certificate chain is saved as cert_chain.pem.
```
cat cert.json | jq -r .Certificate > cert.pem 
cat cert.json | jq -r .CertificateChain > cert_chain.pem
```
Verify that the certificate and certificate chain are valid and configured as expected.
```
openssl x509 -in cert.pem -text -noout
openssl x509 -in cert_chain.pem -text -noout
```
The x509v3 Name Constraints portion of cert.pem should match the permitted and excluded name constraints you provided to the Python encoder earlier.

Figure 5: Verifying the X509v3 name constraints in the newly issued certificate using openssl

Install the name constraints certificate on the subordinate CA

In this section, you will install the name constraints certificate on your subordinate CA. Note that this will replace the existing certificate installed on the subordinate CA. The name constraints will go into effect as soon as the new certificate is installed.

To install the name constraints certificate

Run the following command with your subordinate CA’s certificate-authority-arn and path to the cert.pem and cert_chain.pem files you created earlier.

aws acm-pca import-certificate-authority-certificate \
--certificate-authority-arn <arn:aws:acm-pca:us-west-2:111111111111:certificate-authority/111f1111-ba1b-1111-b11d-11ce1a11afae> \
--certificate fileb://cert.pem \
--certificate-chain fileb://cert_chain.pem

Run the following command with your subordinate CA’s certificate-authority-arn and region to get the CA’s status.

aws acm-pca describe-certificate-authority \
--certificate-authority-arn <arn:aws:acm-pca:us-west-2:111111111111:certificate-authority/cdd22222-2222-2f22-bb2e-222f222222ab> \
--region <us-west-2> \
--output json

The output from the previous command will be similar to the following screenshot. The CertificateAuthorityConfiguration and highlighted NotBefore and NotAfter fields in the output should match the name constraints certificate.

Figure 6: Verifying subordinate CA details using the AWS CLI

Test the name constraints

Now that your subordinate CA has the new certificate installed, you can test to see if the name constraints are being enforced based on the certificate you installed in the previous section.

To request a certificate from your subordinate CA and test the applied name constraints

To request a new certificate, update and run the following command with your subordinate CA’s certificate-authority-arn, region, and desired certificate subject in the domain-name argument.

aws acm request-certificate \
--certificate-authority-arn <arn:aws:acm-pca:us-west-2:111111111111:certificate-authority/cdd22222-2222-2f22-bb2e-222f222222ab> \
--region <us-west-2> \
--domain-name <app.prod.dev.example.com>

If the request-certificate command is successful, it will output a certificate ARN. Take note of this ARN, because you will need it in the next step.

Update and run the following command with the certificate-arn from the previous step and your region to get the status of the certificate request.

aws acm describe-certificate \
--certificate-arn <arn:aws:acm:us-west-2:11111111111:certificate/f11aa1dc-1111-1d1f-1afd-4cb11111b111> \
--region <us-west-2>

You will see output similar to the following screenshot if the requested certificate domain name was not permitted by the name constraints applied to your subordinate CA. In this example, a certificate for app.prod.dev.example.com was rejected. The Status shows “FAILED” and the FailureReason indicates “PCA_NAME_CONSTRAINTS_VALIDATION”.

Figure 7: Verifying the status of the certificate request using the AWS CLI describe-certificate command

Conclusion

In this blog post, you learned how to apply and test DNS name constraints in AWS Private CA. For additional information on this topic, review the AWS documentation on understanding certificate templates and instructions on how to issue a certificate with custom extensions using an APIPassthrough template. If you prefer to use code in Java language format, see Activate a subordinate CA with the NameConstraints extension.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Reduce the security and compliance risks of messaging apps with AWS Wickr

2023-09-05 Anne Grahn

Post Syndicated from Anne Grahn original https://aws.amazon.com/blogs/security/reduce-the-security-and-compliance-risks-of-messaging-apps-with-aws-wickr/

Effective collaboration is central to business success, and employees today depend heavily on messaging tools. An estimated 3.09 billion mobile phone users access messaging applications (apps) to communicate, and this figure is projected to grow to 3.51 billion users in 2025.

This post highlights the risks associated with messaging apps and describes how you can use enterprise solutions — such as AWS Wickr — that combine end-to-end encryption with data retention to drive positive security and business outcomes.

The business risks of messaging apps

Evolving threats, flexible work models, and a growing patchwork of data protection and privacy regulations have made maintaining secure and compliant enterprise messaging a challenge.

The use of third-party apps for business-related messages on both corporate and personal devices can make it more difficult to verify that data is being adequately protected and retained. This can lead to business risk, particularly in industries with unique record-keeping requirements. Organizations in the financial services industry, for example, are subject to rules that include Securities and Exchange Commission (SEC) Rule 17a-4 and Financial Industry Regulatory Authority (FINRA) Rule 3120, which require them to preserve all pertinent electronic communications.

A recent Gartner report on the viability of mobile bring-your-own-device (BYOD) programs noted, “It is now logical to assume that most financial services organizations with mobile BYOD programs for regulated employees could be fined due to a lack of compliance with electronic communications regulations.”

In the public sector, U.S. government agencies are subject to records requests under the Freedom of Information Act (FOIA) and various state sunshine statutes. For these organizations, effectively retaining business messages is about more than supporting security and compliance—it’s about maintaining public trust.

Securing enterprise messaging

Enterprise-grade messaging apps can help you protect communications from unauthorized access and facilitate desired business outcomes.

Security — Critical security protocols protect messages and files that contain sensitive and proprietary data — such as personally identifiable information, protected health information, financial records, and intellectual property — in transit and at rest to decrease the likelihood of a security incident.

Control — Administrative controls allow you to add, remove, and invite users, and organize them into security groups with restricted access to features and content at their level. Passwords can be reset and profiles can be deleted remotely, helping you reduce the risk of data exposure stemming from a lost or stolen device.

Compliance — Information can be preserved in a customer-controlled data store to help meet requirements such as those that fall under the Federal Records Act (FRA) and National Archives and Records Administration (NARA), as well as SEC Rule 17a-4 and Sarbanes-Oxley (SOX).

Marrying encryption with data retention

Enterprise solutions bring end-to-end encryption and data retention together in support of a comprehensive approach to secure messaging that balances people, process, and technology.

End-to-end encryption

Many messaging apps offer some form of encryption, but not all of them use end-to-end encryption. End-to-end encryption is a secure communication method that protects data from unauthorized access, interception, or tampering as it travels from one endpoint to another.

In end-to-end encryption, encryption and decryption take place locally, on the device. Every call, message, and file is encrypted with unique keys and remains indecipherable in transit. Unauthorized parties cannot access communication content because they don’t have the keys required to decrypt the data.

Encryption in transit compared to end-to-end encryption

Encryption in transit encrypts data over a network from one point to another (typically between one client and one server); data might remain stored in plaintext at the source and destination storage systems. End-to-end encryption combines encryption in transit and encryption at rest to secure data at all times, from being generated and leaving the sender’s device, to arriving at the recipient’s device and being decrypted.

“Messaging is a critical tool for any organization, and end-to-end encryption is the security technology that provides organizations with the confidence they need to rely on it.” — CJ Moses, CISO and VP of Security Engineering at AWS

Data retention

While data retention is often thought of as being incompatible with end-to-end encryption, leading enterprise-grade messaging apps offer both, giving you the option to configure a data store of your choice to retain conversations without exposing them to outside parties. No one other than the intended recipients and your organization has access to the message content, giving you full control over your data.

How AWS can help

AWS Wickr is an end-to-end encrypted messaging and collaboration service that was built from the ground up with features designed to help you keep internal and external communications secure, private, and compliant. Wickr protects one-to-one and group messaging, voice and video calling, file sharing, screen sharing, and location sharing with 256-bit Advanced Encryption Standard (AES) encryption, and provides data retention capabilities.

Figure 1: How Wickr works

With Wickr, each message gets a unique AES private encryption key, and a unique Elliptic-curve Diffie–Hellman (ECDH) public key to negotiate the key exchange with recipients. Message content — including text, files, audio, or video — is encrypted on the sending device (your iPhone, for example) using the message-specific AES key. This key is then exchanged via the ECDH key exchange mechanism, so that only intended recipients can decrypt the message.

“As former employees of federal law enforcement, the intelligence community, and the military, Qintel understands the need for enterprise-federated, secure communication messaging capabilities. When searching for our company’s messaging application we evaluated the market thoroughly and while there are some excellent capabilities available, none of them offer the enterprise security and administrative flexibility that Wickr does.”
— Bill Schambura, CEO at Qintel

Wickr network administrators can configure and apply data retention to both internal and external communications in a Wickr network. This includes conversations with guest users, external teams, and other partner networks, so you can retain messages and files sent to and from the organization to help meet internal, legal, and regulatory requirements.

Figure 2: Data retention process

Data retention is implemented as an always-on recipient that is added to conversations, not unlike the blind carbon copy (BCC) feature in email. The data-retention process participates in the key exchange, allowing it to decrypt messages. The process can run anywhere: on-premises, on an Amazon Elastic Compute Cloud (Amazon EC2) instance, or at a location of your choice.

Wickr is a Health Insurance Portability and Accountability Act of 1996 (HIPAA)-eligible service, helping healthcare organizations and medical providers to conduct secure telehealth visits, send messages and files that contain protected health information, and facilitate real-time patient care.

Wickr networks can be created through the AWS Management Console, and workflows can be automated with Wickr bots. Wickr is currently available in the AWS US East (Northern Virginia), AWS GovCloud (US-West), AWS Canada (Central), and AWS Europe (London) Regions.

Keep your messages safe

Employees will continue to use messaging apps to chat with friends and family, and boost productivity at work. While many of these apps can introduce risks if not used properly in business settings, Wickr combines end-to-end encryption with data-retention capabilities to help you achieve security and compliance goals. Incorporating Wickr into a comprehensive approach to secure enterprise messaging that includes clear policies and security awareness training can help you to accelerate collaboration, while protecting your organization’s data.

To learn more and get started, visit the AWS Wickr webpage, or contact us.

Want more AWS Security news? Follow us on Twitter.

Establishing a data perimeter on AWS: Allow access to company data only from expected networks

2023-09-05 Laura Reith

Post Syndicated from Laura Reith original https://aws.amazon.com/blogs/security/establishing-a-data-perimeter-on-aws-allow-access-to-company-data-only-from-expected-networks/

A key part of protecting your organization’s non-public, sensitive data is to understand who can access it and from where. One of the common requirements is to restrict access to authorized users from known locations. To accomplish this, you should be familiar with the expected network access patterns and establish organization-wide guardrails to limit access to known networks. Additionally, you should verify that the credentials associated with your AWS Identity and Access Management (IAM) principals are only usable within these expected networks. On Amazon Web Services (AWS), you can use the network perimeter to apply network coarse-grained controls on your resources and principals. In this fourth blog post of the Establishing a data perimeter on AWS series, we explore the benefits and implementation considerations of defining your network perimeter.

The network perimeter is a set of coarse-grained controls that help you verify that your identities and resources can only be used from expected networks.

To achieve these security objectives, you first must define what expected networks means for your organization. Expected networks usually include approved networks your employees and applications use to access your resources, such as your corporate IP CIDR range and your VPCs. There are also scenarios where you need to permit access from networks of AWS services acting on your behalf or networks of trusted third-party partners that you integrate with. You should consider all intended data access patterns when you create the definition of expected networks. Other networks are considered unexpected and shouldn’t be allowed access.

Security risks addressed by the network perimeter

The network perimeter helps address the following security risks:

Unintended information disclosure through credential use from non-corporate networks

It’s important to consider the security implications of having developers with preconfigured access stored on their laptops. For example, let’s say that to access an application, a developer uses a command line interface (CLI) to assume a role and uses the temporary credentials to work on a new feature. The developer continues their work at a coffee shop that has great public Wi-Fi while their credentials are still valid. Accessing data through a non-corporate network means that they are potentially bypassing their company’s security controls, which might lead to the unintended disclosure of sensitive corporate data in a public space.

Unintended data access through stolen credentials

Organizations are prioritizing protection from credential theft risks, as threat actors can use stolen credentials to gain access to sensitive data. For example, a developer could mistakenly share credentials from an Amazon EC2 instance CLI access over email. After credentials are obtained, a threat actor can use them to access your resources and potentially exfiltrate your corporate data, possibly leading to reputational risk.

Figure 1 outlines an undesirable access pattern: using an employee corporate credential to access corporate resources (in this example, an Amazon Simple Storage Service (Amazon S3) bucket) from a non-corporate network.

Figure 1: Unintended access to your S3 bucket from outside the corporate network

Implementing the network perimeter

During the network perimeter implementation, you use IAM policies and global condition keys to help you control access to your resources based on which network the API request is coming from. IAM allows you to enforce the origin of a request by making an API call using both identity policies and resource policies.

The following two policies help you control both your principals and resources to verify that the request is coming from your expected network:

Service control policies (SCPs) are policies you can use to manage the maximum available permissions for your principals. SCPs help you verify that your accounts stay within your organization’s access control guidelines.
Resource based policies are policies that are attached to resources in each AWS account. With resource based policies, you can specify who has access to the resource and what actions they can perform on it. For a list of services that support resource based policies, see AWS services that work with IAM.

With the help of these two policy types, you can enforce the control objectives using the following IAM global condition keys:

aws:SourceIp: You can use this condition key to create a policy that only allows request from a specific IP CIDR range. For example, this key helps you define your expected networks as your corporate IP CIDR range.
aws:SourceVpc: This condition key helps you check whether the request comes from the list of VPCs that you specified in the policy. In a policy, this condition key is used to only allow access to an S3 bucket if the VPC where the request originated matches the VPC ID listed in your policy.
aws:SourceVpce: You can use this condition key to check if the request came from one of the VPC endpoints specified in your policy. Adding this key to your policy helps you restrict access to API calls that originate from VPC endpoints that belong to your organization.
aws:ViaAWSService: You can use this key to write a policy to allow an AWS service that uses your credentials to make calls on your behalf. For example, when you upload an object to Amazon S3 with server-side encryption with AWS Key Management Service (AWS KMS) on, S3 needs to encrypt the data on your behalf. To do this, S3 makes a subsequent request to AWS KMS to generate a data key to encrypt the object. The call that S3 makes to AWS KMS is signed with your credentials and originates outside of your network.
aws:PrincipalIsAWSService: This condition key helps you write a policy to allow AWS service principals to access your resources. For example, when you create an AWS CloudTrail trail with an S3 bucket as a destination, CloudTrail uses a service principal, cloudtrail.amazonaws.com, to publish logs to your S3 bucket. The API call from CloudTrail comes from the service network.

The following table summarizes the relationship between the control objectives and the capabilities used to implement the network perimeter.

Control objective	Implemented by using	Primary IAM capability
My resources can only be accessed from expected networks.	Resource-based policies	aws:SourceIp aws:SourceVpc aws:SourceVpce aws:ViaAWSService aws:PrincipalIsAWSService
My identities can access resources only from expected networks.	SCPs	aws:SourceIp aws:SourceVpc aws:SourceVpce aws:ViaAWSService

My resources can only be accessed from expected networks

Start by implementing the network perimeter on your resources using resource based policies. The perimeter should be applied to all resources that support resource- based policies in each AWS account. With this type of policy, you can define which networks can be used to access the resources, helping prevent access to your company resources in case of valid credentials being used from non-corporate networks.

The following is an example of a resource-based policy for an S3 bucket that limits access only to expected networks using the aws:SourceIp, aws:SourceVpc, aws:PrincipalIsAWSService, and aws:ViaAWSService condition keys. Replace <my-data-bucket>, <my-corporate-cidr>, and <my-vpc> with your information.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceNetworkPerimeter",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::<my-data-bucket>",
        "arn:aws:s3:::<my-data-bucket>/*"
      ],
      "Condition": {
        "NotIpAddressIfExists": {
          "aws:SourceIp": "<my-corporate-cidr>"
        },
        "StringNotEqualsIfExists": {
          "aws:SourceVpc": "<my-vpc>"
        },
        "BoolIfExists": {
          "aws:PrincipalIsAWSService": "false",
          "aws:ViaAWSService": "false"
        }
      }
    }
  ]
}

The Deny statement in the preceding policy has four condition keys where all conditions must resolve to true to invoke the Deny effect. Use the IfExists condition operator to clearly state that each of these conditions will still resolve to true if the key is not present on the request context.

This policy will deny Amazon S3 actions unless requested from your corporate CIDR range (NotIpAddressIfExists with aws:SourceIp), or from your VPC (StringNotEqualsIfExists with aws:SourceVpc). Notice that aws:SourceVpc and aws:SourceVpce are only present on the request if the call was made through a VPC endpoint. So, you could also use the aws:SourceVpce condition key in the policy above, however this would mean listing every VPC endpoint in your environment. Since the number of VPC endpoints is greater than the number of VPCs, this example uses the aws:SourceVpc condition key.

This policy also creates a conditional exception for Amazon S3 actions requested by a service principal (BoolIfExists with aws:PrincipalIsAWSService), such as CloudTrail writing events to your S3 bucket, or by an AWS service on your behalf (BoolIfExists with aws:ViaAWSService), such as S3 calling AWS KMS to encrypt or decrypt an object.

Extending the network perimeter on resource

There are cases where you need to extend your perimeter to include AWS services that access your resources from outside your network. For example, if you’re replicating objects using S3 bucket replication, the calls to Amazon S3 originate from the service network outside of your VPC, using a service role. Another case where you need to extend your perimeter is if you integrate with trusted third-party partners that need access to your resources. If you’re using services with the described access pattern in your AWS environment or need to provide access to trusted partners, the policy EnforceNetworkPerimeter that you applied on your S3 bucket in the previous section will deny access to the resource.

In this section, you learn how to extend your network perimeter to include networks of AWS services using service roles to access your resources and trusted third-party partners.

AWS services that use service roles and service-linked roles to access resources on your behalf

Service roles are assumed by AWS services to perform actions on your behalf. An IAM administrator can create, change, and delete a service role from within IAM; this role exists within your AWS account and has an ARN like arn:aws:iam::<AccountNumber>:role/<RoleName>. A key difference between a service-linked role (SLR) and a service role is that the SLR is linked to a specific AWS service and you can view but not edit the permissions and trust policy of the role. An example is AWS Identity and Access Management Access Analyzer using an SLR to analyze resource metadata. To account for this access pattern, you can exempt roles on the service-linked role dedicated path arn:aws:iam::<AccountNumber>:role/aws-service-role/*, and for service roles, you can tag the role with the tag network-perimeter-exception set to true.

If you are exempting service roles in your policy based on a tag-value, you must also include a policy to enforce the identity perimeter on your resource as shown in this sample policy. This helps verify that only identities from your organization can access the resource and cannot circumvent your network perimeter controls with network-perimeter-exception tag.

Partners accessing your resources from their own networks

There might be situations where your company needs to grant access to trusted third parties. For example, providing a trusted partner access to data stored in your S3 bucket. You can account for this type of access by using the aws:PrincipalAccount condition key set to the account ID provided by your partner.

The following is an example of a resource-based policy for an S3 bucket that incorporates the two access patterns described above. Replace <my-data-bucket>, <my-corporate-cidr>, <my-vpc>, <third-party-account-a>, <third-party-account-b>, and <my-account-id> with your information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EnforceNetworkPerimeter",
            "Principal": "*",
            "Action": "s3:*",
            "Effect": "Deny",
            "Resource": [
              "arn:aws:s3:::<my-data-bucket>",
              "arn:aws:s3:::<my-data-bucket>/*"
            ],
            "Condition": {
                "NotIpAddressIfExists": {
                  "aws:SourceIp": "<my-corporate-cidr>"
                },
                "StringNotEqualsIfExists": {
                    "aws:SourceVpc": "<my-vpc>",
       "aws:PrincipalTag/network-perimeter-exception": "true",
                    "aws:PrincipalAccount": [
                        "<third-party-account-a>",
                        "<third-party-account-b>"
                    ]
                },
                "BoolIfExists": {
                    "aws:PrincipalIsAWSService": "false",
                    "aws:ViaAWSService": "false"
                },
                "ArnNotLikeIfExists": {
                    "aws:PrincipalArn": "arn:aws:iam::<my-account-id>:role/aws-service-role/*"
                }
            }
        }
    ]
}

There are four condition operators in the policy above, and you need all four of them to resolve to true to invoke the Deny effect. Therefore, this policy only allows access to Amazon S3 from expected networks defined as: your corporate IP CIDR range (NotIpAddressIfExists and aws:SourceIp), your VPC (StringNotEqualsIfExists and aws:SourceVpc), networks of AWS service principals (aws:PrincipalIsAWSService), or an AWS service acting on your behalf (aws:ViaAWSService). It also allows access to networks of trusted third-party accounts (StringNotEqualsIfExists and aws:PrincipalAccount: <third-party-account-a>), and AWS services using an SLR to access your resources (ArnNotLikeIfExists and aws:PrincipalArn).

My identities can access resources only from expected networks

Applying the network perimeter on identity can be more challenging because you need to consider not only calls made directly by your principals, but also calls made by AWS services acting on your behalf. As described in access pattern 3 Intermediate IAM roles for data access in this blog post, many AWS services assume an AWS service role you created to perform actions on your behalf. The complicating factor is that even if the service supports VPC-based access to your data — for example AWS Glue jobs can be deployed within your VPC to access data in your S3 buckets — the service might also use the service role to make other API calls outside of your VPC. For example, with AWS Glue jobs, the service uses the service role to deploy elastic network interfaces (ENIs) in your VPC. However, these calls to create ENIs in your VPC are made from the AWS Glue managed network and not from within your expected network. A broad network restriction in your SCP for all your identities might prevent the AWS service from performing tasks on your behalf.

Therefore, the recommended approach is to only apply the perimeter to identities that represent the highest risk of inappropriate use based on other compensating controls that might exist in your environment. These are identities whose credentials can be obtained and misused by threat actors. For example, if you allow your developers access to the Amazon Elastic Compute Cloud (Amazon EC2) CLI, a developer can obtain credentials from the Amazon EC2 instance profile and use the credentials to access your resources from their own network.

To summarize, to enforce your network perimeter based on identity, evaluate your organization’s security posture and what compensating controls are in place. Then, according to this evaluation, identify which service roles or human roles have the highest risk of inappropriate use, and enforce the network perimeter on those identities by tagging them with data-perimeter-include set to true.

The following policy shows the use of tags to enforce the network perimeter on specific identities. Replace <my-corporate-cidr>, and <my-vpc> with your own information.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceNetworkPerimeter",
      "Effect": "Deny",
      "Action": "*",
      "Resource": "*",
      "Condition": {
        "BoolIfExists": {
          "aws:ViaAWSService": "false"
        },
        "NotIpAddressIfExists": {
          "aws:SourceIp": [
            "<my-corporate-cidr>"
          ]
        },
        "StringNotEqualsIfExists": {
          "aws:SourceVpc": [
            "<my-vpc>"
          ]
        }, 
       "ArnNotLikeIfExists": {
          "aws:PrincipalArn": [
            "arn:aws:iam::*:role/aws:ec2-infrastructure"
          ]
        },
        "StringEquals": {
          "aws:PrincipalTag/data-perimeter-include": "true"
        }
      }
    }
  ]
}

The above policy statement uses the Deny effect to limit access to expected networks for identities with the tag data-perimeter-include attached to them (StringEquals and aws:PrincipalTag/data-perimeter-include set to true). This policy will deny access to those identities unless the request is done by an AWS service on your behalf (aws:ViaAWSService), is coming from your corporate CIDR range (NotIpAddressIfExists and aws:SourceIp), or is coming from your VPCs (StringNotEqualsIfExists with the aws:SourceVpc).

Amazon EC2 also uses a special service role, also known as infrastructure role, to decrypt Amazon Elastic Block Store (Amazon EBS). When you mount an encrypted Amazon EBS volume to an EC2 instance, EC2 calls AWS KMS to decrypt the data key that was used to encrypt the volume. The call to AWS KMS is signed by an IAM role, arn:aws:iam::*:role/aws:ec2-infrastructure, which is created in your account by EC2. For this use case, as you can see on the preceding policy, you can use the aws:PrincipalArn condition key to exclude this role from the perimeter.

IAM policy samples

This GitHub repository contains policy examples that illustrate how to implement network perimeter controls. The policy samples don’t represent a complete list of valid access patterns and are for reference only. They’re intended for you to tailor and extend to suit the needs of your environment. Make sure you thoroughly test the provided example policies before implementing them in your production environment.

Conclusion

In this blog post you learned about the elements needed to build the network perimeter, including policy examples and strategies on how to extend that perimeter. You now also know different access patterns used by AWS services that act on your behalf, how to evaluate those access patterns, and how to take a risk-based approach to apply the perimeter based on identities in your organization.

For additional learning opportunities, see the Data perimeters on AWS. This information resource provides additional materials such as a data perimeter workshop, blog posts, whitepapers, and webinar sessions.

Want more AWS Security news? Follow us on Twitter.

Accelerating JVM cryptography with Amazon Corretto Crypto Provider 2

2023-09-01 Will Childs-Klein

Post Syndicated from Will Childs-Klein original https://aws.amazon.com/blogs/security/accelerating-jvm-cryptography-with-amazon-corretto-crypto-provider-2/

Earlier this year, Amazon Web Services (AWS) released Amazon Corretto Crypto Provider (ACCP) 2, a cryptography provider built by AWS for Java virtual machine (JVM) applications. ACCP 2 delivers comprehensive performance enhancements, with some algorithms (such as elliptic curve key generation) seeing a greater than 13-fold improvement over ACCP 1. The new release also brings official support for the AWS Graviton family of processors. In this post, I’ll discuss a use case for ACCP, then review performance benchmarks to illustrate the performance gains. Finally, I’ll show you how to get started using ACCP 2 in applications today.

This release changes the backing cryptography library for ACCP from OpenSSL (used in ACCP 1) to the AWS open source cryptography library, AWS libcrypto (AWS-LC). AWS-LC has extensive formal verification, as well as traditional testing, to assure the correctness of cryptography that it provides. While AWS-LC and OpenSSL are largely compatible, there are some behavioral differences that required the ACCP major version increment to 2.

The move to AWS-LC also allows ACCP to leverage performance optimizations in AWS-LC for modern processors. I’ll illustrate the ACCP 2 performance enhancements through the use case of establishing a secure communications channel with Transport Layer Security version 1.3 (TLS 1.3). Specifically, I’ll examine cryptographic components of the connection’s initial phase, known as the handshake. TLS handshake latency particularly matters for large web service providers, but reducing the time it takes to perform various cryptographic operations is an operational win for any cryptography-intensive workload.

TLS 1.3 requires ephemeral key agreement, which means that a new key pair is generated and exchanged for every connection. During the TLS handshake, each party generates an ephemeral elliptic curve key pair, exchanges public keys using Elliptic Curve Diffie-Hellman (ECDH), and agrees on a shared secret. Finally, the client authenticates the server by verifying the Elliptic Curve Digital Signature Algorithm (ECDSA) signature in the certificate presented by the server after key exchange. All of this needs to happen before you can send data securely over the connection, so these operations directly impact handshake latency and must be fast.

Figure 1 shows benchmarks for the three elliptic curve algorithms that implement the TLS 1.3 handshake: elliptic curve key generation (up to 1,298% latency improvement in ACCP 2.0 over ACCP 1.6), ECDH key agreement (up to 858% latency improvement), and ECDSA digital signature verification (up to 260% latency improvement). These algorithms were benchmarked over three common elliptic curves with different key sizes on both ACCP 1 and ACCP 2. The choice of elliptic curve determines the size of the key used or generated by the algorithm, and key size correlates to performance. The following benchmarks were measured under the Amazon Corretto 11 JDK on a c7g.large instance running Amazon Linux with a Graviton 3 processor.

Figure 1: Percentage improvement of ACCP 2.0 over 1.6 performance benchmarks on c7g.large Amazon Linux Graviton 3

The performance improvements due to the optimization of secp384r1 in AWS-LC are particularly noteworthy.

Getting started

Whether you’re introducing ACCP to your project or upgrading from ACCP 1, start the onboarding process for ACCP 2 by updating your dependency manager configuration in your development or testing environment. The Maven and Gradle examples below assume that you’re using linux on an ARM64 processor. If you’re using an x86 processor, substitute linux-x86_64 for linux-aarch64. After you’ve performed this update, sync your application’s dependencies and install ACCP in your JVM process. ACCP can be installed either by specifying our recommended security.properties file in your JVM invocation or programmatically at runtime. The following sections provide more details about all of these steps.

After ACCP has been installed, the Java Cryptography Architecture (JCA) will look for cryptographic implementations in ACCP first before moving on to other providers. So, as long as your application and dependencies obtain algorithms supported by ACCP from the JCA, your application should gain the benefits of ACCP 2 without further configuration or code changes.

Maven

If you’re using Maven to manage dependencies, add or update the following dependency configuration in your pom.xml file.

<dependency>
  <groupId>software.amazon.cryptools</groupId>
  <artifactId>AmazonCorrettoCryptoProvider</artifactId>
  <version>[2.0,3.0)</version>
  <classifier>linux-aarch64</classifier>
</dependency>

Gradle

For Gradle, add or update the following dependency in your build.gradle file.

dependencies {
    implementation 'software.amazon.cryptools:AmazonCorrettoCryptoProvider:2.+:linux-aarch64'
}

Install through security properties

After updating your dependency manager, you’ll need to install ACCP. You can install ACCP using security properties as described in our GitHub repository. This installation method is a good option for users who have control over their JVM invocation.

Install programmatically

If you don’t have control over your JVM invocation, you can install ACCP programmatically. For Java applications, add the following code to your application’s initialization logic (optionally performing a health check).

com.amazon.corretto.crypto.provider.AmazonCorrettoCryptoProvider.install();
com.amazon.corretto.crypto.provider.AmazonCorrettoCryptoProvider.INSTANCE.assertHealthy();

Migrating from ACCP 1 to ACCP 2

Although the migration path to version 2 is straightforward for most ACCP 1 users, ACCP 2 ends support for some outdated algorithms: a finite field Diffie-Hellman key agreement, finite field DSA signatures, and a National Institute of Standards and Technology (NIST)-specified random number generator. The removal of these algorithms is not backwards compatible, so you’ll need to check your code for their usage and, if you do find usage, either migrate to more modern algorithms provided by ACCP 2 or obtain implementations from a different provider, such as one of the default providers that ships with the JDK.

Check your code

Search for unsupported algorithms in your application code by their JCA names:

DH: Finite-field Diffie-Hellman key agreement
DSA: Finite-field Digital Signature Algorithm
NIST800-90A/AES-CTR-256: NIST-specified random number generator

Use ACCP 2 supported algorithms

Where possible, use these supported algorithms in your application code:

ECDH for key agreement instead of DH
ECDSA or RSA for signatures instead of DSA
Default SecureRandom instead of NIST800-90A/AES-CTR-256

If your use case requires the now-unsupported algorithms, check whether any of those algorithms are explicitly requested from ACCP.

If ACCP is not explicitly named as the provider, then you should be able to transparently fall back to another provider without a code change.
If ACCP is explicitly named as the provider, then remove that provider specification and register a different provider that offers the algorithm. This will allow the JCA to obtain an implementation from another registered provider without breaking backwards compatibility in your application.

Test your code

Some behavioral differences exist between ACCP 2 and other providers, including ACCP 1 (backed by OpenSSL). After onboarding or migrating, it’s important that you test your application code thoroughly to identify potential incompatibilities between cryptography providers.

Conclusion

Integrate ACCP 2 into your application today to benefit from AWS-LC’s security assurance and performance improvements. For a full list of changes, see the ACCP CHANGELOG on GitHub. Linux builds of ACCP 2 are now available on Maven Central for aarch64 and x86-64 processor architectures. If you encounter any issues with your integration, or have any feature suggestions, please reach out to us on GitHub by filing an issue.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Discover the benefits of AWS WAF advanced rate-based rules

2023-09-01 Rodrigo Ferroni

Post Syndicated from Rodrigo Ferroni original https://aws.amazon.com/blogs/security/discover-the-benefits-of-aws-waf-advanced-rate-based-rules/

In 2017, AWS announced the release of Rate-based Rules for AWS WAF, a new rule type that helps protect websites and APIs from application-level threats such as distributed denial of service (DDoS) attacks, brute force log-in attempts, and bad bots. Rate-based rules track the rate of requests for each originating IP address and invokes a rule action on IPs with rates that exceed a set limit.

While rate-based rules are useful to detect and mitigate a broad variety of bad actors, threats have evolved to bypass request-rate limit rules. For example, one bypass technique is to send a high volumes of requests by spreading them across thousands of unique IP addresses.

In May 2023, AWS announced AWS WAF enhancements to the existing rate-based rules feature that you can use to create more dynamic and intelligent rules by using additional HTTP request attributes for request rate limiting. For example, you can now choose from the following predefined keys to configure your rules: label namespace, header, cookie, query parameter, query string, HTTP method, URI path and source IP Address or IP Address in a header. Additionally, you can combine up to five composite keys as parameters for stronger rule development. These rule definition enhancements help improve perimeter security measures against sophisticated application-layer DDoS attacks using AWS WAF. For more information about the supported request attributes, see Rate-based rule statement in the AWS WAF Developer Guide.

In this blog post, you will learn more about these new AWS WAF feature enhancements and how you can use alternative request attributes to create more robust and granular sets of rules. In addition, you’ll learn how to combine keys to create a composite aggregation key to uniquely identify a specific combination of elements to improve rate tracking.

Getting started

Configuring advanced rate-based rules is similar to configuring simple rate-based rules. The process starts with creating a new custom rule of type rate-based rule, entering the rate limit value, selecting custom keys, choosing the key from the request aggregation key dropdown menu, and adding additional composite keys by choosing Add a request aggregation key as shown in Figure 1.

Figure 1: Creating an advanced rate-based rule with two aggregation keys

For existing rules, you can update those rate-based rules to use the new functionality by editing them. For example, you can add a header to be aggregated with the source IP address, as shown in Figure 2. Note that previously created rules will not be modified.

Figure 2: Add a second key to an existing rate-based rule

You still can set the same rule action, such as block, count, captcha, or challenge. Optionally, you can continue applying a scope-down statement to limit rule action. For example, you can limit the scope to a certain application path or requests with a specified header. You can scope down the inspection criteria so that only certain requests are counted towards rate limiting, and use certain keys to aggregate those requests together. A technique would be to count only requests that have /api at the start of the URI, and aggregate them based on their SessionId cookie value.

Target use cases

Now that you’re familiar with the foundations of advanced rate-based rules, let’s explore how they can improve your security posture using the following use cases:

Enhanced Application (Layer 7) DDoS protection
Improved API security
Enriched request throttling

Use case 1: Enhance Layer 7 DDoS mitigation

The first use case that you might find beneficial is to enhance Layer 7 DDoS mitigation. An HTTP request flood is the most common vector of DDoS attacks. This attack type aims to affect application availability by exhausting available resources to run the application.

Before the release of these enhancements to AWS WAF rules, rules were limited by aggregating requests based on the IP address from the request origin or configured to use a forwarded IP address in an HTTP header such as X-Forwarded-For. Now you can create a more robust rate-based rule to help protect your web application from DDoS attacks by tracking requests based on a different key or a combination of keys. Let’s examine some examples.

To help detect pervasive bots, such as scrapers, scanners, and crawlers, or common bots that are distributed across many unique IP addresses, a rule can look for static request data like a custom header — for example, User-Agent.

Key 1: Custom header (User-Agent)

{
  "Name": "test-rbr",
  "Priority": 0,
  "Statement": {
    "RateBasedStatement": {
      "Limit": 2000,
      "AggregateKeyType": "CUSTOM_KEYS",
      "CustomKeys": [
        {
          "Header": {
            "Name": "User-Agent",
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "NONE"
              }
            ]
          }
        }
      ]
    }
  },
  "Action": {
    "Block": {}
  },
  "VisibilityConfig": {
    "SampledRequestsEnabled": true,
    "CloudWatchMetricsEnabled": true,
    "MetricName": "test-rbr"
  }
}

To help you decide what unique key to use, you can analyze AWS WAF logs. For more information, review Examples 2 and 3 in the blog post Analyzing AWS WAF Logs in Amazon CloudWatch Logs.

To uniquely identity users behind a NAT gateway, you can use a cookie in addition to an IP address. Before the aggregation keys feature, it was difficult to identify users who connected from a single IP address. Now, you can use the session cookie to aggregate requests by their session identifier and IP address.

Note that for Layer 7 DDoS mitigation, tracking by session ID in cookies can be circumvented, because bots might send random values or not send any cookie at all. It’s a good idea to keep an IP-based blanket rate-limiting rule to block offending IP addresses that reach a certain high rate, regardless of their request attributes. In that case, the keys would look like:

Key 1: Session cookie
Key 2: IP address

You can reduce false positives when using AWS Managed Rules (AMR) IP reputation lists by rate limiting based on their label namespace. Labelling functionality is a powerful feature that allows you to map the requests that match a specific pattern and apply custom rules to them. In this case, you can match the label namespace provided by the AMR IP reputation list that includes AWSManagedIPDDoSList, which is a list of IP addresses that have been identified as actively engaging in DDoS activities.

You might want to be cautious about using this group list in block mode, because there’s a chance of blocking legitimate users. To mitigate this, use the list in count mode and create an advanced rate-based rule to aggregate all requests with the label namespace awswaf:managed:aws:amazon-ip-list:, targeting captcha as the rule action. This lets you reduce false positives without compromising security. Applying captcha as an action for the rule reduces serving captcha to all users and instead only applies it when the rate of requests exceeds the defined limit. The key for this rule would be:

Labels (AMR IP reputation lists).

Use case 2: API security

In this second use case, you learn how to use an advanced rate-based rule to improve the security of an API. Protecting an API with rate-limiting rules helps ensure that requests aren’t being sent too frequently in a short amount of time. Reducing the risk from misusing an API helps to ensure that only legitimate requests are handled and not denied due to an overload of requests.

Now, you can create advanced rate-based rules that track API requests based on two aggregation keys. For example, HTTP method to differentiate between GET, POST, and other requests in combination with a custom header like Authorization to match a JSON Web Token (JWT). JWTs are not decrypted by AWS WAF, and AWS WAF only aggregates requests with the same token. This can help to ensure that a token is not being used maliciously or to bypass rate-limiting rules. An additional benefit of this configuration is that requests with no authorization headers are being aggregated together towards the rate limiting threshold. The keys for this use case are:

Key 1: HTTP method
Key 2: Custom header (Authorization)

In addition, you can configure a rule to block and add a custom response when the requests limit is reached. For example, by returning HTTP error code 429 (too many requests) with a Retry-After header indicating the requester should wait 900 seconds (15 minutes) before making a new request.

{
  "Name": "test-rbr",
  "Priority": 0,
  "Statement": {
    "RateBasedStatement": {
      "Limit": 600,
      "AggregateKeyType": "CUSTOM_KEYS",
      "CustomKeys": [
        {
          "HTTPMethod": {}
        },
        {
          "Header": {
            "Name": "Authorization",
            "TextTransformations": [
              {
                "Priority": 0,
                "Type": "NONE"
              }
            ]
          }
        }
      ]
    }
  },
  "Action": {
    "Block": {
      "CustomResponse": {
        "ResponseCode": 429,
        "ResponseHeaders": [
          {
            "Name": "Retry-After",
            "Value": "900"
          }
        ]
      }
    }
  },
  "VisibilityConfig": {
    "SampledRequestsEnabled": true,
    "CloudWatchMetricsEnabled": true,
    "MetricName": "test-rbr"
  }
}

Use case 3: Implement request throttling

There are many situations where throttling should be considered. For example, if you want to maintain the performance of a service API by providing fair usage for all users, you can have different rate limits based on the type or purpose of the API, such as mutable or non-mutable requests. To achieve this, you can create two advanced rate-based rules using aggregation keys like IP address, combined with an HTTP request parameter for either mutable or non-mutable that indicates the type of request. Each rule will have its own HTTP request parameter, and you can set different maximum values for the rate limit. The keys for this use case are:

Key 1: HTTP request parameter
Key 2: IP address

Another example where throttling can be helpful is for a multi-tenant application where you want to track requests made by each tenant’s users. Let’s say you have a free tier but also a paying subscription model for which you want to allow a higher request rate. For this use case, it’s recommended to use two different URI paths to verify that the two tenants are kept separated. Additionally, it is advised to still use a custom header or query string parameter to differentiate between the two tenants, such as a tenant-id header or parameter that contains a unique identifier for each tenant. To implement this type of throttling using advanced rate-based rules, you can create two rules using an IP address in combination with the custom header as aggregation keys. Each rule can have its own maximum value for rate limiting, as well as a scope-down statement that matches requests for each URI path. The keys and scope-down statement for this use case are:

Key 1: Custom header (tenant-id)
Key 2: IP address
Scope down statement (URI path)

As a third example, you can rate-limit web applications based on the total number of requests that can be handled. For this use case, you can use the new Count all as aggregation option. The option counts and rate-limits the requests that match the rule’s scope-down statement, which is required for this type of aggregation. One option is to scope down and inspect the URI path to target a specific functionality like a /history-search page. An option when you need to control how many requests go to a specific domain is to scope down a single header to a specific host, creating one rule for a.example.com and another rule for b.example.com.

Request Aggregation: Count all
Scope down statement (URI path | Single header)

For these examples, you can block with a custom response when the requests exceed the limit. For example, by returning the same HTTP error code and header, but adding a custom response body with a message like “You have reached the maximum number of requests allowed.”

Logging

The AWS WAF logs now include additional information about request keys used for request-rate tracking and the values of matched request keys. In addition to the existing IP or Forwarded_IP values, you can see the updated log fields limitKey and customValue, where the limitKey field now shows either CustomKeys for custom aggregate key settings or Constant for count all requests. CustomValues shows an array of keys, names, and values.

Figure 3: Example log output for the advanced rate-based rule showing updated limitKey and customValues fields

As mentioned in the first use case, to get more detailed information about the traffic that’s analyzed by the web ACL, consider enabling logging. If you choose to enable Amazon CloudWatch Logs as the log destination, you can use CloudWatch Logs Insights and advanced queries to interactively search and analyze logs.

For example, you can use the following query to get the request information that matches rate-based rules, including the updated keys and values, directly from the AWS WAF console.

| fields terminatingRuleId as RuleName
| filter terminatingRuleType ="RATE_BASED" 
| parse @message ',"customValues":[*],' as customKeys
| limit 100

Figure 4 shows the CloudWatch Log Insights query and the logs output including custom keys, names, and values fields.

Figure 4: The CloudWatch Log Insights query and the logs output

Pricing

There is no additional cost for using advanced rate-base rules; standard AWS WAF pricing applies when you use this feature. For AWS WAF pricing information, see AWS WAF Pricing. You only need to be aware that using aggregation keys will increase AWS WAF web ACL capacity units (WCU) usage for the rule. WCU usage is calculated based on how many keys you want to use for rate limiting. The current model of 2 WCUs plus any additional WCUs for a nested statement is being updated to 2 WCUs as a base, and 30 WCUs for each custom aggregation key that you specify. For example, if you want to create aggregation keys with an IP address in combination with a session cookie, this will use 62 WCUs, and aggregation keys with an IP address, session cookie, and customer header will use 92 WCUs. For more details about the WCU-based cost structure, visit Rate-based rule statement in the AWS WAF Developer Guide.

Conclusion

In this blog post, you learned about AWS WAF enhancements to existing rate-based rules that now support request parameters in addition to IP addresses. Additionally, these enhancements allow you to create composite keys based on up to five request parameters. This new capability allows you to be either more coarse in aggregating requests (such as all the requests that have an IP reputation label associated with them) or finer (such as aggregate requests for a specific session ID, not its IP address).

For more rule examples that include JSON rule configuration, visit Rate-based rule examples in the AWS WAF Developer Guide.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Embracing our broad responsibility for securing digital infrastructure in the European Union

2023-08-31 Frank Adelmann

Post Syndicated from Frank Adelmann original https://aws.amazon.com/blogs/security/embracing-our-broad-responsibility-for-securing-digital-infrastructure-in-the-european-union/

Over the past few decades, digital technologies have brought tremendous benefits to our societies, governments, businesses, and everyday lives. However, the more we depend on them for critical applications, the more we must do so securely. The increasing reliance on these systems comes with a broad responsibility for society, companies, and governments.

At Amazon Web Services (AWS), every employee, regardless of their role, works to verify that security is an integral component of every facet of the business (see Security at AWS). This goes hand-in-hand with new cybersecurity-related regulations, such as the Directive on Measures for a High Common Level of Cybersecurity Across the Union (NIS 2), formally adopted by the European Parliament and the Counsel of the European Union (EU) in December 2022. NIS 2 will be transposed into the national laws of the EU Member States by October 2024, and aims to strengthen cybersecurity across the EU.

AWS is excited to help customers become more resilient, and we look forward to even closer cooperation with national cybersecurity authorities to raise the bar on cybersecurity across Europe. Building society’s trust in the online environment is key to harnessing the power of innovation for social and economic development. It’s also one of our core Leadership Principles: Success and scale bring broad responsibility.

Compliance with NIS 2

NIS 2 seeks to ensure that entities mitigate the risks posed by cyber threats, minimize the impact of incidents, and protect the continuity of essential and important services in the EU.

Besides increased cooperation between authorities and support for enhanced information sharing amongst covered entities, NIS 2 includes minimum requirements for cybersecurity risk management measures and reporting obligations, which are applicable to a broad range of AWS customers based on their sector. Examples of sectors that must comply with NIS 2 requirements are energy, transport, health, public administration, and digital infrastructures. For the full list of covered sectors, see Annexes I and II of NIS 2. Generally, the NIS 2 Directive applies to a wider pool of entities than those currently covered by the NIS Directive, including medium-sized enterprises, as defined in Article 2 of the Annex to Recommendation 2003/361/EC (over 50 employees or an annual turnover over €10 million).

In several countries, aspects of the AWS service offerings are already part of the national critical infrastructure. For example, in Germany, Amazon Elastic Compute Cloud (Amazon EC2) and Amazon CloudFront are in scope for the KRITIS regulation. For several years, AWS has fulfilled its obligations to secure these services, run audits related to national critical infrastructure, and have established channels for exchanging security information with the German Federal Office for Information Security (BSI) KRITIS office. AWS is also part of the UP KRITIS initiative, a cooperative effort between industry and the German Government to set industry standards.

AWS will continue to support customers in implementing resilient solutions, in accordance with the shared responsibility model. Compliance efforts within AWS will include implementing the requirements of the act and setting out technical and methodological requirements for cloud computing service providers, to be published by the European Commission, as foreseen in Article 21 of NIS 2.

AWS cybersecurity risk management – Current status

Even before the introduction of NIS 2, AWS has been helping customers improve their resilience and incident response capacities. Our core infrastructure is designed to satisfy the security requirements of the military, global banks, and other highly sensitive organizations.

AWS provides information and communication technology services and building blocks that businesses, public authorities, universities, and individuals use to become more secure, innovative, and responsive to their own needs and the needs of their customers. Security and compliance remain a shared responsibility between AWS and the customer. We make sure that the AWS cloud infrastructure complies with applicable regulatory requirements and good practices for cloud providers, and customers remain responsible for building compliant workloads in the cloud.

In total, AWS supports or has obtained over 143 security standards compliance certifications and attestations around the globe, such as ISO 27001, ISO 22301, ISO 20000, ISO 27017, and System and Organization Controls (SOC) 2. The following are some examples of European certifications and attestations that we’ve achieved:

C5 — provides a wide-ranging control framework for establishing and evidencing the security of cloud operations in Germany.
ENS High — comprises principles for adequate protection applicable to government agencies and public organizations in Spain.
HDS — demonstrates an adequate framework for technical and governance measures to secure and protect personal health data, governed by French law.
Pinakes — provides a rating framework intended to manage and monitor the cybersecurity controls of service providers upon which Spanish financial entities depend.

These and other AWS Compliance Programs help customers understand the robust controls in place at AWS to help ensure the security and compliance of the cloud. Through dedicated teams, we’re prepared to provide assurance about the approach that AWS has taken to operational resilience and to help customers achieve assurance about the security and resiliency of their workloads. AWS Artifact provides on-demand access to these security and compliance reports and many more.

For security in the cloud, it’s crucial for our customers to make security by design and security by default central tenets of product development. To begin with, customers can use the AWS Well-Architected tool to help build secure, high-performing, resilient, and efficient infrastructure for a variety of applications and workloads. Customers that use the AWS Cloud Adoption Framework (AWS CAF) can improve cloud readiness by identifying and prioritizing transformation opportunities. These foundational resources help customers secure regulated workloads. AWS Security Hub provides customers with a comprehensive view of their security state on AWS and helps them check their environments against industry standards and good practices.

With regards to the cybersecurity risk management measures and reporting obligations that NIS 2 mandates, existing AWS service offerings can help customers fulfill their part of the shared responsibility model and comply with future national implementations of NIS 2. For example, customers can use Amazon GuardDuty to detect a set of specific threats to AWS accounts and watch out for malicious activity. Amazon CloudWatch helps customers monitor the state of their AWS resources. With AWS Config, customers can continually assess, audit, and evaluate the configurations and relationships of selected resources on AWS, on premises, and on other clouds. Furthermore, AWS Whitepapers, such as the AWS Security Incident Response Guide, help customers understand, implement, and manage fundamental security concepts in their cloud architecture.

NIS 2 foresees the development and implementation of comprehensive cybersecurity awareness training programs for management bodies and employees. At AWS, we provide various training programs at no cost to the public to increase awareness on cybersecurity, such as the Amazon cybersecurity awareness training, AWS Cloud Security Learning, AWS re/Start, and AWS Ramp-Up Guides.

AWS cooperation with authorities

At Amazon, we strive to be the world’s most customer-centric company. For AWS Security Assurance, that means having teams that continuously engage with authorities to understand and exceed regulatory and customer obligations on behalf of customers. This is just one way that we raise the security bar in Europe. At the same time, we recommend that national regulators carefully assess potentially conflicting, overlapping, or contradictory measures.

We also cooperate with cybersecurity agencies around the globe because we recognize the importance of their role in keeping the world safe. To that end, we have built the Global Cybersecurity Program (GCSP) to provide agencies with a direct and consistent line of communication to the AWS Security team. Two examples of GCSP members are the Dutch National Cyber Security Centrum (NCSC-NL), with whom we signed a cooperation in May 2023, and the Italian National Cybersecurity Agency (ACN). Together, we will work on cybersecurity initiatives and strengthen the cybersecurity posture across the EU. With the war in Ukraine, we have experienced how important such a collaboration can be. AWS has played an important role in helping Ukraine’s government maintain continuity and provide critical services to citizens since the onset of the war.

The way forward

At AWS, we will continue to provide key stakeholders with greater insights into how we help customers tackle their most challenging cybersecurity issues and provide opportunities to deep dive into what we’re building. We very much look forward to continuing our work with authorities, agencies and, most importantly, our customers to provide for the best solutions and raise the bar on cybersecurity and resilience across the EU and globally.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Two real-life examples of why limiting permissions works: Lessons from AWS CIRT

2023-08-31 Richard Billington

Post Syndicated from Richard Billington original https://aws.amazon.com/blogs/security/two-real-life-examples-of-why-limiting-permissions-works-lessons-from-aws-cirt/

Welcome to another blog post from the AWS Customer Incident Response Team (CIRT)! For this post, we’re looking at two events that the team was involved in from the viewpoint of a regularly discussed but sometimes misunderstood subject, least privilege. Specifically, we consider the idea that the benefit of reducing permissions in real-life use cases does not always require using the absolute minimum set of privileges. Instead, you need to weigh the cost and effort of creating and maintaining privileges against the risk reduction that is achieved, to make sure that your permissions are appropriate for your needs.

To quote VP and Distinguished Engineer at Amazon Security, Eric Brandwine, “Least privilege equals maximum effort.” This is the idea that creating and maintaining the smallest possible set of privileges needed to perform a given task will require the largest amount of effort, especially as customer needs and service features change over time. However, the correlation between effort and permission reduction is not linear. So, the question you should be asking is: How do you balance the effort of privilege reduction with the benefits it provides?

Unfortunately, this is not an easy question to answer. You need to consider the likelihood of an unwanted issue happening, the impact if that issue did happen, and the cost and effort to prevent (or detect and recover from) that issue. You also need to factor requirements such as your business goals and regulatory requirements into your decision process. Of course, you won’t need to consider just one potential issue, but many. Often it can be useful to start with a rough set of permissions and refine them down as you develop your knowledge of what security level is required. You can also use service control policies (SCPs) to provide a set of permission guardrails if you’re using AWS Organizations. In this post, we tell two real-world stories where limiting AWS Identity and Access Management (IAM) permissions worked by limiting the impact of a security event, but where the permission set did not involve maximum effort.

Story 1: On the hunt for credentials

In this AWS CIRT story, we see how a threat actor was unable to achieve their goal because the access they obtained — a database administrator’s — did not have the IAM permissions they were after.

Background and AWS CIRT engagement

A customer came to us after they discovered unauthorized activity in their on-premises systems and in some of their AWS accounts. They had incident response capability and were looking for an additional set of hands with AWS knowledge to help them with their investigation. This helped to free up the customer’s staff to focus on the on-premises analysis.

Before our engagement, the customer had already performed initial containment activities. This included rotating credentials, revoking temporary credentials, and isolating impacted systems. They also had a good idea of which federated user accounts had been accessed by the threat actor.

The key part of every AWS CIRT engagement is the customer’s ask. Everything our team does falls on the customer side of the AWS Shared Responsibility Model, so we want to make sure that we are aligned to the customer’s desired outcome. The ask was clear—review the potential unauthorized federated users’ access, and investigate the unwanted AWS actions that were taken by those users during the known timeframe. To get a better idea of what was “unwanted,” we talked to the customer to understand at a high level what a typical day would entail for these users, to get some context around what sort of actions would be expected. The users were primarily focused on working with Amazon Relational Database Service (RDS).

Analysis and findings

For this part of the story, we’ll focus on a single federated user whose apparent actions we investigated, because the other federated users had not been impersonated by the threat actor in a meaningful way. We knew the approximate start and end dates to focus on and had discovered that the threat actor had performed a number of unwanted actions.

After reviewing the actions, it was clear that the threat actor had performed a console sign-in on three separate occasions within a 2-hour window. Each time, the threat actor attempted to perform a subset of the following actions:

CreateAccessKey — Create a new AWS secret access key
CreateLoginProfile — Create a console password for a given IAM user
UpdateAccessKey — Change an access key from inactive to active (or vice versa)
DeleteAccessKey — Delete an access key pair
PutRolePolicy — Add or update an inline IAM policy
CreatePolicyVersion — Create a new version of an existing managed policy

Note: This list includes only the actions that are displayed as readOnly = false in AWS CloudTrail, because these actions are often (but not always) the more impactful ones, with the potential to change the AWS environment.

This is the point where the benefit of permission restriction became clear. As soon as this list was compiled, we noticed that two fields were present for all of the actions listed:

"errorCode": "Client.UnauthorizedOperation",
"errorMessage": "You are not authorized to perform this operation. [rest of message]"

As this reveals, every single non-readOnly action that was attempted by the threat actor was denied because the federated user account did not have the required IAM permissions.

Customer communication and result

After we confirmed the findings, we had a call with the customer to discuss the results. As you can imagine, they were happy that the results showed no material impact to their data, and said no further investigation or actions were required at that time.

What were the IAM permissions the federated user had, which prevented the set of actions the threat actor attempted?

The answer did not actually involve the absolute minimal set of permissions required by the user to do their job. It’s simply that the federated user had a role that didn’t have an Allow statement for the IAM actions the threat actor tried — their job did not require them. Without an explicit Allow statement, the IAM actions attempted were denied because IAM policies are Deny by default. In this instance, simply not having the desired IAM permissions meant that the threat actor wasn’t able to achieve their goal, and stopped using the access. We’ll never know what their goal actually was, but trying to create access keys, passwords, and update policies means that a fair guess would be that they were attempting to create another way to access that AWS account.

Story 2: More instances for crypto mining

In this AWS CIRT story, we see how a threat actor’s inability to create additional Amazon Elastic Compute Cloud (Amazon EC2) instances resulted in the threat actor leaving without achieving their goal.

Background and AWS CIRT engagement

Our second story involves a customer who had an AWS account they were using to test some new third-party software that uses Amazon Elastic Container Service (Amazon ECS). This customer had Amazon GuardDuty turned on, and found that they were getting GuardDuty alerts for CryptoCurrency:EC2/BitcoinTool related findings.

Because this account was new and currently only used for testing their software, the customer saw that the detection was related to the Amazon ECS cluster and decided to delete all the resources in the account and rebuild. Not too long after doing this, they received a similar GuardDuty alert for the new Amazon ECS cluster they had set up. The second finding resulted in the customer’s security team and AWS being brought in to try to understand what was causing this. The customer’s security team was focused on reviewing the tasks that were being run on the cluster, while AWS CIRT reviewed the AWS account actions and provided insight about the GuardDuty finding and what could have caused it.

Analysis and findings

Working with the customer, it wasn’t long before we discovered that the 3^rd party Amazon ECS task definition that the customer was using, was unintentionally exposing a web interface to the internet. That interface allowed unauthenticated users to run commands on the system. This explained why the same alert was also received shortly after the new install had been completed.

This is where the story takes a turn for the better. The AWS CIRT analysis of the AWS CloudTrail logs found that there were a number of attempts to use the credentials of the Task IAM role that was associated with the Amazon ECS task. The majority of actions were attempting to launch multiple Amazon EC2 instances via RunInstances calls. Every one of these actions, along with the other actions attempted, failed with either a Client.UnauthorizedOperation or an AccessDenied error message.

Next, we worked with the customer to understand the permissions provided by the Task IAM role. Once again, the permissions could have been limited more tightly. However, this time the goal of the threat actor — running a number of Amazon EC2 instances (most likely for surreptitious crypto mining) — did not align with the policy given to the role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Action": "s3:*",
          "Resource": "*"
        }
    ]
}

AWS CIRT recommended creating policies to restrict the allowed actions further, providing specific resources where possible, and potentially also adding in some conditions to limit other aspects of the access (such as the two Condition keys launched recently to limit where Amazon EC2 instance credentials can be used from). However, simply having the policy limit access to Amazon Simple Storage Service (Amazon S3) meant that the threat actor decided to leave with just the one Amazon ECS task running crypto mining rather than a larger number of Amazon EC2 instances.

Customer communication and result

After reporting these findings to the customer, there were two clear next steps: First, remove the now unwanted and untrusted Amazon ECS resource from their AWS account. Second, review and re-architect the Amazon ECS task so that access to the web interface was only provided to appropriate users. As part of that re-architecting, an Amazon S3 policy similar to “Allows read and write access to objects in an S3 bucket” was recommended. This separates Amazon S3 bucket actions from Amazon S3 object actions. When applications have a need to read and write objects in Amazon S3, they don’t normally have a need to create or delete entire buckets (or versioning on those buckets).

Some tools to help

We’ve just looked at how limiting privileges helped during two different security events. Now, let’s consider what can help you decide how to reduce your IAM permissions to an appropriate level. There are a number of resources that talk about different approaches:

The first approach is to use Access Analyzer to help generate IAM policies based on access activity from log data. This can then be refined further with the addition of Condition elements as desired. We already have a couple of blog posts about that exact topic:

The second approach is similar, and that is to reduce policy scope based on the last-accessed information:

Review last accessed information to identify unused EC2, IAM, and Lambda permissions and tighten access for your IAM roles

The third approach is a manual method of creating and refining policies to reduce the amount of work required. For this, you can begin with an appropriate AWS managed IAM policy or an AWS provided example policy as a starting point. Following this, you can add or remove Actions, Resources, and Conditions — using wildcards as desired — to balance your effort and permission reduction.

An example of balancing effort and permission is in the IAM tutorial Create and attach your first customer managed policy. In it, the authors create a policy that uses iam:Get* and iam:List:* in the Actions section. Although not all iam:Get* and iam:List:* Actions may be required, this is a good way to group similar Actions together while minimizing Actions that allow unwanted access — for example, iam:Create* or iam:Delete*. Another example of this balancing was mentioned earlier relating to Amazon S3, allowing access to create, delete, and read objects, but not to change the configuration of the bucket those objects are in.

In addition to limiting permissions, we also recommend that you set up appropriate detection and response capability. This will enable you to know when an issue has occurred and provide the tools to contain and recover from the issue. Further details about performing incident response in an AWS account can be found in the AWS Security Incident Response Guide.

There are also two services that were used to help in the stories we presented here — Amazon GuardDuty and AWS CloudTrail. GuardDuty is a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity. It’s a great way to monitor for unwanted activity within your AWS accounts. CloudTrail records account activity across your AWS infrastructure and provides the logs that were used for the analysis that AWS CIRT performed for both these stories. Making sure that both of these are set up correctly is a great first step towards improving your threat detection and incident response capability in AWS.

Conclusion

In this post, we looked at two examples where limiting privilege provided positive results during a security event. In the second case, the policy used should probably have restricted permissions further, but even as it stood, it was an effective preventative control in stopping the unauthorized user from achieving their assumed goal.

Hopefully these stories will provide new insight into the way your organization thinks about setting permissions, while taking into account the effort of creating the permissions. These stories are a good example of how starting a journey towards least privilege can help stop unauthorized users. Neither of the scenarios had policies that were least privilege, but the policies were restrictive enough that the unauthorized users were prevented from achieving their goals this time, resulting in minimal impact to the customers. However in both cases AWS CIRT recommended further reducing the scope of the IAM policies being used.

Finally, we provided a few ways to go about reducing permissions—first, by using tools to assist with policy creation, and second, by editing existing policies so they better fit your specific needs. You can get started by checking your existing policies against what Access Analyzer would recommend, by looking for and removing overly permissive wildcard characters (*) in some of your existing IAM policies, or by implementing and refining your SCPs.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

161 AWS services achieve HITRUST certification

2023-08-30 Mark Weech

Post Syndicated from Mark Weech original https://aws.amazon.com/blogs/security/161-aws-services-achieve-hitrust-certification/

The Amazon Web Services (AWS) HITRUST Compliance Team is excited to announce that 161 AWS services have been certified for the HITRUST CSF version 11.0.1 for the 2023 cycle. The full list of AWS services, which were audited by a third-party assessor and certified under the HITRUST CSF, is now available on our Services in Scope by Compliance Program page. You can view and download our HITRUST CSF certification at any time on demand through AWS Artifact.

The HITRUST CSF has been widely adopted by leading organizations in a variety of industries in their approach to security and privacy. Visit the HITRUST website for more information. HITRUST certification allows you, as an AWS customer, to tailor your security control baselines specific to your architecture and assessment scope, and inherit certification for those controls so they don’t have to be tested as a component of your HITRUST assessment. Because cloud-based controls don’t have to be retested, AWS customers enjoy savings in both time and cost for their own HITRUST assessment certification needs.

AWS HITRUST CSF certification is available for customer inheritance with an updated Shared Responsibility Matrix version 1.4.1

As an added benefit to our customers, organizations no longer have to assess inherited controls for their HITRUST validated assessment, because AWS already has! Our customers can deploy business solutions into the AWS cloud environment and inherit our HITRUST CSF certification for those controls applicable to their cloud architecture for services that are in-scope of the AWS HITRUST assessment. A detailed listing of controls and corresponding inheritance values can be found on the HITRUST website.

The AWS HITRUST Inheritance Program supports the latest version of HITRUST controls (v11.1), and is excited to announce the availability of the latest Shared Responsibility Matrix (SRM) version 1.4.1. As an added benefit, the AWS HITRUST Inheritance Program also supports the control inheritance of AWS cloud-based workloads for new HITRUST e1 and i1 assessment types, as well as the validated r2-type assessments offered through HITRUST. The SRM is also backward-compatible to earlier versions of the HITRUST CSF from v9.1 through v11.

Additionally, through the AWS HITRUST Inheritance Program, AWS is a member of the Health 3^rd Party Trust Initiative (Health3PT), a consortium of the largest US-based healthcare systems that is proactively committed to reducing third-party information security risk with more reliable and efficient assurances. You can find additional information at https://health3pt.org.

As always, we value your feedback and questions and are committed to helping you achieve and maintain the highest standard of security and compliance. Feel free to contact the team through AWS Compliance Contact Us.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Spring 2023 SOC reports now available in Spanish

2023-08-30 Andrew Najjar

Post Syndicated from Andrew Najjar original https://aws.amazon.com/blogs/security/spring-2023-soc-reports-now-available-in-spanish/

Spanish version »

We continue to listen to our customers, regulators, and stakeholders to understand their needs regarding audit, assurance, certification, and attestation programs at Amazon Web Services (AWS). We’re pleased to announce that Spring 2023 System and Organization Controls (SOC) 1, SOC 2, and SOC 3 reports are now available in Spanish. These translated reports will help drive greater engagement and alignment with customer and regulatory requirements across Latin America and Spain.

The Spanish language version of the reports don’t contain the independent opinions issued by the auditors or the control test results, but you can find this information in the English language version. Stakeholders should use the English version as a complement to the Spanish version.

Spanish-translated SOC reports are available to customers through AWS Artifact. Spanish-translated SOC reports will be published twice a year, in alignment with the Fall and Spring reporting cycles.

We value your feedback and questions—feel free to reach out to our team or give feedback about this post through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us onTwitter.

Spanish version

Los informes SOC de Primavera de 2023 ahora están disponibles en español

Seguimos escuchando a nuestros clientes, reguladores y partes interesadas para comprender sus necesidades en relación con los programas de auditoría, garantía, certificación y atestación en Amazon Web Services (AWS). Nos complace anunciar que los informes SOC 1, SOC 2 y SOC 3 de AWS de Primavera de 2023 ya están disponibles en español. Estos informes traducidos ayudarán a impulsar un mayor compromiso y alineación con los requisitos regulatorios y de los clientes en las regiones de América Latina y España.

La versión en inglés de los informes debe tenerse en cuenta en relación con la opinión independiente emitida por los auditores y los resultados de las pruebas de controles, como complemento de las versiones en español.

Los informes SOC traducidos en español están disponibles en AWS Artifact. Los informes SOC traducidos en español se publicarán dos veces al año según los ciclos de informes de Otoño y Primavera.

Valoramos sus comentarios y preguntas; no dude en ponerse en contacto con nuestro equipo o enviarnos sus comentarios sobre esta publicación a través de nuestra página Contáctenos.

Si tienes comentarios sobre esta publicación, envíalos en la sección Comentarios a continuación.

¿Desea obtener más noticias sobre seguridad de AWS? Síguenos en Twitter.

Validate IAM policies by using IAM Policy Validator for AWS CloudFormation and GitHub Actions

2023-08-30 Mitch Beaumont

Post Syndicated from Mitch Beaumont original https://aws.amazon.com/blogs/security/validate-iam-policies-by-using-iam-policy-validator-for-aws-cloudformation-and-github-actions/

In this blog post, I’ll show you how to automate the validation of AWS Identity and Access Management (IAM) policies by using a combination of the IAM Policy Validator for AWS CloudFormation (cfn-policy-validator) and GitHub Actions. Policy validation is an approach that is designed to minimize the deployment of unwanted IAM identity-based and resource-based policies to your Amazon Web Services (AWS) environments.

With GitHub Actions, you can automate, customize, and run software development workflows directly within a repository. Workflows are defined using YAML and are stored alongside your code. I’ll discuss the specifics of how you can set up and use GitHub actions within a repository in the sections that follow.

The cfn-policy-validator tool is a command-line tool that takes an AWS CloudFormation template, finds and parses the IAM policies that are attached to IAM roles, users, groups, and resources, and then runs the policies through IAM Access Analyzer policy checks. Implementing IAM policy validation checks at the time of code check-in helps shift security to the left (closer to the developer) and shortens the time between when developers commit code and when they get feedback on their work.

Let’s walk through an example that checks the policies that are attached to an IAM role in a CloudFormation template. In this example, the cfn-policy-validator tool will find that the trust policy attached to the IAM role allows the role to be assumed by external principals. This configuration could lead to unintended access to your resources and data, which is a security risk.

Prerequisites

To complete this example, you will need the following:

A GitHub account
An AWS account, and an identity within that account that has permissions to create the IAM roles and resources used in this example

Step 1: Create a repository that will host the CloudFormation template to be validated

To begin with, you need to create a GitHub repository to host the CloudFormation template that is going to be validated by the cfn-policy-validator tool.

To create a repository:

Open a browser and go to https://github.com.
In the upper-right corner of the page, in the drop-down menu, choose New repository. For Repository name, enter a short, memorable name for your repository.
(Optional) Add a description of your repository.
Choose either the option Public (the repository is accessible to everyone on the internet) or Private (the repository is accessible only to people access is explicitly shared with).
Choose Initialize this repository with: Add a README file.
Choose Create repository. Make a note of the repository’s name.

Step 2: Clone the repository locally

Now that the repository has been created, clone it locally and add a CloudFormation template.

To clone the repository locally and add a CloudFormation template:

Open the command-line tool of your choice.
Use the following command to clone the new repository locally. Make sure to replace <GitHubOrg> and <RepositoryName> with your own values.
```
git clone [email protected]:<GitHubOrg>/<RepositoryName>.git
```
Change in to the directory that contains the locally-cloned repository.
```
cd <RepositoryName>
```
Now that the repository is locally cloned, populate the locally-cloned repository with the following sample CloudFormation template. This template creates a single IAM role that allows a principal to assume the role to perform the S3:GetObject action.

Use the following command to create the sample CloudFormation template file.

WARNING: This sample role and policy should not be used in production. Using a wildcard in the principal element of a role’s trust policy would allow any IAM principal in any account to assume the role.

cat << EOF > sample-role.yaml

AWSTemplateFormatVersion: "2010-09-09"
Description: Base stack to create a simple role
Resources:
  SampleIamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              AWS: "*"
            Action: ["sts:AssumeRole"]
      Path: /      
      Policies:
        - PolicyName: root
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Resource: "*"
                Effect: Allow
                Action:
                  - s3:GetObject
EOF

Notice that AssumeRolePolicyDocument refers to a trust policy that includes a wildcard value in the principal element. This means that the role could potentially be assumed by an external identity, and that’s a risk you want to know about.

Step 3: Vend temporary AWS credentials for GitHub Actions workflows

In order for the cfn-policy-validator tool that’s running in the GitHub Actions workflow to use the IAM Access Analyzer API, the GitHub Actions workflow needs a set of temporary AWS credentials. The AWS Credentials for GitHub Actions action helps address this requirement. This action implements the AWS SDK credential resolution chain and exports environment variables for other actions to use in a workflow. Environment variable exports are detected by the cfn-policy-validator tool.

AWS Credentials for GitHub Actions supports four methods for fetching credentials from AWS, but the recommended approach is to use GitHub’s OpenID Connect (OIDC) provider in conjunction with a configured IAM identity provider endpoint.

To configure an IAM identity provider endpoint for use in conjunction with GitHub’s OIDC provider:

Open the AWS Management Console and navigate to IAM.
In the left-hand menu, choose Identity providers, and then choose Add provider.
For Provider type, choose OpenID Connect.
For Provider URL, enter
https://token.actions.githubusercontent.com
Choose Get thumbprint.
For Audiences, enter sts.amazonaws.com
Choose Add provider to complete the setup.

At this point, make a note of the OIDC provider name. You’ll need this information in the next step.

After it’s configured, the IAM identity provider endpoint should look similar to the following:

Figure 1: IAM Identity provider details

Step 4: Create an IAM role with permissions to call the IAM Access Analyzer API

In this step, you will create an IAM role that can be assumed by the GitHub Actions workflow and that provides the necessary permissions to run the cfn-policy-validator tool.

To create the IAM role:

In the IAM console, in the left-hand menu, choose Roles, and then choose Create role.
For Trust entity type, choose Web identity.
In the Provider list, choose the new GitHub OIDC provider that you created in the earlier step. For Audience, select sts.amazonaws.com from the list.
Choose Next.
On the Add permission page, choose Create policy.

Choose JSON, and enter the following policy:


    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
              "iam:GetPolicy",
              "iam:GetPolicyVersion",
              "access-analyzer:ListAnalyzers",
              "access-analyzer:ValidatePolicy",
              "access-analyzer:CreateAccessPreview",
              "access-analyzer:GetAccessPreview",
              "access-analyzer:ListAccessPreviewFindings",
              "access-analyzer:CreateAnalyzer",
              "s3:ListAllMyBuckets",
              "cloudformation:ListExports",
              "ssm:GetParameter"
            ],
            "Resource": "*"
        },
        {
          "Effect": "Allow",
          "Action": "iam:CreateServiceLinkedRole",
          "Resource": "*",
          "Condition": {
            "StringEquals": {
              "iam:AWSServiceName": "access-analyzer.amazonaws.com"
            }
          }
        } 
    ]
}

After you’ve attached the new policy, choose Next.

Note: For a full explanation of each of these actions and a CloudFormation template example that you can use to create this role, see the IAM Policy Validator for AWS CloudFormation GitHub project.
Give the role a name, and scroll down to look at Step 1: Select trusted entities.
The default policy you just created allows GitHub Actions from organizations or repositories outside of your control to assume the role. To align with the IAM best practice of granting least privilege, let’s scope it down further to only allow a specific GitHub organization and the repository that you created earlier to assume it.

Replace the policy to look like the following, but don’t forget to replace {AWSAccountID}, {GitHubOrg} and {RepositoryName} with your own values.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::{AWSAccountID}:oidc-provider/token.actions.githubusercontent.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
                },
                "StringLike": {
                    "token.actions.githubusercontent.com:sub": "repo:${GitHubOrg}/${RepositoryName}:*"
                }
            }
        }
    ]
}

For information on best practices for configuring a role for the GitHub OIDC provider, see Creating a role for web identity or OpenID Connect Federation (console).

Checkpoint

At this point, you’ve created and configured the following resources:

A GitHub repository that has been locally cloned and filled with a sample CloudFormation template.
An IAM identity provider endpoint for use in conjunction with GitHub’s OIDC provider.
A role that can be assumed by GitHub actions, and a set of associated permissions that allow the role to make requests to IAM Access Analyzer to validate policies.

Step 5: Create a definition for the GitHub Actions workflow

The workflow runs steps on hosted runners. For this example, we are going to use Ubuntu as the operating system for the hosted runners. The workflow runs the following steps on the runner:

The workflow checks out the CloudFormation template by using the community actions/checkout action.
The workflow then uses the aws-actions/configure-aws-credentials GitHub action to request a set of credentials through the IAM identity provider endpoint and the IAM role that you created earlier.
The workflow installs the cfn-policy-validator tool by using the python package manager, PIP.
The workflow runs a validation against the CloudFormation template by using the cfn-policy-validator tool.

The workflow is defined in a YAML document. In order for GitHub Actions to pick up the workflow, you need to place the definition file in a specific location within the repository: .github/workflows/main.yml. Note the “.” prefix in the directory name, indicating that this is a hidden directory.

To create the workflow:

Use the following command to create the folder structure within the locally cloned repository:
```
mkdir -p .github/workflows
```

Create the sample workflow definition file in the .github/workflows directory. Make sure to replace <AWSAccountID> and <AWSRegion> with your own information.

cat << EOF > .github/workflows/main.yml
name: cfn-policy-validator-workflow

on: push

permissions:
  id-token: write
  contents: read

jobs: 
  cfn-iam-policy-validation: 
    name: iam-policy-validation
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::<AWSAccountID>:role/github-actions-access-analyzer-role
          aws-region: <AWSRegion>
          role-session-name: GitHubSessionName
        
      - name: Install cfn-policy-validator
        run: pip install cfn-policy-validator

      - name: Validate templates
        run: cfn-policy-validator validate --template-path ./sample-role-test.yaml --region <AWSRegion>
EOF

Step 6: Test the setup

Now that everything has been set up and configured, it’s time to test.

To test the workflow and validate the IAM policy:

Add and commit the changes to the local repository.

git add .
git commit -m ‘added sample cloudformation template and workflow definition’

Push the local changes to the remote GitHub repository.
```
git push
```
After the changes are pushed to the remote repository, go back to https://github.com and open the repository that you created earlier. In the top-right corner of the repository window, there is a small orange indicator, as shown in Figure 2. This shows that your GitHub Actions workflow is running.

Figure 2: GitHub repository window with the orange workflow indicator

Because the sample CloudFormation template used a wildcard value “*” in the principal element of the policy as described in the section Step 2: Clone the repository locally, the orange indicator turns to a red x (shown in Figure 3), which signals that something failed in the workflow.

Figure 3: GitHub repository window with the red cross workflow indicator
Choose the red x to see more information about the workflow’s status, as shown in Figure 4.

Figure 4: Pop-up displayed after choosing the workflow indicator
Choose Details to review the workflow logs.
In this example, the Validate templates step in the workflow has failed. A closer inspection shows that there is a blocking finding with the CloudFormation template. As shown in Figure 5, the finding is labelled as EXTERNAL_PRINCIPAL and has a description of Trust policy allows access from external principals.

Figure 5: Details logs from the workflow showing the blocking finding

To remediate this blocking finding, you need to update the principal element of the trust policy to include a principal from your AWS account (considered a zone of trust). The resources and principals within your account comprises of the zone of trust for the cfn-policy-validator tool. In the initial version of sample-role.yaml, the IAM roles trust policy used a wildcard in the Principal element. This allowed principals outside of your control to assume the associated role, which caused the cfn-policy-validator tool to generate a blocking finding.

In this case, the intent is that principals within the current AWS account (zone of trust) should be able to assume this role. To achieve this result, replace the wildcard value with the account principal by following the remaining steps.

Open sample-role.yaml by using your preferred text editor, such as nano.

nano sample-role.yaml

Replace the wildcard value in the principal element with the account principal arn:aws:iam::<AccountID>:root. Make sure to replace <AWSAccountID> with your own AWS account ID.

AWSTemplateFormatVersion: "2010-09-09"
Description: Base stack to create a simple role
Resources:
  SampleIamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              AWS: "arn:aws:iam::<AccountID>:root"
            Action: ["sts:AssumeRole"]
      Path: /      
      Policies:
        - PolicyName: root
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Resource: "*"
                Effect: Allow
                Action:
                  - s3:GetObject

Add the updated file, commit the changes, and push the updates to the remote GitHub repository.

git add sample-role.yaml
git commit -m ‘replacing wildcard principal with account principal’
git push

After the changes have been pushed to the remote repository, go back to https://github.com and open the repository. The orange indicator in the top right of the window should change to a green tick (check mark), as shown in Figure 6.

Figure 6: GitHub repository window with the green tick workflow indicator

This indicates that no blocking findings were identified, as shown in Figure 7.

Figure 7: Detailed logs from the workflow showing no more blocking findings

Conclusion

In this post, I showed you how to automate IAM policy validation by using GitHub Actions and the IAM Policy Validator for CloudFormation. Although the example was a simple one, it demonstrates the benefits of automating security testing at the start of the development lifecycle. This is often referred to as shifting security left. Identifying misconfigurations early and automatically supports an iterative, fail-fast model of continuous development and testing. Ultimately, this enables teams to make security an inherent part of a system’s design and architecture and can speed up product development workflows.

In addition to the example I covered today, IAM Policy Validator for CloudFormation can validate IAM policies by using a range of IAM Access Analyzer policy checks. For more information about these policy checks, see Access Analyzer reference policy checks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Generate machine learning insights for Amazon Security Lake data using Amazon SageMaker

2023-08-29 Jonathan Nguyen

Post Syndicated from Jonathan Nguyen original https://aws.amazon.com/blogs/security/generate-machine-learning-insights-for-amazon-security-lake-data-using-amazon-sagemaker/

Amazon Security Lake automatically centralizes the collection of security-related logs and events from integrated AWS and third-party services. With the increasing amount of security data available, it can be challenging knowing what data to focus on and which tools to use. You can use native AWS services such as Amazon QuickSight, Amazon OpenSearch, and Amazon SageMaker Studio to visualize, analyze, and interactively identify different areas of interest to focus on, and prioritize efforts to increase your AWS security posture.

In this post, we go over how to generate machine learning insights for Security Lake using SageMaker Studio. SageMaker Studio is a web integrated development environment (IDE) for machine learning that provides tools for data scientists to prepare, build, train, and deploy machine learning models. With this solution, you can quickly deploy a base set of Python notebooks focusing on AWS Security Hub findings in Security Lake, which can also be expanded to incorporate other AWS sources or custom data sources in Security Lake. After you’ve run the notebooks, you can use the results to help you identify and focus on areas of interest related to security within your AWS environment. As a result, you might implement additional guardrails or create custom detectors to alert on suspicious activity.

Prerequisites

Specify a delegated administrator account to manage the Security Lake configuration for all member accounts within your organization.
Security Lake has been enabled in the delegated administrator AWS account.
As part of the solution in this post, we focus on Security Hub as a data source. AWS Security Hub must be enabled for your AWS Organizations. When enabling Security Lake, select All log and event sources to include AWS Security Hub findings.
Configure subscriber query access to Security Lake. Security Lake uses AWS Lake Formation cross-account table sharing to support subscriber query access. Accept the resource share request in the subscriber AWS account in AWS Resource Access Manager (AWS RAM). Subscribers with query access can query the data that Security Lake collects. These subscribers query Lake Formation tables in an Amazon Simple Storage Service (Amazon S3) bucket with Security Lake data using services such as Amazon Athena.

Solution overview

Figure 1 that follows depicts the architecture of the solution.

Figure 1 SageMaker machine learning insights architecture for Security Lake

The deployment builds the architecture by completing the following steps:

A Security Lake is set up in an AWS account with supported log sources — such as Amazon VPC Flow Logs, AWS Security Hub, AWS CloudTrail, and Amazon Route53 — configured.
Subscriber query access is created from the Security Lake AWS account to a subscriber AWS account.

Note: See Prerequisite #4 for more information.
The AWS RAM resource share request must be accepted in the subscriber AWS account where this solution is deployed.

Note: See Prerequisite #4 for more information.
A resource link database in Lake Formation is created in the subscriber AWS account and grants access for the Athena tables in the Security Lake AWS account.
VPC is provisioned for SageMaker with IGW, NAT GW, and VPC endpoints for the AWS services used in the solution. IGW and NAT are required to install external open-source packages.
A SageMaker Domain for SageMaker Studio is created in VPCOnly mode with a single SageMaker user profile that is tied to a dedicated AWS Identity and Access Management (IAM) role.
A dedicated IAM role is created to restrict access to create and access the presigned URL for the SageMaker Domain from a specific CIDR for accessing the SageMaker notebook.
An AWS CodeCommit repository containing Python notebooks is used for the AI and ML workflow by the SageMaker user-profile.
An Athena workgroup is created for the Security Lake queries with an S3 bucket for output location (access logging configured for the output bucket).

Deploy the solution

You can deploy the SageMaker solution by using either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

Option 1: Deploy the solution with AWS CloudFormation using the console

Use the console to sign in to your subscriber AWS account and then choose the Launch Stack button to open the AWS CloudFormation console pre-loaded with the template for this solution. It takes approximately 10 minutes for the CloudFormation stack to complete.

Option 2: Deploy the solution by using the AWS CDK

You can find the latest code for the SageMaker solution in the SageMaker machine learning insights GitHub repository, where you can also contribute to the sample code. For instructions and more information on using the AWS CDK, see Get Started with AWS CDK.

To deploy the solution by using the AWS CDK

To build the app when navigating to the project’s root folder, use the following commands:
```
npm install -g aws-cdk-lib
npm install
```
Update IAM_role_assumption_for_sagemaker_presigned_url and security_lake_aws_account default values in source/lib/sagemaker_domain.ts with their respective appropriate values.
Run the following commands in your terminal while authenticated in your subscriber AWS account. Be sure to replace <INSERT_AWS_ACCOUNT> with your account number and replace <INSERT_REGION> with the AWS Region that you want the solution deployed to.
```
cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>
cdk deploy
```

Post deployment steps

Now that you’ve deployed the SageMaker solution, you must grant the SageMaker user profile in the subscriber AWS account query access to your Security Lake. You can Grant permission for the SageMaker user profile to Security Lake in Lake Formation in the subscriber AWS account.

Grant permission to the Security Lake database

Copy the SageMaker user-profile Amazon resource name (ARN) arn:aws:iam::<account-id>:role/sagemaker-user-profile-for-security-lake
Go to Lake Formation in the console.
Select the amazon_security_lake_glue_db_us_east_1 database.
From the Actions Dropdown, select Grant.
In Grant Data Permissions, select SAML Users and Groups.
Paste the SageMaker user profile ARN from Step 1.
In Database Permissions, select Describe and then Grant.

Grant permission to Security Lake – Security Hub table

Copy the SageMaker user-profile ARN arn:aws:iam:<account-id>:role/sagemaker-user-profile-for-security-lake
Go to Lake Formation in the console.
Select the amazon_security_lake_glue_db_us_east_1 database.
Choose View Tables.
Select the amazon_security_lake_table_us_east_1_sh_findings_1_0 table.
From Actions Dropdown, select Grant.
In Grant Data Permissions, select SAML Users and Groups.
Paste the SageMaker user-profile ARN from Step 1.
In Table Permissions, select Describe and then Grant.

Launch your SageMaker Studio application

Now that you have granted permissions for a SageMaker user-profile, we can move on to launching the SageMaker application associated to that user-profile.

Navigate to the SageMaker Studio domain in the console.
Select the SageMaker domain security-lake-ml-insights-<account-id>.
Select the SageMaker user profile sagemaker-user-profile-for-security-lake.
Select the Launch drop-down and select Studio

Figure 2: SageMaker domain user-profile AWS console screen

Clone Python notebooks

You’ll work primarily in the SageMaker user profile to create a data-science app to work in. As part of the solution deployment, we’ve created Python notebooks in CodeCommit that you will need to clone.

To clone the Python notebooks

Navigate to CloudFormation in the console.
In the Stacks section, select the SageMakerDomainStack.
Select to the Outputs tab/
Copy the value for sagemakernotebookmlinsightsrepositoryURL. (For example: https://git-codecommit.us-east-1.amazonaws.com/v1/repos/sagemaker_ml_insights_repo)
Go back to your SageMaker app.
In Studio, in the left sidebar, choose the Git icon (identified by a diamond with two branches), then choose Clone a Repository.

Figure 3: SageMaker clone CodeCommit repository
Paste the CodeCommit repository link from Step 4 under the Git repository URL (git). After you paste the URL, select Clone “https://git-codecommit.us-east-1.amazonaws.com/v1/repos/sagemaker_ml_insights_repo”, then select Clone.

NOTE: If you don’t select from the auto-populated drop-down, SageMaker won’t be able to clone the repository.

Figure 4: SageMaker clone CodeCommit URL

Generating machine learning insights using SageMaker Studio

You’ve successfully pulled the base set of Python notebooks into your SageMaker app and they can be accessed at sagemaker_ml_insights_repo/notebooks/tsat/. The notebooks provide you with a starting point for running machine learning analysis using Security Lake data. These notebooks can be expanded to existing native or custom data sources being sent to Security Lake.

Figure 5: SageMaker cloned Python notebooks

Notebook #1 – Environment setup

The 0.0-tsat-environ-setup notebook handles the installation of the required libraries and dependencies needed for the subsequent notebooks within this blog. For our notebooks, we use an open-source Python library called Kats, which is a lightweight, generalizable framework to perform time series analysis.

Select the 0.0-tsat-environ-setup.ipynb notebook for the environment setup.

Note: If you have already provisioned a kernel, you can skip steps 2 and 3.
In the right-hand corner, select No Kernel
In the Set up notebook environment pop-up, leave the defaults and choose Select.

Figure 6: SageMaker application environment settings
After the kernel has successfully started, choose the Terminal icon to open the image terminal.

Figure 7: SageMaker application terminal
To install open-source packages from https instead of http, you must update the sources.list file. After the terminal opens, send the following commands:
```
cd /etc/apt
sed -i 's/http:/https:/g' sources.list
```
Go back to the 0.0-tsat-environ-setup.ipynb notebook and select the Run drop-down and select Run All Cells. Alternatively, you can run each cell independently, but it’s not required. Grab a coffee! This step will take about 10 minutes.

IMPORTANT: If you complete the installation out of order or update the requirements.txt file, you might not be able to successfully install Kats and you will need to rebuild your environment by using a net-new SageMaker user profile.
After installing all the prerequisites, check the Kats version to determine if it was successfully installed.

Figure 8: Kats installation verification
Install PyAthena (Python DB API client for Amazon Athena) which is used to query your data in Security Lake.

You’ve successfully set up the SageMaker app environment! You can now load the appropriate dataset and create a time series.

Notebook #2 – Load data

The 0.1-load-data notebook establishes the Athena connection to query data in Security Lake and creates the resulting time series dataset. The time series dataset will be used for subsequent notebooks to identify trends, outliers, and change points.

Select the 0.1-load-data.ipynb notebook.
If you deployed the solution outside of us-east-1, update the con details to the appropriate Region. In this example, we’re focusing on Security Hub data within Security Lake. If you want to change the underlying data source, you can update the TABLE value.

Figure 9: SageMaker notebook load Security Lake data settings
In the Query section, there’s an Athena query to pull specific data from Security Hub, this can be expanded as needed to a subset or can include all products within Security Hub. The query below pulls Security Hub information after 01:00:00 1/1/2022 from the products listed in productname.

Figure 10: SageMaker notebook Athena query
After the values have been updated, you can create your time series dataset. For this notebook, we recommend running each cell individually instead of running all cells at once so you can get a bit more familiar with the process. Select the first cell and choose the Run icon.

Figure 11: SageMaker run Python notebook code
Follow the same process as Step 4 for the subsequent cells.

Note: If you encounter any issues with querying the table, make sure you completed the post-deployment step for Grant permission to Security Lake – Security Hub table.

You’ve successfully loaded your data and created a timeseries! You can now move on to generating machine learning insights from your timeseries.

Notebook #3 – Trend detector

The 1.1-trend-detector.ipynb notebook handles trend detection in your data. Trend represents a directional change in the level of a time series. This directional change can be either upward (increase in level) or downward (decrease in level). Trend detection helps detect a change while ignoring the noise from natural variability. Each environment is different, and trends help us identify where to look more closely to determine why a trend is positive or negative.

Select 1.1-trend-detector.ipynb notebook for trend detection.
Slopes are created to identify the relationship between x (time) and y (counts).

Figure 12: SageMaker notebook slope view
If the counts are increasing with time, then it’s considered a positive slope and the reverse is considered a negative slope. A positive slope isn’t necessarily a good thing because in an ideal state we would expect counts of a finding type to come down with time.

Figure 13: SageMaker notebook trend view
Now you can plot the top five positive and negative trends to identify the top movers.

Figure 14: SageMaker notebook trend results view

Notebook #4 – Outlier detection

The 1.2-outlier-detection.ipynb notebook handles outlier detection. This notebook does a seasonal decomposition of the input time series, with additive or multiplicative decomposition as specified (default is additive). It uses a residual time series by either removing only trend or both trend and seasonality if the seasonality is strong. The intent is to discover useful, abnormal, and irregular patterns within data sets, allowing you to pinpoint areas of interest.

To start, it detects points in the residual that are over 5 times the inter-quartile range.
Inter-quartile range (IQR) is the difference between the seventy-fifth and twenty-fifth percentiles of residuals or the spread of data within the middle two quartiles of the entire dataset. IQR is useful in detecting the presence of outliers by looking at values that might lie outside of the middle two quartiles.
The IQR multiplier controls the sensitivity of the range and decision of identifying outliers. By using a larger value for the iqr_mult_thresh parameter in OutlierDetector, outliers would be considered data points, while a smaller value would identify data points as outliers.

Note: If you don’t have enough data, decrease iqr_mult_thresh to a lower value (for example iqr_mult_thresh=3).

Figure 15: SageMaker notebook outlier setting
Along with outlier detection plots, investigation SQL will be displayed as well, which can help with further investigation of the outliers.
In the diagram that follows, you can see that there are several outliers in the number of findings, related to failed AWS Firewall Manager policies, which have been identified by the vertical red lines within the line graph. These are outliers because they deviate from the normal behavior and number of findings on a day-to-day basis. When you see outliers, you can look at the resources that might have caused an unusual increase in Firewall Manager policy findings. Depending on the findings, it could be related to an overly permissive or noncompliant security group or a misconfigured AWS WAF rule group.

Figure 16: SageMaker notebook outlier results view

Notebook #5 – Change point detection

The 1.3-changepoint-detector.ipynb notebook handles the change point detection. Change point detection is a method to detect changes in a time series that persist over time, such as a change in the mean value. To detect a baseline to identify when several changes might have occurred from that point. Change points occur when there’s an increase or decrease to the average number of findings within a data set.

Along with identifying change points within the data set, the investigation SQL is generated to further investigate the specific change point if applicable.
In the following diagram, you can see there’s a change point decrease after July 27, 2022, with confidence of 99.9 percent. It’s important to note that change points differ from outliers, which are sudden changes in the data set observed. This diagram means there was some change in the environment that resulted in an overall decrease in the number of findings for S3 buckets with block public access being disabled. The change could be the result of an update to the CI/CD pipelines provisioning S3 buckets or automation to enable all S3 buckets to block public access. Conversely, if you saw a change point that resulted in an increase, it could mean that there was a change that resulted in a larger number of S3 buckets with a block public access configuration consistently being disabled.

Figure 17: SageMaker changepoint detector view

By now, you should be familiar with the set up and deployment for SageMaker Studio and how you can use Python notebooks to generate machine learning insights for your Security Lake data. You can take what you’ve learned and start to curate specific datasets and data sources within Security Lake, create a time series, detect trends, and identify outliers and change points. By doing so, you can answer a variety of security-related questions such as:

CloudTrail
Is there a large volume of Amazon S3 download or copy commands to an external resource? Are you seeing a large volume of S3 delete object commands? Is it possible there’s a ransomware event going on?
VPC Flow Logs
Is there an increase in the number of requests from your VPC to external IPs? Is there an increase in the number of requests from your VPC to your on-premises CIDR? Is there a possibility of internal or external data exfiltration occurring?
Route53
Which resources are making DNS requests that they haven’t typically made within the last 30–45 days? When did it start? Is there a potential command and control session occurring on an Amazon Elastic Compute Cloud (Amazon EC2) instance?

It’s important to note that this isn’t a solution to replace Amazon GuardDuty, which uses foundational data sources to detect communication with known malicious domains and IP addresses and identify anomalous behavior, or Amazon Detective, which provides customers with prebuilt data aggregations, summaries, and visualizations to help security teams conduct faster and more effective investigations. One of the main benefits of using Security Lake and SageMaker Studio is the ability to interactively create and tailor machine learning insights specific to your AWS environment and workloads.

Clean up

If you deployed the SageMaker machine learning insights solution by using the Launch Stack button in the AWS Management Console or the CloudFormation template sagemaker_ml_insights_cfn, do the following to clean up:

In the CloudFormation console for the account and Region where you deployed the solution, choose the SageMakerML stack.
Choose the option to Delete the stack.

If you deployed the solution by using the AWS CDK, run the command cdk destroy.

Conclusion

Amazon Security Lake gives you the ability to normalize and centrally store your security data from various log sources to help you analyze, visualize, and correlate appropriate security logs. You can then use this data to increase your overall security posture by implementing additional security guardrails or take appropriate remediation actions within your AWS environment.

In this blog post, you learned how you can use SageMaker to generate machine learning insights for your Security Hub findings in Security Lake. Although the example solution focuses on a single data source within Security Lake, you can expand the notebooks to incorporate other native or custom data sources in Security Lake.

There are many different use-cases for Security Lake that can be tailored to fit your AWS environment. Take a look at this blog post to learn how you can ingest, transform and deliver Security Lake data to Amazon OpenSearch to help your security operations team quickly analyze security data within your AWS environment. In supported Regions, new Security Lake account holders can try the service free for 15 days and gain access to its features.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Improve your security investigations with Detective finding groups visualizations

2023-08-29 Rich Vorwaller

Post Syndicated from Rich Vorwaller original https://aws.amazon.com/blogs/security/improve-your-security-investigations-with-detective-finding-groups-visualizations/

At AWS, we often hear from customers that they want expanded security coverage for the multiple services that they use on AWS. However, alert fatigue is a common challenge that customers face as we introduce new security protections. The challenge becomes how to operationalize, identify, and prioritize alerts that represent real risk.

In this post, we highlight recent enhancements to Amazon Detective finding groups visualizations. We show you how Detective automatically consolidates multiple security findings into a single security event—called finding groups—and how finding group visualizations help reduce noise and prioritize findings that present true risk. We incorporate additional services like Amazon GuardDuty, Amazon Inspector, and AWS Security Hub to highlight how effective findings groups is at consolidating findings for different AWS security services.

Overview of solution

This post uses several different services. The purpose is twofold: to show how you can enable these services for broader protection, and to show how Detective can help you investigate findings from multiple services without spending a lot of time sifting through logs or querying multiple data sources to find the root cause of a security event. These are the services and their use cases:

GuardDuty – a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity. If potential malicious activity, such as anomalous behavior, credential exfiltration, or command and control (C2) infrastructure communication is detected, GuardDuty generates detailed security findings that you can use for visibility and remediation. Recently, GuardDuty released the following threat detections for specific services that we’ll show you how to enable for this walkthrough: GuardDuty RDS Protection, EKS Runtime Monitoring, and Lambda Protection.
Amazon Inspector – an automated vulnerability management service that continually scans your AWS workloads for software vulnerabilities and unintended network exposure. Like GuardDuty, Amazon Inspector sends a finding for alerting and remediation when it detects a software vulnerability or a compute instance that’s publicly available.
Security Hub – a cloud security posture management service that performs automated, continuous security best practice checks against your AWS resources to help you identify misconfigurations, and aggregates your security findings from integrated AWS security services.
Detective – a security service that helps you investigate potential security issues. It does this by collecting log data from AWS CloudTrail, Amazon Virtual Private Cloud (Amazon VPC) flow logs, and other services. Detective then uses machine learning, statistical analysis, and graph theory to build a linked set of data called a security behavior graph that you can use to conduct faster and more efficient security investigations.

The following diagram shows how each service delivers findings along with log sources to Detective.

Figure 1: Amazon Detective log source diagram

Enable the required services

If you’ve already enabled the services needed for this post—GuardDuty, Amazon Inspector, Security Hub, and Detective—skip to the next section. For instructions on how to enable these services, see the following resources:

Each of these services offers a free 30-day trial and provides estimates on charges after your trial expires. You can also use the AWS Pricing Calculator to get an estimate.

To enable the services across multiple accounts, consider using a delegated administrator account in AWS Organizations. With a delegated administrator account, you can automatically enable services for multiple accounts and manage settings for each account in your organization. You can view other accounts in the organization and add them as member accounts, making central management simpler. For instructions on how to enable the services with AWS Organizations, see the following resources:

Enable GuardDuty protections

The next step is to enable the latest detections in GuardDuty and learn how Detective can identify multiple threats that are related to a single security event.

If you’ve already enabled the different GuardDuty protection plans, skip to the next section. If you recently enabled GuardDuty, the protections plans are enabled by default, except for EKS Runtime Monitoring, which is a two-step process.

For the next steps, we use the delegated administrator account in GuardDuty to make sure that the protection plans are enabled for each AWS account. When you use GuardDuty (or Security Hub, Detective, and Inspector) with AWS Organizations, you can designate an account to be the delegated administrator. This is helpful so that you can configure these security services for multiple accounts at the same time. For instructions on how to enable a delegated administrator account for GuardDuty, see Managing GuardDuty accounts with AWS Organizations.

To enable EKS Protection

Sign in to the GuardDuty console using the delegated administrator account, choose Protection plans, and then choose EKS Protection.
In the Delegated administrator section, choose Edit and then choose Enable for each scope or protection. For this post, select EKS Audit Log Monitoring, EKS Runtime Monitoring, and Manage agent automatically, as shown in Figure 2. For more information on each feature, see the following resources:
- EKS Audit Log Monitoring
- EKS Runtime Monitoring
To enable these protections for current accounts, in the Active member accounts section, choose Edit and Enable for each scope of protection.
To enable these protections for new accounts, in the New account default configuration section, choose Edit and Enable for each scope of protection.

To enable RDS Protection

The next step is to enable RDS Protection. GuardDuty RDS Protection works by analysing RDS login activity for potential threats to your Amazon Aurora databases (MySQL-Compatible Edition and Aurora PostgreSQL-Compatible Editions). Using this feature, you can identify potentially suspicious login behavior and then use Detective to investigate CloudTrail logs, VPC flow logs, and other useful information around those events.

Navigate to the RDS Protection menu and under Delegated administrator (this account), select Enable and Confirm.
In the Enabled for section, select Enable all if you want RDS Protection enabled on all of your accounts. If you want to select a specific account, choose Manage Accounts and then select the accounts for which you want to enable RDS Protection. With the accounts selected, choose Edit Protection Plans, RDS Login Activity, and Enable for X selected account.
(Optional) For new accounts, turn on Auto-enable RDS Login Activity Monitoring for new member accounts as they join your organization.

Figure 2: Enable EKS Runtime Monitoring

To enable Lambda Protection

The final step is to enable Lambda Protection. Lambda Protection helps detect potential security threats during the invocation of AWS Lambda functions. By monitoring network activity logs, GuardDuty can generate findings when Lambda functions are involved with malicious activity, such as communicating with command and control servers.

Navigate to the Lambda Protection menu and under Delegated administrator (this account), select Enable and Confirm.
In the Enabled for section, select Enable all if you want Lambda Protection enabled on all of your accounts. If you want to select a specific account, choose Manage Accounts and select the accounts for which you want to enable RDS Protection. With the accounts selected, choose Edit Protection Plans, Lambda Network Activity Monitoring, and Enable for X selected account.
(Optional) For new accounts, turn on Auto-enable Lambda Network Activity Monitoring for new member accounts as they join your organization.

Figure 4: Enable Lambda Network Activity Monitoring

Now that you’ve enabled these new protections, GuardDuty will start monitoring EKS audit logs, EKS runtime activity, RDS login activity, and Lambda network activity. If GuardDuty detects suspicious or malicious activity for these log sources or services, it will generate a finding for the activity, which you can review in the GuardDuty console. In addition, you can automatically forward these findings to Security Hub for consolidation, and to Detective for security investigation.

Detective data sources

If you have Security Hub and other AWS security services such as GuardDuty or Amazon Inspector enabled, findings from these services are forwarded to Security Hub. With the exception of sensitive data findings from Amazon Macie, you’re automatically opted in to other AWS service integrations when you enable Security Hub. For the full list of services that forward findings to Security Hub, see Available AWS service integrations.

With each service enabled and forwarding findings to Security Hub, the next step is to enable the data source in Detective called AWS security findings, which are the findings forwarded to Security Hub. Again, we’re going to use the delegated administrator account for these steps to make sure that AWS security findings are being ingested for your accounts.

To enable AWS security findings

Sign in to the Detective console using the delegated administrator account and navigate to Settings and then General.
Choose Optional source packages, Edit, select AWS security findings, and then choose Save.

Figure 5: Enable AWS security findings

When you enable Detective, it immediately starts creating a security behavior graph for AWS security findings to build a linked dataset between findings and entities, such as RDS login activity from Aurora databases, EKS runtime activity, and suspicious network activity for Lambda functions. For GuardDuty to detect potential threats that affect your database instances, it first needs to undertake a learning period of up to two weeks to establish a baseline of normal behavior. For more information, see How RDS Protection uses RDS login activity monitoring. For the other protections, after suspicious activity is detected, you can start to see findings in both GuardDuty and Security Hub consoles. This is where you can start using Detective to better understand which findings are connected and where to prioritize your investigations.

Detective behavior graph

As Detective ingests data from GuardDuty, Amazon Inspector, and Security Hub, as well as CloudTrail logs, VPC flow logs, and Amazon Elastic Kubernetes Service (Amazon EKS) audit logs, it builds a behavior graph database. Graph databases are purpose-built to store and navigate relationships. Relationships are first-class citizens in graph databases, which means that they’re not computed out-of-band or by interfering with relationships through querying foreign keys. Because Detective stores information on relationships in your graph database, you can effectively answer questions such as “are these security findings related?”. In Detective, you can use the search menu and profile panels to view these connections, but a quicker way to see this information is by using finding groups visualizations.

Finding groups visualizations

Finding groups extract additional information out of the behavior graph to highlight findings that are highly connected. Detective does this by running several machine learning algorithms across your behavior graph to identify related findings and then statically weighs the relationships between those findings and entities. The result is a finding group that shows GuardDuty and Amazon Inspector findings that are connected, along with entities like Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS accounts, and AWS Identity and Access Management (IAM) roles and sessions that were impacted by these findings. With finding groups, you can more quickly understand the relationships between multiple findings and their causes because you don’t need to connect the dots on your own. Detective automatically does this and presents a visualization so that you can see the relationships between various entities and findings.

Enhanced visualizations

Recently, we released several enhancements to finding groups visualizations to aid your understanding of security connections and root causes. These enhancements include:

Dynamic legend – the legend now shows icons for entities that you have in the finding group instead of showing all available entities. This helps reduce noise to only those entities that are relevant to your investigation.
Aggregated evidence and finding icons – these icons provide a count of similar evidence and findings. Instead of seeing the same finding or evidence repeated multiple times, you’ll see one icon with a counter to help reduce noise.
More descriptive side panel information – when you choose a finding or entity, the side panel shows additional information, such as the service that identified the finding and the finding title, in addition to the finding type, to help you understand the action that invoked the finding.
Label titles – you can now turn on or off titles for entities and findings in the visualization so that you don’t have to choose each to get a summary of what the different icons mean.

To use the finding groups visualization

Open the Detective console, choose Summary, and then choose View all finding groups.
Choose the title of an available finding group and scroll down to Visualization.
Under the Select layout menu, choose one of the layouts available, or choose and drag each icon to rearrange the layout according to how you’d like to see connections.
For a complete list of involved entities and involved findings, scroll down below the visualization.

Figure 6 shows an example of how you can use finding groups visualization to help identify the root cause of findings quickly. In this example, an IAM role was connected to newly observed geolocations, multiple GuardDuty findings detected malicious API calls, and there were newly observed user agents from the IAM session. The visualization can give you high confidence that the IAM role is compromised. It also provides other entities that you can search against, such as the IP address, S3 bucket, or new user agents.

Figure 6: Finding groups visualization

Now that you have the new GuardDuty protections enabled along with the data source of AWS security findings, you can use finding groups to more quickly visualize which IAM sessions have had multiple findings associated with unauthorized access, or which EC2 instances are publicly exposed with a software vulnerability and active GuardDuty finding—these patterns can help you determine if there is an actual risk.

Conclusion

In this blog post, you learned how to enable new GuardDuty protections and use Detective, finding groups, and visualizations to better identify, operationalize, and prioritize AWS security findings that represent real risk. We also highlighted the new enhancements to visualizations that can help reduce noise and provide summaries of detailed information to help reduce the time it takes to triage findings. If you’d like to see an investigation scenario using Detective, watch the video Amazon Detective Security Scenario Investigation.

If you have feedback about this post, submit comments in the Comments section below. You can also start a new thread on Amazon Detective re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

AWS Digital Sovereignty Pledge: Announcing new dedicated infrastructure options

2023-08-23 Matt Garman

Post Syndicated from Matt Garman original https://aws.amazon.com/blogs/security/aws-digital-sovereignty-pledge-announcing-new-dedicated-infrastructure-options/

At AWS, we’re committed to helping our customers meet digital sovereignty requirements. Last year, I announced the AWS Digital Sovereignty Pledge, our commitment to offering all AWS customers the most advanced set of sovereignty controls and features available in the cloud. Our approach is to continue to make AWS sovereign-by-design—as it has been from day one.

I promised that our pledge was just the start, and that we would continue to innovate to meet the needs of our customers. As part of our promise, we pledged to invest in an ambitious roadmap of capabilities on data residency, granular access restriction, encryption, and resilience. Today, I’d like to update you on another milestone on our journey to continue to help our customers address their sovereignty needs.

Further control over the location of your data

Customers have always controlled the location of their data with AWS. For example, in Europe, customers have the choice to deploy their data into any of eight existing AWS Regions. These AWS Regions provide the broadest set of cloud services and features, enabling our customers to run the majority of their workloads. Customers can also use AWS Local Zones, a type of infrastructure deployment that makes AWS services available in more places, to help meet latency and data residency requirements, without having to deploy self-managed infrastructure. Customers who must comply with data residency regulations can choose to run their workloads in specific geographic locations where AWS Regions and Local Zones are available.

Announcing AWS Dedicated Local Zones

Our public sector and regulated industry customers have told us they want dedicated infrastructure for their most critical workloads to help meet regulatory or other compliance requirements. Many of these customers manage their own infrastructure on premises for workloads that require isolation. This forgoes the performance, innovation, elasticity, scalability, and resiliency benefits of the cloud.

To help our customers address these needs, I’m excited to announce AWS Dedicated Local Zones. Dedicated Local Zones are a type of AWS infrastructure that is fully managed by AWS, built for exclusive use by a customer or community, and placed in a customer-specified location or data center to help comply with regulatory requirements. Dedicated Local Zones can be operated by local AWS personnel and offer the same benefits of Local Zones, such as elasticity, scalability, and pay-as-you-go pricing, with added security and governance features. These features include data access monitoring and audit programs, controls to limit infrastructure access to customer-selected AWS accounts, and options to enforce security clearance or other criteria on local AWS operating personnel. With Dedicated Local Zones, we work with customers to configure their own Local Zones with the services and capabilities they need to meet their regulatory requirements.

AWS Dedicated Local Zones meet the same high AWS security standards that apply to AWS Regions and Local Zones. They also come with the same AWS Nitro System that powers all modern Amazon Elastic Compute Cloud (Amazon EC2) instances to help ensure confidentiality and integrity of customer data. With Dedicated Local Zones, customers can use the multitenancy features of the cloud to efficiently enable adoption across multiple AWS accounts created by a customer’s community of agencies and business units, and reduce the operational overhead of managing on-premises infrastructure. Customers can deploy multiple Dedicated Local Zones for resiliency and simplify their applications’ architecture by using consistent AWS infrastructure, APIs, and tools across different classifications of applications running in AWS Regions and Dedicated Local Zones. AWS services, such as Amazon EC2, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Block Store (Amazon EBS), Elastic Load Balancing (ELB), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), and AWS Direct Connect, will be available in Dedicated Local Zones.

Innovating with the Singapore Government’s Smart Nation and Digital Government Group

At AWS, we work closely with customers to understand their requirements for their most critical workloads. Our work with the Singapore Government’s Smart Nation and Digital Government Group (SNDGG) to build a Smart Nation for their citizens and businesses illustrates this approach. This group spearheads Singapore’s digital government transformation and development of the public sector’s engineering capabilities. SNDGG is the first customer to deploy Dedicated Local Zones.

“AWS is a strategic partner and has been since the beginning of our cloud journey. SNDGG collaborated with AWS to define and build Dedicated Local Zones to help us meet our stringent data isolation and security requirements, enabling Singapore to run more sensitive workloads in the cloud securely,” said Chan Cheow Hoe, Government Chief Digital Technology Officer of Singapore. “In addition to helping the Singapore government meet its cybersecurity requirements, the Dedicated Local Zones enable us to offer its agencies a seamless and consistent cloud experience.”

Our commitments to our customers

We remain committed to helping our customers meet evolving sovereignty requirements. We continue to innovate sovereignty features, controls, and assurances globally with AWS, without compromising on the full power of AWS.

To get started, you can visit the Dedicated Local Zones website, where you can contact AWS specialists to learn more about how Dedicated Local Zones can be configured to meet your regulatory needs.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

How we designed Cedar to be intuitive to use, fast, and safe

2023-08-21 Emina Torlak

Post Syndicated from Emina Torlak original https://aws.amazon.com/blogs/security/how-we-designed-cedar-to-be-intuitive-to-use-fast-and-safe/

This post is a deep dive into the design of Cedar, an open source language for writing and evaluating authorization policies. Using Cedar, you can control access to your application’s resources in a modular and reusable way. You write Cedar policies that express your application’s permissions, and the application uses Cedar’s authorization engine to decide which access requests to allow. This decouples access control from the application logic, letting you write, update, audit, and reuse authorization policies independently of application code.

Cedar’s authorization engine is built to a high standard of performance and correctness. Application developers report typical authorization latencies of less than 1 ms, even with hundreds of policies. The resulting authorization decision — Allow or Deny — is provably correct, thanks to the use of verification-guided development. This high standard means your application can use Cedar with confidence, just like Amazon Web Services (AWS) does as part of the Amazon Verified Permissions and AWS Verified Access services.

Cedar’s design is based on three core tenets: usability, speed, and safety. Cedar policies are intuitive to read because they’re defined using your application’s vocabulary—for example, photos organized into albums for a photo-sharing application. Cedar’s policy structure reflects common authorization use cases and enables fast evaluation. Cedar’s semantics are intuitive and safer by default: policies combine to allow or deny access according to rules you already know from AWS Identity and Access Management (IAM).

This post shows how Cedar’s authorization semantics, data model, and policy syntax work together to make the Cedar language intuitive to use, fast, and safe. We cover each of these in turn and highlight how their design reflects our tenets.

The Cedar authorization semantics: Default deny, forbid wins, no ordering

We show how Cedar works on an example application for sharing photos, called PhotoFlash, illustrated in Figure 1.

Figure 1: An example PhotoFlash account. User Jane has two photos, four albums, and three user groups

PhotoFlash lets users like Jane upload photos to the cloud, tag them, and organize them into albums. Jane can also share photos with others, for example, letting her friends view photos in her trips album. PhotoFlash provides a point-and-click interface for users to share access, and then stores the resulting permissions as Cedar policies.

When a user attempts to perform an action on a resource (for example, view a photo), PhotoFlash calls the Cedar authorization engine to determine whether access is allowed. The authorizer evaluates the stored policies against the request and application-specific data (such as a photo’s tags) and returns Allow or Deny. If it returns Allow, PhotoFlash proceeds with the action. If it returns Deny, PhotoFlash reports that the action is not permitted.

Let’s look at some policies and see how Cedar evaluates them to authorize requests safely and simply.

Default deny

To let Jane’s friends view photos in her trips album, PhotoFlash generates and stores the following Cedar permit policy:

// Policy A: Jane's friends can view photos in Jane's trips album.
permit(
  principal in Group::"jane/friends", 
  action == Action::"viewPhoto",
  resource in Album::"jane/trips");

Cedar policies define who (the principal) can do what (the action) on what asset (the resource). This policy allows the principal (a PhotoFlash User) in Jane’s friends group to view the resources (a Photo) in Jane’s trips album.

Cedar’s authorizer grants access only if a request satisfies a specific permit policy. This semantics is default deny: Requests that don’t satisfy any permit policy are denied.

Given only our example Policy A, the authorizer will allow Alice to view Jane’s flower.jpg photo. Alice’s request satisfies Policy A because Alice is one of Jane’s friends (see Figure 1). But the authorizer will deny John’s request to view this photo. That’s because John isn’t one of Jane’s friends, and there is no other permit that grants John access to Jane’s photos.

Forbid wins

While PhotoFlash allows individual users to choose their own permissions, it also enforces system-wide security rules.

For example, PhotoFlash wants to prevent users from performing actions on resources that are owned by someone else and tagged as private. If a user (Jane) accidentally permits someone else (Alice) to view a private photo (receipt.jpg), PhotoFlash wants to override the user-defined permission and deny the request.

In Cedar, such guardrails are expressed as forbid policies:

// Policy B: Users can't perform any actions on private resources they don't own.
forbid(principal, action, resource)
when {
  resource.tags.contains("private") &&
  !(resource in principal.account)
};

This PhotoFlash policy says that a principal is forbidden from taking an action on a resource when the resource is tagged as private and isn’t contained in the principal’s account.

Cedar’s authorizer makes sure that forbids override permits. If a request satisfies a forbid policy, it’s denied regardless of what permissions are satisfied.

For example, the authorizer will deny Alice’s request to view Jane’s receipt.jpg photo. This request satisfies Policy A because Alice is one of Jane’s friends. But it also satisfies the guardrail in Policy B because the photo is tagged as private. The guardrail wins, and the request is denied.

No ordering

Cedar’s authorization decisions are independent of the order the policies are evaluated in. Whether the authorizer evaluates Policy A first and then Policy B, or the other way around, doesn’t matter. As you’ll see later, the Cedar language design ensures that policies can be evaluated in any order to reach the same authorization decision. To understand the combined meaning of multiple Cedar policies, you need only remember that access is allowed if the request satisfies a permit policy and there are no applicable forbid policies.

Safe by default and intuitive

We’ve proved (using automated reasoning) that Cedar’s authorizer satisfies the default deny, forbids override permits, and order independence properties. These properties help make Cedar’s behavior safe by default and intuitive. Amazon IAM has the same properties. Cedar builds on more than a decade of IAM experience by formalizing and enforcing these properties as parts of its design.

Now that we’ve seen how Cedar authorizes requests, let’s look at how its data model and syntax support writing policies that are quick to read and evaluate.

The Cedar data model: entities with attributes, arranged in a hierarchy

Cedar policies are defined in terms of a vocabulary specific to your application. For example, PhotoFlash organizes photos into albums and users into groups while a task management application organizes tasks into lists. You reflect this vocabulary into Cedar’s data model, which organizes entities into a hierarchy. Entities correspond to objects within your application, such as photos and users. The hierarchy reflects grouping of entities, such as nesting of photos into albums. Think of it as a directed-acyclic graph. Figure 2 shows the entity hierarchy for PhotoFlash that matches Figure 1.

Figure 2: An example hierarchy for PhotoFlash, matching the illustration in Figure 1

Entities are stored objects that serve as principals, resources, and actions in Cedar policies. Policies refer to these objects using entity references, such as Album::”jane/art”.

Policies use the in operator to check if the hierarchy relates two entities. For example, Photo::”flower.jpg” in Account::”jane” is true for the hierarchy in Figure 2, but Photo::”flower.jpg” in Album::”jane/conference” is not. PhotoFlash can persist the entity hierarchy in a dedicated entity store, or compute the relevant parts as needed for an authorization request.

Each entity also has a record that maps named attributes to values. An attribute stores a Cedar value: an entity reference, record, string, 64-bit integer, boolean, or a set of values. For example, Photo::”flower.jpg” has attributes describing the photo’s metadata, such as tags, which is a set of strings, and raw, which is an entity reference to another Photo. Cedar supports a small collection of operators that can be applied to values; these operators are carefully chosen to enable efficient evaluation.

Built-in support for role and attribute-based access control

If the concepts you’ve seen so far seem familiar, that’s not surprising. Cedar’s data model is designed to allow you to implement time-tested access control models, including role-based and attribute-based access control (RBAC and ABAC). The entity hierarchy and the in operator support RBAC-style roles as groups, while entity records and the . operator let you express ABAC-style permissions using per-object attributes.

The Cedar syntax: Structured, loop-free, and stateless

Cedar uses a simple, structured syntax for writing policies. This structure makes Cedar policies simple to understand and fast to authorize at scale. Let’s see how by taking a closer look at Cedar’s syntax.

Structure for readability and scalable authorization

Figure 3 illustrates the structure of Cedar policies: an effect and scope, optionally followed by one or more conditions.

The effect of a policy is to either permit or forbid access. The scope can use equality (==) or membership (in) constraints to restrict the principals, actions, and resources to which the policy applies. Policy conditions are expressions that further restrict when the policy applies.

This structure makes policies straightforward to read and understand: The scope expresses an RBAC rule, and the conditions express ABAC rules. For example, PhotoFlash Policy A has no conditions and expresses a single RBAC rule. Policy B has an open (unconstrained) scope and expresses a single ABAC rule. A quick glance is enough to see if a policy is just an RBAC rule, just an ABAC rule, or a mix of both.

Figure 3: Cedar policy structure, illustrated on PhotoFlash Policy A and B

Scopes also enable scalable authorization for large policy stores through policy slicing. This is a property of Cedar that lets applications authorize a request against a subset of stored policies, supporting real-time decisions even for stores with thousands of policies. With slicing, an application needs to pass a policy to the authorizer only when the request’s principal and resource are descendants of the principal and resource entities specified in the policy’s scope. For example, PhotoFlash needs to include Policy A only for requests that involve the descendants of Group::”jane/friends” and Album::”jane/trips”. But Policy B must be included for all requests because of its open scope.

No loops or state for fast evaluation and intuitive decisions

Policy conditions are Boolean-valued expressions. The Cedar expression language has a familiar syntax that includes if-then-else expressions, short-circuiting Boolean operators (!, &&, ||), and basic operations on Cedar values. Notably, there is no way to express looping or to change the application state (for example, mutate an attribute).

Cedar excludes loops to bound authorization latency. With no loops or costly built-in operators, Cedar policies terminate in O(n²) steps in the worst case (when conditions contain certain set operations), or O(n) in the common case.

Cedar also excludes stateful operations for performance and understandability. Since policies can’t change the application state, their evaluation can be parallelized for better performance, and you can reason about them in any order to see what accesses are allowed.

Learn more

In this post, we explored how Cedar’s design supports intuitive, fast, and safe authorization. With Cedar, your application’s access control rules become standalone policies that are clear, auditable, and reusable. You enforce these policies by calling Cedar’s authorizer to decide quickly and safely which requests are allowed. To learn more, see how to use Cedar to secure your app, and how we built Cedar to a high standard of assurance. You can also visit the Cedar website and blog, try it out in the Cedar playground, and join us on Cedar’s Slack channel.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

AWS launched a Landing Zone for the Baseline Informatiebeveiliging Overheid (BIO) and is issued a certificate for the BIO Thema-uitwerking Clouddiensten

2023-08-21 Eric Washington

Post Syndicated from Eric Washington original https://aws.amazon.com/blogs/security/aws-launched-a-landing-zone-for-the-baseline-informatiebeveiliging-overheid-bio-and-is-issued-a-certificate-for-the-bio-thema-uitwerking-clouddiensten/

We’re pleased to announce that we’ve launched a Landing Zone for the Baseline Informatiebeveiliging Overheid (BIO) framework to support our Dutch customers in their compliance needs with the BIO framework.

We also demonstrated compliance with the BIO Thema-uitwerking Clouddiensten. This alignment with the BIO Thema-uitwerking Clouddiensten requirements demonstrates our continuous commitment to adhere to the heightened expectations for cloud service providers.

Amazon Web Services (AWS) customers across the Dutch public sector can use AWS certified services with confidence, knowing that the AWS services listed in the certificate adhere to the strict requirements imposed on the consumption of cloud-based services.

Baseline Informatiebeveiliging Overheid

The BIO framework is an information security framework that the four layers of the Dutch public sector are required to adhere to. This means that it’s mandatory for the Dutch central government, all provinces, municipalities, and regional water authorities to be compliant with the BIO framework.

To support AWS customers in demonstrating their compliance with the BIO framework, AWS developed a Landing Zone for the BIO framework. This Landing Zone for the BIO framework is a pre-configured AWS environment that includes a subset of the technical requirements of the BIO framework. It’s a helpful tool that provides a starting point from which customers can further build their own AWS environment.

For more information regarding the Landing Zone for the BIO framework, see the AWS Reference Guide for Dutch BIO Framework and BIO Theme-elaboration Cloud Services in AWS Artifact. You can also reach out to your AWS account team or contact AWS through the Contact Us page.

Baseline Informatiebeveiliging Overheid Thema-uitwerking Clouddiensten

In addition to the BIO framework, there’s another information security framework designed specifically for the use of cloud services. It is called BIO Thema-uitwerking Clouddiensten. The BIO Thema-uitwerking Clouddiensten is a guidance document for Dutch cloud service consumers to help them formulate controls and objectives when using cloud services. Consumers can view it as an additional control framework on top of the BIO framework.

AWS was evaluated by the monitoring body, EY CertifyPoint, in February 2023, and it was determined that AWS successfully demonstrated compliance for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), and Amazon Relational Database Service (Amazon RDS) services. The Certificate of Compliance illustrating the compliance status of AWS and the assessment summary report from EY CertifyPoint are available on AWS Artifact. The certificate is available in Dutch and English.

For more information regarding the BIO Thema-uitwerking Clouddiensten, see the AWS Reference Guide for Dutch BIO Framework and BIO Theme-elaboration Cloud Services in AWS Artifact. You can also reach out to your AWS account team or contact AWS through the Contact Us page.

AWS strives to continuously bring services into scope of its compliance programs to help you meet your architectural and regulatory needs.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How AWS built the Security Guardians program, a mechanism to distribute security ownership

2023-08-18 Ana Malhotra

Post Syndicated from Ana Malhotra original https://aws.amazon.com/blogs/security/how-aws-built-the-security-guardians-program-a-mechanism-to-distribute-security-ownership/

Product security teams play a critical role to help ensure that new services, products, and features are built and shipped securely to customers. However, since security teams are in the product launch path, they can form a bottleneck if organizations struggle to scale their security teams to support their growing product development teams. In this post, we will share how Amazon Web Services (AWS) developed a mechanism to scale security processes and expertise by distributing security ownership between security teams and development teams. This mechanism has many names in the industry — Security Champions, Security Advocates, and others — and it’s often part of a shift-left approach to security. At AWS, we call this mechanism Security Guardians.

In many organizations, there are fewer security professionals than product developers. Our experience is that it takes much more time to hire a security professional than other technical job roles, and research conducted by (ISC)² shows that the cybersecurity industry is short 3.4 million workers. When product development teams continue to grow at a faster rate than security teams, the disparity between security professionals and product developers continues to increase as well. Although most businesses understand the importance of security, frustration and tensions can arise when it becomes a bottleneck for the business and its ability to serve customers.

At AWS, we require the teams that build products to undergo an independent security review with an AWS application security engineer before launching. This is a mechanism to verify that new services, features, solutions, vendor applications, and hardware meet our high security bar. This intensive process impacts how quickly product teams can ship to customers. As shown in Figure 1, we found that as the product teams scaled, so did the problem: there were more products being built than the security teams could review and approve for launch. Because security reviews are required and non-negotiable, this could potentially lead to delays in the shipping of products and features.

Figure 1: More products are being developed than can be reviewed and shipped

How AWS builds a culture of security

Because of its size and scale, many customers look to AWS to understand how we scale our own security teams. To tell our story and provide insight, let’s take a look at the culture of security at AWS.

Security is a business priority

At AWS, security is a business priority. Business leaders prioritize building products and services that are designed to be secure, and they consider security to be an enabler of the business rather than an obstacle.

Leaders also strive to create a safe environment by encouraging employees to identify and escalate potential security issues. Escalation is the process of making sure that the right people know about the problem at the right time. Escalation encompasses “Dive Deep”, which is one of our corporate values at Amazon, because it requires owners and leaders to dive into the details of the issue. If you don’t know the details, you can’t make good decisions about what’s going on and how to run your business effectively.

This aspect of the culture goes beyond intention — it’s embedded in our organizational structure:

CISOs and IT leaders play a key role in demystifying what security and compliance represent for the business. At AWS, we made an intentional choice for the security team to report directly to the CEO. The goal was to build security into the structural fabric of how AWS makes decisions, and every week our security team spends time with AWS leadership to ensure we’re making the right choices on tactical and strategic security issues.

– Stephen Schmidt, Chief Security Officer, Amazon, on Building a Culture of Security

Everyone owns security

Because our leadership supports security, it’s understood within AWS that security is everyone’s job. Security teams and product development teams work together to help ensure that products are built and shipped securely. Despite this collaboration, the product teams own the security of their product. They are responsible for making sure that security controls are built into the product and that customers have the tools they need to use the product securely.

On the other hand, central security teams are responsible for helping developers to build securely and verifying that security requirements are met before launch. They provide guidance to help developers understand what security controls to build, provide tools to make it simpler for developers to implement and test controls, provide support in threat modeling activities, use mechanisms to help ensure that customers’ security expectations are met before launch, and so on.

This responsibility model highlights how security ownership is distributed between the security and product development teams. At AWS, we learned that without this distribution, security doesn’t scale. Regardless of the number of security experts we hire, product teams always grow faster. Although the culture around security and the need to distribute ownership is now well understood, without the right mechanisms in place, this model would have collapsed.

Mechanisms compared to good intentions

Mechanisms are the final pillar of AWS culture that has allowed us to successfully distribute security across our organization. A mechanism is a complete process, or virtuous cycle, that reinforces and improves itself as it operates. As shown in Figure 2, a mechanism takes controllable inputs and transforms them into ongoing outputs to address a recurring business challenge. At AWS, the business challenge that we’re facing is that security teams create bottlenecks for the business. The culture of security at AWS provides support to help address this challenge, but we needed a mechanism to actually do it.

Figure 2: AWS sees mechanisms as a complete process, or virtuous cycle

“Often, when we find a recurring problem, something that happens over and over again, we pull the team together, ask them to try harder, do better – essentially, we ask for good intentions. This rarely works… When you are asking for good intentions, you are not asking for a change… because people already had good intentions. But if good intentions don’t work, what does? Mechanisms work.

– Jeff Bezos, February 1, 2008 All Hands.

At AWS, we’ve learned that we can help solve the challenge of scaling security by distributing security ownership with a mechanism we call the Security Guardians program. Like other mechanisms, it has inputs and outputs, and transforms over time.

AWS distributes security ownership with the Security Guardians program

At AWS, the Security Guardians program trains, develops, and empowers developers to be security ambassadors, or Guardians, within the product teams. At a high level, Guardians make sure that security considerations for a product are made earlier and more often, helping their peers build and ship their product faster. They also work closely with the central security team to help ensure that the security bar at AWS is rising and the Security Guardians program is improving over time. As shown in Figure 3, embedding security expertise within the product teams helps products with Guardian involvement move through security review faster.

Figure 3: Security expertise is embedded in the product teams by Guardians

Guardians are informed, security-minded product builders who volunteer to be consistent champions of security on their teams and are deeply familiar with the security processes and tools. They provide security guidance throughout the development lifecycle and are stakeholders in the security of the products being shipped, helping their teams make informed decisions that lead to more secure, on-time launches. Guardians are the security points-of-contact for their product teams.

In this distributed security ownership model, accountability for product security sits with the product development teams. However, the Guardians are responsible for performing the first evaluation of a development team’s security review submission. They confirm the quality and completeness of the new service’s resources, design documents, threat model, automated findings, and penetration test readiness. The development teams, supported by the Guardian, submit their security review to AWS Application Security (AppSec) engineers for the final pre-launch review.

In practice, as part of this development journey, Guardians help ensure that security considerations are made early, when teams are assessing customer requests and the feature or product design. This can be done by starting the threat modeling processes. Next, they work to make sure that mitigations identified during threat modeling are developed. Guardians also play an active role in software testing, including security scans such as static application security testing (SAST) and dynamic application security testing (DAST). To close out the security review, security engineers work with Guardians to make sure that findings are resolved and the product is ready to ship.

Figure 4: Expedited security review process supported by Guardians

Guardians are, after all, Amazonians. Therefore, Guardians exemplify a number of the Amazon Leadership Principles and often have the following characteristics:

They are exemplary practitioners for security ownership and empower their teams to own the security of their service.
They hold a high security bar and exercise strong security judgement, don’t accept quick or easy answers, and drive continuous improvement.
They advocate for security needs in internal discussions with the product team.
They are thoughtful yet assertive to make customer security a top priority on their team.
They maintain and showcase their security knowledge to their peers, continuously building knowledge from many different sources to gain perspective and to stay up to date on the constantly evolving threat landscape.
They aren’t afraid to have their work independently validated by the central security team.

Expected outcomes

AWS has benefited greatly from the Security Guardians program. We’ve had 22.5 percent fewer medium and high severity security findings generated during the security review process and have taken about 26.9 percent less time to review a new service or feature. This data demonstrates that with Guardians involved we’re identifying fewer issues late in the process, reducing remediation work, and as a result securely shipping services faster for our customers. To help both builders and Guardians improve over time, our security review tool captures feedback from security engineers on their inputs. This helps ensure that our security ownership mechanism reinforces and improves itself over time.

AWS and other organizations have benefited from this mechanism because it generates specialized security resources and distributes security knowledge that scales without needing to hire additional staff.

A program such as this could help your business build and ship faster, as it has for AWS, while maintaining an appropriately high security bar that rises over time. By training builders to be security practitioners and advocates within your development cycle, you can increase the chances of identifying risks and security findings earlier. These findings, earlier in the development lifecycle, can reduce the likelihood of having to patch security bugs or even start over after the product has already been built. We also believe that a consistent security experience for your product teams is an important aspect of successfully distributing your security ownership. An experience with less confusion and friction will help build trust between the product and security teams.

To learn more about building positive security culture for your business, watch this spotlight interview with Stephen Schmidt, Chief Security Officer, Amazon.

If you’re an AWS customer and want to learn more about how AWS built the Security Guardians program, reach out to your local AWS solutions architect or account manager for more information.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

How to Connect Your On-Premises Active Directory to AWS Using AD Connector

2023-08-17 Jeremy Cowan

Post Syndicated from Jeremy Cowan original https://aws.amazon.com/blogs/security/how-to-connect-your-on-premises-active-directory-to-aws-using-ad-connector/

August 17, 2023: We updated the instructions and screenshots in this post to align with changes to the AWS Management Console.

April 25, 2023: We’ve updated this blog post to include more security learning resources.

AD Connector is designed to give you an easy way to establish a trusted relationship between your Active Directory and AWS. When AD Connector is configured, the trust allows you to:

Sign in to AWS applications such as Amazon WorkSpaces, Amazon WorkDocs, and Amazon WorkMail by using your Active Directory credentials.
Seamlessly join Windows instances to your Active Directory domain either through the Amazon EC2 launch wizard or programmatically through the EC2 Simple System Manager (SSM) API.
Provide federated sign-in to the AWS Management Console by mapping Active Directory identities to AWS Identity and Access Management (IAM) roles.

AD Connector cannot be used with your custom applications, as it is only used for secure AWS integration for the three use-cases mentioned above. Custom applications relying on your on-premises Active Directory should communicate with your domain controllers directly or utilize AWS Managed Microsoft AD rather than integrating with AD Connector. To learn more about which AWS Directory Service solution works best for your organization, see the service documentation.

With AD Connector, you can streamline identity management by extending your user identities from Active Directory. It also enables you to reuse your existing Active Directory security policies such as password expiration, password history, and account lockout policies. Also, your users will no longer need to remember yet another user name and password combination. Since AD Connector doesn’t rely on complex directory synchronization technologies or Active Directory Federation Services (AD FS), you can forego the added cost and complexity of hosting a SAML-based federation infrastructure. In sum, AD Connector helps foster a hybrid environment by allowing you to leverage your existing on-premises investments to control different facets of AWS.

This blog post will show you how AD Connector works as well as walk through how to enable federated console access, assign users to roles, and seamlessly join an EC2 instance to an Active Directory domain.

AD Connector – Under the Hood

AD Connector is a dual Availability Zone proxy service that connects AWS apps to your on-premises directory. AD Connector forwards sign-in requests to your Active Directory domain controllers for authentication and provides the ability for applications to query the directory for data. When you configure AD Connector, you provide it with service account credentials that are securely stored by AWS. This account is used by AWS to enable seamless domain join, single sign-on (SSO), and AWS Applications (WorkSpaces, WorkDocs, and WorkMail) functionality. Given AD Connector’s role as a proxy, it does not store or cache user credentials. Rather, authentication, lookup, and management requests are handled by your Active Directory.

In order to create an AD Connector, you must also provide a pair of DNS IP addresses during setup. These are used by AD Connector to retrieve Service (SRV) DNS records to locate the nearest domain controllers to route requests to. The AD connector proxy instances use an algorithm similar to the Active Directory domain controller locator process to decide which domain controllers to connect to for LDAP and Kerberos requests.

For authentication to AWS applications and the AWS Management Console, you can configure an access URL from the AWS Directory Service console. This access URL is in the format of https://<alias>.awsapps.com and provides a publicly accessible sign-in page. You can visit https://<alias>.awsapps.com/workdocs to sign in to WorkDocs, and https://<alias>.awsapps.com/console to sign in to the AWS Management Console. The following image shows the sign-in page for the AWS Management Console.

Figure 1: Login

For added security you can enable multi-factor authentication (MFA) for AD Connector, but you’ll need to have an existing RADIUS infrastructure in your on-premises network set up to leverage this feature. See AD Connector – Multi-factor Authentication Prerequisites for more information about requirements and configuration. With MFA enabled with AD Connector, the sign-in page hosted at your access URL will prompt users for an MFA code in addition to their standard sign-in credentials.

AD Connector comes in two sizes: small and large. A large AD Connector runs on more powerful compute resources and is more expensive than a small AD Connector. Depending on the volume of traffic to be proxied by AD Connector, you’ll want to select the appropriate size for your needs.

Figure 2: Directory size

AD Connector is highly available, meaning underlying hosts are deployed across multiple Availability Zones in the region you deploy. In the event of host-level failure, Directory Service will promptly replace failed hosts. Directory Service also applies performance and security updates automatically to AD Connector.

The following diagram illustrates the authentication flow and network path when you enable AWS Management Console access:

A user opens the secure custom sign-in page and supplies their Active Directory user name and password.
The authentication request is sent over SSL to AD Connector.
AD Connector performs LDAP authentication to Active Directory.

Note: AD Connector locates the nearest domain controllers by querying the SRV DNS records for the domain.
After the user has been authenticated, AD Connector calls the STS AssumeRole method to get temporary security credentials for that user. Using those temporary security credentials, AD Connector constructs a sign-in URL that users use to access the console.

Note: If a user is mapped to multiple roles, the user will be presented with a choice at sign-in as to which role they want to assume. The user session is valid for 1 hour.

Figure 3: Authentication flow and network path

Before getting started with configuring AD Connector for federated AWS Management Console access, be sure you’ve read and understand the prerequisites for AD Connector. For example, as shown in Figure 3 there must be a VPN or Direct Connect circuit in place between your VPC and your on-premises environment. Your domain also has to be running at Windows 2003 functional level or later. Also, various ports have to be opened between your VPC and your on-premises environment to allow AD Connector to communicate with your on-premises directory.

Configuring AD Connector for federated AWS Management Console access

Enable console access

To allow users to sign in with their Active Directory credentials, you need to explicitly enable console access. You can do this by opening the Directory Service console and clicking the Directory ID name (Figure 4).

This opens the Directory Details page, where you’ll find a dropdown menu on the Apps & Services tab to enable the directory for AWS Management Console access.

Figure 4: Directories

Choose the Application management tab as seen in Figure 5.

Figure 5: Application Management

Scroll down to AWS Management Console as shown in Figure 6, and choose Enable from the Actions dropdown list.

Figure 6: Enable console access

After enabling console access, you’re ready to start configuring roles and associating Active Directory users and groups with those roles.

Follow these steps to create a new role. When you create a new role through the Directory Service console, AD Connector automatically adds a trust relationship to Directory Service. The following code example shows the IAM trust policy for the role, after a role is created.

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Sid": "",
       "Effect": "Allow",
       "Principal": {
         "Service": "ds.amazonaws.com"
       },
       "Action": "sts:AssumeRole",
       "Condition": {
         "StringEquals": {
           "sts:externalid": "482242153642"
	  }
	}
     }
   ]
}

Assign users to roles

Now that AD Connector is configured and you’ve created a role, your next job is to assign users or groups to those IAM roles. Role mapping is what governs what resources a user has access to within AWS. To do this you’ll need to do the following steps:

Open the Directory Service console and navigate to the AWS Management Console section.
In the search bar, type the name of the role you just created.
Select the role that you just created by choosing the name under the IAM role field.
Choose Add, and enter the name to be added to find users or groups for this role.
Choose Add, and the user or group is now assigned to the role.

When you’re finished, you should see the name of the user or group along with the corresponding ID for that object. It is also important to note that this list can be used to remove users or groups from the role. The next time the user signs in to the AWS Management Console from the custom sign-in page, they will be signed in under the EC2ReadOnly security role.

Seamlessly join an instance to an Active Directory domain

Another advantage to using AD Connector is the ability to seamlessly join Windows (EC2) instances to your Active Directory domain. This allows you to join a Windows Server to the domain while the instance is being provisioned instead of using a script or doing it manually. This section of this blog post will explain the steps necessary to enable this feature in your environment and how the service works.

Step 1: Create a role

Until recently you had to manually create an IAM policy to allow an EC2 instance to access the SSM, an AWS service that allows you to configure Windows instances while they’re running and on first launch. Now, there’s a managed policy called AmazonEC2RoleforSSM that you can use instead. The role you are about to create will be assigned to an EC2 instance when it’s provisioned, which will grant it permission to access the SSM service.

To create the role:

Open the IAM console.
Click Roles in the navigation pane.
Click Create Role.
Type a name for your role in the Role Name field.
Under AWS Service Roles, select Amazon EC2 and then click Select.
On the Attach Policy page, select AmazonEC2RoleforSSM and then click Next Step.
On the Review page, click Create Role.

If you click the role you created, you’ll see a trust policy for EC2, which looks like the following code example.

{
     "Version": "2012-10-17",
     "Statement": [
       {
         "Sid": "",
         "Effect": "Allow",
         "Principal": {
           "Service": "ec2.amazonaws.com"
         },
         "Action": "sts:AssumeRole"
       }
     ]
}

Step 2: Create a new Windows instance from the EC2 console

With this role in place, you can now join a Windows instance to your domain via the EC2 launch wizard. For a detailed explanation about how to do this, see Joining a Domain Using the Amazon EC2 Launch Wizard.

If you’re instantiating a new instance from the API, however, you will need to create an SSM configuration document and upload it to the SSM service beforehand. We’ll step through that process next.

Note: The instance will require internet access to communicate with the SSM service.

Figure 7: Configure instance details

When you create a new Windows instance from the EC2 launch wizard as shown in Figure 7, the wizard automatically creates the SSM configuration document from the information stored in AD Connector. Presently, the EC2 launch wizard doesn’t allow you to specify which organizational unit (OU) you want to deploy the member server into.

Step 3: Create an SSM document (for seamlessly joining a server to the domain through the AWS API)

If you want to provision new Windows instances from the AWS CLI or API or you want to specify the target OU for your instances, you will need to create an SSM configuration document. The configuration document is a JSON file that contains various parameters used to configure your instances. The following code example is a configuration document for joining a domain.

{
	"schemaVersion": "1.0",
	"description": "Sample configuration to join an instance to a domain",
	"runtimeConfig": {
	   "aws:domainJoin": {
	       "properties": {
	          "directoryId": "d-1234567890",
	          "directoryName": "test.example.com",
	          "directoryOU": "OU=test,DC=example,DC=com",
	          "dnsIpAddresses": [
	             "198.51.100.1",
	             "198.51.100.2"
	          ]
	       }
	   }
	}
}

In this configuration document:

directoryId is the ID for the AD Connector you created earlier.
directoryName is the name of the domain (for example, examplecompany.com).
directoryOU is the OU for the domain.
dnsIpAddresses are the IP addresses for the DNS servers you specified when you created the AD Connector.

For additional information, see aws:domainJoin. When you’re finished creating the file, save it as a JSON file.

Note: The name of the file has to be at least 1 character and at most 64 characters in length.

Step 4: Upload the configuration document to SSM

This step requires that the user have permission to use SSM to configure an instance. If you don’t have a policy that includes these rights, create a new policy by using the following JSON, and assign it to an IAM user or group.

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Effect": "Allow",
       "Action": "ssm:*",
       "Resource": "*"
     }
   ]
}

After you’ve signed in with a user that associates with the SSM IAM policy you created, run the following command from the AWS CLI.

aws ssm create-document ‐‐content file://path/to/myconfigfile.json ‐‐name "My_Custom_Config_File"

Note: On Linux/Mac systems, you need to add a “/” at the beginning of the path (for example, file:///Users/username/temp).

This command uploads the configuration document you created to the SSM service, allowing you to reference it when creating a new Windows instance from either the AWS CLI or the EC2 launch wizard.

Conclusion

This blog post has shown you how you can simplify account management by federating with your Active Directory for AWS Management Console access. The post also explored how you can enable hybrid IT by using AD Connector to seamlessly join Windows instances to your Active Directory domain. Armed with this information you can create a trust between your Active Directory and AWS. In addition, you now have a quick and simple way to enable single sign-on without needing to replicate identities or deploy additional infrastructure on premises.

We’d love to hear more about how you are using Directory Service, and welcome any feedback about how we can improve the experience. You can post comments below, or visit the Directory Service forum to post comments and questions.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Directory Service knowledge Center re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.