The new IRAP report includes an additional six AWS services, as well as the new AWS Local Zone in Perth, that are now assessed at the PROTECTED level under IRAP. This brings the total number of services assessed at the PROTECTED level to 145.
The following are the six newly assessed services:
AWS has developed an IRAP documentation pack to assist Australian government agencies and their partners to plan, architect, and assess risk for their workloads when they use AWS Cloud services.
The IRAP pack on AWS Artifact also includes newly updated versions of the AWS Consumer Guide and the whitepaper Reference Architectures for ISM PROTECTED Workloads in the AWS Cloud.
Reach out to your AWS representatives to let us know which additional services you would like to see in scope for upcoming IRAP assessments. We strive to bring more services into scope at the PROTECTED level under IRAP to support your requirements.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
AWS Identity and Access Management (IAM) Roles Anywhere enables workloads that run outside of Amazon Web Services (AWS), such as servers, containers, and applications, to use X.509 digital certificates to obtain temporary AWS credentials and access AWS resources, the same way that you use IAM roles for workloads on AWS. Now, IAM Roles Anywhere allows you to use PKCS #11–compatible cryptographic modules to help you securely store private keys associated with your end-entity X.509 certificates.
Cryptographic modules allow you to generate non-exportable asymmetric keys in the module hardware. The cryptographic module exposes high-level functions, such as encrypt, decrypt, and sign, through an interface such as PKCS #11. Using a cryptographic module with IAM Roles Anywhere helps to ensure that the private keys associated with your end-identity X.509 certificates remain in the module and cannot be accessed or copied to the system.
In this post, I will show how you can use PKCS #11–compatible cryptographic modules, such as YubiKey 5 Series and Thales ID smart cards, with your on-premises servers to securely store private keys. I’ll also show how to use those private keys and certificates to obtain temporary credentials for the AWS Command Line Interface (AWS CLI) and AWS SDKs.
Cryptographic modules use cases
IAM Roles Anywhere reduces the need to manage long-term AWS credentials for workloads running outside of AWS, to help improve your security posture. Now IAM Roles Anywhere has added support for compatible PKCS #11 cryptographic modules to the credential helper tool so that organizations that are currently using these (such as defense, government, or large enterprises) can benefit from storing their private keys on their security devices. This mitigates the risk of storing the private keys as files on servers where they can be accessed or copied by unauthorized users.
Note: If your organization does not implement PKCS #11–compatible modules, IAM Roles Anywhere credential helper supports OS certificate stores (Keychain Access for macOS and Cryptography API: Next Generation (CNG) for Windows) to help protect your certificates and private keys.
Solution overview
This authentication flow is shown in Figure 1 and is described in the following sections.
Figure 1: Authentication flow using crypto modules with IAM Roles Anywhere
How it works
As a prerequisite, you must first create a trust anchor and profile within IAM Roles Anywhere. The trust anchor will establish trust between your public key infrastructure (PKI) and IAM Roles Anywhere, and the profile allows you to specify which roles IAM Roles Anywhere assumes and what your workloads can do with the temporary credentials. You establish trust between IAM Roles Anywhere and your certificate authority (CA) by creating a trust anchor. A trust anchor is a reference to either AWS Private Certificate Authority (AWS Private CA) or an external CA certificate. For this walkthrough, you will use the AWS Private CA.
The one-time initialization process (step “0 – Module initialization” in Figure 1) works as follows:
You first generate the non-exportable private key within the secure container of the cryptographic module.
You then create the X.509 certificate that will bind an identity to a public key:
Create a certificate signing request (CSR).
Submit the CSR to the AWS Private CA.
Obtain the certificate signed by the CA in order to establish trust.
The certificate is then imported into the cryptographic module for mobility purposes, to make it available and simple to locate when the module is connected to the server.
After initialization is done, the module is connected to the server, which can then interact with the AWS CLI and AWS SDK without long-term credentials stored on a disk.
To obtain temporary security credentials from IAM Roles Anywhere:
The server will use the credential helper tool that IAM Roles Anywhere provides. The credential helper works with the credential_process feature of the AWS CLI to provide credentials that can be used by the CLI and the language SDKs. The helper manages the process of creating a signature with the private key.
The credential helper tool calls the IAM Roles Anywhere endpoint to obtain temporary credentials that are issued in a standard JSON format to IAM Roles Anywhere clients via the API method CreateSession action.
The server uses the temporary credentials for programmatic access to AWS services.
Alternatively, you can use the update or serve commands instead of credential-process. The update command will be used as a long-running process that will renew the temporary credentials 5 minutes before the expiration time and replace them in the AWS credentials file. The serve command will be used to vend temporary credentials through an endpoint running on the local host using the same URIs and request headers as IMDSv2 (Instance Metadata Service Version 2).
Supported modules
The credential helper tool for IAM Roles Anywhere supports most devices that are compatible with PKCS #11. The PKCS #11 standard specifies an API for devices that hold cryptographic information and perform cryptographic functions such as signature and encryption.
I will showcase how to use a YubiKey 5 Series device that is a multi-protocol security key that supports Personal Identity Verification (PIV) through PKCS #11. I am using YubiKey 5 Series for the purpose of demonstration, as it is commonly accessible (you can purchase it at the Yubico store or Amazon.com) and is used by some of the world’s largest companies as a means of providing a one-time password (OTP), Fast IDentity Online (FIDO) and PIV for smart card interface for multi-factor authentication. For a production server, we recommend using server-specific PKCS #11–compatible hardware security modules (HSMs) such as the YubiHSM 2, Luna PCIe HSM, or Trusted Platform Modules (TPMs) available on your servers.
Note: The implementation might differ with other modules, because some of these come with their own proprietary tools and drivers.
Implement the solution: Module initialization
You need to have the following prerequisites in order to initialize the module:
The Security Officer (SO) should have permissions to use the AWS Private CA service
Following are the high-level steps for initializing the YubiKey device and generating the certificate that is signed by AWS Private Certificate Authority (AWS Private CA). Note that you could also use your own public key infrastructure (PKI) and register it with IAM Roles Anywhere.
To initialize the module and generate a certificate
Verify that the YubiKey PIV interface is enabled, because some organizations might disable interfaces that are not being used. To do so, run the YubiKey Manager CLI, as follows:
ykman info
The output should look like the following, with the PIV interface enabled for USB.
Figure 2:YubiKey Manager CLI showing that the PIV interface is enabled
Use the YubiKey Manager CLI to generate a new RSA2048 private key on the security module in slot 9a and store the associated public key in a file. Different slots are available on YubiKey, and we will use the slot 9a that is for PIV authentication purpose. Use the following command to generate an asymmetric key pair. The private key is generated on the YubiKey, and the generated public key is saved as a file. Enter the YubiKey management key to proceed:
Create a certificate request (CSR) based on the public key and specify the subject that will identify your server. Enter the user PIN code when prompted.
Import the certificate file on the module by using the YubiKey Manager CLI or through the YubiKey Manager UI. Enter the YubiKey management key to proceed.
The security module is now initialized and can be plugged into the server.
Configuration to use the security module for programmatic access
The following steps will demonstrate how to configure the server to interact with the AWS CLI and AWS SDKs by using the private key stored on the YubiKey or PKCS #11–compatible device.
Install the p11-kit package. Most providers (including opensc) will ship with a p11-kit “module” file that makes them discoverable. Users shouldn’t need to specify the PKCS #11 “provider” library when using the credential helper, because we use p11-kit by default.
If your device library is not supported by p11-kit, you can install that library separately.
Verify the content of the YubiKey by using the following command:
ykman --device 123456 piv info
The output should look like the following.
Figure 3: YubiKey Manager CLI output for the PIV information
This command provides the general status of the PIV application and content in the different slots such as the certificates installed.
Use the credential helper command with the security module. The command will require at least:
The ARN of the trust anchor
The ARN of the target role to assume
The ARN of the profile to pull policies from
The certificate and/or key identifiers in the form of a PKCS #11 URI
You can use the certificate flag to search which slot on the security module contains the private key associated with the user certificate.
To specify an object stored in a cryptographic module, you should use the PKCS #11 URI that is defined in RFC7512. The attributes in the identifier string are a set of search criteria used to filter a set of objects. See a recommended method of locating objects in PKCS #11.
In the following example, we search for an object of type certificate, with the object label as “Certificate for Digital Signature”, in slot 1. The pin-value attribute allows you to directly use the pin to log into the cryptographic device.
From the folder where you have installed the credential helper tool, use the following command. Because we only have one certificate on the device, we can limit the filter to the certificate type in our PKCS #11 URI.
If everything is configured correctly, the credential helper tool will return a JSON that contains the credentials, as follows. The PIN code will be requested if you haven’t specified it in the command.
Please enter your user PIN:
{
"Version":1,
"AccessKeyId": <String>,
"SecretAccessKey": <String>,
"SessionToken": <String>,
"Expiration": <Timestamp>
}
To use temporary security credentials with AWS SDKs and the AWS CLI, you can configure the credential helper tool as a credential process. For more information, see Source credentials with an external process. The following example shows a config file (usually in ~/.aws/config) that sets the helper tool as the credential process.
You can provide the PIN as part of the credential command with the option pin-value=<PIN> so that the user input is not required.
If you prefer not to store your PIN in the config file, you can remove the attribute pin-value. In that case, you will be prompted to enter the PIN for every CLI command.
You can use the serve and update commands of the credential helper mentioned in the solution overview to manage credential rotation for unattended workloads. After the successful use of the PIN, the credential helper will store it in memory for the duration of the process and not ask for it anymore.
Auditability and fine-grained access
You can audit the activity of servers that are assuming roles through IAM Roles Anywhere. IAM Roles Anywhere is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in IAM Roles Anywhere.
For Lookup attributes, filter by Event source and enter rolesanywhere.amazonaws.com in the textbox. You will find all the API calls that relate to IAM Roles Anywhere, including the CreateSession API call that returns temporary security credentials for workloads that have been authenticated with IAM Roles Anywhere to access AWS resources.
Figure 4: CloudTrail Events filtered on the “IAM Roles Anywhere” event source
When you review the CreateSession event record details, you can find the assumed role ID in the form of <PrincipalID>:<serverCertificateSerial>, as in the following example:
Figure 5: Details of the CreateSession event in the CloudTrail console showing which role is being assumed
If you want to identify API calls made by a server, for Lookup attributes, filter by User name, and enter the serverCertificateSerial value from the previous step in the textbox.
Figure 6: CloudTrail console events filtered by the username associated to our certificate on the security module
The API calls to AWS services made with the temporary credentials acquired through IAM Roles Anywhere will contain the identity of the server that made the call in the SourceIdentity field. For example, the EC2 DescribeInstances API call provides the following details:
Figure 7: The event record in the CloudTrail console for the EC2 describe instances call, with details on the assumed role and certificate CN.
Additionally, you can include conditions in the identity policy for the IAM role to apply fine-grained access control. This will allow you to apply a fine-grained access control filter to specify which server in the group of servers can perform the action.
To apply access control per server within the same IAM Roles Anywhere profile
In the IAM Roles Anywhere console, select the profile used by the group of servers, then select one of the roles that is being assumed.
Apply the following policy, which will allow only the server with CN=server1-demo to list all buckets by using the condition on aws:SourceIdentity.
In this blog post, I’ve demonstrated how you can use the YubiKey 5 Series (or any PKCS #11 cryptographic module) to securely store the private keys for the X.509 certificates used with IAM Roles Anywhere. I’ve also highlighted how you can use AWS CloudTrail to audit API actions performed by the roles assumed by the servers.
The Agence Française de la Santé Numérique (ASIP Santé), the French governmental agency for health, introduced the HDS certification to strengthen the security and protection of personal health data. By achieving this certification, AWS demonstrates our commitment to adhere to the heightened expectations for cloud service providers.
The following 20 Regions are in scope for this certification:
US East (Ohio)
US East (Northern Virginia)
US West (Northern California)
US West (Oregon)
Asia Pacific (Jakarta)
Asia Pacific (Seoul)
Asia Pacific (Mumbai)
Asia Pacific (Singapore)
Asia Pacific (Sydney)
Asia Pacific (Tokyo)
Canada (Central)
Europe (Frankfurt)
Europe (Ireland)
Europe (London)
Europe (Milan)
Europe (Paris)
Europe (Stockholm)
Europe (Zurich)
Middle East (UAE)
South America (São Paulo)
The HDS certification demonstrates that AWS provides a framework for technical and governance measures that secure and protect personal health data, governed by French law. Our customers who handle personal health data can continue to manage their workloads in HDS-certified Regions with confidence.
For up-to-date information, including when additional Regions are added, see the AWS Compliance Programs page and choose HDS.
AWS strives to continuously meet your architectural and regulatory needs. If you have questions or feedback about HDS compliance, reach out to your AWS account team.
To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security news? Follow us on Twitter.
AWS Management Console Private Access is an advanced security feature to help you control access to the AWS Management Console. In this post, I will show you how this feature works, share current limitations, and provide AWS CloudFormation templates that you can use to automate the deployment. AWS Management Console Private Access is useful when you want to restrict users from signing in to unknown AWS accounts from within your network. With this feature, you can limit access to the console only to a specified set of known accounts when the traffic originates from within your network.
For enterprise customers, users typically access the console from devices that are connected to a corporate network, either directly or through a virtual private network (VPN). With network connectivity to the console, users can authenticate into an account with valid credentials, including third-party accounts and personal accounts. For enterprise customers with stringent network access controls, this feature provides a way to control which accounts can be accessed from on-premises networks.
How AWS Management Console Private Access works
AWS PrivateLink now supports the AWS Management Console, which means that you can create Virtual Private Cloud (VPC) endpoints in your VPC for the console. You can then use DNS forwarding to conditionally route users’ browser traffic to the VPC endpoints from on-premises and define endpoint policies that allow or deny access to specific accounts, organizations, or organizational units (OUs). To privately reach the endpoints, you must have a hybrid network connection between on-premises and AWS over AWS Direct Connect or AWS Site-to-Site VPN.
When you conditionally forward DNS queries for the zone aws.amazon.com from on-premises to an Amazon Route 53 Resolver inbound endpoint within the VPC, Route 53 will prefer the private hosted zone for aws.amazon.com to resolve the queries. The private hosted zone makes it simple to centrally manage records for the console in the AWS US East (N. Virginia) Region (us-east-1) as well as other Regions.
Configure a VPC endpoint for the console
To configure VPC endpoints for the console, you must complete the following steps:
Create interface VPC endpoints in a VPC in the US East (N. Virginia) Region for the console and sign-in services. Repeat for other desired Regions. You must create VPC endpoints in the US East (N. Virginia) Region because the default DNS name for the console resolves to this Region. Specify the accounts, organizations, or OUs that should be allowed or denied in the endpoint policies. For instructions on how to create interface VPC endpoints, see Access an AWS service using an interface VPC endpoint.
Create a Route 53 Resolver inbound endpoint in a VPC and note the IP addresses for the elastic network interfaces of the endpoint. Forward DNS queries for the console from on-premises to these IP addresses. For instructions on how to configure Route 53 Resolver, see Getting started with Route 53 Resolver.
Create a Route 53 private hosted zone with records for the console and sign-in subdomains. For the full list of records needed, see DNS configuration for AWS Management Console and AWS Sign-In. Then associate the private hosted zone with the same VPC that has the Resolver inbound endpoint. For instructions on how to create a private hosted zone, see Creating a private hosted zone.
Conditionally forward DNS queries for aws.amazon.com to the IP addresses of the Resolver inbound endpoint.
How to access Regions other than US East (N. Virginia)
To access the console for another supported Region using AWS Management Console Private Access, complete the following steps:
Create the console and sign-in VPC endpoints in a VPC in that Region.
Create resource records for <region>.console.aws.amazon.com and <region>.signin.aws.amazon.com in the private hosted zone, with values that target the respective VPC endpoints in that Region. Replace <region> with the region code (for example, us-west-2).
For increased resiliency, you can also configure a second Resolver inbound endpoint in a different Region other than the US East (N. Virginia) Region (us-east-1). On-premises DNS resolvers can use both endpoints for resilient DNS resolution to the private hosted zone.
Automate deployment of AWS Management Console Private Access
I created an AWS CloudFormation template that you can use to deploy the required resources in the US East (N. Virginia) Region (us-east-1). To get the template, go to console-endpoint-use1.yaml. The CloudFormation stack deploys the required VPC endpoints, Route 53 Resolver inbound endpoint, and private hosted zone with required records.
I also created a CloudFormation template that you can use to deploy the required resources in other Regions where private access to the console is required. To get the template, go to console-endpoint-non-use1.yaml.
Cost considerations
When you configure AWS Management Console Private Access, you will incur charges. You can use the following information to estimate these charges:
PrivateLink pricing is based on the number of hours that the VPC endpoints remain provisioned. In the US East (N. Virginia) Region, this is $0.01 per VPC endpoint per Availability Zone ($/hour).
Data processing charges per gigabyte (GB) of data processed through the VPC endpoints is $0.01 in the US East (N. Virginia) Region.
The Route 53 Resolver inbound endpoint is charged per IP (elastic network interface) per hour. In the US East (N. Virginia) Region, this is $0.125 per IP address per hour. See Route 53 pricing.
DNS queries to the inbound endpoint are charged at $0.40 per million queries.
The Route 53 hosted zone is charged at $0.50 per hosted zone per month. To allow testing, AWS won’t charge you for a hosted zone that you delete within 12 hours of creation.
Based on this pricing model, the cost of configuring AWS Management Console Private Access in the US East (N. Virginia) Region in two Availability Zones is approximately $212.20 per month for the deployed resources. DNS queries and data processing charges are additional based on actual usage. You can also apply this pricing model to help estimate the cost to configure in additional supported Regions. Route 53 is a global service, so you only have to create the private hosted zone once along with the resources in the US East (N. Virginia) Region.
Limitations and considerations
Before you get started with AWS Management Console Private Access, make sure to review the following limitations and considerations:
You can use this feature to restrict access to specific accounts from customer networks by forwarding DNS queries to the VPC endpoints. This feature doesn’t prevent users from accessing the console directly from the internet by using the console’s public endpoints from devices that aren’t on the corporate network.
The following subdomains aren’t currently supported by this feature and won’t be accessible through private access:
docs.aws.amazon.com
health.aws.amazon.com
status.aws.amazon.com
After a user completes authentication and accesses the console with private access, when they navigate to an individual service console, for example Amazon Elastic Compute Cloud (Amazon EC2), they must have network connectivity to the service’s API endpoint, such as ec2.amazonaws.com. This is needed for the console to make API calls such as ec2:DescribeInstances to display resource details in the service console.
Conclusion
In this blog post, I outlined how you can configure the console through AWS Management Console Private Access to restrict access to AWS accounts from on-premises, how the feature works, and how to configure it for multiple Regions. I also provided CloudFormation templates that you can use to automate the configuration of this feature. Finally, I shared information on costs and some limitations that you should consider before you configure private access to the console.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread at re:Post.
Want more AWS Security news? Follow us on Twitter.
Distributed denial of service (DDoS) events occur when a threat actor sends traffic floods from multiple sources to disrupt the availability of a targeted application. DDoS simulation testing uses a controlled DDoS event to allow the owner of an application to assess the application’s resilience and practice event response. DDoS simulation testing is permitted on Amazon Web Services (AWS), subject to Testing policy terms and conditions. In this blog post, we help you understand when it’s appropriate to perform a DDoS simulation test on an application running on AWS, and what options you have for running the test.
DDoS protection at AWS
Security is the top priority at AWS. AWS services include basic DDoS protection as a standard feature to help protect customers from the most common and frequently occurring infrastructure (layer 3 and 4) DDoS events, such as SYN/UDP floods, reflection attacks, and others. While this protection is designed to protect the availability of AWS infrastructure, your application might require more nuanced protections that consider your traffic patterns and integrate with your internal reporting and incident response processes. If you need more nuanced protection, then you should consider subscribing to AWS Shield Advanced in addition to the native resiliency offered by the AWS services you use.
AWS Shield Advanced is a managed service that helps you protect your application against external threats, like DDoS events, volumetric bots, and vulnerability exploitation attempts. When you subscribe to Shield Advanced and add protection to your resources, Shield Advanced provides expanded DDoS event protection for those resources. With advanced protections enabled on your resources, you get tailored detection based on the traffic patterns of your application, assistance with protecting against Layer 7 DDoS events, access to 24×7 specialized support from the Shield Response Team (SRT), access to centralized management of security policies through AWS Firewall Manager, and cost protections to help safeguard against scaling charges resulting from DDoS-related usage spikes. You can also configure AWS WAF (a web application firewall) to integrate with Shield Advanced to create custom layer 7 firewall rules and enable automatic application layer DDoS mitigation.
Acceptable DDoS simulation use cases on AWS
AWS is constantly learning and innovating by delivering new DDoS protection capabilities, which are explained in the DDoS Best Practices whitepaper. This whitepaper provides an overview of DDoS events and the choices that you can make when building on AWS to help you architect your application to absorb or mitigate volumetric events. If your application is architected according to our best practices, then a DDoS simulation test might not be necessary, because these architectures have been through rigorous internal AWS testing and verified as best practices for customers to use.
Using DDoS simulations to explore the limits of AWS infrastructure isn’t a good use case for these tests. Similarly, validating if AWS is effectively protecting its side of the shared responsibility model isn’t a good test motive. Further, using AWS resources as a source to simulate a DDoS attack on other AWS resources isn’t encouraged. Load tests are performed to gain reliable information on application performance under stress and these are different from DDoS tests. For more information, see the Amazon Elastic Compute Cloud (Amazon EC2) testing policy and penetration testing. Application owners, who have a security compliance requirement from a regulator or who want to test the effectiveness of their DDoS mitigation strategies, typically run DDoS simulation tests.
DDoS simulation tests at AWS
AWS offers two options for running DDoS simulation tests. They are:
A simulated DDoS attack in production traffic with an authorized pre-approved AWS Partner.
A synthetic simulated DDoS attack with the SRT, also referred to as a firedrill.
The motivation for DDoS testing varies from application to application and these engagements don’t offer the same value to all customers. Establishing clear motives for the test can help you choose the right option. If you want to test your incident response strategy, we recommend scheduling a firedrill with our SRT. If you want to test the Shield Advanced features or test application resiliency, we recommend that you work with an AWS approved partner.
DDoS simulation testing with an AWS Partner
AWS DDoS test partners are authorized to conduct DDoS simulation tests on customers’ behalf without prior approval from AWS. Customers can currently contact the following partners to set up these paid engagements:
Before contacting the partners, customers must agree to the terms and conditions for DDoS simulation tests. The application must be well-architected prior to DDoS simulation testing as described in AWS DDoS Best Practices whitepaper. AWS DDoS test partners that want to perform DDoS simulation tests that don’t comply with the technical restrictions set forth in our public DDoS testing policy, or other DDoS test vendors that aren’t approved, can request approval to perform DDoS simulation tests by submitting the DDoS Simulation Testing form at least 14 days before the proposed test date. For questions, please send an email to [email protected].
After choosing a test partner, customers go through various phases of testing. Typically, the first phase involves a discovery discussion, where the customer defines clear goals, assembles technical details, and defines the test schedule with the partner. In the next phase, partners run multiple simulations based on agreed attack vectors, duration, diversity of the attack vectors, and other factors. These tests are usually carried out by slowly ramping up traffic levels from low levels to desired high levels with an ability for an emergency stop. The final stage involves reporting, discussing observed gaps, identifying actionable tasks, and driving those tasks to completion.
These engagements are typically long-term, paid contracts that are planned over months and carried out over weeks, with results analyzed over time. These tests and reports are beneficial to customers who need to evaluate detection and mitigation capabilities on a large scale. If you’re an application owner and want to evaluate the DDoS resiliency of your application, practice event response with real traffic, or have a DDoS compliance or regulation requirement, we recommend this type of engagement. These tests aren’t recommended if you want to learn the volumetric breaking points of the AWS network or understand when AWS starts to throttle requests. AWS services are designed to scale, and when certain dynamic volume thresholds are exceeded, AWS detection systems will be invoked to block traffic. Lastly, it’s critical to distinguish between these tests and stress tests, in which meaningful packets are sent to the application to assess its behavior.
DDoS firedrill testing with the Shield Response Team
Shield Advanced service offers additional assistance through the SRT, this team can also help with testing incident response workflows. Customers can contact the SRT and request firedrill testing. Firedrill testing is a type of synthetic test that doesn’t generate real volumetric traffic but does post a shield event to the requesting customer’s account.
These tests are available for customers who are already on-boarded to Shield Advanced and want to test their Amazon CloudWatch alarms by invoking a DDoSDetected metric, or test their proactive engagement setup or their custom incident response strategy. Because this event isn’t based on real traffic, the customer won’t see traffic generated on their account or see logs that drive helpful reports.
These tests are intended to generate associated Shield Advanced metrics and post a DDoS event for a customer resource. For example, SRT can post a 14 Gbps UDP mock attack on a protected resource for about 15 minutes and customers can test their response capability during such an event.
Note: Not all attack vectors and AWS resource types are supported for a firedrill. Shield Advanced onboarded customers can contact AWS Support teams to request assistance with running a firedrill or understand more about them.
Conclusion
DDoS simulations and incident response testing on AWS through the SRT or an AWS Partner are useful in improving application security controls, identifying Shield Advanced misconfigurations, optimizing existing detection systems, and improving incident readiness. The goal of these engagements is to help you build a DDoS resilient architecture to protect your application’s availability. However, these engagements don’t offer the same value to all customers. Most customers can obtain similar benefits by following AWS Best Practices for DDoS Resiliency. AWS recommends architecting your application according to DDoS best practices and fine tuning AWS Shield Advanced out-of-the-box offerings to your application needs to improve security posture.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Application teams should consider the impact unexpected traffic floods can have on an application’s availability. Internet-facing applications can be susceptible to traffic that some distributed denial of service (DDoS) mitigation systems can’t detect. For example, hit-and-run events are a popular approach that use short-lived floods that reoccur at random intervals. Each burst is small enough to go unnoticed by mitigation systems, but still occur often enough and are large enough to be disruptive. Automatically detecting and blocking temporary sources of invalid traffic, combined with other best practices, can strengthen the resiliency of your applications and maintain customer trust.
Use resilient architectures
AWS customers can use prescriptive guidance to improve DDoS resiliency by reviewing the AWS Best Practices for DDoS Resiliency. It describes a DDoS-resilient reference architecture as a guide to help you protect your application’s availability.
The best practices above address the needs of most AWS customers; however, in this blog we cover a few outlier examples that fall outside normal guidance. Here are a few examples that might describe your situation:
You need to operate functionality that isn’t yet fully supported by an AWS managed service that takes on the responsibility of DDoS mitigation.
Migrating to an AWS managed service such as Amazon Route 53 isn’t immediately possible and you need an interim solution that mitigates risks.
Network ingress must be allowed from a wide public IP space that can’t be restricted.
Network floods are disruptive but not significant enough (too infrequent or too low volume) to be detected by your managed DDoS mitigation systems.
For these situations, VPC network ACLs can be used to deny invalid traffic. Normally, the limit on rules per network ACL makes them unsuitable for handling truly distributed network floods. However, they can be effective at mitigating network floods that aren’t distributed enough or large enough to be detected by DDoS mitigation systems.
Given the dynamic nature of network traffic and the limited size of network ACLs, it helps to automate the lifecycle of network ACL rules. In the following sections, I show you a solution that uses network ACL rules to automatically detect and block infrastructure layer traffic within 2–5 minutes and automatically removes the rules when they’re no longer needed.
Detecting anomalies in network traffic
You need a way to block disruptive traffic while not impacting legitimate traffic. Anomaly detection can isolate the right traffic to block. Every workload is unique, so you need a way to automatically detect anomalies in the workload’s traffic pattern. You can determine what is normal (a baseline) and then detect statistical anomalies that deviate from the baseline. This baseline can change over time, so it needs to be calculated based on a rolling window of recent activity.
Z-scores are a common way to detect anomalies in time-series data. The process for creating a Z-score is to first calculate the average and standard deviation (a measure of how much the values are spread out) across all values over a span of time. Then for each value in the time window calculate the Z-score as follows:
Z-score = (value – average) / standard deviation
A Z-score exceeding 3.0 indicates the value is an outlier that is greater than 99.7 percent of all other values.
To calculate the Z-score for detecting network anomalies, you need to establish a time series for network traffic. This solution uses VPC flow logs to capture information about the IP traffic in your VPC. Each VPC flow log record provides a packet count that’s aggregated over a time interval. Each flow log record aggregates the number of packets over an interval of 60 seconds or less. There isn’t a consistent time boundary for each log record. This means raw flow log records aren’t a predictable way to build a time series. To address this, the solution processes flow logs into packet bins for time series values. A packet bin is the number of packets sent by a unique source IP address within a specific time window. A source IP address is considered an anomaly if any of its packet bins over the past hour exceed the Z-score threshold (default is 3.0).
When overall traffic levels are low, there might be source IP addresses with a high Z-score that aren’t a risk. To mitigate against false positives, source IP addresses are only considered to be an anomaly if the packet bin exceeds a minimum threshold (default is 12,000 packets).
Let’s review the overall solution architecture.
Solution overview
This solution, shown in Figure 1, uses VPC flow logs to capture information about the traffic reaching the network interfaces in your public subnets. CloudWatch Logs Insights queries are used to summarize the most recent IP traffic into packet bins that are stored in Timestream. The time series table is queried to identify source IP addresses responsible for traffic that meets the anomaly threshold. Anomalous source IP addresses are published to an Amazon Simple Notification Service (Amazon SNS) topic. A Lambda function receives the SNS message and decides how to update the network ACL.
Figure 1: Automating the detection and mitigation of traffic floods using network ACLs
How it works
The numbered steps that follow correspond to the numbers in Figure 1.
Capture VPC flow logs. Your VPC is configured to stream flow logs to CloudWatch Logs. To minimize cost, the flow logs are limited to particular subnets and only include log fields required by the CloudWatch query. When protecting an endpoint that spans multiple subnets (such as a Network Load Balancer using multiple availability zones), each subnet shares the same network ACL and is configured with a flow log that shares the same CloudWatch log group.
Scheduled flow log analysis.Amazon EventBridge starts an AWS Step Functions state machine on a time interval (60 seconds by default). The state machine starts a Lambda function immediately, and then again after 30 seconds. The Lambda function performs steps 3–6.
Summarize recent network traffic. The Lambda function runs a CloudWatch Logs Insights query. The query scans the most recent flow logs (5-minute window) to summarize packet frequency grouped by source IP. These groupings are called packet bins, where each bin represents the number of packets sent by a source IP within a given minute of time.
Update time series database. A time series database in Timestream is updated with the most recent packet bins.
Use statistical analysis to detect abusive source IPs. A Timestream query is used to perform several calculations. The query calculates the average bin size over the past hour, along with the standard deviation. These two values are then used to calculate the maximum Z-score for all source IPs over the past hour. This means an abusive IP will remain flagged for one hour even if it stopped sending traffic. Z-scores are sorted so that the most abusive source IPs are prioritized. If a source IP meets these two criteria, it is considered abusive.
Maximum Z-score exceeds a threshold (defaults to 3.0).
Packet bin exceeds a threshold (defaults to 12,000). This avoids flagging source IPs during periods of overall low traffic when there is no need to block traffic.
Publish anomalous source IPs. Publish a message to an Amazon SNS topic with a list of anomalous source IPs. The function also publishes CloudWatch metrics to help you track the number of unique and abusive source IPs over time. At this point, the flow log summarizer function has finished its job until the next time it’s invoked from EventBridge.
Receive anomalous source IPs. The network ACL updater function is subscribed to the SNS topic. It receives the list of anomalous source IPs.
Update the network ACL. The network ACL updater function uses two network ACLs called blue and green. This verifies that the active rules remain in place while updating the rules in the inactive network ACL. When the inactive network ACL rules are updated, the function swaps network ACLs on each subnet. By default, each network ACL has a limit of 20 rules. If the number of anomalous source IPs exceeds the network ACL limit, the source IPs with the highest Z-score are prioritized. CloudWatch metrics are provided to help you track the number of source IPs blocked, and how many source IPs couldn’t be blocked due to network ACL limits.
Prerequisites
This solution assumes you have one or more public subnets used to operate an internet-facing endpoint.
Deploy the solution
Follow these steps to deploy and validate the solution.
Download the latest release from GitHub.
Upload the AWS CloudFormation templates and Python code to an S3 bucket.
Gather the information needed for the CloudFormation template parameters.
Create the CloudFormation stack.
Monitor traffic mitigation activity using the CloudWatch dashboard.
Let’s review the steps I followed in my environment.
Step 1. Download the latest release
I create a new directory on my computer named auto-nacl-deploy. I review the releases on GitHub and choose the latest version. I download auto-nacl.zip into the auto-nacl-deploy directory. Now it’s time to stage this code in Amazon Simple Storage Service (Amazon S3).
Figure 2: Save auto-nacl.zip to the auto-nacl-deploy directory
Step 2. Upload the CloudFormation templates and Python code to an S3 bucket
I extract the auto-nacl.zip file into my auto-nacl-deploy directory.
Figure 3: Expand auto-nacl.zip into the auto-nacl-deploy directory
The template.yaml file is used to create a CloudFormation stack with four nested stacks. You copy all files to an S3 bucket prior to creating the stacks.
To stage these files in Amazon S3, use an existing bucket or create a new one. For this example, I used an existing S3 bucket named auto-nacl-us-east-1. Using the Amazon S3 console, I created a folder named artifacts and then uploaded the extracted files to it. My bucket now looks like Figure 4.
Figure 4: Upload the extracted files to Amazon S3
Step 3. Gather information needed for the CloudFormation template parameters
There are six parameters required by the CloudFormation template.
Template parameter
Parameter description
VpcId
The ID of the VPC that runs your application.
SubnetIds
A comma-delimited list of public subnet IDs used by your endpoint.
The Internet Protocol (TCP or UDP) used by your endpoint.
SourceCodeS3Bucket
The S3 bucket that contains the files you uploaded in Step 2. This bucket must be in the same AWS Region as the CloudFormation stack.
SourceCodeS3Prefix
The S3 prefix (folder) of the files you uploaded in Step 2.
For the VpcId parameter, I use the VPC console to find the VPC ID for my application.
Figure 5: Find the VPC ID
For the SubnetIds parameter, I use the VPC console to find the subnet IDs for my application. My VPC has public and private subnets. For this solution, you only need the public subnets.
Figure 6: Find the subnet IDs
My application uses a Network Load Balancer that listens on port 80 to handle TCP traffic. I use 80 for ListenerPort and TCP for ListenerProtocol.
The next two parameters are based on the Amazon S3 location I used earlier. I use auto-nacl-us-east-1 for SourceCodeS3Bucket and artifacts for SourceCodeS3Prefix.
Step 4. Create the CloudFormation stack
I use the CloudFormation console to create a stack. The Amazon S3 URL format is https://<bucket>.s3.<region>.amazonaws.com/<prefix>/template.yaml. I enter the Amazon S3 URL for my environment, then choose Next.
Figure 7: Specify the CloudFormation template
I enter a name for my stack (for example, auto-nacl-1) along with the parameter values I gathered in Step 3. I leave all optional parameters as they are, then choose Next.
Figure 8: Provide the required parameters
I review the stack options, then scroll to the bottom and choose Next.
Figure 9: Review the default stack options
I scroll down to the Capabilities section and acknowledge the capabilities required by CloudFormation, then choose Submit.
Figure 10: Acknowledge the capabilities required by CloudFormation
I wait for the stack to reach CREATE_COMPLETE status. It takes 10–15 minutes to create all of the nested stacks.
Figure 11: Wait for the stacks to complete
Step 5. Monitor traffic mitigation activity using the CloudWatch dashboard
After the CloudFormation stacks are complete, I navigate to the CloudWatch console to open the dashboard. In my environment, the dashboard is named auto-nacl-1-MitigationDashboard-YS697LIEHKGJ.
Figure 12: Find the CloudWatch dashboard
Initially, the dashboard, shown in Figure 13, has little information to display. After an hour, I can see the following metrics from my sample environment:
The Network Traffic graph shows how many packets are allowed and rejected by network ACL rules. No anomalies have been detected yet, so this only shows allowed traffic.
The All Source IPs graph shows how many total unique source IP addresses are sending traffic.
The Anomalous Source Networks graph shows how many anomalous source networks are being blocked by network ACL rules (or not blocked due to network ACL rule limit). This graph is blank unless anomalies have been detected in the last hour.
The Anomalous Source IPs graph shows how many anomalous source IP addresses are being blocked (or not blocked) by network ACL rules. This graph is blank unless anomalies have been detected in the last hour.
The Packet Statistics graph can help you determine if the sensitivity should be adjusted. This graph shows the average packets-per-minute and the associated standard deviation over the past hour. It also shows the anomaly threshold, which represents the minimum number of packets-per-minute for a source IP address to be considered an anomaly. The anomaly threshold is calculated based on the CloudFormation parameter MinZScore.
anomaly threshold = (MinZScore * standard deviation) + average
Increasing the MinZScore parameter raises the threshold and reduces sensitivity. You can also adjust the CloudFormation parameter MinPacketsPerBin to mitigate against blocking traffic during periods of low volume, even if a source IP address exceeds the minimum Z-score.
The Blocked IPs grid shows which source IP addresses are being blocked during each hour, along with the corresponding packet bin size and Z-score. This grid is blank unless anomalies have been detected in the last hour.
Figure 13: Observe the dashboard after one hour
Let’s review a scenario to see what happens when my endpoint sees two waves of anomalous traffic.
By default, my network ACL allows a maximum of 20 inbound rules. The two default rules count toward this limit, so I only have room for 18 more inbound rules. My application sees a spike of network traffic from 20 unique source IP addresses. When the traffic spike begins, the anomaly is detected in less than five minutes. Network ACL rules are created to block the top 18 source IP addresses (sorted by Z-score). Traffic is blocked for about 5 minutes until the flood subsides. The rules remain in place for 1 hour by default. When the same 20 source IP addresses send another traffic flood a few minutes later, most traffic is immediately blocked. Some traffic is still allowed from two source IP addresses that can’t be blocked due to the limit of 18 rules.
Figure 14: Observe traffic blocked from anomalous source IP addresses
Customize the solution
You can customize the behavior of this solution to fit your use case.
Block many IP addresses per network ACL rule. To enable blocking more source IP addresses than your network ACL rule limit, change the CloudFormation parameter NaclRuleNetworkMask (default is 32). This sets the network mask used in network ACL rules and lets you block IP address ranges instead of individual IP addresses. By default, the IP address 192.0.2.1 is blocked by a network ACL rule for 192.0.2.1/32. Setting this parameter to 24 results in a network ACL rule that blocks 192.0.2.0/24. As a reminder, address ranges that are too wide might result in blocking legitimate traffic.
Only block source IPs that exceed a packet volume threshold. Use the CloudFormation parameter MinPacketsPerBin (default is 12,000) to set the minimum packets per minute. This mitigates against blocking source IPs (even if their Z-score is high) during periods of overall low traffic when there is no need to block traffic.
Adjust the sensitivity of anomaly detection. Use the CloudFormation parameter MinZScore to set the minimum Z-score for a source IP to be considered an anomaly. The default is 3.0, which only blocks source IPs with packet volume that exceeds 99.7 percent of all other source IPs.
Exclude trusted source IPs from anomaly detection. Specify an allow list object in Amazon S3 that contains a list of IP addresses or CIDRs that you want to exclude from network ACL rules. The network ACL updater function reads the allow list every time it handles an SNS message.
Limitations
As covered in the preceding sections, this solution has a few limitations to be aware of:
CloudWatch Logs queries can only return up to 10,000 records. This means the traffic baseline can only be calculated based on the observation of 10,000 unique source IP addresses per minute.
The traffic baseline is based on a rolling 1-hour window. You might need to increase this if a 1-hour window results in a baseline that allows false positives. For example, you might need a longer baseline window if your service normally handles abrupt spikes that occur hourly or daily.
By default, a network ACL can only hold 20 inbound rules. This includes the default allow and deny rules, so there’s room for 18 deny rules. You can increase this limit from 20 to 40 with a support case; however, it means that a maximum of 18 (or 38) source IP addresses can be blocked at one time.
The speed of anomaly detection is dependent on how quickly VPC flow logs are delivered to CloudWatch. This usually takes 2–4 minutes but can take over 6 minutes.
Cost considerations
CloudWatch Logs Insights queries are the main element of cost for this solution. See CloudWatch pricing for more information. The cost is about 7.70 USD per GB of flow logs generated per month.
To optimize the cost of CloudWatch queries, the VPC flow log record format only includes the fields required for anomaly detection. The CloudWatch log group is configured with a retention of 1 day. You can tune your cost by adjusting the anomaly detector function to run less frequently (the default is twice per minute). The tradeoff is that the network ACL rules won’t be updated as frequently. This can lead to the solution taking longer to mitigate a traffic flood.
Conclusion
Maintaining high availability and responsiveness is important to keeping the trust of your customers. The solution described above can help you automatically mitigate a variety of network floods that can impact the availability of your application even if you’ve followed all the applicable best practices for DDoS resiliency. There are limitations to this solution, but it can quickly detect and mitigate disruptive sources of traffic in a cost-effective manner. Your feedback is important. You can share comments below and report issues on GitHub.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security news? Follow us on Twitter.
Amazon Web Services (AWS) is proud to announce the successful completion of the ISO/IEC 20000-1:2018 certification for the AWS Asia Pacific (Mumbai) and (Hyderabad) Regions in India.
The scope of the ISO/IEC 20000-1:2018 certification is limited to the IT Service Management System (ITSMS) of AWS India Data Center (DC) Operations that supports the delivery of Security Operations Center (SOC) and Network Operation Center (NOC) managed services.
ISO/IEC 20000-1 is a service management system (SMS) standard that specifies requirements for establishing, implementing, maintaining, and continually improving an SMS. An SMS supports the management of the service lifecycle, including the planning, design, transition, delivery, and improvement of services, which meet agreed upon requirements and deliver value for customers, users, and the organization that delivers the services.
The ISO/IEC 20000-1 certification provides an assurance that the AWS Data Center operations in India support the delivery of SOC and NOC managed services, in accordance with the ISO/IEC 20000-1 guidance and in line with the requirements of the Ministry of Electronics and Information Technology (MeitY), government of India.
AWS is committed to bringing new services into the scope of its compliance programs to help you meet your architectural, business, and regulatory needs. If you have questions about the ISO/IEC 20000-1:2018 certification, contact your AWS account team.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.
In March 2022, AWS announced support for custom certificate extensions, including name constraints, using AWS Certificate Manager (ACM) Private Certificate Authority (CA). Defining DNS name constraints with your subordinate CA can help establish guardrails to improve public key infrastructure (PKI) security and mitigate certificate misuse. For example, you can set a DNS name constraint that restricts the CA from issuing certificates to a resource that is using a specific domain name. Certificate requests from resources using an unauthorized domain name will be rejected by your CA and won’t be issued a certificate.
In this blog post, I’ll walk you step-by-step through the process of applying DNS name constraints to a subordinate CA by using the AWS Private CA service.
Prerequisites
You need to have the following prerequisite tools, services, and permissions in place before following the steps presented within this post:
All of the examples in this blog post are provided for the us-west-2 AWS Region. You will need to make sure that you have access to resources in your desired Region and specify the Region in the example commands.
Retrieve the solution code
Our GitHub repository contains the Python code that you need in order to replicate the steps presented in this post. There are two methods for cloning the repository provided, HTTPS or SSH. Select the method that you prefer.
Creating a Python virtual environment will allow you to run this solution in a fresh environment without impacting your existing Python packages. This will prevent the solution from interfering with dependencies that your other Python scripts may have. The virtual environment has its own set of Python packages installed. Read the official Python documentation on virtual environments for more information on their purpose and functionality.
To create a Python virtual environment
Create a new directory for the Python virtual environment in your home path.
Generate the API passthrough file with encoded name constraints
This step allows you to define the permitted and excluded DNS name constraints to apply to your subordinate CA. Read the documentation on name constraints in RFC 5280 for more information on their usage and functionality.
The Python encoder provided in this solution accepts two arguments for the permitted and excluded name constraints. The -p argument is used to provide the permitted subtrees, and the -e argument is used to provide the excluded subtrees. Use commas without spaces to separate multiple entries. For example: -p .dev.example.com,.test.example.com -e .prod.dev.example.com,.amazon.com.
To encode your name constraints
Run the following command, and update <~/github> with your information and provide your desired name constraints for the permitted (-p) and excluded (-e) arguments.
If the command runs successfully, you will see the message “Successfully Encoded Name Constraints” and the name of the generated API passthrough JSON file. The output of Permitted Subtrees will show the domain names you passed with the -p argument, and Excluded Subtrees will show the domain names you passed with the -e argument in the previous step.
Figure 1: Command line output example for name-constraints-encoder.py
Use the following command to display the contents of the API passthrough file generated by the Python encoder.
The contents of api_passthrough_config.json will look similar to the following screenshot. The JSON object will have an ObjectIdentifier key and value of 2.5.29.30, which represents the name constraints OID from the Global OID database. The base64-encoded Value represents the permitted and excluded name constraints you provided to the Python encoder earlier.
Figure 2: Viewing contents of api_passthrough_config.json
Generate a CSR from your subordinate CA
You must generate a certificate signing request (CSR) from the subordinate CA to which you intend to have the name constraints applied. Otherwise, you might encounter errors when you attempt to install the new certificate with name constraints.
To generate the CSR
Update and run the following command with your subordinate CA ARN and Region. The ARN is something that uniquely identifies AWS resources, similar to how your home address tells the mail person where to deliver the mail. In this case, the ARN is the unique identifier for your subordinate CA that tells the command which subordinate CA it’s interacting with.
The following screenshot provides an example output for a CSR. Your CSR details will be different; however, you should see something similar. Look for verify OK in the output and make sure that the Subject details match your subordinate CA. The subject details will provide the country, state, and city. The details will also likely contain your organization’s name, organizational unit or department name, and a common name for the subordinate CA.
Figure 3: Reviewing CSR content using openssl
Use the root CA to issue a new certificate with the name constraints custom extension
This post uses a two-tiered certificate authority architecture for simplicity. However, you can use the steps in this post with a more complex multi-level CA architecture. The name constraints certificate will be generated by the root CA and applied to the intermediary CA.
To issue and download a certificate with name constraints
Run the following command, making sure to update the argument values in red italics with your information. Make sure that the certificate-authority-arn is that of your root CA.
Note that the provided template-arn instructs the root CA to use the api_passthrough_config.json file that you created earlier to generate the certificate with the name constraints custom extension. If you use a different template, the new certificate might not be created as you intended.
Also, note that the validity period provided in this example is 5 years or 1825 days. The validity period for your subordinate CA must be less than that of your root CA.
If the issue-certificate command is successful, the output will provide the ARN of the new certificate that is issued by the root CA. Copy the certificate ARN, because it will be used in the following command.
Figure 4: Issuing a certificate with name constraints from the root CA using the AWS CLI
To download the new certificate, run the following command. Make sure to update the placeholders in red italics with your root CA’s certificate-authority-arn, the certificate-arn you obtained from the previous step, and your region.
Separate the certificate and certificate chain into two separate files by running the following commands. The new subordinate CA certificate is saved as cert.pem and the certificate chain is saved as cert_chain.pem.
The x509v3 Name Constraints portion of cert.pem should match the permitted and excluded name constraints you provided to the Python encoder earlier.
Figure 5: Verifying the X509v3 name constraints in the newly issued certificate using openssl
Install the name constraints certificate on the subordinate CA
In this section, you will install the name constraints certificate on your subordinate CA. Note that this will replace the existing certificate installed on the subordinate CA. The name constraints will go into effect as soon as the new certificate is installed.
To install the name constraints certificate
Run the following command with your subordinate CA’s certificate-authority-arn and path to the cert.pem and cert_chain.pem files you created earlier.
The output from the previous command will be similar to the following screenshot. The CertificateAuthorityConfiguration and highlighted NotBefore and NotAfter fields in the output should match the name constraints certificate.
Figure 6: Verifying subordinate CA details using the AWS CLI
Test the name constraints
Now that your subordinate CA has the new certificate installed, you can test to see if the name constraints are being enforced based on the certificate you installed in the previous section.
To request a certificate from your subordinate CA and test the applied name constraints
To request a new certificate, update and run the following command with your subordinate CA’s certificate-authority-arn, region, and desired certificate subject in the domain-name argument.
You will see output similar to the following screenshot if the requested certificate domain name was not permitted by the name constraints applied to your subordinate CA. In this example, a certificate for app.prod.dev.example.com was rejected. The Status shows “FAILED” and the FailureReason indicates “PCA_NAME_CONSTRAINTS_VALIDATION”.
Figure 7: Verifying the status of the certificate request using the AWS CLI describe-certificate command
Effective collaboration is central to business success, and employees today depend heavily on messaging tools. An estimated 3.09 billion mobile phone users access messaging applications (apps) to communicate, and this figure is projected to grow to 3.51 billion users in 2025.
This post highlights the risks associated with messaging apps and describes how you can use enterprise solutions — such as AWS Wickr — that combine end-to-end encryption with data retention to drive positive security and business outcomes.
The business risks of messaging apps
Evolving threats, flexible work models, and a growing patchwork of data protection and privacy regulations have made maintaining secure and compliant enterprise messaging a challenge.
The use of third-party apps for business-related messages on both corporate and personal devices can make it more difficult to verify that data is being adequately protected and retained. This can lead to business risk, particularly in industries with unique record-keeping requirements. Organizations in the financial services industry, for example, are subject to rules that include Securities and Exchange Commission (SEC) Rule 17a-4 and Financial Industry Regulatory Authority (FINRA) Rule 3120, which require them to preserve all pertinent electronic communications.
A recent Gartner report on the viability of mobile bring-your-own-device (BYOD) programs noted, “It is now logical to assume that most financial services organizations with mobile BYOD programs for regulated employees could be fined due to a lack of compliance with electronic communications regulations.”
In the public sector, U.S. government agencies are subject to records requests under the Freedom of Information Act (FOIA) and various state sunshine statutes. For these organizations, effectively retaining business messages is about more than supporting security and compliance—it’s about maintaining public trust.
Securing enterprise messaging
Enterprise-grade messaging apps can help you protect communications from unauthorized access and facilitate desired business outcomes.
Security — Critical security protocols protect messages and files that contain sensitive and proprietary data — such as personally identifiable information, protected health information, financial records, and intellectual property — in transit and at rest to decrease the likelihood of a security incident.
Control — Administrative controls allow you to add, remove, and invite users, and organize them into security groups with restricted access to features and content at their level. Passwords can be reset and profiles can be deleted remotely, helping you reduce the risk of data exposure stemming from a lost or stolen device.
Compliance — Information can be preserved in a customer-controlled data store to help meet requirements such as those that fall under the Federal Records Act (FRA) and National Archives and Records Administration (NARA), as well as SEC Rule 17a-4 and Sarbanes-Oxley (SOX).
Marrying encryption with data retention
Enterprise solutions bring end-to-end encryption and data retention together in support of a comprehensive approach to secure messaging that balances people, process, and technology.
End-to-end encryption
Many messaging apps offer some form of encryption, but not all of them use end-to-end encryption. End-to-end encryption is a secure communication method that protects data from unauthorized access, interception, or tampering as it travels from one endpoint to another.
In end-to-end encryption, encryption and decryption take place locally, on the device. Every call, message, and file is encrypted with unique keys and remains indecipherable in transit. Unauthorized parties cannot access communication content because they don’t have the keys required to decrypt the data.
Encryption in transit compared to end-to-end encryption
Encryption in transit encrypts data over a network from one point to another (typically between one client and one server); data might remain stored in plaintext at the source and destination storage systems. End-to-end encryption combines encryption in transit and encryption at rest to secure data at all times, from being generated and leaving the sender’s device, to arriving at the recipient’s device and being decrypted.
“Messaging is a critical tool for any organization, and end-to-end encryption is the security technology that provides organizations with the confidence they need to rely on it.” — CJ Moses, CISO and VP of Security Engineering at AWS
Data retention
While data retention is often thought of as being incompatible with end-to-end encryption, leading enterprise-grade messaging apps offer both, giving you the option to configure a data store of your choice to retain conversations without exposing them to outside parties. No one other than the intended recipients and your organization has access to the message content, giving you full control over your data.
How AWS can help
AWS Wickr is an end-to-end encrypted messaging and collaboration service that was built from the ground up with features designed to help you keep internal and external communications secure, private, and compliant. Wickr protects one-to-one and group messaging, voice and video calling, file sharing, screen sharing, and location sharing with 256-bit Advanced Encryption Standard (AES) encryption, and provides data retention capabilities.
Figure 1: How Wickr works
With Wickr, each message gets a unique AES private encryption key, and a unique Elliptic-curve Diffie–Hellman (ECDH) public key to negotiate the key exchange with recipients. Message content — including text, files, audio, or video — is encrypted on the sending device (your iPhone, for example) using the message-specific AES key. This key is then exchanged via the ECDH key exchange mechanism, so that only intended recipients can decrypt the message.
“As former employees of federal law enforcement, the intelligence community, and the military, Qintel understands the need for enterprise-federated, secure communication messaging capabilities. When searching for our company’s messaging application we evaluated the market thoroughly and while there are some excellent capabilities available, none of them offer the enterprise security and administrative flexibility that Wickr does.” — Bill Schambura, CEO at Qintel
Wickr network administrators can configure and apply data retention to both internal and external communications in a Wickr network. This includes conversations with guest users, external teams, and other partner networks, so you can retain messages and files sent to and from the organization to help meet internal, legal, and regulatory requirements.
Figure 2: Data retention process
Data retention is implemented as an always-on recipient that is added to conversations, not unlike the blind carbon copy (BCC) feature in email. The data-retention process participates in the key exchange, allowing it to decrypt messages. The process can run anywhere: on-premises, on an Amazon Elastic Compute Cloud (Amazon EC2) instance, or at a location of your choice.
Wickr networks can be created through the AWS Management Console, and workflows can be automated with Wickr bots. Wickr is currently available in the AWS US East (Northern Virginia), AWS GovCloud (US-West), AWS Canada (Central), and AWS Europe (London) Regions.
Keep your messages safe
Employees will continue to use messaging apps to chat with friends and family, and boost productivity at work. While many of these apps can introduce risks if not used properly in business settings, Wickr combines end-to-end encryption with data-retention capabilities to help you achieve security and compliance goals. Incorporating Wickr into a comprehensive approach to secure enterprise messaging that includes clear policies and security awareness training can help you to accelerate collaboration, while protecting your organization’s data.
A key part of protecting your organization’s non-public, sensitive data is to understand who can access it and from where. One of the common requirements is to restrict access to authorized users from known locations. To accomplish this, you should be familiar with the expected network access patterns and establish organization-wide guardrails to limit access to known networks. Additionally, you should verify that the credentials associated with your AWS Identity and Access Management (IAM) principals are only usable within these expected networks. On Amazon Web Services (AWS), you can use the network perimeter to apply network coarse-grained controls on your resources and principals. In this fourth blog post of the Establishing a data perimeter on AWS series, we explore the benefits and implementation considerations of defining your network perimeter.
The network perimeter is a set of coarse-grained controls that help you verify that your identities and resources can only be used from expected networks.
To achieve these security objectives, you first must define what expected networks means for your organization. Expected networks usually include approved networks your employees and applications use to access your resources, such as your corporate IP CIDR range and your VPCs. There are also scenarios where you need to permit access from networks of AWS services acting on your behalf or networks of trusted third-party partners that you integrate with. You should consider all intended data access patterns when you create the definition of expected networks. Other networks are considered unexpected and shouldn’t be allowed access.
Security risks addressed by the network perimeter
The network perimeter helps address the following security risks:
Unintended information disclosure through credential use from non-corporate networks
It’s important to consider the security implications of having developers with preconfigured access stored on their laptops. For example, let’s say that to access an application, a developer uses a command line interface (CLI) to assume a role and uses the temporary credentials to work on a new feature. The developer continues their work at a coffee shop that has great public Wi-Fi while their credentials are still valid. Accessing data through a non-corporate network means that they are potentially bypassing their company’s security controls, which might lead to the unintended disclosure of sensitive corporate data in a public space.
Unintended data access through stolen credentials
Organizations are prioritizing protection from credential theft risks, as threat actors can use stolen credentials to gain access to sensitive data. For example, a developer could mistakenly share credentials from an Amazon EC2 instance CLI access over email. After credentials are obtained, a threat actor can use them to access your resources and potentially exfiltrate your corporate data, possibly leading to reputational risk.
Figure 1 outlines an undesirable access pattern: using an employee corporate credential to access corporate resources (in this example, an Amazon Simple Storage Service (Amazon S3) bucket) from a non-corporate network.
Figure 1: Unintended access to your S3 bucket from outside the corporate network
Implementing the network perimeter
During the network perimeter implementation, you use IAM policies and global condition keys to help you control access to your resources based on which network the API request is coming from. IAM allows you to enforce the origin of a request by making an API call using both identity policies and resource policies.
The following two policies help you control both your principals and resources to verify that the request is coming from your expected network:
Service control policies (SCPs) are policies you can use to manage the maximum available permissions for your principals. SCPs help you verify that your accounts stay within your organization’s access control guidelines.
Resource based policies are policies that are attached to resources in each AWS account. With resource based policies, you can specify who has access to the resource and what actions they can perform on it. For a list of services that support resource based policies, see AWS services that work with IAM.
With the help of these two policy types, you can enforce the control objectives using the following IAM global condition keys:
aws:SourceIp: You can use this condition key to create a policy that only allows request from a specific IP CIDR range. For example, this key helps you define your expected networks as your corporate IP CIDR range.
aws:SourceVpc: This condition key helps you check whether the request comes from the list of VPCs that you specified in the policy. In a policy, this condition key is used to only allow access to an S3 bucket if the VPC where the request originated matches the VPC ID listed in your policy.
aws:SourceVpce: You can use this condition key to check if the request came from one of the VPC endpoints specified in your policy. Adding this key to your policy helps you restrict access to API calls that originate from VPC endpoints that belong to your organization.
aws:ViaAWSService: You can use this key to write a policy to allow an AWS service that uses your credentials to make calls on your behalf. For example, when you upload an object to Amazon S3 with server-side encryption with AWS Key Management Service (AWS KMS) on, S3 needs to encrypt the data on your behalf. To do this, S3 makes a subsequent request to AWS KMS to generate a data key to encrypt the object. The call that S3 makes to AWS KMS is signed with your credentials and originates outside of your network.
aws:PrincipalIsAWSService: This condition key helps you write a policy to allow AWS service principals to access your resources. For example, when you create an AWS CloudTrail trail with an S3 bucket as a destination, CloudTrail uses a service principal, cloudtrail.amazonaws.com, to publish logs to your S3 bucket. The API call from CloudTrail comes from the service network.
The following table summarizes the relationship between the control objectives and the capabilities used to implement the network perimeter.
Control objective
Implemented by using
Primary IAM capability
My resources can only be accessed from expected networks.
My resources can only be accessed from expected networks
Start by implementing the network perimeter on your resources using resource based policies. The perimeter should be applied to all resources that support resource- based policies in each AWS account. With this type of policy, you can define which networks can be used to access the resources, helping prevent access to your company resources in case of valid credentials being used from non-corporate networks.
The following is an example of a resource-based policy for an S3 bucket that limits access only to expected networks using the aws:SourceIp, aws:SourceVpc, aws:PrincipalIsAWSService, and aws:ViaAWSService condition keys. Replace <my-data-bucket>, <my-corporate-cidr>, and <my-vpc> with your information.
The Deny statement in the preceding policy has four condition keys where all conditions must resolve to true to invoke the Deny effect. Use the IfExists condition operator to clearly state that each of these conditions will still resolve to true if the key is not present on the request context.
This policy will deny Amazon S3 actions unless requested from your corporate CIDR range (NotIpAddressIfExists with aws:SourceIp), or from your VPC (StringNotEqualsIfExists with aws:SourceVpc). Notice that aws:SourceVpc and aws:SourceVpce are only present on the request if the call was made through a VPC endpoint. So, you could also use the aws:SourceVpce condition key in the policy above, however this would mean listing every VPC endpoint in your environment. Since the number of VPC endpoints is greater than the number of VPCs, this example uses the aws:SourceVpc condition key.
This policy also creates a conditional exception for Amazon S3 actions requested by a service principal (BoolIfExists with aws:PrincipalIsAWSService), such as CloudTrail writing events to your S3 bucket, or by an AWS service on your behalf (BoolIfExists with aws:ViaAWSService), such as S3 calling AWS KMS to encrypt or decrypt an object.
Extending the network perimeter on resource
There are cases where you need to extend your perimeter to include AWS services that access your resources from outside your network. For example, if you’re replicating objects using S3 bucket replication, the calls to Amazon S3 originate from the service network outside of your VPC, using a service role. Another case where you need to extend your perimeter is if you integrate with trusted third-party partners that need access to your resources. If you’re using services with the described access pattern in your AWS environment or need to provide access to trusted partners, the policy EnforceNetworkPerimeter that you applied on your S3 bucket in the previous section will deny access to the resource.
In this section, you learn how to extend your network perimeter to include networks of AWS services using service roles to access your resources and trusted third-party partners.
AWS services that use service roles and service-linked roles to access resources on your behalf
Service roles are assumed by AWS services to perform actions on your behalf. An IAM administrator can create, change, and delete a service role from within IAM; this role exists within your AWS account and has an ARN like arn:aws:iam::<AccountNumber>:role/<RoleName>. A key difference between a service-linked role (SLR) and a service role is that the SLR is linked to a specific AWS service and you can view but not edit the permissions and trust policy of the role. An example is AWS Identity and Access Management Access Analyzer using an SLR to analyze resource metadata. To account for this access pattern, you can exempt roles on the service-linked role dedicated path arn:aws:iam::<AccountNumber>:role/aws-service-role/*, and for service roles, you can tag the role with the tag network-perimeter-exception set to true.
If you are exempting service roles in your policy based on a tag-value, you must also include a policy to enforce the identity perimeter on your resource as shown in this sample policy. This helps verify that only identities from your organization can access the resource and cannot circumvent your network perimeter controls with network-perimeter-exception tag.
Partners accessing your resources from their own networks
There might be situations where your company needs to grant access to trusted third parties. For example, providing a trusted partner access to data stored in your S3 bucket. You can account for this type of access by using the aws:PrincipalAccount condition key set to the account ID provided by your partner.
The following is an example of a resource-based policy for an S3 bucket that incorporates the two access patterns described above. Replace <my-data-bucket>, <my-corporate-cidr>, <my-vpc>, <third-party-account-a>, <third-party-account-b>, and <my-account-id> with your information.
There are four condition operators in the policy above, and you need all four of them to resolve to true to invoke the Deny effect. Therefore, this policy only allows access to Amazon S3 from expected networks defined as: your corporate IP CIDR range (NotIpAddressIfExists and aws:SourceIp), your VPC (StringNotEqualsIfExists and aws:SourceVpc), networks of AWS service principals (aws:PrincipalIsAWSService), or an AWS service acting on your behalf (aws:ViaAWSService). It also allows access to networks of trusted third-party accounts (StringNotEqualsIfExists and aws:PrincipalAccount:<third-party-account-a>), and AWS services using an SLR to access your resources (ArnNotLikeIfExists and aws:PrincipalArn).
My identities can access resources only from expected networks
Applying the network perimeter on identity can be more challenging because you need to consider not only calls made directly by your principals, but also calls made by AWS services acting on your behalf. As described in access pattern 3 Intermediate IAM roles for data access in this blog post, many AWS services assume an AWS service role you created to perform actions on your behalf. The complicating factor is that even if the service supports VPC-based access to your data — for example AWS Glue jobs can be deployed within your VPC to access data in your S3 buckets — the service might also use the service role to make other API calls outside of your VPC. For example, with AWS Glue jobs, the service uses the service role to deploy elastic network interfaces (ENIs) in your VPC. However, these calls to create ENIs in your VPC are made from the AWS Glue managed network and not from within your expected network. A broad network restriction in your SCP for all your identities might prevent the AWS service from performing tasks on your behalf.
Therefore, the recommended approach is to only apply the perimeter to identities that represent the highest risk of inappropriate use based on other compensating controls that might exist in your environment. These are identities whose credentials can be obtained and misused by threat actors. For example, if you allow your developers access to the Amazon Elastic Compute Cloud (Amazon EC2) CLI, a developer can obtain credentials from the Amazon EC2 instance profile and use the credentials to access your resources from their own network.
To summarize, to enforce your network perimeter based on identity, evaluate your organization’s security posture and what compensating controls are in place. Then, according to this evaluation, identify which service roles or human roles have the highest risk of inappropriate use, and enforce the network perimeter on those identities by tagging them with data-perimeter-include set to true.
The following policy shows the use of tags to enforce the network perimeter on specific identities. Replace <my-corporate-cidr>, and <my-vpc> with your own information.
The above policy statement uses the Deny effect to limit access to expected networks for identities with the tag data-perimeter-include attached to them (StringEquals and aws:PrincipalTag/data-perimeter-include set to true). This policy will deny access to those identities unless the request is done by an AWS service on your behalf (aws:ViaAWSService), is coming from your corporate CIDR range (NotIpAddressIfExists and aws:SourceIp), or is coming from your VPCs (StringNotEqualsIfExists with the aws:SourceVpc).
Amazon EC2 also uses a special service role, also known as infrastructure role, to decrypt Amazon Elastic Block Store (Amazon EBS). When you mount an encrypted Amazon EBS volume to an EC2 instance, EC2 calls AWS KMS to decrypt the data key that was used to encrypt the volume. The call to AWS KMS is signed by an IAM role, arn:aws:iam::*:role/aws:ec2-infrastructure, which is created in your account by EC2. For this use case, as you can see on the preceding policy, you can use the aws:PrincipalArn condition key to exclude this role from the perimeter.
IAM policy samples
This GitHub repository contains policy examples that illustrate how to implement network perimeter controls. The policy samples don’t represent a complete list of valid access patterns and are for reference only. They’re intended for you to tailor and extend to suit the needs of your environment. Make sure you thoroughly test the provided example policies before implementing them in your production environment.
Conclusion
In this blog post you learned about the elements needed to build the network perimeter, including policy examples and strategies on how to extend that perimeter. You now also know different access patterns used by AWS services that act on your behalf, how to evaluate those access patterns, and how to take a risk-based approach to apply the perimeter based on identities in your organization.
For additional learning opportunities, see the Data perimeters on AWS. This information resource provides additional materials such as a data perimeter workshop, blog posts, whitepapers, and webinar sessions.
Earlier this year, Amazon Web Services (AWS)released Amazon Corretto Crypto Provider (ACCP) 2, a cryptography provider built by AWS for Java virtual machine (JVM) applications. ACCP 2 delivers comprehensive performance enhancements, with some algorithms (such as elliptic curve key generation) seeing a greater than 13-fold improvement over ACCP 1. The new release also brings official support for the AWS Graviton family of processors. In this post, I’ll discuss a use case for ACCP, then review performance benchmarks to illustrate the performance gains. Finally, I’ll show you how to get started using ACCP 2 in applications today.
This release changes the backing cryptography library for ACCP from OpenSSL (used in ACCP 1) to the AWS open source cryptography library, AWS libcrypto (AWS-LC). AWS-LC has extensive formal verification, as well as traditional testing, to assure the correctness of cryptography that it provides. While AWS-LC and OpenSSL are largely compatible, there are some behavioral differences that required the ACCP major version increment to 2.
The move to AWS-LC also allows ACCP to leverage performance optimizations in AWS-LC for modern processors. I’ll illustrate the ACCP 2 performance enhancements through the use case of establishing a secure communications channel with Transport Layer Security version 1.3 (TLS 1.3). Specifically, I’ll examine cryptographic components of the connection’s initial phase, known as the handshake. TLS handshake latency particularly matters for large web service providers, but reducing the time it takes to perform various cryptographic operations is an operational win for any cryptography-intensive workload.
TLS 1.3 requires ephemeral key agreement, which means that a new key pair is generated and exchanged for every connection. During the TLS handshake, each party generates an ephemeral elliptic curve key pair, exchanges public keys using Elliptic Curve Diffie-Hellman (ECDH), and agrees on a shared secret. Finally, the client authenticates the server by verifying the Elliptic Curve Digital Signature Algorithm (ECDSA) signature in the certificate presented by the server after key exchange. All of this needs to happen before you can send data securely over the connection, so these operations directly impact handshake latency and must be fast.
Figure 1 shows benchmarks for the three elliptic curve algorithms that implement the TLS 1.3 handshake: elliptic curve key generation (up to 1,298% latency improvement in ACCP 2.0 over ACCP 1.6), ECDH key agreement (up to 858% latency improvement), and ECDSA digital signature verification (up to 260% latency improvement). These algorithms were benchmarked over three common elliptic curves with different key sizes on both ACCP 1 and ACCP 2. The choice of elliptic curve determines the size of the key used or generated by the algorithm, and key size correlates to performance. The following benchmarks were measured under the Amazon Corretto 11 JDK on a c7g.large instance running Amazon Linux with a Graviton 3 processor.
Figure 1: Percentage improvement of ACCP 2.0 over 1.6 performance benchmarks on c7g.large Amazon Linux Graviton 3
The performance improvements due to the optimization of secp384r1 in AWS-LC are particularly noteworthy.
Getting started
Whether you’re introducing ACCP to your project or upgrading from ACCP 1, start the onboarding process for ACCP 2 by updating your dependency manager configuration in your development or testing environment. The Maven and Gradle examples below assume that you’re using linux on an ARM64 processor. If you’re using an x86 processor, substitute linux-x86_64 for linux-aarch64. After you’ve performed this update, sync your application’s dependencies and install ACCP in your JVM process. ACCP can be installed either by specifying our recommended security.properties file in your JVM invocation or programmatically at runtime. The following sections provide more details about all of these steps.
After ACCP has been installed, the Java Cryptography Architecture (JCA) will look for cryptographic implementations in ACCP first before moving on to other providers. So, as long as your application and dependencies obtain algorithms supported by ACCP from the JCA, your application should gain the benefits of ACCP 2 without further configuration or code changes.
Maven
If you’re using Maven to manage dependencies, add or update the following dependency configuration in your pom.xml file.
After updating your dependency manager, you’ll need to install ACCP. You can install ACCP using security properties as described in our GitHub repository. This installation method is a good option for users who have control over their JVM invocation.
Install programmatically
If you don’t have control over your JVM invocation, you can install ACCP programmatically. For Java applications, add the following code to your application’s initialization logic (optionally performing a health check).
Although the migration path to version 2 is straightforward for most ACCP 1 users, ACCP 2 ends support for some outdated algorithms: a finite field Diffie-Hellman key agreement, finite field DSA signatures, and a National Institute of Standards and Technology (NIST)-specified random number generator. The removal of these algorithms is not backwards compatible, so you’ll need to check your code for their usage and, if you do find usage, either migrate to more modern algorithms provided by ACCP 2 or obtain implementations from a different provider, such as one of the default providers that ships with the JDK.
Check your code
Search for unsupported algorithms in your application code by their JCA names:
DH: Finite-field Diffie-Hellman key agreement
DSA: Finite-field Digital Signature Algorithm
NIST800-90A/AES-CTR-256: NIST-specified random number generator
Use ACCP 2 supported algorithms
Where possible, use these supported algorithms in your application code:
ECDH for key agreement instead of DH
ECDSA or RSA for signatures instead of DSA
Default SecureRandom instead of NIST800-90A/AES-CTR-256
If your use case requires the now-unsupported algorithms, check whether any of those algorithms are explicitly requested from ACCP.
If ACCP is not explicitly named as the provider, then you should be able to transparently fall back to another provider without a code change.
If ACCP is explicitly named as the provider, then remove that provider specification and register a different provider that offers the algorithm. This will allow the JCA to obtain an implementation from another registered provider without breaking backwards compatibility in your application.
Test your code
Some behavioral differences exist between ACCP 2 and other providers, including ACCP 1 (backed by OpenSSL). After onboarding or migrating, it’s important that you test your application code thoroughly to identify potential incompatibilities between cryptography providers.
Conclusion
Integrate ACCP 2 into your application today to benefit from AWS-LC’s security assurance and performance improvements. For a full list of changes, see the ACCP CHANGELOG on GitHub. Linux builds of ACCP 2 are now available on Maven Central for aarch64 and x86-64 processor architectures. If you encounter any issues with your integration, or have any feature suggestions, please reach out to us on GitHub by filing an issue.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security news? Follow us on Twitter.
In 2017, AWS announced the release of Rate-based Rules for AWS WAF, a new rule type that helps protect websites and APIs from application-level threats such as distributed denial of service (DDoS) attacks, brute force log-in attempts, and bad bots. Rate-based rules track the rate of requests for each originating IP address and invokes a rule action on IPs with rates that exceed a set limit.
While rate-based rules are useful to detect and mitigate a broad variety of bad actors, threats have evolved to bypass request-rate limit rules. For example, one bypass technique is to send a high volumes of requests by spreading them across thousands of unique IP addresses.
In May 2023, AWS announced AWS WAF enhancements to the existing rate-based rules feature that you can use to create more dynamic and intelligent rules by using additional HTTP request attributes for request rate limiting. For example, you can now choose from the following predefined keys to configure your rules: label namespace, header, cookie, query parameter, query string, HTTP method, URI path and source IP Address or IP Address in a header. Additionally, you can combine up to five composite keys as parameters for stronger rule development. These rule definition enhancements help improve perimeter security measures against sophisticated application-layer DDoS attacks using AWS WAF. For more information about the supported request attributes, see Rate-based rule statement in the AWS WAF Developer Guide.
In this blog post, you will learn more about these new AWS WAF feature enhancements and how you can use alternative request attributes to create more robust and granular sets of rules. In addition, you’ll learn how to combine keys to create a composite aggregation key to uniquely identify a specific combination of elements to improve rate tracking.
Getting started
Configuring advanced rate-based rules is similar to configuring simple rate-based rules. The process starts with creating a new custom rule of type rate-based rule, entering the rate limit value, selecting custom keys, choosing the key from the request aggregation key dropdown menu, and adding additional composite keys by choosing Add a request aggregation key as shown in Figure 1.
Figure 1: Creating an advanced rate-based rule with two aggregation keys
For existing rules, you can update those rate-based rules to use the new functionality by editing them. For example, you can add a header to be aggregated with the source IP address, as shown in Figure 2. Note that previously created rules will not be modified.
Figure 2: Add a second key to an existing rate-based rule
You still can set the same rule action, such as block, count, captcha, or challenge. Optionally, you can continue applying a scope-down statement to limit rule action. For example, you can limit the scope to a certain application path or requests with a specified header. You can scope down the inspection criteria so that only certain requests are counted towards rate limiting, and use certain keys to aggregate those requests together. A technique would be to count only requests that have /api at the start of the URI, and aggregate them based on their SessionId cookie value.
Target use cases
Now that you’re familiar with the foundations of advanced rate-based rules, let’s explore how they can improve your security posture using the following use cases:
Enhanced Application (Layer 7) DDoS protection
Improved API security
Enriched request throttling
Use case 1: Enhance Layer 7 DDoS mitigation
The first use case that you might find beneficial is to enhance Layer 7 DDoS mitigation. An HTTP request flood is the most common vector of DDoS attacks. This attack type aims to affect application availability by exhausting available resources to run the application.
Before the release of these enhancements to AWS WAF rules, rules were limited by aggregating requests based on the IP address from the request origin or configured to use a forwarded IP address in an HTTP header such as X-Forwarded-For. Now you can create a more robust rate-based rule to help protect your web application from DDoS attacks by tracking requests based on a different key or a combination of keys. Let’s examine some examples.
To help detect pervasive bots, such as scrapers, scanners, and crawlers, or common bots that are distributed across many unique IP addresses, a rule can look for static request data like a custom header — for example, User-Agent.
To uniquely identity users behind a NAT gateway, you can use a cookie in addition to an IP address. Before the aggregation keys feature, it was difficult to identify users who connected from a single IP address. Now, you can use the session cookie to aggregate requests by their session identifier and IP address.
Note that for Layer 7 DDoS mitigation, tracking by session ID in cookies can be circumvented, because bots might send random values or not send any cookie at all. It’s a good idea to keep an IP-based blanket rate-limiting rule to block offending IP addresses that reach a certain high rate, regardless of their request attributes. In that case, the keys would look like:
Key 1: Session cookie
Key 2: IP address
You can reduce false positives when using AWS Managed Rules (AMR) IP reputation lists by rate limiting based on their label namespace. Labelling functionality is a powerful feature that allows you to map the requests that match a specific pattern and apply custom rules to them. In this case, you can match the label namespace provided by the AMR IP reputation list that includes AWSManagedIPDDoSList, which is a list of IP addresses that have been identified as actively engaging in DDoS activities.
You might want to be cautious about using this group list in block mode, because there’s a chance of blocking legitimate users. To mitigate this, use the list in count mode and create an advanced rate-based rule to aggregate all requests with the label namespace awswaf:managed:aws:amazon-ip-list:, targeting captcha as the rule action. This lets you reduce false positives without compromising security. Applying captcha as an action for the rule reduces serving captcha to all users and instead only applies it when the rate of requests exceeds the defined limit. The key for this rule would be:
Labels (AMR IP reputation lists).
Use case 2: API security
In this second use case, you learn how to use an advanced rate-based rule to improve the security of an API. Protecting an API with rate-limiting rules helps ensure that requests aren’t being sent too frequently in a short amount of time. Reducing the risk from misusing an API helps to ensure that only legitimate requests are handled and not denied due to an overload of requests.
Now, you can create advanced rate-based rules that track API requests based on two aggregation keys. For example, HTTP method to differentiate between GET, POST, and other requests in combination with a custom header like Authorization to match a JSON Web Token (JWT). JWTs are not decrypted by AWS WAF, and AWS WAF only aggregates requests with the same token. This can help to ensure that a token is not being used maliciously or to bypass rate-limiting rules. An additional benefit of this configuration is that requests with no authorization headers are being aggregated together towards the rate limiting threshold. The keys for this use case are:
Key 1: HTTP method
Key 2: Custom header (Authorization)
In addition, you can configure a rule to block and add a custom response when the requests limit is reached. For example, by returning HTTP error code 429 (too many requests) with a Retry-After header indicating the requester should wait 900 seconds (15 minutes) before making a new request.
There are many situations where throttling should be considered. For example, if you want to maintain the performance of a service API by providing fair usage for all users, you can have different rate limits based on the type or purpose of the API, such as mutable or non-mutable requests. To achieve this, you can create two advanced rate-based rules using aggregation keys like IP address, combined with an HTTP request parameter for either mutable or non-mutable that indicates the type of request. Each rule will have its own HTTP request parameter, and you can set different maximum values for the rate limit. The keys for this use case are:
Key 1: HTTP request parameter
Key 2: IP address
Another example where throttling can be helpful is for a multi-tenant application where you want to track requests made by each tenant’s users. Let’s say you have a free tier but also a paying subscription model for which you want to allow a higher request rate. For this use case, it’s recommended to use two different URI paths to verify that the two tenants are kept separated. Additionally, it is advised to still use a custom header or query string parameter to differentiate between the two tenants, such as a tenant-id header or parameter that contains a unique identifier for each tenant. To implement this type of throttling using advanced rate-based rules, you can create two rules using an IP address in combination with the custom header as aggregation keys. Each rule can have its own maximum value for rate limiting, as well as a scope-down statement that matches requests for each URI path. The keys and scope-down statement for this use case are:
Key 1: Custom header (tenant-id)
Key 2: IP address
Scope down statement (URI path)
As a third example, you can rate-limit web applications based on the total number of requests that can be handled. For this use case, you can use the new Count all as aggregation option. The option counts and rate-limits the requests that match the rule’s scope-down statement, which is required for this type of aggregation. One option is to scope down and inspect the URI path to target a specific functionality like a /history-search page. An option when you need to control how many requests go to a specific domain is to scope down a single header to a specific host, creating one rule for a.example.com and another rule for b.example.com.
Request Aggregation: Count all
Scope down statement (URI path | Single header)
For these examples, you can block with a custom response when the requests exceed the limit. For example, by returning the same HTTP error code and header, but adding a custom response body with a message like “You have reached the maximum number of requests allowed.”
Logging
The AWS WAF logs now include additional information about request keys used for request-rate tracking and the values of matched request keys. In addition to the existing IP or Forwarded_IP values, you can see the updated log fieldslimitKey and customValue, where the limitKey field now shows either CustomKeys for custom aggregate key settings or Constant for count all requests. CustomValues shows an array of keys, names, and values.
Figure 3: Example log output for the advanced rate-based rule showing updated limitKey and customValues fields
As mentioned in the first use case, to get more detailed information about the traffic that’s analyzed by the web ACL, consider enabling logging. If you choose to enable Amazon CloudWatch Logs as the log destination, you can use CloudWatch Logs Insights and advanced queries to interactively search and analyze logs.
For example, you can use the following query to get the request information that matches rate-based rules, including the updated keys and values, directly from the AWS WAF console.
Figure 4 shows the CloudWatch Log Insights query and the logs output including custom keys, names, and values fields.
Figure 4: The CloudWatch Log Insights query and the logs output
Pricing
There is no additional cost for using advanced rate-base rules; standard AWS WAF pricing applies when you use this feature. For AWS WAF pricing information, see AWS WAF Pricing. You only need to be aware that using aggregation keys will increase AWS WAF web ACL capacity units (WCU) usage for the rule. WCU usage is calculated based on how many keys you want to use for rate limiting. The current model of 2 WCUs plus any additional WCUs for a nested statement is being updated to 2 WCUs as a base, and 30 WCUs for each custom aggregation key that you specify. For example, if you want to create aggregation keys with an IP address in combination with a session cookie, this will use 62 WCUs, and aggregation keys with an IP address, session cookie, and customer header will use 92 WCUs. For more details about the WCU-based cost structure, visit Rate-based rule statement in the AWS WAF Developer Guide.
Conclusion
In this blog post, you learned about AWS WAF enhancements to existing rate-based rules that now support request parameters in addition to IP addresses. Additionally, these enhancements allow you to create composite keys based on up to five request parameters. This new capability allows you to be either more coarse in aggregating requests (such as all the requests that have an IP reputation label associated with them) or finer (such as aggregate requests for a specific session ID, not its IP address).
For more rule examples that include JSON rule configuration, visit Rate-based rule examples in the AWS WAF Developer Guide.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Over the past few decades, digital technologies have brought tremendous benefits to our societies, governments, businesses, and everyday lives. However, the more we depend on them for critical applications, the more we must do so securely. The increasing reliance on these systems comes with a broad responsibility for society, companies, and governments.
At Amazon Web Services (AWS), every employee, regardless of their role, works to verify that security is an integral component of every facet of the business (see Security at AWS). This goes hand-in-hand with new cybersecurity-related regulations, such as the Directive on Measures for a High Common Level of Cybersecurity Across the Union (NIS 2), formally adopted by the European Parliament and the Counsel of the European Union (EU) in December 2022. NIS 2 will be transposed into the national laws of the EU Member States by October 2024, and aims to strengthen cybersecurity across the EU.
AWS is excited to help customers become more resilient, and we look forward to even closer cooperation with national cybersecurity authorities to raise the bar on cybersecurity across Europe. Building society’s trust in the online environment is key to harnessing the power of innovation for social and economic development. It’s also one of our core Leadership Principles: Success and scale bring broad responsibility.
Compliance with NIS 2
NIS 2 seeks to ensure that entities mitigate the risks posed by cyber threats, minimize the impact of incidents, and protect the continuity of essential and important services in the EU.
Besides increased cooperation between authorities and support for enhanced information sharing amongst covered entities, NIS 2 includes minimum requirements for cybersecurity risk management measures and reporting obligations, which are applicable to a broad range of AWS customers based on their sector. Examples of sectors that must comply with NIS 2 requirements are energy, transport, health, public administration, and digital infrastructures. For the full list of covered sectors, see Annexes I and II of NIS 2. Generally, the NIS 2 Directive applies to a wider pool of entities than those currently covered by the NIS Directive, including medium-sized enterprises, as defined in Article 2 of the Annex to Recommendation 2003/361/EC (over 50 employees or an annual turnover over €10 million).
In several countries, aspects of the AWS service offerings are already part of the national critical infrastructure. For example, in Germany, Amazon Elastic Compute Cloud (Amazon EC2) and Amazon CloudFront are in scope for the KRITIS regulation. For several years, AWS has fulfilled its obligations to secure these services, run audits related to national critical infrastructure, and have established channels for exchanging security information with the German Federal Office for Information Security (BSI) KRITIS office. AWS is also part of the UP KRITIS initiative, a cooperative effort between industry and the German Government to set industry standards.
AWS will continue to support customers in implementing resilient solutions, in accordance with the shared responsibility model. Compliance efforts within AWS will include implementing the requirements of the act and setting out technical and methodological requirements for cloud computing service providers, to be published by the European Commission, as foreseen in Article 21 of NIS 2.
AWS cybersecurity risk management – Current status
Even before the introduction of NIS 2, AWS has been helping customers improve their resilience and incident response capacities. Our core infrastructure is designed to satisfy the security requirements of the military, global banks, and other highly sensitive organizations.
AWS provides information and communication technology services and building blocks that businesses, public authorities, universities, and individuals use to become more secure, innovative, and responsive to their own needs and the needs of their customers. Security and compliance remain a shared responsibility between AWS and the customer. We make sure that the AWS cloud infrastructure complies with applicable regulatory requirements and good practices for cloud providers, and customers remain responsible for building compliant workloads in the cloud.
In total, AWS supports or has obtained over 143 security standards compliance certifications and attestations around the globe, such as ISO 27001, ISO 22301, ISO 20000, ISO 27017, and System and Organization Controls (SOC) 2. The following are some examples of European certifications and attestations that we’ve achieved:
C5 — provides a wide-ranging control framework for establishing and evidencing the security of cloud operations in Germany.
ENS High — comprises principles for adequate protection applicable to government agencies and public organizations in Spain.
HDS — demonstrates an adequate framework for technical and governance measures to secure and protect personal health data, governed by French law.
Pinakes — provides a rating framework intended to manage and monitor the cybersecurity controls of service providers upon which Spanish financial entities depend.
These and other AWS Compliance Programs help customers understand the robust controls in place at AWS to help ensure the security and compliance of the cloud. Through dedicated teams, we’re prepared to provide assurance about the approach that AWS has taken to operational resilience and to help customers achieve assurance about the security and resiliency of their workloads. AWS Artifact provides on-demand access to these security and compliance reports and many more.
For security in the cloud, it’s crucial for our customers to make security by design and security by default central tenets of product development. To begin with, customers can use the AWS Well-Architected tool to help build secure, high-performing, resilient, and efficient infrastructure for a variety of applications and workloads. Customers that use the AWS Cloud Adoption Framework (AWS CAF) can improve cloud readiness by identifying and prioritizing transformation opportunities. These foundational resources help customers secure regulated workloads. AWS Security Hub provides customers with a comprehensive view of their security state on AWS and helps them check their environments against industry standards and good practices.
With regards to the cybersecurity risk management measures and reporting obligations that NIS 2 mandates, existing AWS service offerings can help customers fulfill their part of the shared responsibility model and comply with future national implementations of NIS 2. For example, customers can use Amazon GuardDuty to detect a set of specific threats to AWS accounts and watch out for malicious activity. Amazon CloudWatch helps customers monitor the state of their AWS resources. With AWS Config, customers can continually assess, audit, and evaluate the configurations and relationships of selected resources on AWS, on premises, and on other clouds. Furthermore, AWS Whitepapers, such as the AWS Security Incident Response Guide, help customers understand, implement, and manage fundamental security concepts in their cloud architecture.
At Amazon, we strive to be the world’s most customer-centric company. For AWS Security Assurance, that means having teams that continuously engage with authorities to understand and exceed regulatory and customer obligations on behalf of customers. This is just one way that we raise the security bar in Europe. At the same time, we recommend that national regulators carefully assess potentially conflicting, overlapping, or contradictory measures.
We also cooperate with cybersecurity agencies around the globe because we recognize the importance of their role in keeping the world safe. To that end, we have built the Global Cybersecurity Program (GCSP) to provide agencies with a direct and consistent line of communication to the AWS Security team. Two examples of GCSP members are the Dutch National Cyber Security Centrum (NCSC-NL), with whom we signed a cooperation in May 2023, and the Italian National Cybersecurity Agency (ACN). Together, we will work on cybersecurity initiatives and strengthen the cybersecurity posture across the EU. With the war in Ukraine, we have experienced how important such a collaboration can be. AWS has played an important role in helping Ukraine’s government maintain continuity and provide critical services to citizens since the onset of the war.
The way forward
At AWS, we will continue to provide key stakeholders with greater insights into how we help customers tackle their most challenging cybersecurity issues and provide opportunities to deep dive into what we’re building. We very much look forward to continuing our work with authorities, agencies and, most importantly, our customers to provide for the best solutions and raise the bar on cybersecurity and resilience across the EU and globally.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Welcome to another blog post from the AWS Customer Incident Response Team (CIRT)! For this post, we’re looking at two events that the team was involved in from the viewpoint of a regularly discussed but sometimes misunderstood subject, least privilege. Specifically, we consider the idea that the benefit of reducing permissions in real-life use cases does not always require using the absolute minimum set of privileges. Instead, you need to weigh the cost and effort of creating and maintaining privileges against the risk reduction that is achieved, to make sure that your permissions are appropriate for your needs.
To quote VP and Distinguished Engineer at Amazon Security, Eric Brandwine, “Least privilege equals maximum effort.” This is the idea that creating and maintaining the smallest possible set of privileges needed to perform a given task will require the largest amount of effort, especially as customer needs and service features change over time. However, the correlation between effort and permission reduction is not linear. So, the question you should be asking is: How do you balance the effort of privilege reduction with the benefits it provides?
Unfortunately, this is not an easy question to answer. You need to consider the likelihood of an unwanted issue happening, the impact if that issue did happen, and the cost and effort to prevent (or detect and recover from) that issue. You also need to factor requirements such as your business goals and regulatory requirements into your decision process. Of course, you won’t need to consider just one potential issue, but many. Often it can be useful to start with a rough set of permissions and refine them down as you develop your knowledge of what security level is required. You can also use service control policies (SCPs) to provide a set of permission guardrails if you’re using AWS Organizations. In this post, we tell two real-world stories where limiting AWS Identity and Access Management (IAM) permissions worked by limiting the impact of a security event, but where the permission set did not involve maximum effort.
Story 1: On the hunt for credentials
In this AWS CIRT story, we see how a threat actor was unable to achieve their goal because the access they obtained — a database administrator’s — did not have the IAM permissions they were after.
Background and AWS CIRT engagement
A customer came to us after they discovered unauthorized activity in their on-premises systems and in some of their AWS accounts. They had incident response capability and were looking for an additional set of hands with AWS knowledge to help them with their investigation. This helped to free up the customer’s staff to focus on the on-premises analysis.
Before our engagement, the customer had already performed initial containment activities. This included rotating credentials, revoking temporary credentials, and isolating impacted systems. They also had a good idea of which federated user accounts had been accessed by the threat actor.
The key part of every AWS CIRT engagement is the customer’s ask. Everything our team does falls on the customer side of the AWS Shared Responsibility Model, so we want to make sure that we are aligned to the customer’s desired outcome. The ask was clear—review the potential unauthorized federated users’ access, and investigate the unwanted AWS actions that were taken by those users during the known timeframe. To get a better idea of what was “unwanted,” we talked to the customer to understand at a high level what a typical day would entail for these users, to get some context around what sort of actions would be expected. The users were primarily focused on working with Amazon Relational Database Service (RDS).
Analysis and findings
For this part of the story, we’ll focus on a single federated user whose apparent actions we investigated, because the other federated users had not been impersonated by the threat actor in a meaningful way. We knew the approximate start and end dates to focus on and had discovered that the threat actor had performed a number of unwanted actions.
After reviewing the actions, it was clear that the threat actor had performed a console sign-in on three separate occasions within a 2-hour window. Each time, the threat actor attempted to perform a subset of the following actions:
Note: This list includes only the actions that are displayed as readOnly = false in AWS CloudTrail, because these actions are often (but not always) the more impactful ones, with the potential to change the AWS environment.
This is the point where the benefit of permission restriction became clear. As soon as this list was compiled, we noticed that two fields were present for all of the actions listed:
"errorCode": "Client.UnauthorizedOperation",
"errorMessage": "You are not authorized to perform this operation. [rest of message]"
As this reveals, every single non-readOnly action that was attempted by the threat actor was denied because the federated user account did not have the required IAM permissions.
Customer communication and result
After we confirmed the findings, we had a call with the customer to discuss the results. As you can imagine, they were happy that the results showed no material impact to their data, and said no further investigation or actions were required at that time.
What were the IAM permissions the federated user had, which prevented the set of actions the threat actor attempted?
The answer did not actually involve the absolute minimal set of permissions required by the user to do their job. It’s simply that the federated user had a role that didn’t have an Allow statement for the IAM actions the threat actor tried — their job did not require them. Without an explicit Allow statement, the IAM actions attempted were denied because IAM policies are Deny by default. In this instance, simply not having the desired IAM permissions meant that the threat actor wasn’t able to achieve their goal, and stopped using the access. We’ll never know what their goal actually was, but trying to create access keys, passwords, and update policies means that a fair guess would be that they were attempting to create another way to access that AWS account.
Story 2: More instances for crypto mining
In this AWS CIRT story, we see how a threat actor’s inability to create additional Amazon Elastic Compute Cloud (Amazon EC2) instances resulted in the threat actor leaving without achieving their goal.
Because this account was new and currently only used for testing their software, the customer saw that the detection was related to the Amazon ECS cluster and decided to delete all the resources in the account and rebuild. Not too long after doing this, they received a similar GuardDuty alert for the new Amazon ECS cluster they had set up. The second finding resulted in the customer’s security team and AWS being brought in to try to understand what was causing this. The customer’s security team was focused on reviewing the tasks that were being run on the cluster, while AWS CIRT reviewed the AWS account actions and provided insight about the GuardDuty finding and what could have caused it.
Analysis and findings
Working with the customer, it wasn’t long before we discovered that the 3rd party Amazon ECS task definition that the customer was using, was unintentionally exposing a web interface to the internet. That interface allowed unauthenticated users to run commands on the system. This explained why the same alert was also received shortly after the new install had been completed.
This is where the story takes a turn for the better. The AWS CIRT analysis of the AWS CloudTrail logs found that there were a number of attempts to use the credentials of the Task IAM role that was associated with the Amazon ECS task. The majority of actions were attempting to launch multiple Amazon EC2 instances via RunInstances calls. Every one of these actions, along with the other actions attempted, failed with either a Client.UnauthorizedOperation or an AccessDenied error message.
Next, we worked with the customer to understand the permissions provided by the Task IAM role. Once again, the permissions could have been limited more tightly. However, this time the goal of the threat actor — running a number of Amazon EC2 instances (most likely for surreptitious crypto mining) — did not align with the policy given to the role:
AWS CIRT recommended creating policies to restrict the allowed actions further, providing specific resources where possible, and potentially also adding in some conditions to limit other aspects of the access (such as the two Condition keys launched recently to limit where Amazon EC2 instance credentials can be used from). However, simply having the policy limit access to Amazon Simple Storage Service (Amazon S3) meant that the threat actor decided to leave with just the one Amazon ECS task running crypto mining rather than a larger number of Amazon EC2 instances.
Customer communication and result
After reporting these findings to the customer, there were two clear next steps: First, remove the now unwanted and untrusted Amazon ECS resource from their AWS account. Second, review and re-architect the Amazon ECS task so that access to the web interface was only provided to appropriate users. As part of that re-architecting, an Amazon S3 policy similar to “Allows read and write access to objects in an S3 bucket” was recommended. This separates Amazon S3 bucket actions from Amazon S3 object actions. When applications have a need to read and write objects in Amazon S3, they don’t normally have a need to create or delete entire buckets (or versioning on those buckets).
Some tools to help
We’ve just looked at how limiting privileges helped during two different security events. Now, let’s consider what can help you decide how to reduce your IAM permissions to an appropriate level. There are a number of resources that talk about different approaches:
The first approach is to use Access Analyzer to help generate IAM policies based on access activity from log data. This can then be refined further with the addition of Condition elements as desired. We already have a couple of blog posts about that exact topic:
The third approach is a manual method of creating and refining policies to reduce the amount of work required. For this, you can begin with an appropriate AWS managed IAM policy or an AWS provided example policy as a starting point. Following this, you can add or remove Actions, Resources, and Conditions — using wildcards as desired — to balance your effort and permission reduction.
An example of balancing effort and permission is in the IAM tutorial Create and attach your first customer managed policy. In it, the authors create a policy that uses iam:Get* and iam:List:* in the Actions section. Although not all iam:Get* and iam:List:* Actions may be required, this is a good way to group similar Actions together while minimizing Actions that allow unwanted access — for example, iam:Create* or iam:Delete*. Another example of this balancing was mentioned earlier relating to Amazon S3, allowing access to create, delete, and read objects, but not to change the configuration of the bucket those objects are in.
In addition to limiting permissions, we also recommend that you set up appropriate detection and response capability. This will enable you to know when an issue has occurred and provide the tools to contain and recover from the issue. Further details about performing incident response in an AWS account can be found in the AWS Security Incident Response Guide.
There are also two services that were used to help in the stories we presented here — Amazon GuardDuty and AWS CloudTrail. GuardDuty is a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity. It’s a great way to monitor for unwanted activity within your AWS accounts. CloudTrail records account activity across your AWS infrastructure and provides the logs that were used for the analysis that AWS CIRT performed for both these stories. Making sure that both of these are set up correctly is a great first step towards improving your threat detection and incident response capability in AWS.
Conclusion
In this post, we looked at two examples where limiting privilege provided positive results during a security event. In the second case, the policy used should probably have restricted permissions further, but even as it stood, it was an effective preventative control in stopping the unauthorized user from achieving their assumed goal.
Hopefully these stories will provide new insight into the way your organization thinks about setting permissions, while taking into account the effort of creating the permissions. These stories are a good example of how starting a journey towards least privilege can help stop unauthorized users. Neither of the scenarios had policies that were least privilege, but the policies were restrictive enough that the unauthorized users were prevented from achieving their goals this time, resulting in minimal impact to the customers. However in both cases AWS CIRT recommended further reducing the scope of the IAM policies being used.
Finally, we provided a few ways to go about reducing permissions—first, by using tools to assist with policy creation, and second, by editing existing policies so they better fit your specific needs. You can get started by checking your existing policies against what Access Analyzer would recommend, by looking for and removing overly permissive wildcard characters (*) in some of your existing IAM policies, or by implementing and refining your SCPs.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
The Amazon Web Services (AWS) HITRUST Compliance Team is excited to announce that 161 AWS services have been certified for the HITRUST CSF version 11.0.1 for the 2023 cycle. The full list of AWS services, which were audited by a third-party assessor and certified under the HITRUST CSF, is now available on our Services in Scope by Compliance Program page. You can view and download our HITRUST CSF certification at any time on demand through AWS Artifact.
The HITRUST CSF has been widely adopted by leading organizations in a variety of industries in their approach to security and privacy. Visit the HITRUST website for more information. HITRUST certification allows you, as an AWS customer, to tailor your security control baselines specific to your architecture and assessment scope, and inherit certification for those controls so they don’t have to be tested as a component of your HITRUST assessment. Because cloud-based controls don’t have to be retested, AWS customers enjoy savings in both time and cost for their own HITRUST assessment certification needs.
AWS HITRUST CSF certification is available for customer inheritance with an updated Shared Responsibility Matrix version 1.4.1
As an added benefit to our customers, organizations no longer have to assess inherited controls for their HITRUST validated assessment, because AWS already has! Our customers can deploy business solutions into the AWS cloud environment and inherit our HITRUST CSF certification for those controls applicable to their cloud architecture for services that are in-scope of the AWS HITRUST assessment. A detailed listing of controls and corresponding inheritance values can be found on the HITRUST website.
The AWS HITRUST Inheritance Program supports the latest version of HITRUST controls (v11.1), and is excited to announce the availability of the latest Shared Responsibility Matrix (SRM) version 1.4.1. As an added benefit, the AWS HITRUST Inheritance Program also supports the control inheritance of AWS cloud-based workloads for new HITRUST e1 and i1 assessment types, as well as the validated r2-type assessments offered through HITRUST. The SRM is also backward-compatible to earlier versions of the HITRUST CSF from v9.1 through v11.
Additionally, through the AWS HITRUST Inheritance Program, AWS is a member of the Health 3rd Party Trust Initiative (Health3PT), a consortium of the largest US-based healthcare systems that is proactively committed to reducing third-party information security risk with more reliable and efficient assurances. You can find additional information at https://health3pt.org.
As always, we value your feedback and questions and are committed to helping you achieve and maintain the highest standard of security and compliance. Feel free to contact the team through AWS Compliance Contact Us.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.
We continue to listen to our customers, regulators, and stakeholders to understand their needs regarding audit, assurance, certification, and attestation programs at Amazon Web Services (AWS). We’re pleased to announce that Spring 2023 System and Organization Controls (SOC) 1, SOC 2, and SOC 3 reports are now available in Spanish. These translated reports will help drive greater engagement and alignment with customer and regulatory requirements across Latin America and Spain.
The Spanish language version of the reports don’t contain the independent opinions issued by the auditors or the control test results, but you can find this information in the English language version. Stakeholders should use the English version as a complement to the Spanish version.
Spanish-translated SOC reports are available to customers through AWS Artifact. Spanish-translated SOC reports will be published twice a year, in alignment with the Fall and Spring reporting cycles.
We value your feedback and questions—feel free to reach out to our team or give feedback about this post through the Contact Us page.
If you have feedback about this post, submit comments in the Comments section below.
Los informes SOC de Primavera de 2023 ahora están disponibles en español
Seguimos escuchando a nuestros clientes, reguladores y partes interesadas para comprender sus necesidades en relación con los programas de auditoría, garantía, certificación y atestación en Amazon Web Services (AWS). Nos complace anunciar que los informes SOC 1, SOC 2 y SOC 3 de AWS de Primavera de 2023 ya están disponibles en español. Estos informes traducidos ayudarán a impulsar un mayor compromiso y alineación con los requisitos regulatorios y de los clientes en las regiones de América Latina y España.
La versión en inglés de los informes debe tenerse en cuenta en relación con la opinión independiente emitida por los auditores y los resultados de las pruebas de controles, como complemento de las versiones en español.
Los informes SOC traducidos en español están disponibles en AWS Artifact. Los informes SOC traducidos en español se publicarán dos veces al año según los ciclos de informes de Otoño y Primavera.
Valoramos sus comentarios y preguntas; no dude en ponerse en contacto con nuestro equipo o enviarnos sus comentarios sobre esta publicación a través de nuestra página Contáctenos.
Si tienes comentarios sobre esta publicación, envíalos en la sección Comentarios a continuación.
¿Desea obtener más noticias sobre seguridad de AWS? Síguenos enTwitter.
With GitHub Actions, you can automate, customize, and run software development workflows directly within a repository. Workflows are defined using YAML and are stored alongside your code. I’ll discuss the specifics of how you can set up and use GitHub actions within a repository in the sections that follow.
The cfn-policy-validator tool is a command-line tool that takes an AWS CloudFormation template, finds and parses the IAM policies that are attached to IAM roles, users, groups, and resources, and then runs the policies through IAM Access Analyzerpolicy checks. Implementing IAM policy validation checks at the time of code check-in helps shift security to the left (closer to the developer) and shortens the time between when developers commit code and when they get feedback on their work.
Let’s walk through an example that checks the policies that are attached to an IAM role in a CloudFormation template. In this example, the cfn-policy-validator tool will find that the trust policy attached to the IAM role allows the role to be assumed by external principals. This configuration could lead to unintended access to your resources and data, which is a security risk.
Prerequisites
To complete this example, you will need the following:
A GitHub account
An AWS account, and an identity within that account that has permissions to create the IAM roles and resources used in this example
Step 1: Create a repository that will host the CloudFormation template to be validated
To begin with, you need to create a GitHub repository to host the CloudFormation template that is going to be validated by the cfn-policy-validator tool.
In the upper-right corner of the page, in the drop-down menu, choose New repository. For Repository name, enter a short, memorable name for your repository.
(Optional) Add a description of your repository.
Choose either the option Public (the repository is accessible to everyone on the internet) or Private (the repository is accessible only to people access is explicitly shared with).
Choose Initialize this repository with: Add a README file.
Choose Create repository. Make a note of the repository’s name.
Step 2: Clone the repository locally
Now that the repository has been created, clone it locally and add a CloudFormation template.
To clone the repository locally and add a CloudFormation template:
Open the command-line tool of your choice.
Use the following command to clone the new repository locally. Make sure to replace <GitHubOrg> and <RepositoryName> with your own values.
Change in to the directory that contains the locally-cloned repository.
cd <RepositoryName>
Now that the repository is locally cloned, populate the locally-cloned repository with the following sample CloudFormation template. This template creates a single IAM role that allows a principal to assume the role to perform the S3:GetObject action.
Use the following command to create the sample CloudFormation template file.
WARNING: This sample role and policy should not be used in production. Using a wildcard in the principal element of a role’s trust policy would allow any IAM principal in any account to assume the role.
Notice that AssumeRolePolicyDocument refers to a trust policy that includes a wildcard value in the principal element. This means that the role could potentially be assumed by an external identity, and that’s a risk you want to know about.
Step 3: Vend temporary AWS credentials for GitHub Actions workflows
In order for the cfn-policy-validator tool that’s running in the GitHub Actions workflow to use the IAM Access Analyzer API, the GitHub Actions workflow needs a set of temporary AWS credentials. The AWS Credentials for GitHub Actions action helps address this requirement. This action implements the AWS SDK credential resolution chain and exports environment variables for other actions to use in a workflow. Environment variable exports are detected by the cfn-policy-validator tool.
AWS Credentials for GitHub Actions supports four methods for fetching credentials from AWS, but the recommended approach is to use GitHub’s OpenID Connect (OIDC) provider in conjunction with a configured IAM identity provider endpoint.
To configure an IAM identity provider endpoint for use in conjunction with GitHub’s OIDC provider:
In the left-hand menu, choose Identity providers, and then choose Add provider.
For Provider type, choose OpenID Connect.
For Provider URL, enter https://token.actions.githubusercontent.com
Choose Get thumbprint.
For Audiences, enter sts.amazonaws.com
Choose Add provider to complete the setup.
At this point, make a note of the OIDC provider name. You’ll need this information in the next step.
After it’s configured, the IAM identity provider endpoint should look similar to the following:
Figure 1: IAM Identity provider details
Step 4: Create an IAM role with permissions to call the IAM Access Analyzer API
In this step, you will create an IAM role that can be assumed by the GitHub Actions workflow and that provides the necessary permissions to run the cfn-policy-validator tool.
To create the IAM role:
In the IAM console, in the left-hand menu, choose Roles, and then choose Create role.
For Trust entity type, choose Web identity.
In the Provider list, choose the new GitHub OIDC provider that you created in the earlier step. For Audience, select sts.amazonaws.com from the list.
After you’ve attached the new policy, choose Next.
Note: For a full explanation of each of these actions and a CloudFormation template example that you can use to create this role, see the IAM Policy Validator for AWS CloudFormation GitHub project.
The default policy you just created allows GitHub Actions from organizations or repositories outside of your control to assume the role. To align with the IAM best practice of granting least privilege, let’s scope it down further to only allow a specific GitHub organization and the repository that you created earlier to assume it.
Replace the policy to look like the following, but don’t forget to replace {AWSAccountID}, {GitHubOrg} and {RepositoryName} with your own values.
At this point, you’ve created and configured the following resources:
A GitHub repository that has been locally cloned and filled with a sample CloudFormation template.
An IAM identity provider endpoint for use in conjunction with GitHub’s OIDC provider.
A role that can be assumed by GitHub actions, and a set of associated permissions that allow the role to make requests to IAM Access Analyzer to validate policies.
Step 5: Create a definition for the GitHub Actions workflow
The workflow runs steps on hosted runners. For this example, we are going to use Ubuntu as the operating system for the hosted runners. The workflow runs the following steps on the runner:
The workflow checks out the CloudFormation template by using the community actions/checkout action.
The workflow then uses the aws-actions/configure-aws-credentials GitHub action to request a set of credentials through the IAM identity provider endpoint and the IAM role that you created earlier.
The workflow runs a validation against the CloudFormation template by using the cfn-policy-validator tool.
The workflow is defined in a YAML document. In order for GitHub Actions to pick up the workflow, you need to place the definition file in a specific location within the repository: .github/workflows/main.yml. Note the “.” prefix in the directory name, indicating that this is a hidden directory.
To create the workflow:
Use the following command to create the folder structure within the locally cloned repository:
mkdir -p .github/workflows
Create the sample workflow definition file in the .github/workflows directory. Make sure to replace <AWSAccountID> and <AWSRegion> with your own information.
Push the local changes to the remote GitHub repository.
git push
After the changes are pushed to the remote repository, go back to https://github.com and open the repository that you created earlier. In the top-right corner of the repository window, there is a small orange indicator, as shown in Figure 2. This shows that your GitHub Actions workflow is running.
Figure 2: GitHub repository window with the orange workflow indicator
Because the sample CloudFormation template used a wildcard value “*” in the principal element of the policy as described in the section Step 2: Clone the repository locally, the orange indicator turns to a red x (shown in Figure 3), which signals that something failed in the workflow.
Figure 3: GitHub repository window with the red cross workflow indicator
Choose the red x to see more information about the workflow’s status, as shown in Figure 4.
Figure 4: Pop-up displayed after choosing the workflow indicator
Choose Details to review the workflow logs.
In this example, the Validate templates step in the workflow has failed. A closer inspection shows that there is a blocking finding with the CloudFormation template. As shown in Figure 5, the finding is labelled as EXTERNAL_PRINCIPAL and has a description of Trust policy allows access from external principals.
Figure 5: Details logs from the workflow showing the blocking finding
To remediate this blocking finding, you need to update the principal element of the trust policy to include a principal from your AWS account (considered a zone of trust). The resources and principals within your account comprises of the zone of trust for the cfn-policy-validator tool. In the initial version of sample-role.yaml, the IAM roles trust policy used a wildcard in the Principal element. This allowed principals outside of your control to assume the associated role, which caused the cfn-policy-validator tool to generate a blocking finding.
In this case, the intent is that principals within the current AWS account (zone of trust) should be able to assume this role. To achieve this result, replace the wildcard value with the account principal by following the remaining steps.
Open sample-role.yaml by using your preferred text editor, such as nano.
nano sample-role.yaml
Replace the wildcard value in the principal element with the account principal arn:aws:iam::<AccountID>:root. Make sure to replace <AWSAccountID> with your own AWS account ID.
Add the updated file, commit the changes, and push the updates to the remote GitHub repository.
git add sample-role.yaml
git commit -m ‘replacing wildcard principal with account principal’
git push
After the changes have been pushed to the remote repository, go back to https://github.com and open the repository. The orange indicator in the top right of the window should change to a green tick (check mark), as shown in Figure 6.
Figure 6: GitHub repository window with the green tick workflow indicator
This indicates that no blocking findings were identified, as shown in Figure 7.
Figure 7: Detailed logs from the workflow showing no more blocking findings
Conclusion
In this post, I showed you how to automate IAM policy validation by using GitHub Actions and the IAM Policy Validator for CloudFormation. Although the example was a simple one, it demonstrates the benefits of automating security testing at the start of the development lifecycle. This is often referred to as shifting security left. Identifying misconfigurations early and automatically supports an iterative, fail-fast model of continuous development and testing. Ultimately, this enables teams to make security an inherent part of a system’s design and architecture and can speed up product development workflows.
In addition to the example I covered today, IAM Policy Validator for CloudFormation can validate IAM policies by using a range of IAM Access Analyzer policy checks. For more information about these policy checks, see Access Analyzer reference policy checks.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Amazon Security Lake automatically centralizes the collection of security-related logs and events from integrated AWS and third-party services. With the increasing amount of security data available, it can be challenging knowing what data to focus on and which tools to use. You can use native AWS services such as Amazon QuickSight, Amazon OpenSearch, and Amazon SageMaker Studio to visualize, analyze, and interactively identify different areas of interest to focus on, and prioritize efforts to increase your AWS security posture.
In this post, we go over how to generate machine learning insights for Security Lake using SageMaker Studio. SageMaker Studio is a web integrated development environment (IDE) for machine learning that provides tools for data scientists to prepare, build, train, and deploy machine learning models. With this solution, you can quickly deploy a base set of Python notebooks focusing on AWS Security Hub findings in Security Lake, which can also be expanded to incorporate other AWS sources or custom data sources in Security Lake. After you’ve run the notebooks, you can use the results to help you identify and focus on areas of interest related to security within your AWS environment. As a result, you might implement additional guardrails or create custom detectors to alert on suspicious activity.
Prerequisites
Specify a delegated administrator account to manage the Security Lake configuration for all member accounts within your organization.
Security Lake has been enabled in the delegated administrator AWS account.
As part of the solution in this post, we focus on Security Hub as a data source. AWS Security Hub must be enabled for your AWS Organizations. When enabling Security Lake, select All log and event sources to include AWS Security Hub findings.
Figure 1 that follows depicts the architecture of the solution.
Figure 1 SageMaker machine learning insights architecture for Security Lake
The deployment builds the architecture by completing the following steps:
A Security Lake is set up in an AWS account with supported log sources — such as Amazon VPC Flow Logs, AWS Security Hub, AWS CloudTrail, and Amazon Route53 — configured.
Subscriber query access is created from the Security Lake AWS account to a subscriber AWS account.
Note: See Prerequisite #4 for more information.
The AWS RAM resource share request must be accepted in the subscriber AWS account where this solution is deployed.
Note: See Prerequisite #4 for more information.
A resource link database in Lake Formation is created in the subscriber AWS account and grants access for the Athena tables in the Security Lake AWS account.
VPC is provisioned for SageMaker with IGW, NAT GW, and VPC endpoints for the AWS services used in the solution. IGW and NAT are required to install external open-source packages.
A dedicated IAM role is created to restrict access to create and access the presigned URL for the SageMaker Domain from a specific CIDR for accessing the SageMaker notebook.
An AWS CodeCommit repository containing Python notebooks is used for the AI and ML workflow by the SageMaker user-profile.
An Athena workgroup is created for the Security Lake queries with an S3 bucket for output location (access logging configured for the output bucket).
Option 1: Deploy the solution with AWS CloudFormation using the console
Use the console to sign in to your subscriber AWS account and then choose the Launch Stack button to open the AWS CloudFormation console pre-loaded with the template for this solution. It takes approximately 10 minutes for the CloudFormation stack to complete.
Option 2: Deploy the solution by using the AWS CDK
To build the app when navigating to the project’s root folder, use the following commands:
npm install -g aws-cdk-lib
npm install
Update IAM_role_assumption_for_sagemaker_presigned_url and security_lake_aws_account default values in source/lib/sagemaker_domain.ts with their respective appropriate values.
Run the following commands in your terminal while authenticated in your subscriber AWS account. Be sure to replace <INSERT_AWS_ACCOUNT> with your account number and replace <INSERT_REGION> with the AWS Region that you want the solution deployed to.
Now that you’ve deployed the SageMaker solution, you must grant the SageMaker user profile in the subscriber AWS account query access to your Security Lake. You can Grant permission for the SageMaker user profile to Security Lake in Lake Formation in the subscriber AWS account.
Grant permission to the Security Lake database
Copy the SageMaker user-profile Amazon resource name (ARN) arn:aws:iam::<account-id>:role/sagemaker-user-profile-for-security-lake
Go to Lake Formation in the console.
Select the amazon_security_lake_glue_db_us_east_1 database.
From the Actions Dropdown, select Grant.
In GrantData Permissions, select SAML Users and Groups.
Paste the SageMaker user profile ARN from Step 1.
In Database Permissions, select Describe and then Grant.
Grant permission to Security Lake – Security Hub table
Copy the SageMaker user-profile ARN arn:aws:iam:<account-id>:role/sagemaker-user-profile-for-security-lake
Go to Lake Formation in the console.
Select the amazon_security_lake_glue_db_us_east_1 database.
Choose View Tables.
Select the amazon_security_lake_table_us_east_1_sh_findings_1_0 table.
From Actions Dropdown, select Grant.
In Grant Data Permissions, select SAML Users and Groups.
Paste the SageMaker user-profile ARN from Step 1.
In Table Permissions, select Describe and then Grant.
Launch your SageMaker Studio application
Now that you have granted permissions for a SageMaker user-profile, we can move on to launching the SageMaker application associated to that user-profile.
You’ll work primarily in the SageMaker user profile to create a data-science app to work in. As part of the solution deployment, we’ve created Python notebooks in CodeCommit that you will need to clone.
In the Stacks section, select the SageMakerDomainStack.
Select to the Outputs tab/
Copy the value for sagemakernotebookmlinsightsrepositoryURL. (For example: https://git-codecommit.us-east-1.amazonaws.com/v1/repos/sagemaker_ml_insights_repo)
Go back to your SageMaker app.
In Studio, in the left sidebar, choose the Git icon (identified by a diamond with two branches), then choose Clone a Repository.
Figure 3: SageMaker clone CodeCommit repository
Paste the CodeCommit repository link from Step 4 under the Git repository URL (git). After you paste the URL, select Clone “https://git-codecommit.us-east-1.amazonaws.com/v1/repos/sagemaker_ml_insights_repo”, then select Clone.
NOTE: If you don’t select from the auto-populated drop-down, SageMaker won’t be able to clone the repository.
Figure 4: SageMaker clone CodeCommit URL
Generating machine learning insights using SageMaker Studio
You’ve successfully pulled the base set of Python notebooks into your SageMaker app and they can be accessed at sagemaker_ml_insights_repo/notebooks/tsat/. The notebooks provide you with a starting point for running machine learning analysis using Security Lake data. These notebooks can be expanded to existing native or custom data sources being sent to Security Lake.
Figure 5: SageMaker cloned Python notebooks
Notebook #1 – Environment setup
The 0.0-tsat-environ-setup notebook handles the installation of the required libraries and dependencies needed for the subsequent notebooks within this blog. For our notebooks, we use an open-source Python library called Kats, which is a lightweight, generalizable framework to perform time series analysis.
Select the 0.0-tsat-environ-setup.ipynb notebook for the environment setup.
Note: If you have already provisioned a kernel, you can skip steps 2 and 3.
In the right-hand corner, select No Kernel
In the Set up notebook environment pop-up, leave the defaults and choose Select.
After the kernel has successfully started, choose the Terminal icon to open the image terminal.
Figure 7: SageMaker application terminal
To install open-source packages from https instead of http, you must update the sources.list file. After the terminal opens, send the following commands:
cd /etc/apt
sed -i 's/http:/https:/g' sources.list
Go back to the 0.0-tsat-environ-setup.ipynb notebook and select the Run drop-down and select Run All Cells. Alternatively, you can run each cell independently, but it’s not required. Grab a coffee! This step will take about 10 minutes.
IMPORTANT: If you complete the installation out of order or update the requirements.txt file, you might not be able to successfully install Kats and you will need to rebuild your environment by using a net-new SageMaker user profile.
After installing all the prerequisites, check the Kats version to determine if it was successfully installed.
Figure 8: Kats installation verification
Install PyAthena (Python DB API client for Amazon Athena) which is used to query your data in Security Lake.
You’ve successfully set up the SageMaker app environment! You can now load the appropriate dataset and create a time series.
Notebook #2 – Load data
The 0.1-load-data notebook establishes the Athena connection to query data in Security Lake and creates the resulting time series dataset. The time series dataset will be used for subsequent notebooks to identify trends, outliers, and change points.
Select the 0.1-load-data.ipynb notebook.
If you deployed the solution outside of us-east-1, update the con details to the appropriate Region. In this example, we’re focusing on Security Hub data within Security Lake. If you want to change the underlying data source, you can update the TABLE value.
Figure 9: SageMaker notebook load Security Lake data settings
In the Query section, there’s an Athena query to pull specific data from Security Hub, this can be expanded as needed to a subset or can include all products within Security Hub. The query below pulls Security Hub information after 01:00:00 1/1/2022 from the products listed in productname.
Figure 10: SageMaker notebook Athena query
After the values have been updated, you can create your time series dataset. For this notebook, we recommend running each cell individually instead of running all cells at once so you can get a bit more familiar with the process. Select the first cell and choose the Run icon.
Figure 11: SageMaker run Python notebook code
Follow the same process as Step 4 for the subsequent cells.
You’ve successfully loaded your data and created a timeseries! You can now move on to generating machine learning insights from your timeseries.
Notebook #3 – Trend detector
The 1.1-trend-detector.ipynb notebook handles trend detection in your data. Trend represents a directional change in the level of a time series. This directional change can be either upward (increase in level) or downward (decrease in level). Trend detection helps detect a change while ignoring the noise from natural variability. Each environment is different, and trends help us identify where to look more closely to determine why a trend is positive or negative.
Select 1.1-trend-detector.ipynb notebook for trend detection.
Slopes are created to identify the relationship between x (time) and y (counts).
Figure 12: SageMaker notebook slope view
If the counts are increasing with time, then it’s considered a positive slope and the reverse is considered a negative slope. A positive slope isn’t necessarily a good thing because in an ideal state we would expect counts of a finding type to come down with time.
Figure 13: SageMaker notebook trend view
Now you can plot the top five positive and negative trends to identify the top movers.
Figure 14: SageMaker notebook trend results view
Notebook #4 – Outlier detection
The 1.2-outlier-detection.ipynb notebook handles outlier detection. This notebook does a seasonal decomposition of the input time series, with additive or multiplicative decomposition as specified (default is additive). It uses a residual time series by either removing only trend or both trend and seasonality if the seasonality is strong. The intent is to discover useful, abnormal, and irregular patterns within data sets, allowing you to pinpoint areas of interest.
To start, it detects points in the residual that are over 5 times the inter-quartile range.
Inter-quartile range (IQR) is the difference between the seventy-fifth and twenty-fifth percentiles of residuals or the spread of data within the middle two quartiles of the entire dataset. IQR is useful in detecting the presence of outliers by looking at values that might lie outside of the middle two quartiles.
The IQR multiplier controls the sensitivity of the range and decision of identifying outliers. By using a larger value for the iqr_mult_thresh parameter in OutlierDetector, outliers would be considered data points, while a smaller value would identify data points as outliers.
Note: If you don’t have enough data, decrease iqr_mult_thresh to a lower value (for example iqr_mult_thresh=3).
Figure 15: SageMaker notebook outlier setting
Along with outlier detection plots, investigation SQL will be displayed as well, which can help with further investigation of the outliers.
In the diagram that follows, you can see that there are several outliers in the number of findings, related to failed AWS Firewall Manager policies, which have been identified by the vertical red lines within the line graph. These are outliers because they deviate from the normal behavior and number of findings on a day-to-day basis. When you see outliers, you can look at the resources that might have caused an unusual increase in Firewall Manager policy findings. Depending on the findings, it could be related to an overly permissive or noncompliant security group or a misconfigured AWS WAF rule group.
The 1.3-changepoint-detector.ipynb notebook handles the change point detection. Change point detection is a method to detect changes in a time series that persist over time, such as a change in the mean value. To detect a baseline to identify when several changes might have occurred from that point. Change points occur when there’s an increase or decrease to the average number of findings within a data set.
Along with identifying change points within the data set, the investigation SQL is generated to further investigate the specific change point if applicable.
In the following diagram, you can see there’s a change point decrease after July 27, 2022, with confidence of 99.9 percent. It’s important to note that change points differ from outliers, which are sudden changes in the data set observed. This diagram means there was some change in the environment that resulted in an overall decrease in the number of findings for S3 buckets with block public access being disabled. The change could be the result of an update to the CI/CD pipelines provisioning S3 buckets or automation to enable all S3 buckets to block public access. Conversely, if you saw a change point that resulted in an increase, it could mean that there was a change that resulted in a larger number of S3 buckets with a block public access configuration consistently being disabled.
Figure 17: SageMaker changepoint detector view
By now, you should be familiar with the set up and deployment for SageMaker Studio and how you can use Python notebooks to generate machine learning insights for your Security Lake data. You can take what you’ve learned and start to curate specific datasets and data sources within Security Lake, create a time series, detect trends, and identify outliers and change points. By doing so, you can answer a variety of security-related questions such as:
CloudTrail
Is there a large volume of Amazon S3 download or copy commands to an external resource? Are you seeing a large volume of S3 delete object commands? Is it possible there’s a ransomware event going on?
VPC Flow Logs
Is there an increase in the number of requests from your VPC to external IPs? Is there an increase in the number of requests from your VPC to your on-premises CIDR? Is there a possibility of internal or external data exfiltration occurring?
Route53
Which resources are making DNS requests that they haven’t typically made within the last 30–45 days? When did it start? Is there a potential command and control session occurring on an Amazon Elastic Compute Cloud (Amazon EC2) instance?
It’s important to note that this isn’t a solution to replace Amazon GuardDuty, which uses foundational data sources to detect communication with known malicious domains and IP addresses and identify anomalous behavior, or Amazon Detective, which provides customers with prebuilt data aggregations, summaries, and visualizations to help security teams conduct faster and more effective investigations. One of the main benefits of using Security Lake and SageMaker Studio is the ability to interactively create and tailor machine learning insights specific to your AWS environment and workloads.
Clean up
If you deployed the SageMaker machine learning insights solution by using the Launch Stack button in the AWS Management Console or the CloudFormation template sagemaker_ml_insights_cfn, do the following to clean up:
In the CloudFormation console for the account and Region where you deployed the solution, choose the SageMakerML stack.
Choose the option to Delete the stack.
If you deployed the solution by using the AWS CDK, run the command cdk destroy.
Conclusion
Amazon Security Lake gives you the ability to normalize and centrally store your security data from various log sources to help you analyze, visualize, and correlate appropriate security logs. You can then use this data to increase your overall security posture by implementing additional security guardrails or take appropriate remediation actions within your AWS environment.
In this blog post, you learned how you can use SageMaker to generate machine learning insights for your Security Hub findings in Security Lake. Although the example solution focuses on a single data source within Security Lake, you can expand the notebooks to incorporate other native or custom data sources in Security Lake.
There are many different use-cases for Security Lake that can be tailored to fit your AWS environment. Take a look at this blog post to learn how you can ingest, transform and deliver Security Lake data to Amazon OpenSearch to help your security operations team quickly analyze security data within your AWS environment. In supported Regions, new Security Lake account holders can try the service free for 15 days and gain access to its features.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
At AWS, we often hear from customers that they want expanded security coverage for the multiple services that they use on AWS. However, alert fatigue is a common challenge that customers face as we introduce new security protections. The challenge becomes how to operationalize, identify, and prioritize alerts that represent real risk.
In this post, we highlight recent enhancements to Amazon Detective finding groups visualizations. We show you how Detective automatically consolidates multiple security findings into a single security event—called finding groups—and how finding group visualizations help reduce noise and prioritize findings that present true risk. We incorporate additional services like Amazon GuardDuty, Amazon Inspector, and AWS Security Hub to highlight how effective findings groups is at consolidating findings for different AWS security services.
Overview of solution
This post uses several different services. The purpose is twofold: to show how you can enable these services for broader protection, and to show how Detective can help you investigate findings from multiple services without spending a lot of time sifting through logs or querying multiple data sources to find the root cause of a security event. These are the services and their use cases:
GuardDuty – a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity. If potential malicious activity, such as anomalous behavior, credential exfiltration, or command and control (C2) infrastructure communication is detected, GuardDuty generates detailed security findings that you can use for visibility and remediation. Recently, GuardDuty released the following threat detections for specific services that we’ll show you how to enable for this walkthrough: GuardDuty RDS Protection, EKS Runtime Monitoring, and Lambda Protection.
Amazon Inspector – an automated vulnerability management service that continually scans your AWS workloads for software vulnerabilities and unintended network exposure. Like GuardDuty, Amazon Inspector sends a finding for alerting and remediation when it detects a software vulnerability or a compute instance that’s publicly available.
Security Hub – a cloud security posture management service that performs automated, continuous security best practice checks against your AWS resources to help you identify misconfigurations, and aggregates your security findings from integrated AWS security services.
Detective – a security service that helps you investigate potential security issues. It does this by collecting log data from AWS CloudTrail, Amazon Virtual Private Cloud (Amazon VPC) flow logs, and other services. Detective then uses machine learning, statistical analysis, and graph theory to build a linked set of data called a security behavior graph that you can use to conduct faster and more efficient security investigations.
The following diagram shows how each service delivers findings along with log sources to Detective.
Figure 1: Amazon Detective log source diagram
Enable the required services
If you’ve already enabled the services needed for this post—GuardDuty, Amazon Inspector, Security Hub, and Detective—skip to the next section. For instructions on how to enable these services, see the following resources:
Each of these services offers a free 30-day trial and provides estimates on charges after your trial expires. You can also use the AWS Pricing Calculator to get an estimate.
To enable the services across multiple accounts, consider using a delegated administrator account in AWS Organizations. With a delegated administrator account, you can automatically enable services for multiple accounts and manage settings for each account in your organization. You can view other accounts in the organization and add them as member accounts, making central management simpler. For instructions on how to enable the services with AWS Organizations, see the following resources:
The next step is to enable the latest detections in GuardDuty and learn how Detective can identify multiple threats that are related to a single security event.
If you’ve already enabled the different GuardDuty protection plans, skip to the next section. If you recently enabled GuardDuty, the protections plans are enabled by default, except for EKS Runtime Monitoring, which is a two-step process.
For the next steps, we use the delegated administrator account in GuardDuty to make sure that the protection plans are enabled for each AWS account. When you use GuardDuty (or Security Hub, Detective, and Inspector) with AWS Organizations, you can designate an account to be the delegated administrator. This is helpful so that you can configure these security services for multiple accounts at the same time. For instructions on how to enable a delegated administrator account for GuardDuty, see Managing GuardDuty accounts with AWS Organizations.
To enable EKS Protection
Sign in to the GuardDuty console using the delegated administrator account, choose Protection plans, and then choose EKS Protection.
In the Delegated administrator section, choose Edit and then choose Enable for each scope or protection. For this post, select EKS Audit Log Monitoring, EKS Runtime Monitoring, and Manage agent automatically, as shown in Figure 2. For more information on each feature, see the following resources:
To enable these protections for current accounts, in the Active member accounts section, choose Edit and Enable for each scope of protection.
To enable these protections for new accounts, in the New account default configuration section, choose Edit and Enable for each scope of protection.
To enable RDS Protection
The next step is to enable RDS Protection. GuardDuty RDS Protection works by analysing RDS login activity for potential threats to your Amazon Aurora databases (MySQL-Compatible Edition and Aurora PostgreSQL-Compatible Editions). Using this feature, you can identify potentially suspicious login behavior and then use Detective to investigate CloudTrail logs, VPC flow logs, and other useful information around those events.
Navigate to the RDS Protection menu and under Delegated administrator (this account), select Enable and Confirm.
In the Enabled for section, select Enable all if you want RDS Protection enabled on all of your accounts. If you want to select a specific account, choose Manage Accounts and then select the accounts for which you want to enable RDS Protection. With the accounts selected, choose Edit Protection Plans, RDS Login Activity, and Enable for X selected account.
(Optional) For new accounts, turn on Auto-enable RDS Login Activity Monitoring for new member accounts as they join your organization.
Figure 2: Enable EKS Runtime Monitoring
To enable Lambda Protection
The final step is to enable Lambda Protection. Lambda Protection helps detect potential security threats during the invocation of AWS Lambda functions. By monitoring network activity logs, GuardDuty can generate findings when Lambda functions are involved with malicious activity, such as communicating with command and control servers.
Navigate to the Lambda Protection menu and under Delegated administrator (this account), select Enable and Confirm.
In the Enabled for section, select Enable all if you want Lambda Protection enabled on all of your accounts. If you want to select a specific account, choose Manage Accounts and select the accounts for which you want to enable RDS Protection. With the accounts selected, choose Edit Protection Plans, Lambda Network Activity Monitoring, and Enable for X selected account.
(Optional) For new accounts, turn on Auto-enable Lambda Network Activity Monitoring for new member accounts as they join your organization.
Now that you’ve enabled these new protections, GuardDuty will start monitoring EKS audit logs, EKS runtime activity, RDS login activity, and Lambda network activity. If GuardDuty detects suspicious or malicious activity for these log sources or services, it will generate a finding for the activity, which you can review in the GuardDuty console. In addition, you can automatically forward these findings to Security Hub for consolidation, and to Detective for security investigation.
Detective data sources
If you have Security Hub and other AWS security services such as GuardDuty or Amazon Inspector enabled, findings from these services are forwarded to Security Hub. With the exception of sensitive data findings from Amazon Macie, you’re automatically opted in to other AWS service integrations when you enable Security Hub. For the full list of services that forward findings to Security Hub, see Available AWS service integrations.
With each service enabled and forwarding findings to Security Hub, the next step is to enable the data source in Detective called AWS security findings, which are the findings forwarded to Security Hub. Again, we’re going to use the delegated administrator account for these steps to make sure that AWS security findings are being ingested for your accounts.
To enable AWS security findings
Sign in to the Detective console using the delegated administrator account and navigate to Settings and then General.
Choose Optional source packages, Edit, select AWS security findings, and then choose Save.
Figure 5: Enable AWS security findings
When you enable Detective, it immediately starts creating a security behavior graph for AWS security findings to build a linked dataset between findings and entities, such as RDS login activity from Aurora databases, EKS runtime activity, and suspicious network activity for Lambda functions. For GuardDuty to detect potential threats that affect your database instances, it first needs to undertake a learning period of up to two weeks to establish a baseline of normal behavior. For more information, see How RDS Protection uses RDS login activity monitoring. For the other protections, after suspicious activity is detected, you can start to see findings in both GuardDuty and Security Hub consoles. This is where you can start using Detective to better understand which findings are connected and where to prioritize your investigations.
Detective behavior graph
As Detective ingests data from GuardDuty, Amazon Inspector, and Security Hub, as well as CloudTrail logs, VPC flow logs, and Amazon Elastic Kubernetes Service (Amazon EKS) audit logs, it builds a behavior graph database. Graph databases are purpose-built to store and navigate relationships. Relationships are first-class citizens in graph databases, which means that they’re not computed out-of-band or by interfering with relationships through querying foreign keys. Because Detective stores information on relationships in your graph database, you can effectively answer questions such as “are these security findings related?”. In Detective, you can use the search menu and profile panels to view these connections, but a quicker way to see this information is by using finding groups visualizations.
Finding groups visualizations
Finding groups extract additional information out of the behavior graph to highlight findings that are highly connected. Detective does this by running several machine learning algorithms across your behavior graph to identify related findings and then statically weighs the relationships between those findings and entities. The result is a finding group that shows GuardDuty and Amazon Inspector findings that are connected, along with entities like Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS accounts, and AWS Identity and Access Management (IAM) roles and sessions that were impacted by these findings. With finding groups, you can more quickly understand the relationships between multiple findings and their causes because you don’t need to connect the dots on your own. Detective automatically does this and presents a visualization so that you can see the relationships between various entities and findings.
Enhanced visualizations
Recently, we released several enhancements to finding groups visualizations to aid your understanding of security connections and root causes. These enhancements include:
Dynamic legend – the legend now shows icons for entities that you have in the finding group instead of showing all available entities. This helps reduce noise to only those entities that are relevant to your investigation.
Aggregated evidence and finding icons – these icons provide a count of similar evidence and findings. Instead of seeing the same finding or evidence repeated multiple times, you’ll see one icon with a counter to help reduce noise.
More descriptive side panel information – when you choose a finding or entity, the side panel shows additional information, such as the service that identified the finding and the finding title, in addition to the finding type, to help you understand the action that invoked the finding.
Label titles – you can now turn on or off titles for entities and findings in the visualization so that you don’t have to choose each to get a summary of what the different icons mean.
To use the finding groups visualization
Open the Detective console, choose Summary, and then choose View all finding groups.
Choose the title of an available finding group and scroll down to Visualization.
Under the Select layout menu, choose one of the layouts available, or choose and drag each icon to rearrange the layout according to how you’d like to see connections.
For a complete list of involved entities and involved findings, scroll down below the visualization.
Figure 6 shows an example of how you can use finding groups visualization to help identify the root cause of findings quickly. In this example, an IAM role was connected to newly observed geolocations, multiple GuardDuty findings detected malicious API calls, and there were newly observed user agents from the IAM session. The visualization can give you high confidence that the IAM role is compromised. It also provides other entities that you can search against, such as the IP address, S3 bucket, or new user agents.
Figure 6: Finding groups visualization
Now that you have the new GuardDuty protections enabled along with the data source of AWS security findings, you can use finding groups to more quickly visualize which IAM sessions have had multiple findings associated with unauthorized access, or which EC2 instances are publicly exposed with a software vulnerability and active GuardDuty finding—these patterns can help you determine if there is an actual risk.
Conclusion
In this blog post, you learned how to enable new GuardDuty protections and use Detective, finding groups, and visualizations to better identify, operationalize, and prioritize AWS security findings that represent real risk. We also highlighted the new enhancements to visualizations that can help reduce noise and provide summaries of detailed information to help reduce the time it takes to triage findings. If you’d like to see an investigation scenario using Detective, watch the video Amazon Detective Security Scenario Investigation.
At AWS, we’re committed to helping our customers meet digital sovereignty requirements. Last year, I announced the AWS Digital Sovereignty Pledge, our commitment to offering all AWS customers the most advanced set of sovereignty controls and features available in the cloud. Our approach is to continue to make AWS sovereign-by-design—as it has been from day one.
I promised that our pledge was just the start, and that we would continue to innovate to meet the needs of our customers. As part of our promise, we pledged to invest in an ambitious roadmap of capabilities on data residency, granular access restriction, encryption, and resilience. Today, I’d like to update you on another milestone on our journey to continue to help our customers address their sovereignty needs.
Further control over the location of your data
Customers have always controlled the location of their data with AWS. For example, in Europe, customers have the choice to deploy their data into any of eight existing AWS Regions. These AWS Regions provide the broadest set of cloud services and features, enabling our customers to run the majority of their workloads. Customers can also use AWS Local Zones, a type of infrastructure deployment that makes AWS services available in more places, to help meet latency and data residency requirements, without having to deploy self-managed infrastructure. Customers who must comply with data residency regulations can choose to run their workloads in specific geographic locations where AWS Regions and Local Zones are available.
Announcing AWS Dedicated Local Zones
Our public sector and regulated industry customers have told us they want dedicated infrastructure for their most critical workloads to help meet regulatory or other compliance requirements. Many of these customers manage their own infrastructure on premises for workloads that require isolation. This forgoes the performance, innovation, elasticity, scalability, and resiliency benefits of the cloud.
To help our customers address these needs, I’m excited to announce AWS Dedicated Local Zones. Dedicated Local Zones are a type of AWS infrastructure that is fully managed by AWS, built for exclusive use by a customer or community, and placed in a customer-specified location or data center to help comply with regulatory requirements. Dedicated Local Zones can be operated by local AWS personnel and offer the same benefits of Local Zones, such as elasticity, scalability, and pay-as-you-go pricing, with added security and governance features. These features include data access monitoring and audit programs, controls to limit infrastructure access to customer-selected AWS accounts, and options to enforce security clearance or other criteria on local AWS operating personnel. With Dedicated Local Zones, we work with customers to configure their own Local Zones with the services and capabilities they need to meet their regulatory requirements.
Innovating with the Singapore Government’s Smart Nation and Digital Government Group
At AWS, we work closely with customers to understand their requirements for their most critical workloads. Our work with the Singapore Government’s Smart Nation and Digital Government Group (SNDGG) to build a Smart Nation for their citizens and businesses illustrates this approach. This group spearheads Singapore’s digital government transformation and development of the public sector’s engineering capabilities. SNDGG is the first customer to deploy Dedicated Local Zones.
“AWS is a strategic partner and has been since the beginning of our cloud journey. SNDGG collaborated with AWS to define and build Dedicated Local Zones to help us meet our stringent data isolation and security requirements, enabling Singapore to run more sensitive workloads in the cloud securely,” said Chan Cheow Hoe, Government Chief Digital Technology Officer of Singapore. “In addition to helping the Singapore government meet its cybersecurity requirements, the Dedicated Local Zones enable us to offer its agencies a seamless and consistent cloud experience.”
Our commitments to our customers
We remain committed to helping our customers meet evolving sovereignty requirements. We continue to innovate sovereignty features, controls, and assurances globally with AWS, without compromising on the full power of AWS.
To get started, you can visit the Dedicated Local Zones website, where you can contact AWS specialists to learn more about how Dedicated Local Zones can be configured to meet your regulatory needs.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.