All posts by Marshall Jones

Testing and evaluating GuardDuty detections

2025-01-28 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/testing-and-evaluating-guardduty-detections/

Amazon GuardDuty is a threat detection service that continuously monitors, analyzes, and processes Amazon Web Services (AWS) data sources and logs in your AWS environment. GuardDuty uses threat intelligence feeds, such as lists of malicious IP addresses and domains, file hashes, and machine learning (ML) models to identify suspicious and potentially malicious activity in your AWS environment. When GuardDuty identifies a potential security issue, it creates a GuardDuty finding that gives you information about what the potential security issue is, the resources involved, and contextualized information that’s key to remediating the issue. GuardDuty helps you monitor for the latest threats by continually expanding threat detection to emerging and common threats.

Whether you’re new to GuardDuty or are a long-time user, it’s recommended that you understand the different GuardDuty finding types and finding details and practice responding to them as suggested in the security pillar of AWS Well-Architected.

In this blog post, I dive deep into an open source tool for testing GuardDuty findings and then walk through three examples of how you can use this tool to test and improve your response to GuardDuty findings.

Overview

If you want to learn more about GuardDuty, you can read about the finding types in this AWS documentation. However, customers often want realistic findings in their environment to understand what a finding looks like and to practice responding hands on. While you can use GuardDuty to create sample findings in your environment, these findings are approximations populated with placeholder values and look different from real findings. Additionally, you cannot practice remediation with these findings because they’re not tied to actual resources in your account. This can be helpful if you only want to see what details are in a finding, but if you want to practice a real-world scenario, these sample findings might not be adequate.

To address this use case and provide customers with a secure and reliable way to test the threat detection capabilities of GuardDuty, the GuardDuty service team launched an open source project called GuardDuty Tester. The GuardDuty Tester creates infrastructure in your environment to simulate different security issues so that you can test GuardDuty findings that mirror actual security issues that you might encounter, such as crypto mining or a reverse shell being created on an Amazon Elastic Compute Cloud (Amazon EC2) instance. The GuardDuty Tester was originally released in 2018 as an AWS CloudFormation template and was focused more on testing investigation workflows than on a wide range of finding types. AWS has since released an updated version that uses the AWS Cloud Development Kit (AWS CDK) to make the infrastructure code easier to read and expanded the test coverage to over 100 unique finding types and resource combinations.

The ability to create findings across different resource types such as Amazon EC2, Amazon Simple Storage Service (Amazon S3), and Amazon Elastic Kubernetes Service (Amazon EKS) is a valuable resource for your security team, allowing them to simulate various types of threats with isolated infrastructure so that you don’t need to compromise your deployed workloads to improve response actions and techniques. Remember that the GuardDuty Tester doesn’t cover every possible scenario, but is instead focused on threat intelligence and rules-based findings. Anomaly-based findings, which require learning about how you operate your environment, aren’t included in the GuardDuty Tester.

Getting started with the GuardDuty Tester

The GuardDuty Tester is deployed by using the AWS CDK to create the required infrastructure and scripts to generate the GuardDuty findings. For safety, AWS recommends that you deploy the GuardDuty Tester in a nonproduction environment in an account that’s used specifically for this purpose. This way, your security team can differentiate between test GuardDuty findings and findings for other workloads that they’re monitoring.

In this post, I won’t walk through configuring the GuardDuty Tester because this is already documented in the GuardDuty documentation. Instead, I will go over what you need to know about the GuardDuty Tester and some of the benefits.

Figure 1 shows the GuardDuty Tester architecture, which includes the resources necessary to create GuardDuty findings for various protection plans such as Amazon S3 buckets, Amazon EC2 instances, and an Amazon EKS cluster. The tester also deploys a dedicated GuardDuty Tester instance where you will run the scripts needed to create the GuardDuty findings.

Figure 1: GuardDuty Tester architecture

The GuardDuty Tester provides key features including:

A wide range of threat scenario simulations: Resources that the GuardDuty Tester can create findings for include Amazon S3, AWS Identity and Access Management (IAM), Amazon Elastic Container Service (Amazon ECS) for both Amazon EC2 and AWS Fargate hosted workloads, Amazon EKS, and AWS Lambda and covers over 105 threat scenarios. This includes GuardDuty runtime monitoring as well as other GuardDuty protection plans.
Access through AWS Systems Manager: The GuardDuty Tester provides secure access by using Systems Manager to minimize open ports to the internet and allowing access only through Systems Manager.
Modular scripts: With an expanded library of tests available, the GuardDuty Tester accepts user parameters to set the scope of the tests to run, which gives you greater flexibility for different testing scenarios.

Setting up the GuardDuty Tester environment is straightforward and requires only a few commands. As outlined in the documentation and the README file in the repository, there are a number of prerequisites to set up the stack. These prerequisites include Python 3+, git, the AWS Command Line Interface (AWS CLI), AWS Systems Manager Session Manager plugin, npm, Docker, and a subscription to Kali Linux image for Amazon EC2. You will have to subscribe to the Kali Linux instance in AWS Marketplace, but will be charged for the instance only while the GuardDuty Tester is deployed. After these prerequisites are met, you can clone the repository, install the packages, and deploy the GuardDuty Tester to your AWS account.

Deploying the GuardDuty Tester can take 20–30 minutes, but if you’re following along with this post, I assume that you have deployed the GuardDuty Tester into your environment and have started your Systems Manager session as stated on Part A of Step 3 – Run tester scripts in the GuardDuty documentation. Now, I will dive into the first testing example.

Manual investigation

The first test use case is about getting familiar with what GuardDuty findings look like and the details that a finding gives you. This might be one of your first steps after turning on GuardDuty, or this might be an activity that you perform to help new team members understand GuardDuty findings.

To start a manual investigation:

Run the following command in your Systems Manager session to view the GuardDuty Tester options.
```
Python3 guardduty_tester.py --help
```
Run the following command in your Systems Manager session to create the first test finding.
```
Python3 guardduty_tester.py - -ec2 - -runtime only - -tactics impact
```
Before creating the findings, the GuardDuty Tester prompts you to confirm that it’s allowed to change GuardDuty settings in the environment. For example, if you’ve chosen to create findings related to the GuardDuty runtime monitoring feature but don’t have this feature enabled, the GuardDuty Tester will enable it for the tests and then disable it after testing is complete.

Note: This will start the 30-day trial of the enabled features in this account, in this AWS Region, even if the feature is disabled after testing is complete. More information about GuardDuty pricing and free trials can be found on the GuardDuty pricing page.
After choosing y which indicates “yes”, the GuardDuty Tester reports the number of domain reputation findings it’s expecting. Figure 2 shows an example of the expected findings. You can learn more about domain reputation findings in the GuardDuty finding documentation.

Figure 2: Generated GuardDuty findings in the console
After the GuardDuty Tester is finished, wait a few minutes and then go to the AWS Management Console for GuardDuty to see the findings. In this example, there are four new GuardDuty findings as expected from step 4 and shown in Figure 3. With the findings generated, you can start your manual investigation.

Figure 3: GuardDuty finding details

In the preceding figure, you can see some of the finding details presented—such as the action type and the process information—that can help you quickly identify what trigger started the suspicious communication. From here, I encourage you to use this finding to practice your runbooks for investigation and response. For example, you might start with validating and triaging the finding before moving into evidence collection and remediation. If you don’t have incident response runbooks already built, you can use this finding as an example to get started. There are multiple open source examples such as AWS incident response playbooks and AWS customer response playbook. A runbook will help your team evaluate the information provided in the GuardDuty finding and understand what else they need to know about your specific environment to properly respond to the finding. For example, in the finding, you will have resource and actor information but not things such as who is the account owner or point of contact for security for that account.

Creating alerts

The next use case highlights how to create alerts based on GuardDuty findings. When setting up alerting automation with tools such as Amazon Simple Notification Service (Amazon SNS) and Slack, you should create a finding using the GuardDuty Tester to test that you’ve configured your alert correctly. See Creating custom responses to GuardDuty findings for information about creating alerts with either of these tools. Figure 4 shows a sample EventBridge rule that will send GuardDuty findings to SNS.

Figure 4: EventBridge rule to send GuardDuty findings to SNS

For this post, I assume that you’ve already configured an Amazon EventBridge rule and Amazon SNS alert.

To test alerts:

Run the following command in your Systems Manager session to create a privileged container finding.
```
Python3 guardduty_tester.py - -finding ‘PrivilegedEscalation:Kubernetes/PrivilegedContainer’
```
Shortly after creating this finding, you should see an SNS alert based on the finding type.

Figure 5: SNS notification from a GuardDuty finding

If you’ve configured the alert correctly, you will see an email similar to Figure 5. The email demonstrates that SNS notifications were successfully configured and tested using the GuardDuty Tester. If this is a new finding, you will receive this SNS notification shortly after the GuardDuty Tester generates the finding, but if this is an updated finding, then the timing will be based on the notification frequency configured in the account.

There are many ways that customers consume GuardDuty findings in their environments. Whether you’re using Amazon SNS or another mechanism such as a chat application, ticketing system, or a security information and event management (SIEM) solution, you can use this example of an EventBridge rule and the GuardDuty Tester to test out your notification pipeline.

Automated response

For the third use case, I show you how to create an automated action based on a GuardDuty finding. In this example, I create a finding based on an EC2 instance connecting to a Bitcoin mining domain, then based on this finding, I use Lambda to tag the instance to assist with identification during that investigation steps that follow. Although this is a simple example, it shows you what you can do by combining EventBridge rules and Lambda functions. If you want to create an automated response for GuardDuty runtime monitoring findings that requires making a host-level modification, you can use EventBridge rules with AWS Systems Manager Run Command to run commands locally on a host to remediate a security issue.

Start by creating a Lambda function that will take a GuardDuty event delivered by EventBridge, pull out the instance ID information, and then use that as a parameter in the create_tags API call. See the following example code.

import json
import boto3
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    try:
        # Extract the necessary information from the GuardDuty finding
        instance_id = event['detail']['resource']['instanceDetails']['instanceId']
        account_id = event['detail']['accountId']
        region = event['detail']['region']

        # Create an EC2 client
        ec2 = boto3.client('ec2', region_name=region)

        # Add the "infected" and "cryptomining" tag value pair to the instance
        ec2.create_tags(
            Resources=[instance_id],
            Tags=[
                {
                    'Key': 'infected',
                    'Value': 'cryptomining'
                }
            ]
        )

        logger.info(f"Tagged instance {instance_id} with 'infected=cryptomining' in account {account_id} and region {region}")
        return {
            'statusCode': 200,
            'body': 'Instance tagged successfully'
        }
    except Exception as e:
        logger.error(f"Error tagging instance {instance_id}: {str(e)}")
        return {
            'statusCode': 500,
            'body': f"Error tagging instance: {str(e)}"
        }

Next, I create an EventBridge rule specific to the Bitcoin mining finding that I want to test, shown in Figure 6. The target is the Lambda function that I just created.

Figure 6: EventBridge rule for crypto mining GuardDuty finding

Now that the EventBridge rule is in place with the Lambda function as the target, I can use the GuardDuty Tester to trigger a Bitcoin mining finding and test my solution with the following command.

Python3 guardduty_tester.py - - finding ‘CryptoCurrency:EC2/BitcoinTool.B!DNS’

After the finding is generated, I go to my EC2 instance, where there’s a new instance tag with a key of infected and a value of cryptomining, shown in Figure 7.

Figure 7: Updated tags after automated response

Although this is a general example, you can use the same approach across various actions that you might take in response to a GuardDuty finding and then test them using the GuardDuty Tester. Examples include using Lambda to add logic in AWS WAF, a network access control list (network ACL), or AWS Network Firewall to block suspicious traffic, or use Systems Manager Run Command to end a malicious process that’s running on a host.

Conclusion

The updated GuardDuty Tester represents a significant advancement in helping organizations validate and gain confidence in GuardDuty threat detection. The GuardDuty Tester now provides more comprehensive coverage of GuardDuty runtime monitoring and protection plans across various AWS services.

By using the GuardDuty Tester and following the use cases in this post, you can proactively assess your threat detection readiness, identify potential gaps, and implement necessary measures to help you fortify your AWS environments against evolving cyber threats.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

AWS re:Invent 2024: Security, identity, and compliance recap

2025-01-13 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/aws-reinvent-2024-security-identity-and-compliance-recap/

AWS re:Invent 2024 was held in Las Vegas December 2–6, with over 54,000 attendees participating in more than 2,300 sessions and hands-on labs. The conference was a hub of innovation and learning hosted by AWS for the global cloud computing community.

In this blog post, we cover on-demand sessions and major security, identity, and compliance announcements that were unveiled leading up to and during the conference. Whether you missed the event or want to revisit the key takeaways, we’ve compiled the essential information for you to provide a comprehensive overview of the latest developments in AWS security, identity, and compliance. This year’s event put best practices for zero trust, generative AI–driven security, identity, and access management, DevSecOps, network and infrastructure security, data protection, and threat detection and incident response at the forefront.

Key announcements

For identity and access management, we launched multiple new features that can help you scale permissions management across your AWS Organizations.

Resource control policies (RCPs) – RCPs are a new type of organization policy that can be used to centrally create and enforce preventative controls on AWS resources in your organization. Using RCPs, you can centrally set the maximum available permissions to your AWS resources as you scale your workloads on AWS.
Centrally manage root access – With central management for root access, you now have a capability to centrally manage your root credentials, simplify auditing of credentials, and perform tightly scoped privileged tasks across your AWS member accounts managed using AWS Organizations.
Declarative policies – Declarative policies simplify the way you enforce durable intent, such as baseline configurations for AWS services within your organization.

Amazon Cognito announced four new features:

Feature tiers – Amazon Cognito launched two user pool feature tiers: Essentials and Plus. The Essentials tier offers comprehensive and flexible user authentication and access control features, helping you to implement secure, scalable, and customized sign-up and sign-in experiences. The Plus tier offers threat protection capabilities against suspicious sign-ins for customers who have elevated security needs for their applications.
Developer-focused console – Amazon Cognito now offers a streamlined getting-started experience featuring a quick wizard and use case–specific recommendations. This approach helps you set up configurations and reach your end users faster and more efficiently than ever before.
Managed Login – This feature is a fully managed, hosted sign-in and sign-up experience that you can personalize to align with your company or application branding. Managed Login helps you offload the undifferentiated heavy lifting of designing and maintaining custom implementations such as passwordless authentication and localization.
Passwordless authentication – With passwordless authentication, you can secure user access to your application with passkeys, email, and text messages. If your users choose to use passkeys to sign in, they can do so using a built-in authenticator, such as Touch ID on Apple MacBooks and Windows Hello facial recognition on PCs.

To discover security issues in your environment, Amazon GuardDuty launched Extended Threat Detection, a capability that you can use to identify sophisticated, multi-stage threats targeting your AWS accounts, workloads, and data. You can now use new threat sequence findings that cover multiple resources and data sources over an extensive time period, allowing you to spend less time on first-level analysis and more time responding to critical severity threats to minimize business impact.

Amazon OpenSearch Service now offers a zero-ETL integration with Amazon Security Lake, enabling you to query and analyze security data in-place directly through OpenSearch Service. This integration allows you to efficiently explore voluminous data sources that were previously cost-prohibitive to analyze, helping you streamline security investigations and obtain comprehensive visibility of your security landscape. With the flexibility to selectively ingest data and without the need to manage complex data pipelines, you can now focus on effective security operations while potentially lowering your analytics costs.

AWS Security Incident Response is a new service that helps you respond to security issues in your environment. This new service combines the power of automated monitoring and investigation, accelerated communication and coordination, and direct 24/7 access to the AWS Customer Incident Response Team to quickly prepare for, respond to, and recover from security events.

In the zero trust space, AWS Verified Access AWS Verified Accessand Amazon VPC Lattice both launched support for accessing non-HTTPS resources. Verified Access enables you to provide secure, VPN-less access to your corporate applications over protocols such as TCP, SSH, and RDP. With the launch of VPC Resources for Amazon VPC Lattice, you can now access your application dependencies through a VPC Lattice service network. You’re able to connect to your application dependencies that are hosted in different VPCs, accounts, and on-premises using additional protocols, including TLS, HTTP, HTTPS, and now TCP. Watch the on demand session to learn how you can enable zero trust access over non-HTTP(S) protocols by using AWS Verified Access.

Amazon Route 53 Resolver DNS Firewall launched an advanced firewall rule that has a new set of capabilities that you can use to monitor and block suspicious DNS traffic associated with advanced DNS threats.

Amazon Virtual Private Cloud launched block public access, which is a one-click declarative control that admins can implement centrally to authoritatively block internet traffic for each of their VPCs.

As more and more customers deploy generative AI workloads into production, it’s important to have proper security controls. Amazon Bedrock launched two new features to help with this:

Automated Reasoning checks – Automated Reasoning checks help detect hallucinations and provide a verifiable proof that a large language model response is accurate. With Automated Reasoning checks, domain experts can more straightforwardly build specifications called Automated Reasoning Policies that encapsulate their knowledge in fields such as operational workflows and HR policies. Users of Amazon Bedrock Guardrails can validate generated content against an Automated Reasoning Policy to identify inaccuracies and unstated assumptions, and explain why statements are accurate in a verifiable way.
Multimodal toxicity detection (Preview) – Amazon Bedrock Guardrails now supports multimodal toxicity detection for image content, enabling organizations to apply content filters to images. This capability, now in public preview, removes the heavy lifting required to build your own safeguards for image data or spend cycles with manual evaluation that can be error-prone and tedious.

AWS has continued to work closely with partners to help drive customer success. There were three new partner programs launched at AWS re:Invent:

AI Security category – The AI Security category in the AWS Security competency helps you identify AWS Partners with deep experience securing AI environments and defending AI workloads against advanced threats. Partners in this category are validated for their capabilities in areas such as prevention of sensitive data disclosure, prevention of injection threats, security posture management, and implementing responsible AI filtering.
AWS Security Incident Response Specialization – Today, AWS customers rely on various third-party tools and services to support their internal security incident response capabilities. To better help both customers and partners, AWS introduced AWS Security Incident Response, a new service that helps you prepare for, respond to, and recover from security events. Alongside approved AWS Partners, AWS Security Incident Response monitors, investigates, and escalates triaged security findings from Amazon GuardDuty and other threat detection tools through AWS Security Hub. Security Incident Response is designed to identify and escalate only high-priority incidents.
Amazon Security Lake Ready Specialization – This specialization recognizes AWS Partners who have technically validated their software solutions to integrate with Amazon Security Lake and demonstrated successful customer deployments. These solutions have been technically validated by AWS Partner Solutions Architects for their sound architecture and proven customer success.

Experience content on demand

If you were unable to join us in person or you want to watch a session again, you can view the sessions that are available on demand. Catch the CEO Keynote with Matt Garman to learn how AWS is reinventing foundational building blocks in addition to developing brand-new experiences, all to empower AWS customers and partners with what they need to build a better future. You can also replay additional re:Invent 2024 keynotes.

Watch the Security Innovation Talk, with AWS CISO Chris Betz, to hear how the latest AWS innovations are helping customers move fast and stay secure. Learn how AWS empowers organizations to confidently integrate and automate security into their products, services, and processes so security teams can focus their time on work that brings the highest value to the business. Chris also shares how AWS is helping to make the internet more secure by scaling security innovation and investing in the security community.

Stream any of the AWS security, identity, and compliance breakout sessions and new launch talks on demand to learn about the following key topics, and more:

Consider joining us for more in-person security learning opportunities by saving the date for AWS re:Inforce 2025, which will take place June 16–18 in Philadelphia, Pennsylvania. We look forward to seeing you there!

If you want to discuss how these new announcements can help your organization improve its security posture, AWS is here to help. Contact your AWS account team today.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Detect threats to your data stored in RDS databases by using GuardDuty

2023-05-10 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/detect-threats-to-your-data-stored-in-rds-databases-by-using-guardduty/

With Amazon Relational Database Service (Amazon RDS), you can set up, operate, and scale a relational database in the AWS Cloud. Amazon RDS provides cost-efficient, resizable capacity for an industry-standard relational database and manages common database administration tasks.

If you use Amazon RDS for your workloads, you can now use Amazon GuardDuty RDS Protection to help detect threats to your data stored in Amazon Aurora databases. GuardDuty is a continuous security monitoring service that can help you identify and prioritize potential threats in your AWS environment. By analyzing and profiling RDS login activity to your Aurora databases, GuardDuty can detect threats, such as high severity brute force events, suspicious logins, access from Tor, and access by known threat actors.

In this post, we will provide an overview of how to get started with RDS Protection, dive into its finding types, and walk you through examples of how to investigate and remediate findings.

Overview of RDS Protection

RDS Protection in GuardDuty analyzes and profiles Amazon RDS login activity to identify potential threats to your data stored in Aurora databases by using a combination of threat intelligence and machine learning. At launch, RDS Protection supports Aurora MySQL versions 2.10.2 and 3.2.1 or later and Aurora PostgreSQL versions 10.17, 11.12, 12.7, 13.3, and 14.3 or later. An updated list of the supported engines and versions is available in the GuardDuty documentation. RDS Protection doesn’t require additional infrastructure, and you don’t need to configure, collect, or store RDS logs in your own account. RDS Protection is also designed to have no impact on the performance of your database instances so that you don’t have to worry about compromising performance to better secure your data stored in Amazon RDS.

When RDS Protection detects a suspicious or anomalous login attempt that indicates a potential threat to your database instance, GuardDuty generates a finding with details to help you quickly identify relevant information to assist in remediation. RDS Protection findings include details on both anomalous and normal login activity in addition to information such as database instance details, database user details, action information, and actor information. These findings are available to you in the GuardDuty console, AWS Command Line Interface (AWS CLI), and API, and all GuardDuty findings are sent to Amazon EventBridge and AWS Security Hub, giving you options to respond by sending alerts to chat or ticketing systems, or by using AWS Lambda and AWS Systems Manager for automatic remediation.

Enable RDS Protection

Getting started with RDS Protection is simple, and you can do it with just a few steps in the console. Both new and existing GuardDuty customers can take advantage of the GuardDuty RDS Protection 30-day free trial. You can turn RDS Protection on or off for each of your accounts in supported AWS Regions. If you already use GuardDuty, you will need to enable RDS Protection either in the console or CLI, or through the API. You will have the option to enable it in the account that you are currently in, or if you are using a GuardDuty delegated administrator account (as shown in Figure 1), you can enable it for all accounts in your AWS Organizations organization. You’ll also have the ability to auto-enable. The auto-enable feature helps ensure that RDS Protection is enabled for each new account added to your organization, without the need for you to configure anything in each member account. If you are turning on GuardDuty for the first time, RDS Protection is enabled by default.

Figure 1: GuardDuty RDS Protection enablement page

Investigate RDS Protection findings

After GuardDuty generates a finding, you will need to analyze the finding so that you understand the potential impact to your environment. We recommend that you familiarize yourself with the GuardDuty finding types. Understanding GuardDuty finding types can help you understand the types of activity that GuardDuty is looking for and help you prepare for how to respond if they occur in your environment.

As adversaries become more sophisticated, it becomes even more important for you to align to a common framework to understand the tactics, techniques, and procedures (TTPs) behind an individual event. GuardDuty aligns findings using the MITRE ATT&CK framework, which is a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. GuardDuty findings have a specific finding format that helps you understand the details of each finding. You can examine the Threat Purpose section of the GuardDuty finding types to see finding types associated with various MITRE ATT&CK tactics, including CredentialAccess and Discovery. This can help you identify and understand the type of activity associated with a finding.

For example, consider two finding types that seem similar: CredentialAccess:RDS/MaliciousIPCaller.SuccessfulLogin and Discovery:RDS/MaliciousIPCaller. The difference between them is the ThreatPurpose aspect, located at the beginning of the finding type. GuardDuty has determined that both are involved with MaliciousIPCaller, and the difference is the intent of the activity associated with each finding. CredentialAccess SuccessfulLogin indicates that there was a successful login to your RDS database from a known malicious IP address. Discovery indicates that a threat actor opened a connection to the database, but didn’t attempt to authenticate. This indicates scanning behavior, but it might not be targeted at RDS instances. For more information, see GuardDuty RDS Protection finding types.

GuardDuty uses threat intelligence and machine learning to continually monitor and identify potential threats in your environment. To understand how to investigate RDS Protection finding types, you need to understand the details of a finding type that are derived from machine learning. As shown in Figure 2, RDS Protection finding types have two sections: one that shows the unusual behavior and one that shows the normal, historical behavior. To determine this, GuardDuty uses machine learning models to evaluate API requests to your account and identify anomalous events that are associated with tactics used by adversaries. The machine learning model tracks various factors of the API request, such as the user that made the request, the location the request was made from, and the specific API that was requested. It also looks at information such as successfulLoginCount, failedLoginCount, and incompleteConnectionCount for anomalies based on login activity. For more information about anomalous activity in GuardDuty findings, see Anomalous behavior.

Figure 2: GuardDuty finding details showing unusual and historical behavior sections

With RDS Protection, you now have an additional mechanism to gain insight into your Amazon RDS databases across your accounts to continuously monitor for suspicious activity. RDS Protection can alert you to suspicious activity in Amazon RDS, such as a potentially suspicious or anomalous login attempt, unusual pattern in a series of successful, failed, or incomplete login attempts, and unauthorized access to your database instance from a previously unseen internal or external actor. With this new feature, GuardDuty also extends support for finding types that you might already be familiar with that also apply to RDS databases. These finding types include calls to an RDS database API from a Tor node, or calls to an RDS database from a known malicious IP address, which can indicate that there are interactions with your RDS database from sources that are associated with known malicious activity.

Remediate RDS Protection findings

In this section, we describe two RDS Protection findings and how you can investigate and remediate them. Understanding how to remediate these findings can help you maintain the integrity of your database. We share recommendations that focus specifically on security groups, network access control lists (network ACLs), and firewall rules.

CredentialAccess:RDS/AnomalousBehavior.SuccessfulLogin

The CredentialAccess:RDS/AnomalousBehavior.SuccessfulLogin finding informs you that an anomalous successful login was observed on an RDS database in your AWS environment. It might indicate that a previous unseen user logged in to an RDS database for the first time. A common scenario involves an internal user logging in to a database that is accessed programmatically by applications and not by individual users. A potential malicious actor might have compromised and accessed the role on your RDS database. The default Severity for this finding varies, depending on the anomalous behavior associated with the finding.

Figure 3 shows an example of this finding.

Figure 3: Finding of an anomalous behavior successful login

How to remediate

If the activity is unexpected for the associated database, AWS recommends that you change the password of the associated database user, and review available audit logs for activity that the user performed. Medium and high severity findings might indicate an overly permissive access policy to the database, and user credentials might have been exposed or compromised. We recommend that you place the database in a private virtual private cloud (VPC), and limit the security group rules to allow traffic only from necessary sources. For more information, see Remediating potentially compromised database with successful login events.

We recommend that you take the following steps to remediate this finding:

Remediation step 1: Identify the affected database and user

Identify the affected database and user and confirm whether the behavior is expected or unexpected by looking through the GuardDuty finding details, which provide the name of the affected database instance and the corresponding user details. Use the findings to confirm if the behavior is expected or not—for example, the findings might help you identify a user who logs in to their database instance after a long time has passed; a user who logs in to their database instance only occasionally, such as a financial analyst who logs in each quarter; or a suspicious actor who is involved in a successful login attempt that isn’t authorized and potentially compromises the database instance. Review the IP address of the finding. Public IP addresses might signify overly permissive access if it’s not a known network associated with your account.

Figure 4: Finding with details showing Amazon RDS database instance and user details

If the behavior is unexpected, complete the following steps:

Remediation step 2: Restrict database instance credential access

Restrict database instance access for the suspected accounts and the source of the login activity. For more information, see Remediating potentially compromised credentials and Restrict network access. You can identify the user in the RDS DB user details section within the finding panel in the console, or within the resource.rdsDbUserDetails of the findings JSON. These user details include user name, application used, database accessed, SSL version, and authentication method.

To revoke access or rotate passwords for specific users that are involved in the finding, see Security with Amazon Aurora MySQL or Security with Amazon Aurora PostgreSQL. To securely store and automatically rotate the secrets for RDS databases, use AWS Secrets Manager. For more information, see AWS Secrets Manager tutorials. To manage database users’ access without the need for passwords, use IAM database authentication. For more information, see Security best practices for Amazon RDS.

The following CLI command is an example of how to revoke access to a user in a MySQL database. If the behavior is unexpected, you can revoke the privileges while you assess if the user is malicious.

REVOKE CONNECTION_ADMIN ON *.* FROM 'fakeadmin'@'%';

You can revoke privileges from the user, but when taking this action, you should make sure that the user isn’t vital to your system and that revoking permissions won’t break your production or development application. The following CLI command is an example of how to revoke privileges from a user:

REVOKE ALL PRIVILEGES ON *.* FROM 'fakeadmin'@'%';

If you know that the user isn’t necessary for your database or application to function, then you can remove the user from the system. To make sure that your security team can run forensics, check your company’s incident response policy. If you need help getting started with incident response, see AWS sample incident response playbooks. The following CLI command is an example of how to remove a user:

DROP USER 'fakeadmin'@'%';

Let’s say that you find the behavior unexpected, but the user turns out to be the application user, and making a change to the database credential will break your application. You can use AWS Systems Manager to help in this scenario, in which the affected RDS user is the account that is tied to your application. In many cases, a password rotation can break your application, depending on how you connect. If you rotate the password without notifying your application, the application might require additional cascading changes. You could lose connectivity to your application because the credentials that your application is using to connect to your database didn’t change, and now you are experiencing an outage that will remain until you update the credentials. Systems Manager can tie into your application code to automatically update the rotated database credentials in your application. For more information, see Rotate Amazon RDS database credentials automatically with AWS Secrets Manager.

The following figure shows a CLI command to get a secret from Secrets Manager — for this example, we assume the secret is compromised.

Figure 5: Example compromised credentials

The following figures shows that we have a new set of credentials that replace our old credentials, as indicated by “CreatedDate”.

Figure 6: Example remediated credentials

Remediation step 3: Assess the impact and determine what information was accessed

If available, review the audit logs to identify which information might have been accessed. For more information, see Monitoring events, logs, and streams in an Amazon Aurora DB cluster. Determine if sensitive or protected information was accessed or modified.

Remediation step 4: Restrict database instance network access

To learn how to restrict IP access on a security group, see Control traffic to resources using security groups. You can identify the user in the RDS DB user details section within the finding panel in the console, or within the resource.rdsDbUserDetails of the findings JSON. These user details include user name, application used, database accessed, SSL version, and authentication method.

Remediation step 5: Perform root-cause analysis and determine the steps that potentially led to this activity

Implementing a lessons-learned framework and methodology can help improve your incident response capabilities and also help prevent the incident from recurring. By learning from each incident, you can help avoid repeating the same mistakes, exposures, or misconfigurations, which can both improve your security posture and reduce the time lost to preventable situations. To learn more about post-incident activity, see AWS Security Incident Response Guide.

You can set up an alert to be notified when an activity modifies a networking policy and creates an insecure state by using AWS Config and Amazon Simple Notification Service (Amazon SNS). You can use an EventBridge rule with a custom event pattern and an input transformer to match an AWS Config evaluation rule output as NON_COMPLIANT. Then, you can route the response to an Amazon SNS topic. For more information, see How can I be notified when an AWS resource is non-compliant using AWS Config? or Firewall policies in AWS Network Firewall.

CredentialAccess:RDS/AnomalousBehavior.SuccessfulBruteForce

The CredentialAccess:RDS/AnomalousBehavior.successfulBruteForce finding informs you that an anomalous login occurred that is indicative of a successful brute force event, as observed on an RDS database in your AWS environment. Before the anomalous successful login, a consistent pattern of unusual failed login attempts was observed. This indicates that the user and password associated with the RDS database in your account might have been compromised, and a potentially malicious actor might have accessed the RDS database. The Severity of this finding is high. Figure 7 shows an example of this finding.

Figure 7: Example of an anomalous successful brute force finding

How to remediate

This activity indicates that database credentials might have been exposed or compromised. We recommend that you change the password of the associated database user, and review available audit logs for activity performed by the potentially compromised user. A consistent pattern of unusual failed login attempts indicates an overly permissive access policy to the database, or that the database might also have been publicly exposed. AWS recommends that you place the database in a private VPC, and limit the security group rules to allow traffic only from necessary sources. For more information, see Remediating potentially compromised database with successful login events.

We recommend that you take the following steps to remediate this finding

Remediation step 1: Identify the affected database and user

The generated GuardDuty finding provides the name of the affected database instance and the corresponding user details. For more information, see Finding details.

Figure 8: Finding details showing Amazon RDS database instance and user details

Remediation step 2: Identify the source of the failed login attempts

In the generated GuardDuty finding, you can find the IP address, and if it was a public connection, the ASN organization in the Actor section of the finding panel. An autonomous system is a group of one or more IP prefixes (lists of IP addresses accessible on a network) run by one or more network operators that maintain a single, clearly-defined routing policy. Network operators need autonomous system numbers (ASNs) to control routing within their networks and to exchange routing information with other internet service providers.

Figure 9: Action and actor details related to GuardDuty brute force finding

Remediation step 3: Confirm that the behavior is unexpected

Examine if this activity represents an attempt to gain additional unauthorized access to the database instance as follows:

If the source is internal to your network, examine if an application is misconfigured and attempting a connection repeatedly.
If this is an external actor, examine whether the corresponding database instance is public facing or is misconfigured and thus allowing potential malicious actors to attempt to log in with common user names.

If the behavior is unexpected, complete the following steps:

Remediation step 4: Restrict database instance access

As discussed previously for the CredentialAccess:RDS/AnomalousBehavior.SuccessfulLogin finding, you can restrict access to the database through credentials or network access:

For information about how to restrict a specific user’s database access, see Restrict database instance credential access.
For information about how to restrict network access to the database, see Restrict database instance network access.

Remediation step 5: Perform root-cause analysis and determine the steps that potentially led to this activity

By learning from each incident, you can help avoid repeating the same mistakes, exposures, or misconfigurations, which can both improve your security posture and reduce time lost to preventable situations.

Conclusion

In this post, you learned about the new GuardDuty RDS Protection feature and how to understand, operationalize, and respond to the new findings. You can enable this feature through the GuardDuty console, CLI, or APIs to start monitoring your Amazon RDS workloads today.

If you’ve created EventBridge rules to send findings from GuardDuty to a target, make sure that you’ve configured your rules to deliver the newly added findings. After you enable GuardDuty findings, consider creating IR playbooks, doing tabletops and AWS gamedays, and mapping out what you want to automate. For more information, see the AWS Security Incident Response Guide and AWS Incident Response Playbook resources. To gain hands-on experience with different AWS Security services, see AWS Activation Days. The Activation Days workshops begin with hands-on work with different services in sandbox accounts, and then take you through the steps to deploy them across your organization.

To make it more efficient for you to operate securely on AWS, we are committed to continually improving GuardDuty, and we value your feedback. If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

How to investigate and take action on security issues in Amazon EKS clusters with Amazon Detective – Part 2

2022-12-05 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/how-to-investigate-and-take-action-on-security-issues-in-amazon-eks-clusters-with-amazon-detective-part-2/

In part 1 of this of this two-part series, How to detect security issues in Amazon EKS cluster using Amazon GuardDuty, we walked through a real-world observed security issue in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster and saw how Amazon GuardDuty detected each phase by following MITRE ATT&CK tactics.

In this blog post, we’ll walk you through investigative techniques to use with Amazon Detective, paired with the GuardDuty EKS and malware findings from the security issue. After we have identified impacted resources through our investigation, we’ll provide example remediation tactics and preventative controls to address and help prevent security issues in EKS clusters.

Amazon Detective can help you investigate security issues and related resources in your account. Detective provides EKS coverage that you can enable within your accounts. When this coverage is enabled, Detective can help investigate and remediate potentially unauthorized EKS activity that results from misconfiguration of the control plane nodes or application. Although GuardDuty is not a prerequisite to enable Detective, it is recommended that you enable GuardDuty to enhance the visualization capabilities in Detective with GuardDuty findings.

Prerequisites

You must have the following services enabled in your AWS account to generate and investigate findings associated with EKS security events in a similar manner as outlined in this blog. If you do not have GuardDuty enabled, you can still investigate with Detective, but in a limited capacity.

Amazon GuardDuty, along with these features of GuardDuty:
- Kubernetes Protection
- Malware Protection
Amazon Detective, along with this feature of Detective:
- EKS audit logs

Investigate with Amazon Detective

In the five phases we walked through in part 1, we discussed GuardDuty findings and MITRE ATT&CK tactics that can help you detect and understand each phase of the unauthorized activity, from the initial misconfiguration to the impact on our application when the EKS cluster is used for crypto mining.

The next recommended step is to investigate the EKS cluster and any associated resources. Amazon Detective can help you to investigate whether there was any other related unauthorized activity in the environment. We will walk through Detective capabilities for visualizing and gathering important information to effectively respond to the security issue. If you’re interested in creating detailed incident response playbooks for your security team to follow in your own environment, refer to these sample AWS incident response playbooks.

Depending on your scenario, there are various resources you can use to start your investigation, such as Security Hub findings, GuardDuty findings, related Kubernetes subjects, or an AWS account’s AWS CloudTrail activity. For our walkthrough, we’ll start our investigation from the GuardDuty finding and use the EKS cluster resource to pivot to the Detective console, as shown in Figure 7. Although we initially focus on the EKS cluster, you could start from any entities that are supported in the Detective behavior graph structure in the Amazon Detective User Guide. For example, we could start directly with the Kubernetes subject system:anonymous and find activity associated with the anonymous user.

Figure 7: Example Detective popup from GuardDuty finding for EKS cluster

We’ll now go over the information that you would need to gather from Detective in order to investigate the example security issue.

To investigate EKS cluster findings with Detective

In the GuardDuty console, navigate to an individual finding and hover over Investigate with Detective. Choose one of the specific resources to start. In the image below, we selected the EKS cluster resource to investigate with Detective. You will need to gather some preliminary information about the IAM roles associated with the EKS cluster.
- Questions: When was the cluster created? What IAM role created the cluster? What IAM role is assigned to the cluster?
- Why it matters: If you are an incident responder, these details can potentially help you identify the owner of the cluster and help you determine what IAM principals are involved.
- What next: Start looking into each IAM principal’s activity, as seen in CloudTrail, to investigate whether the IAM entity itself is potentially compromised or what other resources may have been impacted.
Figure 8: Detective summary page for EKS cluster metadata details
Next, on the EKS cluster overview page, you can see the container details associated with the cluster.
- Question: What are some of the other container details for the cluster? Does anything look out of the ordinary? Is it using a public image? Is it missing a network policy?
- Why it matters: Based on the architecture related to this cluster, you might be able to use this information to determine whether there are unauthorized containers. The contents of unauthorized containers will depend on your organization but typically consist of public images or unauthorized RBAC, pod security policies, or network policy configurations. It’s important to keep in mind that when you look at data in Detective, the scope time is very important. When you pivot from a GuardDuty finding, the scope time will be set to the first time the GuardDuty finding was seen to the last time the finding was seen. The container details reflect the containers that were running during the selected scope time. Changing the scope time might change the containers that are listed in the table shown in Figure 9.
- What next: Information found on this page can help to highlight unauthorized resources or configurations that will need to be remediated. You will also need to look at how these resources were initially created and if there are missing guardrails that should have been created during the provisioning of the cluster.
Figure 9: Detective summary page for EKS container metadata details
Finally, you will see associated security findings with this specific EKS cluster, similar to Figure 10, at the bottom of the EKS cluster overview page in Detective.
- Question: Are there any other security findings associated with this cluster that I previously was not aware of?
- Why it matters: In our example scenario, we walked through the findings that were initially detected and the events that unfolded from those findings. After further investigation, you might see other findings that were not part of the original investigation. This can occur if your security team is only investigating specific findings or severity values. The finding for PrivilegeEscalation:Kubernetes/PrivilegedContainer informs you that a privileged container was launched on your Kubernetes cluster by using an image that has never before been used to launch privileged containers in your cluster. A privileged container has root level access to the host. The other finding, Persistence:Kubernetes/ContainerWithSensitiveMount, informs you that a container was launched with a configuration that included a sensitive host path with write access in the volumeMounts section. This makes the sensitive host path accessible and writable from inside the container. Any finding associated to the suspicious or compromised cluster is valuable because it provides additional insight into what the unauthorized entity was trying to accomplish after the initial detection.
- What next: With Detective, you might want to continue your investigation by selecting each of these findings and reviewing all details related to the finding. Depending on the findings, you could bring in additional team members to help investigate further. For this example, we will move on to the next step.
Figure 10: Example Detective summary of security findings associated with the EKS cluster
Shift from the EKS cluster overview section to the Kubernetes API activity section, similar to Figure 11 below. This will give you the opportunity to dig into the API activity associated with this cluster.
1. Question: What other Kubernetes API activity was attempted from the cluster? Which API calls were successful? Which API calls failed? What was the unauthorized user trying to do?
2. Why it matters: It’s important to determine which actions were successfully invoked by the unauthorized user so that appropriate remediation actions can be taken. You can look at trends of successful and failed API calls, and can even search by Subject, IP address, or Kubernetes API call.
3. What next: You might want to look at all cluster role binding from days before the first GuardDuty finding was seen to determine if there was any other suspicious activity you should be investigating regarding the cluster.
Figure 11: Example Detective summary page for Kubernetes API activity on the EKS cluster
Next, you will want to look at the Newly observed Kubernetes API calls section, similar to Figure 12 below.
- Question: What are some of the more recent Kubernetes API calls? What are they trying to access right now and are they successful? Do I need to start taking action for other resources outside of EKS?
- Why it matters: This data shows Kubernetes subjects who were observed issuing API calls to this cluster for the first time during our scope time. Detective provides you this information by keeping a baseline of the activity associated with supported AWS resources. This can help you more quickly determine whether activity might be suspicious and worth looking into. In our example, we used the search functionality to look at API calls associated with the built-in Kubernetes secrets management. A common way to start your search is to see if an unauthorized user has successfully accessed any secrets, which can help you determine what information you might want to search in the overall API call volume section discussed in step 4.
- What next: If the unauthorized user has successfully accessed any secret, those secrets should be marked as compromised, and they should be rotated immediately.
Figure 12: Example Detective summary for newly observed Kubernetes API calls from the EKS cluster
You can also consider the following question when you look at the Newly observed Kubernetes API calls section.
- Question: Has the IP address associated with the finding been communicating with any other resources in our environment, and if so, what are the details of that communication?
- Why it matters: To answer this question, you can use Detective’s search functionality and the ability to use wild cards to search for IP addresses with the same first three octets. Also note that you can use CIDR notation to search, as well. Based on the results in the example in Figure 13, you can see that there are a number of related IP addresses associated with the environment. With this information, you now can look at the traffic associated with these different IPs and what resources they were communicating with.
Figure 13: Example Detective results page from a query against IP addresses associated with the EKS cluster
You can select one of the IP addresses in the search results to get more information related to it, similar to Figure 14 below.
1. Question: What was the first time an IP address was observed in the environment? When was the last time it was observed?
2. Why it matters: You can use this information to start isolating where unauthorized activity is coming from and what actions are being taken. You can also start creating a time series of unauthorized activity and scope.
3. What next: You can repeat some of the previous investigation steps for each IP address, like looking at the different tabs to review New behavior, Resource interaction, and Kubernetes activity.
Figure 14: Example Detective results page for specific IP address and associated metadata details

In summary, we began our investigation with a GuardDuty finding about an anonymous API request that was successful in using system:anonymous on one of our EKS clusters. We then used Detective to investigate and visualize activity associated with that EKS cluster, such as volume of successful or unsuccessful API requests, where and when those actions were attempted and other security findings associated with the resource. Once we have completed the investigation, we can confirm scope and impact of the security event and start moving towards taking action.

Remediation techniques for Amazon EKS

In this section, we will focus on how to remediate the security issue in our example. Your actions will vary based on your organization and the resources affected. It’s important to note that these actions will impact the EKS cluster and associated workloads, and should accordingly be performed by or coordinated with the cluster operator.

Before you take action on the EKS cluster, you will need to preserve forensic artifacts and evidence for the impacted EKS resources. The order of operations for these actions matters, because you want to get all the data from forensic artifacts in order to determine the overall impact to the resources affected. If you quarantine resources before you capture forensic artifacts, there is a risk that running processes will be interrupted or that the malware attempts to destroy resources that are valuable to a forensics investigation, to cover its tracks.

To preserve forensic evidence

Enable termination protection on the impacted worker node and change the shutdown behavior to Stop.
Label the offending pod or node with a label indicating that it is part of an active investigation.
Cordon the worker node.
Capture both volatile (temporary memory) and non-volatile (Amazon EBS snapshots) artifacts on the worker node.

Now that you have the forensic evidence, you can start to quarantine your EKS resources to restrict unauthorized network communication. The main objective is to prevent the affected EKS pods from communicating with internal resources or exfiltrating data externally.

To quarantine EKS resources

Isolate the pod by creating a network policy that denies ingress and egress traffic to the pod.
Attach a security group to the host and remove inbound and outbound rules. Take this action if you believe the underlying host has been compromised.
Depending on existing inbound and outbound rules on the security group, the connections will either be tracked or untracked. Applying an isolation security group will drop untracked connections. For tracked connections, new connections with the host will not be allowed from the isolation security group, but existing tracked connections will not be interrupted.

Important: This action will affect all containers running on the host.
Attach a deny rule for the EKS resources in a network access control list (network ACL). Because network ACLs are stateless firewalls, all connections will be interrupted, whether they are tracked or untracked connections.

Important: This action will affect all subnets using the network ACL and all resources within those subnets.

At this point, the affected EKS resources are quarantined, but the cluster is still configured to allow anonymous, unauthenticated access. You will need to remove all unauthorized permissions that were created or added.

To remove unauthorized permissions

Update the RBAC configuration to remove system:anonymous access.
Revoke temporary security credentials that are assigned to the pod or worker node, if necessary. You can also remove the IAM role associated with the EKS resources.

Note: Removing IAM policies or attaching IAM policies to restrict permissions will affect the resources that are using the IAM role.
Remove any unauthorized ClusterRoleBinding created by the system:anonymous user.
Redeploy the compromised pod or workload resource.

The actions taken so far primarily target the EKS resource, but based on our Detective investigation, there are other actions you might need to take. Because secrets were involved that could be used outside of the EKS cluster, those secrets will need to be rotated wherever they are referenced. Detective will also suggest additional areas where you can investigate and remediate additional unauthorized activity in your AWS account.

It is important that your team go through game days or run-throughs for investigating and responding to different scenarios in order to make sure the team is prepared. You can run through the EKS security workshop to get your security team more familiar with remediation for EKS.

For more information about responding to EKS cluster related security issues, refer to GuardDuty EKS remediation in the GuardDuty User Guide and the EKS Best Practices Guide.

Preventative controls for EKS

This section covers several preventative controls that you can use to protect EKS clusters.

How can I prevent external access to the EKS cluster?

To help prevent external access to your EKS clusters, limit the exposure of your API server. You can achieve that in two ways:

Set the API server endpoint access to Private. This will effectively forbid anyone outside of the VPC to send Kubernetes API requests to your EKS cluster.
Set an IP address allow list for the EKS cluster public access endpoint.

How can I prevent giving admin access to the EKS cluster?

To help prevent an EKS cluster user from granting any type of access to anonymous or unauthenticated users, you can set up a ValidatingAdmissionWebhook. This is a special type of Kubernetes admission controller that can be configured in the Kubernetes API. (To learn how to build serverless admission webhooks, see the blog post Building serverless admission webhooks for Kubernetes with AWS SAM.)

The ValidatingAdmissionWebhook will deny a Kubernetes API request that matches all of the following checks:

The request is creating or modifying a ClusterRoleBinding or RoleBinding.
The subjects section contains either of the following:
- The user system:anonymous
- The group system:unauthenticated

How can I prevent malicious images from being deployed?

Now that you have set controls to prevent external access to the EKS cluster and prevent granting access to anonymous users, you can focus on preventing the deployment of potentially malicious images.

Malicious container images can have different origins, including:

Images stored in public or unauthorized registries
Images replacing the ones that are stored in authorized registries
Authorized images that contain software with existing or newly discovered vulnerabilities

You can address these sources of malicious images by doing the following:

Use admission controllers to verify that images meet your organization’s requirements, including for the image origin. You can also refer to this this blog post to implement a solution with a webhook and admission controllers.
Enable tag immutability in your registry, a control that prevents an actor from maliciously replacing container images without changing the image’s tags. Additionally, you can enable an AWS Config rule to check tag immutability
Configure another ValidatingAdmissionWebhook that will only accept images if they meet all of the following criteria.
1. Images that come from approved registries.
2. Images that pass the vulnerability scan during deployment time.
3. Images that are signed by a trusted party. Amazon Elastic Container Registry (Amazon ECR) is working on a product enhancement to store image signatures. Currently, you can use an open-source cosign tool to verify and store image signatures.
  
  Note: These criteria can vary based on your use case and internal security and compliance standards.

The above controls will help prevent the deployment of a vulnerable, unauthorized, or potentially malicious container image.

How can I prevent lateral movement inside the cluster?

To prevent lateral movement inside the cluster, it is recommended to use network policies, as follows:

Enforce Kubernetes network policies to enforce ingress and egress controls within the cluster. You can implement these policies by following the steps in the Securing your cluster with network policies EKS workshop.

It’s important to note that you could use security groups for the same purpose, but pod security groups should only be used if the cluster is compromised and when you want to control the traffic between a pod and a resource that resides in the VPC, not inter-pod traffic.

In this section, we’ve reviewed different preventative controls that could have helped mitigate our example security incident. With the first preventative control, we could have prevented external actors from connecting to the API server. The second control could have prevented granting access to anonymous users. The third control could have prevented the deployment of an unauthorized or vulnerable container image. Finally, the fourth control could have helped limit the impact of the deployed vulnerable images to only the pods where the images were deployed, making it harder to laterally move to other pods in the cluster.

Conclusion

In this post, we walked you through how to investigate an EKS cluster related security issue with Amazon Detective. We also provided some recommended remediation and preventative controls to put in place for the EKS cluster specific security issues. When pairing GuardDuty’s ability for continuous threat detection and monitoring with Detective’s organization and visualization capabilities, you enable your security team to conduct faster and more effective investigation. By providing the security team the ability quickly view an organized set of data associated with security events within your AWS account, you reduce the overall Mean Time to Respond (MTTR).

Now that you understand the investigative capabilities with Detective, it’s time to try things out! It is important that you provide a mechanism for your security team to practice detection, investigation, and remediation techniques using security incident response simulations. By periodically running simulations, your security team will be prepared to quickly respond to possible security events. You can find more detailed incident response playbooks that can assist you in preparing for events in your environment, see these sample AWS incident response playbooks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a thread on Amazon GuardDuty re:Post.

Want more AWS Security news? Follow us on Twitter.

How to detect security issues in Amazon EKS clusters using Amazon GuardDuty – Part 1

2022-11-22 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/how-to-detect-security-issues-in-amazon-eks-clusters-using-amazon-guardduty-part-1/

In this two-part blog post, we’ll discuss how to detect and investigate security issues in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster with Amazon GuardDuty and Amazon Detective.

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed service that you can use to run and scale container workloads by using Kubernetes in the AWS Cloud, which can help increase the speed of deployment and portability of modern applications. Amazon EKS provides secure, managed Kubernetes clusters on the AWS control plane by default. Kubernetes configurations such as pod security policies, runtime security, and network policies and configurations are specific for your organization’s use-case and securing them adequately would be a customer’s responsibility within AWS’ shared responsibility model.

Amazon GuardDuty can help you continuously monitor and detect suspicious activity related to AWS resources in your account. GuardDuty for EKS protection is a feature that you can enable within your accounts. When this feature is enabled, GuardDuty can help detect potentially unauthorized EKS activity resulting from misconfiguration of the control plane nodes or application.

In this post, we’ll walk through the events leading up to a real-world security issue that occurred due to EKS cluster misconfiguration, discuss how those misconfigurations could be used by a malicious actor, and how Amazon GuardDuty monitors and identifies suspicious activity throughout the EKS security event. In part 2 of the post, we’ll cover Amazon Detective investigation capabilities, possible remediation techniques, and preventative controls for EKS cluster related security issues.

Prerequisites

You must have AWS GuardDuty enabled in your AWS account in order to monitor and generate findings associated with an EKS cluster related security issue in your environment.

Amazon GuardDuty, along with these features of GuardDuty:
- Kubernetes Protection
- Malware Protection

EKS security issue walkthrough

Before jumping into the security issue, it is important to understand how the AWS shared responsibility model applies to the Amazon EKS managed service. AWS is responsible for the EKS managed Kubernetes control plane and the infrastructure to deliver EKS in a secure and reliable manner. You have the ability to configure EKS and how it interacts with other applications and services, where you are responsible for making sure that secure configurations are being used.

The following scenario is based on a real-world observed event, where a malicious actor used Kubernetes compromise tactics and techniques to expose and access an EKS cluster. We use this example to show how you can use AWS security services to identify and investigate each step of this security event. For a security event in your own environment, the order of operations and the investigative and remediation techniques used might be different. The scenario is broken down into the following phases and associated MITRE ATT&CK tactics:

Phase 1 – EKS cluster misconfiguration
Phase 2 (Discovery) – Discovery of vulnerable EKS clusters
Phase 3 (Initial Access) – Credential access to obtain Kubernetes secrets
Phase 4 (Persistence) – Impact to persist unauthorized access to the cluster
Phase 5 (Impact) – Impact to manipulate resources for unauthorized activity

Phase 1 – EKS cluster misconfiguration

By default, when you provision an EKS cluster, the API cluster endpoint is set to public, meaning that it can be accessed from the internet. Despite being accessible from the internet, the endpoint is still considered secure because it requires all API requests to be authenticated by AWS Identity and Access Management (IAM) and then authorized by Kubernetes role-based access control (RBAC). Also, the entity (user or role) that creates the EKS cluster is automatically granted system:masters permissions, which allows the entity to modify the EKS cluster’s RBAC configuration.

This example scenario starts with a developer who has access to administer EKS clusters in an AWS account. The developer wants to work from their home network and doesn’t want to connect to their enterprise VPN for IAM role federation. They configure an EKS cluster API without setting up the proper authentication and authorization components. Instead, the developer grants explicit access to the system:anonymous user in the cluster’s RBAC configuration. (Alternatively, an unauthorized RBAC configuration could be introduced into your environment after a developer unknowingly installs a malicious helm chart from the internet without reviewing or inspecting it first.)

In Kubernetes anonymous requests, unauthenticated and unrejected HTTP requests are treated as anonymous access and are identified as a system:anonymous user belonging to a system:unauthenticated group. This means that any entity on the internet can access the cluster and make API requests that are permitted by the role. There aren’t many legitimate use cases for this type of activity, because it’s considered a best practice to use RBAC instead. Anonymous requests are primarily used for setting up health endpoints and custom authentication.

By monitoring EKS audit logs, GuardDuty identifies this activity and generates the finding Policy:Kubernetes/AnonymousAccessGranted, as shown in Figure 1. This finding informs you that a user on your Kubernetes cluster successfully created a ClusterRoleBinding or RoleBinding to bind the user system:anonymous to a role. This action enables unauthenticated access to the API operations permitted by the role.

Figure 1: Example GuardDuty finding for Kubernetes anonymous access granted

Phase 2 (Discovery) – Discovery of vulnerable EKS clusters

Port scanning is a method that malicious actors use to determine if resources are publicly exposed, with open ports and known vulnerabilities. As an increasing number of open-source tools allows users to search for endpoints connected to the internet, finding these endpoints has become even easier. Security teams can use these open-source tools to their advantage by proactively scanning for and identifying externally exposed resources in their organization.

This brings us to the discovery phase of our misconfigured EKS cluster. The discovery phase is defined by MITRE as follows: “Discovery consists of techniques an adversary may use to gain knowledge about the system and internal network. These techniques help adversaries observe the environment and orient themselves before deciding how to act.”

By granting system:anonymous access to the EKS cluster in our example, the developer allowed requests from any public unauthenticated source. This can result in external web crawlers probing the cluster API, which can often happen within seconds of the system:anonymous access being granted. GuardDuty identifies this activity and generates the finding Discovery:Kubernetes/SuccessfulAnonymousAccess, as shown in Figure 2. This finding informs you that an API operation to discover resources in a cluster was successfully invoked by the system:anonymous user. Remember, all API calls made by system:anonymous are unauthenticated, in addition to /healthz and /version calls that are always unauthenticated regardless of the user identity, and any entity can make use of this user within the EKS cluster.

In the screenshot, under the Action section in the finding details, you can see that the anonymous user made a get request to “/”. This is a generic request that is not specific to a Kubernetes cluster, which may indicate that the crawler is not specifically targeting Kubernetes clusters. You can further see that the Status code is 200, indicating that the request was successful. If this activity is malicious, then the actor is now aware that there is an exposed resource.

Figure 2: Example GuardDuty finding for Kubernetes successful anonymous access

Phase 3 (Initial Access) – Credential access to obtain Kubernetes secrets

Next, in this phase, you might start observing more targeted API calls for establishing initial access from unauthorized users. MITRE defines initial access as “techniques that use various entry vectors to gain their initial foothold within a network. Techniques used to gain a foothold include targeted spearphishing and exploiting weaknesses on public-facing web servers. Footholds gained through initial access may allow for continued access, like valid accounts and use of external remote services, or may be limited-use due to changing passwords.”

In our example, the malicious actor has established initial access for the EKS cluster which is evident in the next GuardDuty finding, CredentialAccess:Kubernetes/SuccessfulAnonymousAccess, as shown in Figure 3. This finding informs you that an API call to access credentials or secrets was successfully invoked by the system:anonymous user. The observed API call is commonly associated with the credential access tactic where an adversary is attempting to collect passwords, usernames, and access keys for a Kubernetes cluster.

You can see that in this GuardDuty finding, in the Action section, the Request uri is targeted at a Kubernetes cluster, specifically /api/v1/namespaces/kube-system/secrets. This request seems to be targeting the secrets management capabilities that are built into Kubernetes. You can find more information about this secrets management capability in the Kubernetes documentation.

Figure 3: Example GuardDuty finding for Kubernetes successful credential access from anonymous user

Phase 4 (Persistence) – Impact to persist unauthorized access to the cluster

The next phase of this scenario is likely to be an impact in the EKS cluster to enable persistence by the malicious actor. MITRE defines impact as “techniques that adversaries use to disrupt availability or compromise integrity by manipulating business and operational processes.” Following the MITRE definitions, “Persistence consists of techniques that adversaries use to keep access to systems across restarts, changed credentials, and other interruptions that could cut off their access. Techniques used for persistence include any access, action, or configuration changes that let them maintain their foothold on systems, such as replacing or hijacking legitimate code or adding startup code.”

In the GuardDuty finding Impact:Kubernetes/SuccessfulAnonymousAccess, shown in Figure 4, you can see the Kubernetes user details and Action sections that indicate that a successful Kubernetes API call was made to create a ClusterRoleBinding by the system:anonymous username. This finding informs you that a write API operation to tamper with resources was successfully invoked by the system:anonymous user. The observed API call is commonly associated with the impact stage of an attack, when an adversary is tampering with resources in your cluster. This activity shows that the system:anonymous user has now created their own role to enable persistent access the EKS cluster. If the user is malicious, they can now access the cluster even if access is removed in the RBAC configuration for the system:anonymous user.

Figure 4 Example GuardDuty finding for Kubernetes successful credential change by anonymous user

Phase 5 (Impact) – Impact to manipulate resources for unauthorized activity

The fifth phase of this scenario is where the unauthorized user is likely to focus on impact techniques in order to use the access for malicious purpose. MITRE says of the impact phase: “Techniques used for impact can include destroying or tampering with data. In some cases, business processes can look fine, but may have been altered to benefit the adversaries’ goals. These techniques might be used by adversaries to follow through on their end goal or to provide cover for a confidentiality breach.” Typically, once a malicious actor has access into a system, they will introduce malware to the system to manipulate the compromised resource and possibly also other resources.

With the introduction of GuardDuty Malware Protection, when an Amazon Elastic Compute Cloud (Amazon EC2) or container-related GuardDuty finding that indicates potentially suspicious activity is generated, an agentless scan on the volumes will initiate and detect the presence of malware. Existing GuardDuty customers need to enable Malware Protection, and for new customers this feature is on by default when they enable GuardDuty for the first time. Malware Protection comes with a 30-day free trial for both existing and new GuardDuty customers. You can see a list of findings that initiates a malware scan in the GuardDuty User Guide.

In this example, the malicious actor now uses access to the cluster to perform unauthorized cryptocurrency mining. GuardDuty monitors the DNS requests from the EC2 instances used to host the EKS cluster. This allows GuardDuty to identify a DNS request made to a domain name associated with a cryptocurrency mining pool, and generate the finding CryptoCurrency:EC2/BitcoinTool.B!DNS, as shown in Figure 5.

Figure 5: Example GuardDuty finding for EC2 instance querying bitcoin domain name

Because this is an EC2 related GuardDuty finding and GuardDuty Malware Protection is enabled in the account, GuardDuty then conducts an agentless scan on the volumes of the EC2 instance to detect malware. If the scan results in a successful detection of one or more malicious files, another GuardDuty finding for Execution:EC2/MaliciousFile is generated, as shown in Figure 6.

Figure 6: Example GuardDuty finding for detection of a malicious file on EC2

The first GuardDuty finding detects crypto mining activity, while the proceeding malware protection finding provides context on the malware associated with this activity. This context is very valuable for the incident response process.

Conclusion

In this post, we walked you through each of the five phases where we outlined how an initial misconfiguration could result in a malicious actor gaining control of EKS resources within an AWS account and how GuardDuty is able to continually monitor and detect the progression of the security event. As previously stated, this is just one example where a misconfiguration in an EKS cluster could result in a security event.

Now that you have a good understanding of GuardDuty capabilities to continuously monitor and detect EKS security events, you will need to establish processes and procedures to enable your security team to investigate these events. You can enable Amazon Detective to help accelerate your security team’s mean time to respond (MTTR) by providing an efficient mechanism to analyze, investigate, and identify the root cause of security events. Follow along in part 2 of this series, How to investigate and take action on an Amazon EKS cluster related security issue with Amazon Detective, where we’ll cover techniques you can use with Amazon Detective to identify impacted EKS resources in your AWS account, possible remediation actions to take on the cluster, and preventative controls you can implement.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a thread on Amazon GuardDuty re:Post.

Want more AWS Security news? Follow us on Twitter.

How to use new Amazon GuardDuty EKS Protection findings

2022-05-06 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/how-to-use-new-amazon-guardduty-eks-protection-findings/

If you run container workloads that use Amazon Elastic Kubernetes Service (Amazon EKS), Amazon GuardDuty now has added support that will help you better protect these workloads from potential threats. Amazon GuardDuty EKS Protection can help detect threats related to user and application activity that is captured in Kubernetes audit logs. Newly-added Kubernetes threat detections include Amazon EKS clusters that are accessed by known malicious actors or from Tor nodes, API operations performed by anonymous users that might indicate a misconfiguration, and misconfigurations that can result in unauthorized access to Amazon EKS clusters. By using machine learning (ML) models, GuardDuty can identify patterns consistent with privilege-escalation techniques, such as a suspicious launch of a container with root-level access to the underlying Amazon Elastic Compute Cloud (Amazon EC2) host. In this post, we give you an overview of the new GuardDuty EKS Protection feature; show you examples of new finding details; and help you understand, operationalize, and respond to these new findings.

Amazon GuardDuty is an automated threat detection service that continuously monitors for suspicious activity and potentially unauthorized behavior to help protect your AWS accounts, Amazon EC2 workloads, data stored in Amazon Simple Storage Service (S3), and now Amazon EKS workloads.

If you are already a GuardDuty customer, you can enable GuardDuty EKS Protection and efficiently navigate the console to begin to use this feature. Your delegated administrator accounts can enable this for existing member accounts and determine if new AWS accounts in an organization will be automatically enrolled. If you are new to GuardDuty, the EKS Protection feature is included as part of the service’s 30-day trial period. As part of the 30-day trial period, you can take full advantage of this new feature and gain insight into your Amazon EKS workloads.

Overview of GuardDuty EKS Protection

GuardDuty EKS Protection enables GuardDuty to detect suspicious activities and potential compromises of your EKS clusters by analyzing Kubernetes audit logs. Kubernetes audit logs provide a security relevant, chronological set of records documenting the sequence of events from individual users, administrators, or system components that have affected your cluster. Audit logs can help answer questions such as: What happened? When did it happen? Who initiated it? GuardDuty EKS Protection analyzes Kubernetes audit logs from your Amazon EKS clusters, both new and existing, without the need to configure EKS control plane logging in your environment. GuardDuty collects these Kubernetes audit logs in addition to AWS CloudTrail, Amazon Virtual Private Cloud (Amazon VPC) flow logs, DNS queries, and Amazon S3 data events. GuardDuty EKS Protection performs analysis and looks for suspicious activity without the need for agents or adding resource constraints to your environment.

To detect threats using Kubernetes audit logs, GuardDuty uses a combination of machine learning, anomaly detection, and integrated threat intelligence to identify and prioritize potential threats. These findings primarily align to five root causes including compromised container images, configuration issues, Kubernetes user compromise, pod compromise, and node compromise. An example of a configuration issue is granting unnecessary privileges to the anonymous user by misconfiguring role-based access control (RBAC), which may inadvertently allow anonymous and unauthenticated calls to the Kubernetes API. A Kubernetes user compromise example could be a bad actor using stolen credentials to deploy containers with insecure settings, to use for a variety of activities from command and control to crypto-mining.

After a threat is detected, GuardDuty generates a security finding that includes container details such as the pod ID, container image ID, and tags associated with the Amazon EKS cluster. These finding details assist you with understanding the root cause which you can use to identify basic steps to remediate findings specific to EKS clusters. For example, your response to a finding or group of findings associated with a compromised Kubernetes user might begin with revoking access. For more information, see Remediating Kubernetes security issues discovered by GuardDuty in the Amazon GuardDuty User Guide.

Understanding new GuardDuty EKS Protection findings

As adversaries continue to become more sophisticated, it becomes even more important for you to align to a common framework to understand the tactics, techniques, and procedures (TTPs) behind an individual event. GuardDuty aligns findings using the MITRE ATT&CK framework, which is a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. GuardDuty findings have a specific finding format that helps you understand details of each finding. If you examine the ThreatPurpose portion in the GuardDuty EKS Protection finding types, you see there are finding types associated with various MITRE ATT&CK tactics, including CredentialAccess, DefenseEvasion, Discovery, Impact, Persistence, and PrivilegeEscalation. This can help you identify and understand the type of activity associated with a finding.

For example, look at two different finding types that seem similar: Impact:Kubernetes/SuccessfulAnonymousAccess and Discovery:Kubernetes/SuccessfulAnonymousAccess. You can see the difference is the ThreatPurpose at the beginning. They are both involved with successful anonymous access, and the difference is the intent of the activity associated with each finding. GuardDuty has determined based on the API or request URI invoked, that in this example, the activity seen on one finding aligns with the Impact tactic whereas the other finding aligns with the Discovery tactic

With GuardDuty EKS Protection, you now have an additional mechanism to gain insight into your EKS clusters across your accounts to look for suspicious activity. You can be alerted to Kubernetes-specific suspicious activity including: allowing administrator access to the default service account, exposing a Kubernetes dashboard, and launching a container with sensitive host paths. With this new feature, GuardDuty is also able to extend support for finding types that you might already be familiar with that also apply to Amazon EKS workloads. These finding types include calls to a Kubernetes cluster API from a Tor node, or calls to a Kubernetes cluster from a known malicious IP address, which can indicate that there are interactions with your Kubernetes clusters from sources that are commonly associated with malicious actors.

Responding to GuardDuty EKS Protection findings

This section gives an overview of three new GuardDuty EKS Protection findings, how to prevent them, and how to investigate and respond if they happen in your environment. The patterns shown can also act as a guide for how to prevent, investigate, and respond to other GuardDuty EKS Protection findings.

Discovery:Kubernetes/SuccessfulAnonymousAccess

Finding documentation: Discovery:Kubernetes/SuccessfulAnonymousAccess

Severity: Medium

Overview: This finding (as shown in Figure 1) informs you that an API operation was successfully invoked by the system:anonymous user. API calls made by system:anonymous are unauthenticated. The observed API is commonly associated with the discovery stage of an attack when an adversary is gathering information on your Kubernetes cluster. This activity indicates that anonymous or unauthenticated access is permitted on the API action reported in the finding, and may be permitted on other actions. These API calls are possible because of a misconfiguration of the system:anonymous user or system:unauthenticated group.

Preventative measures: AWS recommends that you disable unnecessary anonymous authentication. For instructions, see Review and revoke unnecessary anonymous access in the Amazon EKS Best Practices Guides. It is important to note that Kubernetes versions older than 1.14 granted system:discovery and system:basic-user roles to system:anonymous user by default, and these permissions remain in place after updating unless you explicitly change them.

How to remediate: To respond to this finding, it is important to first identify the details of the activity, for example what cluster is involved? Who is the owner of this cluster? This information will assist you with the remediation steps that follow, to review and revoke unnecessary permissions, and also help you determine a root cause.

Figure 1: GuardDuty Console showing Discovery:Kubernetes/SuccessfulAnonymousAccess finding type

Remediation step 1: Examine permissions

The first step is to examine the permissions that have been granted to the system:anonymous user, and determine what permissions are needed. To accomplish this, you need to first understand what permissions the system:anonymous user has. You can use an rbac-lookup tool to list the Kubernetes roles and cluster roles bound to users, service accounts, and groups. An alternative method can be found at this GitHub page.

./rbac-lookup | grep -P 'system:(anonymous)|(unauthenticated)'
system:anonymous               cluster-wide        ClusterRole/system:discovery
system:unauthenticated         cluster-wide        ClusterRole/system:discovery
system:unauthenticated         cluster-wide        ClusterRole/system:public-info-viewer

Remediation step 2: Disassociate groups

Next, you disassociate the system:unauthenticated group from system:discovery and system:basic-user ClusterRoles, which you do by editing the ClusterRoleBinding. Make sure to not remove system:unauthenticated from the system:public-info-viewer cluster role binding, because that will prevent the Network Load Balancer from performing health checks against the API server. For more information, see Network Load Balancer in the AWS Load Balancer Controller Guide and Identity and Access Management in Amazon EKS Best Practices Guide.

To disassociate the appropriate groups

Run the command kubectl edit clusterrolebindings system:discovery. This command will open the current definition of system:discovery ClusterRoleBinding in your editor as shown in the sample .yaml configuration file:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this # file will be reopened with the relevant failures.
#
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  creationTimestamp: "2021-06-17T20:50:49Z"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:discovery
  resourceVersion: "24502985"
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/system%3Adiscovery
  uid: b7936268-5043-431a-a0e1-171a423abeb6
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:discovery
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:authenticated
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:unauthenticated

Delete the entry for system:unauthenticated group, which is highlighted in bold in the subjects section.
Repeat the same steps for system:basic-user ClusterRoleBinding.

If there is no reason that the system:anonymous user should be used in your environment, AWS recommends that you set up automatic response and remediation steps 1-3. For more information about the system:anonymous user, see Identity and Access Management in Amazon EKS Best Practices Guide.

PrivilegeEscalation:Kubernetes/PrivilegedContainer

Finding documentation: PrivilegeEscalation:Kubernetes/PrivilegedContainer

Severity: Medium

Overview: This finding (as shown in Figure 2) informs you that a privileged container was launched on your Kubernetes cluster using an image that has never before been used to launch privileged containers in your cluster. A privileged container has root level access on the host. Adversaries commonly launch privileged containers to perform privilege escalation to gain access and compromise the underlying host.

Preventative measures: Create and enforce policy-as-code (PAC) or Pod Security Standards (PSS) that require that pods be created as non-privileged. For more information, see Pod Security in the in Amazon EKS Best Practices Guide.

How to remediate: To respond to this finding, it is important to first identify the details of the activity and begin to answer questions that will help determine what happened. For example, what pod or workload was launched? Who was the user that launched this pod or workload? What cluster is involved?

Figure 2: GuardDuty Console showing PrivilegeEscalation:Kubernetes/PrivilegedContainer finding type

If this privileged container launch is unexpected, the credentials of the user identity used to launch the container may be compromised. You should then focus on remediating and reviewing access to your cluster, and remediating the user. To do this, follow the procedure in the Remediating a compromised Kubernetes user section of this post. Next, you should identify compromised pods using the procedure in the Identifying and remediating compromised pods section of this post.

If you know what specific circumstances a privileged container can be deployed in your environment, for example only in a specific namespace, it is likely you can automatically remediate any GuardDuty EKS Protection finding associated with a privileged container in any other namespace. For more information about automated response activities, see Incident response and forensics in the Amazon EKS Best Practices Guide.

Persistence:Kubernetes/ContainerWithSensitiveMount

Finding documentation: Persistence:Kubernetes/ContainerWithSensitiveMount

Severity: Medium

Overview: This finding (as shown in Figure 3) informs you that a container was launched with a configuration that included a sensitive host path with write access in the volumeMounts section. This makes the sensitive host path accessible and writable from inside the container. This technique is commonly used by adversaries to gain access to the host’s filesystem.

Preventative Measures: Create and enforce policy-as-code (PAC) or Pod Security Standards (PSS) that use the allowedHostPaths control to only allow required host paths for use in volumes and preferably with read-only access. For more information, see Pod Security in the Amazon EKS Best Practices Guide.

Figure 3: GuardDuty Console showing Persistence:Kubernetes/ContainerWithSensitiveMount finding type

If the container launched is unexpected, the credentials of the user identity used to launch the container may be compromised. You should then focus on remediating and reviewing access to your cluster and remediating the user. To do this, follow the procedure in the next section, Remediating a compromised Kubernetes user.

If you can determine what containers should and should not be launched with writable hostPath mounts, then you can create automatic response and remediation for this use case. For example, you might want to revoke temporary security credentials assigned to the pod or worker node. For more information about revoking temporary security credentials and other response and remediation actions, see Incident response and forensics in the Amazon EKS Best Practices Guide.

Remediating a compromised Kubernetes user

If the compromised user has privileges to read secrets of one or more namespaces, rotate all of the affected secrets. For more information about the different types of secrets, see Secrets in the Kubernetes documentation. If the user has write privileges, AWS recommends auditing all changes made by the user in question. You can accomplish this by querying audit logs, if you have enabled EKS control plane logging on your EKS cluster. If you do not currently have logging enabled, follow the instructions for Enabling and disabling control plane logs in the Amazon EKS User Guide. Amazon EKS stores these control plane logs in Amazon CloudWatch Logs in your account. You can use CloudWatch Logs Insights to list all the mutating changes that the compromised user has made.

Remediation step 1: Identify the user

All actions performed on a Kubernetes cluster has an associated identity. GuardDuty EKS Protection findings report details of the Kubernetes user identity that the malicious actor may have compromised. You can find details of the user identity in the GuardDuty console under the Kubernetes user details section in the finding details, or in the finding JSON under the resources.eksClusterDetails.kubernetesDetails.kubernetesUserDetails section. These user details include username, UID, and groups that the user belongs to.

Remediation step 2: Identify changes

Identify the changes made by the attacker associated with the compromised user identity by using the code example below to query CloudWatch Logs Insights, replacing the placeholders with your values.

fields @timestamp, @message
| filter user.username == <username> 
| filter verb == "create" or verb == "update" or verb == "patch"
| filter responseStatus.code >= 200 and responseStatus.code <= 300
| filter @timestamp >= <approximate start time of the attack in epoch milliseconds>

For example:

fields @timestamp, @message
| filter user.username == "kubernetes-admin" 
| filter verb == "create" or verb == "update" or verb == "patch"
| filter responseStatus.code >= 200 and responseStatus.code <= 300
| filter @timestamp >= 1628279482312

An EKS cluster can have multiple types of user identities, for example the kubernetes-admin user, aws-auth ConfigMap defined user, and so on. You will need to take actions appropriate for the user type to properly revoke its access. For more information, see Remediating compromised Kubernetes users in the Amazon GuardDuty User Guide.
(Optional) If the compromised user identity had extensive privileges and you determine that the attacker made extensive changes to the cluster, you should consider isolating the pod, followed by creating a new clean cluster and redeploying your applications to the new cluster. For instructions to isolate and redeploy EKS pods, see Isolate the Pod by creating a Network Policy that denies all ingress and egress traffic to the pod in the Amazon EKS Best Practices Guide.

Identifying and remediating compromised pods

If a GuardDuty EKS Protection finding is caused by activity related to a specific pod, the value of the finding JSON resource.kubernetesDetails.kubernetesWorkloadDetails.type field is pod. The finding includes the name of the pod and namespace in the resource.kubernetesDetails.kubernetesWorkloadDetails.name and resource.kubernetesDetails.kubernetesWorkloadDetails.namespace fields, which uniquely identify the pod.

In other cases, such as when a service account or a Kubernetes workload name is in the resource.kubernetesDetails.kubernetesUserDetails, you can follow the instructions in the Sample incident response plan to identify compromised pods using different pieces of information available in the GuardDuty EKS Protection findings.

After you have identified compromised pods, to remediate, use the instructions to isolate the pods, rotate the credentials, and gather data for forensic analysis in Isolate the Pod by creating a Network Policy that denies all ingress and egress traffic to the pod in the Amazon EKS Best Practices Guide.

Conclusion

In this post, you learned the details of the new Amazon GuardDuty EKS Protection feature, and Kubernetes audit logs, and you saw examples for how to understand, operationalize, and respond to these new findings. You can enable this feature through the GuardDuty Console or APIs to start monitoring your Amazon EKS clusters today. If you have created Amazon EventBridge Rules to send findings from GuardDuty to a target, then ensure that your rules are configured to deliver these newly added findings.

AWS is committed to continually improving GuardDuty, to make it more efficient for you to operate securely in AWS. At AWS, customer feedback drives change, so we encourage you to continue providing feedback. If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Best practices for cross-Region aggregation of security findings

2022-01-19 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/best-practices-for-cross-region-aggregation-of-security-findings/

AWS Security Hub enables customers to have a centralized view into the security posture across their AWS environment by aggregating your security alerts from various AWS services and partner products in a standardized format so that you can more easily take action on them. To facilitate that central view, Security Hub allows you to designate an aggregation Region, which links some or all Regions to a single aggregated Region in a delegated administrator AWS account. All your findings across all of your accounts and all of your linked Regions will be processed by Security Hub in this one Region. With this feature, you can take advantage of many configurations when ingesting findings into Security Hub, that will benefit you operationally and provide cost savings.

This blog post provides you with a set of best practices when using Security Hub across multiple Regions. After implementing the recommendations in this blog post, you’ll have an optimized and centralized view of Security Hub findings from all integrated AWS services and partner products across all Regions in a single AWS account and Region.

Enable cross-Region aggregation

To enable cross-Region aggregation in Security Hub, you must first enable finding aggregation in Security Hub from the Region that will become the aggregation Region. You cannot use a Region that is disabled by default as your aggregation Region. For a list of Regions that are disabled by default, see Enabling a Region in the AWS General Reference.

You can enable AWS Security Hub finding aggregation using either the console or CLI. You must enable finding aggregation from the Region that will be the aggregation Region.

To enable Security Hub finding aggregation from the console

To enable AWS Security Hub finding aggregation using the AWS console:

Start by navigating to the AWS Security Hub console and select Settings on the left side of the screen. Once on the settings page, choose the Regions tab.

Figure 1. Enabling finding aggregation

Check the checkbox to Link future Regions. As AWS releases new Regions, their results will automatically be aggregated into your designated Region. If this checkbox is not checked, any new Region that is released will not aggregate Security Hub findings to the aggregation Region.

To enable Security Hub finding aggregation using the CLI

Alternatively, you can enable AWS Security Hub finding aggregation using the CLI by using the following command:

aws securityhub create-finding-aggregator –region <aggregation Region> –region-linking-mode ALL_REGIONS | ALL_REGIONS_EXCEPT_SPECIFIED | SPECIFIED_REGIONS –regions <Region list>

Here’s a sample CLI command to enable AWS Security Hub finding aggregation:

aws securityhub create-finding-aggregator –region us-east-1 –region-linking-mode SPECIFIED_REGIONS –regions us-west-1,us-west-2

For more details around AWS Security Hub cross-region aggregation, see Aggregating findings across regions.

Consolidating downstream SIEM and ticketing integrations

Security Hub findings for all AWS accounts in your environment should be integrated into a Security Information and Event Management (SIEM) solution, such as Amazon OpenSearch Service or an APN partner SIEM, or a standardized ticketing system such as JIRA or ServiceNow.

You should send all Security Hub findings to a SIEM or ticketing solution from a single aggregation point to simplify operational overhead. Although integration architectures vary, as an example, this might mean configuring an Amazon EventBridge rule to parse and send findings to AWS Lambda or Amazon Kinesis for a custom integration point with the SIEM or ticketing solution.

You should to configure this integration point in a single delegated administrator account across all member AWS accounts and aggregated Regions. You should avoid having multiple integration points between each Security Hub Region and your SIEM or ticketing solution to avoid unnecessary operational overhead and costs of managing multiple integration points and resources required to stream findings to your SIEM.

Collecting Security Hub findings in a SIEM or ticketing solution can help you correlate findings across many other logs sources. For example, you might use a SIEM solution to analyze operating system logs from an Amazon Elastic Compute Cloud (Amazon EC2) instance to correlate with GuardDuty findings collected by Security Hub to investigate suspicious activity. You could also use ServiceNow or JIRA to create an automated, bidirectional integration between these ticketing solutions that keeps your Security Hub findings and issues in sync.

Auto-archive GuardDuty findings associated with global resources

Amazon GuardDuty creates findings associated with AWS IAM resources. IAM resources are global resources, which means that they are not Region-specific. If GuardDuty generates a finding for an IAM API call that is not Region-specific, such as ListGroups (for example, PenTest:IAMUser/KaliLinux) that finding is created in all GuardDuty Regions and ingested into Security Hub in every Region. You want to implement suppression rules in GuardDuty so that you don’t have multiple copies of this finding in your Security Hub delegated administrator account finding aggregation Region.

To implement AWS GuardDuty suppression rules (Console)

To reduce the duplication of findings in Security Hub, suppress global GuardDuty findings in all Regions except the Security Hub aggregation Region. For example, if you are aggregating Security Hub findings in us-east-1 and your environment uses all commercial AWS Regions in the United States, you would add a suppression rule in GuardDuty in us-east-2, us-west-1, and us-west-2.

To create AWS GuardDuty suppression rules using the AWS console:

Navigate to the GuardDuty console and select the Findings link on the left side of the screen.

Figure 2. Creating GuardDuty suppression rules

Filter to search for the findings you want to suppress, and click Save / edit in the search bar.
Enter a name and description for the suppression rule and save it.

To implement AWS GuardDuty suppression rules (CLI)

Alternatively, you can create AWS GuardDuty suppression rules using the CreateFilter API via CLI.

Create a JSON file with your desired suppression filter criteria for the suppression rule.
The following CLI command will test your filter criteria for AWS GuardDuty findings that will be suppressed:

aws guardduty list-findings –detector-id 12abc34d567e8fa901bc2d34e56789f0 –finding-criteria file://criteria.json

The following CLI command will create a filter for AWS GuardDuty findings that will be suppressed:

aws guardduty create-filter –action ARCHIVE –detector-id 12abc34d567e8fa901bc2d34e56789f0 –name yourfiltername –finding-criteria file://criteria.json

For more details for creating AWS GuardDuty suppression rules, see Creating AWS GuardDuty suppression rules.

Reduce AWS Config cost by recording global resources in one Region

Like GuardDuty, AWS Config also records supported types of global resources, which are not tied to a specific Region and can be used in all Regions. The global resource types that AWS Config supports are IAM users, groups, roles, and customer managed policies. The configuration details for a specific global resource are the same in all Regions. If you have AWS Security Hub AWS Foundational Best Practices enabled, the feature has certain checks for global resources in AWS Config that you need to disable in all Regions except the aggregated Region.

Customize AWS Config for global resources

If you customize AWS Config in multiple Regions to record global resources, AWS Config creates multiple configuration items each time a global resource changes, one configuration item for each Region. Costs for each configuration item can be found on AWS Config pricing. These configuration items will contain identical data. To prevent duplicate configuration items, consider customizing AWS Config in only one Region to record global resources, unless you want those configuration items to be available in multiple Regions. See this blog post for a comprehensive list of additional AWS Config best practices.

To customize AWS Config for global resources (Console)

Follow the steps below to change the AWS Config global resource configuration in the AWS Console.

Navigate to the AWS Config console and select Settings on the left side of the screen
Click Edit in the top right corner
Uncheck the Include global resources checkbox.
Repeat these steps for each Region AWS Config is enabled, except the Region where you would like to track global resources.

Figure 3. AWS Config global resource setting

To customize AWS Config for global resources (CLI)

Alternatively, you can disable the global resource tracking in AWS Config using the CLI.

aws configservice put-configuration-recorder –configuration-recorder name=default,roleARN=arn:aws:iam::123456789012:role/config-role –recording-group allSupported=true,includeGlobalResourceTypes=false

If you have deployed AWS Config using these CloudFormation templates, you would set the IncludeGlobalResourceTypes to False under the AWS::Config::ConfigurationRecorder for the Regions you do not want to track global resources, and set the value to True in the aggregated Region where you would like to use to track global resources. You can use the CloudFormation StackSets multiple AWS Region deployment feature to deploy the CloudFormation template in all AWS Regions where AWS Config is enabled.

For more details for AWS Config global resources, see Selecting AWS Config resources to record.

Disable AWS Security Hub AWS Foundational Best Practices periodic controls associated with global resources

AWS Security Hub AWS Foundational Best Practices perform checks against the resources in your AWS environment utilizing AWS Config rules. After you have disabled the AWS Config global resources in all Regions except for the Region that runs global recording, disable the Security Hub controls that deal with global resources as shown in Figure 5 below.

You can disable AWS Security Hub controls relating to global resources using the console or CLI.

To disable AWS Security Hub controls (Console)

Follow the steps below to disable Security Hub controls that deal with global resources in the AWS Console.

Navigate to the Security Hub console and select Security Standards on the left side of the screen.
Click on the AWS Foundation Security Best Practices v.1.0.0 security standard.
Then use the filter box to search for IAM. Now you should be able to see security controls IAM.1-IAM.7, which are Security Hub global controls.

Figure 4. Security Hub global controls

Click on each control and select Disable in the top right corner
After you have disabled resources, add a reason for disabling and choose Disable.

Figure 5. Disabling Security Hub control

To disable AWS Security Hub controls (CLI)

Alternatively, you can disable Security Hub controls that deal with global resources using the CLI.

aws securityhub update-standards-control –standards-control-arn <control ARN> –control-status “DISABLED” –disabled-reason <description of reason to disable>

This sample CLI command disables Security Hub controls that deal with global resources:

aws securityhub update-standards-control –standards-control-arn “arn:aws:securityhub:us-east-1:123456789012:control/aws-foundational-security-best-practices/v/1.0.0/ACM.1” –control-status “DISABLED” –disabled-reason “Not applicable for my service”

You can also follow instructions to implement a solution to disable specific Security Hub controls for multiple AWS accounts.

Be sure to only disable the Security Hub controls in the Regions where global recording is also disabled. Verify the Security Hub controls associated with global resources are enabled in the same Region where AWS Config global resources are enabled.

After you have completed disabling these controls and recording of global resources, proceed to disable the [Config.1] AWS Config should be enabled control. This specific control requires recording of global resources in order to pass, which is not required to have enabled in multiple Regions.

For more details for AWS Security Hub controls, see Disabling and enabling individual AWS Security Hub controls .

Implement automatic remediation from a central Region

Once findings are consolidated and ingested into Security Hub across all your organization’s AWS accounts, you should implement auto-remediation where possible, including everything from resource misconfigurations to automated quarantine of infected EC2 instances. Security Hub provides multiple ways to achieve this through end-to-end automation with EventBridge or through human-triggered automation with Security Hub Custom Actions. You can deploy automatic remediation solutions in a single Region to perform cross-Region remediation. This helps you deploy fewer resources, saving money and operational overhead. For more information on how to enable the solution for Security Hub Automated Response and Remediation, see this blog post.

If you have automation currently in place, it’s important to understand how findings from multiple Regions triggering your automation might be affected. For example, you might have a Lambda function that remediates problems with S3 buckets, where it assumes it is being invoked in the same Region as the S3 bucket it needs to remediate. With cross-Region aggregation, your Lambda might need to make a cross-Region AWS SDK call. The Lambda function will run in the Region where the aggregation occurs, but the bucket could be in another Region, so you might have to adjust your function to handle that situation. Also, the role associated with the Lambda function could have its privileges limited to a single Region. If you intend the same function to work in all Regions, you might need change the IAM policy for the IAM role used by the Lambda. Make sure to check Service Control Policies in AWS Organizations, if you use them, because they can also deny actions in one Region while allowing them in another Region.

When enabling cross-Region finding aggregation, you’ll need to understand how any automatic remediation that might be in place today could be affected. Be sure to test your remediation functions on resources in various Regions, to be sure remediation works in all Regions you monitor.

Conclusion

This blog post highlights configurations you can take advantage of to reduce operational overhead and provide cost savings by using cross-Region finding aggregation in Security Hub. The examples given apply to the majority of AWS environments, and are meant to be action items you can use to improve the overall security and operational effectiveness of your AWS environment.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the re:Post forum.

Want more AWS Security news? Follow us on Twitter.

Using AWS security services to protect against, detect, and respond to the Log4j vulnerability

2021-12-16 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/using-aws-security-services-to-protect-against-detect-and-respond-to-the-log4j-vulnerability/

January 7, 2022: The blog post has been updated to include using Network ACL rules to block potential log4j-related outbound traffic.

January 4, 2022: The blog post has been updated to suggest using WAF rules when correct HTTP Host Header FQDN value is not provided in the request.

December 31, 2021: We made a minor update to the second paragraph in the Amazon Route 53 Resolver DNS Firewall section.

December 29, 2021: A paragraph under the Detect section has been added to provide guidance on validating if log4j exists in an environment.

December 23, 2021: The GuardDuty section has been updated to describe new threat labels added to specific finding to give log4j context.

December 21, 2021: The post includes more info about Route 53 Resolver DNS query logging.

December 20, 2021: The post has been updated to include Amazon Route 53 Resolver DNS Firewall info.

December 17, 2021: The post has been updated to include using Athena to query VPC flow logs.

December 16, 2021: The Respond section of the post has been updated to include IMDSv2 and container mitigation info.

This blog post was first published on December 15, 2021.

Overview

In this post we will provide guidance to help customers who are responding to the recently disclosed log4j vulnerability. This covers what you can do to limit the risk of the vulnerability, how you can try to identify if you are susceptible to the issue, and then what you can do to update your infrastructure with the appropriate patches.

The log4j vulnerability (CVE-2021-44228, CVE-2021-45046) is a critical vulnerability (CVSS 3.1 base score of 10.0) in the ubiquitous logging platform Apache Log4j. This vulnerability allows an attacker to perform a remote code execution on the vulnerable platform. Version 2 of log4j, between versions 2.0-beta-9 and 2.15.0, is affected.

The vulnerability uses the Java Naming and Directory Interface (JNDI) which is used by a Java program to find data, typically through a directory, commonly a LDAP directory in the case of this vulnerability.

Figure 1, below, highlights the log4j JNDI attack flow.

Figure 1. Log4j attack progression. Source: GovCERT.ch, the Computer Emergency Response Team (GovCERT) of the Swiss government

As an immediate response, follow this blog and use the tool designed to hotpatch a running JVM using any log4j 2.0+. Steve Schmidt, Chief Information Security Officer for AWS, also discussed this hotpatch.

Protect

You can use multiple AWS services to help limit your risk/exposure from the log4j vulnerability. You can build a layered control approach, and/or pick and choose the controls identified below to help limit your exposure.

AWS WAF

Use AWS Web Application Firewall, following AWS Managed Rules for AWS WAF, to help protect your Amazon CloudFront distribution, Amazon API Gateway REST API, Application Load Balancer, or AWS AppSync GraphQL API resources.

AWSManagedRulesKnownBadInputsRuleSet esp. the Log4JRCE rule which helps inspects the request for the presence of the Log4j vulnerability. Example patterns include ${jndi:ldap://example.com/}.
AWSManagedRulesAnonymousIpList esp. the AnonymousIPList rule which helps inspect IP addresses of sources known to anonymize client information.
AWSManagedRulesCommonRuleSet, esp. the SizeRestrictions_BODY rule to verify that the request body size is at most 8 KB (8,192 bytes).

You should also consider implementing WAF rules that deny access, if the correct HTTP Host Header FQDN value is not provided in the request. This can help reduce the likelihood of scanners that are scanning the internet IP address space from reaching your resources protected by WAF via a request with an incorrect Host Header, like an IP address instead of an FQDN. It’s also possible to use custom Application Load Balancer listener rules to achieve this.

If you’re using AWS WAF Classic, you will need to migrate to AWS WAF or create custom regex match conditions.

Have multiple accounts? Follow these instructions to use AWS Firewall Manager to deploy AWS WAF rules centrally across your AWS organization.

Amazon Route 53 Resolver DNS Firewall

You can use Route 53 Resolver DNS Firewall, following AWS Managed Domain Lists, to help proactively protect resources with outbound public DNS resolution. We recommend associating Route 53 Resolver DNS Firewall with a rule configured to block domains on the AWSManagedDomainsMalwareDomainList, which has been updated in all supported AWS regions with domains identified as hosting malware used in conjunction with the log4j vulnerability. AWS will continue to deliver domain updates for Route 53 Resolver DNS Firewall through this list.

Also, you should consider blocking outbound port 53 to prevent the use of external untrusted DNS servers. This helps force all DNS queries through DNS Firewall and ensures DNS traffic is visible for GuardDuty inspection. Using DNS Firewall to block DNS resolution of certain country code top-level domains (ccTLD) that your VPC resources have no legitimate reason to connect out to, may also help. Examples of ccTLDs you may want to block may be included in the known log4j callback domains IOCs.

We also recommend that you enable DNS query logging, which allows you to identify and audit potentially impacted resources within your VPC, by inspecting the DNS logs for the presence of blocked outbound queries due to the log4j vulnerability, or to other known malicious destinations. DNS query logging is also useful in helping identify EC2 instances vulnerable to log4j that are responding to active log4j scans, which may be originating from malicious actors or from legitimate security researchers. In either case, instances responding to these scans potentially have the log4j vulnerability and should be addressed. GreyNoise is monitoring for log4j scans and sharing the callback domains here. Some notable domains customers may want to examine log activity for, but not necessarily block, are: *interact.sh, *leakix.net, *canarytokens.com, *dnslog.cn, *.dnsbin.net, and *cyberwar.nl. It is very likely that instances resolving these domains are vulnerable to log4j.

AWS Network Firewall

Customers can use Suricata-compatible IDS/IPS rules in AWS Network Firewall to deploy network-based detection and protection. While Suricata doesn’t have a protocol detector for LDAP, it is possible to detect these LDAP calls with Suricata. Open-source Suricata rules addressing Log4j are available from corelight, NCC Group, from ET Labs, and from CrowdStrike. These rules can help identify scanning, as well as post exploitation of the log4j vulnerability. Because there is a large amount of benign scanning happening now, we recommend customers focus their time first on potential post-exploitation activities, such as outbound LDAP traffic from their VPC to untrusted internet destinations.

We also recommend customers consider implementing outbound port/protocol enforcement rules that monitor or prevent instances of protocols like LDAP from using non-standard LDAP ports such as 53, 80, 123, and 443. Monitoring or preventing usage of port 1389 outbound may be particularly helpful in identifying systems that have been triggered by internet scanners to make command and control calls outbound. We also recommend that systems without a legitimate business need to initiate network calls out to the internet not be given that ability by default. Outbound network traffic filtering and monitoring is not only very helpful with log4j, but with identifying other classes of vulnerabilities too.

Network Access Control Lists

Customers may be able to use Network Access Control List rules (NACLs) to block some of the known log4j-related outbound ports to help limit further compromise of successfully exploited systems. We recommend customers consider blocking ports 1389, 1388, 1234, 12344, 9999, 8085, 1343 outbound. As NACLs block traffic at the subnet level, careful consideration should be given to ensure any new rules do not block legitimate communications using these outbound ports across internal subnets. Blocking ports 389 and 88 outbound can also be helpful in mitigating log4j, but those ports are commonly used for legitimate applications, especially in a Windows Active Directory environment. See the VPC flow logs section below to get details on how you can validate any ports being considered.

Use IMDSv2

Through the early days of the log4j vulnerability researchers have noted that, once a host has been compromised with the initial JDNI vulnerability, attackers sometimes try to harvest credentials from the host and send those out via some mechanism such as LDAP, HTTP, or DNS lookups. We recommend customers use IAM roles instead of long-term access keys, and not store sensitive information such as credentials in environment variables. Customers can also leverage AWS Secrets Manager to store and automatically rotate database credentials instead of storing long-term database credentials in a host’s environment variables. See prescriptive guidance here and here on how to implement Secrets Manager in your environment.

To help guard against such attacks in AWS when EC2 Roles may be in use — and to help keep all IMDS data private for that matter — customers should consider requiring the use of Instance MetaData Service version 2 (IMDSv2). Since IMDSv2 is enabled by default, you can require its use by disabling IMDSv1 (which is also enabled by default). With IMDSv2, requests are protected by an initial interaction in which the calling process must first obtain a session token with an HTTP PUT, and subsequent requests must contain the token in an HTTP header. This makes it much more difficult for attackers to harvest credentials or any other data from the IMDS. For more information about using IMDSv2, please refer to this blog and documentation. While all recent AWS SDKs and tools support IMDSv2, as with any potentially non-backwards compatible change, test this change on representative systems before deploying it broadly.

Detect

This post has covered how to potentially limit the ability to exploit this vulnerability. Next, we’ll shift our focus to which AWS services can help to detect whether this vulnerability exists in your environment.

Figure 2. Log4j finding in the Inspector console

Amazon Inspector

As shown in Figure 2, the Amazon Inspector team has created coverage for identifying the existence of this vulnerability in your Amazon EC2 instances and Amazon Elastic Container Registry Images (Amazon ECR). With the new Amazon Inspector, scanning is automated and continual. Continual scanning is driven by events such as new software packages, new instances, and new common vulnerability and exposure (CVEs) being published.

For example, once the Inspector team added support for the log4j vulnerability (CVE-2021-44228 & CVE-2021-45046), Inspector immediately began looking for this vulnerability for all supported AWS Systems Manager managed instances where Log4j was installed via OS package managers and where this package was present in Maven-compatible Amazon ECR container images. If this vulnerability is present, findings will begin appearing without any manual action. If you are using Inspector Classic, you will need to ensure you are running an assessment against all of your Amazon EC2 instances. You can follow this documentation to ensure you are creating an assessment target for all of your Amazon EC2 instances. Here are further details on container scanning updates in Amazon ECR private registries.

GuardDuty

In addition to finding the presence of this vulnerability through Inspector, the Amazon GuardDuty team has also begun adding indicators of compromise associated with exploiting the Log4j vulnerability, and will continue to do so. GuardDuty will monitor for attempts to reach known-bad IP addresses or DNS entries, and can also find post-exploit activity through anomaly-based behavioral findings. For example, if an Amazon EC2 instance starts communicating on unusual ports, GuardDuty would detect this activity and create the finding Behavior:EC2/NetworkPortUnusual. This activity is not limited to the NetworkPortUnusual finding, though. GuardDuty has a number of different findings associated with post exploit activity, such as credential compromise, that might be seen in response to a compromised AWS resource. For a list of GuardDuty findings, please refer to this GuardDuty documentation.

To further help you identify and prioritize issues related to CVE-2021-44228 and CVE-2021-45046, the GuardDuty team has added threat labels to the finding detail for the following finding types:

Backdoor:EC2/C&CActivity.B
If the IP queried is Log4j-related, then fields of the associated finding will include the following values:

service.additionalInfo.threatListName = Amazon
service.additionalInfo.threatName = Log4j Related

Backdoor:EC2/C&CActivity.B!DNS
If the domain name queried is Log4j-related, then the fields of the associated finding will include the following values:

service.additionalInfo.threatListName = Amazon
service.additionalInfo.threatName = Log4j Related

Behavior:EC2/NetworkPortUnusual
If the EC2 instance communicated on port 389 or port 1389, then the associated finding severity will be modified to High, and the finding fields will include the following value:

service.additionalInfo.context = Possible Log4j callback

Figure 3. GuardDuty finding with log4j threat labels

Security Hub

Many customers today also use AWS Security Hub with Inspector and GuardDuty to aggregate alerts and enable automatic remediation and response. In the short term, we recommend that you use Security Hub to set up alerting through AWS Chatbot, Amazon Simple Notification Service, or a ticketing system for visibility when Inspector finds this vulnerability in your environment. In the long term, we recommend you use Security Hub to enable automatic remediation and response for security alerts when appropriate. Here are ideas on how to setup automatic remediation and response with Security Hub.

VPC flow logs

Customers can use Athena or CloudWatch Logs Insights queries against their VPC flow logs to help identify VPC resources associated with log4j post exploitation outbound network activity. Version 5 of VPC flow logs is particularly useful, because it includes the “flow-direction” field. We recommend customers start by paying special attention to outbound network calls using destination port 1389 since outbound usage of that port is less common in legitimate applications. Customers should also investigate outbound network calls using destination ports 1388, 1234, 12344, 9999, 8085, 1343, 389, and 88 to untrusted internet destination IP addresses. Free-tier IP reputation services, such as VirusTotal, GreyNoise, NOC.org, and ipinfo.io, can provide helpful insights related to public IP addresses found in the logged activity.

Note: If you have a Microsoft Active Directory environment in the captured VPC flow logs being queried, you might see false positives due to its use of port 389.

Validation with open-source tools

With the evolving nature of the different log4j vulnerabilities, it’s important to validate that upgrades, patches, and mitigations in your environment are indeed working to mitigate potential exploitation of the log4j vulnerability. You can use open-source tools, such as aws_public_ips, to get a list of all your current public IP addresses for an AWS Account, and then actively scan those IPs with log4j-scan using a DNS Canary Token to get notification of which systems still have the log4j vulnerability and can be exploited. We recommend that you run this scan periodically over the next few weeks to validate that any mitigations are still in place, and no new systems are vulnerable to the log4j issue.

Respond

The first two sections have discussed ways to help prevent potential exploitation attempts, and how to detect the presence of the vulnerability and potential exploitation attempts. In this section, we will focus on steps that you can take to mitigate this vulnerability. As we noted in the overview, the immediate response recommended is to follow this blog and use the tool designed to hotpatch a running JVM using any log4j 2.0+. Steve Schmidt, Chief Information Security Officer for AWS, also discussed this hotpatch.

Figure 4. Systems Manager Patch Manager patch baseline approving critical patches immediately

AWS Patch Manager

If you use AWS Systems Manager Patch Manager, and you have critical patches set to install immediately in your patch baseline, your EC2 instances will already have the patch. It is important to note that you’re not done at this point. Next, you will need to update the class path wherever the library is used in your application code, to ensure you are using the most up-to-date version. You can use AWS Patch Manager to patch managed nodes in a hybrid environment. See here for further implementation details.

Container mitigation

To install the hotpatch noted in the overview onto EKS cluster worker nodes AWS has developed an RPM that performs a JVM-level hotpatch which disables JNDI lookups from the log4j2 library. The Apache Log4j2 node agent is an open-source project built by the Kubernetes team at AWS. To learn more about how to install this node agent, please visit the this Github page.

Once identified, ECR container images will need to be updated to use the patched log4j version. Downstream, you will need to ensure that any containers built with a vulnerable ECR container image are updated to use the new image as soon as possible. This can vary depending on the service you are using to deploy these images. For example, if you are using Amazon Elastic Container Service (Amazon ECS), you might want to update the service to force a new deployment, which will pull down the image using the new log4j version. Check the documentation that supports the method you use to deploy containers.

If you’re running Java-based applications on Windows containers, follow Microsoft’s guidance here.

We recommend you vend new application credentials and revoke existing credentials immediately after patching.

Mitigation strategies if you can’t upgrade

In case you either can’t upgrade to a patched version, which disables access to JDNI by default, or if you are still determining your strategy for how you are going to patch your environment, you can mitigate this vulnerability by changing your log4j configuration. To implement this mitigation in releases >=2.10, you will need to remove the JndiLookup class from the classpath: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class.

For a more comprehensive list about mitigation steps for specific versions, refer to the Apache website.

Conclusion

In this blog post, we outlined key AWS security services that enable you to adopt a layered approach to help protect against, detect, and respond to your risk from the log4j vulnerability. We urge you to continue to monitor our security bulletins; we will continue updating our bulletins with our remediation efforts for our side of the shared-responsibility model.

Given the criticality of this vulnerability, we urge you to pay close attention to the vulnerability, and appropriately prioritize implementing the controls highlighted in this blog.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Correlate security findings with AWS Security Hub and Amazon EventBridge

2021-10-25 Marshall Jones

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/correlate-security-findings-with-aws-security-hub-and-amazon-eventbridge/

In this blog post, we’ll walk you through deploying a solution to correlate specific AWS Security Hub findings from multiple AWS services that are related to a single AWS resource, which indicates an increased possibility that a security incident has happened.

AWS Security Hub ingests findings from multiple AWS services, including Amazon GuardDuty, Amazon Inspector, Amazon Macie, AWS Firewall Manager, AWS Identity and Access Management (IAM) Access Analyzer, and AWS Systems Manager Patch Manager. Findings from each service are normalized into the AWS Security Finding Format (ASFF), so that you can review findings in a standardized format and take action quickly. You can use AWS Security Hub to provide a single view of all security-related findings, where you can set up alerting, automatic remediation, and ingestion into third-party incident management systems for specific findings.

Although Security Hub can ingest a vast number of integrations and findings, it cannot create correlation rules like a Security Information and Event Management (SIEM) tool can. However, you can create such rules using EventBridge. It’s important to take a closer look when multiple AWS security services generate findings for a single resource, because this potentially indicates elevated risk. Depending on your environment, the initial number of findings in AWS Security Hub findings may be high, so you may need to prioritize which findings require immediate action. AWS Security Hub natively gives you the ability to filter findings by resource, account, and many other details. With the solution in this post, when one of these correlated sets of findings is detected, a new finding is created and pushed to AWS Security Hub by using the Security Hub BatchImportFindings API operation. You can then respond to these new security incident-oriented findings through ticketing, chat, or incident management systems.

Prerequisites

This solution requires that you have AWS Security Hub enabled in your AWS account. In addition to AWS Security Hub, the following services must be enabled and integrated to AWS Security Hub:

Solution overview

In this solution, you will use a combination of AWS Security Hub, Amazon EventBridge, AWS Lambda, and Amazon DynamoDB to ingest and correlate specific findings that indicate a higher likelihood of a security incident. Each correlation is focused on multiple specific AWS security service findings for a single AWS resource.

The following list shows the correlated findings that are detected by this solution. The Description section for each finding correlation provides context for that correlation, the Remediation section provides general recommendations for remediation, and the Prevention/Detection section provides guidance to either prevent or detect one or more findings within the correlation. With the code provided, you can also add more correlations than those listed here by modifying the Cloud Development Kit (CDK) code and AWS Lambda code. The Solution workflow section breaks down the flow of the solution. If you choose to implement automatic remediation, each finding correlation will be created with the following AWS Security Hub Finding Format (ASFF) fields:

- Severity: CRITICAL
- ProductArn: arn:aws:securityhub:<REGION>:<AWS_ACCOUNT_ID>:product/<AWS_ACCOUNT_ID>/default

These correlated findings are created as part of this solution:

Any Amazon GuardDuty Backdoor findings and three critical common vulnerabilities and exposures (CVEs) from Amazon Inspector that are associated with the same Amazon Elastic Compute Cloud (Amazon EC2) instance.
- Description: Amazon Inspector has found at least three critical CVEs on the EC2 instance. CVEs indicate that the EC2 instance is currently vulnerable or exposed. The EC2 instance is also performing backdoor activities. The combination of these two findings is a stronger indication of an elevated security incident.
- Remediation: It’s recommended that you isolate the EC2 instance and follow standard protocol to triage the EC2 instance to verify if the instance has been compromised. If the instance has been compromised, follow your standard Incident Response process for post-instance compromise and forensics. Redeploy a backup of the EC2 instance by using an up-to-date hardened Amazon Machine Image (AMI) or apply all security-related patches to the redeployed EC2 instance.
- Prevention/Detection: To mitigate or prevent an Amazon EC2 instance from missing critical security updates, consider using Amazon Systems Manager Patch Manager to automate installing security-related patching for managed instances. Alternatively, you can provide developers up-to-date hardened Amazon Machine Images (AMI) by using Amazon EC2 Image Builder. For detection, you can set the AMI property called ‘DeprecationTime’ to indicate when the AMI will become outdated and respond accordingly.
An Amazon Macie sensitive data finding and an Amazon GuardDuty S3 exfiltration finding for the same Amazon Simple Storage Service (Amazon S3) bucket.
- Description: Amazon Macie has scanned an Amazon S3 bucket and found a possible match for sensitive data. Amazon GuardDuty has detected a possible exfiltration finding for the same Amazon S3 bucket. The combination of these findings indicates a higher risk security incident.
- Remediation: It’s recommended that you review the source IP and/or IAM principal that is making the S3 object reads against the S3 bucket. If the source IP and/or IAM principal is not authorized to access sensitive data within the S3 bucket, follow your standard Incident Response process for post-compromise plan for S3 exfiltration. For example, you can restrict an IAM principal’s permissions, revoke existing credentials or unauthorized sessions, restricting access via the Amazon S3 bucket policy, or using the Amazon S3 Block Public Access feature.
- Prevention/Detection: To mitigate or prevent exposure of sensitive data within Amazon S3, ensure the Amazon S3 buckets are using least-privilege bucket policies and are not publicly accessible. Alternatively, you can use the Amazon S3 Block Public Access feature. Review your AWS environment to make sure you are following Amazon S3 security best practices. For detection, you can use Amazon Config to track and auto-remediate Amazon S3 buckets that do not have logging and encryption enabled or publicly accessible.
AWS Security Hub detects an EC2 instance with a public IP and unrestricted VPC Security Group; Amazon GuardDuty unusual network traffic behavior finding; and Amazon GuardDuty brute force finding.
- Description: AWS Security Hub has detected an EC2 instance that has a public IP address attached and a VPC Security Group that allows traffic for ports outside of ports 80 and 443. Amazon GuardDuty has also determined that the EC2 instance has multiple brute force attempts and is communicating with a remote host on an unusual port that the EC2 instance has not previously used for network communication. The correlation of these lower-severity findings indicates a higher-severity security incident.
- Remediation: It’s recommended that you isolate the EC2 instance and follow standard protocol to triage the EC2 instance to verify if the instance has been compromised. If the instance has been compromised, follow your standard Incident Response process for post-instance compromise and forensics.
- Prevention/Detection: To mitigate or prevent these events from occurring within your AWS environment, determine whether the EC2 instance requires a public-facing IP address and review the VPC Security Group(s) has only the required rules configured. Review your AWS environment to make sure you are following Amazon EC2 best practices. For detection, consider implementing AWS Firewall Manager to continuously audit and limit VPC Security Groups.

The solution workflow, shown in Figure 1, is as follows:

Security Hub ingests findings from integrated AWS security services.
An EventBridge rule is invoked based on Security Hub findings in GuardDuty, Macie, Amazon Inspector, and Security Hub security standards.
The EventBridge rule invokes a Lambda function to store the Security Hub finding, which is passed via EventBridge, in a DynamoDB table for further analysis.
After the new findings are stored in DynamoDB, another Lambda function is invoked by using Dynamo StreamSets and a time-to-live (TTL) set to delete finding entries that are older than 30 days.
The second Lambda function looks at the resource associated with the new finding entry in the DynamoDB table. The Lambda function checks for specific Security Hub findings that are associated with the same resource.

Figure 1: Architecture diagram describing the flow of the solution

Solution deployment

You can deploy the solution through either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

To deploy the solution by using the AWS Management Console

In your account, launch the AWS CloudFormation template by choosing the following Launch Stack button. It will take approximately 10 minutes for the CloudFormation stack to complete.

To deploy the solution by using the AWS CDK

You can find the latest code in the aws-security GitHub repository where you can also contribute to the sample code. The following commands show how to deploy the solution by using the AWS CDK. First, the CDK initializes your environment and uploads the AWS Lambda assets to Amazon S3. Then, you can deploy the solution to your account. For <INSERT_AWS_ACCOUNT>, specify the account number, and for <INSERT_REGION>, specify the AWS Region that you want the solution deployed to.

cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>

cdk deploy

Conclusion

In this blog post, we walked through a solution to use AWS services, including Amazon EventBridge, AWS Lambda, and Amazon DynamoDB, to correlate AWS Security Hub findings from multiple different AWS security services. The solution provides a framework to prioritize specific sets of findings that indicate a higher likelihood that a security incident has occurred, so that you can prioritize and improve your security response.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the AWS Security Hub forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.