Tag Archives: Detective

Improve your security investigations with Detective finding groups visualizations

Post Syndicated from Rich Vorwaller original https://aws.amazon.com/blogs/security/improve-your-security-investigations-with-detective-finding-groups-visualizations/

At AWS, we often hear from customers that they want expanded security coverage for the multiple services that they use on AWS. However, alert fatigue is a common challenge that customers face as we introduce new security protections. The challenge becomes how to operationalize, identify, and prioritize alerts that represent real risk.

In this post, we highlight recent enhancements to Amazon Detective finding groups visualizations. We show you how Detective automatically consolidates multiple security findings into a single security event—called finding groups—and how finding group visualizations help reduce noise and prioritize findings that present true risk. We incorporate additional services like Amazon GuardDuty, Amazon Inspector, and AWS Security Hub to highlight how effective findings groups is at consolidating findings for different AWS security services.

Overview of solution

This post uses several different services. The purpose is twofold: to show how you can enable these services for broader protection, and to show how Detective can help you investigate findings from multiple services without spending a lot of time sifting through logs or querying multiple data sources to find the root cause of a security event. These are the services and their use cases:

  • GuardDuty – a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity. If potential malicious activity, such as anomalous behavior, credential exfiltration, or command and control (C2) infrastructure communication is detected, GuardDuty generates detailed security findings that you can use for visibility and remediation. Recently, GuardDuty released the following threat detections for specific services that we’ll show you how to enable for this walkthrough: GuardDuty RDS Protection, EKS Runtime Monitoring, and Lambda Protection.
  • Amazon Inspector – an automated vulnerability management service that continually scans your AWS workloads for software vulnerabilities and unintended network exposure. Like GuardDuty, Amazon Inspector sends a finding for alerting and remediation when it detects a software vulnerability or a compute instance that’s publicly available.
  • Security Hub – a cloud security posture management service that performs automated, continuous security best practice checks against your AWS resources to help you identify misconfigurations, and aggregates your security findings from integrated AWS security services.
  • Detective – a security service that helps you investigate potential security issues. It does this by collecting log data from AWS CloudTrail, Amazon Virtual Private Cloud (Amazon VPC) flow logs, and other services. Detective then uses machine learning, statistical analysis, and graph theory to build a linked set of data called a security behavior graph that you can use to conduct faster and more efficient security investigations.

The following diagram shows how each service delivers findings along with log sources to Detective.

Figure 1: Amazon Detective log source diagram

Figure 1: Amazon Detective log source diagram

Enable the required services

If you’ve already enabled the services needed for this post—GuardDuty, Amazon Inspector, Security Hub, and Detective—skip to the next section. For instructions on how to enable these services, see the following resources:

Each of these services offers a free 30-day trial and provides estimates on charges after your trial expires. You can also use the AWS Pricing Calculator to get an estimate.

To enable the services across multiple accounts, consider using a delegated administrator account in AWS Organizations. With a delegated administrator account, you can automatically enable services for multiple accounts and manage settings for each account in your organization. You can view other accounts in the organization and add them as member accounts, making central management simpler. For instructions on how to enable the services with AWS Organizations, see the following resources:

Enable GuardDuty protections

The next step is to enable the latest detections in GuardDuty and learn how Detective can identify multiple threats that are related to a single security event.

If you’ve already enabled the different GuardDuty protection plans, skip to the next section. If you recently enabled GuardDuty, the protections plans are enabled by default, except for EKS Runtime Monitoring, which is a two-step process.

For the next steps, we use the delegated administrator account in GuardDuty to make sure that the protection plans are enabled for each AWS account. When you use GuardDuty (or Security Hub, Detective, and Inspector) with AWS Organizations, you can designate an account to be the delegated administrator. This is helpful so that you can configure these security services for multiple accounts at the same time. For instructions on how to enable a delegated administrator account for GuardDuty, see Managing GuardDuty accounts with AWS Organizations.

To enable EKS Protection

  1. Sign in to the GuardDuty console using the delegated administrator account, choose Protection plans, and then choose EKS Protection.
  2. In the Delegated administrator section, choose Edit and then choose Enable for each scope or protection. For this post, select EKS Audit Log Monitoring, EKS Runtime Monitoring, and Manage agent automatically, as shown in Figure 2. For more information on each feature, see the following resources:
  3. To enable these protections for current accounts, in the Active member accounts section, choose Edit and Enable for each scope of protection.
  4. To enable these protections for new accounts, in the New account default configuration section, choose Edit and Enable for each scope of protection.

To enable RDS Protection

The next step is to enable RDS Protection. GuardDuty RDS Protection works by analysing RDS login activity for potential threats to your Amazon Aurora databases (MySQL-Compatible Edition and Aurora PostgreSQL-Compatible Editions). Using this feature, you can identify potentially suspicious login behavior and then use Detective to investigate CloudTrail logs, VPC flow logs, and other useful information around those events.

  1. Navigate to the RDS Protection menu and under Delegated administrator (this account), select Enable and Confirm.
  2. In the Enabled for section, select Enable all if you want RDS Protection enabled on all of your accounts. If you want to select a specific account, choose Manage Accounts and then select the accounts for which you want to enable RDS Protection. With the accounts selected, choose Edit Protection Plans, RDS Login Activity, and Enable for X selected account.
  3. (Optional) For new accounts, turn on Auto-enable RDS Login Activity Monitoring for new member accounts as they join your organization.
Figure 2: Enable EKS Runtime Monitoring

Figure 2: Enable EKS Runtime Monitoring

To enable Lambda Protection

The final step is to enable Lambda Protection. Lambda Protection helps detect potential security threats during the invocation of AWS Lambda functions. By monitoring network activity logs, GuardDuty can generate findings when Lambda functions are involved with malicious activity, such as communicating with command and control servers.

  1. Navigate to the Lambda Protection menu and under Delegated administrator (this account), select Enable and Confirm.
  2. In the Enabled for section, select Enable all if you want Lambda Protection enabled on all of your accounts. If you want to select a specific account, choose Manage Accounts and select the accounts for which you want to enable RDS Protection. With the accounts selected, choose Edit Protection Plans, Lambda Network Activity Monitoring, and Enable for X selected account.
  3. (Optional) For new accounts, turn on Auto-enable Lambda Network Activity Monitoring for new member accounts as they join your organization.
Figure 4: Enable Lambda Network Activity Monitoring

Figure 4: Enable Lambda Network Activity Monitoring

Now that you’ve enabled these new protections, GuardDuty will start monitoring EKS audit logs, EKS runtime activity, RDS login activity, and Lambda network activity. If GuardDuty detects suspicious or malicious activity for these log sources or services, it will generate a finding for the activity, which you can review in the GuardDuty console. In addition, you can automatically forward these findings to Security Hub for consolidation, and to Detective for security investigation.

Detective data sources

If you have Security Hub and other AWS security services such as GuardDuty or Amazon Inspector enabled, findings from these services are forwarded to Security Hub. With the exception of sensitive data findings from Amazon Macie, you’re automatically opted in to other AWS service integrations when you enable Security Hub. For the full list of services that forward findings to Security Hub, see Available AWS service integrations.

With each service enabled and forwarding findings to Security Hub, the next step is to enable the data source in Detective called AWS security findings, which are the findings forwarded to Security Hub. Again, we’re going to use the delegated administrator account for these steps to make sure that AWS security findings are being ingested for your accounts.

To enable AWS security findings

  1. Sign in to the Detective console using the delegated administrator account and navigate to Settings and then General.
  2. Choose Optional source packages, Edit, select AWS security findings, and then choose Save.
    Figure 5: Enable AWS security findings

    Figure 5: Enable AWS security findings

When you enable Detective, it immediately starts creating a security behavior graph for AWS security findings to build a linked dataset between findings and entities, such as RDS login activity from Aurora databases, EKS runtime activity, and suspicious network activity for Lambda functions. For GuardDuty to detect potential threats that affect your database instances, it first needs to undertake a learning period of up to two weeks to establish a baseline of normal behavior. For more information, see How RDS Protection uses RDS login activity monitoring. For the other protections, after suspicious activity is detected, you can start to see findings in both GuardDuty and Security Hub consoles. This is where you can start using Detective to better understand which findings are connected and where to prioritize your investigations.

Detective behavior graph

As Detective ingests data from GuardDuty, Amazon Inspector, and Security Hub, as well as CloudTrail logs, VPC flow logs, and Amazon Elastic Kubernetes Service (Amazon EKS) audit logs, it builds a behavior graph database. Graph databases are purpose-built to store and navigate relationships. Relationships are first-class citizens in graph databases, which means that they’re not computed out-of-band or by interfering with relationships through querying foreign keys. Because Detective stores information on relationships in your graph database, you can effectively answer questions such as “are these security findings related?”. In Detective, you can use the search menu and profile panels to view these connections, but a quicker way to see this information is by using finding groups visualizations.

Finding groups visualizations

Finding groups extract additional information out of the behavior graph to highlight findings that are highly connected. Detective does this by running several machine learning algorithms across your behavior graph to identify related findings and then statically weighs the relationships between those findings and entities. The result is a finding group that shows GuardDuty and Amazon Inspector findings that are connected, along with entities like Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS accounts, and AWS Identity and Access Management (IAM) roles and sessions that were impacted by these findings. With finding groups, you can more quickly understand the relationships between multiple findings and their causes because you don’t need to connect the dots on your own. Detective automatically does this and presents a visualization so that you can see the relationships between various entities and findings.

Enhanced visualizations

Recently, we released several enhancements to finding groups visualizations to aid your understanding of security connections and root causes. These enhancements include:

  • Dynamic legend – the legend now shows icons for entities that you have in the finding group instead of showing all available entities. This helps reduce noise to only those entities that are relevant to your investigation.
  • Aggregated evidence and finding icons – these icons provide a count of similar evidence and findings. Instead of seeing the same finding or evidence repeated multiple times, you’ll see one icon with a counter to help reduce noise.
  • More descriptive side panel information – when you choose a finding or entity, the side panel shows additional information, such as the service that identified the finding and the finding title, in addition to the finding type, to help you understand the action that invoked the finding.
  • Label titles – you can now turn on or off titles for entities and findings in the visualization so that you don’t have to choose each to get a summary of what the different icons mean.

To use the finding groups visualization

  1. Open the Detective console, choose Summary, and then choose View all finding groups.
  2. Choose the title of an available finding group and scroll down to Visualization.
  3. Under the Select layout menu, choose one of the layouts available, or choose and drag each icon to rearrange the layout according to how you’d like to see connections.
  4. For a complete list of involved entities and involved findings, scroll down below the visualization.

Figure 6 shows an example of how you can use finding groups visualization to help identify the root cause of findings quickly. In this example, an IAM role was connected to newly observed geolocations, multiple GuardDuty findings detected malicious API calls, and there were newly observed user agents from the IAM session. The visualization can give you high confidence that the IAM role is compromised. It also provides other entities that you can search against, such as the IP address, S3 bucket, or new user agents.

Figure 6: Finding groups visualization

Figure 6: Finding groups visualization

Now that you have the new GuardDuty protections enabled along with the data source of AWS security findings, you can use finding groups to more quickly visualize which IAM sessions have had multiple findings associated with unauthorized access, or which EC2 instances are publicly exposed with a software vulnerability and active GuardDuty finding—these patterns can help you determine if there is an actual risk.

Conclusion

In this blog post, you learned how to enable new GuardDuty protections and use Detective, finding groups, and visualizations to better identify, operationalize, and prioritize AWS security findings that represent real risk. We also highlighted the new enhancements to visualizations that can help reduce noise and provide summaries of detailed information to help reduce the time it takes to triage findings. If you’d like to see an investigation scenario using Detective, watch the video Amazon Detective Security Scenario Investigation.

If you have feedback about this post, submit comments in the Comments section below. You can also start a new thread on Amazon Detective re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Rich Vorwaller

Rich Vorwaller

Rich is a Principal Product Manager of Amazon Detective. He came to AWS with a passion for walking backwards from customer security problems. AWS is a great place for innovation, and Rich is excited to dive deep on how customers are using AWS to strengthen their security posture in the cloud. In his spare time, Rich loves to read, travel, and perform a little bit of amateur stand-up comedy.

Nicholas Doropoulos

Nicholas Doropoulos

Nicholas is an AWS Cloud Security Engineer, a bestselling Udemy instructor, and a subject matter expert in AWS Shield, Amazon GuardDuty, AWS IAM, and AWS Certificate Manager. Outside work, he enjoys spending time with his wife and their beautiful baby son.

How to investigate and take action on security issues in Amazon EKS clusters with Amazon Detective – Part 2

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/how-to-investigate-and-take-action-on-security-issues-in-amazon-eks-clusters-with-amazon-detective-part-2/

In part 1 of this of this two-part series, How to detect security issues in Amazon EKS cluster using Amazon GuardDuty, we walked through a real-world observed security issue in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster and saw how Amazon GuardDuty detected each phase by following MITRE ATT&CK tactics.

In this blog post, we’ll walk you through investigative techniques to use with Amazon Detective, paired with the GuardDuty EKS and malware findings from the security issue. After we have identified impacted resources through our investigation, we’ll provide example remediation tactics and preventative controls to address and help prevent security issues in EKS clusters.

Amazon Detective can help you investigate security issues and related resources in your account. Detective provides EKS coverage that you can enable within your accounts. When this coverage is enabled, Detective can help investigate and remediate potentially unauthorized EKS activity that results from misconfiguration of the control plane nodes or application. Although GuardDuty is not a prerequisite to enable Detective, it is recommended that you enable GuardDuty to enhance the visualization capabilities in Detective with GuardDuty findings.

Prerequisites

You must have the following services enabled in your AWS account to generate and investigate findings associated with EKS security events in a similar manner as outlined in this blog. If you do not have GuardDuty enabled, you can still investigate with Detective, but in a limited capacity.

Investigate with Amazon Detective

In the five phases we walked through in part 1, we discussed GuardDuty findings and MITRE ATT&CK tactics that can help you detect and understand each phase of the unauthorized activity, from the initial misconfiguration to the impact on our application when the EKS cluster is used for crypto mining.

The next recommended step is to investigate the EKS cluster and any associated resources. Amazon Detective can help you to investigate whether there was any other related unauthorized activity in the environment. We will walk through Detective capabilities for visualizing and gathering important information to effectively respond to the security issue. If you’re interested in creating detailed incident response playbooks for your security team to follow in your own environment, refer to these sample AWS incident response playbooks.

Depending on your scenario, there are various resources you can use to start your investigation, such as Security Hub findings, GuardDuty findings, related Kubernetes subjects, or an AWS account’s AWS CloudTrail activity. For our walkthrough, we’ll start our investigation from the GuardDuty finding and use the EKS cluster resource to pivot to the Detective console, as shown in Figure 7. Although we initially focus on the EKS cluster, you could start from any entities that are supported in the Detective behavior graph structure in the Amazon Detective User Guide. For example, we could start directly with the Kubernetes subject system:anonymous and find activity associated with the anonymous user.

Figure 7: Example Detective popup from GuardDuty finding for EKS cluster

Figure 7: Example Detective popup from GuardDuty finding for EKS cluster

We’ll now go over the information that you would need to gather from Detective in order to investigate the example security issue.

To investigate EKS cluster findings with Detective

  1. In the GuardDuty console, navigate to an individual finding and hover over Investigate with Detective. Choose one of the specific resources to start. In the image below, we selected the EKS cluster resource to investigate with Detective. You will need to gather some preliminary information about the IAM roles associated with the EKS cluster.
    • Questions: When was the cluster created? What IAM role created the cluster? What IAM role is assigned to the cluster?
    • Why it matters: If you are an incident responder, these details can potentially help you identify the owner of the cluster and help you determine what IAM principals are involved.
    • What next: Start looking into each IAM principal’s activity, as seen in CloudTrail, to investigate whether the IAM entity itself is potentially compromised or what other resources may have been impacted.
    Figure 8: Detective summary page for EKS cluster metadata details

    Figure 8: Detective summary page for EKS cluster metadata details

  2. Next, on the EKS cluster overview page, you can see the container details associated with the cluster.
    • Question: What are some of the other container details for the cluster? Does anything look out of the ordinary? Is it using a public image? Is it missing a network policy?
    • Why it matters: Based on the architecture related to this cluster, you might be able to use this information to determine whether there are unauthorized containers. The contents of unauthorized containers will depend on your organization but typically consist of public images or unauthorized RBAC, pod security policies, or network policy configurations. It’s important to keep in mind that when you look at data in Detective, the scope time is very important. When you pivot from a GuardDuty finding, the scope time will be set to the first time the GuardDuty finding was seen to the last time the finding was seen. The container details reflect the containers that were running during the selected scope time. Changing the scope time might change the containers that are listed in the table shown in Figure 9.
    • What next: Information found on this page can help to highlight unauthorized resources or configurations that will need to be remediated. You will also need to look at how these resources were initially created and if there are missing guardrails that should have been created during the provisioning of the cluster.
    Figure 9: Detective summary page for EKS container metadata details

    Figure 9: Detective summary page for EKS container metadata details

  3. Finally, you will see associated security findings with this specific EKS cluster, similar to Figure 10, at the bottom of the EKS cluster overview page in Detective.
    • Question: Are there any other security findings associated with this cluster that I previously was not aware of?
    • Why it matters: In our example scenario, we walked through the findings that were initially detected and the events that unfolded from those findings. After further investigation, you might see other findings that were not part of the original investigation. This can occur if your security team is only investigating specific findings or severity values. The finding for PrivilegeEscalation:Kubernetes/PrivilegedContainer informs you that a privileged container was launched on your Kubernetes cluster by using an image that has never before been used to launch privileged containers in your cluster. A privileged container has root level access to the host. The other finding, Persistence:Kubernetes/ContainerWithSensitiveMount, informs you that a container was launched with a configuration that included a sensitive host path with write access in the volumeMounts section. This makes the sensitive host path accessible and writable from inside the container. Any finding associated to the suspicious or compromised cluster is valuable because it provides additional insight into what the unauthorized entity was trying to accomplish after the initial detection.
    • What next: With Detective, you might want to continue your investigation by selecting each of these findings and reviewing all details related to the finding. Depending on the findings, you could bring in additional team members to help investigate further. For this example, we will move on to the next step.
    Figure 10: Example Detective summary of security findings associated with the EKS cluster

    Figure 10: Example Detective summary of security findings associated with the EKS cluster

  4. Shift from the EKS cluster overview section to the Kubernetes API activity section, similar to Figure 11 below. This will give you the opportunity to dig into the API activity associated with this cluster.
    1. Question: What other Kubernetes API activity was attempted from the cluster? Which API calls were successful? Which API calls failed? What was the unauthorized user trying to do?
    2. Why it matters: It’s important to determine which actions were successfully invoked by the unauthorized user so that appropriate remediation actions can be taken. You can look at trends of successful and failed API calls, and can even search by Subject, IP address, or Kubernetes API call.
    3. What next: You might want to look at all cluster role binding from days before the first GuardDuty finding was seen to determine if there was any other suspicious activity you should be investigating regarding the cluster.
    Figure 11: Example Detective summary page for Kubernetes API activity on the EKS cluster

    Figure 11: Example Detective summary page for Kubernetes API activity on the EKS cluster

  5. Next, you will want to look at the Newly observed Kubernetes API calls section, similar to Figure 12 below.
    • Question: What are some of the more recent Kubernetes API calls? What are they trying to access right now and are they successful? Do I need to start taking action for other resources outside of EKS?
    • Why it matters: This data shows Kubernetes subjects who were observed issuing API calls to this cluster for the first time during our scope time. Detective provides you this information by keeping a baseline of the activity associated with supported AWS resources. This can help you more quickly determine whether activity might be suspicious and worth looking into. In our example, we used the search functionality to look at API calls associated with the built-in Kubernetes secrets management. A common way to start your search is to see if an unauthorized user has successfully accessed any secrets, which can help you determine what information you might want to search in the overall API call volume section discussed in step 4.
    • What next: If the unauthorized user has successfully accessed any secret, those secrets should be marked as compromised, and they should be rotated immediately.
    Figure 12: Example Detective summary for newly observed Kubernetes API calls from the EKS cluster

    Figure 12: Example Detective summary for newly observed Kubernetes API calls from the EKS cluster

  6. You can also consider the following question when you look at the Newly observed Kubernetes API calls section.
    • Question: Has the IP address associated with the finding been communicating with any other resources in our environment, and if so, what are the details of that communication?
    • Why it matters: To answer this question, you can use Detective’s search functionality and the ability to use wild cards to search for IP addresses with the same first three octets. Also note that you can use CIDR notation to search, as well. Based on the results in the example in Figure 13, you can see that there are a number of related IP addresses associated with the environment. With this information, you now can look at the traffic associated with these different IPs and what resources they were communicating with.
    Figure 13: Example Detective results page from a query against IP addresses associated with the EKS cluster

    Figure 13: Example Detective results page from a query against IP addresses associated with the EKS cluster

  7. You can select one of the IP addresses in the search results to get more information related to it, similar to Figure 14 below.
    1. Question: What was the first time an IP address was observed in the environment? When was the last time it was observed?
    2. Why it matters: You can use this information to start isolating where unauthorized activity is coming from and what actions are being taken. You can also start creating a time series of unauthorized activity and scope.
    3. What next: You can repeat some of the previous investigation steps for each IP address, like looking at the different tabs to review New behavior, Resource interaction, and Kubernetes activity.
    Figure 14: Example Detective results page for specific IP address and associated metadata details

    Figure 14: Example Detective results page for specific IP address and associated metadata details

In summary, we began our investigation with a GuardDuty finding about an anonymous API request that was successful in using system:anonymous on one of our EKS clusters. We then used Detective to investigate and visualize activity associated with that EKS cluster, such as volume of successful or unsuccessful API requests, where and when those actions were attempted and other security findings associated with the resource. Once we have completed the investigation, we can confirm scope and impact of the security event and start moving towards taking action.

Remediation techniques for Amazon EKS

In this section, we will focus on how to remediate the security issue in our example. Your actions will vary based on your organization and the resources affected. It’s important to note that these actions will impact the EKS cluster and associated workloads, and should accordingly be performed by or coordinated with the cluster operator.

Before you take action on the EKS cluster, you will need to preserve forensic artifacts and evidence for the impacted EKS resources. The order of operations for these actions matters, because you want to get all the data from forensic artifacts in order to determine the overall impact to the resources affected. If you quarantine resources before you capture forensic artifacts, there is a risk that running processes will be interrupted or that the malware attempts to destroy resources that are valuable to a forensics investigation, to cover its tracks.

To preserve forensic evidence

  1. Enable termination protection on the impacted worker node and change the shutdown behavior to Stop.
  2. Label the offending pod or node with a label indicating that it is part of an active investigation.
  3. Cordon the worker node.
  4. Capture both volatile (temporary memory) and non-volatile (Amazon EBS snapshots) artifacts on the worker node.

Now that you have the forensic evidence, you can start to quarantine your EKS resources to restrict unauthorized network communication. The main objective is to prevent the affected EKS pods from communicating with internal resources or exfiltrating data externally.

To quarantine EKS resources

  1. Isolate the pod by creating a network policy that denies ingress and egress traffic to the pod.
  2. Attach a security group to the host and remove inbound and outbound rules. Take this action if you believe the underlying host has been compromised.

    Depending on existing inbound and outbound rules on the security group, the connections will either be tracked or untracked. Applying an isolation security group will drop untracked connections. For tracked connections, new connections with the host will not be allowed from the isolation security group, but existing tracked connections will not be interrupted.

    Important: This action will affect all containers running on the host.

  3. Attach a deny rule for the EKS resources in a network access control list (network ACL). Because network ACLs are stateless firewalls, all connections will be interrupted, whether they are tracked or untracked connections.

    Important: This action will affect all subnets using the network ACL and all resources within those subnets.

At this point, the affected EKS resources are quarantined, but the cluster is still configured to allow anonymous, unauthenticated access. You will need to remove all unauthorized permissions that were created or added.

To remove unauthorized permissions

  1. Update the RBAC configuration to remove system:anonymous access.
  2. Revoke temporary security credentials that are assigned to the pod or worker node, if necessary. You can also remove the IAM role associated with the EKS resources.

    Note: Removing IAM policies or attaching IAM policies to restrict permissions will affect the resources that are using the IAM role.

  3. Remove any unauthorized ClusterRoleBinding created by the system:anonymous user.
  4. Redeploy the compromised pod or workload resource.

The actions taken so far primarily target the EKS resource, but based on our Detective investigation, there are other actions you might need to take. Because secrets were involved that could be used outside of the EKS cluster, those secrets will need to be rotated wherever they are referenced. Detective will also suggest additional areas where you can investigate and remediate additional unauthorized activity in your AWS account.

It is important that your team go through game days or run-throughs for investigating and responding to different scenarios in order to make sure the team is prepared. You can run through the EKS security workshop to get your security team more familiar with remediation for EKS.

For more information about responding to EKS cluster related security issues, refer to GuardDuty EKS remediation in the GuardDuty User Guide and the EKS Best Practices Guide.

Preventative controls for EKS

This section covers several preventative controls that you can use to protect EKS clusters.

How can I prevent external access to the EKS cluster?

To help prevent external access to your EKS clusters, limit the exposure of your API server. You can achieve that in two ways:

  1. Set the API server endpoint access to Private. This will effectively forbid anyone outside of the VPC to send Kubernetes API requests to your EKS cluster.
  2. Set an IP address allow list for the EKS cluster public access endpoint.

How can I prevent giving admin access to the EKS cluster?

To help prevent an EKS cluster user from granting any type of access to anonymous or unauthenticated users, you can set up a ValidatingAdmissionWebhook. This is a special type of Kubernetes admission controller that can be configured in the Kubernetes API. (To learn how to build serverless admission webhooks, see the blog post Building serverless admission webhooks for Kubernetes with AWS SAM.)

The ValidatingAdmissionWebhook will deny a Kubernetes API request that matches all of the following checks:

  1. The request is creating or modifying a ClusterRoleBinding or RoleBinding.
  2. The subjects section contains either of the following:
    • The user system:anonymous
    • The group system:unauthenticated

How can I prevent malicious images from being deployed?

Now that you have set controls to prevent external access to the EKS cluster and prevent granting access to anonymous users, you can focus on preventing the deployment of potentially malicious images.

Malicious container images can have different origins, including:

  1. Images stored in public or unauthorized registries
  2. Images replacing the ones that are stored in authorized registries
  3. Authorized images that contain software with existing or newly discovered vulnerabilities

You can address these sources of malicious images by doing the following:

  1. Use admission controllers to verify that images meet your organization’s requirements, including for the image origin. You can also refer to this this blog post to implement a solution with a webhook and admission controllers.
  2. Enable tag immutability in your registry, a control that prevents an actor from maliciously replacing container images without changing the image’s tags. Additionally, you can enable an AWS Config rule to check tag immutability
  3. Configure another ValidatingAdmissionWebhook that will only accept images if they meet all of the following criteria.
    1. Images that come from approved registries.
    2. Images that pass the vulnerability scan during deployment time.
    3. Images that are signed by a trusted party. Amazon Elastic Container Registry (Amazon ECR) is working on a product enhancement to store image signatures. Currently, you can use an open-source cosign tool to verify and store image signatures.

      Note: These criteria can vary based on your use case and internal security and compliance standards.

The above controls will help prevent the deployment of a vulnerable, unauthorized, or potentially malicious container image.

How can I prevent lateral movement inside the cluster?

To prevent lateral movement inside the cluster, it is recommended to use network policies, as follows:

  • Enforce Kubernetes network policies to enforce ingress and egress controls within the cluster. You can implement these policies by following the steps in the Securing your cluster with network policies EKS workshop.

It’s important to note that you could use security groups for the same purpose, but pod security groups should only be used if the cluster is compromised and when you want to control the traffic between a pod and a resource that resides in the VPC, not inter-pod traffic.

In this section, we’ve reviewed different preventative controls that could have helped mitigate our example security incident. With the first preventative control, we could have prevented external actors from connecting to the API server. The second control could have prevented granting access to anonymous users. The third control could have prevented the deployment of an unauthorized or vulnerable container image. Finally, the fourth control could have helped limit the impact of the deployed vulnerable images to only the pods where the images were deployed, making it harder to laterally move to other pods in the cluster.

Conclusion

In this post, we walked you through how to investigate an EKS cluster related security issue with Amazon Detective. We also provided some recommended remediation and preventative controls to put in place for the EKS cluster specific security issues. When pairing GuardDuty’s ability for continuous threat detection and monitoring with Detective’s organization and visualization capabilities, you enable your security team to conduct faster and more effective investigation. By providing the security team the ability quickly view an organized set of data associated with security events within your AWS account, you reduce the overall Mean Time to Respond (MTTR).

Now that you understand the investigative capabilities with Detective, it’s time to try things out! It is important that you provide a mechanism for your security team to practice detection, investigation, and remediation techniques using security incident response simulations. By periodically running simulations, your security team will be prepared to quickly respond to possible security events. You can find more detailed incident response playbooks that can assist you in preparing for events in your environment, see these sample AWS incident response playbooks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a thread on Amazon GuardDuty re:Post.

Want more AWS Security news? Follow us on Twitter.

Author

Marshall Jones

Marshall is a worldwide senior security specialist solutions architect at AWS. His background is in AWS consulting and security architecture, focused on a variety of security domains including edge, threat detection, and compliance. Today, he helps enterprise customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

Jonathan Nguyen

Jonathan Nguyen

Jonathan is a shared delivery team senior security consultant at AWS. His background is in AWS security, with a focus on threat detection and incident response. He helps enterprise customers develop a comprehensive AWS security strategy, deploy security solutions at scale, and train customers on AWS security best practices.

Manuel Martinez Arizmendi

Manuel Martinez Arizmendi

Manuel works a Security Engineer at Amazon Detective providing new security investigation capabilities to AWS customers. Based on Boston,MA and originally from Madrid, Spain, when he’s not at work, he enjoys playing and watching soccer, playing videogames, and hanging out with his friends.

How to detect security issues in Amazon EKS clusters using Amazon GuardDuty – Part 1

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/how-to-detect-security-issues-in-amazon-eks-clusters-using-amazon-guardduty-part-1/

In this two-part blog post, we’ll discuss how to detect and investigate security issues in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster with Amazon GuardDuty and Amazon Detective.

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed service that you can use to run and scale container workloads by using Kubernetes in the AWS Cloud, which can help increase the speed of deployment and portability of modern applications. Amazon EKS provides secure, managed Kubernetes clusters on the AWS control plane by default. Kubernetes configurations such as pod security policies, runtime security, and network policies and configurations are specific for your organization’s use-case and securing them adequately would be a customer’s responsibility within AWS’ shared responsibility model.

Amazon GuardDuty can help you continuously monitor and detect suspicious activity related to AWS resources in your account. GuardDuty for EKS protection is a feature that you can enable within your accounts. When this feature is enabled, GuardDuty can help detect potentially unauthorized EKS activity resulting from misconfiguration of the control plane nodes or application.

In this post, we’ll walk through the events leading up to a real-world security issue that occurred due to EKS cluster misconfiguration, discuss how those misconfigurations could be used by a malicious actor, and how Amazon GuardDuty monitors and identifies suspicious activity throughout the EKS security event. In part 2 of the post, we’ll cover Amazon Detective investigation capabilities, possible remediation techniques, and preventative controls for EKS cluster related security issues.

Prerequisites

You must have AWS GuardDuty enabled in your AWS account in order to monitor and generate findings associated with an EKS cluster related security issue in your environment.

EKS security issue walkthrough

Before jumping into the security issue, it is important to understand how the AWS shared responsibility model applies to the Amazon EKS managed service. AWS is responsible for the EKS managed Kubernetes control plane and the infrastructure to deliver EKS in a secure and reliable manner. You have the ability to configure EKS and how it interacts with other applications and services, where you are responsible for making sure that secure configurations are being used.

The following scenario is based on a real-world observed event, where a malicious actor used Kubernetes compromise tactics and techniques to expose and access an EKS cluster. We use this example to show how you can use AWS security services to identify and investigate each step of this security event. For a security event in your own environment, the order of operations and the investigative and remediation techniques used might be different. The scenario is broken down into the following phases and associated MITRE ATT&CK tactics:

  • Phase 1 – EKS cluster misconfiguration
  • Phase 2 (Discovery) – Discovery of vulnerable EKS clusters
  • Phase 3 (Initial Access) – Credential access to obtain Kubernetes secrets
  • Phase 4 (Persistence) – Impact to persist unauthorized access to the cluster
  • Phase 5 (Impact) – Impact to manipulate resources for unauthorized activity

Phase 1 – EKS cluster misconfiguration

By default, when you provision an EKS cluster, the API cluster endpoint is set to public, meaning that it can be accessed from the internet. Despite being accessible from the internet, the endpoint is still considered secure because it requires all API requests to be authenticated by AWS Identity and Access Management (IAM) and then authorized by Kubernetes role-based access control (RBAC). Also, the entity (user or role) that creates the EKS cluster is automatically granted system:masters permissions, which allows the entity to modify the EKS cluster’s RBAC configuration.

This example scenario starts with a developer who has access to administer EKS clusters in an AWS account. The developer wants to work from their home network and doesn’t want to connect to their enterprise VPN for IAM role federation. They configure an EKS cluster API without setting up the proper authentication and authorization components. Instead, the developer grants explicit access to the system:anonymous user in the cluster’s RBAC configuration. (Alternatively, an unauthorized RBAC configuration could be introduced into your environment after a developer unknowingly installs a malicious helm chart from the internet without reviewing or inspecting it first.)

In Kubernetes anonymous requests, unauthenticated and unrejected HTTP requests are treated as anonymous access and are identified as a system:anonymous user belonging to a system:unauthenticated group. This means that any entity on the internet can access the cluster and make API requests that are permitted by the role. There aren’t many legitimate use cases for this type of activity, because it’s considered a best practice to use RBAC instead. Anonymous requests are primarily used for setting up health endpoints and custom authentication.

By monitoring EKS audit logs, GuardDuty identifies this activity and generates the finding Policy:Kubernetes/AnonymousAccessGranted, as shown in Figure 1. This finding informs you that a user on your Kubernetes cluster successfully created a ClusterRoleBinding or RoleBinding to bind the user system:anonymous to a role. This action enables unauthenticated access to the API operations permitted by the role.

Figure 1: Example GuardDuty finding for Kubernetes anonymous access granted

Figure 1: Example GuardDuty finding for Kubernetes anonymous access granted

Phase 2 (Discovery) – Discovery of vulnerable EKS clusters

Port scanning is a method that malicious actors use to determine if resources are publicly exposed, with open ports and known vulnerabilities. As an increasing number of open-source tools allows users to search for endpoints connected to the internet, finding these endpoints has become even easier. Security teams can use these open-source tools to their advantage by proactively scanning for and identifying externally exposed resources in their organization.

This brings us to the discovery phase of our misconfigured EKS cluster. The discovery phase is defined by MITRE as follows: “Discovery consists of techniques an adversary may use to gain knowledge about the system and internal network. These techniques help adversaries observe the environment and orient themselves before deciding how to act.”

By granting system:anonymous access to the EKS cluster in our example, the developer allowed requests from any public unauthenticated source. This can result in external web crawlers probing the cluster API, which can often happen within seconds of the system:anonymous access being granted. GuardDuty identifies this activity and generates the finding Discovery:Kubernetes/SuccessfulAnonymousAccess, as shown in Figure 2. This finding informs you that an API operation to discover resources in a cluster was successfully invoked by the system:anonymous user. Remember, all API calls made by system:anonymous are unauthenticated, in addition to /healthz and /version calls that are always unauthenticated regardless of the user identity, and any entity can make use of this user within the EKS cluster.

In the screenshot, under the Action section in the finding details, you can see that the anonymous user made a get request to “/”. This is a generic request that is not specific to a Kubernetes cluster, which may indicate that the crawler is not specifically targeting Kubernetes clusters. You can further see that the Status code is 200, indicating that the request was successful. If this activity is malicious, then the actor is now aware that there is an exposed resource.

Figure 2: Example GuardDuty finding for Kubernetes successful anonymous access

Figure 2: Example GuardDuty finding for Kubernetes successful anonymous access

Phase 3 (Initial Access) – Credential access to obtain Kubernetes secrets

Next, in this phase, you might start observing more targeted API calls for establishing initial access from unauthorized users. MITRE defines initial access as “techniques that use various entry vectors to gain their initial foothold within a network. Techniques used to gain a foothold include targeted spearphishing and exploiting weaknesses on public-facing web servers. Footholds gained through initial access may allow for continued access, like valid accounts and use of external remote services, or may be limited-use due to changing passwords.”

In our example, the malicious actor has established initial access for the EKS cluster which is evident in the next GuardDuty finding, CredentialAccess:Kubernetes/SuccessfulAnonymousAccess, as shown in Figure 3. This finding informs you that an API call to access credentials or secrets was successfully invoked by the system:anonymous user. The observed API call is commonly associated with the credential access tactic where an adversary is attempting to collect passwords, usernames, and access keys for a Kubernetes cluster.

You can see that in this GuardDuty finding, in the Action section, the Request uri is targeted at a Kubernetes cluster, specifically /api/v1/namespaces/kube-system/secrets. This request seems to be targeting the secrets management capabilities that are built into Kubernetes. You can find more information about this secrets management capability in the Kubernetes documentation.

Figure 3: Example GuardDuty finding for Kubernetes successful credential access from anonymous user

Figure 3: Example GuardDuty finding for Kubernetes successful credential access from anonymous user

Phase 4 (Persistence) – Impact to persist unauthorized access to the cluster

The next phase of this scenario is likely to be an impact in the EKS cluster to enable persistence by the malicious actor. MITRE defines impact as “techniques that adversaries use to disrupt availability or compromise integrity by manipulating business and operational processes.” Following the MITRE definitions, “Persistence consists of techniques that adversaries use to keep access to systems across restarts, changed credentials, and other interruptions that could cut off their access. Techniques used for persistence include any access, action, or configuration changes that let them maintain their foothold on systems, such as replacing or hijacking legitimate code or adding startup code.”

In the GuardDuty finding Impact:Kubernetes/SuccessfulAnonymousAccess, shown in Figure 4, you can see the Kubernetes user details and Action sections that indicate that a successful Kubernetes API call was made to create a ClusterRoleBinding by the system:anonymous username. This finding informs you that a write API operation to tamper with resources was successfully invoked by the system:anonymous user. The observed API call is commonly associated with the impact stage of an attack, when an adversary is tampering with resources in your cluster. This activity shows that the system:anonymous user has now created their own role to enable persistent access the EKS cluster. If the user is malicious, they can now access the cluster even if access is removed in the RBAC configuration for the system:anonymous user.

Figure 4 Example GuardDuty finding for Kubernetes successful credential change by anonymous user

Figure 4 Example GuardDuty finding for Kubernetes successful credential change by anonymous user

Phase 5 (Impact) – Impact to manipulate resources for unauthorized activity

The fifth phase of this scenario is where the unauthorized user is likely to focus on impact techniques in order to use the access for malicious purpose. MITRE says of the impact phase: “Techniques used for impact can include destroying or tampering with data. In some cases, business processes can look fine, but may have been altered to benefit the adversaries’ goals. These techniques might be used by adversaries to follow through on their end goal or to provide cover for a confidentiality breach.” Typically, once a malicious actor has access into a system, they will introduce malware to the system to manipulate the compromised resource and possibly also other resources.

With the introduction of GuardDuty Malware Protection, when an Amazon Elastic Compute Cloud (Amazon EC2) or container-related GuardDuty finding that indicates potentially suspicious activity is generated, an agentless scan on the volumes will initiate and detect the presence of malware. Existing GuardDuty customers need to enable Malware Protection, and for new customers this feature is on by default when they enable GuardDuty for the first time. Malware Protection comes with a 30-day free trial for both existing and new GuardDuty customers. You can see a list of findings that initiates a malware scan in the GuardDuty User Guide.

In this example, the malicious actor now uses access to the cluster to perform unauthorized cryptocurrency mining. GuardDuty monitors the DNS requests from the EC2 instances used to host the EKS cluster. This allows GuardDuty to identify a DNS request made to a domain name associated with a cryptocurrency mining pool, and generate the finding CryptoCurrency:EC2/BitcoinTool.B!DNS, as shown in Figure 5.

Figure 5: Example GuardDuty finding for EC2 instance querying bitcoin domain name

Figure 5: Example GuardDuty finding for EC2 instance querying bitcoin domain name

Because this is an EC2 related GuardDuty finding and GuardDuty Malware Protection is enabled in the account, GuardDuty then conducts an agentless scan on the volumes of the EC2 instance to detect malware. If the scan results in a successful detection of one or more malicious files, another GuardDuty finding for Execution:EC2/MaliciousFile is generated, as shown in Figure 6.

Figure 6: Example GuardDuty finding for detection of a malicious file on EC2

Figure 6: Example GuardDuty finding for detection of a malicious file on EC2

The first GuardDuty finding detects crypto mining activity, while the proceeding malware protection finding provides context on the malware associated with this activity. This context is very valuable for the incident response process.

Conclusion

In this post, we walked you through each of the five phases where we outlined how an initial misconfiguration could result in a malicious actor gaining control of EKS resources within an AWS account and how GuardDuty is able to continually monitor and detect the progression of the security event. As previously stated, this is just one example where a misconfiguration in an EKS cluster could result in a security event.

Now that you have a good understanding of GuardDuty capabilities to continuously monitor and detect EKS security events, you will need to establish processes and procedures to enable your security team to investigate these events. You can enable Amazon Detective to help accelerate your security team’s mean time to respond (MTTR) by providing an efficient mechanism to analyze, investigate, and identify the root cause of security events. Follow along in part 2 of this series, How to investigate and take action on an Amazon EKS cluster related security issue with Amazon Detective, where we’ll cover techniques you can use with Amazon Detective to identify impacted EKS resources in your AWS account, possible remediation actions to take on the cluster, and preventative controls you can implement.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a thread on Amazon GuardDuty re:Post.

Want more AWS Security news? Follow us on Twitter.

Author

Marshall Jones

Marshall is a worldwide senior security specialist solutions architect at AWS. His background is in AWS consulting and security architecture, focused on a variety of security domains including edge, threat detection, and compliance. Today, he helps enterprise customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

Jonathan Nguyen

Jonathan Nguyen

Jonathan is a shared delivery team senior security consultant at AWS. His background is in AWS security, with a focus on threat detection and incident response. He helps enterprise customers develop a comprehensive AWS security strategy, deploy security solutions at scale, and train customers on AWS security best practices.

Manuel Martinez Arizmendi

Manuel Martinez Arizmendi

Manuel works a Security Engineer at Amazon Detective providing new security investigation capabilities to AWS customers. Based on Boston,MA and originally from Madrid, Spain, when he’s not at work, he enjoys playing and watching soccer, playing videogames, and hanging out with his friends.