Tag Archives: SecOps

Evolving cyber threats demand new security approaches – The benefits of a unified and global IT/OT SOC

2023-10-30 Stuart Gregg

Post Syndicated from Stuart Gregg original https://aws.amazon.com/blogs/security/evolving-cyber-threats-demand-new-security-approaches-the-benefits-of-a-unified-and-global-it-ot-soc/

In this blog post, we discuss some of the benefits and considerations organizations should think through when looking at a unified and global information technology and operational technology (IT/OT) security operations center (SOC). Although this post focuses on the IT/OT convergence within the SOC, you can use the concepts and ideas discussed here when thinking about other environments such as hybrid and multi-cloud, Industrial Internet of Things (IIoT), and so on.

The scope of assets has vastly expanded as organizations transition to remote work, and from increased interconnectivity through the Internet of Things (IoT) and edge devices coming online from around the globe, such as cyber physical systems. For many organizations, the IT and OT SOCs were separate, but there is a strong argument for convergence, which provides better context for the business outcomes of being able to respond to unexpected activity. In the ten security golden rules for IIoT solutions, AWS recommends deploying security audit and monitoring mechanisms across OT and IIoT environments, collecting security logs, and analyzing them using security information and event management (SIEM) tools within a SOC. SOCs are used to monitor, detect, and respond; this has traditionally been done separately for each environment. In this blog post, we explore the benefits and potential trade-offs of the convergence of these environments for the SOC. Although organizations should carefully consider the points raised throughout this blog post, the benefits of a unified SOC outweigh the potential trade-offs—visibility into the full threat chain propagating from one environment to another is critical for organizations as daily operations become more connected across IT and OT.

Traditional IT SOC

Traditionally, the SOC was responsible for security monitoring, analysis, and incident management of the entire IT environment within an organization—whether on-premises or in a hybrid architecture. This traditional approach has worked well for many years and ensures the SOC has the visibility to effectively protect the IT environment from evolving threats.

Note: Organizations should be aware of the considerations for security operations in the cloud which are discussed in this blog post.

Traditional OT SOC

Traditionally, OT, IT, and cloud teams have worked on separate sides of the air gap as described in the Purdue model. This can result in siloed OT, IIoT, and cloud security monitoring solutions, creating potential gaps in coverage or missing context that could otherwise have improved the response capability. To realize the full benefits of IT/OT convergence, IIoT, IT and OT must collaborate effectively to provide a broad perspective and the most effective defense. The convergence trend applies to newly connected devices and to how security and operations work together.

As organizations explore how industrial digital transformation can give them a competitive advantage, they’re using IoT, cloud computing, artificial intelligence and machine learning (AI/ML), and other digital technologies. This increases the potential threat surface that organizations must protect and requires a broad, integrated, and automated defense-in-depth security approach delivered through a unified and global SOC.

Without full visibility and control of traffic entering and exiting OT networks, the operations function might not be able to get full context or information that can be used to identify unexpected events. If a control system or connected assets such as programmable logic controllers (PLCs), operator workstations, or safety systems are compromised, threat actors could damage critical infrastructure and services or compromise data in IT systems. Even in cases where the OT system isn’t directly impacted, the secondary impacts can result in OT networks being shut down due to safety concerns over the ability to operate and monitor OT networks.

The SOC helps improve security and compliance by consolidating key security personnel and event data in a centralized location. Building a SOC is significant because it requires a substantial upfront and ongoing investment in people, processes, and technology. However, the value of an improved security posture is of great consideration compared to the costs.

In many OT organizations, operators and engineering teams may not be used to focusing on security; in some cases, organizations set up an OT SOC that’s independent from their IT SOC. Many of the capabilities, strategies, and technologies developed for enterprise and IT SOCs apply directly to the OT environment, such as security operations (SecOps) and standard operating procedures (SOPs). While there are clearly OT-specific considerations, the SOC model is a good starting point for a converged IT/OT cybersecurity approach. In addition, technologies such as a SIEM can help OT organizations monitor their environment with less effort and time to deliver maximum return on investment. For example, by bringing IT and OT security data into a SIEM, IT and OT stakeholders share access to the information needed to complete security work.

Benefits of a unified SOC

A unified SOC offers numerous benefits for organizations. It provides broad visibility across the entire IT and OT environments, enabling coordinated threat detection, faster incident response, and immediate sharing of indicators of compromise (IoCs) between environments. This allows for better understanding of threat paths and origins.

Consolidating data from IT and OT environments in a unified SOC can bring economies of scale with opportunities for discounted data ingestion and retention. Furthermore, managing a unified SOC can reduce overhead by centralizing data retention requirements, access models, and technical capabilities such as automation and machine learning.

Operational key performance indicators (KPIs) developed within one environment can be used to enhance another, promoting operational efficiency such as reducing mean time to detect security events (MTTD). A unified SOC enables integrated and unified security, operations, and performance, which supports comprehensive protection and visibility across technologies, locations, and deployments. Sharing lessons learned between IT and OT environments improves overall operational efficiency and security posture. A unified SOC also helps organizations adhere to regulatory requirements in a single place, streamlining compliance efforts and operational oversight.

By using a security data lake and advanced technologies like AI/ML, organizations can build resilient business operations, enhancing their detection and response to security threats.

Creating cross-functional teams of IT and OT subject matter experts (SMEs) help bridge the cultural divide and foster collaboration, enabling the development of a unified security strategy. Implementing an integrated and unified SOC can improve the maturity of industrial control systems (ICS) for IT and OT cybersecurity programs, bridging the gap between the domains and enhancing overall security capabilities.

Considerations for a unified SOC

There are several important aspects of a unified SOC for organizations to consider.

First, the separation of duty is crucial in a unified SOC environment. It’s essential to verify that specific duties are assigned to individuals based on their expertise and job function, allowing the most appropriate specialists to work on security events for their respective environments. Additionally, the sensitivity of data must be carefully managed. Robust access and permissions management is necessary to restrict access to specific types of data, maintaining that only authorized analysts can access and handle sensitive information. You should implement a clear AWS Identity and Access Management (IAM) strategy following security best practices across your organization to verify that the separation of duties is enforced.

Another critical consideration is the potential disruption to operations during the unification of IT and OT environments. To promote a smooth transition, careful planning is required to minimize any loss of data, visibility, or disruptions to standard operations. It’s crucial to recognize the differences in IT and OT security. The unique nature of OT environments and their close ties to physical infrastructure require tailored cybersecurity strategies and tools that address the distinct missions, challenges, and threats faced by industrial organizations. A copy-and-paste approach from IT cybersecurity programs will not suffice.

Furthermore, the level of cybersecurity maturity often varies between IT and OT domains. Investment in cybersecurity measures might differ, resulting in OT cybersecurity being relatively less mature compared to IT cybersecurity. This discrepancy should be considered when designing and implementing a unified SOC. Baselining the technology stack from each environment, defining clear goals and carefully architecting the solution can help ensure this discrepancy has been accounted for. After the solution has moved into the proof-of-concept (PoC) phase, you can start to testing for readiness to move the convergence to production.

You also must address the cultural divide between IT and OT teams. Lack of alignment between an organization’s cybersecurity policies and procedures with ICS and OT security objectives can impact the ability to secure both environments effectively. Bridging this divide through collaboration and clear communication is essential. This has been discussed in more detail in the post on managing organizational transformation for successful IT/OT convergence.

Unified IT/OT SOC deployment:

Figure 1 shows the deployment that would be expected in a unified IT/OT SOC. This is a high-level view of a unified SOC. In part 2 of this post, we will provide prescriptive guidance on how to design and build a unified and global SOC on AWS using AWS services and AWS Partner Network (APN) solutions.

Figure 1: Unified IT/OT SOC architecture

The parts of the IT/OT unified SOC are the following:

Environment: There are multiple environments, including a traditional IT on-premises organization, OT environment, cloud environment, and so on. Each environment represents a collection of security events and log sources from assets.

Data lake: A centralized place for data collection, normalization, and enrichment to verify that raw data from the different environments is standardized into a common scheme. The data lake should support data retention and archiving for long term storage.

Visualize: The SOC includes multiple dashboards based on organizational and operational needs. Dashboards can cover scenarios for multiple environments including data flows between IT and OT environments. There are also specific dashboards for the individual environments to cover each stakeholder’s needs. Data should be indexed in a way that allows humans and machines to query the data to monitor for security and performance issues.

Security analytics: Security analytics are used to aggregate and analyze security signals and generate higher fidelity alerts and to contextualize OT signals against concurrent IT signals and against threat intelligence from reputable sources.

Detect, alert, and respond: Alerts can be set up for events of interest based on data across both individual and multiple environments. Machine learning should be used to help identify threat paths and events of interest across the data.

Conclusion

Throughout this blog post, we’ve talked through the convergence of IT and OT environments from the perspective of optimizing your security operations. We looked at the benefits and considerations of designing and implementing a unified SOC.

Visibility into the full threat chain propagating from one environment to another is critical for organizations as daily operations become more connected across IT and OT. A unified SOC is the nerve center for incident detection and response and can be one of the most critical components in improving your organization’s security posture and cyber resilience.

If unification is your organization’s goal, you must fully consider what this means and design a plan for what a unified SOC will look like in practice. Running a small proof of concept and migrating in steps often helps with this process.

In the next blog post, we will provide prescriptive guidance on how to design and build a unified and global SOC using AWS services and AWS Partner Network (APN) solutions.

Learn more:

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Showcasing SecOps Metrics That Matter

2023-07-06 Rapid7

Post Syndicated from Rapid7 original https://blog.rapid7.com/2023/07/06/showcasing-secops-metrics-that-matter/

Showcasing SecOps Metrics That Matter

This year, new rules from the Security and Exchange Commission (SEC) about board-level expertise, risk management, and public disclosures will take effect. The European Union is updating its regulations, as well. To meet these new requirements, organizations will need to explain to shareholders exactly how they assess cyber risk, describe security policies, and prove a significant level of board oversight.

In this climate, security leaders will be expected to advise the C-suite on SecOps activities. As a security professional, this can be a challenge. It’s also an opportunity to shape the structure and execution of business and go-to-market decisions.

Our latest ebook, Presenting Upward: How to Showcase SecOps Metrics That Matter offers practical and actionable advice on how to present security metrics in a language execs understand.

About those metrics

Cybersecurity metrics are essential to understand where you’re succeeding and where you may need to make changes.

Some examples include:

Number and disposition of security incidents: You have no control of this, but it gives execs insight into the risk they face. There’s an attack every 39 seconds somewhere. What’s life like in your security operation?

Mean time-to-detection (MTTD): This metric gives insight into both efficacy of tools and coverage of data (is the detection coming from a reported incident vs. a tool, etc.).

Mean time-to-respond (MTTR): This also gives insight into your ability to respond and whether your tools and processes meet your threats and use cases.

Cost-per-incident: This gives you insight into efficiency of process, tooling, and also potential staffing shortcomings (like the number of people or specific skills).

There are many other metrics you may need to track to understand your cybersecurity readiness. Good metrics will differ for every organization, depending on your risks, needs, compliance requirements, desired business outcomes, security maturity, and more.

Stories + metrics = success

Generally speaking, executives don’t usually want to get too deep in the weeds. So, your ability to present metrics in a way they understand is critical to achieve cybersecurity goals.

Execs typically want answers to questions like:

What are our risks, and how are we addressing them?
How secure are we compared to similar organizations?
Are we budgeting the right amount for cybersecurity?
Where do we have opportunities for efficiencies or consolidation?
How are we addressing that thing in the news?

So, when presenting to execs it’s essential to put metrics into context. One way to do this is to craft a narrative that brings metrics to life. Stories often have more of an impact than facts and figures alone. This isn’t anecdotal; neuroscience has shown that when we are presented with a story, we understand the information more deeply, remember longer, and are more likely to factor what it taught us into future decisions.
For more tips on crafting an effective narrative, and much more, download Presenting Upward: How to Showcase SecOps Metrics That Matter now.

AWS Security Hub launches a new capability for automating actions to update findings

2023-06-13 Stuart Gregg

Post Syndicated from Stuart Gregg original https://aws.amazon.com/blogs/security/aws-security-hub-launches-a-new-capability-for-automating-actions-to-update-findings/

If you’ve had discussions with a security organization recently, there’s a high probability that the word automation has come up. As organizations scale and consume the benefits the cloud has to offer, it’s important to factor in and understand how the additional cloud footprint will affect operations. Automation is a key enabler for efficient operations and can help drive down the number of repetitive tasks that the operational teams have to perform.

Alert fatigue is caused when humans work on the same repetitive tasks day in and day out and also have a large volume of alerts that need to be addressed. The repetitive nature of these tasks can cause analysts to become numb to the importance of the task or make errors due to manual processing. This can lead to misclassification of security alerts or higher-severity alerts being overlooked due to investigation times. Automation is key here to reduce the number of repetitive tasks and give analysts time to focus on other areas of importance.

In this blog post, we’ll walk you through new capabilities within AWS Security Hub that you can use to take automated actions to update findings. We’ll show you some example scenarios that use this capability and set you up with the knowledge you need to get started with creating automation rules.

Automation rules in Security Hub

AWS Security Hub is available globally and is designed to give you a comprehensive view of your security posture across your AWS accounts. With Security Hub, you have a single place that aggregates, organizes, and prioritizes your security alerts, or findings, from multiple AWS services, including Amazon GuardDuty, Amazon Inspector, Amazon Macie, AWS Firewall Manager, AWS Systems Manager Patch Manager, AWS Config, AWS Health, and AWS Identity and Access Management (IAM) Access Analyzer, as well as from over 65 AWS Partner Network (APN) solutions.

Previously, Security Hub could take automated actions on findings, but this involved going to the Amazon EventBridge console or API, creating an EventBridge rule, and then building an AWS Lambda function, an AWS Systems Manager Automation runbook, or an AWS Step Functions step as the target of that rule. If you wanted to set up these automated actions in the administrator account and home AWS Region and run them in member accounts and in linked Regions, you would also need to deploy the correct IAM permissions to enable the actions to run across accounts and Regions. After setting up the automation flow, you would need to maintain the EventBridge rule, Lambda function, and IAM roles. Such maintenance could include upgrading the Lambda versions, verifying operational efficiency, and checking that everything is running as expected.

With Security Hub, you can now use rules to automatically update various fields in findings that match defined criteria. This allows you to automatically suppress findings, update findings’ severities according to organizational policies, change findings’ workflow status, and add notes. As findings are ingested, automation rules look for findings that meet defined criteria and update the specified fields in findings that meet the criteria. For example, a user can create a rule that automatically sets the finding’s severity to “Critical” if the finding account ID is of a known business-critical account. A user could also automatically suppress findings for a specific control in an account where the finding represents an accepted risk.

With automation rules, Security Hub provides you a simplified way to build automations directly from the Security Hub console and API. This reduces repetitive work for cloud security and DevOps engineers and can reduce the mean time to response.

Use cases

In this section, we’ve put together some examples of how Security Hub automation rules can help you. There’s a lot of flexibility in how you can use the rules, and we expect there will be many variations that your organization will use when contextual information about security risk has been added.

Scenario 1: Elevate finding severity for specific controls based on account IDs

Security Hub offers protection by using hundreds of security controls that create findings that have a severity associated with them. Sometimes, you might want to elevate that severity according to your organizational policies or according to the context of the finding, such as the account it relates to. With automation rules, you can now automatically elevate the severity for specific controls when they are in a specific account.

For example, the AWS Foundational Security Best Practices control GuardDuty.1 has a “High” severity by default. But you might consider such a finding to have “Critical” severity if it occurs in one of your top production accounts. To change the severity automatically, you can choose GeneratorId as a criteria and check that it’s equal to aws-foundational-security-best-practices/v/1.0.0/GuardDuty.1, and also add AwsAccountId as a criteria and check that it’s equal to YOUR_ACCOUNT_IDs. Then, add an action to update the severity to “Critical,” and add a note to the person who will look at the finding that reads “Urgent — look into these production accounts.”

You can set up this automation rule through the AWS CLI, the console, the Security Hub API, or the AWS SDK for Python (Boto3), as follows.

To set up the automation rule for Scenario 1 (AWS CLI)

In the AWS CLI, run the following command to create a new automation rule with a specific Amazon Resource Name (ARN). Note the different modifiable parameters:
- Rule-name — The name of the rule that will be created.
- Rule-status — An optional parameter. Specify whether you want Security Hub to activate and start applying the rule to findings after creation. If no value is specified, the default value is ENABLED. A value of DISABLED means that the rule will be paused after creation.
- Rule-order — Provide the processing order for the rule. Security Hub applies rules with a lower numerical value for this parameter first.
- Criteria — Provide the criteria that you want Security Hub to use to filter your findings. The rule action will be applied to findings that match the criteria. For a list of supported criteria, see Criteria and actions for automation rules. In this example, the criteria are placeholders and should be replaced.
- Actions — Provide the actions that you want Security Hub to take when there’s a match between a finding and your defined criteria. For a list of supported actions, see Criteria and actions for automation rules. In this example, the actions are placeholders and should be replaced.
aws securityhub create-automation-rule \—rule-name "Elevate severity for findings in production accounts - GuardDuty.1" \—rule-status "ENABLED"" \—rule-order 1 \—description "Elevate severity for findings in production accounts - GuardDuty.1" \—criteria '{"GeneratorId": [{"Value": "aws-foundational-security-best-practices/v/1.0.0/GuardDuty.1","Comparison": "EQUALS"}, "AwsAccountId": [{"Value": "<111122223333>","Comparison": "EQUALS"},]}' \—actions '[{"Type": "FINDING_FIELDS_UPDATE","FindingFieldsUpdate": {"Severity": {"Label": "CRITICAL"},"Note": {"Text": "Urgent – look into these production accounts","UpdatedBy": "sechub-automation"}}}]' \—region us-east-1

To set up the automation rule for Scenario 1 (console)

Open the Security Hub console, and in the left navigation pane, choose Automations.

Figure 1: Automation rules in the Security Hub console
Choose Create rule, and then choose Create a custom rule to get started with creating a rule of your choice. Add a rule name and description.

Figure 2: Create a new custom rule
Under Criteria, add the following information.
- Key 1
  - Key = GeneratorID
  - Operator = EQUALS
  - Value = aws-foundational-security-best-practices/v/1.0.0/GuardDuty.1
- Key 2
  - Key = AwsAccountId
  - Operator = EQUALS
  - Value = Your AWS account ID
Figure 3: Information added for the rule criteria
You can preview which findings will match the criteria by looking in the preview section.

Figure 4: Preview section
Next, under Automated action, specify which finding value to update automatically when findings match your criteria.

Figure 5: Automated action to be taken against the findings that match the criteria
For Rule status, choose Enabled, and then choose Create rule.

Figure 6: Set the rule status to Enabled
After you choose Create rule, you will see the newly created rule within the Automations portal.

Figure 7: Newly created rule within the Security Hub Automations page

Note: In figure 7, you can see multiple automation rules. When you create automation rules, you assign each rule an order number. This determines the order in which Security Hub applies your automation rules. This becomes important when multiple rules apply to the same finding or finding field. When multiple rule actions apply to the same finding field, the rule with the highest numerical value for rule order is applied last and has the ultimate effect on that field.

Additionally, if your preferred deployment method is to use the API or AWS SDK for Python (Boto3), we have information on how you can use these means of deployment in our public documentation.

Scenario 2: Change the finding severity to high if a resource is important, based on resource tags

Imagine a situation where you have findings associated to a wide range of resources. Typically, organizations will attempt to prioritize which findings to remediate first. You can achieve this prioritization through Security Hub and the contextual fields that are available for you to use — for example, by using the severity of the finding or the account ID the resource is sitting in. You might also have your own prioritization based on other factors. You could add this additional context to findings by using a tagging strategy. With automation rules, you can now automatically elevate the severity for specific findings based on the tag value associated to the resource.

For example, if a finding comes into Security Hub with the severity rating “Medium,” but the resource in question is critical to the business and has the tag production associated to it, you could automatically raise the severity rating to “High.”

Note: This will work only for findings where there is a resource tag associated with the finding.

Scenario 3: Suppress GuardDuty findings with a severity of “Informational”

GuardDuty provides an overarching view of the state of threats to deployed resources in your organization’s cloud environment. After evaluation, GuardDuty produces findings related to these threats. The findings produced by GuardDuty have different severities, to help organizations with prioritization. Some of these findings will be given an “Informational” severity. “Informational” indicates that no issue was found and the content of the finding is purely to give information. After you have evaluated the context of the finding, you might want to suppress any additional findings that match the same criteria.

For example, you might want to set up a rule so that new findings with the generator ID that produced “Informational” findings are suppressed, keeping only the findings that need action.

Templates

When you create a new rule, you can also choose to create a rule from a template. These templates are regularly updated with use cases that are applicable for many customers.

To set up an automation rule by using a template from the console

In the Security Hub console, choose Automations, and then choose Create rule.
Choose Create a rule from a template to get started with creating a rule of your choice.
Select a rule template from the drop-down menu.

Figure 8: Select an automation rule template
(Optional) If necessary, modify the Rule, Criteria, and Automated action sections.
For Rule status, choose whether you want the rule to be enabled or disabled after it’s created.
(Optional) Expand the Additional settings section. Choose Ignore subsequent rules for findings that match these criteria if you want this rule to be the last rule applied to findings that match the rule criteria.
(Optional) For Tags, add tags as key-value pairs to help you identify the rule.
Choose Create rule.

Multi-Region deployment

For organizations that operate in multiple AWS Regions, we’ve provided a solution that you can use to replicate rules created in your central Security Hub admin account into these additional Regions. You can find the sample code for this solution in our GitHub repo.

Conclusion

In this blog post, we’ve discussed the importance of automation and its ability to help organizations scale operations within the cloud. We’ve introduced a new capability in AWS Security Hub, automation rules, that can help reduce the repetitive tasks your operational teams may be facing, and we’ve showcased some example use cases to get you started. Start using automation rules in your environment today. We’re excited to see what use cases you will solve with this feature and as always, are happy to receive any feedback.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Considerations for the security operations center in the cloud: deployment using AWS security services

2023-03-01 Stuart Gregg

Post Syndicated from Stuart Gregg original https://aws.amazon.com/blogs/security/considerations-for-the-security-operations-center-in-the-cloud-deployment-using-aws-security-services/

Welcome back. If you’re joining this series for the first time, we recommend that you read the first blog post in this series, Considerations for security operations in the cloud, for some context on what we will discuss and deploy in this blog post. In the earlier post, we talked through the different operating models (centralized, decentralized, or hybrid) that you can deploy for a Security Operations Center (SOC) function when you operate in the cloud. We covered the advantages of each model and some of the potential drawbacks you might see when you start to scale up operations within the cloud.

This post will focus on the Amazon Web Services (AWS) native security service, AWS Security Hub, that you can use to deploy in different SOC operating models. AWS Security Hub is a cloud security posture management service that SOC teams can use to perform security best practice checks and aggregate alerts. AWS Security Hub accepts findings from multiple sources, whether native to AWS, from the pre-built integrations, or from your own sources converted into the AWS Security Finding Format (ASFF). The data collected in Security Hub facilitates response and remediation actions.

Although the models we describe here use services that are native to AWS, the reference architectures that correspond to each operating model can be applied to a variety of deployments, including multi-cloud and traditional on-premises deployments. The majority of this post will focus on the decentralized and hybrid models—the centralized model is well documented and has reference architectures already available for you today.

Each organization is different, and no one operating model will fit everyone. You should choose the model that works best for your organizational landscape, with an understanding that the landscape will change and evolve over time. Using feedback loops and being open to change is important to help you meet the continued needs of your business. Additional factors to consider include, but are not limited to: staff skills, compliance requirements, previous operating model, and budget.

The centralized model

The centralized operating model for the SOC is well documented and frequently discussed, both at AWS and in the security community. According to AWS best practices, typically you designate a central security tooling account that is dedicated to operating security services, monitoring AWS accounts, and automating security alerting and response. The security tooling account serves as the administrator account for security services that are managed in an administrator/member structure across your AWS accounts. The key objectives for establishing a security tooling account are the following:

Provide a dedicated enclave with controlled access for managing security guardrails, monitoring, and response.
Maintain the appropriate centralized security infrastructure to monitor security operations data and maintain traceability across the security lifecycle.

Figure 1 demonstrates the variety of AWS security services that you can deploy in the central security account. For example, Security Hub within the security tooling account can act as the administrator to enable Security Hub in the member accounts, as well as view findings, view insights, and set security standards across member accounts, which can help simplify security posture management across your existing and future accounts.

Figure 1: Reference architecture for the security tooling account in a centralized model

As mentioned earlier, you can enable Security Hub to administer and enable member accounts. This is achieved by using AWS Organizations and the delegated administrator functionality. In addition, you can use Security Hub cross-Region aggregation within the delegated administrator account to aggregate findings, finding updates, insights, control compliance statuses, and security scores from multiple Regions to a single aggregation Region. You can then manage this data from the aggregation Region. Figure 2 shows the reference architecture for this functionality.

Figure 2: Reference architecture for Security Hub in the delegated administrator model

The AWS Security Reference Architecture (AWS SRA) is a great starting point for establishing the centralized security operations model. The AWS SRA is a holistic set of guidelines for deploying the full complement of AWS security services in a multi-account environment. You can use it to help design, implement, and manage AWS security services so that they align with AWS best practices. The AWS SRA’s Security Hub Organization solution provides deployable templates and examples that automate the process of enabling Security Hub by delegating administration to an account and configuring Security Hub for the existing and future AWS Organizations accounts.

The decentralized and hybrid models

As mentioned in Considerations for security operations in the cloud, the decentralized and hybrid SOC models provide many benefits for organizations. The flexibility of these operating models allows organizational units (OUs) to control how they deal with security-related incidents while still having organization-wide visibility into security posture. This flexibility is important as organizations start to scale up activities within the cloud.

The reference architecture in Figure 3 shows how the benefits we discussed in our earlier blog post can be architected in the decentralized and hybrid operating models in the AWS Cloud.

Figure 3: Reference architecture for the decentralized and hybrid operating models in AWS

The key features of this architecture are as follows:

The organization root account is separate, according to AWS Organizations best practices. By using service control policies (SCPs), the root account can still achieve a level of governance across the business.
Dedicated accounts have been created for each OU for the Security Hub administration. The model we will use for this deployment is the invite model. In this reference architecture and as an example, we’re using Amazon GuardDuty to flow findings into Security Hub. When you use this model, each OU can manage findings for that OU. This gives you flexibility to work from the Security Hub admin with full visibility of the OU and accounts associated with that OU, or to work in each member account and view findings for that account only.
(Optional, for use with the hybrid model) Each OU’s Security Hub member accounts first send events to their Security Hub admin account. The Security Hub admin account will then send events for that OU to the local Amazon EventBridge bus. You can then set up rules to forward events to a central EventBridge bus in a dedicated AWS account. In the architecture in Figure 3, this account is named SecAnalytics. This step will follow a similar flow as the one described in this AWS Cloud Operations & Migrations blog post.
(Optional, for use with the hybrid model) After the OUs have sent data to the central bus, you can use a capability similar to the one in this AWS Architecture Blog post to start organizing the findings and gain organization-wide visibility. The solution in the earlier post used Amazon QuickSight to visualize the data, but you can use another tool or pre-existing data pipeline.

Items 3 and 4 labeled with (Optional) are capabilities that enable the hybrid model; these are not required if you only want to enable the decentralized model.

Considerations for all deployments

Keep the following considerations in mind for all deployments:

Steady state operations should be considered for whichever model you deploy in. For the centralized model, you can use functionality within AWS Organizations to automatically enable Security Hub for accounts within the organization. In the decentralized and hybrid models, you will need to build out this capability or use a similar capability as described in this repo.
Alert fatigue happens when humans work on the same repetitive tasks’ day in and day out. To help reduce this, within the reference architecture and solution overview, we’ve added the capability described in this Security Blog post to automatically suppress findings based on criteria set by you. For the centralized model, you can add this capability in the delegated admin account for Security Hub. For the decentralized and hybrid models, we recommend that you put the auto-suppression capability in the Security Hub admin account, and then centralize the rules for suppression for that OU at the Security Hub admin level. This will reduce the overhead for deploying suppression rules multiple times and give a single location where rules are placed for that OU.
Context is key. Within the reference architecture and solution overview for decentralized and hybrid deployments, we’ve added the capability described in this Security Blog post. This capability will add additional context, such as the account name, the OU associated with the account, security contact information, and account tags. This information is pulled from AWS Organizations to enrich Security Hub findings. This additional context can also be used in the centralized model.

Deploy the decentralized and hybrid models

In this section, we’ll walk you through the deployment that reflects the reference architecture for the decentralized and hybrid models. Figure 4 shows the solution architecture, including the solution that needs to be deployed in the Security Hub admin account and in the aggregation Region for each business unit within the organization. The solution provides the capability to suppress Security Hub findings, enrich the findings, and propagate findings to central security accounts.

Figure 4: Reference architecture for the decentralized and hybrid deployment

The solution architecture consists of the following:

An EventBridge rule to invoke a Lambda function (Suppression Lambda) as the target to suppress any findings based on specific generator IDs within specific member accounts.

Note: The Security Hub Generator IDs and AWS Account IDs in the EventBridge rule are left as placeholders so that you can fill based on your needs.
An EventBridge rule to invoke a Lambda function (Enrichment Lambda) as the target to enrich the findings with AWS account and OU related metadata, along with alternate contact information to better prioritize the findings. The API calls to AWS Organizations and AWS account management services are optimized by caching the metadata in an Amazon DynamoDB table with a time-to-live (TTL) value of 24 hours.
An EventBridge rule to post the enriched findings that were not suppressed to a custom EventBridge event bus in the organization-level Security Tooling/SecAnalytics account.

Prerequisites

The following are the prerequisites for this deployment:

AWS Organizations is utilized across the business. In this scenario, AWS Organizations will be used to group AWS accounts into OUs, as well as to provide enrichment data for Security Hub findings.
Alternative contacts for AWS accounts have been filled out with the most up-to-date information. This is a best practice recommendation. This information will be used for enrichment of the Security Hub findings.
Your organization already has a pipeline in place for indexing Security Hub findings and visualizing them.
Security Hub is set up in the invite model. OU-level Security Hub accounts have been invited and accepted to be managed by the OU-level Security Hub admin account.
The grouping of findings across multiple OU-level Security Hub admin accounts uses Amazon EventBridge to forward events to a centralized bus. You should have the event bus set up ready for this deployment.

Deploy the solution

This solution deployment consists of two parts:

Create an IAM role in your Organizations management account that allows BU-level Security Hub admin to access account metadata, as described in the Create the IAM role procedure that follows.
Deploy the Enrichment Lambda function, the Suppression Lambda function, and the associated EventBridge event rules within the BU-level Security Hub administrator account.

Create the IAM role

Follow the instructions in Creating a role to delegate permissions to an IAM user to create an IAM role by using the IAM console, AWS Command Line Interface (AWS CLI), or AWS API. Create the role in the AWS Organizations management account with the role name as account-contact-readonly, based on the following trust and permission policy templates. You will need the account ID of your BU-level Security Hub administrator account.

The IAM trust policy allows the Security Hub administrator account to assume the role in your Organizations management account.

Note: The following trust policy shows only one BU Security admin account. You will need to add all BU Security admin accounts to the trust policy.

IAM role trust policy

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Effect": "Allow",
       "Principal": {
         "AWS": "arn:aws:iam::<BU SecHubAdmin Account ID>:root"
       },
       "Action": "sts:AssumeRole",
       "Condition": {}
     }
   ]
 }

Note: Replace <BU SecHubAdmin Account ID> with the account ID of your decentralized BU-level Security Hub administrator account. After the solution is deployed, you should update the principal in the preceding trust policy to use the new IAM role created for the solution.

IAM permission policy

{
     "Version": "2012-10-17",
     "Statement": [
         {
            "Action": "Account:GetAlternateContact",
            "Resource": [
                "arn:aws:account::<Org Management Account ID>:account/o-*/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "organizations:DescribeAccount",
                "organizations:ListTagsForResource",
                "organizations:DescribeOrganizationalUnit",
                "organizations:ListParents"
            ],
            "Resource": [
                "arn:aws:organizations::<Org Management Account ID>:account/o-*/*",
                "arn:aws:organizations::<Org Management Account ID>:ou/o-*/ou-*"
            ],
            "Effect": "Allow"
        }
     ]
 }

The IAM permission policy allows the Security Hub administrator account to look up the alternate contact information for the member accounts.

Make a note of the role Amazon Resource Name (ARN) for the IAM role, which will be similar to this format:
arn:aws:iam::<Org Management Account ID>:role/account-contact-readonly.

You will need this ARN when you deploy the solution in the next procedure.

Use AWS CloudFormation to create the IAM role

Alternatively, you can use the CloudFormation template we provide in our GitHub repository to create the role in the management account. The IAM role ARN is available in the Outputs section of the created CloudFormation stack.

Deploy the solution to your BU-level Security Hub administrator account

After you have the IAM role created, you can deploy the solution either from the AWS Management Console, or from our GitHub repository by using the AWS SAM CLI.

Note: If you’ve designated an aggregation Region within the BU-level Security Hub administrator account, you can deploy this solution only in the aggregation Region. Otherwise, you need to deploy this solution separately in each Region of the BU-level Security Hub administrator account where Security Hub is enabled.

To deploy the solution by using the AWS Management Console

In your Security Hub administrator account, launch the template by choosing the following Launch Stack button, which creates the stack the in us-east-1 Region.

Note: If your Security Hub aggregation Region is different than us-east-1 or you want to deploy the solution in a different AWS Region, you can deploy the solution from the GitHub repository described in the next section.
On the Quick create stack page, for Stack name, enter a unique stack name for this account; for example, aws-security-hub-decentralized-deployment-stack.

Figure 5: Quick create CloudFormation stack for the solution
For SecurityToolingAccountEventBus, provide the EventBus ARN in the security tooling account to post the Security Hub findings from the BU-level Security Hub administrator account.
For OrgManagementAccountContactRole, enter the role ARN of the role you created previously in the Create IAM role procedure.
Choose Create stack.
After the stack is created, go to the Resources tab and take note of the name of the IAM role that was created.
Update the principal element of the IAM role trust policy that you previously created in the Organizations management account in the Create the IAM role procedure, replacing the existing value with the role name you noted down.

Figure 6: Update the principal element for the role trust policy

To deploy the solution from our GitHub repository and AWS SAM CLI

Install the AWS SAM CLI.
Download or clone the GitHub repository by using the following commands.
git clone https://github.com/aws-samples/aws-securityhub-decentralized-operations-solution.git cd aws-securityhub-decentralized-operations-solution
Update the content of the profile.txt file with the profile name you want to use for the deployment.
To create a new bucket for deployment artifacts, run create-bucket.sh by specifying the Region as argument.
$ ./create-bucket.sh us-east-1
Deploy the solution to the account by running the deploy.sh script by specifying the Region as argument.
$ ./deploy.sh us-east-1
After the stack is created, go to the Resources tab and take note of the name of the IAM role that was created.
Update the principal element of the IAM role trust policy that you previously created in the Organizations management account in the Create the IAM role procedure, replacing it with the role name you noted down.
"AWS": "arn:aws:iam::<BU SH Delegated Account ID>: role/<Role Name>"

Note: The EventBridge rule to invoke the findings suppression Lambda function uses placeholders for the generator IDs and AWS account IDs. You need to update the EventBridge rule to meet your specific organizational requirements.

Further enhancements and conclusion

Beyond what is described in the decentralized and hybrid models, you can extend the solution to include the following aspects to meet your security operational needs:

In Considerations for security operations in the cloud, we spoke about the role of ChatOps. AWS Chatbot can enable OUs to set up rules to post notifications directly into chat rooms such as Amazon Chime or Slack. You can define rules to send only certain severity notifications or findings that are important to your OU to the chat room.
SCPs give organizations a level of control and governance. See this blog post for some best practices for deploying SCPs, as well as example policies that could be beneficial for your organization in any model you operate in.
We’ve performed testing of the decentralized and hybrid models in the reference architecture within one AWS Region. Although we don’t see any reason why this solution would not work in multiple Regions, if you do operate in multiple Regions you would need to deploy the CloudFormation template in each Region that you operate in. At this stage, you can keep findings within a Region or choose to centralize across multiple Regions by sending to the single central bus in Amazon EventBridge—the flexibility is yours.
The decentralized and hybrid models can also be extended if you operate in multiple organizations in AWS Organizations or have standalone accounts outside of your organization that you want to monitor. Interesting use cases could be in mergers and acquisitions scenarios, when newly acquired accounts need to be monitored to understand their posture before bringing them fully into the organization.

Throughout this two-part blog series, we’ve explored the role of the Security Operations Center (SOC) function, both traditionally in an on-premises environment and in the cloud. We’ve explored different operating models, from the traditional centralized deployment to the decentralized and hybrid models. We’ve also demonstrated, with reference architectures and deployable solutions, how you can achieve the different operating models in the AWS Cloud by using native AWS services. In the end, you should choose the model that works best for your environment and the security landscape you work in.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Considerations for security operations in the cloud

2022-11-18 Stuart Gregg

Post Syndicated from Stuart Gregg original https://aws.amazon.com/blogs/security/considerations-for-security-operations-in-the-cloud/

Cybersecurity teams are often made up of different functions. Typically, these can include Governance, Risk & Compliance (GRC), Security Architecture, Assurance, and Security Operations, to name a few. Each function has its own specific tasks, but works towards a common goal—to partner with the rest of the business and help teams ship and run workloads securely.

In this blog post, I’ll focus on the role of the security operations (SecOps) function, and in particular, the considerations that you should look at when choosing the most suitable operating model for your enterprise and environment. This becomes particularly important when your organization starts to adapt and operate more workloads in the cloud.

Operational teams that manage business processes are the backbone of organizations—they pave the way for efficient running of a business and provide a solid understanding of which day-to-day processes are effective. Typically, these processes are defined within standard operating procedures (SOPs), also known as runbooks or playbooks, and business functions are centralized around them—think Human Resources, Accounting, IT, and so on. This is also true for cybersecurity and SecOps, which typically has operational oversight of security for the entire organization.

Teams adopt an operating model that inherently leans toward a delegated ownership of security when scaling and developing workloads in the cloud. The emergence of this type of delegation might cause you to re-evaluate your currently supported model, and when you do this, it’s important to understand what outcome you are trying to get to. You want to be able to quickly respond to and resolve security issues. You want to help application teams own their own security decisions. You also want to have centralized visibility of the security posture of your organization. This last objective is key to being able to identify where there are opportunities for improvement in tooling or processes that can improve the operation of multiple teams.

Three ways of designing the operating model for SecOps are as follows:

Centralized – A more traditional model where SecOps is responsible for identifying and remediating security events across the business. This can also include reviewing general security posture findings for the business, such as patching and security configuration issues.
Decentralized – Responsibility for responding to and remediating security events across the business has been delegated to the application owners and individual business units, and there is no central operations function. Typically, there will still be an overarching security governance function that takes more of a policy or principles view.
Hybrid – A mix of both approaches, where SecOps still has a level of responsibility and ownership for identifying and orchestrating the response to security events, while the responsibility for remediation is owned by the application owners and individual business units.

As you can see from these descriptions, the main distinction between the different models is in the team that is responsible for remediation and response. I’ll discuss the benefits and considerations of each model throughout this blog post.

The strategies and operating models that I talk about throughout this blog post will focus on the role of SecOps and organizations that operate in the cloud. It’s worth noting that these operating models don’t apply to any particular technology or cloud provider. Each model has its own benefits and challenges to consider; overall, you should aim to adopt an operating model that gets to the best business outcome, while managing risk and providing a path for continuous improvement.

Background: the centralized model

As you might expect, the most familiar and well-understood operating model for SecOps is a centralized one. Traditionally, SecOps has developed gradually from internal security staff who have a very good understanding of the mostly static on-premises infrastructure and corporate assets, such as employee laptops, servers, and databases.

Centralizing in this way provides organizations with a familiar operating model and structure. Over time, operating in this model across an industry has allowed teams to develop reliable SOPs for common security events. Analysts who deal with these incidents have a good understanding of the infrastructure, the environment, and the steps that are needed to resolve incidents. Every incident gives opportunities to update the SOPs and to share this knowledge and the lessons learned with the wider industry. This continuous feedback cycle has provided benefits to SecOps teams for many years.

When security issues occur, understanding the division of responsibility between the various teams in this model is extremely important for quick resolution and remediation. The Responsibility Assignment Matrix, also known as the RACI model, has defined roles—Responsible, Accountable, Consulted, and Informed. Utilizing a model like this will help align each employee, department, and business unit so that they are aware of their role and contact points when incidents do occur, and can use defined playbooks to quickly act upon incidents.

The pressure can be high during a security event, and incidents that involve production systems carry additional weight. Typically, in a centralized model, security events flow into a central queue that a security analyst will monitor. A common approach is the Security Operations Center (SOC), where events from multiple sources are displayed on screens and also trigger activity in the queue. Security incidents are acted upon by an experienced team that is well versed in SOPs and understands the importance of time sensitivity when dealing with such incidents. Additionally, a centralized SecOps team usually operates in a 24/7 model, which might be achieved by having teams in multiple time zones or with help from an MSSP (Managed Security Service Provider). Whichever strategy is followed, having experienced security analysts deal with security incidents is a great benefit, because experience helps to ensure efficient and thorough remediation of issues.

So, with context and background set—how does a centralized SOC look and feel when it operates in the cloud, and what are its challenges?

Centralized SOC in the cloud: the advantages

Cloud providers offer many solutions and capabilities for SOCs that operate in a centralized model. For example, you can monitor your organization’s cloud security posture as a whole, which allows for key performance indicator (KPI) benchmarking, both internally and industry wide. This can then help your organization target security initiatives, training, and awareness on lower-scoring areas.

Security orchestration, automation, and response (SOAR) is a phrase commonly used across the security industry, and the cloud unlocks this capability. Combining both native and third-party security services and solutions with automation facilitates quick resolution of security incidents. The use of SOAR means that only incidents that need human intervention are actually reviewed by the analysts. After investigation, if automation can be introduced on that alert, it’s quickly applied. Having a central place for automating alerts helps the organization to have a consistent and structured approach to the response for security events and gives analysts more time to focus on activities like threat hunting.

Additionally, such threat-hunting operations require a central security data lake or similar technology. As a result, the SecOps team helps to drive the centralization of data across the business, which is a traditional cybersecurity function.

Centralized SOC in the cloud: organizational considerations

Some KPIs that a traditional SOC would typically use are time to detect (TTD), time to acknowledge (TTA), and time to resolve (TTR). These have been good metrics that SecOps managers can use to understand and benchmark how well the SecOps team is performing, both internally and against industry benchmarks. As your organization starts to take advantage of the breadth and depth available within the cloud, how does this change the KPIs that you need to track? As stated earlier, the cloud makes it easier to track KPIs through increased visibility of your cloud footprint—although you should evaluate traditional KPIs to understand whether they still make sense to use. Some additional KPIs that should be considered are metrics that show increasing automation, reduction in human access, and the overall improvement in security posture.

Organizations should consider scaling factors for operational processes and capability in the centralized SOC model. Once benefits from adopting the cloud have been realized, organizations typically expand and scale up their cloud footprint aggressively. For a centralized SecOps team, this could cause a challenging battle between the wider business, which wants to expand, and the SOC, which needs the ability to fully understand and respond to issues in the environment. For example, most organizations will put together small proof of concepts (POCs) to showcase new architectures and their benefits, and these POCs may become available as blueprints for the wider organization to consume. When new blueprints are implemented, the centralized SecOps team should implement and rely on its automation capabilities to verify that the correct alerting, monitoring, and operational processes are in place.

Decentralization: all ownership with the application teams

Moving or designing workloads in the cloud provides organizations with many benefits, such as increased speed and agility, built-in native security, and the ability to launch globally in minutes. When looking at the decentralized model, business units should incorporate practices into their development pipelines to benefit from the security capabilities of the cloud. This is sometimes referred to as a shift left or DevSecOps approach—essentially building security best practices into every part of the development process, and as early as possible.

Placing the ownership of the SecOps function on the business units and application owners can provide some benefits. One immediate benefit is that the teams that create applications and architectures have first-hand knowledge and contextual awareness of their products. This knowledge is critical when security events occur, because understanding the expected behavior and information flows of workloads helps with quick remediation and resolution of issues. Having teams work on security incidents in the ways that best fit their operational processes can also increase speed of remediation.

Decentralization: organizational considerations

When considering the decentralized approach, there are some organizational considerations that you should be aware of:

Dedicated security analysts within a central SecOps function deal with security incidents day in and day out; they study the industry, have a keen eye on upcoming threats, and are also well versed in high-pressure situations. By decentralizing, you might lose the consistent, level-headed experience they offer during a security incident. Embedding security champions who have industry experience into each business unit can help ensure that security is considered throughout the development lifecycle and that incidents are resolved as quickly as possible.

Contextual information and root cause analysis from past incidents are vital data points. Having a centralized SecOps team makes it much simpler to get a broad view of the security issues affecting the whole organization, which improves the ability to take a signal from one business unit and apply that to other parts of the organization to understand if they are also vulnerable, and to help protect the organization in the future.

Decentralizing the SecOps responsibility completely can cause you to lose these benefits. As mentioned earlier, effective communication and an environment to share data is key to verifying that lessons learned are shared across business units—one way of achieving this effective knowledge sharing could be to set up a Cloud Center of Excellence (CCoE). The CCoE helps with broad information sharing, but the minimization of team hand-offs provided by a centralized SecOps function is a strong organizational mechanism to drive consistency.

Traditionally, in the centralized model, the SOC has 24/7 coverage of applications and critical business functions, which can require a large security staff. The need for 24/7 operations still exists in a decentralized model, and having to provide that capability in each application team or business unit can increase costs while making it more difficult to share information. In a decentralized model, having greater levels of automation across organizational processes can help reduce the number of humans needed for 24/7 coverage.

Blending the models: the hybrid approach

Most organizations end up using a hybrid operating model in one way or another. This model combines the benefits of the centralized and decentralized models, with clear responsibility and division of ownership between the business units and the central SecOps function.

This best-of-both-worlds scenario can be summarized by the statement “global monitoring, local response.” This means that the SecOps team and wider cybersecurity function guides the entire organization with security best practices and guardrails while also maintaining visibility for reporting, compliance, and understanding the security posture of the organization as a whole. Meanwhile, local business units have the tools, knowledge, and expertise available to confidently own remediation of security events for their applications.

In this hybrid model, you split delegation of ownership into two parts. First, the operational capability for security is centrally owned. This centrally owned capability builds upon the partnership between the application teams and the security organization, via the CCoE. This gives the benefits of consistency, tooling expertise, and lessons learned from past security incidents. Second, the resolution of day-to-day security events and security posture findings is delegated to the business units. This empowers the people closest to the business problem to own service improvement in ways that best suit that team’s way of working, whether that’s through ChatOps and automation, or through the tools available in the cloud. Examples of the types of events you might want to delegate for resolution are items such as patching, configuration issues, and workload-specific security events. It’s important to provide these teams with a well-defined escalation route to the central security organization for issues that require specialist security knowledge, such as forensics or other investigations.

A RACI is particularly important when you operate in this hybrid model. Making sure that there is a clear set of responsibilities between the business units and the SecOps team is crucial to avoid confusion when security incidents occur.

Conclusion

The cloud has the ability to unlock new capabilities for your organization. Increased security, speed, and agility and are just some of the benefits you can gain when you move workloads to the cloud. The traditional centralized SecOps model offers a consistent approach to security detection and response for your organization. Decentralization of the response provides application teams with direct exposure to the consequences of their design decisions, which can speed up improvement. The hybrid model, where application teams are responsible for the resolution of issues, can improve the time to fix issues while freeing up SecOps to continue their works. The hybrid operating model compliments the capabilities of the cloud, and enables application owners and business units to work in ways that best suit them while maintaining a high bar for security across the organization.

Whichever operating model and strategy you decide to embark on, it’s important to remember the core principles that you should aim for:

Enable effective risk management across the business
Drive security awareness and embed security champions where possible
When you scale, maintain organization-wide visibility of security events
Help application owners and business units to work in ways that work best for them
Work with application owners and business units to understand the cyber landscape

The cloud offers many benefits for your organization, and your security organization is there to help teams ship and operate securely. This confidence will lead to realized productivity and continued innovation—which is good for both internal teams and your customers.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Automate Amazon EC2 instance isolation by using tags

2021-03-01 Jose Obando

Post Syndicated from Jose Obando original https://aws.amazon.com/blogs/security/automate-amazon-ec2-instance-isolation-by-using-tags/

Containment is a crucial part of an overall Incident Response Strategy, as this practice allows time for responders to perform forensics, eradication and recovery during an Incident. There are many different approaches to containment. In this post, we will be focusing on isolation—the ability to keep multiple targets separated so that each target only sees and affects itself—as a containment strategy.

I’ll show you how to automate isolation of an Amazon Elastic Compute Cloud (Amazon EC2) instance by using an AWS Lambda function that’s triggered by tag changes on the instance, as reported by Amazon CloudWatch Events.

CloudWatch Event Rules deliver a near real-time stream of system events that describe changes in Amazon Web Services (AWS) resources. See also Amazon EventBridge.

Preparing for an incident is important as outlined in the Security Pillar of the AWS Well-Architected Framework.

Out of the 7 Design Principles for Security in the Cloud, as per the Well-Architected Framework, this solution will cover the following:

Enable traceability: Monitor, alert, and audit actions and changes to your environment in real time. Integrate log and metric collection with systems to automatically investigate and take action.
Automate security best practices: Automated software-based security mechanisms can improve your ability to securely scale more rapidly and cost-effectively. Create secure architectures, including through the implementation of controls that can be defined and managed by AWS as code in version-controlled templates.
Prepare for security events: Prepare for an incident by implementing incident management and investigation policy and processes that align to your organizational requirements. Run incident response simulations and use tools with automation to help increase your speed for detection, investigation, and recovery.

After detecting an event in the Detection phase and analyzing in the Analysis phase, you can automate the process of logically isolating an instance from a Virtual Private Cloud (VPC) in Amazon Web Services (AWS).

In this blog post, I describe how to automate EC2 instance isolation by using the tagging feature that a responder can use to identify instances that need to be isolated. A Lambda function then uses AWS API calls to isolate the instances by performing the actions described in the following sections.

Use cases

Your organization can use automated EC2 instance isolation for scenarios like these:

A security analyst wants to automate EC2 instance isolation in order to respond to security events in a timely manner.
A security manager wants to provide their first responders with a way to quickly react to security incidents without providing too much access to higher security features.

High-level architecture and design

The example solution in this blog post uses a CloudWatch Events rule to trigger a Lambda function. The CloudWatch Events rule is triggered when a tag is applied to an EC2 instance. The Lambda code triggers further actions based on the contents of the event. Note that the CloudFormation template includes the appropriate permissions to run the function.

The event flow is shown in Figure 1 and works as follows:

The EC2 instance is tagged.
The CloudWatch Events rule filters the event.
The Lambda function is invoked.
If the criteria are met, the isolation workflow begins.

When the Lambda function is invoked and the criteria are met, these actions are performed:

Checks for IAM instance profile associations.
If the instance is associated to a role, the Lambda function disassociates that role.
Applies the isolation role that you defined during CloudFormation stack creation.
Checks the VPC where the EC2 instance resides.
- If there is no isolation security group in the VPC (if the VPC is new, for example), the function creates one.
Applies the isolation security group.

Note: If you had a security group with an open (0.0.0.0/0) outbound rule, and you apply this Isolation security group, your existing SSH connections to the instance are immediately dropped. On the other hand, if you have a narrower inbound rule that initially allows the SSH connection, the existing connection will not be broken by changing the group. This is known as Connection Tracking.

Figure 1: High-level diagram showing event flow

For the deployment method, we will be using an AWS CloudFormation Template. AWS CloudFormation gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles, by treating infrastructure as code.

The AWS CloudFormation template that I provide here deploys the following resources:

An EC2 instance role for isolation – this is attached to the EC2 Instance to prevent further communication with other AWS Services thus limiting the attack surface to your overall infrastructure.
An Amazon CloudWatch Events rule – this is used to detect changes to an AWS EC2 resource, in this case a “tag change event”. We will use this as a trigger to our Lambda function.
An AWS Identity and Access Management (IAM) role for Lambda functions – this is what the Lambda function will use to execute the workflow.
A Lambda function for automation – this function is where all the decision logic sits, once triggered it will follow a set of steps described in the following section.
Lambda function permissions – this is used by the Lambda function to execute.
An IAM instance profile – this is a container for an IAM role that you can use to pass role information to an EC2 instance.

Supporting functions within the Lambda function

Let’s dive deeper into each supporting function inside the Lambda code.

The following function identifies the virtual private cloud (VPC) ID for a given instance. This is needed to identify which security groups are present in the VPC.

def identifyInstanceVpcId(instanceId):
    instanceReservations = ec2Client.describe_instances(InstanceIds=[instanceId])['Reservations']
    for instanceReservation in instanceReservations:
        instancesDescription = instanceReservation['Instances']
        for instance in instancesDescription:
            return instance['VpcId']

The following function modifies the security group of an EC2 instance.

def modifyInstanceAttribute(instanceId,securityGroupId):
    response = ec2Client.modify_instance_attribute(
        Groups=[securityGroupId],
        InstanceId=instanceId)

The following function creates a security group on a VPC that blocks all egress access to the security group.

def createSecurityGroup(groupName, descriptionString, vpcId):
    resource = boto3.resource('ec2')
    securityGroupId = resource.create_security_group(GroupName=groupName, Description=descriptionString, VpcId=vpcId)
    securityGroupId.revoke_egress(IpPermissions= [{'IpProtocol': '-1','IpRanges': [{'CidrIp': '0.0.0.0/0'}],'Ipv6Ranges': [],'PrefixListIds': [],'UserIdGroupPairs': []}])
    return securityGroupId.

Deploy the solution

To deploy the solution provided in this blog post, first download the CloudFormation template, and then set up a CloudFormation stack that specifies the tags that are used to trigger the automated process.

Download the CloudFormation template

To get started, download the CloudFormation template from Amazon S3. Alternatively, you can launch the CloudFormation template by selecting the following Launch Stack button:

Deploy the CloudFormation stack

Start by uploading the CloudFormation template to your AWS account.

To upload the template

From the AWS Management Console, open the CloudFormation console.
Choose Create Stack.
Choose With new resources (standard).
Choose Upload a template file.
Select Choose File, and then select the YAML file that you just downloaded.

Figure 2: CloudFormation stack creation

Specify stack details

You can leave the default values for the stack as long as there aren’t any resources provisioned already with the same name, such as an IAM role. For example, if left with default values an IAM role named “SecurityIsolation-IAMRole” will be created. Otherwise, the naming convention is fully customizable from this screen and you can enter your choice of name for the CloudFormation stack, and modify the parameters as you see fit. Figure 3 shows the parameters that you can set.

The Evaluation Parameters section defines the tag key and value that will initiate the automated response. Keep in mind that these values are case-sensitive.

Figure 3: CloudFormation stack parameters

Choose Next until you reach the final screen, shown in Figure 4, where you acknowledge that an IAM role is created and you trust the source of this template. Select the check box next to the statement I acknowledge that AWS CloudFormation might create IAM resources with custom names, and then choose Create Stack.

Figure 4: CloudFormation IAM notification

After you complete these steps, the following resources will be provisioned, as shown in Figure 5:

EC2IsolationRole
EC2TagChangeEvent
IAMRoleForLambdaFunction
IsolationLambdaFunction
IsolationLambdaFunctionInvokePermissions
RootInstanceProfile

Figure 5: CloudFormation created resources

Testing

To start your automation, tag an EC2 instance using the tag defined during the CloudFormation setup. If you’re using the Amazon EC2 console, you can apply tags to resources by using the Tags tab on the relevant resource screen, or you can use the Tags screen, the AWS CLI or an AWS SDK. A detailed walkthrough for each approach can be found in the Amazon EC2 Documentation page.

Reverting Changes

If you need to remove the restrictions applied by this workflow, complete the following steps.

From the EC2 dashboard, in the Instances section, check the box next to the instance you want to modify.

Figure 6: Select the instance to modify
In the top right, select Actions, choose Instance settings, and then choose Modify IAM role.

Figure 7: Choose Actions > Instance settings > Modify IAM role
Under IAM role, choose the IAM role to attach to your instance, and then select Save.

Figure 8: Choose the IAM role to attach
Select Actions, choose Networking, and then choose Change security groups.

Figure 9: Choose Actions > Networking > Change security groups
Under Associated security groups, select Remove and add a different security group with the access you want to grant to this instance.

Summary

Using the CloudFormation template provided in this blog post, a Security Operations Center analyst could have only tagging privileges and isolate an EC2 instance based on this tag. Or a security service such as Amazon GuardDuty could trigger a lambda to apply the tag as part of a workflow. This means the Security Operations Center analyst wouldn’t need administrative privileges over the EC2 service.

This solution creates an isolation security group for any new VPCs that don’t have one already. The security group would still follow the naming convention defined during the CloudFormation stack launch, but won’t be part of the provisioned resources. If you decide to delete the stack, manual cleanup would be required to remove these security groups.

This solution terminates established inbound Secure Shell (SSH) sessions that are associated to the instance, and isolates the instance from new connections either inbound or outbound. For outbound connections that are already established (for example, reverse shell), you either need to shut down the network interface card (NIC) at the operating system (OS) level, restart the instance network stack at the OS level, terminate the established connections, or apply a network access control list (network ACL).

For more information, see the following documentation:

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Shifting Security Right: How Cloud-Based SecOps Can Speed Processes While Maintaining Integrity

2021-01-04 Aaron Wells

Post Syndicated from Aaron Wells original https://blog.rapid7.com/2021/01/04/shifting-security-right-how-cloud-based-secops-can-speed-processes-while-maintaining-integrity/

Shifting Security Right: How Cloud-Based SecOps Can Speed Processes While Maintaining Integrity

When it comes to offloading security controls to the cloud, it may seem counterintuitive to the notion of “securing” things. But, when we consider the efficiency to be gained by shifting right with some security controls, it makes sense to send more granular, ground-up responsibilities to a trusted managed services cloud partner. This could help to increase development-and-deployment velocity, without compromising the integrity of your bespoke process.

Building a true DevSecOps ecosystem is probably a common goal for most teams. However, uncommonality most often enters the picture in the forms of both technical and organizational roadblocks. Let’s take a look at some key insights from a 2020 SANS Institute survey on current industry efforts to more closely integrate DevOps and SecOps—and how you can plot your best path forward.

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

The security landscape

In more traditional environments, security teams often feel they’ve been left behind by the pace of DevOps. Vulnerabilities are introduced faster than SecOps can likely find them. The shift is with teams that are building continuous delivery frameworks, with compliance checks at every stage of the game. It becomes a matter of defending the environment as it’s being built.

Currently, about 74% of organizations are deploying changes more than once per month, according to SANS. Often, these are weekly or daily instances. So, velocity is increasing, primarily out of a need to get customers what they need, faster. Traditional change approvals and security controls are becoming more guardrail-style checks. The challenge, however, lies in optimizing the process and keeping it as secure as possible.

Increasing cloud adoption

From a security perspective, transitioning to a cloud provider’s responsibility model can better match the pace of DevOps and increase delivery speed. When both of these velocities are increasing, albeit responsibly, that’s better for business.

Cloud-hosted VM platforms allow teams to spin up processes more quickly compared to a traditional setup.
Adoption is accelerating for cloud-hosted container services and serverless platforms because providers are doing more provisioning, patching, and upgrading for many existing execution environments.
More organizations are running on cloud-hosted VMs versus container services and serverless platforms, but that could change because the latter two options allow you to further reduce your responsibility model.

Multi-cloud motivations

About 92% of organizations run on at least one public cloud provider. But for about 60% of those companies, the main motivations behind spreading services out between multiple providers are not quite as technical as one might imagine.

Mergers and acquisitions can cause obvious complexity, as companies link up and potentially run similar processes in different cloud environments like AWS, Azure, or GCP. There are also decision-makers and teams that prioritize a task-based approach and pick the best environment to get a particular job done. The benefits of a multi-cloud environment could then become drawbacks, as security becomes more difficult to plan for and understand. And no one wants complexity in an approach that is essentially supposed to offload responsibilities and make things easier.

Risk doesn’t translate for SecOps

As more DevOps teams increase their use of JavaScript, traditional security controls don’t support the popular format as well as other legacy languages. In this situation, there is greater risk. However, an older web app that hasn’t been updated in a while could be the tip of the iceberg in terms of the technical debt sitting out there.

Apps built on older languages like Java, .NET, and C++ could leave exposures open as teams roll over to newer languages. So, this situation also presents risk. Security teams may not even be aware they’re in the dark about vulnerabilities those legacy apps present, as they try to keep pace with DevOps.

The future of shifting left

When it comes to security testing phases, there’s still a heavy tendency toward QA. More is being done to integrate those protocols in the process, but the sea change of baking testing into earlier phases largely has yet to occur.

Over the next decade, teams will likely adopt more cloud-based integration tools like AWS CodePipeline, Microsoft Azure DevOps, GitHub Actions, and GitLab CI. In these instances, the cloud provider is managing more for you, minimizing attack surfaces and providing more built-in security. GitHub and GitLab, in particular, are trending toward greater baked-in security.
Jenkins has been the continuous integration tool of choice for about the last decade. However, the 24/7 nature of running on-premises or in the cloud to manage builds, releases, and patches can increase the attack surface.
When it comes to container orchestration tools, cloud-managed services like AWS Fargate and Azure Container are beginning to pull even with cloud-hosted services like Docker and Kubernetes. It’s becoming more attractive to outsource control-point and hardening responsibilities, so that security can shift further left into containers; it simplifies testing and helps ease deployment.

The future of shifting right

Security-testing responsibility lies with actual security teams about 65% of the time. Yet, managing corrective actions lies with development teams about 63% of the time, according to SANS. These numbers indicate largely siloed actions blocking the path to a true DevSecOps approach.

The biggest success measurement of DevSecOps is the time it takes to fix an issue. Aligning teams to tackle an issue in a speedy manner can make or break. Additionally, identifying post-deployment issues can help to improve shift-left controls to prevent those issues from ever escaping into production.

A 100% cross-functional effort most likely will not be achieved by every organization. However, moving closer to this goal could help strengthen teams, boost morale, and feed back key learnings to ultimately increase the speed of success.

In conclusion

Ironically, the biggest challenge of all isn’t technical in nature. Red tape within organizations can present challenges like lack of buy-in from management, insufficient budget (open-source tools can help here!), and siloed efforts. Additionally, a shortage of skilled workers could reinforce the same old decision-making patterns at those management levels.

When it comes to closely aligning teams and getting more time back to innovate, it’s often a cyclical dance of shifting right to improve your efforts in shifting left. For example, can you move further right into the cloud rather than building do-it-yourself, comprehensive solutions to security? Offloading could help to create more controls for enforcing security in tandem with DevOps.

No one wants to compromise the integrity of deploying on time, particularly as it relates to customers and your company’s bottom line. Co-sponsored by Rapid7, this recent SANS webinar presents an in-depth look at key statistics from a recent survey of companies and their advancements—or lack thereof—in DevSecOps.

For more insights, access the full 2020 SANS Institute survey on Extending DevSecOps Security Controls into the Cloud.

How to visualize multi-account Amazon Inspector findings with Amazon Elasticsearch Service

2020-12-23 Moumita Saha

Post Syndicated from Moumita Saha original https://aws.amazon.com/blogs/security/how-to-visualize-multi-account-amazon-inspector-findings-with-amazon-elasticsearch-service/

Amazon Inspector helps to improve the security and compliance of your applications that are deployed on Amazon Web Services (AWS). It automatically assesses Amazon Elastic Compute Cloud (Amazon EC2) instances and applications on those instances. From that assessment, it generates findings related to exposure, potential vulnerabilities, and deviations from best practices.

You can use the findings from Amazon Inspector as part of a vulnerability management program for your Amazon EC2 fleet across multiple AWS Regions in multiple accounts. The ability to rank and efficiently respond to potential security issues reduces the time that potential vulnerabilities remain unresolved. This can be accelerated within a single pane of glass for all the accounts in your AWS environment.

Following AWS best practices, in a secure multi-account AWS environment, you can provision (using AWS Control Tower) a group of accounts—known as core accounts, for governing other accounts within the environment. One of the core accounts may be used as a central security account, which you can designate for governing the security and compliance posture across all accounts in your environment. Another core account is a centralized logging account, which you can provision and designate for central storage of log data.

In this blog post, I show you how to:

Use Amazon Inspector, a fully managed security assessment service, to generate security findings.
Gather findings from multiple Regions across multiple accounts using Amazon Simple Notification Service (Amazon SNS) and Amazon Simple Queue Service (Amazon SQS).
Use AWS Lambda to send the findings to a central security account for deeper analysis and reporting.

In this solution, we send the findings to two services inside the central security account:

Amazon Elasticsearch Service (Amazon ES) – A search and analytics engine used for log analytics, real-time application monitoring, and clickstream analytics.
Amazon Simple Storage Service (Amazon S3) – A highly scalable, reliable, fast, and inexpensive data storage service of AWS.

Solution overview

Overall architecture

The flow of events to implement the solution is shown in Figure 1 and described in the following process flow.

Figure 1: Solution overview architecture

Process flow

The flow of this architecture is divided into two types of processes—a one-time process and a scheduled process. The AWS resources that are part of the one-time process are triggered the first time an Amazon Inspector assessment template is created in each Region of each application account. The AWS resources of the scheduled process are triggered at a designated interval of Amazon Inspector scan in each Region of each application account.

One-time process

An event-based Amazon CloudWatch rule in each Region of every application account triggers a regional AWS Lambda function when an Amazon Inspector assessment template is created for the first time in that Region.

Note: In order to restrict this event to trigger the Lambda function only the first time an assessment template is created, you must use a specific user-defined tag to trigger the Attach Inspector template to SNS Lambda function for only one Amazon Inspector template per Region. For more information on tags, see the Tagging AWS resources documentation.
The Lambda function attaches the Amazon Inspector assessment template (created in application accounts) to the cross-account Amazon SNS topic (created in the security account). The function, the template, and the topic are all in the same AWS Region.

Note: This step is needed because Amazon Inspector templates can only be attached to SNS topics in the same account via the AWS Management Console or AWS Command Line Interface (AWS CLI).

Scheduled process

A scheduled Amazon CloudWatch Event in every Region of the application accounts starts the Amazon Inspector scan at a scheduled time interval, which you can configure.
An Amazon Inspector agent conducts the scan on the EC2 instances of the Region where the assessment template is created and sends any findings to Amazon Inspector.
Once the findings are generated, Amazon Inspector notifies the Amazon SNS topic of the security account in the same Region.
The Amazon SNS topics from each Region of the central security account receive notifications of Amazon Inspector findings from all application accounts. The SNS topics then send the notifications to a central Amazon SQS queue in the primary Region of the security account.
The Amazon SQS queue triggers the Send findings Lambda function (as shown in Figure 1) of the security account.

Note: Each Amazon SQS message represents one Amazon Inspector finding.
The Send findings Lambda function assumes a cross-account role to fetch the following information from all application accounts:
1. Finding details from the Amazon Inspector API.
2. Additional Amazon EC2 attributes—VPC, subnet, security group, and IP address—from EC2 instances with potential vulnerabilities.
The Lambda function then sends all the gathered data to a central S3 bucket and a domain in Amazon ES—both in the central security account.

These Amazon Inspector findings, along with additional attributes on the scanned instances, can be used for further analysis and visualization via Kibana—a data visualization dashboard for Amazon ES. Storing a copy of these findings in an S3 bucket gives you the opportunity to forward the findings data to outside monitoring tools that don’t support direct data ingestion from AWS Lambda.

Prerequisites

The following resources must be set up before you can implement this solution:

A multi-account structure. To learn how to set up a multi-account structure, see Setting up AWS Control Tower and AWS Landing zone.
Amazon Inspector agents must be installed on all EC2 instances. See Installing Amazon Inspector agents to learn how to set up Amazon Inspector agents on EC2 instances. Additionally, keep note of all the Regions where you install the Amazon Inspector agent.
An Amazon ES domain with Kibana authentication. See Getting started with Amazon Elasticsearch Service and Use Amazon Cognito for Kibana access control.
An S3 bucket for centralized storage of Amazon Inspector findings.
An S3 bucket for storage of the Lambda source code for the solution.

Set up Amazon Inspector with Amazon ES and S3

Follow these steps to set up centralized Amazon Inspector findings with Amazon ES and Amazon S3:

Upload the solution ZIP file to the S3 bucket used for Lambda code storage.
Collect the input parameters for AWS CloudFormation deployment.
Deploy the base template into the central security account.
Deploy the second template in the primary Region of all application accounts to create global resources.
Deploy the third template in all Regions of all application accounts.

Step 1: Upload the solution ZIP file to the S3 bucket used for Lambda code storage

From GitHub, download the file Inspector-to-S3ES-crossAcnt.zip.
Upload the ZIP file to the S3 bucket you created in the central security account for Lambda code storage. This code is used to create the Lambda function in the first CloudFormation stack set of the solution.

Step 2: Collect input parameters for AWS CloudFormation deployment

In this solution, you deploy three AWS CloudFormation stack sets in succession. Each stack set should be created in the primary Region of the central security account. Underlying stacks are deployed across the central security account and in all the application accounts where the Amazon Inspector scan is performed. You can learn more in Working with AWS CloudFormation StackSets.

Before you proceed to the stack set deployment, you must collect the input parameters for the first stack set: Central-SecurityAcnt-BaseTemplate.yaml.

To collect input parameters for AWS CloudFormation deployment

Fetch the account ID (CentralSecurityAccountID) of the AWS account where the stack set will be created and deployed. You can use the steps in Finding your AWS account ID to help you find the account ID.
Values for the ES domain parameters can be fetched from the Amazon ES console.
1. Open the Amazon ES Management Console and select the Region where the Amazon ES domain exists.
2. Select the domain name to view the domain details.
3. The value for ElasticsearchDomainName is displayed on the top left corner of the domain details.
4. On the Overview tab in the domain details window, select and copy the URL value of the Endpoint to use as the ElasticsearchEndpoint parameter of the template. Make sure to exclude the https:// at the beginning of the URL.
  
  Figure 2: Details of the Amazon ES domain for fetching parameter values
Get the values for the S3 bucket parameters from the Amazon S3 console.
1. Open the Amazon S3 Management Console.
2. Copy the name of the S3 bucket that you created for centralized storage of Amazon Inspector findings. Save this bucket name for the LoggingS3Bucket parameter value of the Central-SecurityAcnt-BaseTemplate.yaml template.
3. Select the S3 bucket used for source code storage. Select the bucket name and copy the name of this bucket for the LambdaSourceCodeS3Bucket parameter of the template.
  
  Figure 3: The S3 bucket where Lambda code is uploaded
On the bucket details page, select the source code ZIP file name that you previously uploaded to the bucket. In the detail page of the ZIP file, choose the Overview tab, and then copy the value in the Key field to use as the value for the LambdaCodeS3Key parameter of the template.

Figure 4: Details of the Lambda code ZIP file uploaded in Amazon S3 showing the key prefix value

Note: All of the other input parameter values of the template are entered automatically, but you can change them during stack set creation if necessary.

Step 3: Deploy the base template into the central security account

Now that you’ve collected the input parameters, you’re ready to deploy the base template that will create the necessary resources for this solution implementation in the central security account.

Prerequisites for CloudFormation stack set deployment

There are two permission modes that you can choose from for deploying a stack set in AWS CloudFormation. If you’re using AWS Organizations and have all features enabled, you can use the service-managed permissions; otherwise, self-managed permissions mode is recommended. To deploy this solution, you’ll use self-managed permissions mode. To run stack sets in self-managed permissions mode, your administrator account and the target accounts must have two IAM roles—AWSCloudFormationStackSetAdministrationRole and AWSCloudFormationStackSetExecutionRole—as prerequisites. In this solution, the administrator account is the central security account and the target accounts are application accounts. You can use the following CloudFormation templates to create the necessary IAM roles:

AWSCloudFormationStackSetAdministrationRole – Creates the IAM role only in the central security account.
AWSCloudFormationStackSetExecutionRole – Creates the IAM role in all accounts.

To deploy the base template

Download the base template (Central-SecurityAcnt-BaseTemplate.yaml) from GitHub.
Open the AWS CloudFormation Management Console and select the Region where all the stack sets will be created for deployment. This should be the primary Region of your environment.
Select Create StackSet.
1. In the Create StackSet window, select Template is ready and then select Upload a template file.
2. Under Upload a template file, select Choose file and select the Central-SecurityAcnt-BaseTemplate.yaml template that you downloaded earlier.
3. Choose Next.
Add stack set details.
1. Enter a name for the stack set in StackSet name.
2. Under Parameters, most of the values are pre-populated except the values you collected in the previous procedure for CentralSecurityAccountID, ElasticsearchDomainName, ElasticsearchEndpoint, LoggingS3Bucket, LambdaSourceCodeS3Bucket, and LambdaCodeS3Key.
3. After all the values are populated, choose Next.
Configure StackSet options.
1. (Optional) Add tags as described in the prerequisites to apply to the resources in the stack set that these rules will be deployed to. Tagging is a recommended best practice, because it enables you to add metadata information to resources during their creation.
2. Under Permissions, choose the Self service permissions mode to be used for deploying the stack set, and then select the AWSCloudFormationStackSetAdministrationRole from the dropdown list.
  
  Figure 5: Permission mode to be selected for stack set deployment
3. Choose Next.
Add the account and Region details where the template will be deployed.
1. Under Deployment locations, select Deploy stacks in accounts. Under Account numbers, enter the account ID of the security account that you collected earlier.
  
  Figure 6: Values to be provided during the deployment of the first stack set
2. Under Specify regions, select all the Regions where the stacks will be created. This should be the list of Regions where you installed the Amazon Inspector agent. Keep note of this list of Regions to use in the deployment of the third template in an upcoming step.
  - Though an Amazon Inspector scan is performed in all the application accounts, the regional Amazon SNS topics that send scan finding notifications are created in the central security account. Therefore, this template is created in all the Regions where Amazon Inspector will notify SNS. The template has the logic needed to handle the creation of specific AWS resources only in the primary Region, even though the template executes in many Regions.
  - The order in which Regions are selected under Specify regions defines the order in which the stack is deployed in the Regions. So you must make sure that the primary Region of your deployment is the first one specified under Specify regions, followed by the other Regions of stack set deployment. This is required because global resources are created using one Region—ideally the primary Region—and so stack deployment in that Region should be done before deployment to other Regions in order to avoid any build dependencies.
    
    Figure 7: Showing the order of specifying the Regions of stack set deployment
Review the template settings and select the check box to acknowledge the Capabilities section. This is required if your deployment template creates IAM resources. You can learn more at Controlling access with AWS Identity and Access Management.

Figure 8: Acknowledge IAM resources creation by AWS CloudFormation
Choose Submit to deploy the stack set.

Step 4: Deploy the second template in the primary Region of all application accounts to create global resources

This template creates the global resources required for sending Amazon Inspector findings to Amazon ES and Amazon S3.

To deploy the second template

Download the template (ApplicationAcnts-RolesTemplate.yaml) from GitHub and use it to create the second CloudFormation stack set in the primary Region of the central security account.
To deploy the template, follow the steps used to deploy the base template (described in the previous section) through Configure StackSet options.
In Set deployment options, do the following:
1. Under Account numbers, enter the account IDs of your application accounts as comma-separated values. You can use the steps in Finding your AWS account ID to help you gather the account IDs.
2. Under Specify regions, select only your primary Region.
  
  Figure 9: Select account numbers and specify Regions
The remaining steps are the same as for the base template deployment.

Step 5: Deploy the third template in all Regions of all application accounts

This template creates the resources in each Region of all application accounts needed for scheduled scanning of EC2 instances using Amazon Inspector. Notifications are sent to the SNS topics of each Region of the central security account.

To deploy the third template

Download the template InspectorRun-SetupTemplate.yaml from GitHub and create the final AWS CloudFormation stack set. Similar to the previous stack sets, this one should also be created in the central security account.
For deployment, follow the same steps you used to deploy the base template through Configure StackSet options.
In Set deployment options:
1. Under Account numbers, enter the same account IDs of your application accounts (comma-separated values) as you did for the second template deployment.
2. Under Specify regions, select all the Regions where you installed the Amazon Inspector agent.
  
  Note: This list of Regions should be the same as the Regions where you deployed the base template.
The remaining steps are the same as for the second template deployment.

Test the solution and delivery of the findings

After successful deployment of the architecture, to test the solution you can wait until the next scheduled Amazon Inspector scan or you can use the following steps to run the Amazon Inspector scan manually.

To run the Amazon Inspector scan manually for testing the solution

In any one of the application accounts, go to any Region where the Amazon Inspector scan was performed.
Open the Amazon Inspector console.
In the left navigation menu, select Assessment templates to see the available assessments.
Choose the assessment template that was created by the third template.
Choose Run to start the assessment immediately.
When the run is complete, Last run status changes from Collecting data to Analysis Complete.

Figure 10: Amazon Inspector assessment run
You can see the recent scan findings in the Amazon Inspector console by selecting Assessment runs from the left navigation menu.

Figure 11: The assessment run indicates total findings from the last Amazon Inspector run in this Region
In the left navigation menu, select Findings to see details of each finding, or use the steps in the following section to verify the delivery of findings to the central security account.

Test the delivery of the Amazon Inspector findings

This solution delivers the Amazon Inspector findings to two AWS services—Amazon ES and Amazon S3—in the primary Region of the central security account. You can either use Kibana to view the findings sent to Amazon ES or you can use the findings sent to Amazon S3 and forward them to the security monitoring software of your preference for further analysis.

To check whether the findings are delivered to Amazon ES

Open the Amazon ES Management Console and select the Region where the Amazon ES domain is located.
Select the domain name to view the domain details.
On the domain details page, select the Kibana URL.

Figure 12: Amazon ES domain details page
Log in to Kibana using your preferred authentication method as set up in the prerequisites.
1. In the left panel, select Discover.
2. In the Discover window, select a Region to view the total number of findings in that Region.
  
  Figure 13: The total findings in Kibana for the chosen Region of an application account

To check whether the findings are delivered to Amazon S3

Open the Amazon S3 Management Console.
Select the S3 bucket that you created for storing Amazon Inspector findings.
Select the bucket name to view the bucket details. The total number of findings for the chosen Region is at the top right corner of the Overview tab.

Figure 14: The total security findings as stored in an S3 bucket for us-east-1 Region

Visualization in Kibana

The data sent to the Amazon ES index can be used to create visualizations in Kibana that make it easier to identify potential security gaps and plan the remediation accordingly.

You can use Kibana to create a dashboard that gives an overview of the potential vulnerabilities identified in different instances of different AWS accounts. Figure 15 shows an example of such a dashboard. The dashboard can help you rank the need for remediation based on criteria such as:

The category of vulnerability
The most impacted AWS accounts
EC2 instances that need immediate attention

Figure 15: A sample Kibana dashboard showing findings from Amazon Inspector

You can build additional panels to visualize details of the vulnerability findings identified by Amazon Inspector, such as the CVE ID of the security vulnerability, its description, and recommendations on how to remove the vulnerabilities.

Figure 16: A sample Kibana dashboard panel listing the top identified vulnerabilities and their details

Conclusion

By using this solution to combine Amazon Inspector, Amazon SNS topics, Amazon SQS queues, Lambda functions, an Amazon ES domain, and S3 buckets, you can centrally analyze and monitor the vulnerability posture of EC2 instances across your AWS environment, including multiple Regions across multiple AWS accounts. This solution is built following least privilege access through AWS IAM roles and policies to help secure the cross-account architecture.

In this blog post, you learned how to send the findings directly to Amazon ES for visualization in Kibana. These visualizations can be used to build dashboards that security analysts can use for centralized monitoring. Better monitoring capability helps analysts to identify potentially vulnerable assets and perform remediation activities to improve security of your applications in AWS and their underlying assets. This solution also demonstrates how to store the findings from Amazon Inspector in an S3 bucket, which makes it easier for you to use those findings to create visualizations in your preferred security monitoring software.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How to deploy the AWS Solution for Security Hub Automated Response and Remediation

2020-11-20 Ramesh Venkataraman

Post Syndicated from Ramesh Venkataraman original https://aws.amazon.com/blogs/security/how-to-deploy-the-aws-solution-for-security-hub-automated-response-and-remediation/

In this blog post I show you how to deploy the Amazon Web Services (AWS) Solution for Security Hub Automated Response and Remediation. The first installment of this series was about how to create playbooks using Amazon CloudWatch Events, AWS Lambda functions, and AWS Security Hub custom actions that you can run manually based on triggers from Security Hub in a specific account. That solution requires an analyst to directly trigger an action using Security Hub custom actions and doesn’t work for customers who want to set up fully automated remediation based on findings across one or more accounts from their Security Hub master account.

The solution described in this post automates the cross-account response and remediation lifecycle from executing the remediation action to resolving the findings in Security Hub and notifying users of the remediation via Amazon Simple Notification Service (Amazon SNS). You can also deploy these automated playbooks as custom actions in Security Hub, which allows analysts to run them on-demand against specific findings. You can deploy these remediations as custom actions or as fully automated remediations.

Currently, the solution includes 10 playbooks aligned to the controls in the Center for Internet Security (CIS) AWS Foundations Benchmark standard in Security Hub, but playbooks for other standards such as AWS Foundational Security Best Practices (FSBP) will be added in the future.

Solution overview

Figure 1 shows the flow of events in the solution described in the following text.

Figure 1: Flow of events

Detect

Security Hub gives you a comprehensive view of your security alerts and security posture across your AWS accounts and automatically detects deviations from defined security standards and best practices.

Security Hub also collects findings from various AWS services and supported third-party partner products to consolidate security detection data across your accounts.

Ingest

All of the findings from Security Hub are automatically sent to CloudWatch Events and Amazon EventBridge and you can set up CloudWatch Events and EventBridge rules to be invoked on specific findings. You can also send findings to CloudWatch Events and EventBridge on demand via Security Hub custom actions.

Remediate

The CloudWatch Event and EventBridge rules can have AWS Lambda functions, AWS Systems Manager automation documents, or AWS Step Functions workflows as the targets of the rules. This solution uses automation documents and Lambda functions as response and remediation playbooks. Using cross-account AWS Identity and Access Management (IAM) roles, the playbook performs the tasks to remediate the findings using the AWS API when a rule is invoked.

Log

The playbook logs the results to the Amazon CloudWatch log group for the solution, sends a notification to an Amazon Simple Notification Service (Amazon SNS) topic, and updates the Security Hub finding. An audit trail of actions taken is maintained in the finding notes. The finding is updated as RESOLVED after the remediation is run. The security finding notes are updated to reflect the remediation performed.

Here are the steps to deploy the solution from this GitHub project.

In the Security Hub master account, you deploy the AWS CloudFormation template, which creates an AWS Service Catalog product along with some other resources. For a full set of what resources are deployed as part of an AWS CloudFormation stack deployment, you can find the full set of deployed resources in the Resources section of the deployed AWS CloudFormation stack. The solution uses the AWS Service Catalog to have the remediations available as a product that can be deployed after granting the users the required permissions to launch the product.
Add an IAM role that has administrator access to the AWS Service Catalog portfolio.
Deploy the CIS playbook from the AWS Service Catalog product list using the IAM role you added in the previous step.
Deploy the AWS Security Hub Automated Response and Remediation template in the master account in addition to the member accounts. This template establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. Use AWS CloudFormation StackSets in the master account to have a centralized deployment approach across the master account and multiple member accounts.

Deployment steps for automated response and remediation

This section reviews the steps to implement the solution, including screenshots of the solution launched from an AWS account.

Launch AWS CloudFormation stack on the master account

As part of this AWS CloudFormation stack deployment, you create custom actions to configure Security Hub to send findings to CloudWatch Events. Lambda functions are used to provide remediation in response to actions sent to CloudWatch Events.

Note: In this solution, you create custom actions for the CIS standards. There will be more custom actions added for other security standards in the future.

To launch the AWS CloudFormation stack

Deploy the AWS CloudFormation template in the Security Hub master account. In your AWS console, select CloudFormation and choose Create new stack and enter the S3 URL.
Select Next to move to the Specify stack details tab, and then enter a Stack name as shown in Figure 2. In this example, I named the stack SO0111-SHARR, but you can use any name you want.

Figure 2: Creating a CloudFormation stack
Creating the stack automatically launches it, creating 21 new resources using AWS CloudFormation, as shown in Figure 3.

Figure 3: Resources launched with AWS CloudFormation
An Amazon SNS topic is automatically created from the AWS CloudFormation stack.
When you create a subscription, you’re prompted to enter an endpoint for receiving email notifications from Amazon SNS as shown in Figure 4. To subscribe to that topic that was created using CloudFormation, you must confirm the subscription from the email address you used to receive notifications.

Figure 4: Subscribing to Amazon SNS topic

Enable Security Hub

You should already have enabled Security Hub and AWS Config services on your master account and the associated member accounts. If you haven’t, you can refer to the documentation for setting up Security Hub on your master and member accounts. Figure 5 shows an AWS account that doesn’t have Security Hub enabled.

Figure 5: Enabling Security Hub for first time

AWS Service Catalog product deployment

In this section, you use the AWS Service Catalog to deploy Service Catalog products.

To use the AWS Service Catalog for product deployment

In the same master account, add roles that have administrator access and can deploy AWS Service Catalog products. To do this, from Services in the AWS Management Console, choose AWS Service Catalog. In AWS Service Catalog, select Administration, and then navigate to Portfolio details and select Groups, roles, and users as shown in Figure 6.

Figure 6: AWS Service Catalog product
After adding the role, you can see the products available for that role. You can switch roles on the console to assume the role that you granted access to for the product you added from the AWS Service Catalog. Select the three dots near the product name, and then select Launch product to launch the product, as shown in Figure 7.

Figure 7: Launch the product
While launching the product, you can choose from the parameters to either enable or disable the automated remediation. Even if you do not enable fully automated remediation, you can still invoke a remediation action in the Security Hub console using a custom action. By default, it’s disabled, as highlighted in Figure 8.

Figure 8: Enable or disable automated remediation
After launching the product, it can take from 3 to 5 minutes to deploy. When the product is deployed, it creates a new CloudFormation stack with a status of CREATE_COMPLETE as part of the provisioned product in the AWS CloudFormation console.

AssumeRole Lambda functions

Deploy the template that establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. You must deploy this template in the master account in addition to any member accounts. Choose CloudFormation and create a new stack. In Specify stack details, go to Parameters and specify the Master account number as shown in Figure 9.

Figure 9: Deploy AssumeRole Lambda function

Test the automated remediation

Now that you’ve completed the steps to deploy the solution, you can test it to be sure that it works as expected.

To test the automated remediation

To test the solution, verify that there are 10 actions listed in Custom actions tab in the Security Hub master account. From the Security Hub master account, open the Security Hub console and select Settings and then Custom actions. You should see 10 actions, as shown in Figure 10.

Figure 10: Custom actions deployed
Make sure you have member accounts available for testing the solution. If not, you can add member accounts to the master account as described in Adding and inviting member accounts.
For testing purposes, you can use CIS 1.5 standard, which is to require that the IAM password policy requires at least one uppercase letter. Check the existing settings by navigating to IAM, and then to Account Settings. Under Password policy, you should see that there is no password policy set, as shown in Figure 11.

Figure 11: Password policy not set
To check the security settings, go to the Security Hub console and select Security standards. Choose CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 from the list to see the Findings. You will see the Status as Failed. This means that the password policy to require at least one uppercase letter hasn’t been applied to either the master or the member account, as shown in Figure 12.

Figure 12: CIS 1.5 finding
Select CIS 1.5 – 1.11 from Actions on the top right dropdown of the Findings section from the previous step. You should see a notification with the heading Successfully sent findings to Amazon CloudWatch Events as shown in Figure 13.

Figure 13: Sending findings to CloudWatch Events
Return to Findings by selecting Security standards and then choosing CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 to review Findings and verify that the Workflow status of CIS 1.5 is RESOLVED, as shown in Figure 14.

Figure 14: Resolved findings
After the remediation runs, you can verify that the Password policy is set on the master and the member accounts. To verify that the password policy is set, navigate to IAM, and then to Account Settings. Under Password policy, you should see that the account uses a password policy, as shown in Figure 15.

Figure 15: Password policy set
To check the CloudWatch logs for the Lambda function, in the console, go to Services, and then select Lambda and choose the Lambda function and within the Lambda function, select View logs in CloudWatch. You can see the details of the function being run, including updating the password policy on both the master account and the member account, as shown in Figure 16.

Figure 16: Lambda function log

Conclusion

In this post, you deployed the AWS Solution for Security Hub Automated Response and Remediation using Lambda and CloudWatch Events rules to remediate non-compliant CIS-related controls. With this solution, you can ensure that users in member accounts stay compliant with the CIS AWS Foundations Benchmark by automatically invoking guardrails whenever services move out of compliance. New or updated playbooks will be added to the existing AWS Service Catalog portfolio as they’re developed. You can choose when to take advantage of these new or updated playbooks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.