Tag Archives: Intermediate (200)

Resource leak detection in Amazon CodeGuru Reviewer

Post Syndicated from Pranav Garg original https://aws.amazon.com/blogs/devops/resource-leak-detection-in-amazon-codeguru/

This post discusses the resource leak detector for Java in Amazon CodeGuru Reviewer. CodeGuru Reviewer automatically analyzes pull requests (created in supported repositories such as AWS CodeCommit, GitHub, GitHub Enterprise, and Bitbucket) and generates recommendations for improving code quality. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. This blog does not describe the resource leak detector for Python programs that is now available in preview.

What are resource leaks?

Resources are objects with a limited availability within a computing system. These typically include objects managed by the operating system, such as file handles, database connections, and network sockets. Because the number of such resources in a system is limited, they must be released by an application as soon as they are used. Otherwise, you will run out of resources and you won’t be able to allocate new ones. The paradigm of acquiring a resource and releasing it is also followed by other categories of objects such as metric wrappers and timers.

Resource leaks are bugs that arise when a program doesn’t release the resources it has acquired. Resource leaks can lead to resource exhaustion. In the worst case, they can cause the system to slow down or even crash.

Starting with Java 7, most classes holding resources implement the java.lang.AutoCloseable interface and provide a close() method to release them. However, a close() call in source code doesn’t guarantee that the resource is released along all program execution paths. For example, in the following sample code, resource r is acquired by calling its constructor and is closed along the path corresponding to the if branch, shown using green arrows. To ensure that the acquired resource doesn’t leak, you must also close r along the path corresponding to the else branch (the path shown using red arrows).

A resource must be closed along all execution paths to prevent resource leaks

Often, resource leaks manifest themselves along code paths that aren’t frequently run, or under a heavy system load, or after the system has been running for a long time. As a result, such leaks are latent and can remain dormant in source code for long periods of time before manifesting themselves in production environments. This is the primary reason why resource leak bugs are difficult to detect or replicate during testing, and why automatically detecting these bugs during pull requests and code scans is important.

Detecting resource leaks in CodeGuru Reviewer

For this post, we consider the following Java code snippet. In this code, method getConnection() attempts to create a connection in the connection pool associated with a data source. Typically, a connection pool limits the maximum number of connections that can remain open at any given time. As a result, you must close connections after their use so as to not exhaust this limit.

 1     private Connection getConnection(final BasicDataSource dataSource, ...)
               throws ValidateConnectionException, SQLException {
 2         boolean connectionAcquired = false;
 3         // Retrying three times to get the connection.
 4         for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
 5             Connection connection = dataSource.getConnection();
 6             // validateConnection may throw ValidateConnectionException
 7             if (! validateConnection(connection, ...)) {
 8                 // connection is invalid
 9                 DbUtils.closeQuietly(connection);
10             } else {
11                 // connection is established
12                 connectionAcquired = true;
13                 return connection;
14             }
15         }
16         return null;
17     }

At first glance, it seems that the method getConnection() doesn’t leak connection resources. If a valid connection is established in the connection pool (else branch on line 10 is taken), the method getConnection() returns it to the client for use (line 13). If the connection established is invalid (if branch on line 7 is taken), it’s closed in line 9 before another attempt is made to establish a connection.

However, method validateConnection() at line 7 can throw a ValidateConnectionException. If this exception is thrown after a connection is established at line 5, the connection is neither closed in this method nor is it returned upstream to the client to be closed later. Furthermore, if this exceptional code path runs frequently, for instance, if the validation logic throws on a specific recurring service request, each new request causes a connection to leak in the connection pool. Eventually, the client can’t acquire new connections to the data source, impacting the availability of the service.

A typical recommendation to prevent resource leak bugs is to declare the resource objects in a try-with-resources statement block. However, we can’t use try-with-resources to fix the preceding method because this method is required to return an open connection for use in the upstream client. The CodeGuru Reviewer recommendation for the preceding code snippet is as follows:

“Consider closing the following resource: connection. The resource is referenced at line 7. The resource is closed at line 9. The resource is returned at line 13. There are other execution paths that don’t close the resource or return it, for example, when validateConnection throws an exception. To prevent this resource leak, close connection along these other paths before you exit this method.”

As mentioned in the Reviewer recommendation, to prevent this resource leak, you must close the established connection when method validateConnection() throws an exception. This can be achieved by inserting the validation logic (lines 7–14) in a try block. In the finally block associated with this try, the connection must be closed by calling DbUtils.closeQuietly(connection) if connectionAcquired == false. The method getConnection() after this fix has been applied is as follows:

private Connection getConnection(final BasicDataSource dataSource, ...) 
        throws ValidateConnectionException, SQLException {
    boolean connectionAcquired = false;
    // Retrying three times to get the connection.
    for (int attempt = 0; attempt < CONNECTION_RETRIES; ++attempt) {
        Connection connection = dataSource.getConnection();
        try {
            // validateConnection may throw ValidateConnectionException
            if (! validateConnection(connection, ...)) {
                // connection is invalid
                DbUtils.closeQuietly(connection);
            } else {
                // connection is established
                connectionAcquired = true;
                return connection;
            }
        } finally {
            if (!connectionAcquired) {
                DBUtils.closeQuietly(connection);
            }
        }
    }
    return null;
}

As shown in this example, resource leaks in production services can be very disruptive. Furthermore, leaks that manifest along exceptional or less frequently run code paths can be hard to detect or replicate during testing and can remain dormant in the code for long periods of time before manifesting themselves in production environments. With the resource leak detector, you can detect such leaks on objects belonging to a large number of popular Java types such as file streams, database connections, network sockets, timers and metrics, etc.

Combining static code analysis with machine learning for accurate resource leak detection

In this section, we dive deep into the inner workings of the resource leak detector. The resource leak detector in CodeGuru Reviewer uses static analysis algorithms and techniques. Static analysis algorithms perform code analysis without running the code. These algorithms are generally prone to high false positives (the tool might report correct code as having a bug). If the number of these false positives is high, it can lead to alarm fatigue and low adoption of the tool. As a result, the resource leak detector in CodeGuru Reviewer prioritizes precision over recall— the findings we surface are resource leaks with a high accuracy, though CodeGuru Reviewer could potentially miss some resource leak findings.

The main reason for false positives in static code analysis is incomplete information available to the analysis. CodeGuru Reviewer requires only the Java source files and doesn’t require all dependencies or the build artifacts. Not requiring the external dependencies or the build artifacts reduces the friction to perform automated code reviews. As a result, static analysis only has access to the code in the source repository and doesn’t have access to its external dependencies. The resource leak detector in CodeGuru Reviewer combines static code analysis with a machine learning (ML) model. This ML model is used to reason about external dependencies to provide accurate recommendations.

To understand the use of the ML model, consider again the code above for method getConnection() that had a resource leak. In the code snippet, a connection to the data source is established by calling BasicDataSource.getConnection() method, declared in the Apache Commons library. As mentioned earlier, we don’t require the source code of external dependencies like the Apache library for code analysis during pull requests. Without access to the code of external dependencies, a pure static analysis-driven technique doesn’t know whether the Connection object obtained at line 5 will leak, if not closed. Similarly, it doesn’t know that DbUtils.closeQuietly() is a library function that closes the connection argument passed to it at line 9. Our detector combines static code analysis with ML that learns patterns over such external function calls from a large number of available code repositories. As a result, our resource leak detector knows that the connection doesn’t leak along the following code path:

  • A connection is established on line 5
  • Method validateConnection() returns false at line 7
  • DbUtils.closeQuietly() is called on line 9

This suppresses the possible false warning. At the same time, the detector knows that there is a resource leak when the connection is established at line 5, and validateConnection() throws an exception at line 7 that isn’t caught.

When we run CodeGuru Reviewer on this code snippet, it surfaces only the second leak scenario and makes an appropriate recommendation to fix this bug.

The ML model used in the resource leak detector has been trained on a large number of internal Amazon and GitHub code repositories.

Responses to the resource leak findings

Although closing an open resource in code isn’t difficult, doing so properly along all program paths is important to prevent resource leaks. This can easily be overlooked, especially along exceptional or less frequently run paths. As a result, the resource leak detector in CodeGuru Reviewer has observed a relatively high frequency, and has alerted developers within Amazon to thousands of resource leaks before they hit production.

The resource leak detections have witnessed a high developer acceptance rate, and developer feedback towards the resource leak detector has been very positive. Some of the feedback from developers includes “Very cool, automated finding,” “Good bot :),” and “Oh man, this is cool.” Developers have also concurred that the findings are important and need to be fixed.

Conclusion

Resource leak bugs are difficult to detect or replicate during testing. They can impact the availability of production services. As a result, it’s important to automatically detect these bugs early on in the software development workflow, such as during pull requests or code scans. The resource leak detector in CodeGuru Reviewer combines static code analysis algorithms with ML to surface only the high confidence leaks. It has a high developer acceptance rate and has alerted developers within Amazon to thousands of leaks before those leaks hit production.

How to approach threat modeling

Post Syndicated from Darran Boyd original https://aws.amazon.com/blogs/security/how-to-approach-threat-modeling/

In this post, I’ll provide my tips on how to integrate threat modeling into your organization’s application development lifecycle. There are many great guides on how to perform the procedural parts of threat modeling, and I’ll briefly touch on these and their methodologies. However, the main aim of this post is to augment the existing guidance with some additional tips on how to handle the people and process components of your threat modeling approach, which in my experience goes a long way to improving the security outcomes, security ownership, speed to market, and general happiness of all involved. Furthermore, I’ll also provide some guidance specific to when you’re using Amazon Web Services (AWS).

Let’s start with a primer on threat modeling.

Why use threat modeling

IT systems are complex, and are becoming increasingly more complex and capable over time, delivering more business value and increased customer satisfaction and engagement. This means that IT design decisions need to account for an ever-increasing number of use cases, and be made in a way that mitigates potential security threats that may lead to business-impacting outcomes, including unauthorized access to data, denial of service, and resource misuse.

This complexity and number of use-case permutations typically makes it ineffective to use unstructured approaches to find and mitigate threats. Instead, you need a systematic approach to enumerate the potential threats to the workload, and to devise mitigations and prioritize these mitigations to make sure that the limited resources of your organization have the maximum impact in improving the overall security posture of the workload. Threat modeling is designed to provide this systematic approach, with the aim of finding and addressing issues early in the design process, when the mitigations have a low relative cost compared to later in the lifecycle.

The AWS Well-Architected Framework calls out threat modeling as a specific best practice within the Security Pillar, under the area of foundational security, under the question SEC 1: How do you securely operate your workload?:

“Identify and prioritize risks using a threat model: Use a threat model to identify and maintain an up-to-date register of potential threats. Prioritize your threats and adapt your security controls to prevent, detect, and respond. Revisit and maintain this in the context of the evolving security landscape.”

Threat modeling is most effective when done at the workload (or workload feature) level, in order to ensure that all context is available for assessment. AWS Well-Architected defines a workload as:

“A set of components that together deliver business value. The workload is usually the level of detail that business and technology leaders communicate about. Examples of workloads are marketing websites, e-commerce websites, the back-ends for a mobile app, analytic platforms, etc. Workloads vary in levels of architectural complexity, from static websites to architectures with multiple data stores and many components.”

The core steps of threat modeling

In my experience, all threat modeling approaches are similar; at a high level, they follow these broad steps:

  1. Identify assets, actors, entry points, components, use cases, and trust levels, and include these in a design diagram.
  2. Identify a list of threats.
  3. Per threat, identify mitigations, which may include security control implementations.
  4. Create and review a risk matrix to determine if the threat is adequately mitigated.

To go deeper into the general practices associated with these steps, I would suggest that you read the SAFECode Tactical Threat Modeling whitepaper and the Open Web Application Security Project (OWASP) Threat Modeling Cheat Sheet. These guides are great resources for you to consider when adopting a particular approach. They also reference a number of tools and methodologies that are helpful to accelerate the threat modeling process, including creating threat model diagrams with the OWASP Threat Dragon project and determining possible threats with the OWASP Top 10, OWASP Application Security Verification Standard (ASVS) and STRIDE. You may choose to adopt some combination of these, or create your own.

When to do threat modeling

Threat modeling is a design-time activity. It’s typical that during the design phase you would go beyond creating a diagram of your architecture, and that you may also be building in a non-production environment—and these activities are performed to inform and develop your production design. Because threat modeling is a design-time activity, it occurs before code review, code analysis (static or dynamic), and penetration testing; these all come later in the security lifecycle.

Always consider potential threats when designing your workload from the earliest phases—typically when people are still on the whiteboard (whether physical or virtual). Threat modeling should be performed during the design phase of a given workload feature or feature change, as these changes may introduce new threats that require you to update your threat model.

Threat modeling tips

Ultimately, threat modeling requires thought, brainstorming, collaboration, and communication. The aim is to bridge the gap between application development, operations, business, and security. There is no shortcut to success. However, there are things I’ve observed that have meaningful impacts on the adoption and success of threat modeling—I’ll be covering these areas in the following sections.

1. Assemble the right team

Threat modeling is a “team sport,” because it requires the knowledge and skill set of a diverse team where all inputs can be viewed as equal in value. For all listed personas in this section, the suggested mindset is to start from your end-customers’ expectations, and work backwards. Think about what your customers expect from this workload or workload feature, both in terms of its security properties and maintaining a balance of functionality and usability.

I recommend that the following perspectives be covered by the team, noting that a single individual can bring more than one of these perspectives to the table:

The Business persona – First, to keep things grounded, you’ll want someone who represents the business outcomes of the workload or feature that is part of the threat modeling process. This person should have an intimate understanding of the functional and non-functional requirements of the workload—and their job is to make sure that these requirements aren’t unduly impacted by any proposed mitigations to address threats. Meaning that if a proposed security control (that is, mitigation) renders an application requirement unusable or overly degraded, then further work is required to come to the right balance of security and functionality.

The Developer persona – This is someone who understands the current proposed design for the workload feature, and has had the most depth of involvement in the design decisions made to date. They were involved in design brainstorming or whiteboarding sessions leading up to this point, when they would typically have been thinking about threats to the design and possible mitigations to include. In cases where you are not developing your own in-house application (e.g. COTS applications) you would bring in the internal application owner.

The Adversary persona – Next, you need someone to play the role of the adversary. The aim of this persona is to put themselves in the shoes of an attacker, and to critically review the workload design and look for ways to take advantage of a design flaw in the workload to achieve a particular objective (for example, unauthorized sharing of data). The “attacks” they perform are a mental exercise, not actual hands-on-keyboard exploitation. If your organization has a so-called Red Team, then they could be a great fit for this role; if not, you may want to have one or more members of your security operations or engineering team play this role. Or alternately, bring in a third party who is specialized in this area.

The Defender persona – Then, you need someone to play the role of the defender. The aim of this persona is to see the possible “attacks” designed by the adversary persona as potential threats, and to devise security controls that mitigate the threats. This persona also evaluates whether the possible mitigations are reasonably manageable in terms of on-going operational support, monitoring, and incident response.

The AppSec SME persona – The Application Security (AppSec) subject matter expert (SME) persona should be the most familiar with the threat modeling process and discussion moderation methods, and should have a depth of IT security knowledge and experience. Discussion moderation is crucial for the overall exercise process to make sure that the overall objectives of the process are kept on-track, and that the appropriate balance between security and delivery of the customer outcome is maintained. Ultimately, it’s this persona who endorses the threat model and advises the scope of the actions beyond the threat modeling exercise—for example, penetration testing scope.

2. Have a consistent approach

In the earlier section, I listed some of the popular threat modeling approaches, and which one you select is not as important as using it consistently both within and across your teams.

By using a consistent approach and format, teams can move faster and estimate effort more accurately. Individuals can learn from examples, by looking at threat models developed by other team members or other teams—saving them from having to start from scratch.

When your team estimates the effort and time required to create a threat model, the experience and time taken from previous threat models can be used to provide more accurate estimations of delivery timelines.

Beyond using a consistent approach and format, consistency in the granularity and relevance of the threats being modeled is key. Later in this post I describe a recommendation for creating a catalog of threats for reuse across your organization.

Finally, and importantly, this approach allows for scalability: if a given workload feature that’s undergoing a threat modeling exercise is using components that have an existing threat model, then the threat model (or individual security controls) of those components can be reused. With this approach, you can effectively take a dependency on a component’s existing threat model, and build on that model, eliminating re-work.

3. Align to the software delivery methodology

Your application development teams already have a particular workflow and delivery style. These days, Agile-style delivery is most popular. Ensure that the approach you take for threat modeling integrates well with both your delivery methodology and your tools.

Just like for any other deliverable, capture the user stories related to threat modeling as part of the workload feature’s sprint, epic, or backlog.

4. Use existing workflow tooling

Your application development teams are already using a suite of tools to support their delivery methodology. This would typically include collaboration tools for documentation (for example, a team wiki), and an issue-tracking tool to track work products through the software development lifecycle. Aim to use these same tools as part of your security review and threat modeling workflow.

Existing workflow tools can provide a single place to provide and view feedback, assign actions, and view the overall status of the threat modeling deliverables of the workload feature. Being part of the workflow reduces the friction of getting the project done and allows threat modeling to become as commonplace as unit testing, QA testing, or other typical steps of the workflow.

By using typical workflow tools, team members working on creating and reviewing the threat model can work asynchronously. For example, when the threat model reviewer adds feedback, the author is notified, and then the author can address the feedback when they have time, without having to set aside dedicated time for a meeting. Also, this allows the AppSec SME to more effectively work across multiple threat model reviews that they may be engaged in.

Having a consistent approach and language as described earlier is an important prerequisite to make this asynchronous process feasible, so that each participant can read and understand the threat model without having to re-learn the correct interpretation each time.

5. Break the workload down into smaller parts

It’s advisable to decompose (break down) the workload into features and perform the threat modeling exercise at the feature level, rather than create a single threat model for an entire workload. This approach has a number of key benefits:

  1. Having smaller chunks of work allows more granular tracking of progress, which aligns well with development teams that are following Agile-style delivery, and gives leadership a constant view of progress.
  2. This approach tends to create threat models that are more detailed, which results in more findings being identified.
  3. Decomposing also opens up the opportunity for the threat model to be reused as a dependency for other workload features that use the same components.
  4. By considering threat mitigations for each component, at the overall workload level this means that a single threat may have multiple mitigations, resulting in an improved resilience against those threats.
  5. Issues with a single threat model, for example a critical threat which is not yet mitigated, does not become launch blocking for the entire workload, but rather just for the individual feature.

The question then becomes, how far should you decompose the workload?

As a general rule, in order to create a threat model, the following context is required, at a minimum:

  • One asset. For example, credentials, customer records, and so on.
  • One entry point. For example, Amazon API Gateway REST API deployment.
  • Two components. For example, a web browser and an API Gateway REST API; or API Gateway and an AWS Lambda function.

Creating a threat model for a given AWS service (for example, API Gateway) in isolation wouldn’t fully meet this criteria—given that the service is a single component, there is no movement of the data from one component to another. Furthermore, the context of all the possible use cases of the service within a workload isn’t known, so you can’t comprehensively derive the threats and mitigations. AWS performs threat modeling of the multiple features that make up a given AWS service. Therefore, for your workload feature that leverages a given AWS service, you wouldn’t need to threat model the AWS service, but instead consider the various AWS service configuration options and your own workload-specific mitigations when you look to mitigate the threats you’ve identified. I go into more depth on this in the “Identify and evaluate mitigations” section, where I go into the concept of baseline security controls.

6. Distribute ownership

Having a central person or department responsible for creation of threat models doesn’t work in the long run. These central entities become bottlenecks and can only scale up with additional head count. Furthermore, centralized ownership doesn’t empower those who are actually designing and implementing your workload features.

Instead, what scales well is distributed ownership of threat model creation by the team that is responsible for designing and implementing each workload feature. Distributed ownership scales and drives behavior change, because now the application teams are in control, and importantly they’re taking security learnings from the threat modeling process and putting those learnings into their next feature release, and therefore constantly improving the security of their workload and features.

This creates the opportunity for the AppSec SME (or team) to effectively play the moderator and security advisor role to all the various application teams in your organization. The AppSec SME will be in a position to drive consistency, adoption, and communication, and to set and raise the security bar among teams.

7. Identify entry points

When you look to identify entry points for AWS services that are components within your overall threat model, it’s important to understand that, depending on the type of AWS service, the entry points may vary based on the architecture of the workload feature included in the scope of the threat model.

For example, with Amazon Simple Storage Service (Amazon S3), the possible types of entry-points to an S3 bucket are limited to what is exposed through the Amazon S3 API, and the service doesn’t offer the capability for you, as a customer, to create additional types of entry points. In this Amazon S3 example, as a customer you make choices about how these existing types of endpoints are exposed—including whether the bucket is private or publicly accessible.

On the other end of the spectrum, Amazon Elastic Compute Cloud (Amazon EC2) allows customers to create additional types of entry-points to EC2 instances (for example, your application API), besides the entry-point types that are provided by the Amazon EC2 API and those native to the operating system running on the EC2 instance (for example, SSH or RDP).

Therefore, make sure that you’re applying the entry points that are specific to the workload feature, in additional to the native endpoints for AWS services, as part of your threat model.

8. Come up with threats

Your aim here is to try to come up with answers to the question “What can go wrong?” There isn’t any canonical list that lists all the possible threats, because determining threats depends on the context of the workload feature that’s under assessment, and the types of threats that are unique to a given industry, geographical area, and so on.

Coming up with threats requires brainstorming. The brainstorming exercise can be facilitated by using a mnemonic like STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege), or by looking through threat lists like the OWASP Top 10 or HiTrust Threat Catalog to get the ideas flowing.

Through this process, it’s recommended that you develop and contribute to a threat catalog that is contextual to your organization and will accelerate the brainstorming process going forward, as well as drive consistency in the granularity of threats that you model.

9. Identify and evaluate mitigations

Here, your aim is to identify the mitigations (security controls) within the workload design and evaluate whether threats have been adequately addressed. Keep in mind that there are typically multiple layers of controls and multiple responsibilities at play.

For your own in-house applications and code, you would want to review the mitigations you’ve included in your design—including, but not limited to, input validation, authentication, session handling, and bounds handling.

Consider all other components of your workload (for example, software as a service (SaaS), infrastructure supporting your COTS applications, or components hosted within your on-premises data centers) and determine the security controls that are part of the workload design.

When you use AWS services, Security and Compliance is a shared responsibility between AWS and you as our customer. This is described on the AWS Shared Responsibility Model page.

This means, for the portions of the AWS services that you’re using that are the responsibility of AWS (Security of the Cloud), the security controls are managed by AWS, along with threat identification and mitigation. The distribution of responsibility between AWS (Security of the Cloud) and you (Security in the Cloud) depends on which AWS service you use. Below, I provide examples of infrastructure, container, and abstracted AWS services to show how your responsibility for identifying and mitigating threats can vary:

  • Amazon EC2 is a good example of an infrastructure service, where you are able to access a virtual server in the cloud, you get to choose the operating system, and you have control of the service and all aspects you run on it—so you would be responsible for mitigating the identified threats.
  • Amazon Relational Database Service (Amazon RDS) is a representative example of a container service, where there is no operating system exposed for you, and instead AWS exposes the selected database engine to you (for example, MySQL). AWS is responsible for the security of the operating system in this example, and you don’t need to devise mitigations. However, the database engine is under your control as well as all aspects above it, so you would need to consider mitigations for these areas. Here, AWS is taking on a larger portion of the responsibility compared to infrastructure services.
  • Amazon S3, AWS Key Management Service (AWS KMS), and Amazon DynamoDB are examples of an abstracted service where AWS exposes the entire service control plane and data plane to you through the service API. Again, here there are no operating systems, database engines, or platforms exposed to you—these are an AWS responsibility. However, the API actions and associated policies are under your control and so are all aspects above the API level, so you should be considering mitigations for these. For this type of service, AWS takes a larger portion of responsibility compared to container and infrastructure types of services.

While these examples do not encompass all types of AWS services that may be in your workload, they demonstrate how your Security and Compliance responsibilities under the Shared Responsibility Model will vary in this context. Understanding the balance of responsibilities between AWS and yourself for the types of AWS services in your workload helps you scope your threat modeling exercise to the mitigations that are under your control, which are typically a combination of AWS service configuration options and your own workload-specific mitigations. For the AWS portion of the responsibility, you will find that AWS services are in-scope of many compliance programs, and the audit reports are available for download for AWS customers (at no cost) from AWS Artifact.

Regardless of which AWS services you’re using, there’s always an element of customer responsibility, and this should be included in your workload threat model.

Specifically, for security control mitigations for the AWS services themselves, you’d want to consider security controls across domains, including these domains: Identity and Access Management (Authentication/Authorization), Data Protection (At-Rest, In-Transit), Infrastructure Security, and Logging and Monitoring. AWS services each have a dedicated security chapter in the documentation, which provides guidance on the security controls to consider as mitigations. When capturing these security controls and mitigations in your threat model, you should aim to include references to the actual code, IAM policies, and AWS CloudFormation templates located in the workload’s infrastructure-as-code repository, and so on. This helps the reviewer or approver of your threat model to get an unambiguous view of the intended mitigation.

As in the case for threat identification, there’s no canonical list enumerating all the possible security controls. Through the process described here, you should consciously develop baseline security controls that align to your organization’s control objectives, and where possible, implement these baseline security controls as platform-level controls, including AWS service-level configurations (for example, encryption at rest) or guardrails (for example, through service control policies). By doing this, you can drive consistency and scale, so that these implemented baseline security controls are automatically inherited and enforced for other workload features that you design and deploy.

When you come up with the baseline security controls, it’s important to note that the context of a given workload feature isn’t known. Therefore, it’s advisable to consider these controls as a negotiable baseline that you can deviate from, provided that when you perform the workload threat modeling exercise, you find that the threat that the baseline control was designed to mitigate isn’t applicable, or there are other mitigations or compensating controls that adequately mitigate the threat. Compensating controls and mitigating factors could include: reduced data asset classification, non-human access, or ephemeral data/workload.

To learn more about how to start thinking about baseline security controls as part of your overall cloud security governance, have a look at the How to think about cloud security governance blog post.

10. Decide when enough is enough

There’s no perfect answer to this question. However, it’s important to have a risk-based perspective on the threat modeling process to create a balanced approach, so that the likelihood and impact of a risk are appropriately considered. Over-emphasis on “let’s build and ship it” could lead to significant costs and delays later. Conversely, over-emphasis on “let’s mitigate every conceivable threat” could lead to the workload feature shipping late (or never), and your customers might move on. In the recommendation I made earlier in the “Assemble the right team” section, the selection of personas is deliberate to make sure that there’s a natural tension between shipping the feature, and mitigating threats. Embrace this healthy tension.

11. Don’t let paralysis stop you before you start

Earlier in the “Break the workload down into smaller parts” section, I gave the recommendation that you should scope your threat models down to a workload feature. You may be thinking to yourself, “We’ve already shipped <X number> of features, how do we threat model those?” This is a completely reasonable question.

My view is that rather than go back to threat model features that are already live, aim to threat model any new features that you are working on now and improve the security properties of the code you ship next, and for each feature you ship after that. During this process you, your team, and your organization will learn—not just about threat modeling—but how to communicate effectively with one another. Make adjustments, iterate, improve. Sometime in the future, when you’re routinely providing high quality, consistent and reusable threat models for your new features, you can start putting activities to perform threat modeling for existing features into your backlog.

Conclusion

Threat modeling is an investment—in my view, it’s a good one, because finding and mitigating threats in the design phase of your workload feature can reduce the relative cost of mitigation, compared to finding the threats later. Consistently implementing threat modeling will likely also improve your security posture over time.

I’ve shared my observations and tips for practical ways to incorporate threat modeling into your organization, which center around communication, collaboration, and human-led expertise to find and address threats that your end customer expects. Armed with these tips, I encourage you to look across the workload features you’re working on now (or have in your backlog) and decide which ones will be the first you’ll threat model.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Boyd author photo

Darran Boyd

Darran is a Principal Security Solutions Architect at AWS, responsible for helping customers make good security choices and accelerating their journey to the AWS Cloud. Darran’s focus and passion is to deliver strategic security initiatives that unlock and enable our customers at scale.

Masking field values with Amazon Elasticsearch Service

Post Syndicated from Prashant Agrawal original https://aws.amazon.com/blogs/security/masking-field-values-with-amazon-elasticsearch-service/

Amazon Elasticsearch Service (Amazon ES) is a fully managed service that you can use to deploy, secure, and run Elasticsearch cost-effectively at scale. The service provides support for open-source Elasticsearch APIs, managed Kibana, and integration with Logstash and other AWS services. Amazon ES provides a deep security model that spans many layers of interaction and supports fine-grained access control at the cluster, index, document, and field level, on a per-user basis. The service’s security plugin integrates with federated identity providers for Kibana login.

A common use case for Amazon ES is log analytics. Customers configure their applications to store log data to the Elasticsearch cluster, where the data can be queried for insights into the functionality and use of the applications over time. In many cases, users reviewing those insights should not have access to all the details from the log data. The log data for a web application, for example, might include the source IP addresses of incoming requests. Privacy rules in many countries require that those details be masked, wholly or in part. This post explains how to set up field masking within your Amazon ES domain.

Field masking is an alternative to field-level security that lets you anonymize the data in a field rather than remove it altogether. When creating a role, add a list of fields to mask. Field masking affects whether you can see the contents of a field when you search. You can use field masking to either perform a random hash or pattern-based substitution of sensitive information from users, who shouldn’t have access to that information.

When you use field masking, Amazon ES creates a hash of the actual field values before returning the search results. You can apply field masking on a per-role basis, supporting different levels of visibility depending on the identity of the user making the query. Currently, field masking is only available for string-based fields. A search result with a masked field (clientIP) looks like this:

{
  "_index": "web_logs",
  "_type": "_doc",
  "_id": "1",
  "_score": 1,
  "_source": {
    "agent": "Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1",
    "bytes": 0,
    "clientIP": "7e4df8d4df7086ee9c05efe1e21cce8ff017a711ee9addf1155608ca45d38219",
    "host": "www.example.com",
    "extension": "txt",
    "geo": {
      "src": "EG",
      "dest": "CN",
      "coordinates": {
        "lat": 35.98531194,
        "lon": -85.80931806
      }
    },
    "machine": {
      "ram": 17179869184,
      "os": "win 7"
    }
  }
}

To follow along in this post, make sure you have an Amazon ES domain with Elasticsearch version 6.7 or higher, sample data loaded (this example uses the web logs data supplied by Kibana), and access to Kibana through a role with administrator privileges for the domain.

Configure field masking

Field masking is managed by defining specific access controls within the Kibana visualization system. You’ll need to create a new Kibana role, define the fine-grained access-control privileges for that role, specify which fields to mask, and apply that role to specific users.

You can use either the Kibana console or direct-to-API calls to set up field masking. In our first example, we’ll use the Kibana console.

To configure field masking in the Kibana console

  1. Log in to Kibana, choose the Security pane, and then choose Roles, as shown in Figure 1.

    Figure 1: Choose security roles

    Figure 1: Choose security roles

  2. Choose the plus sign (+) to create a new role, as shown in Figure 2.

    Figure 2: Create role

    Figure 2: Create role

  3. Choose the Index Permissions tab, and then choose Add index permissions, as shown in Figure 3.

    Figure 3: Set index permissions

    Figure 3: Set index permissions

  4. Add index patterns and appropriate permissions for data access. See the Amazon ES documentation for details on configuring fine-grained access control.
  5. Once you’ve set Index Patterns, Permissions: Action Groups, Document Level Security Query, and Include or exclude fields, you can use the Anonymize fields entry to mask the clientIP, as shown in Figure 4.

    Figure 4: Anonymize field

    Figure 4: Anonymize field

  6. Choose Save Role Definition.
  7. Next, you need to create one or more users and apply the role to the new users. Go back to the Security page and choose Internal User Database, as shown in Figure 5.

    Figure 5: Select Internal User Database

    Figure 5: Select Internal User Database

  8. Choose the plus sign (+) to create a new user, as shown in Figure 6.

    Figure 6: Create user

    Figure 6: Create user

  9. Add a username and password, and under Open Distro Security Roles, select the role es-mask-role, as shown in Figure 7.

    Figure 7: Select the username, password, and roles

    Figure 7: Select the username, password, and roles

  10. Choose Submit.

If you prefer, you can perform the same task by using the Amazon ES REST API using Kibana dev tools.

Use the following API to create a role as described in below snippet and shown in Figure 8.

PUT _opendistro/_security/api/roles/es-mask-role
{
  "cluster_permissions": [],
  "index_permissions": [
    {
      "index_patterns": [
        "web_logs"
      ],
      "dls": "",
      "fls": [],
      "masked_fields": [
        "clientIP"
      ],
      "allowed_actions": [
        "data_access"
      ]
    }
  ]
}

Sample response:

{
  "status": "CREATED",
  "message": "'es-mask-role' created."
}
Figure 8: API to create Role

Figure 8: API to create Role

Use the following API to create a user with the role as described in below snippet and shown in Figure 9.

PUT _opendistro/_security/api/internalusers/es-mask-user
{
  "password": "xxxxxxxxxxx",
  "opendistro_security_roles": [
    "es-mask-role"
  ]
}

Sample response:

{
  "status": "CREATED",
  "message": "'es-mask-user' created."
}
Figure 9: API to create User

Figure 9: API to create User

Verify field masking

You can verify field masking by running a simple search query using Kibana dev tools (GET web_logs/_search) and retrieving the data first by using the kibana_user (with no field masking), and then by using the es-mask-user (with field masking) you just created.

Query responses run by the kibana_user (all access) have the original values in all fields, as shown in Figure 10.

Figure 10: Retrieval of the full clientIP data with kibana_user

Figure 10: Retrieval of the full clientIP data with kibana_user

Figure 11, following, shows an example of what you would see if you logged in as the es-mask-user. In this case, the clientIP field is hidden due to the es-mask-role you created.

Figure 11: Retrieval of the masked clientIP data with es-mask-user

Figure 11: Retrieval of the masked clientIP data with es-mask-user

Use pattern-based field masking

Rather than creating a hash, you can use one or more regular expressions and replacement strings to mask a field. The syntax is <field>::/<regular-expression>/::<replacement-string>.

You can use either the Kibana console or direct-to-API calls to set up pattern-based field masking. In the following example, clientIP is masked in such a way that the last three parts of the IP address are masked by xxx using the pattern is clientIP::/[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}$/::xxx.xxx.xxx>. You see only the first part of the IP address, as shown in Figure 12.

Figure 12: Anonymize the field with a pattern

Figure 12: Anonymize the field with a pattern

Run the search query to verify that the last three parts of clientIP are masked by custom characters and only the first part is shown to the requester, as shown in Figure 13.

Figure 13: Retrieval of the masked clientIP (according to the defined pattern) with es-mask-user

Figure 13: Retrieval of the masked clientIP (according to the defined pattern) with es-mask-user

Conclusion

Field level security should be the primary approach for ensuring data access security – however if there are specific business requirements that cannot be met with this approach, then field masking may offer a viable alternative. By using field masking, you can selectively allow or prevent your users from seeing private information such as personally identifying information (PII) or personal healthcare information (PHI). For more information about fine-grained access control, see the Amazon Elasticsearch Service Developer Guide.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Elasticsearch Service forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Prashant Agrawal

Prashant is a Search Specialist Solutions Architect with Amazon Elasticsearch Service. He works closely with team members to help customers migrate their workloads to the cloud. Before joining AWS, he helped various customers use Elasticsearch for their search and analytics use cases.

Use AWS Secrets Manager to simplify the management of private certificates

Post Syndicated from Maitreya Ranganath original https://aws.amazon.com/blogs/security/use-aws-secrets-manager-to-simplify-the-management-of-private-certificates/

AWS Certificate Manager (ACM) lets you easily provision, manage, and deploy public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with Amazon Web Services (AWS) services and your internal connected resources. For private certificates, AWS Certificate Manager Private Certificate Authority (ACM PCA) can be used to create private CA hierarchies, including root and subordinate CAs, without the investment and maintenance costs of operating an on-premises CA. With these CAs, you can issue custom end-entity certificates or use the ACM defaults.

When you manage the lifecycle of certificates, it’s important to follow best practices. You can think of a certificate as an identity of a service you’re connecting to. You have to ensure that these identities are secure and up to date, ideally with the least amount of manual intervention. AWS Secrets Manager provides a mechanism for managing certificates, and other secrets, at scale. Specifically, you can configure secrets to automatically rotate on a scheduled basis by using pre-built or custom AWS Lambda functions, encrypt them by using AWS Key Management Service (AWS KMS) keys, and automatically retrieve or distribute them for use in applications and services across an AWS environment. This reduces the overhead of manually managing the deployment, creation, and secure storage of these certificates.

In this post, you’ll learn how to use Secrets Manager to manage and distribute certificates created by ACM PCA across AWS Regions and accounts.

We present two use cases in this blog post to demonstrate the difference between issuing private certificates with ACM and with ACM PCA. For the first use case, you will create a certificate by using the ACM defaults for private certificates. You will then deploy the ACM default certificate to an Amazon Elastic Compute Cloud (Amazon EC2) instance that is launched in the same account as the secret and private CA. In the second scenario, you will create a custom certificate by using ACM PCA templates and parameters. This custom certificate will be deployed to an EC2 instance in a different account to demonstrate cross-account sharing of secrets.

Solution overview

Figure 1 shows the architecture of our solution.

Figure 1: Solution architecture

Figure 1: Solution architecture

This architecture includes resources that you will create during the blog walkthrough and by using AWS CloudFormation templates. This architecture outlines how these services can be used in a multi-account environment. As shown in the diagram:

  1. You create a certificate authority (CA) in ACM PCA to generate end-entity certificates.
  2. In the account where the issuing CA was created, you create secrets in Secrets Manager.
    1. There are several required parameters that you must provide when creating secrets, based on whether you want to create an ACM or ACM PCA issued certificate. These parameters will be passed to our Lambda function to make sure that the certificate is generated and stored properly.
    2. The Lambda rotation function created by the CloudFormation template is attached when configuring secrets rotation. Initially, the function generates two Privacy-Enhanced Mail (PEM) encoded files containing the certificate and private key, based on the provided parameters, and stores those in the secret. Subsequent calls to the function are made when the secret needs to be rotated, and then the function stores the resulting Certificate PEM and Private Key PEM in the desired secret. The function is written using Python, the AWS SDK for Python (Boto3), and OpenSSL. The flow of the function follows the requirements for rotating secrets in Secrets Manager.
  3. The first CloudFormation template creates a Systems Manager Run Command document that can be invoked to install the certificate and private key from the secret on an Apache Server running on EC2 in Account A.
  4. The second CloudFormation template deploys the same Run Command document and EC2 environment in Account B.
    1. To make sure that the account has the ability to pull down the certificate and private key from Secrets Manager, you need to update the key policy in Account A to give Account B access to decrypt the secret.
    2. You also need to add a resource-based policy to the secret that gives Account B access to retrieve the secret from Account A.
    3. Once the proper access is set up in Account A, you can use the Run Command document to install the certificate and private key on the Apache Server.

In a multi-account scenario, it’s common to have a central or shared AWS account that owns the ACM PCA resource, while workloads that are deployed in other AWS accounts use certificates issued by the ACM PCA. This can be achieved in two ways:

  1. Secrets in Secrets Manager can be shared with other AWS accounts by using resource-based policies. Once shared, the secrets can be deployed to resources, such as EC2 instances.
  2. You can share the central ACM PCA with other AWS accounts by using AWS Resource Access Manager or ACM PCA resource-based policies. These two options allow the receiving AWS account or accounts to issue private certificates by using the shared ACM PCA. These issued certificates can then use Secrets Manager to manage the secret in the child account and leverage features like rotation.

We will focus on first case for sharing secrets.

Solution cost

The cost for running this solution comes from the following services:

  • AWS Certificate Manager Private Certificate Authority (ACM PCA)
    Referring to the pricing page for ACM PCA, this solution incurs a prorated monthly charge of $400 for each CA that is created. A CA can be deleted the same day it’s created, leading to a charge of around $13/day (400 * 12 / 365.25). In addition, there is a cost for issuing certificates using ACM PCA. For the first 1000 certificates, this cost is $0.75. For this demonstration, you only need two certificates, resulting in a total charge of $1.50 for issuing certificates using ACM PCA. In all, the use of ACM PCA in this solution results in a charge of $14.50.
  • Amazon EC2
    The CloudFormation templates create t2.micro instances that cost $0.0116/hour, if they’re not eligible for Free Tier.
  • Secrets Manager
    There is a 30-day free trial for Secrets Manager, which is initiated when the first secret is created. After the free trial has completed, it costs $0.40 per secret stored per month. You will use two secrets for this solution and can schedule these for deletion after seven days, resulting in a prorated charge of $0.20.
  • Lambda
    Lambda has a free usage tier that allows for 1 million free requests per month and 400,000 GB-seconds of compute time per month. This fits within the usage for this blog, making the cost $0.
  • AWS KMS
    A single key created by one of the CloudFormation templates costs $1/month. The first 20,000 requests to AWS KMS are free, which fits within the usage of the test environment. In a production scenario, AWS KMS would charge $0.03 per 10,000 requests involving this key.

There are no charges for Systems Manager Run Command.

See the “Clean up resources” section of this blog post to get information on how to delete the resources that you create for this environment.

Deploy the solution

Now we’ll walk through the steps to deploy the solution. The CloudFormation templates and Lambda function code can be found in the AWS GitHub repository.

Create a CA to issue certificates

First, you’ll create an ACM PCA to issue private certificates. A common practice we see with customers is using a subordinate CA in AWS that is used to issue end-entity certificates for applications and workloads in the cloud. This subordinate can either point to a root CA in ACM PCA that is maintained by a central team, or to an existing on-premises public key infrastructure (PKI). There are some considerations when creating a CA hierarchy in ACM.

For demonstration purposes, you need to create a CA that can issue end-entity certificates. If you have an existing PKI that you want to use, you can create a subordinate CA that is signed by an external CA that can issue certificates. Otherwise, you can create a root CA and begin building a PKI on AWS. During creation of the CA, make sure that ACM has permissions to automatically renew certificates, because this feature will be used in later steps.

You should have one or more private CAs in the ACM console, as shown in Figure 2.

Figure 2: A private CA in the ACM PCA console

Figure 2: A private CA in the ACM PCA console

You will use two CloudFormation templates for this architecture. The first is launched in the same account where your private CA lives, and the second is launched in a different account. The first template generates the following: a Lambda function used for Secrets Manager rotation, an AWS KMS key to encrypt secrets, and a Systems Manager Run Command document to install the certificate on an Apache Server running on EC2 in Amazon Virtual Private Cloud (Amazon VPC). The second template launches the same Systems Manager Run Command document and EC2 environment.

To deploy the resources for the first template, select the following Launch Stack button. Make sure you’re in the N. Virginia (us-east-1) Region.

Select the Launch Stack button to launch the template

The template takes a few minutes to launch.

Use case #1: Create and deploy an ACM certificate

For the first use case, you’ll create a certificate by using the ACM defaults for private certificates, and then deploy it.

Create a Secrets Manager secret

To begin, create your first secret in Secrets Manager. You will create these secrets in the console to see how the service can be set up and used, but all these actions can be done through the AWS Command Line Interface (AWS CLI) or AWS SDKs.

To create a secret

  1. Navigate to the Secrets Manager console.
  2. Choose Store a new secret.
  3. For the secret type, select Other type of secrets.
  4. The Lambda rotation function has a set of required parameters in the secret type depending on what kind of certificate needs to be generated.For this first secret, you’re going to create an ACM_ISSUED certificate. Provide the following parameters.

    KeyValue
    CERTIFICATE_TYPEACM_ISSUED
    CA_ARNThe Amazon Resource Name (ARN) of your certificate-issuing CA in ACM PCA
    COMMON_NAMEThe end-entity name for your certificate (for example, server1.example)
    ENVIRONMENTTEST (You need this later on to test the renewal of certificates. If using this outside of the blog walkthrough, set it to something like DEV or PROD.)
  5. For Encryption key, select CAKey, and then choose Next.
  6. Give the secret a name and optionally add tags or a description. Choose Next.
  7. Select Enable automatic rotation and choose the Lambda function that starts with <CloudFormation Stack Name>-SecretsRotateFunction. Because you’re creating an ACM-issued certificate, the rotation will be handled 60 days before the certificate expires. The validity is set to 365 days, so any value higher than 305 would work. Choose Next.
  8. Review the configuration, and then choose Store.
  9. This will take you back to a list of your secrets, and you will see your new secret, as shown in Figure 3. Select the new secret.

    Figure 3: The new secret in the Secrets Manager console

    Figure 3: The new secret in the Secrets Manager console

  10. Choose Retrieve secret value to confirm that CERTIFICATE_PEM, PRIVATE_KEY_PEM, CERTIFICATE_CHAIN_PEM, and CERTIFICATE_ARN are set in the secret value.

You now have an ACM-issued certificate that can be deployed to an end entity.

Deploy to an end entity

For testing purposes, you will now deploy the certificate that you just created to an Apache Server.

To deploy the certificate to the Apache Server

  1. In a new tab, navigate to the Systems Manager console.
  2. Choose Documents at the bottom left, and then choose the Owned by me tab.
  3. Choose RunUpdateTLS.
  4. Choose Run command at the top right.
  5. Copy and paste the secret ARN from Secrets Manager and make sure there are no leading or trailing spaces.
  6. Select Choose instances manually, and then choose ApacheServer.
  7. Select CloudWatch output to track progress.
  8. Choose Run.

The certificate and private key are now installed on the server, and it has been restarted.

To verify that the certificate was installed

  1. Navigate to the EC2 console.
  2. In the dashboard, choose Running Instances.
  3. Select ApacheServer, and choose Connect.
  4. Select Session Manager, and choose Connect.
  5. When you’re logged in to the instance, enter the following command.
    openssl s_client -connect localhost:443 | openssl x509 -text -noout
    

    This will display the certificate that the server is using, along with other metadata like the certificate chain and validity period. For the validity period, note the Not Before and Not After dates and times, as shown in figure 4.

    Figure 4: Server certificate

    Figure 4: Server certificate

Now, test the rotation of the certificate manually. In a production scenario, this process would be automated by using maintenance windows. Maintenance windows allow for the least amount of disruption to the applications that are using certificates, because you can determine when the server will update its certificate.

To test the rotation of the certificate

  1. Navigate back to your secret in Secrets Manager.
  2. Choose Rotate secret immediately. Because you set the ENVIRONMENT key to TEST in the secret, this rotation will renew the certificate. When the key isn’t set to TEST, the rotation function pulls down the renewed certificate based on its rotation schedule, because ACM is managing the renewal for you. In a couple of minutes, you’ll receive an email from ACM stating that your certificate was rotated.
  3. Pull the renewed certificate down to the server, following the same steps that you used to deploy the certificate to the Apache Server.
  4. Follow the steps that you used to verify that the certificate was installed to make sure that the validity date and time has changed.

Use case #2: Create and deploy an ACM PCA certificate by using custom templates

Next, use the second CloudFormation template to create a certificate, issued by ACM PCA, which will be deployed to an Apache Server in a different account. Sign in to your other account and select the following Launch Stack button to launch the CloudFormation template.

Select the Launch Stack button to launch the template

This creates the same Run Command document you used previously, as well as the EC2 and Amazon VPC environment running an Apache Server. This template takes in a parameter for the KMS key ARN; this can be found in the first template’s output section, shown in figure 5.

Figure 5: CloudFormation outputs

Figure 5: CloudFormation outputs

While that’s completing, sign in to your original account so that you can create the new secret.

To create the new secret

  1. Follow the same steps you used to create a secret, but change the secret values passed in to the following.

    KeyValue
    CA_ARNThe ARN of your certificate-issuing CA in ACM PCA
    COMMON_NAMEYou can use any name you want, such as server2.example
    TEMPLATE_ARN

    For testing purposes, use arn:aws:acm-pca:::template/EndEntityCertificate/V1

    This template ARN determines what type of certificate is being created and your desired path length. For more information, see Understanding Certificate Templates.

    KEY_ALGORITHMTYPE_RSA
    (You can also use TYPE_DSA)
    KEY_SIZE2048
    (You can also use 1024 or 4096)
    SIGNING_HASHsha256
    (You can also use sha384 or sha512)
    SIGNING_ALGORITHMRSA
    (You can also use ECDSA if the key type for your issuing CA is set to ECDSA P256 or ECDSA P384)
    CERTIFICATE_TYPEACM_PCA_ISSUED
  2. Add the following resource policy during the name and description step. This gives your other account access to pull this secret down to install the certificate on its Apache Server.
    {
      "Version" : "2012-10-17",
      "Statement" : [ {
        "Effect" : "Allow",
        "Principal" : {
          "AWS" : "<ARN in output of second CloudFormation Template>"
        },
        "Action" : "secretsmanager:GetSecretValue",
        "Resource" : "*"
      } ]
    }
    

  3. Finish creating the secret.

After the secret has been created, the last thing you need to do is add permissions to the KMS key policy so that your other account can decrypt the secret when installing the certificate on your server.

To add AWS KMS permissions

  1. Navigate to the AWS KMS console, and choose CAKey.
  2. Next to the key policy name, choose Edit.
  3. For the Statement ID (SID) Allow use of the key, add the ARN of the EC2 instance role in the other account. This can be found in the CloudFormation templates as an output called ApacheServerInstanceRole, as shown in Figure 5. The Statement should look something like this:
    {
                "Sid": "Allow use of the key",
                "Effect": "Allow",
                "Principal": {
                    "AWS": [
                        "arn:aws:iam::<AccountID with CA>:role/<Apache Server Instance Role>",
                        "arn:aws:iam:<Second AccountID>:role/<Apache Server Instance Role>"
                    ]
                },
                "Action": [
                    "kms:Encrypt",
                    "kms:Decrypt",
                    "kms:ReEncrypt*",
                    "kms:GenerateDataKey*",
                    "kms:DescribeKey"
                ],
                "Resource": "*"
    }
    

Your second account now has permissions to pull down the secret and certificate to the Apache Server. Follow the same steps described in the earlier section, “Deploy to an end entity.” Test rotating the secret the same way, and make sure the validity period has changed. You may notice that you didn’t get an email notifying you of renewal. This is because the certificate isn’t issued by ACM.

In this demonstration, you may have noticed you didn’t create resources that pull down the secret in different Regions, just in different accounts. If you want to deploy certificates in different Regions from the one where you create the secret, the process is exactly the same as what we described here. You don’t need to do anything else to accomplish provisioning and deploying in different Regions.

Clean up resources

Finally, delete the resources you created in the earlier steps, in order to avoid additional charges described in the section, “Solution cost.”

To delete all the resources created:

  1. Navigate to the CloudFormation console in both accounts, and select the stack that you created.
  2. Choose Actions, and then choose Delete Stack. This will take a few minutes to complete.
  3. Navigate to the Secrets Manager console in the CA account, and select the secrets you created.
  4. Choose Actions, and then choose Delete secret. This won’t automatically delete the secret, because you need to set a waiting period that allows for the secret to be restored, if needed. The minimum time is 7 days.
  5. Navigate to the Certificate Manager console in the CA account.
  6. Select the certificates that were created as part of this blog walkthrough, choose Actions, and then choose Delete.
  7. Choose Private CAs.
  8. Select the subordinate CA you created at the beginning of this process, choose Actions, and then choose Disable.
  9. After the CA is disabled, choose Actions, and then Delete. Similar to the secrets, this doesn’t automatically delete the CA but marks it for deletion, and the CA can be recovered during the specified period. The minimum waiting period is also 7 days.

Conclusion

In this blog post, we demonstrated how you could use Secrets Manager to rotate, store, and distribute private certificates issued by ACM and ACM PCA to end entities. Secrets Manager uses AWS KMS to secure these secrets during storage and delivery. You can introduce additional automation for deploying the certificates by using Systems Manager Maintenance Windows. This allows you to define a schedule for when to deploy potentially disruptive changes to EC2 instances.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Secrets Manager forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Maitreya Ranganath

Maitreya is an AWS Security Solutions Architect. He enjoys helping customers solve security and compliance challenges and architect scalable and cost-effective solutions on AWS.

Author

Blake Franzen

Blake is a Security Solutions Architect with AWS in Seattle. His passion is driving customers to a more secure AWS environment while ensuring they can innovate and move fast. Outside of work, he is an avid movie buff and enjoys recreational sports.

Signing executables with HSM-backed certificates using multiple Windows instances

Post Syndicated from Karim Hamdy Abdelmonsif Ibrahim original https://aws.amazon.com/blogs/security/signing-executables-with-hsm-backed-certificates-using-multiple-windows-instances/

Customers use code signing certificates to digitally sign software, documents, and other certificates. Signing is a cryptographic tool that lets users verify that the code hasn’t been altered and that the software, documents or other certificates can be trusted.

This blog post shows you how to configure your applications so you can use a key pair already on your hardware security module (HSM) to generate signatures using any Windows instance. Many customers use multiple Amazon Elastic Compute Cloud (Amazon EC2) instances to sign workloads using the same key pair. You must configure these instances to use a pre-existing key pair from the HSM. In this blog post, I show you how to create a key container on a new Windows instance from an existing key pair in AWS CloudHSM, and then update the certificate store to associate the newly imported certificate with the new container. I also show you how to use a common application to sign executables with this key pair.

Every certificate is associated with a key pair, which includes a private key and a public key. You can only trust a signature if you can be sure that the private key has remained confidential and can be used only by the owner of the certificate. You achieve this goal by generating the key pair on an HSM and securely storing the private key on the HSM. Enterprise certificate authority (CA) or public key infrastructure (PKI) applications are configured to use this private key in the HSM whenever they need to use the corresponding certificate to sign. This configuration is generally handled transparently between the application and the HSM on the Windows instance your application is running on. The process gets tricky when you want to use multiple Windows instances to sign using the same key pair. This is especially true if your current EC2 instance that acts as a Windows Server CA, which you used to issue the HSM-backed certificate, is deleted and you have a backup of the HSM-backed certificate.

Before we get into the details, you need to know about a library called the key storage provider (KSP). Windows systems use KSP libraries to connect applications to an HSM. For each HSM brand, such as CloudHSM, you need a corresponding KSP to run operations that involve cryptographic keys stored on that HSM. From your application, select the KSP that corresponds with the HSM you want to use to store (or use) your keys. All KSPs associate keys on their HSM with metadata in the Microsoft ecosystem using key containers. Key containers map the metadata in certificates with metadata on the HSM, which allows the application to properly address keys. The list of certificates available for Microsoft utilities to sign with is contained in a trust store. To use the same key pair across multiple Windows instances, you must copy the key containers to each instance—or create a new key container from an existing key pair in each instance—and import the corresponding certificate into the trust store for each instance.

Prerequisites

The solution in this post assumes that you’ve completed the steps in Signing executables with Microsoft SignTool.exe using AWS CloudHSM-backed certificates. You should already have your HSM-backed certificate on one Windows instance.

Before you implement the solution, you must:

  1. Install the AWS CloudHSM client on the new instance and make sure that you can interact with HSM in your CloudHSM cluster.
  2. Verify the CloudHSM KSP and CNG providers installation on your new instance.
  3. Set the login credentials for the HSM on your system. Set credentials through Windows Credentials Manager. I recommend that you reboot your instance after setting up the credentials.

Note: The login credentials identify a crypto user (CU) in the HSM that has access to the key pair in CloudHSM.

Architectural overview

 

Figure 1: Architectural overview

Figure 1: Architectural overview

This diagram shows a virtual private cloud (VPC) that contains an EC2 instance running Windows Server 2016 that resides on private subnet 1. This instance will run the CloudHSM client software and will use your HSM-backed certificate with a key pair already on your HSM to sign executable files. The instance can be accessed through a VPN connection. It will also have security groups that enable RDP access for your on-premises network. Private subnet 2 hosts the elastic network interface for the CloudHSM cluster, which has a single HSM.

Out of scope

The focus of this blog post is how to use an HSM-backed certificate with a key pair already on your HSM to sign executable files from any Windows instance using Microsoft SignTool.exe. This post isn’t intended to represent any best practices for implementing code signing or Amazon EC2. For more information, see the NIST cybersecurity whitepaper Security Considerations for Code Signing and Best practices for Amazon EC2, respectively.

Deploy the solution

To deploy the solution, you use certutil, import_key, and SignTool. Certutil is a Microsoft tool that helps you examine your system for available certificates and key containers. Import_key—a tool provided by CloudHSM—generates a local key container for a key pair that’s on your HSM. To complete the process, use SignTool—a Microsoft tool that enables Windows users to digitally sign files, and verifies signatures in files and timestamps files.

You will need the following:

Certificates or key materialPurpose
<my root certificate>.cerRoot certificate
<my signed certificate>.cerHSM-backed signing certificate
<signed certificate in base64>.cerHSM-backed signing certificate in base64 format
<public key handle>Public key handle of the signing certificate
<private key handle>Private key handle of the signing certificate

Import the HSM-backed certificate and its RootCA chain certificate into the new instance

Before you can use third-party tools such as SignTool to generate signatures using the HSM-backed certificate, you must move the signing certificate file to the Personal certificate store in the new Windows instance.

To do that, you copy the HSM-backed certificate that your application uses for signing operations and its root certificate chain from the original instance to the new Windows instance.

If you issued your signing certificate through a private CA (like in my example), you must deploy a copy of the root CA certificate and any intermediate certificates from the private CA to any systems you want to use to verify the integrity of your signed file.

To import the HSM-backed certificate and root certificate

  1. Sign in to the Windows Server that has the private CA that you used to issue your signing certificate. Then, run the following certutil command to export the root CA to a new file. Replace <my root certificate> with a name that you can remember easily.
    C:\Users\Administrator\Desktop>certutil -ca.cert <my root certificate>.cer
    
    CA cert[0]: 3 -- Valid
    CA cert[0]:
    
    -----BEGIN CERTIFICATE-----
    MIICiTCCAfICCQD6m7oRw0uXOjANBgkqhkiG9w0BAQUFADCBiDELMAkGA1UEBhMC
    VVMxCzAJBgNVBAgTAldBMRAwDgYDVQQHEwdTZWF0dGxlMQ8wDQYDVQQKEwZBbWF6
    b24xFDASBgNVBAsTC0lBTSBDb25zb2xlMRIwEAYDVQQDEwlUZXN0Q2lsYWMxHzAd
    BgkqhkiG9w0BCQEWEG5vb25lQGFtYXpvbi5jb20wHhcNMTEwNDI1MjA0NTIxWhcN
    MTIwNDI0MjA0NTIxWjCBiDELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAldBMRAwDgYD
    VQQHEwdTZWF0dGxlMQ8wDQYDVQQKEwZBbWF6b24xFDASBgNVBAsTC0lBTSBDb25z
    b2xlMRIwEAYDVQQDEwlUZXN0Q2lsYWMxHzAdBgkqhkiG9w0BCQEWEG5vb25lQGFt
    YXpvbi5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMaK0dn+a4GmWIWJ
    21uUSfwfEvySWtC2XADZ4nB+BLYgVIk60CpiwsZ3G93vUEIO3IyNoH/f0wYK8m9T
    rDHudUZg3qX4waLG5M43q7Wgc/MbQITxOUSQv7c7ugFFDzQGBzZswY6786m86gpE
    Ibb3OhjZnzcvQAaRHhdlQWIMm2nrAgMBAAEwDQYJKoZIhvcNAQEFBQADgYEAtCu4
    nUhVVxYUntneD9+h8Mg9q6q+auNKyExzyLwaxlAoo7TJHidbtS4J5iNmZgXL0Fkb
    FFBjvSfpJIlJ00zbhNYS5f6GuoEDmFJl0ZxBHjJnyp378OD8uTs7fLvjx79LjSTb
    NYiytVbZPQUQ5Yaxu2jXnimvw3rrszlaEXAMPLE=
    -----END CERTIFICATE-----
            
    CertUtil: -ca.cert command completed successfully.
    
    C:\Users\Administrator\Desktop>
    

  2. Copy the <my root certificate>.cer file to your new Windows instance and run the following certutil command. This moves the root certificate from the file into the Trusted Root Certification Authorities store in Windows. You can verify that it exists by running certlm.msc and viewing the Trusted Root Certification Authorities certificates.
    C:\Users\Administrator\Desktop>certutil -addstore "Root" <my root certificate>.cer
    
    Root "Trusted Root Certification Authorities"
    Signature matches Public Key
    Certificate "MYRootCA" added to store.
    CertUtil: -addstore command completed successfully.
    

  3. Copy the HSM-backed signing certificate from the original instance to the new one, and run the following certutil command. This moves the certificate from the file into the Personal certificate store in Windows.
    C:\Users\Administrator\Desktop>certutil -addstore "My" <my signed certificate>.cer
    
    My "Personal"
    Certificate "www.mydomain.com" added to store.
    CertUtil: -addstore command completed successfully.
    

  4. Verify that the certificate exists in your Personal certificate store by running the following certutil command. The following sample output from certutil shows the serial number. Take note of the certificate serial number to use later.
    C:\Users\Administrator\Desktop>certutil -store my
    
    my "Personal"
    ================ Certificate 0 ================
    Serial Number: <certificate serial number>
    Issuer: CN=MYRootCA
     NotBefore: 2/5/2020 1:38 PM
     NotAfter: 2/5/2021 1:48 PM
    Subject: CN=www.mydomain.com, OU=Certificate Management, O=Information Technology, L=Houston, S=Texas, C=US
    Non-root Certificate
    Cert Hash(sha1): 5aaef93e7e972b1187363d880cfa3f71507c2e24
    No key provider information
    Cannot find the certificate and private key for decryption.
    CertUtil: -store command completed successfully.
    

Retrieve the key handles of the RSA key pair on the HSM

In this step, you retrieve the key handles of the existing public and private key pair on your CloudHSM in order to use that key pair to create a key container on the new Windows instance.

One way to get the key handles of an existing key pair on the CloudHSM is to use the modulus value. Since the certificate and its public and private keys all must have the same modulus value and you have the signing certificate already, you view its modulus value using the OpenSSL tool. Then, you use the findKey command in key_mgmt_util to search for the public and private key handles on the HSM using the value of the certificate modulus.

To retrieve the key handles

  1. Download the OpenSSL for Windows installation package.

    Note: In my example, I downloaded Win64OpenSSL-1_1_1d.exe.

  2. Right-click on the downloaded file and choose Run as administrator.
  3. Follow the installation instructions, accepting all default settings. Then choose Install.
    1. If the error message “The Win64 Open SSL Installation Project setup has detected that the following critical component is missing…”—shown in Figure 2—appears, you need to install Microsoft Visual C++ Redistributables to complete this procedure.

      Figure 2: OpenSSL installation error message

      Figure 2: OpenSSL installation error message

    2. Choose Yes to download and install the required Microsoft Visual C++ package on your system.
    3. Run the OpenSSL installer again and follow the installation instructions, accepting all default settings. Then choose Install.
  4. Choose Finish when the installation is complete.

    With the installation complete, OpenSSL for Windows can be found as OpenSSL.exe in C:\Program Files\OpenSSL-Win64\bin. Always open the program as the administrator.

  5. On the new CloudHSM client instance, copy your certificate to C:\Program Files\OpenSSL-Win64\bin and run the command certutil -encode <my signed certificate>.cer <signed certificate in base64>.cer to export the certificate using base64 .cer format. This exports the certificate to a file with the name you enter in place of <signed certificate in base64>.
    C:\Program Files\OpenSSL-Win64\bin>certutil -encode <my signed certificate>.cer <signed certificate in base64>.cer
    
    Input Length = 1066
    Output Length = 1526
    CertUtil: -encode command completed successfully.
    

  6. Run the command openssl x509 -noout -modulus -in <signed certificate in base64>.cer to view the certificate modulus.
    C:\Program Files\OpenSSL-Win64\bin>openssl x509 -noout -modulus -in <signed certificate in base64>.cer
    
    Modulus=9D1D625C041F7FAF076780E486CA2DB2FB846982E88804030F9C84F6CF553925C287934C18B92606EE9A4438F80E47961D7B2CD28213EADE2078BE1A921E6D164CC07F99DA42CF6DD1767A6392FC4BC2B19592474782E1B8574F4A46A93626CD2A8D56405EA7DFCED8DA7042F6FC6D3716CC1649174E93C66F0A9EC7EEFEC9661D43FD2BC8E2E261C06A619E4AF3B5E13190215F72EE5BDE2090818031F8AAD0AA7E934894DC54DF5F1E7577645137637F400E10B9ECDC0870C78C99E8027A86807CD719AA05931D1A4326A5ED1C3687C8EA8E54DF62BFD1851A92473348C98973DEF850B8A88A443A56E93B997F3286A1DC274E6A8DD187D8C59BAB32A6919F
    

  7. Save the certificate modulus in a text file named modulus.txt.
  8. Run the key_mgmt_util command line tool, and log in as the CU, as described in Getting Started with key_mgmt_util. Replace <cu username> and <cu password> with the username and password of the CU.
    Command: loginHSM -u CU -s <CU username> -p <CU password>
    
         	Cfm3LoginHSM returned: 0x00 : HSM Return: SUCCESS
    
            Cluster Error Status
            Node id 13 and err state 0x00000000 : HSM Return: SUCCESS
            Node id 14 and err state 0x00000000 : HSM Return: SUCCESS
    

  9. Run the following findKey command to find the public key handle that has the same RSA modulus that you generated previously. Enter the path to the modulus.txt file that you created in step 7. Take note of the public key handle that’s returned so that you can use it in the following steps.
    Command: findKey -c 2 -m C:\\Users\\Administrator\\Desktop\\modulus.txt
    
            Total number of keys present: 1
    
            Number of matching keys from start index 0::0
    
            Handles of matching keys:
            <public key handle>
    
            Cluster Error Status
            Node id 13 and err state 0x00000000 : HSM Return: SUCCESS
            Node id 14 and err state 0x00000000 : HSM Return: SUCCESS
    
            Cfm3FindKey returned: 0x00 : HSM Return: SUCCESS
    

  10. Run the following findKey command to find the private key handle that has the same RSA modulus that you generated previously. Enter the path to the modulus.txt file that you created in step 7. Take note of the private key handle that’s returned so that you can use it in the following steps.
    Command: findKey -c 3 -m C:\\Users\\Administrator\\Desktop\\modulus.txt
    
            Total number of keys present: 1
    
            Number of matching keys from start index 0::0
    
            Handles of matching keys:
            <private key handle>
    
            Cluster Error Status
            Node id 13 and err state 0x00000000 : HSM Return: SUCCESS
            Node id 14 and err state 0x00000000 : HSM Return: SUCCESS
    
            Cfm3FindKey returned: 0x00 : HSM Return: SUCCESS
    

Create a new key container for the existing public and private key pair in the CloudHSM

To use the same key pair across new Windows instances, you must copy over the key containers to each instance, or create a new key container from an existing key pair in the key storage provider of each instance. In this step, you create a new key container to hold the public key of the certificate and its corresponding private key metadata. To create a new key container from an existing public and private key pair in the HSM, first make sure to start the CloudHSM client daemon. Then, use the import_key.exe utility, which is included in CloudHSM version 3.0 and later.

To create a new key container

  1. Run the following import_key.exe command, replacing <private key handle> and <public key handle> with the public and private key handles you created in the previous procedure. This creates the HSM key pair in a new key container in the key storage provider.
    C:\Program Files\Amazon\CloudHSM>import_key.exe -from HSM –privateKeyHandle <private key handle> -publicKeyHandle <public key handle>
    
    Represented 1 keypairs in Cavium Key Storage Provider.
    

    Note: If you get the error message n3fips_password is not set, make sure that you set the login credentials for the HSM on your system.

  2. You can verify the new key container by running the following certutil command to list the key containers in your key storage provider (KSP). Take note of the key container name to use in the following steps.
    C:\Program Files\Amazon\CloudHSM>certutil -key -csp "Cavium Key Storage provider"
    
    Cavium Key Storage provider:
      <key container name>
      RSA
    
    
    CertUtil: -key command completed successfully.
    

Update the certificate store

Now you have everything in place: the imported certificate in the Personal certificate store of the new Windows instance and the key container that represents the key pair in CloudHSM. In this step, you associate the certificate to the key container that you made a note of earlier.

To update the certificate store

  1. Create a file named repair.txt as shown following.

    Note: You must use the key container name of your certificate that you got in the previous step as the input for the repair.txt file.

    [Properties]
    11 = "" ; Add friendly name property
    2 = "{text}" ; Add Key Provider Information property
    _continue_="Container=<key container name>&"
    _continue_="Provider=Cavium Key Storage Provider&"
    _continue_="Flags=0&"
    _continue_="KeySpec=2"
    

  2. Make sure that the CloudHSM client daemon is still running. Then, use the certutil verb -repairstore to update the certificate serial number that you took note of earlier, as shown in the following command. The following sample shows the command and output. See the Microsoft documentation for information about the – repairstore verb.
    certutil -repairstore my <certificate serial number> repair.txt
    
    C:\Users\Administrator\Desktop>certutil -repairstore my <certificate serial number> repair.txt
    
    my "Personal"
    ================ Certificate 0 ================
    Serial Number: <certificate serial number>
    Issuer: CN=MYRootCA
     NotBefore: 2/5/2020 1:38 PM
     NotAfter: 2/5/2021 1:48 PM
    Subject: CN=www.mydomain.com, OU=Certificate Management, O=Information Technology, L=Houston, S=Texas, C=US
    Non-root Certificate
    Cert Hash(sha1): 5aaef93e7e972b1187363d880cfa3f71507c2e24
    CertUtil: -repairstore command completed successfully.
    

  3. Run the following certutil command to verify that your certificate has been associated with the new key container successfully.
    C:\Users\Administrator\Desktop>certutil -store my
    
    my "Personal"
    ================ Certificate 0 ================
    Serial Number: <certificate serial number>
    Issuer: CN=MYRootCA
     NotBefore: 2/5/2020 1:38 PM
     NotAfter: 2/5/2021 1:48 PM
    Subject: CN=www.mydomain.com, OU=Certificate Management, O=Information Technology, L=Houston, S=Texas, C=US
    Non-root Certificate
    Cert Hash(sha1): 5aaef93e7e972b1187363d880cfa3f71507c2e24
      Key Container = CNGRSAPriv-3145768-3407903-26dd1d
      Provider = Cavium Key Storage Provider
    Private key is NOT exportable
    Encryption test passed
    CertUtil: -store command completed successfully.
    

Now you can use this certificate and its corresponding private key with any third-party signing tool on Windows.

Use the certificate with Microsoft SignTool

Now that you have everything in place, you can use the certificate to sign a file using the Microsoft SignTool.

To use the certificate

  1. Get the thumbprint of your certificate. To do this, right-click PowerShell and choose Run as administrator. Enter the following command:
    PS C:\>Get-ChildItem -path cert:\LocalMachine\My
    

    If successful, you should see output similar to the following.

    PSParentPath: Microsoft.PowerShell.Security\Certificate::LocalMachine\My
    
    Thumbprint                                Subject
    ----------                                -------
    <thumbprint>   CN=www.mydomain.com, OU=Certificate Management, O=Information Technology, L=Ho...
    

  2. Copy the thumbprint. You need it to perform the actual signing operation on a file.
  3. Download and install one of the following versions of the Microsoft Windows SDK on your Windows EC2 instance:Microsoft Windows 10 SDK
    Microsoft Windows 8.1 SDK
    Microsoft Windows 7 SDK

    Install the latest applicable Windows SDK package for your operating system. For example, for Microsoft Windows 2012 R2 or later versions, you should install the Microsoft Windows 10 SDK.

  4. To open the SignTool application, navigate to the application directory within PowerShell. This is usually:
    C:\Program Files (x86)\Windows Kits\<SDK version>\bin\<version number>\<CPU architecture>\signtool.exe
    

  5. When you’ve located the directory, sign your file by running the following command. Remember to replace <thumbprint> and <test.exe> with your own values. <test.exe> can be any executable file in your directory.
    PS C:\>.\signtool.exe sign /v /fd sha256 /sha1 <thumbprint> /sm /as C:\Users\Administrator\Desktop\<test.exe>
    

    You should see a message like the following:

    Done Adding Additional Store
    Successfully signed: C:\Users\Administrator\Desktop\<test.exe>
    
    Number of files successfully Signed: 1
    Number of warnings: 0
    Number of errors: 0
    

  6. (Optional) To verify the signature on the file, you can use SignTool.exe with the verify option by using the following command.
    PS C:\>.\signtool.exe verify /v /pa C:\Users\Administrators\Desktop\<test.exe>
    

    If successful, you should see output similar to the following.

    Number of files successfully Verified: 1
    

Conclusion

In this post, I walked you through the process of using an HSM-backed certificate on a new Windows instance for signing operations. You used the import_key.exe utility to create a new key container from an existing private/public key pair in CloudHSM. Then, you updated the certificate store to associate your certificate with the key container. Finally, you saw how to use the HSM-backed certificate with the new key container to sign executable files. As you continue to use this solution, it’s important to keep Microsoft Windows SDK, CloudHSM client software, and any other installed software up-to-date.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS CloudHSM forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Karim Hamdy Abdelmonsif Ibrahim

Karim is a Cloud Support Engineer II at AWS and a subject matter expert for AWS Shield. He’s an avid pentester and security enthusiast who’s worked in IT for almost 11 years. He obtained OSCP, OSWP, CISSP, CEH, ECSA, CISM, and AWS Certified Security Specialist certifications. Outside of work, he enjoys jet skiing, hanging out with friends, and watching space documentaries.

Use a single AWS Managed Microsoft AD for Amazon RDS for SQL Server instances in multiple Regions

Post Syndicated from Jeremy Girven original https://aws.amazon.com/blogs/security/use-a-single-aws-managed-microsoft-ad-for-amazon-rds-for-sql-server-instances-in-multiple-regions/

Many Amazon Web Services (AWS) customers use Active Directory to centralize user authentication and authorization for a variety of applications and services. For these customers, Active Directory is a critical piece of their IT infrastructure.

AWS offers AWS Directory Service for Microsoft Active Directory, also known as AWS Managed Microsoft AD, to provide a highly accessible and resilient Active Directory service that is built on Microsoft Active Directory.

AWS also offers Amazon Relational Database Service (Amazon RDS) for SQL Server. Amazon RDS enables you to prioritize application development by managing time-consuming database administration tasks including provisioning, backups, software patching, monitoring, and hardware scaling. If you require Windows authentication with Amazon RDS for SQL Server, Amazon RDS for SQL Server instances need to be integrated with AWS Managed Microsoft AD.

With the release of AWS Managed Microsoft AD cross-Region support, you only need one distinct AWS Managed Microsoft AD that spans multiple AWS Regions; this simplifies directory management and configuration. Additionally, it simplifies trusts between the AWS Managed Microsoft AD domain and your on-premises domain. Now, only a single trust between your on-premises domain and AWS Managed Microsoft AD domain is required, as compared to the previous pattern of only one AWS Managed Microsoft AD per Region—each of which would require a trust if you wanted to allow on-premises objects access to your AWS Managed Microsoft AD domain. Further, AWS Managed Microsoft AD cross-Region support provides an additional benefit when using your on-premises users and groups with Amazon RDS for SQL Server: You only need a single, one-way, outgoing trust between your multi-Region AWS Managed Microsoft AD and your on-premises domain.

As detailed in this post, to enable AWS Managed Microsoft AD cross-Region support, you create a new AWS Managed Microsoft AD and extend it to multiple Regions (as shown in Figure 1 below). Once you’ve extended your directory, you deploy an Amazon RDS SQL Server instance in each Region, integrating it to the same directory. Finally, you install SQL Server Management Studio (SSMS) on an instance joined to the AWS Managed Microsoft AD directory. You use that instance to connect to the RDS SQL Server instances using the same domain user account.

Figure 1: High level diagram of resources deployed in this post

Figure 1: High level diagram of resources deployed in this post

The architecture in Figure 1 includes a network connection between the Regions. That connection isn’t required for the AWS Managed Microsoft AD to function. If you don’t require network connectivity between your regions, you can disregard the network link in the diagram. Since you will be using a single Amazon Elastic Compute Cloud (Amazon EC2) instance in one Region, the network connection is needed between Amazon VPCs in the two Regions to allow that instance to connect to a domain controller in each Region.

Prerequisites for AWS Managed Microsoft AD cross-Region Support

  1. An AWS Managed Microsoft AD deployed in a Region of your choice. If you don’t have one already deployed, you can follow the instructions in Create Your AWS Managed Microsoft AD directory to create one. For this post, I recommend that you use us-east-1.
  2. The VPC must be peered in order to complete the steps in this blog. Creating and accepting a VPC peering connection has information on how to create a peering connection between Regions. Be aware of unsupported VPC peering configurations.
  3. A Windows Server instance joined to your managed Active Directory domain. Join an EC2 Instance to Your AWS Managed Microsoft AD Directory has instructions if you need assistance.
  4. Install the Active Directory administration tools onto your domain-joined instance. Installing the Active Directory Administration Tools has detailed instructions.

Extend your AWS Managed Microsoft AD to another Region

We’ve made the process to extend your directory to another Region straightforward. There is no cost to add another Region; you only pay for the resources for your directory running in the new Region. See here for additional information on pricing changes with new Regions. For example, in this post you will be extending your directory into the us-east-2 region. There will be an additional cost for two new domain controllers. Figure 3 shows the additional cost to extend the directory.

Let’s walk through the steps of setting up Windows Authentication with Amazon RDS for SQL Server instances in multiple Regions using a single cross-Region AWS Managed Microsoft AD.

To extend your directory to another Region:

  1. In the AWS Directory Service console navigation pane, choose Directories.

    Note: You should see a list of your AWS Managed Microsoft AD directories.

  2. Choose the Directory ID of the directory you want to expand to another Region.
  3. Go to the Directory details page. In the Multi-region replication section, select Add Region.

    Figure 2: Directory details and new multi-Region replication pane

    Figure 2: Directory details and new multi-Region replication pane

  4. On the Add region page:
    1. For Region to add, select the Region you want to extend your directory to.
    2. For VPC, select the Amazon Virtual Private Cloud (Amazon VPC) for the new domain controllers to use.
    3. For Subnets, select two unique subnets in the Amazon VPC that you selected in the preceding step.
    4. Once you have everything to your liking, choose Add.
      Figure 3: Add a Region

      Figure 3: Add a Region

      In the background, AWS is provisioning two new AWS managed domain controllers in the Region you selected. It could take up to 2 hours for your directory to become available in the Region.

Note: Your managed domain controllers in the home Region are fully functional during this process.

  • On the Directory details page, in Multi-Region replication, the status should be Active when the process has completed. Now you’re ready to deploy your Amazon RDS SQL Server instances.

Enable Amazon RDS for SQL Server

Integrating Amazon RDS into AWS Managed Microsoft AD is exactly the same process as it was before the cross-Region feature was released. This post goes through that original process with only one change, which is that you select the same directory ID for both Regions.

Create an Amazon RDS SQL Server instance in each Region using the same directory

The steps for creating an Amazon RDS SQL Server instance in each Region are the same. The following process will create the first instance. Once you’ve completed the process, you change the AWS Management Console Region to the Region you extended your directory to and repeat the process.

To create an Amazon RDS SQL Server instance:

  1. In the AWS Managed Microsoft AD directory primary Region, go to the Amazon RDS console navigation pane and choose Create database.
  2. Choose Microsoft SQL Server.
  3. You can leave the default values, except for the following settings:
    1. Under Settings select Master and Confirm password.
    2. Under Connectivity, expand Additional connectivity configuration:
      1. Choose Create new to create a new VPC security group.
      2. Enter a name in New VPC security group name.
      3. Select No preference for Availability Zone.
      4. Enter 1433 for Database port.
      Figure 4: Connectivity settings

      Figure 4: Connectivity settings

  4. Select the Enable Microsoft SQL Server Windows authentication check box and then choose Browse Directory.

    Figure 5: Enable Microsoft SQL Server Windows authentication selected

    Figure 5: Enable Microsoft SQL Server Windows authentication selected

  5. Select your directory and select Choose.

    Figure 6: Select a directory

    Figure 6: Select a directory

  6. Choose Create database.
  7. Repeat these steps in your expanded Region. Note that the Directory ID will be the same for both Regions. You can complete the next section while your Amazon RDS SQL instances are provisioning.

Create an Active Directory user and group to delegate SQL administrative rights

The following steps walk you through the process of creating an Active Directory user and group for delegation. Following this process, you add the user to the group you just created and to the AWS Delegated Server Administrators group.

To create a user and group:

  1. Log in to the domain-joined instance with a domain user account that has permissions to create Active Directory users and groups.
  2. Choose Start, enter dsa.msc, and press Enter.
  3. In Active Directory Users and Computers, right-click on the Users OU, select New, and then Group. The New Object – Group window pops up.
    1. Fill in the Group name boxes with your choice of name.
    2. For Group Scope, select Domain local.
    3. For Group type, select Security.
    4. Choose OK.
  4. In Active Directory Users and Computers, right-click on your Users OU and select New and then User. The New Object – User window pops up.
    1. Fill in the boxes with your choice of information, and then choose Next.
    2. Enter your choice of password and clear User must change password at next logon, then choose Next.
    3. On the confirmation page, choose Finish.
  5. Double-click on the user you just created. The user account properties window appears.
    1. Select the Member of tab.
    2. Choose Add.
    3. Enter the name of the group that you previously created and choose Check Names. Next, enter AWS Delegated Server Administrators and choose Check Names again. If you do not receive any error, choose OK, and then OK again.
  6. The Member of tab for the user should include the two groups you just added. Choose OK to close the properties page.

Delegate SQL Server permissions in each Region using the Active Directory group you just created

The following steps guide you through the process of modifying the Amazon RDS SQL security group, installing SQL Server Management Studio (SSMS), and delegating permission in SQL to your Active Directory group.

Modify the Amazon RDS SQL security group

In these next steps, you modify the security group you created with your Amazon RDS instances, allowing your Windows Server instance to connect to the Amazon RDS SQL Server instances over port 1433.

To modify the security group:

  1. From the Amazon Elastic Compute Cloud (Amazon EC2) console, select Security Groups under the Network & Security navigation section.
  2. Select the new Amazon RDS SQL security group that was created with your Amazon RDS SQL instance and select Edit inbound rules.
  3. Choose Add rule and enter the following:
    1. Type – Select Custom TCP.
    2. Protocol – Select TCP.
    3. Port range – Enter 1433.
    4. Source – Select Custom.
    5. Enter the private IP of your instance with a /32. An example would be 10.0.0.10/32.
  4. Choose Save rules.

    Figure 7: Create a security group rule

    Figure 7: Create a security group rule

  5. Repeat these steps on the security group of your other Amazon RDS SQL instance in the other Region.

Install SQL Server Management Studio

All of the steps after the first are done on the Windows Server instance from Prerequisite 3.

To install SMMS:

  1. On your local computer, download SQL Server Management Studio (SSMS).

    Note: If desired, you can disable IE Enhanced Security Configuration and download directly to the Windows Server instance using IE or any other browser, and skip to step 3.

  2. RDP into your Windows Server instance and copy SSMS-Setup-ENU.exe to your RDP session.
  3. Run the file on your Windows Server instance.
  4. Choose Install.

    Figure 8: Install SMMS

    Figure 8: Install SMMS

  5. It might take a few minutes to install. When complete, choose Close.

Delegate permissions in SSMS

All of the following steps are performed on the Windows Server instance from Prerequisite 3. Log in to the Amazon RDS SQL instance using the SQL master user account. Next, create a SQL login for the Active Directory group you created previously and give it elevated permission to the Amazon RDS SQL instance.

To delegate permissions:

  1. Start SMMS.
  2. On the Connect to Server window, enter or select:
    1. Server name – Your Amazon RDS SQL Server endpoint.
    2. Authentication – Select SQL Server Authentication.
    3. Login – Enter the master user name you used when you launched your Amazon RDS SQL instance. The default is admin.
    4. Password – Enter the password for the master user name.
    5. Choose Connect.
    Figure 9: Connect to server

    Figure 9: Connect to server

  3. In SMMS, Choose New Query at the top of the window.

    Figure 10: New query

    Figure 10: New query

  4. In the query window, enter the following query. Replace <CORP\SQL-Admins> with the name of the group you created earlier.
    CREATE LOGIN [<CORP\SQL-Admins>] FROM WINDOWS WITH DEFAULT_DATABASE = [master],
       DEFAULT_LANGUAGE = [us_english];
    

    Figure 11: Query SQL database

    Figure 11: Query SQL database

  5. Choose Execute on the menu bar. You should see a Commands completed successfully message.

    Figure 12: Commands completed successfully

    Figure 12: Commands completed successfully

  6. Next, navigate to the Logins directory on the navigation page. Right-click on the group you added with the SQL command in step 5 and select Properties.

    Figure 13: Open group properties

    Figure 13: Open group properties

  7. Select Server Roles and select the processadmin and setupadmin checkboxes. Then choose OK.

    Figure 14: Configure server roles

    Figure 14: Configure server roles

  8. You can log off from the instance. For the next steps, you log in to the instance using the user account you created previously.
  9. Repeat these steps on the Amazon RDS SQL instance in the other Region.

Connect to the Amazon RDS SQL Server with the same Active Directory user in both Regions

All of the steps are performed on the Windows Server instance from Prerequisite 3. You must log in to the instance using the account you created earlier. You then log in to the Amazon RDS SQL instance using Windows authentication with that account.

  1. Log in to the instance with the user account you created earlier.
  2. Start SSMS.
  3. On the Connect to Server window, enter or select:
    1. Server name: Your Amazon RDS SQL Server endpoint.
    2. Authentication: Select Windows Authentication.
    3. Choose Connect.
    Figure 15: Connect to server

    Figure 15: Connect to server

  4. You should be logged in to SSMS. If you aren’t logged in, make sure you added your user account to the group you created earlier and try again.
  5. Repeat these steps using the other Amazon RDS SQL instance endpoint for the server name. You should be able to connect to both Amazon RDS SQL instances using the same user account.

Summary

In this post, you extended your AWS Managed Microsoft AD into a new Region. You then deployed Amazon RDS for SQL Server in multiple Regions attached to the same AWS Managed Microsoft AD directory. You then tested authentication to both Amazon RDS SQL instances using the same Active Directory user.

To learn more about using AWS Managed Microsoft AD or AD Connector, visit the AWS Directory Service documentation. For general information and pricing, see the AWS Directory Service home page. If you have comments about this blog post, submit a comment in the Comments section below. If you have implementation or troubleshooting questions, start a new thread on the AWS Directory Service forum or contact AWS Support.

Author

Jeremy Girven

Jeremy is a Solutions Architect specializing in Microsoft workloads on AWS. He has over 15 years of experience with Microsoft Active Directory and over 23 years of industry experience. One of his fun projects is using SSM to automate the Active Directory build processes in AWS. To see more please check out the Active Directory AWS QuickStart.

How to bulk import users and groups from CSV into AWS SSO

Post Syndicated from Darryn Hendricks original https://aws.amazon.com/blogs/security/how-to-bulk-import-users-and-groups-from-csv-into-aws-sso/

When you connect an external identity provider (IdP) to AWS Single Sign-On (SSO) using Security Assertion Markup Language (SAML) 2.0 standard, you must create all users and groups into AWS SSO before you can make any assignments to AWS accounts or applications. If your IdP supports user and group provisioning by way of the System for Cross-Domain Identity Management (SCIM), we strongly recommend using SCIM to simplify ongoing lifecycle management for your users and groups in AWS SSO.

If your IdP doesn’t yet support automatic provisioning, you will need to create your users and groups manually in AWS SSO. Although manual creation of users and groups is the least complicated option to get started, it can be tedious and prone to errors.

In this post, we show you how to use a comma-separated values (CSV) file to bulk create users and groups in AWS SSO.

How it works

AWS SSO supports automatic provisioning of user and group information from an external IdP into AWS SSO using the SCIM protocol. For this solution, you use a PowerShell script to simulate a SCIM server, to provision users and groups from a CSV file into AWS SSO. You create and populate the CSV file with your user and group information that is then used by the PowerShell script. Next, on your Windows, Linux, or macOS system with PowerShell Core installed, you run the PowerShell script. The PowerShell script reads users and groups from the CSV file and then programmatically creates the users and groups in AWS SSO using your SCIM configuration for AWS SSO.

Assumptions

In this blog post, we assume the following:

  • You already have an AWS SSO-enabled account (free). For more information, see Enable AWS SSO.
  • You have the permissions needed to add users and groups in AWS SSO.
  • You configured a SAML IdP with AWS SSO, as described in How to Configure SAML 2.0 for AWS Single Sign-On.
  • You’re using a Windows, MacOS, or Linux system with PowerShell Core installed.
  • If you’re not using a system with PowerShell Core installed, you’re using a Windows 7 or later system, with PowerShell 4.0 or later installed.

Note: This article was authored and the code tested on a Microsoft Windows Server 2019 system with PowerShell installed.

Enable automatic provisioning

In this step, you enable automatic provisioning in AWS SSO. You use the automatic provisioning endpoints for AWS SSO to connect and create users and groups in AWS SSO.

To enable automatic provisioning in AWS SSO

    1. On the AWS SSO Console, go to the Single Sign-On page and then go to Settings.
    2. Change the provisioning from Manual to SCIM by selecting Enable automatic provisioning.
Figure 1: Enable automatic provisioning

Figure 1: Enable automatic provisioning

    1. Copy the SCIM endpoint and the Access token (you can have up to two access token IDs). You use these values later.
Figure 2: Copy the SCIM endpoint and access token

Figure 2: Copy the SCIM endpoint and access token

Bulk create users and groups into AWS SSO

In this section, you create your users and groups from a CSV file into AWS SSO. To do this, you create a CSV file with your users’ profile information (for example: first name, last name, display name, and other values.). You also create a PowerShell script to connect to AWS SSO and create the users and groups from the CSV file in AWS SSO.

To bulk create your users from a CSV file

    1. Create a file called csv-example-users.csv with the following column headings: firstName, lastName, userName, displayName, emailAddress, and memberOf.

Note: The memberOf column will include all the groups you want to add the user to in AWS SSO. If the group you plan to add a user to isn’t in AWS SSO, the script automatically creates the group for you. If you want to add a user to multiple groups, you can add the group names separated by semicolons in the memberOf column.

    1. Populate the CSV file csv-example-users.csv with the users you want to create in AWS SSO.

Note: Before you populate the CSV file, take note of the existing users, groups, and group membership in AWS SSO. Make sure that none of the users or groups in the CSV file already exists in AWS SSO.

Note: For this to work, every user in the csv-example-users.csv must have a firstName, lastName, userName, displayName, and emailAddress value specified. If any of these values are missing, that user isn’t created. The userName and emailAddress values must not contain any spaces.

Figure 3: Create the CSV file and populate it with the users to create in AWS SSO

Figure 3: Create the CSV file and populate it with the users to create in AWS SSO

  1. Next, create a create_users.ps1 file and copy the following PowerShell code to it. Use a text editor like Notepad or TextEdit to edit the create_users.ps1 file.
    • Replace <SCIMENDPOINT> with the SCIM endpoint value you copied earlier.
    • Replace <BEARERTOKEN> with the Access token value you copied earlier.
    • Replace <CSVLOCATION> with the location of your CSV file (for example, C:\Users\testuser\Downloads\csv-example-users.csv. Relative paths are also accepted).
    #Input SCIM configuration and CSV file location
    $Url = "<SCIMENDPOINT>"
    $Bearertoken = "<BEARERTOKEN>"
    $CSVfile = "<CSVLOCATION>"
    $Headers = @{ Authorization = "Bearer $Bearertoken" }
    
    #Get users from CSV file and store in variable
    $Users = Import-Csv -Delimiter "," -Path "$CSVfile"
    
     #Read groups in CSV and groups in AWS SSO
        
        $Groups = $Users.memberOf -split ";"
        $Groups = $Groups | Sort-Object -Unique | where {$_ -ne ""}
    
        foreach($Group in $Groups){
             $SSOgroup = @{
                "displayName" = $Group.trim()
                }
    
        #Store group attribute in json format
    
        $Groupjson = $SSOgroup | ConvertTo-Json
    
        #Create groups in AWS SSO
    
        try {
        
            $Response = Invoke-RestMethod -ContentType application/json -Uri "$Url/Groups" -Method POST -Headers $Headers -Body $Groupjson -UseBasicParsing
            Write-Host "Create group: The group $($Group) has been created successfully." -foregroundcolor green
    
        }
        catch 
        {
        
          $ErrorMessage = $_.Exception.Message
    
           if ($ErrorMessage -eq "The remote server returned an error: (409) Conflict.")
           {
             Write-Host "Error creating group: A group with the name $($Group) already exists." -foregroundcolor yellow
           }
           
           else 
           {       
             Write-Host "Error has occurred: $($ErrorMessage)" -foregroundcolor Red
           }
        }
        }
    
    #Loop through each user
    foreach ($User in $Users)
    {
    
        #Get user attributes from each field
        $SSOuser = @{
                name = @{ familyName = $User.lastName.trim(); givenName = $User.firstName.trim() }
                displayName = $User.displayName.trim()
                userName = $User.userName
                emails = @(@{ value = $User.emailAddress; type = "work"; primary = "true" })
                active = "true"
                }
    
        #Store user attributes in json format
        $Userjson = $SSOuser | ConvertTo-Json
    
        #Create users in AWS SSO
    
        try {
        $Response = Invoke-RestMethod -ContentType application/json -Uri "$Url/Users" -Method POST -Headers $Headers -Body $Userjson -UseBasicParsing
        Write-Host "Create user: The user $($User.userName) has been created successfully." -foregroundcolor green
    
        }
        catch 
        {
        
          $ErrorMessage = $_.Exception.Message
    
           if ($ErrorMessage -eq "The remote server returned an error: (409) Conflict.")
           {
             Write-Host "Error creating user: A user with the same username $($User.userName) already exist" -foregroundcolor yellow
           }
           
           else 
           {       
             Write-Host "Error has occurred: $($ErrorMessage)" -foregroundcolor Red
           }
        }   
    
    #Get user information
        $UserName = $User.userName
        $UserId = (Invoke-RestMethod -ContentType application/json -Uri "$Url/Users`?filter=userName%20eq%20%22$UserName%22" -Method GET -Headers $Headers).Resources.id
        $Groups = $User.memberOf -split ";"
    
    #Loop through each group and add user to group
        foreach($Group in $Groups){
    
    If (-not [string]::IsNullOrWhiteSpace($Group)) 
    {
    #Get the GroupName and GroupId
        $GroupName = $Group.trim()
        $GroupId = (Invoke-RestMethod -ContentType application/json -Uri "$Url/Groups`?filter=displayName%20eq%20%22$GroupName%22" -Method GET -Headers $Headers).Resources.id
    
    #Store group membership in variable. 
        $AddUserToGroup = @{
                Operations = @(@{ op = "add"; path = "members"; value = @(@{ value = $UserId })})
                }
                
        #Convert to json format
        $AddUsertoGroupjson = $AddUserToGroup | ConvertTo-Json -Depth 4
    
        #Add users to group in AWS SSO
        
            try {
        $Responses = Invoke-RestMethod -ContentType application/json -Uri "$Url/Groups/$GroupId" -Method PATCH -Headers $Headers -Body $AddUsertoGroupjson -UseBasicParsing
        Write-Host "Add user to group: The user $($User.userName) has been added successfully to group $($GroupName)." -foregroundcolor green
    
        }
        catch 
        {
        
          $ErrorMessage = $_.Exception.Message
    
    	if ($ErrorMessage -eq "The remote server returned an error: (409) Conflict.")
           {
             Write-Host "Error adding user to group: The user $($User.userName) is already added to group $($GroupName)." -foregroundcolor yellow
           }
           
           else 
           {       
             Write-Host "Error has occurred: $($ErrorMessage)" -foregroundcolor Red
           }
        }
       }        
      }
    }
    

  2. Use Windows PowerShell to run the script create_users.ps1, as shown in the following figure.

    Figure 4: Run PowerShell script to create users from CSV in AWS SSO

    Figure 4: Run PowerShell script to create users from CSV in AWS SSO

  3. Use the AWS SSO console to verify that the users and groups were successfully created. In the AWS SSO console, select Users from the left menu, as shown in figure 5.

    Figure 5: View the newly created users in AWS SSO console

    Figure 5: View the newly created users in AWS SSO console

  4. Use the AWS SSO console to verify that the groups were successfully created. In the AWS SSO console, select Groups from the left menu, as shown in figure 6.

    Figure 6: View the newly created groups in AWS SSO console

    Figure 6: View the newly created groups in AWS SSO console

Your users, groups, and group memberships have been created in AWS SSO. You can now manage access for your identities in AWS SSO across your own applications, third-party applications (SaaS), and Amazon Web Services (AWS) environments.

How to run the PowerShell scripts on Linux and macOS

While this post focuses on running the PowerShell script on a Windows system. You can also run the PowerShell script on a Linux or macOS system that has PowerShell Core installed. You can then follow the steps in this post to create the required CSV files for creating a user and group and adding a user to a group. Then, on your Linux or macOS system, you can run the PowerShell script using the following command.

pwsh -File <Path to PowerShell Script>

Conclusion

In this post, we showed you how to programmatically create users and groups from a CSV file into AWS SSO. This solution isn’t a replacement for automatic provisioning. However, it can help you to quickly get up and running with AWS SSO by reducing the administration burden of manually creating users in AWS SSO.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Darryn Hendricks

Darryn is a Senior Cloud Support Engineer for AWS Single Sign-On (SSO) based in Seattle, Washington. He is passionate about Cloud computing, identities, automation and helping customers leverage these key building blocks when moving to the Cloud. Outside of work, he loves spending time with his wife and daughter.

Author

Jose Ruiz

Jose is a Senior Solutions Architect – Security Specialist at AWS. He often enjoys “the road less traveled” and knows each technology has a security story often not spoken of. He takes this perspective when working with customers on highly complex solutions and driving security at the beginning of each build.

Automate domain join for Amazon EC2 instances from multiple AWS accounts and Regions

Post Syndicated from Sanjay Patel original https://aws.amazon.com/blogs/security/automate-domain-join-for-amazon-ec2-instances-multiple-aws-accounts-regions/

As organizations scale up their Amazon Web Services (AWS) presence, they are faced with the challenge of administering user identities and controlling access across multiple accounts and Regions. As this presence grows, managing user access to cloud resources such as Amazon Elastic Compute Cloud (Amazon EC2) becomes increasingly complex. AWS Directory Service for Microsoft Active Directory (also known as an AWS Managed Microsoft AD) makes it easier and more cost-effective for you to manage this complexity. AWS Managed Microsoft AD is built on highly available, AWS managed infrastructure. Each directory is deployed across multiple Availability Zones, and monitoring automatically detects and replaces domain controllers that fail. In addition, data replication and automated daily snapshots are configured for you. You don’t have to install software, and AWS handles all patching and software updates. AWS Managed Microsoft AD enables you to leverage your existing on-premises user credentials to access cloud resources such as the AWS Management Console and EC2 instances.

This blog post describes how EC2 resources launched across multiple AWS accounts and Regions can automatically domain-join a centralized AWS Managed Microsoft AD. The solution we describe in this post is implemented for both Windows and Linux instances. Removal of Computer objects from Active Directory upon instance termination is also implemented. The solution uses Amazon DynamoDB to centrally store account and directory information in a central security account. We also provide AWS CloudFormation templates and platform-specific domain join scripts for you to use with AWS Lambda as a quick start solution.

Architecture

The following diagram shows the domain-join process for EC2 instances across multiple accounts and Regions using AWS Managed Microsoft AD.

Figure 1: EC2 domain join architecture

Figure 1: EC2 domain join architecture

The event flow works as follows:

  1. An EC2 instance is launched in a peered virtual private cloud (VPC) of a workload or security account. VPCs that are hosting EC2 instances need to be peered with the VPC that contains AWS Managed Microsoft AD to enable network connectivity with Active Directory.
  2. An Amazon CloudWatch Events rule detects an EC2 instance in the “running” state.
  3. The CloudWatch event is forwarded to a regional CloudWatch event bus in the security account.
  4. If the CloudWatch event bus is in the same Region as AWS Managed Microsoft AD, it delivers the event to an Amazon Simple Queue Service (Amazon SQS) queue, referred to as the domain-join queue in this post.
  5. If the CloudWatch event bus is in a different Region from AWS Managed Microsoft AD, it delivers the event to an Amazon Simple Notification Service (Amazon SNS) topic. The event is then delivered to the domain-join queue described in step 4, through the Amazon SNS topic subscription.
  6. Messages in the domain-join queue are held for five minutes to allow for EC2 instances to stabilize after they reach the “running” state. This delay allows time for installation of additional software components and agents through the use of EC2 user data and AWS Systems Manager Distributor.
  7. After the holding period is over, messages in the domain-join queue invoke the AWS AD Join/Leave Lambda function. The Lambda function does the following:
    1. Retrieves the AWS account ID that originated the event from the message and retrieves account-specific configurations from a DynamoDB table. This configuration identifies AWS Managed Microsoft AD domain controller IPs, credentials required to perform EC2 domain join, and an AWS Identity and Access Management (IAM) role that can be assumed by the Lambda function to invoke AWS Systems Manager Run Command.
    2. If needed, uses AWS Security Token Service (AWS STS) and prepares a cross-account access session.
    3. Retrieves EC2 instance information, such as the instance state, platform, and tags, and validates the instance state.
    4. Retrieves platform-specific domain-join scripts that are deployed with the Lambda function’s code bundle, and configures invocation of those scripts by using data read from the DynamoDB table (bash script for Linux instances and PowerShell script for Windows instances).
    5. Uses AWS Systems Manager Run Command to invoke the domain-join script on the instance. Run Command enables you to remotely and securely manage the configuration of your managed instances.
    6. The domain-join script runs on the instance. It uses script parameters and instance attributes to configure the instance and perform the domain join. The adGroupName tag value is used to configure the Active Directory user group that will have permissions to log in to the instance. The instance is rebooted to complete the domain join process. Various software components are installed on the instance when the script runs. For the Linux instance, sssd, realmd, krb5, samba-common, adcli, unzip, and packageit are installed. For the Windows instance, the RDS-RD-Server feature is installed.

Removal of EC2 instances from AWS Managed Microsoft AD upon instance termination follows a similar sequence of steps. Each instance that is domain joined creates an Active Directory domain object under the “Computer” hierarchy. This domain object needs to be removed upon instance termination so that a new instance that uses the same private IP address in the subnet (at a future time) can successfully domain join and enable instance access with Active Directory credentials. Removal of the Active Directory Computer object is done by running the leaveDomaini.ps1 script (included with this blog) through Run Command on the Active Directory Tools instance identified in Figure 1.

Prerequisites and setup

To build the solution outlined in this post, you need:

  • AWS Managed Microsoft AD with an appropriate DNS name (for example, example.com). For more information about getting started with AWS Managed Microsoft AD, see Create Your AWS Managed Microsoft AD directory.
  • AD Tools. To install AD Tools and use it to create the required users:
    1. Launch a Windows EC2 instance in the same account and Region, and domain-join it with the directory you created in the previous step. Log in to the instance through Remote Desktop Protocol (RDP) and install AD Tools as described in Installing the Active Directory Administration Tools.
    2. After the AD Tools are installed, launch the AD Users & Computers application to create domain users, and assign those users to an Active Directory security group (for example, my_UserGroup) that has permission to access domain-joined instances.
    3. Create a least-privileged user for performing domain joins as described in Delegate Directory Join Privileges for AWS Managed Microsoft AD. The identity of this user is stored in the DynamoDB table and read by the AD Join Lambda function to invoke Active Directory join scripts.
    4. Store the password for the least-privileged user in an encrypted Systems Manager parameter. The password for this user is stored in the secure string System Manager parameter and read by the AD Join Lambda function at runtime while processing Amazon SQS messages.
    5. Assign a unique tag key and value to identify the AD Tools instance. This instance will be invoked by the Lambda function to delete Computer objects from Active Directory upon termination of domain-joined instances.
  • All VPCs that are hosting EC2 instances to be domain joined must be peered with the VPC that hosts the relevant AWS Managed Microsoft AD. Alternatively, AWS Transit Gateway could be used to establish this connectivity.
  • In addition to having network connectivity to the AWS Managed Microsoft AD domain controllers, domain join scripts that run on EC2 instances must be able to resolve relevant Active Directory resource records. In this solution, we leverage Amazon Route 53 Outbound Resolver to forward DNS queries to the AWS Managed Microsoft AD DNS servers, while still preserving the default DNS capabilities that are available to the VPC. Learn more about deploying Route 53 Outbound Resolver and resolver rules to resolve your directory DNS name to DNS IPs.
  • Each domain-join EC2 instance must have a Systems Manager Agent (SSM Agent) installed and an IAM role that provides equivalent permissions as provided by the AmazonEC2RoleforSSM built-in policy. The SSM Agent is used to allow domain-join scripts to run automatically. See Working with SSM Agent for more information on installing and configuring SSM Agents on EC2 instances.

Solution deployment

The steps in this section deploy AD Join solution components by using the AWS CloudFormation service.

The CloudFormation template provided with this solution (mad_auto_join_leave.json) deploys resources that are identified in the security account’s AWS Region that hosts AWS Managed Microsoft AD (the top left quadrant highlighted in Figure 1). The template deploys a DynamoDB resource with 5 read and 5 write capacity units. This should be adjusted to match your usage. DynamoDB also provides the ability to auto-scale these capacities. You will need to create and deploy additional CloudFormation stacks for cross-account, cross-Region scenarios.

To deploy the solution

  1. Create a versioned Amazon Simple Storage Service (Amazon S3) bucket to store a zip file (for example, adJoinCode.zip) that contains Python Lambda code and domain join/leave bash and PowerShell scripts. Upload the source code zip file to an S3 bucket and find the version associated with the object.
  2. Navigate to the AWS CloudFormation console. Choose the appropriate AWS Region, and then choose Create Stack. Select With new resources.
  3. Choose Upload a template file (for this solution, mad_auto_join_leave.json), select the CloudFormation stack file, and then choose Next.
  4. Enter the stack name and values for the other parameters, and then choose Next.
    Figure 2: Defining the stack name and parameters

    Figure 2: Defining the stack name and parameters

    The parameters are defined as follows:

  • S3CodeBucket: The name of the S3 bucket that holds the Lambda code zip file object.
  • adJoinLambdaCodeFileName: The name of the Lambda code zip file that includes Lambda Python code, bash, and Powershell scripts.
  • adJoinLambdaCodeVersion: The S3 Version ID of the uploaded Lambda code zip file.
  • DynamoDBTableName: The name of the DynamoDB table that will hold account configuration information.
  • CreateDynamoDBTable: The flag that indicates whether to create a new DynamoDB table or use an existing table.
  • ADToolsHostTagKey: The tag key of the Windows EC2 instance that has AD Tools installed and that will be used for removal of Active Directory Computer objects upon instance termination.
  • ADToolsHostTagValue: The tag value for the key identified by the ADToolsHostTagKey parameter.
  • Acknowledge creation of AWS resources and choose to continue to deploy AWS resources through AWS CloudFormation.The CloudFormation stack creation process is initiated, and after a few minutes, upon completion, the stack status is marked as CREATE_COMPLETE. The following resources are created when the CloudFormation stack deploys successfully:
    • An AD Join Lambda function with associated scripts and IAM role.
    • A CloudWatch Events rule to detect the “running” and “terminated” states for EC2 instances.
    • An SQS event queue to hold the EC2 instance “running” and “terminated” events.
    • CloudWatch event mapping to the SQS event queue and further to the Lambda function.
    • A DynamoDB table to hold the account configuration (if you chose this option).

The DynamoDB table hosts account-level configurations. Account-specific configuration is required for an instance from a given account to join the Active Directory domain. Each DynamoDB item contains the account-specific configuration shown in the following table. Storing account-level information in the DynamoDB table provides the ability to use multiple AWS Managed Microsoft AD directories and group various accounts accordingly. Additional account configurations can also be stored in this table for implementation of various centralized security services (instance inspection, patch management, and so on).

AttributeDescription
accountIdAWS account number
adJoinUserNameUser ID with AD Join permissions
adJoinUserPwParamEncrypted Systems Manager parameter containing the AD Join user’s password
dnsIP1Domain controller 1 IP address2
dnsIP2Domain controller 2 IP address
assumeRoleARNAmazon Resource Name (ARN) of the role assumed by the AD Join Lambda function

Following is an example of how you could insert an item (row) in a DynamoDB table for an account.

aws dynamodb put-item --table-name <DynamoDB-Table-Name> --item file://itemData.json

where itemData.json is as follows.

{
    "accountId": { "S": "123412341234" },
    " adJoinUserName": { "S": "ADJoinUser" },
    " adJoinUserPwParam": { "S": "ADJoinUser-PwParam" },
    "dnsName": { "S": "example.com" },
    "dnsIP1": { "S": "192.0.2.1" },
    "dnsIP2": { "S": "192.0.2.2" },
    "assumeRoleARN": { "S": "arn:aws:iam::111122223333:role/adJoinLambdaRole" }
}

(Update with your own values as appropriate for your environment.)

In the preceding example, adJoinLambdaRole is assumed by the AD Join Lambda function (if needed) to establish cross-account access using AWS Security Token Service (AWS STS). The role needs to provide sufficient privileges for the AD Join Lambda function to retrieve instance information and run cross-account Systems Manager commands.

adJoinUserName identifies a user with the minimum privileges to do the domain join; you created this user in the prerequisite steps.

adJoinUserPwParam identifies the name of the encrypted Systems Manager parameter that stores the password for the AD Join user. You created this parameter in the prerequisite steps.

Solution test

After you successfully deploy the solution using the steps in the previous section, the next step is to test the deployed solution.

To test the solution

  1. Navigate to the AWS EC2 console and launch a Linux instance. Launch the instance in a public subnet of the available VPC.
  2. Choose an IAM role that gives at least AmazonEC2RoleforSSM permissions to the instance.
  3. Add an adGroupName tag with the value that identifies the name of the Active Directory security group whose members should have access to the instance.
  4. Make sure that the security group associated with your instance has permissions for your IP address to log in to the instance by using the Secure Shell (SSH) protocol.
  5. Wait for the instance to launch and perform the Active Directory domain join. You can navigate to the AWS SQS console and observe a delayed message that represents the CloudWatch instance “running” event. This message is processed after five minutes; after that you can observe the Lambda function’s message processing log in CloudWatch logs.
  6. Log in to the instance with Active Directory user credentials. This user must be the member of the Active Directory security group identified by the adGroupName tag value. Following is an example login command.
    ssh ‘[email protected]’@<public-dns-name|public-ip-address>
    

  7. Similarly, launch a Windows EC2 instance to validate the Active Directory domain join by using Remote Desktop Protocol (RDP).
  8. Terminate domain-joined instances. Log in to the AD Tools instance to validate that the Active Directory Computer object that represents the instance is deleted.

The AD Join Lambda function invokes Systems Manager commands to deliver and run domain join scripts on the EC2 instances. The AWS-RunPowerShellScript command is used for Microsoft Windows instances, and the AWS-RunShellScript command is used for Linux instances. Systems Manager command parameters and execution status can be observed in the Systems Manager Run Command console.

The AD user used to perform the domain join is a least-privileged user, as described in Delegate Directory Join Privileges for AWS Managed Microsoft AD. The password for this user is passed to instances by way of SSM Run Commands, as described above. The password is visible in the SSM Command history log and in the domain join scripts run on the instance. Alternatively, all script parameters can be read locally on the instance through the “adjoin” encrypted SSM parameter. Refer to the domain join scripts for details of the “adjoin” SSM parameter.

Additional information

Directory sharing

AWS Managed Microsoft AD can be shared with other AWS accounts in the same Region. Learn how to use this feature and seamlessly domain join Microsoft Windows EC2 instances and Linux instances.

autoadjoin tag

Launching EC2 instances with an autoadjoin tag key with a “false” value excludes the instance from the automated Active Directory join process. You might want to do this in scenarios where you want to install additional agent software before or after the Active Directory join process. You can invoke domain join scripts (bash or PowerShell) by using user data or other means. However, you’ll need to reboot the instance and re-run scripts to complete the domain join process.

Summary

In this blog post, we demonstrated how you could automate the Active Directory domain join process for EC2 instances to AWS Managed Microsoft AD across multiple accounts and Regions, and also centrally manage this configuration by using AWS DynamoDB. By adopting this model, administrators can centrally manage Active Directory–aware applications and resources across their accounts.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Sanjay Patel

Sanjay is a Senior Cloud Application Architect with AWS Professional Services. He has a diverse background in software design, enterprise architecture, and API integrations. He has helped AWS customers automate infrastructure security. He enjoys working with AWS customers to identify and implement the best fit solution.

Author

Vaibhawa Kumar

Vaibhawa is a Senior Cloud Infrastructure Architect with AWS Professional Services. He helps customers with the architecture, design, and automation to build innovative, secured, and highly available solutions using various AWS services. In his free time, you can find him spending time with family, sports, and cooking.

Author

Kevin Higgins

Kevin is a Senior Cloud Infrastructure Architect with AWS Professional Services. He helps customers with the architecture, design, and development of cloud-optimized infrastructure solutions. As a member of the Microsoft Global Specialty Practice, he collaborates with AWS field sales, training, support, and consultants to help drive AWS product feature roadmap and go-to-market strategies.

Get started with fine-grained access control in Amazon Elasticsearch Service

Post Syndicated from Jon Handler original https://aws.amazon.com/blogs/security/get-started-with-fine-grained-access-control-in-amazon-elasticsearch-service/

Amazon Elasticsearch Service (Amazon ES) provides fine-grained access control, powered by the Open Distro for Elasticsearch security plugin. The security plugin adds Kibana authentication and access control at the cluster, index, document, and field levels that can help you secure your data. You now have many different ways to configure your Amazon ES domain to provide access control. In this post, I offer basic configuration information to get you started.

Figure 1: A high-level view of data flow and security

Figure 1: A high-level view of data flow and security

Figure 1 details the authentication and access control provided in Amazon ES. The left half of the diagram details the different methods of authenticating. Looking horizontally, requests originate either from Kibana or directly access the REST API. When using Kibana, you can use a login screen powered by the Open Distro security plugin, your SAML identity provider, or Amazon Cognito. Each of these methods results in an authenticated identity: SAML providers via the response, Amazon Cognito via an AWS Identity and Access Management (IAM) identity, and Open Distro via an internal user identity. When you use the REST API, you can use AWS Signature V4 request signing (SigV4 signing), or user name and password authentication. You can also send unauthenticated traffic, but your domain should be configured to reject all such traffic.

The right side of the diagram details the access control points. You can consider the handling of access control in two phases to better understand it—authentication at the edge by IAM and authentication in the Amazon ES domain by the Open Distro security plugin.

First, requests from Kibana or direct API calls have to reach your domain endpoint. If you follow best practices and the domain is in an Amazon Virtual Private Cloud (VPC), you can use Amazon Elastic Compute Cloud (Amazon EC2) security groups to allow or deny traffic based on the originating IP address or security group of the Amazon EC2 instances. Best practice includes least privilege based on subnet ACLs and security group ingress and egress restrictions. In this post, we assume that your requests are legitimate, meet your access control criteria, and can reach your domain.

When a request reaches the domain endpoint—the edge of your domain—, it can be anonymous or it can carry identity and authentication information as described previously. Each Amazon ES domain carries a resource-based IAM policy. With this policy, you can allow or deny traffic based on an IAM identity attached to the request. When your policy specifies an IAM principal, Amazon ES evaluates the request against the allowed Actions in the policy and allows or denies the request. If you don’t have an IAM identity attached to the request (SAML assertion, or user name and password) you should leave the domain policy open and pass traffic through to fine-grained access control in Amazon ES without any checks. You should employ IAM security best practices and add additional IAM restrictions for direct-to-API access control once your domain is set up.

The Open Distro for Elasticsearch security plugin has its own internal user database for user name and password authentication and handles access control for all users. When traffic reaches the Elasticsearch cluster, the plugin validates any user name and password authentication information against this internal database to identify the user and grant a set of permissions. If a request comes with identity information from either SAML or an IAM role, you map that backend role onto the roles or users that you have created in Open Distro security.

Amazon ES documentation and the Open Distro for Elasticsearch documentation give more information on all of these points. For this post, I walk through a basic console setup for a new domain.

Console set up

The Amazon ES console provides a guided wizard that lets you configure—and reconfigure—your Amazon ES domain. Step 1 offers you the opportunity to select some predefined configurations that carry through the wizard. In step 2, you choose the instances to deploy in your domain. In Step 3, you configure the security. This post focuses on step 3. See also these tutorials that explain using an IAM master user and using an HTTP-authenticated master user.

Note: At the time of writing, you cannot enable fine-grained access control on existing domains; you must create a new domain and enable the feature at domain creation time. You can use fine-grained access control with Elasticsearch versions 6.8 and later.

Set your endpoint

Amazon ES gives you a DNS name that resolves to an IP address that you use to send traffic to the Elasticsearch cluster in the domain. The IP address can be in the IP space of the public internet, or it can resolve to an IP address in your VPC. While—with fine-grained access control—you have the means of securing your cluster even when the endpoint is a public IP address, we recommend using VPC access as the more secure option. Shown in Figure 2.

Figure 2: Select VPC access

Figure 2: Select VPC access

With the endpoint in your VPC, you use security groups to control which ports accept traffic and limit access to the endpoints of your Amazon ES domain to IP addresses in your VPC. Make sure to use least privilege when setting up security group access.

Enable fine-grained access control

You should enable fine-grained access control. Shown in Figure 3.

Figure 3: Enabled fine-grained access control

Figure 3: Enabled fine-grained access control

Set up the master user

The master user is the administrator identity for your Amazon ES domain. This user can set up additional users in the Amazon ES security plugin, assign roles to them, and assign permissions for those roles. You can choose user name and password authentication for the master user, or use an IAM identity. User name and password authentication, shown in Figure 4, is simpler to set up and—with a strong password—may provide sufficient security depending on your use case. We recommend you follow your organization’s policy for password length and complexity. If you lose this password, you can return to the domain’s dashboard in the AWS Management Console and reset it. You’ll use these credentials to log in to Kibana. Following best practices on choosing your master user, you should move to an IAM master user once setup is complete.

Note: Password strength is a function of length, complexity of characters (e.g., upper and lower case letters, numbers, and special characters), and unpredictability to decrease the likelihood the password could be guessed or cracked over a period of time.

 

Figure 4: Setting up the master username and password

Figure 4: Setting up the master username and password

Do not enable Amazon Cognito authentication

When you use Kibana, Amazon ES includes a login experience. You currently have three choices for the source of the login screen:

  1. The Open Distro security plugin
  2. Amazon Cognito
  3. Your SAML-compliant system

You can apply fine-grained access control regardless of how you log in. However, setting up fine-grained access control for the master user and additional users is most straightforward if you use the login experience provided by the Open Distro security plugin. After your first login, and when you have set up additional users, you should migrate to either Cognito or SAML for login, taking advantage of the additional security they offer. To use the Open Distro login experience, disable Amazon Cognito authentication, as shown in Figure 5.

Figure 5: Amazon Cognito authentication is not enabled

Figure 5: Amazon Cognito authentication is not enabled

If you plan to integrate with your SAML identity provider, check the Prepare SAML authentication box. You will complete the set up when the domain is active.

Figure 6: Choose Prepare SAML authentication if you plan to use it

Figure 6: Choose Prepare SAML authentication if you plan to use it

Use an open access policy

When you create your domain, you attach an IAM policy to it that controls whether your traffic must be signed with AWS SigV4 request signing for authentication. Policies that specify an IAM principal require that you use AWS SigV4 signing to authenticate those requests. The domain sends your traffic to IAM, which authenticates signed requests to resolve the user or role that sent the traffic. The domain and IAM apply the policy access controls and either accept the traffic or reject it based on the commands. This is done down to the index level for single-index API calls.

When you use fine-grained access control, your traffic is also authenticated by the Amazon ES security plugin, which makes the IAM authentication redundant. Create an open access policy, as shown in Figure 7, which doesn’t specify a principal and so doesn’t require request signing. This may be acceptable, since you can choose to require an authenticated identity on all traffic. The security plugin authenticates the traffic as above, providing access control based on the internal database.

Figure 7: Selected open access policy

Figure 7: Selected open access policy

Encrypted data

Amazon ES provides an option to encrypt data in transit and at rest for any domain. When you enable fine-grained access control, you must use encryption with the corresponding checkboxes automatically checked and not changeable. These include Transport Layer Security (TLS) for requests to the domain and for traffic between nodes in the domain, and encryption of data at rest through AWS Key Management Service (KMS). Shown in Figure 8.

Figure 8: Enabled encryption

Figure 8: Enabled encryption

Accessing Kibana

When you complete the domain creation wizard, it takes about 10 minutes for your domain to activate. Return to the console and the Overview tab of your Amazon ES dashboard. When the Domain Status is Active, select the Kibana URL. Since you created your domain in your VPC, you must be able to access the Kibana endpoint via proxy, VPN, SSH tunnel, or similar. Use the master user name and password that you configured earlier to log in to Kibana, as shown in Figure 9. As detailed above, you should only ever log in as the master user to set up additional users—administrators, users with read-only access, and others.

Figure 9: Kibana login page

Figure 9: Kibana login page

Conclusion

Congratulations, you now know the basic steps to set up the minimum configuration to access your Amazon ES domain with a master user. You can examine the settings for fine-grained access control in the Kibana console Security tab. Here, you can add additional users, assign permissions, map IAM users to security roles, and set up your Kibana tenancy. We’ll cover those topics in future posts.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Elasticsearch Service forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Jon Handler

Jon is a Principal Solutions Architect at AWS. He works closely with the CloudSearch and Elasticsearch teams, providing help and guidance to a broad range of customers who have search workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included four years of coding a large-scale, eCommerce search engine. Jon holds a Bachelor of the Arts from the University of Pennsylvania, and a Master of Science and a Ph. D. in Computer Science and Artificial Intelligence from Northwestern University.

Author

Sajeev Attiyil Bhaskaran

Sajeev is a Senior Cloud Engineer focused on big data and analytics. He works with AWS customers to provide architectural and engineering assistance and guidance. He dives deep into big data technologies and streaming solutions. He also does onsite and online sessions for customers to design best solutions for their use cases. In his free time, he enjoys spending time with his wife and daughter.

Configuring AWS VPN for UK public sector use

Post Syndicated from Charlie Llewellyn original https://aws.amazon.com/blogs/security/configuring-aws-vpn-for-uk-public-sector-use/

In this post, we explain the United Kingdom (UK) National Cyber Security Centre (NCSC)’s guidance on VPN profiles configuration, and how the configuration parameters for the AWS Virtual Private Network (AWS VPN) align with the NCSC guidance. At the end of the post, there are links to code to deploy the AWS VPN in line with those parameters.

Many public sector organizations in the UK need to connect their existing on-premises facilities, data centers, or offices to the Amazon Web Services (AWS) cloud so they can take advantage of the broad set of services AWS provides to help them deliver against their mission.

This can be achieved using the AWS VPN service. However some customers find it difficult to know the exact configuration parameters that they should choose when establishing the VPN connection in-line with guidance for the UK public sector.

AWS VPN services enable organizations to establish secure connections between their on-premises networks, remote offices, and client devices and the AWS global network. AWS VPN comprises two services: AWS Site-to-Site VPN and AWS Client VPN. Together, they deliver a highly available, fully managed, elastic cloud VPN solution to protect your network traffic.

For the purposes of this post, we focus on the Site-to-Site VPN configuration, not Client VPN because the NCSC guidance we’re discussing is specifically related to site-to-site VPNs. This post covers two areas:

  • An overview of the current guidance for VPN configurations for the public sector.
  • Recommendations on how to configure AWS VPN to meet or exceed the current guidance.

VPN guidance for UK public sector organizations

The starting point for security guidance for the UK public sector is often the NCSC. The role of the NCSC includes:

  • Protecting government systems and information.
  • Planning for and responding to cyber incidents.
  • Working with providers of critical national infrastructure to improve the protection and computer security of such infrastructure against cyber-borne threats.

Specifically, for guidance on the configuration of VPNs for the UK public sector to support data at OFFICIAL, the NCSC has created detailed guidance on the technical configurations to support two different profiles: PRIME and Foundation. These two profiles provide different technical implementations to support different equipment and are both suitable for use with OFFCICIAL data. Beyond these technical differences, NCSC also documents that Foundation is expected to provide suitable protection for OFFICIAL information until at least December 31, 2023, while PRIME has no review date specified at the time of writing.

This guidance is available in Using IPsec to protect data.

Let’s start by debunking a few myths.

Myth 1: I have to adhere exactly to the NCSC technical configuration or I cannot use a VPN for OFFICIAL data

It’s a common misconception that a public sector organization must adhere exactly to the configuration of either PRIME or Foundation in order to use a VPN for OFFICIAL data, even if other configuration options available—such as a longer key length—offer a higher security baseline.

Note that the NCSC isn’t mandating the use of the configuration in their guidance. They’re offering a configuration that provides a useful baseline, but you must assess your use of the NCSC guidance in context of the risks. To help with these risk-based decisions, the NCSC has developed a series of guidance documents to help organizations make risk-based decisions. A common consideration that might require deviating from the guidance would be supporting interoperability with legacy systems where the suggested algorithms aren’t supported. In this case, a risk-based decision should be made—including accounting for other factors such as cost.

It’s also worth noting that the NCSC creates guidance designed to be useful to as many organizations as possible. The NCSC balances adopting the latest possible configurations with backwards compatibility and vendor support. For example, the NCSC suggests AES-128 where—in theory—AES-256 could also be a good choice. Organizations need to be aware that if they choose to adopt devices that support only AES-256 and later need to connect in devices capable of only AES-128, there could be significant investment to replace the legacy devices with ones that support AES-256. However, AWS provides both AES-128 and AES-256, so if the remote device supports it, AWS would recommend opting for AES-256.

The NCSC also tries to develop advice that has some longevity. For example, the guidance suggesting use of AES-128 was created in 2012 with a view to providing solid guidance over a number of years. This means customers can choose different configuration parameters that offer increased levels of security if both sides of the VPN can support it.

It’s possible for a customer to choose options that might lower the security of the connection, provided that risks are identified and appropriately managed by the customers assurance team. This might be needed to support interoperability between existing systems where the cost of an upgrade outweighs the risk.

Myth 2: Foundation has been deprecated and I must use PRIME

Another common misconception is that Foundation has been deprecated in favor of PRIME. This is not the case. The NCSC has stated that Foundation is expected to provide suitable protection for OFFICIAL information until at least December 31, 2023. The security provided by both solutions provides commensurate security for accessing data classified as OFFICIAL. One of the main differences between PRIME and Foundation is the choice of signature algorithm: RSA or ECDSA. This difference can be helpful in enabling an organization to choose which profile to adopt. For example, if the organization already has a private key infrastructure (PKI), then the decision regarding which signature algorithm to use is based on what existing systems support.

Myth 3: I can’t use Foundation for accessing OFFICIAL SENSITIVE data

A final point that often causes confusion is the classifying of data at OFFICIAL SENSITIVE because it isn’t a classification, but a handling caveat. The data would be classified as OFFICIAL and marked as OFFICIAL SENSITIVE, meaning that systems handling the data need risk-appropriate security measures. A system that can handle OFFICIAL data might be appropriate to handle sensitive information. Hence Foundation could be suitable for accessing OFFICIAL SENSITIVE data, depending on the risks identified.

Deep-dive into the technical specifications

Now that you know a little more about how the guidance should be viewed, let’s look more closely at the technical configurations for each VPN profile.

The following table shows the configuration parameters suggested by the NCSC VPN guidance discussed previously.

Technical detailFoundationPRIME
IKEv* – EncryptionIKEv1 – AES with 128-bit keys in CBC mode (RFC3602)IKEv2 – AES-128 in GCM-128 (and optionally CBC)
IKEv2 – Pseudo-random functionHMAC-SHA256HMAC-SHA256
IKEv2 – Diffie-Hellman groupGroup 14 (2048-bit MODP group) (RFC3526)256 bit random ECP (RFC5903) Group 19
IKEv2 – AuthenticationX.509 certificates with RSA signatures (2048 bits) and SHA-256 (RFC4945 and RFC4055)X.509 certificates with ECDSA-256 with SHA256 on P-256 curve
ESP – EncryptionAES with 128-bit keys in CBC mode (RFC3602)
SHA-256 (RFC4868)
AES-128 in GCM-128
SHA-256 (RFC4868)

Recommended AWS VPN configuration for public sector

Bearing in mind these policies, and remembering that the configuration is only guidance, you must make a risk-based decision. AWS recommends the following configuration as a starting point for the configuration of the AWS VPN.

Technical detailAWS configurationAdherence
IKEv* – EncryptionIKEv2 – AES-256-GCMSuitable for Foundation and PRIME
IKEv2 – Pseudo-random functionHMAC-SHA256Meets Foundation and PRIME
IKEv2 – Diffie-Hellman groupGroup 19Suitable for Foundation and matches PRIME
IKEv2 – AuthenticationRSA 2048 SHA2-512Suitable for Foundation
ESP – EncryptionAES-256-GCMSuitable for PRIME and Foundation

In the table above, we use the term suitable for where the protocol doesn’t match the guidance exactly but the AWS configuration options provide equivalent or stronger security—for example, by using a longer key length.

With the configuration defined above, the AWS VPN service is suitable for use under the Foundation profile in all areas. It can also be made suitable for PRIME in all areas apart from IKEv2 encryption. The use of RSA or ECDSA is the main difference between the AWS VPN and PRIME configurations. This makes the current AWS VPN solution closer to Foundation than PRIME.

When considering which options are available to you, the starting point should be the capabilities of your current—and possible future—VPN devices. Based on its capabilities, you can use the NCSC guidance and preceding tables to choose the protocols that match or are suitable for the NCSC guidance.

Summary

To review:

  • The NCSC provides guidance for the VPN configuration, not a mandate.
  • An organization is free to decide not to use the guidance, but should consider risks when they make that decision.
  • The AWS VPN meets or is suitable for the configuration options for Foundation.

After reviewing the details contained in this blog, UK public sector organizations should have the confidence to use the AWS VPN service with systems running at OFFICIAL.

If you’re interested in deploying the AWS VPN configuration described in this post, you can download instructions and AWS CloudFormation templates to configure the AWS VPN service. The AWS VPN configuration can be deployed to either connect directly to a single Amazon Virtual Private Cloud (Amazon VPC) using a virtual private gateway, or to an AWS Transit Gateway to enable its use by multiple VPCs.

If you’re interested in configuring your AWS VPN tunnel options manually, you can follow Modifying Site-to-Site VPN tunnel options.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Charlie Llewellyn

Charlie is a Solutions Architect working in the Public Sector team with Amazon Web Services. He specializes in data analytics and enjoys helping customers use data to make better decisions. In his spare time he avidly enjoys mountain biking and cooking.

Author

Muhammad Khas

Muhammad Khas is a Solutions Architect working in the Public Sector team at Amazon Web Services. He enjoys supporting customers in using artificial intelligence and machine learning to enhance their decision making. Outside of work, Muhammad enjoys swimming, and horse riding.

Publishing private npm packages with AWS CodeArtifact

Post Syndicated from Ryan Sonshine original https://aws.amazon.com/blogs/devops/publishing-private-npm-packages-aws-codeartifact/

This post demonstrates how to create, publish, and download private npm packages using AWS CodeArtifact, allowing you to share code across your organization without exposing your packages to the public.

The ability to control CodeArtifact repository access using AWS Identity and Access Management (IAM) removes the need to manage additional credentials for a private npm repository when developers already have IAM roles configured.

You can use private npm packages for a variety of use cases, such as:

  • Reducing code duplication
  • Configuration such as code linting and styling
  • CLI tools for internal processes

This post shows how to easily create a sample project in which we publish an npm package and install the package from CodeArtifact. For more information about pipeline integration, see AWS CodeArtifact and your package management flow – Best Practices for Integration.

Solution overview

The following diagram illustrates this solution.

Diagram showing npm package publish and install with CodeArtifact

In this post, you create a private scoped npm package containing a sample function that can be used across your organization. You create a second project to download the npm package. You also learn how to structure your npm package to make logging in to CodeArtifact automatic when you want to build or publish the package.

The code covered in this post is available on GitHub:

Prerequisites

Before you begin, you need to complete the following:

  1. Create an AWS account.
  2. Install the AWS Command Line Interface (AWS CLI). CodeArtifact is supported in these CLI versions:
    1. 18.83 or later: install the AWS CLI version 1
    2. 0.54 or later: install the AWS CLI version 2
  3. Create a CodeArtifact repository.
  4. Add required IAM permissions for CodeArtifact.

Creating your npm package

You can create your npm package in three easy steps: set up the project, create your npm script for authenticating with CodeArtifact, and publish the package.

Setting up your project

Create a directory for your new npm package. We name this directory my-package because it serves as the name of the package. We use an npm scope for this package, where @myorg represents the scope all of our organization’s packages are published under. This helps us distinguish our internal private package from external packages. See the following code:

npm init [email protected] -y

{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

The package.json file specifies that the main file of the package is called index.js. Next, we create that file and add our package function to it:

module.exports.helloWorld = function() {
  console.log('Hello world!');
}

Creating an npm script

To create your npm script, complete the following steps:

  1. On the CodeArtifact console, choose the repository you created as part of the prerequisites.

If you haven’t created a repository, create one before proceeding.

CodeArtifact repository details console

  1. Select your CodeArtifact repository and choose Details to view the additional details for your repository.

We use two items from this page:

  • Repository name (my-repo)
  • Domain (my-domain)
  1. Create a script named co:login in our package.json. The package.json contains the following code:
{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Running this script updates your npm configuration to use your CodeArtifact repository and sets your authentication token, which expires after 12 hours.

  1. To test our new script, enter the following command:

npm run co:login

The following code is the output:

> aws codeartifact login --tool npm --repository my-repo --domain my-domain
Successfully configured npm to use AWS CodeArtifact repository https://my-domain-<ACCOUNT ID>.d.codeartifact.us-east-1.amazonaws.com/npm/my-repo/
Login expires in 12 hours at 2020-09-04 02:16:17-04:00
  1. Add a prepare script to our package.json to run our login command:
{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

This configures our project to automatically authenticate and generate an access token anytime npm install or npm publish run on the project.

If you see an error containing Invalid choice, valid choices are:, you need to update the AWS CLI according to the versions listed in the perquisites of this post.

Publishing your package

To publish our new package for the first time, run npm publish.

The following screenshot shows the output.

Terminal showing npm publish output

If we navigate to our CodeArtifact repository on the CodeArtifact console, we now see our new private npm package ready to be downloaded.

CodeArtifact console showing published npm package

Installing your private npm package

To install your private npm package, you first set up the project and add the CodeArtifact configs. After you install your package, it’s ready to use.

Setting up your project

Create a directory for a new application and name it my-app. This is a sample project to download our private npm package published in the previous step. You can apply this pattern to all repositories you intend on installing your organization’s npm packages in.

npm init -y

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "A sample application consuming a private scoped npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Adding CodeArtifact configs

Copy the npm scripts prepare and co:login created earlier to your new project:

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "A sample application consuming a private scoped npm package",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Installing your new private npm package

Enter the following command:

npm install @myorg/my-package

Your package.json should now list @myorg/my-package in your dependencies:

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "dependencies": {
    "@myorg/my-package": "^1.0.0"
  }
}

Using your new npm package

In our my-app application, create a file named index.js to run code from our package containing the following:

const { helloWorld } = require('@myorg/my-package');

helloWorld();

Run node index.js in your terminal to see the console print the message from our @myorg/my-package helloWorld function.

Cleaning Up

If you created a CodeArtifact repository for the purposes of this post, use one of the following methods to delete the repository:

Remove the changes made to your user profile’s npm configuration by running npm config delete registry, this will remove the CodeArtifact repository from being set as your default npm registry.

Conclusion

In this post, you successfully published a private scoped npm package stored in CodeArtifact, which you can reuse across multiple teams and projects within your organization. You can use npm scripts to streamline the authentication process and apply this pattern to save time.

About the Author

Ryan Sonshine

Ryan Sonshine is a Cloud Application Architect at Amazon Web Services. He works with customers to drive digital transformations while helping them architect, automate, and re-engineer solutions to fully leverage the AWS Cloud.

 

 

How to deploy the AWS Solution for Security Hub Automated Response and Remediation

Post Syndicated from Ramesh Venkataraman original https://aws.amazon.com/blogs/security/how-to-deploy-the-aws-solution-for-security-hub-automated-response-and-remediation/

In this blog post I show you how to deploy the Amazon Web Services (AWS) Solution for Security Hub Automated Response and Remediation. The first installment of this series was about how to create playbooks using Amazon CloudWatch Events, AWS Lambda functions, and AWS Security Hub custom actions that you can run manually based on triggers from Security Hub in a specific account. That solution requires an analyst to directly trigger an action using Security Hub custom actions and doesn’t work for customers who want to set up fully automated remediation based on findings across one or more accounts from their Security Hub master account.

The solution described in this post automates the cross-account response and remediation lifecycle from executing the remediation action to resolving the findings in Security Hub and notifying users of the remediation via Amazon Simple Notification Service (Amazon SNS). You can also deploy these automated playbooks as custom actions in Security Hub, which allows analysts to run them on-demand against specific findings. You can deploy these remediations as custom actions or as fully automated remediations.

Currently, the solution includes 10 playbooks aligned to the controls in the Center for Internet Security (CIS) AWS Foundations Benchmark standard in Security Hub, but playbooks for other standards such as AWS Foundational Security Best Practices (FSBP) will be added in the future.

Solution overview

Figure 1 shows the flow of events in the solution described in the following text.

Figure 1: Flow of events

Figure 1: Flow of events

Detect

Security Hub gives you a comprehensive view of your security alerts and security posture across your AWS accounts and automatically detects deviations from defined security standards and best practices.

Security Hub also collects findings from various AWS services and supported third-party partner products to consolidate security detection data across your accounts.

Ingest

All of the findings from Security Hub are automatically sent to CloudWatch Events and Amazon EventBridge and you can set up CloudWatch Events and EventBridge rules to be invoked on specific findings. You can also send findings to CloudWatch Events and EventBridge on demand via Security Hub custom actions.

Remediate

The CloudWatch Event and EventBridge rules can have AWS Lambda functions, AWS Systems Manager automation documents, or AWS Step Functions workflows as the targets of the rules. This solution uses automation documents and Lambda functions as response and remediation playbooks. Using cross-account AWS Identity and Access Management (IAM) roles, the playbook performs the tasks to remediate the findings using the AWS API when a rule is invoked.

Log

The playbook logs the results to the Amazon CloudWatch log group for the solution, sends a notification to an Amazon Simple Notification Service (Amazon SNS) topic, and updates the Security Hub finding. An audit trail of actions taken is maintained in the finding notes. The finding is updated as RESOLVED after the remediation is run. The security finding notes are updated to reflect the remediation performed.

Here are the steps to deploy the solution from this GitHub project.

  • In the Security Hub master account, you deploy the AWS CloudFormation template, which creates an AWS Service Catalog product along with some other resources. For a full set of what resources are deployed as part of an AWS CloudFormation stack deployment, you can find the full set of deployed resources in the Resources section of the deployed AWS CloudFormation stack. The solution uses the AWS Service Catalog to have the remediations available as a product that can be deployed after granting the users the required permissions to launch the product.
  • Add an IAM role that has administrator access to the AWS Service Catalog portfolio.
  • Deploy the CIS playbook from the AWS Service Catalog product list using the IAM role you added in the previous step.
  • Deploy the AWS Security Hub Automated Response and Remediation template in the master account in addition to the member accounts. This template establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. Use AWS CloudFormation StackSets in the master account to have a centralized deployment approach across the master account and multiple member accounts.

Deployment steps for automated response and remediation

This section reviews the steps to implement the solution, including screenshots of the solution launched from an AWS account.

Launch AWS CloudFormation stack on the master account

As part of this AWS CloudFormation stack deployment, you create custom actions to configure Security Hub to send findings to CloudWatch Events. Lambda functions are used to provide remediation in response to actions sent to CloudWatch Events.

Note: In this solution, you create custom actions for the CIS standards. There will be more custom actions added for other security standards in the future.

To launch the AWS CloudFormation stack

  1. Deploy the AWS CloudFormation template in the Security Hub master account. In your AWS console, select CloudFormation and choose Create new stack and enter the S3 URL.
  2. Select Next to move to the Specify stack details tab, and then enter a Stack name as shown in Figure 2. In this example, I named the stack SO0111-SHARR, but you can use any name you want.
     
    Figure 2: Creating a CloudFormation stack

    Figure 2: Creating a CloudFormation stack

  3. Creating the stack automatically launches it, creating 21 new resources using AWS CloudFormation, as shown in Figure 3.
     
    Figure 3: Resources launched with AWS CloudFormation

    Figure 3: Resources launched with AWS CloudFormation

  4. An Amazon SNS topic is automatically created from the AWS CloudFormation stack.
  5. When you create a subscription, you’re prompted to enter an endpoint for receiving email notifications from Amazon SNS as shown in Figure 4. To subscribe to that topic that was created using CloudFormation, you must confirm the subscription from the email address you used to receive notifications.
     
    Figure 4: Subscribing to Amazon SNS topic

    Figure 4: Subscribing to Amazon SNS topic

Enable Security Hub

You should already have enabled Security Hub and AWS Config services on your master account and the associated member accounts. If you haven’t, you can refer to the documentation for setting up Security Hub on your master and member accounts. Figure 5 shows an AWS account that doesn’t have Security Hub enabled.
 

Figure 5: Enabling Security Hub for first time

Figure 5: Enabling Security Hub for first time

AWS Service Catalog product deployment

In this section, you use the AWS Service Catalog to deploy Service Catalog products.

To use the AWS Service Catalog for product deployment

  1. In the same master account, add roles that have administrator access and can deploy AWS Service Catalog products. To do this, from Services in the AWS Management Console, choose AWS Service Catalog. In AWS Service Catalog, select Administration, and then navigate to Portfolio details and select Groups, roles, and users as shown in Figure 6.
     
    Figure 6: AWS Service Catalog product

    Figure 6: AWS Service Catalog product

  2. After adding the role, you can see the products available for that role. You can switch roles on the console to assume the role that you granted access to for the product you added from the AWS Service Catalog. Select the three dots near the product name, and then select Launch product to launch the product, as shown in Figure 7.
     
    Figure 7: Launch the product

    Figure 7: Launch the product

  3. While launching the product, you can choose from the parameters to either enable or disable the automated remediation. Even if you do not enable fully automated remediation, you can still invoke a remediation action in the Security Hub console using a custom action. By default, it’s disabled, as highlighted in Figure 8.
     
    Figure 8: Enable or disable automated remediation

    Figure 8: Enable or disable automated remediation

  4. After launching the product, it can take from 3 to 5 minutes to deploy. When the product is deployed, it creates a new CloudFormation stack with a status of CREATE_COMPLETE as part of the provisioned product in the AWS CloudFormation console.

AssumeRole Lambda functions

Deploy the template that establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. You must deploy this template in the master account in addition to any member accounts. Choose CloudFormation and create a new stack. In Specify stack details, go to Parameters and specify the Master account number as shown in Figure 9.
 

Figure 9: Deploy AssumeRole Lambda function

Figure 9: Deploy AssumeRole Lambda function

Test the automated remediation

Now that you’ve completed the steps to deploy the solution, you can test it to be sure that it works as expected.

To test the automated remediation

  1. To test the solution, verify that there are 10 actions listed in Custom actions tab in the Security Hub master account. From the Security Hub master account, open the Security Hub console and select Settings and then Custom actions. You should see 10 actions, as shown in Figure 10.
     
    Figure 10: Custom actions deployed

    Figure 10: Custom actions deployed

  2. Make sure you have member accounts available for testing the solution. If not, you can add member accounts to the master account as described in Adding and inviting member accounts.
  3. For testing purposes, you can use CIS 1.5 standard, which is to require that the IAM password policy requires at least one uppercase letter. Check the existing settings by navigating to IAM, and then to Account Settings. Under Password policy, you should see that there is no password policy set, as shown in Figure 11.
     
    Figure 11: Password policy not set

    Figure 11: Password policy not set

  4. To check the security settings, go to the Security Hub console and select Security standards. Choose CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 from the list to see the Findings. You will see the Status as Failed. This means that the password policy to require at least one uppercase letter hasn’t been applied to either the master or the member account, as shown in Figure 12.
     
    Figure 12: CIS 1.5 finding

    Figure 12: CIS 1.5 finding

  5. Select CIS 1.5 – 1.11 from Actions on the top right dropdown of the Findings section from the previous step. You should see a notification with the heading Successfully sent findings to Amazon CloudWatch Events as shown in Figure 13.
     
    Figure 13: Sending findings to CloudWatch Events

    Figure 13: Sending findings to CloudWatch Events

  6. Return to Findings by selecting Security standards and then choosing CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 to review Findings and verify that the Workflow status of CIS 1.5 is RESOLVED, as shown in Figure 14.
     
    Figure 14: Resolved findings

    Figure 14: Resolved findings

  7. After the remediation runs, you can verify that the Password policy is set on the master and the member accounts. To verify that the password policy is set, navigate to IAM, and then to Account Settings. Under Password policy, you should see that the account uses a password policy, as shown in Figure 15.
     
    Figure 15: Password policy set

    Figure 15: Password policy set

  8. To check the CloudWatch logs for the Lambda function, in the console, go to Services, and then select Lambda and choose the Lambda function and within the Lambda function, select View logs in CloudWatch. You can see the details of the function being run, including updating the password policy on both the master account and the member account, as shown in Figure 16.
     
    Figure 15: Lambda function log

    Figure 16: Lambda function log

Conclusion

In this post, you deployed the AWS Solution for Security Hub Automated Response and Remediation using Lambda and CloudWatch Events rules to remediate non-compliant CIS-related controls. With this solution, you can ensure that users in member accounts stay compliant with the CIS AWS Foundations Benchmark by automatically invoking guardrails whenever services move out of compliance. New or updated playbooks will be added to the existing AWS Service Catalog portfolio as they’re developed. You can choose when to take advantage of these new or updated playbooks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Ramesh Venkataraman

Ramesh is a Solutions Architect who enjoys working with customers to solve their technical challenges using AWS services. Outside of work, Ramesh enjoys following stack overflow questions and answers them in any way he can.

Set up centralized monitoring for DDoS events and auto-remediate noncompliant resources

Post Syndicated from Fola Bolodeoku original https://aws.amazon.com/blogs/security/set-up-centralized-monitoring-for-ddos-events-and-auto-remediate-noncompliant-resources/

When you build applications on Amazon Web Services (AWS), it’s a common security practice to isolate production resources from non-production resources by logically grouping them into functional units or organizational units. There are many benefits to this approach, such as making it easier to implement the principal of least privilege, or reducing the scope of adversely impactful activities that may occur in non-production environments. After building these applications, setting up monitoring for resource compliance and security risks, such as distributed denial of service (DDoS) attacks across your AWS accounts, is just as important. The recommended best practice to perform this type of monitoring involves using AWS Shield Advanced with AWS Firewall Manager, and integrating these with AWS Security Hub.

In this blog post, I show you how to set up centralized monitoring for Shield Advanced–protected resources across multiple AWS accounts by using Firewall Manager and Security Hub. This enables you to easily manage resources that are out of compliance from your security policy and to view DDoS events that are detected across multiple accounts in a single view.

Shield Advanced is a managed application security service that provides DDoS protection for your workloads against infrastructure layer (Layer 3–4) attacks, as well as application layer (Layer 7) attacks, by using AWS WAF. Firewall Manager is a security management service that enables you to centrally configure and manage firewall rules across your accounts and applications in an organization in AWS. Security Hub consumes, analyzes, and aggregates security events produced by your application running on AWS by consuming security findings. Security Hub integrates with Firewall Manager without the need for any action to be taken by you.

I’m going to cover two different scenarios that show you how to use Firewall Manager for:

  1. Centralized visibility into Shield Advanced DDoS events
  2. Automatic remediation of noncompliant resources

Scenario 1: Centralized visibility of DDoS detected events

This scenario represents a fully native and automated integration, where Shield Advanced DDoSDetected events (indicates whether a DDoS event is underway for a particular Amazon Resource Name (ARN)) are made visible as a security finding in Security Hub, through Firewall Manager.

Solution overview

Figure 1 shows the solution architecture for scenario 1.
 

Figure 1: Scenario 1 – Shield Advanced DDoS detected events visible in Security Hub

Figure 1: Scenario 1 – Shield Advanced DDoS detected events visible in Security Hub

The diagram illustrates a customer using AWS Organizations to isolate their production resources into the Production Organizational Unit (OU), with further separation into multiple accounts for each of the mission-critical applications. The resources in Account 1 are protected by Shield Advanced. The Security OU was created to centralize security functions across all AWS accounts and OUs, obscuring the visibility of the production environment resources from the Security Operations Center (SOC) engineers and other security staff. The Security OU is home to the designated administrator account for Firewall Manager and the Security Hub dashboard.

Scenario 1 implementation

You will be setting up Security Hub in an account that has the prerequisite services configured in it as explained below. Before you proceed, see the architecture requirements in the next section. Once Security Hub is enabled for your organization, you can simulate a DDoS event in strict accordance with the AWS DDoS Simulation Testing Policy or use one of AWS DDoS Test Partners.

Architecture requirements

In order to implement these steps, you must have the following:

Once you have all these requirements completed, you can move on to enable Security Hub.

Enable Security Hub

Note: If you plan to protect resources with Shield Advanced across multiple accounts and in multiple Regions, we recommend that you use the AWS Security Hub Multiaccount Scripts from AWS Labs. Security Hub needs to be enabled in all the Regions and all the accounts where you have Shield protected resources. For global resources, like Amazon CloudFront, you should enable Security Hub in the us-east-1 Region.

To enable Security Hub

  1. In the AWS Security Hub console, switch to the account you want to use as the designated Security Hub administrator account.
  2. Select the security standard or standards that are applicable to your application’s use-case, and choose Enable Security Hub.
     
    Figure 2: Enabling Security Hub

    Figure 2: Enabling Security Hub

  3. From the designated Security Hub administrator account, go to the Settings – Account tab, and add accounts by sending invites to all the accounts you want added as member accounts. The invited accounts become associated as member accounts once the owner of the invited account has accepted the invite and Security Hub has been enabled. It’s possible to upload a comma-separated list of accounts you want to send to invites to.
     
    Figure 3: Designating a Security Hub administrator account by adding member accounts

    Figure 3: Designating a Security Hub administrator account by adding member accounts

View detected events in Shield and Security Hub

When Shield Advanced detects signs of DDoS traffic that is destined for a protected resource, the Events tab in the Shield console displays information about the event detected and provides a status on the mitigation that has been performed. Following is an example of how this looks in the Shield console.
 

Figure 4: Scenario 1 - The Events tab on the Shield console showing a Shield event in progress

Figure 4: Scenario 1 – The Events tab on the Shield console showing a Shield event in progress

If you’re managing multiple accounts, switching between these accounts to view the Shield console to keep track of DDoS incidents can be cumbersome. Using the Amazon CloudWatch metrics that Shield Advanced reports for Shield events, visibility across multiple accounts and Regions is easier through a custom CloudWatch dashboard or by consuming these metrics in a third-party tool. For example, the DDoSDetected CloudWatch metric has a binary value, where a value of 1 indicates that an event that might be a DDoS has been detected. This metric is automatically updated by Shield when the DDoS event starts and ends. You only need permissions to access the Security Hub dashboard in order to monitor all events on production resources. Following is an example of what you see in the Security Hub console.
 

Figure 5: Scenario 1 - Shield Advanced DDoS alarm showing in Security Hub

Figure 5: Scenario 1 – Shield Advanced DDoS alarm showing in Security Hub

Configure Shield event notification in Firewall Manager

In order to increase your visibility into possible Shield events across your accounts, you must configure Firewall Manager to monitor your protected resources by using Amazon Simple Notification Service (Amazon SNS). With this configuration, Firewall Manager sends you notifications of possible attacks by creating an Amazon SNS topic in Regions where you might have protected resources.

To configure SNS topics in Firewall Manager

  1. In the Firewall Manager console, go to the Settings page.
  2. Under Amazon SNS Topic Configuration, select a Region.
  3. Choose Configure SNS Topic.
     
    Figure 6: The Firewall Manager Settings page for configuring SNS topics

    Figure 6: The Firewall Manager Settings page for configuring SNS topics

  4. Select an existing topic or create a new topic, and then choose Configure SNS Topic.
     
    Figure 7: Configure an SNS topic in a Region

    Figure 7: Configure an SNS topic in a Region

Scenario 2: Automatic remediation of noncompliant resources

The second scenario is an example in which a new production resource is created, and Security Hub has full visibility of the compliance state of the resource.

Solution overview

Figure 8 shows the solution architecture for scenario 2.
 

Figure 8: Scenario 2 – Visibility of Shield Advanced noncompliant resources in Security Hub

Figure 8: Scenario 2 – Visibility of Shield Advanced noncompliant resources in Security Hub

Firewall Manager identifies that the resource is out of compliance with the defined policy for Shield Advanced and posts a finding to Security Hub, notifying your operations team that a manual action is required to bring the resource into compliance. If configured, Firewall Manager can automatically bring the resource into compliance by creating it as a Shield Advanced–protected resource, and then update Security Hub when the resource is in a compliant state.

Scenario 2 implementation

The following steps describe how to use Firewall Manager to enforce Shield Advanced protection compliance of an application that is deployed to a member account within AWS Organizations. This implementation assumes that you set up Security Hub as described for scenario 1.

Create a Firewall Manager security policy for Shield Advanced protected resources

In this step, you create a Shield Advanced security policy that will be enforced by Firewall Manager. For the purposes of this walkthrough, you’ll choose to automatically remediate noncompliant resources and apply the policy to Application Load Balancer (ALB) resources.

To create the Shield Advanced policy

  1. Open the Firewall Manager console in the designated Firewall Manager administrator account.
  2. In the left navigation pane, choose Security policies, and then choose Create a security policy.
  3. Select AWS Shield Advanced as the policy type, and select the Region where your protected resources are. Choose Next.

    Note: You will need to create a security policy for each Region where you have regional resources, such as Elastic Load Balancers and Elastic IP addresses, and a security policy for global resources such as CloudFront distributions.

    Figure 9: Select the policy type and Region

    Figure 9: Select the policy type and Region

  4. On the Describe policy page, for Policy name, enter a name for your policy.
  5. For Policy action, you have the option to configure automatic remediation of noncompliant resources or to only send alerts when resources are noncompliant. You can change this setting after the policy has been created. For the purposes of this blog post, I’m selecting Auto remediate any noncompliant resources. Select your option, and then choose Next.

    Important: It’s a best practice to first identify and review noncompliant resources before you enable automatic remediation.

  6. On the Define policy scope page, define the scope of the policy by choosing which AWS accounts, resource type, or resource tags the policy should be applied to. For the purposes of this blog post, I’m selecting to manage Application Load Balancer (ALB) resources across all accounts in my organization, with no preference for resource tags. When you’re finished defining the policy scope, choose Next.
     
    Figure 10: Define the policy scope

    Figure 10: Define the policy scope

  7. Review and create the policy. Once you’ve reviewed and created the policy in the Firewall Manager designated administrator account, the policy will be pushed to all the Firewall Manager member accounts for enforcement. The new policy could take up to 5 minutes to appear in the console. Figure 11 shows a successful security policy propagation across accounts.
     
    Figure 11: View security policies in an account

    Figure 11: View security policies in an account

Test the Firewall Manager and Security Hub integration

You’ve now defined a policy to cover only ALB resources, so the best way to test this configuration is to create an ALB in one of the Firewall Manager member accounts. This policy causes resources within the policy scope to be added as protected resources.

To test the policy

  1. Switch to the Security Hub administrator account and open the Security Hub console in the same Region where you created the ALB. On the Findings page, set the Title filter to Resource lacks Shield Advanced protection and set the Product name filter to Firewall Manager.
     
    Figure 12: Security Hub findings filter

    Figure 12: Security Hub findings filter

    You should see a new security finding flagging the ALB as a noncompliant resource, according to the Shield Advanced policy defined in Firewall Manager. This confirms that Security Hub and Firewall Manager have been enabled correctly.
     

    Figure 13: Security Hub with a noncompliant resource finding

    Figure 13: Security Hub with a noncompliant resource finding

  2. With the automatic remediation feature enabled, you should see the “Updated at” time reflect exactly when the automatic remediation actions were completed. The completion of the automatic remediation actions can take up to 5 minutes to be reflected in Security Hub.
     
    Figure 14: Security Hub with an auto-remediated compliance finding

    Figure 14: Security Hub with an auto-remediated compliance finding

  3. Go back to the account where you created the ALB, and in the Shield Protected Resources console, navigate to the Protected Resources page, where you should see the ALB listed as a protected resource.
     
    Figure 15: Shield console in the member account shows that the new ALB is a protected resource

    Figure 15: Shield console in the member account shows that the new ALB is a protected resource

    Confirming that the ALB has been added automatically as a Shield Advanced–protected resource means that you have successfully configured the Firewall Manager and Security Hub integration.

(Optional): Send a custom action to a third-party provider

You can send all regional Security Hub findings to a ticketing system, Slack, AWS Chatbot, a Security Information and Event Management (SIEM) tool, a Security Orchestration Automation and Response (SOAR), incident management tools, or to custom remediation playbooks by using Security Hub Custom Actions.

Conclusion

In this blog post I showed you how to set up a Firewall Manager security policy for Shield Advanced so that you can monitor your applications for DDoS events, and their compliance to DDoS protection policies in your multi-account environment from the Security Hub findings console. In line with best practices for account governance, organizations should have a centralized security account that performs monitoring for multiple accounts. Security Hub and Firewall Manager provide a centralized solution to help you achieve your compliance and monitoring goals for DDoS protection.

If you’re interested in exploring how Shield Advanced and AWS WAF help to improve the security posture of your application, have a look at the following resources:

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Fola Bolodeoku

Fola is a Security Engineer on the AWS Threat Research Team, where he focuses on helping customers improve their application security posture against DDoS and other application threats. When he is not working, he enjoys spending time exploring the natural beauty of the Western Cape.

Snowflake: Running Millions of Simulation Tests with Amazon EKS

Post Syndicated from Keith Joelner original https://aws.amazon.com/blogs/architecture/snowflake-running-millions-of-simulation-tests-with-amazon-eks/

This post was co-written with Brian Nutt, Senior Software Engineer and Kao Makino, Principal Performance Engineer, both at Snowflake.

Transactional databases are a key component of any production system. Maintaining data integrity while rows are read and written at a massive scale is a major technical challenge for these types of databases. To ensure their stability, it’s necessary to test many different scenarios and configurations. Simulating as many of these as possible allows engineers to quickly catch defects and build resilience. But the Holy Grail is to accomplish this at scale and within a timeframe that allows your developers to iterate quickly.

Snowflake has been using and advancing FoundationDB (FDB), an open-source, ACID-compliant, distributed key-value store since 2014. FDB, running on Amazon Elastic Cloud Compute (EC2) and Amazon Elastic Block Storage (EBS), has proven to be extremely reliable and is a key part of Snowflake’s cloud services layer architecture. To support its development process of creating high quality and stable software, Snowflake developed Project Joshua, an internal system that leverages Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Container Registry (ECR), Amazon EC2 Spot Instances, and AWS PrivateLink to run over one hundred thousand of validation and regression tests an hour.

About Snowflake

Snowflake is a single, integrated data platform delivered as a service. Built from the ground up for the cloud, Snowflake’s unique multi-cluster shared data architecture delivers the performance, scale, elasticity, and concurrency that today’s organizations require. It features storage, compute, and global services layers that are physically separated but logically integrated. Data workloads scale independently from one another, making it an ideal platform for data warehousing, data lakes, data engineering, data science, modern data sharing, and developing data applications.

Snowflake architecture

Developing a simulation-based testing and validation framework

Snowflake’s cloud services layer is composed of a collection of services that manage virtual warehouses, query optimization, and transactions. This layer relies on rich metadata stored in FDB.

Prior to the creation of the simulation framework, Project Joshua, FDB developers ran tests on their laptops and were limited by the number they could run. Additionally, there was a scheduled nightly job for running further tests.

Joshua at Snowflake

Amazon EKS as the foundation

Snowflake’s platform team decided to use Kubernetes to build Project Joshua. Their focus was on helping engineers run their workloads instead of spending cycles on the management of the control plane. They turned to Amazon EKS to achieve their scalability needs. This was a crucial success criterion for Project Joshua since at any point in time there could be hundreds of nodes running in the cluster. Snowflake utilizes the Kubernetes Cluster Autoscaler to dynamically scale worker nodes in minutes to support a tests-based queue of Joshua’s requests.

With the integration of Amazon EKS and Amazon Virtual Private Cloud (Amazon VPC), Snowflake is able to control access to the required resources. For example: the database that serves Joshua’s test queues is external to the EKS cluster. By using the Amazon VPC CNI plugin, each pod receives an IP address in the VPC and Snowflake can control access to the test queue via security groups.

To achieve its desired performance, Snowflake created its own custom pod scaler, which responds quicker to changes than using a custom metric for pod scheduling.

  • The agent scaler is responsible for monitoring a test queue in the coordination database (which, coincidentally, is also FDB) to schedule Joshua agents. The agent scaler communicates directly with Amazon EKS using the Kubernetes API to schedule tests in parallel.
  • Joshua agents (one agent per pod) are responsible for pulling tests from the test queue, executing, and reporting results. Tests are run one at a time within the EKS Cluster until the test queue is drained.

Achieving scale and cost savings with Amazon EC2 Spot

A Spot Fleet is a collection—or fleet—of Amazon EC2 Spot instances that Joshua uses to make the infrastructure more reliable and cost effective. ​ Spot Fleet is used to reduce the cost of worker nodes by running a variety of instance types.

With Spot Fleet, Snowflake requests a combination of different instance types to help ensure that demand gets fulfilled. These options make Fleet more tolerant of surges in demand for instance types. If a surge occurs it will not significantly affect tasks since Joshua is agnostic to the type of instance and can fall back to a different instance type and still be available.

For reservations, Snowflake uses the capacity-optimized allocation strategy to automatically launch Spot Instances into the most available pools by looking at real-time capacity data and predicting which are the most available. This helps Snowflake quickly switch instances reserved to what is most available in the Spot market, instead of spending time contending for the cheapest instances, at the cost of a potentially higher price.

Overcoming hurdles

Snowflake’s usage of a public container registry posed a scalability challenge. When starting hundreds of worker nodes, each node needs to pull images from the public registry. This can lead to a potential rate limiting issue when all outbound traffic goes through a NAT gateway.

For example, consider 1,000 nodes pulling a 10 GB image. Each pull request requires each node to download the image across the public internet. Some issues that need to be addressed are latency, reliability, and increased costs due to the additional time to download an image for each test. Also, container registries can become unavailable or may rate-limit download requests. Lastly, images are pulled through public internet and other services in the cluster can experience pulling issues.

​For anything more than a minimal workload, a local container registry is needed. If an image is first pulled from the public registry and then pushed to a local registry (cache), it only needs to pull once from the public registry, then all worker nodes benefit from a local pull. That’s why Snowflake decided to replicate images to ECR, a fully managed docker container registry, providing a reliable local registry to store images. Additional benefits for the local registry are that it’s not exclusive to Joshua; all platform components required for Snowflake clusters can be cached in the local ECR Registry. For additional security and performance Snowflake uses AWS PrivateLink to keep all network traffic from ECR to the workers nodes within the AWS network. It also resolved rate-limiting issues from pulling images from a public registry with unauthenticated requests, unblocking other cluster nodes from pulling critical images for operation.

Conclusion

Project Joshua allows Snowflake to enable developers to test more scenarios without having to worry about the management of the infrastructure. ​ Snowflake’s engineers can schedule thousands of test simulations and configurations to catch bugs faster. FDB is a key component of ​the Snowflake stack and Project Joshua helps make FDB more stable and resilient. Additionally, Amazon EC2 Spot has provided non-trivial cost savings to Snowflake vs. running on-demand or buying reserved instances.

If you want to know more about how Snowflake built its high performance data warehouse as a Service on AWS, watch the This is My Architecture video below.

How to secure your Amazon WorkSpaces for external users

Post Syndicated from Olivia Carline original https://aws.amazon.com/blogs/security/how-to-secure-your-amazon-workspaces-for-external-users/

In response to the current shift towards a remote workforce, companies are providing greater access to corporate applications from a range of different devices. Amazon WorkSpaces is a desktop-as-a-service solution that can be used to quickly deploy cloud-based desktops to your external users, including employees, third-party vendors, and consultants. Amazon WorkSpaces desktops are accessible from anywhere with an internet connection. In this blog post, I review some key security controls that you can use to architect your Amazon WorkSpaces environment to provide external users access to your corporate applications and data in a way that satisfies your unique security and compliance objectives.

Amazon Workspaces provides a virtual desktop infrastructure that removes the need for upfront infrastructure expenditure. Instead, you can pay for Windows or Linux desktop environments as you need them. These environments can be provisioned in a few minutes, and enable you to scale up to thousands of desktops that can be accessed from wherever your users are located.

As part of the shared responsibility model, security is a shared responsibility between Amazon Web Services (AWS) and you. AWS is responsible for protecting the infrastructure that runs the AWS services while you are responsible for securing your data in AWS through appropriate permissions and WorkSpace management as outlined in the Best Practices for Deploying Amazon WorkSpaces whitepaper. Amazon WorkSpaces has been independently assessed to meet the requirements of a wide range of compliance programs, including IRAP, SOC, PCI DSS, FedRAMP, and HIPAA.

Prerequisites

Define user groups

A user group is a collection of people who all have the same security rights and permissions. Leveraging user groups helps you to identify the types of access and your requirements for user authentication. How you define your user groups should reflect how you classify your data and the access controls associated with the classifications. A common approach is to begin by separating your internal (employees) and external (vendors and consultants) users. Classifying your users into different groups helps you to define your security controls. For example, the security and configuration of your external users’ devices will be different from the configuration for your internal users’ devices. The identification process also helps to ensure that you’re following the principle of least privilege by limiting access to certain applications or resources. These user groups are the building blocks for designing the rest of your security controls, including the directories, access controls, and security groups.

In this blog post, I walk you through the security configurations for the following example external user groups. How you configure security for your user groups will depend on your own security requirements.

Example user groups

Internal users: Employees who need access to company resources from any location. In addition to having access to the internet and the internal network from any supported device, internal users have administrator access on their virtual desktops so they can install applications.

External users: Third-party vendors and consultants who need access to specific websites that are inside the corporate network. They have fewer permissions and tighter guardrails on their virtual desktops and can only access resources through trusted devices. External users should have access to only pre-installed applications and not be able to install additional applications onto their WorkSpaces.

At this stage, it’s okay to separate your user groups broadly based on the preceding requirements. Later, you can configure fine-grained access controls for individual users.

Configure your directories

Amazon WorkSpaces uses directories to manage information and configuration of WorkSpaces and users. Each WorkSpace that you provision exists within a directory. There are a couple of different options for configuring the directory. Amazon Workspaces can create and manage a directory for you so that users are entered into that directory when you provision a WorkSpace. As an alternative, you can integrate WorkSpaces with an existing, on-premises Microsoft Active Directory (AD) so your users can use the credentials they already know to access applications.

Within Amazon WorkSpaces, directories play a large part in how access to workspaces is configured. Directories within Amazon WorkSpaces are used to store and manage information for your WorkSpaces and users. Based on the preceding two example user groups, let’s split your users’ WorkSpaces across two directories. That will help you to establish different access control settings for the two groups.

To define the two directories, you must set up the directories within AWS Directory Service. As previously mentioned, there are various approaches to handling user management that depend on your existing user directories and requirements. For this example, you can configure two simple Active Directories—one for internal users and one for external users. Handling the external users in a separate directory allows you to ensure your user groups are configured with least privilege. With this approach, external users can still be given access to objects inside the internal directory through a trust if required but can be configured with stricter access controls than users inside the internal directory.

A comprehensive guide to setting up your directories is available in the Amazon WorkSpaces administration guide and outlines the steps to configure a directory using AWS Managed Microsoft AD, Simple AD, or AD Connector.

Configure security settings

After you define what privileges and access controls you want in place for your external users and configure the directories you need, it’s time to establish the security controls for your WorkSpaces. This blog will focus on the external users’ security configurations from the prerequisites. Use the following steps to implement the security requirements:

  1. Establish security groups
  2. Disable local administrator rights
  3. Configure IP access control groups
  4. Define trusted devices
  5. Configure monitoring of WorkSpaces

Establish security groups

With your two AD directories configured, you can start implementing the security controls for your external users. Your Amazon WorkSpaces are configured within a logically isolated network known as Amazon Virtual Private Cloud (VPC). A key concept within Amazon VPC is security groups, which act as virtual firewalls to control inbound and outbound traffic to the virtual desktops. A properly configured security group can limit access to resources in your network or to the internet at the individual WorkSpace level or at the directory level.

To ensure that your external users can access only the network resources you want them to, you can define security groups with restrictive network access settings. One approach is to configure security groups so that your external users only have HTTP and HTTPS access to specific internal websites by trusted IP addresses. To define more fine-grained access control for individual users, you can define another restrictive security group and attach it to an individual user’s WorkSpace. This way, you can use a single directory to handle many different users with different network security requirements and ensure that third-party users only have access to authorized data and systems. In addition to security groups, you can use your preferred host-based firewall on a given WorkSpace to limit network access to resources within the VPC.

To establish and configure security groups

  1. In the Amazon WorkSpaces menu, select Directories from the left menu. Choose the directory you created for your external users. Select Actions and then Update Details as shown in the following figure.
     
    Figure 1: Updating details of your directory

    Figure 1: Updating details of your directory

  2. In the Update Directory Details screen that appears, select the down arrow next to Security Group to expand the section. Select Create New next to the dropdown menu to configure a new security group.
     
    Figure 2: Adding a security group to your directory

    Figure 2: Adding a security group to your directory

  3. In the next window, select Create security group.
  4. Enter a descriptive name for the Security group name and a description for the security group in Description. For example, the description could be external-workspaces-users-sg.
  5. Change the VPC using the dropdown menu to the VPC hosting the WorkSpaces.
  6. In the Inbound rules section, leave the rules as default. The default configuration will block everything except for sessions that have been already established from the Workspace.
  7. In the Outbound rules section, configure the following settings:
    1. Select Delete the existing outbound rule.
    2. Select Add rule.
    3. Set Type to HTTP.
    4. Leave Protocol as TCP and Port range as 80.
    5. Change Source to Custom and enter the appropriate range for your Destination based on where your internal resources are located.
    6. Select Add rule again.
    7. Set Type to HTTPS.
    8. Leave Protocol as TCP and Port range as 443.
    9. Change Source to Custom and enter the appropriate range for your Destination based on where your internal resources are located.
    Figure 3: Configuring your security groups

    Figure 3: Configuring your security groups

  8. Select Create security group.
  9. Return to the WorkSpaces directory tab and select Refresh to see the newly created security group.
  10. Select Update and Exit.

Disable local administrator rights

One of the recommendations for external users is to disable the local administrator setting on their WorkSpaces and provide them with access to only specific, preinstalled applications. This guardrail helps to ensure that external users have limited permissions and to reduce the risk that they might access or share sensitive information. If local administrator isn’t disabled, users can install applications and modify settings on their WorkSpaces. You can disable local administrator access from within the external users’ directory. Changes to the directory are applied to all new WorkSpaces that you create and can be applied to existing WorkSpaces by rebuilding them after the making changes.

Note: If your internal users don’t need local administrator access, it’s a best practice to follow the principle of least privilege and disable it for them as well.

To disable local administrator rights for external users

  1. In the Amazon WorkSpaces menu, select Directories from the left menu. Choose the directory you configured for your external users.
  2. Select Actions and then Update Details.
  3. In Update Directory Details, select Local Administrator Setting and choose the Enable radio button.
  4. Select Update and Exit as shown in the following figure.
     
    Figure 4: Disabling your local administrator setting

    Figure 4: Disabling your local administrator setting

Define IP access control

So far the security groups you have defined previously allow external users access to company resources only from inside the corporate network. You can enhance this security configuration by leveraging IP access control groups to limit traffic and only allow certain IPs to access the WorkSpaces. An IP access control group acts as a virtual firewall and filters access to WorkSpaces by controlling the source classless inter-domain routing (CIDR) ranges that users can access their WorkSpaces from. Each IP access control group consists of a set of rules that specify a permitted IP address or range of addresses that Amazon WorkSpaces can be accessed from. Using this feature, you can configure rules that permit access to your WorkSpaces only if they are coming from your company’s VPN. To achieve this control, you must define rules that specify the ranges of IP addresses for your trusted networks within IP access control groups associated to the external users directory.

Note: Currently only IPV4 addresses are permitted.

To define IP access control

  1. Inside the Amazon WorkSpaces page, select IP Access Controls on the left panel. Select Create IP Group and enter a Group Name and Description in the window that appears.
  2. Select Create as shown in the following figure.
     
    Figure 5: Creating an IP group

    Figure 5: Creating an IP group

  3. Select the box next to the IP group you just created to open the new rules form.
  4. Select Add Rule.
  5. Enter the individual IP addresses or CIDR IP ranges that you want to allow WorkSpaces to have access from in Source. If you want to restrict access to your VPN make sure to add the public IPs of the VPN. Enter a description in Description.
  6. Select Save as shown in the following figure.
     
    Figure 6: Adding rules to your IP group

    Figure 6: Adding rules to your IP group

Configure trusted devices

Regulating the devices that can connect to your workspaces can help reduce the risk of unauthorized access to your network and applications. By default, all Amazon WorkSpaces users can access their virtual desktop from any supported device that has internet connectivity. However, it’s a good practice to configure additional guardrails to limit external users to only accessing their WorkSpaces through trusted devices, otherwise known as managed devices (currently this feature only applies to Amazon WorkSpaces Windows and macOS clients). With this feature enabled, only devices that have been authenticated through a certificate-based approach will have access to WorkSpaces. If the WorkSpaces client application cannot verify that a device is trusted, it blocks attempts to log in or connect from the device.

Note: If you haven’t already configured certificates, you will need to follow the steps in the Amazon WorkSpaces Administration Guide that walkthrough the requirements of the certificates as well as the process to generate one.

To configure trusted devices

  1. In the Amazon WorkSpaces menu, select Directories in the left menu. After selecting the directory that has been configured for your external users, select Actions and then Update Details.
  2. In Update Directory Details, select Access Control Options. Select Allow next to Windows and MacOS to allow only trusted Windows and macOS devices to access WorkSpaces.
  3. Select Import to import your root certificate.
  4. Next to Other Platforms select Block so that only Windows and MacOS devices will have access.
  5. Select Update and Exit.
     
    Figure 7: Establishing trusted devices

    Figure 7: Establishing trusted devices

  6. Test your settings by trying to access one of your WorkSpaces from a trusted device and from a non-trusted device.

Use Amazon CloudWatch to monitor your WorkSpaces

Once the guardrails for your external users have been set up, it’s important to monitor your environment for suspicious behavior and potential threats. Monitoring your infrastructure should be a fundamental aspect in your security plan. Amazon WorkSpaces is natively integrated with Amazon CloudWatch, which you can use to gather and analyze metrics to gain visibility into individual WorkSpaces and at a directory level. Alongside metrics, Amazon CloudWatch Events can also be used to provide visibility into your Amazon WorkSpaces fleet so you can view, filter, and respond to logins to your WorkSpaces. This approach lets you create a thorough monitoring pipeline that enhances your security. It lets you filter and automatically respond to suspicious activity in real time. A comprehensive example of this approach is outlined in this blog post that covers the steps involved to set up a CloudWatch based monitoring system for your WorkSpaces.

Conclusion

While you’ve used Amazon WorkSpaces features to help provide secure access for your external users, it’s also important to implement the principle of least privilege across all WorkSpaces users. You can use the design considerations and procedures in this blog post to help secure your WorkSpaces for all users, internal and external. You can learn more about best practices for securing your Amazon WorkSpaces by reading the Best Practices for Deploying Amazon WorkSpaces whitepaper to understand other features and capabilities that are available.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon WorkSpaces forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Olivia Carline

Olivia is an Associate Solutions Architect working in the public sector team. In her role she enjoys helping customers up-skill and build their cloud knowledge with a particular focus on cloud security. In her free time, you can find her exploring local hiking tracks and trying out new recipes.

Integrating CloudEndure Disaster Recovery into your security incident response plan

Post Syndicated from Gonen Stein original https://aws.amazon.com/blogs/security/integrating-cloudendure-disaster-recovery-into-your-security-incident-response-plan/

An incident response plan (also known as procedure) contains the detailed actions an organization takes to prepare for a security incident in its IT environment. It also includes the mechanisms to detect, analyze, contain, eradicate, and recover from a security incident. Every incident response plan should contain a section on recovery, which outlines scenarios ranging from single component to full environment recovery. This recovery section should include disaster recovery (DR), with procedures to recover your environment from complete failure. Effective recovery from an IT disaster requires tools that can automate preparation, testing, and recovery processes. In this post, I explain how to integrate CloudEndure Disaster Recovery into the recovery section of your incident response plan. CloudEndure Disaster Recovery is an Amazon Web Services (AWS) DR solution that enables fast, reliable recovery of physical, virtual, and cloud-based servers on AWS. This post also discusses how you can use CloudEndure Disaster Recovery to reduce downtime and data loss when responding to a security incident, and best practices for maintaining your incident response plan.

How disaster recovery fits into a security incident response plan

The AWS Well-Architected Framework security pillar provides guidance to help you apply best practices and current recommendations in the design, delivery, and maintenance of secure AWS workloads. It includes a recommendation to integrate tools to secure and protect your data. A secure data replication and recovery tool helps you protect your data if there’s a security incident and quickly return to normal business operation as you resolve the incident. The recovery section of your incident response plan should define recovery point objectives (RPOs) and recovery time objectives (RTOs) for your DR-protected workloads. RPO is the window of time that data loss can be tolerated due to a disruption. RTO is the amount of time permitted to recover workloads after a disruption.

Your DR response to a security incident can vary based on the type of incident you encounter. For example, your DR plan for responding to a security incident such as ransomware—which involves data corruption—should describe how to recover workloads on your secondary DR site using a recovery point prior to the data corruption. This use case will be discussed further in the next section.

In addition to tools and processes, your security incident response plan should define the roles and responsibilities necessary during an incident. This includes the people and roles in your organization who perform incident mitigation steps, in addition to those who need to be informed and consulted. This can include technology partners, application owners, or subject matter experts (SMEs) outside of your organization who can offer additional expertise. DR-related roles for your incident response plan include:

  • A person who analyzes the situation and provides visibility to decision-makers.
  • A person who decides whether or not to trigger a DR response.
  • A person who actively triggers the DR response.

Be sure to include all of the stakeholders you identify in your documented security incident response procedures and runbooks. Test your plan to verify that the people in these roles have the pre-provisioned access they need to perform their defined role.

How to use CloudEndure Disaster Recovery during a security incident

CloudEndure Disaster Recovery continuously replicates your servers—including OS, system state configuration, databases, applications, and files—to a staging area in your target AWS Region. The staging area contains low-cost resources automatically provisioned and managed by CloudEndure Disaster Recovery. This reduces the cost of provisioning duplicate resources during normal operation. Your fully provisioned recovery environment is launched only during an incident or drill.

If your organization experiences a security incident that can be remediated using DR, you can use CloudEndure Disaster Recovery to perform failover to your target AWS Region from your source environment. When you perform failover, CloudEndure Disaster Recovery orchestrates the recovery of your environment in your target AWS Region. This enables quick recovery, with RPOs of seconds and RTOs of minutes.

To deploy CloudEndure Disaster Recovery, you must first install the CloudEndure agent on the servers in your environment that you want to replicate for DR, and then initiate data replication to your target AWS Region. Once data replication is complete and your data is in sync, you can launch machines in your target AWS Region from the CloudEndure User Console. CloudEndure Disaster Recovery enables you to launch target machines in either Test Mode or Recovery Mode. Your launched machines behave the same way in either mode; the only difference is how the machine lifecycle is displayed in the CloudEndure User Console. Launch machines by opening the Machines page, shown in the following figure, and selecting the machines you want to launch. Then select either Test Mode or Recovery Mode from the Launch Target Machines menu.
 

Figure 1: Machines page on the CloudEndure User Console

Figure 1: Machines page on the CloudEndure User Console

You can launch your entire environment, a group of servers comprising one or more applications, or a single server in your target AWS Region. When you launch machines from the CloudEndure User Console, you’re prompted to choose a recovery point from the Choose Recovery Point dialog box (shown in the following figure).

Use point-in-time recovery to respond to security incidents that involve data corruption, such as ransomware. Your incident response plan should include a mechanism to determine when data corruption occurred. Knowing how to determine which recovery point to choose in the CloudEndure User Console helps you minimize response time during a security incident. Each recovery point is a point-in-time snapshot of your servers that you can use to launch recovery machines in your target AWS Region. Select the latest recovery point before the data corruption to restore your workloads on AWS, and then choose Continue With Launch.
 

Figure 2: Selection of an earlier recovery point from the Choose Recovery Point dialog box

Figure 2: Selection of an earlier recovery point from the Choose Recovery Point dialog box

Run your recovered workloads in your target AWS Region until you’ve resolved the security incident. When the incident is resolved, you can perform failback to your primary environment using CloudEndure Disaster Recovery. You can learn more about CloudEndure Disaster Recovery setup, operation, and recovery by taking this online CloudEndure Disaster Recovery Technical Training.

Test and maintain the recovery section of your incident response plan

Your entire incident response plan must be kept accurate and up to date in order to effectively remediate security incidents if they occur. A best practice for achieving this is through frequently testing all sections of your plan, including your tools. When you first deploy CloudEndure Disaster Recovery, begin running tests as soon as all of your replicated servers are in sync on your target AWS Region. DR solution implementation is generally considered complete when all initial testing has succeeded.

By correctly configuring the networking and security groups in your target AWS Region, you can use CloudEndure Disaster Recovery to launch a test workload in an isolated environment without impacting your source environment. You can run tests as often as you want. Tests don’t incur additional fees beyond payment for the fully provisioned resources generated during tests.

Testing involves two main components: launching the machines you wish to test on AWS, and performing user acceptance testing (UAT) on the launched machines.

  1. Launch machines to test.
     
    Select the machines to test from the Machines page of the CloudEndure User Console by selecting the check box next to the machine. Then choose Test Mode from the Launch Target Machines menu, as shown in the following figure. You can select the latest recovery point or an earlier recovery point.
     
    Figure 3: Select Test Mode to launch selected machines

    Figure 3: Select Test Mode to launch selected machines

     

    The following figure shows the CloudEndure User Console. The Disaster Recovery Lifecycle column shows that the machines have been Tested Recently.

    Figure 4: Machines launched in Test Mode display purple icons in the Status column and Tested Recently in the Disaster Recovery Lifecycle column

    Figure 4: Machines launched in Test Mode display purple icons in the Status column and Tested Recently in the Disaster Recovery Lifecycle column

  2. Perform UAT testing.
     
    Begin UAT testing when the machine launch job is successfully completed and your target machines have booted.

After you’ve successfully deployed, configured, and tested CloudEndure Disaster Recovery on your source environment, add it to your ongoing change management processes so that your incident response plan remains accurate and up-to-date. This includes deploying and testing CloudEndure Disaster Recovery every time you add new servers to your environment. In addition, monitor for changes to your existing resources and make corresponding changes to your CloudEndure Disaster Recovery configuration if necessary.

How CloudEndure Disaster Recovery keeps your data secure

CloudEndure Disaster Recovery has multiple mechanisms to keep your data secure and not introduce new security risks. Data replication is performed using AES 256-bit encryption in transit. Data at rest can be encrypted by using Amazon Elastic Block Store (Amazon EBS) encryption with an AWS managed key or a customer key. Amazon EBS encryption is supported by all volume types, and includes built-in key management infrastructure that has no performance impact. Replication traffic is transmitted directly from your source servers to your target AWS Region, and can be restricted to private connectivity such as AWS Direct Connect or a VPN. CloudEndure Disaster Recovery is ISO 27001 and GDPR compliant and HIPAA eligible.

Summary

Each organization tailors its incident response plan to meet its unique security requirements. As described in this post, you can use CloudEndure Disaster Recovery to improve your organization’s incident response plan. I also explained how to recover from an earlier point in time when you respond to security incidents involving data corruption, and how to test your servers as part of maintaining the DR section of your incident response plan. By following the guidance in this post, you can improve your IT resilience and recover more quickly from security incidents. You can also reduce your DR operational costs by avoiding duplicate provisioning of your DR infrastructure.

Visit the CloudEndure Disaster Recovery product page if you would like to learn more. You can also view the AWS Raise the Bar on Data Protection and Security webinar series for additional information on how to protect your data and improve IT resilience on AWS.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Gonen Stein

Gonen is the Head of Product Strategy for CloudEndure, an AWS company. He combines his expertise in business, cloud infrastructure, storage, and information security to assist enterprise organizations with developing and deploying IT resilience and business continuity strategies in the cloud.

New! Streamline existing IAM Access Analyzer findings using archive rules

Post Syndicated from Andrea Nedic original https://aws.amazon.com/blogs/security/new-streamline-existing-iam-access-analyzer-findings-using-archive-rules/

AWS Identity and Access Management (IAM) Access Analyzer generates comprehensive findings to help you identify resources that grant public and cross-account access. Now, you can also apply archive rules to existing findings, so you can better manage findings and focus on the findings that need your attention most.

You can think of archive rules as similar to email rules. You define email rules to automatically organize emails. With IAM Access Analyzer, you can define archive rules to automatically mark findings as intended access. Now, those rules can apply to existing as well as new IAM Access Analyzer findings. This helps you focus on findings for potential unintended access to your resources. You can then easily track and resolve these findings by reducing access, helping you to work towards least privilege.

In this post, first I give a brief overview of IAM Access Analyzer. Then I show you an example of how to create an archive rule to automatically archive findings for intended access. Finally, I show you how to update an archive rule to mark existing active findings as intended.

IAM Access Analyzer overview

IAM Access Analyzer helps you determine which resources can be accessed publicly or from other accounts or organizations. IAM Access Analyzer determines this by mathematically analyzing access control policies attached to resources. This form of analysis—called automated reasoning—applies logic and mathematical inference to determine all possible access paths allowed by a resource policy. This is how IAM Access Analyzer uses provable security to deliver comprehensive findings for potential unintended bucket access. You can enable IAM Access Analyzer in the IAM console by creating an analyzer for an account or an organization. Once you’ve created your analyzer, you can review findings for resources that can be accessed publicly or from other AWS accounts or organizations.

Create an archive rule to automatically archive findings for intended access

When you review findings and discover common patterns for intended access, you can create archive rules to automatically archive those findings. This helps you focus on findings for unintended access to your resources, just like email rules help streamline your inbox.

To create an archive rule

In the IAM console, choose Archive rules under Access Analyzer. Then, choose Create archive rule to display the Create archive rule page shown in Figure 1. There, you find the option to name the rule or use the name generated by default. In the Rule section, you define criteria to match properties of findings you want to archive. Just like email rules, you can add multiple criteria to the archive rule. You can define each criterion by selecting a finding property, an operator, and a value. To help ensure a rule doesn’t archive findings for public access, the criterion Public access is false is suggested by default.
 

Figure 1: IAM Access Analyzer create archive rule page where you add criteria to create a new archive rule

Figure 1: IAM Access Analyzer create archive rule page where you add criteria to create a new archive rule

For example, I have a security audit role external to my account that I expect to have access to resources in my account. To mark that access as intended, I create a rule to archive all findings for Amazon S3 buckets in my account that can be accessed by the security audit role outside of the account. To do this, I include two criteria: Resource type matches S3 bucket, and the AWS Account value matches the security audit role ARN. Once I add these criteria, the Results section displays the list of existing active findings the archive rule matches, as shown in Figure 2.
 

Figure 2: A rule to archive all findings for S3 buckets in an account that can be accessed by the audit role outside of the account, with matching findings displayed

Figure 2: A rule to archive all findings for S3 buckets in an account that can be accessed by the audit role outside of the account, with matching findings displayed

When you’re done adding criteria for your archive rule, select Create and archive active findings to archive new and existing findings based on the rule criteria. Alternatively, you can choose Create rule to create the rule for new findings only. In the preceding example, I chose Create and archive active findings to archive all findings—existing and new—that match the criteria.

Update an archive rule to mark existing findings as intended

You can also update an archive rule to archive existing findings retroactively and streamline your findings. To edit an archive rule, choose Archive rules under Access Analyzer, then select an existing rule and choose Edit. In the Edit archive rule page, update the archive rule criteria and review the list of existing active findings the archive rule applies to. When you save the archive rule, you can apply it retroactively to existing findings by choosing Save and archive active findings as shown in Figure 3. Otherwise, you can choose Save rule to update the rule and apply it to new findings only.

Note: You can also use the new IAM Access Analyzer API operation ApplyArchiveRule to retroactively apply an archive rule to existing findings that meet the archive rule criteria.

 

Figure 3: IAM Access Analyzer edit archive rule page where you can apply the rule retroactively to existing findings by choosing Save and archive active findings

Figure 3: IAM Access Analyzer edit archive rule page where you can apply the rule retroactively to existing findings by choosing Save and archive active findings

Get started

To turn on IAM Access Analyzer at no additional cost, open the IAM console. IAM Access Analyzer is available at no additional cost in the IAM console and through APIs in all commercial AWS Regions, AWS China Regions, and AWS GovCloud (US). To learn more about IAM Access Analyzer and which resources it supports, visit the feature page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS IAM forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Andrea Nedic

Andrea is a Sr. Tech Product Manager for AWS Identity and Access Management. She enjoys hearing from customers about how they build on AWS. Outside of work, Andrea likes to ski, dance, and be outdoors. She holds a PhD from Princeton University.

How to enhance Amazon CloudFront origin security with AWS WAF and AWS Secrets Manager

Post Syndicated from Cameron Worrell original https://aws.amazon.com/blogs/security/how-to-enhance-amazon-cloudfront-origin-security-with-aws-waf-and-aws-secrets-manager/

Whether your web applications provide static or dynamic content, you can improve their performance, availability, and security by using Amazon CloudFront as your content delivery network (CDN). CloudFront is a web service that speeds up distribution of your web content through a worldwide network of data centers called edge locations. CloudFront ensures that end-user requests are served by the closest edge location. As a result, viewer requests travel a short distance, improving performance for your viewers. When you deliver web content through a CDN such as CloudFront, a best practice is to prevent viewer requests from bypassing the CDN and accessing your origin content directly. In this blog post, you’ll see how to use CloudFront custom headers, AWS WAF, and AWS Secrets Manager to restrict viewer requests from accessing your CloudFront origin resources directly.

You can configure CloudFront to add custom HTTP headers to the requests that it sends to your origin. HTTP header fields are components of the header section of request and response messages in the Hypertext Transfer Protocol (HTTP). These custom headers enable you to send and gather information from your origin that isn’t included in typical viewer requests. You can use custom headers to control access to content. By configuring your origin to respond to requests only when they include a custom header that was added by CloudFront, you prevent users from bypassing CloudFront and accessing your origin content directly. In addition to offloading traffic from your origin servers, this also helps enforce web traffic being processed at CloudFront edge locations according to your AWS WAF rules prior to being forwarded to your origin.

AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources. It supports managed rules as well as a powerful rule language for custom rules. AWS WAF is tightly integrated with CloudFront and the Application Load Balancer (ALB). AWS Secrets Manager helps you protect the secrets needed to access your applications, services, and IT resources. This service enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle.

Solution overview

This blog post includes a sample solution you can deploy to see how its components integrate to implement the origin access restriction. The sample solution includes a web server deployed on Amazon Elastic Compute Cloud (Amazon EC2) Linux instances running in an AWS Auto Scaling group. Elastic Load Balancing distributes the incoming application traffic across the EC2 instances by using an ALB. The ALB is associated with an AWS WAF web access control list (web ACL), which is used to validate the incoming origin requests. Finally, a CloudFront distribution is deployed with an AWS WAF web ACL and configured to point to the origin ALB.

Although the sample solution is designed for deployment with CloudFront with an AWS WAF–associated ALB as its origin, the same approach could be used for origins that use Amazon API Gateway. A custom origin is any origin that is not an Amazon Simple Storage Service (Amazon S3) bucket, with one exception. An S3 bucket that is configured with static website hosting is a custom origin. You can refer to the CloudFront Developer Guide for more information on securing content that CloudFront delivers from S3 origins.

This solution is intended to enhance security for CloudFront custom origins that support AWS WAF, such as ALB, and is not a substitute for authentication and authorization mechanisms within your web applications. In this solution, Secrets Manager is used to control, audit, monitor, and rotate a random string used within your CloudFront and AWS WAF configurations. Although most of these lifecycle attributes could be set manually, Secrets Manager makes it easier.

Figure 1 shows how the provided AWS CloudFormation template creates the sample solution.
 

Figure 1: How the CloudFormation template works

Figure 1: How the CloudFormation template works

Here’s how the solution works, as shown in the diagram:

  1. A viewer accesses your website or application and requests one or more files, such as an image file and an HTML file.
  2. DNS routes the request to the CloudFront edge location that can best serve the request—typically the nearest CloudFront edge location in terms of latency.
  3. At the edge location, AWS WAF inspects the incoming request according to configured web ACL rules.
  4. At the edge location, CloudFront checks its cache for the requested content. If the content is in the cache, CloudFront returns it to the user. If the content isn’t in the cache, CloudFront adds the custom header, X-Origin-Verify, with the value of the secret from Secrets Manager, and forwards the request to the origin.
  5. At the origin Application Load Balancer (ALB), AWS WAF inspects the incoming request header, X-Origin-Verify, and allows the request if the string value is valid. If the header isn’t valid, AWS WAF blocks the request.
  6. At the configured interval, Secrets Manager automatically rotates the custom header value and updates the origin AWS WAF and CloudFront configurations.

Solution deployment

This sample solution includes seven main steps:

  1. Deploy the CloudFormation template.
  2. Confirm successful viewer access to the CloudFront URL.
  3. Confirm that direct viewer access to the origin URL is blocked by AWS WAF.
  4. Review the CloudFront origin custom header configuration.
  5. Review the AWS WAF web ACL header validation rule.
  6. Review the Secrets Manager configuration.
  7. Review the Secrets Manager AWS Lambda rotation function.

Step 1: Deploy the CloudFormation template

The stack will launch in the N. Virginia (us-east-1) Region. It takes approximately 10 minutes for the CloudFormation stack to complete.

Note: The sample solution requires deployment in the N. Virginia (us-east-1) Region. Although out of scope for this blog post, an additional sample template is available in this solution’s GitHub repository for testing this solution with an existing CloudFront distribution and regional AWS WAF web ACL. Refer to the AWS regional service support information for more details on regional service availability.

To launch the CloudFormation stack

  1. Choose the following Launch Stack icon to launch a CloudFormation stack in your account in the N. Virginia Region.
     
    Select the Launch Stack button to launch the template
  2. In the CloudFormation console, leave the configured values, and then choose Next.
  3. On the Specify Details page, provide the following input parameters. You can modify the default values to customize the solution for your environment.

    Input parameter Input parameter description
    EC2InstanceSizeThe instance size for EC2 web servers.
    HeaderNameThe HTTP header name for the secret string.
    WAFRulePriorityThe rule number to use for the regional AWS WAF web ACL. 0 is recommended, because rules are evaluated in order based on the value of priority.
    RotateIntervalThe rotation interval, in days, for the origin secret value. Full rotation requires two intervals.
    ArtifactsBucketThe S3 bucket with artifact files (Lambda functions, templates, HTML files, and so on). Keep the default value.
    ArtifactsPrefixThe path for the S3 bucket that contains artifact files. Keep the default value.

    Figure 2 shows an example of values entered under Parameters.
     

    Figure 2: Input parameters for the CloudFormation stack

    Figure 2: Input parameters for the CloudFormation stack

  4. Enter values for all of the input parameters, and then choose Next.
  5. On the Options page, keep the defaults, and then choose Next.
  6. On the Review page, confirm the details, acknowledge the statements under Capabilities and transforms as shown in Figure 3, and then choose Create stack.
     
    Figure 3: CloudFormation Capabilities and Transforms acknowledgments

    Figure 3: CloudFormation Capabilities and Transforms acknowledgments

Step 2: Confirm access to the website through CloudFront

Next, confirm that website access through CloudFront is functioning as intended. After the CloudFormation stack completes deployment, you can access the test website using the domain name that was automatically assigned to the distribution.

To confirm viewer access to the website through CloudFront

  1. In the CloudFormation console, choose Services > CloudFormation > CFOriginVerify stack. On the stack Outputs tab, look for the cfEndpoint entry, similar to that shown in Figure 4.
     
    Figure 4: CloudFormation cfEndpoint stack output

    Figure 4: CloudFormation cfEndpoint stack output

  2. The cfEndpoint is the URL for the site, and it is automatically assigned by CloudFront. Choose the cfEndpoint link to open the test page, as shown in Figure 5.
     
    Figure 5: CloudFormation cfEndpoint test page

    Figure 5: CloudFormation cfEndpoint test page

In this step, you’ve confirmed that website accessibility through CloudFront is functioning as intended.

Step 3: Confirm that direct viewer access to the origin URL is blocked by AWS WAF

In this step, you confirm that direct access to the test website is blocked by the regional AWS WAF web ACL.

To test direct access to the origin URL

  1. In the CloudFormation console, choose Services > CloudFormation > CFOriginVerify stack. On the stack Outputs tab, look for the albEndpoint entry.
  2. Choose the albEndpoint link to go to the test site URL that was automatically assigned to the ALB. Choosing this link will result in a 403 Forbidden response. When AWS WAF blocks a web request based on the conditions that you specify, it returns HTTP status code 403 (Forbidden).

In this step, you’ve confirmed that website accessibility directly to the origin ALB is blocked by the regional AWS WAF web ACL.

Step 4: Review the CloudFront origin custom header configuration

Now that you’ve confirmed that the test website can only be accessed through CloudFront, you can review the detailed CloudFront, WAF, and Secrets Manager configurations that enable this restriction.

To review the custom header configuration

  1. In the CloudFormation console, choose Services > CloudFormation > CFOriginVerify stack. On the stack Outputs tab, look for the cfDistro entry.
  2. Choose the cfDistro link to go to this distribution’s configuration in the CloudFront console. On the Origin Groups tab, under Origins, select the origin as shown in Figure 6.
     
    Figure 6: CloudFront Origins and Origin Groups settings

    Figure 6: CloudFront Origins and Origin Groups settings

  3. Choose Edit to go to the Origin Settings section, scroll to the bottom and review the Origin Custom Headers as shown in Figure 7.
     
    Figure 7: CloudFront Origin Custom Headers settings

    Figure 7: CloudFront Origin Custom Headers settings

    You can see that the custom header, X-Origin-Verify, has been configured using Secrets Manager with a random 32-character alpha-numeric value. This custom header will be added to web requests that are forwarded from CloudFront to your origin. As you learned in steps 2 and 3, requests without this header are blocked by AWS WAF at the origin ALB. In the next two steps, you will dive deeper into how this works.

Step 5: Review the AWS WAF web ACL header validation rule

In this step, you review the AWS WAF rule configuration that validates the CloudFront custom header X-Origin-Verify.

To review the header validation rule

  1. In the CloudFormation console, select Services > CloudFormation > CFOriginVerify stack. On the stack Outputs tab, look for the wafWebACLR entry.
  2. Choose the wafWebACLR link to go to the origin ALB web ACL configuration in the WAF and Shield console. On the Overview tab, you can view the Requests per 5 minute period chart and the Sampled requests list, which shows requests from the last three hours that the ALB has forwarded to AWS WAF for inspection. The sample of requests includes detailed data about each request, such as the originating IP address and Uniform Resource Identifier (URI). You also can view which rule the request matched, and whether the rule Action is configured to ALLOW, BLOCK, or COUNT requests. You can enable AWS WAF logging to get detailed information about traffic that’s analyzed by your web ACL. You send logs from your web ACL to an Amazon Kinesis Data Firehose with a configured storage destination such as Amazon S3. Information that’s contained in the logs includes the time that AWS WAF received the request from your AWS resource, detailed information about the request, and the action for the rule that each request matched.
  3. Choose the Rules tab to review the rules for this web ACL, as shown in Figure 8.
     
    Figure 8: AWS WAF web ACL rules

    Figure 8: AWS WAF web ACL rules

    On the Rules tab, you can see that the CFOriginVerifyXOriginVerify rule has been configured with the Allow action, while the Default web ACL action is Block. This means that any incoming requests that don’t match the conditions in this rule will be blocked.

    In every AWS WAF rule group and every web ACL, rules define how to inspect web requests and what to do when a web request matches the inspection criteria. Each rule requires one top-level statement, which might contain nested statements at any depth, depending on the rule and statement type. You can learn more about AWS WAF rule statements in the AWS WAF Developer Guide, AWS Online Tech Talks, and samples on GitHub.

  4. Choose the CFOriginVerifyXOriginVerify rule, and then choose Edit to bring up the Rule Builder tool. In the Rule Builder, you can see that a rule has been created with two Rule Statements similar to those in Figure 9.
     
    Figure 9: AWS WAF web ACL rule statement

    Figure 9: AWS WAF web ACL rule statement

    In the Rule Builder configuration for Statement 1, you can see that the request Header is being inspected for the x-origin-verify Header field name (HTTP header field names are case insensitive), and the String to match value is set to the value you reviewed in step 4. In the Rule Builder, you can also see a logical OR with an additional rule statement, Statement 2. You will notice that the configuration for Statement 2 is the same as Statement 1, except that the String to match value is different. You will learn about this in detail in step 7, but Statement 2 helps to ensure that valid web requests are processed by your origin servers when Secrets Manager automatically rotates the value of the X-Origin-Verify header. The effect of this rule configuration is that inspected web requests will be allowed if they match either of the two statements.

    In addition to the visual web ACL representation you just reviewed in the WAF Rule visual editor, every web ACL also has a JSON format representation you can edit by using the WAF Rule JSON editor. You can retrieve the complete configuration for a web ACL in JSON format, modify it as you need, and then provide it to AWS WAF through the console, API, or command line interface (CLI).

    This step demonstrated how your request was allowed to access the test website in step 2 and why your request was blocked in step 3.

Step 6: Review Secrets Manager configuration

Now that you’re familiar with the CloudFront and AWS WAF configurations, you will learn how Secrets Manager creates and rotates the secret used for the X-Origin-Verify header field value. Secrets Manager uses an AWS Lambda function to perform the actual rotation of the secret used for the value and update the associated AWS WAF web ACL and CloudFront distribution.

To review the Secrets Manager configuration

  1. In the CloudFormation console, choose Services > CloudFormation > CFOriginVerify stack. On the stack Outputs tab, look for the OriginVerifySecret entry.
  2. Choose the OriginVerifySecret link to go to the configuration for the secret in the Secrets Manager console. Scroll down to the section titled Secret value, and then choose Retrieve secret value to display the Secret key/value as shown in Figure 10.
     
    Figure 10: Secrets Manager retrieve value

    Figure 10: Secrets Manager retrieve value

    When you retrieve the secret, Secrets Manager programmatically decrypts the secret and displays it in the console. You can see that the secret is stored as a key-value pair, where the secret key is HEADERVALUE, and the secret value is the string used in the CloudFront and WAF configurations you reviewed in steps 3 and 4.

  3. While you’re in the Secrets Manager console, review the Rotation configuration section, as shown in Figure 11.
     
    Figure 11: Secrets Manager rotation configuration

    Figure 11: Secrets Manager rotation configuration

    You can see that rotation was enabled for this secret at an interval of one day. This configuration also includes a Lambda rotation function. Secrets Manager uses a Lambda function to perform the actual rotation of a secret. If you use your secret for one of the supported Amazon Relational Database Service (Amazon RDS) databases, then Secrets Manager provides the Lambda function for you. If you use your secret for another service, then you must provide the code for the Lambda function, as we’ve done in this solution.

Step 7: Review the Secrets Manager Lambda rotation function

In this step, you review the Secrets Manager Lambda rotation function.

To review the Secrets Manager Lambda rotation function

  1. In the CloudFormation console, choose Services > CloudFormation > CFOriginVerify stack. In the stack Outputs tab, look for the OriginSecretRotateFunction entry.
  2. Choose the OriginSecretRotateFunction link to go to the Lambda function that is configured for this secret. The code used for this secrets rotation function is based on the AWS Secrets Manager Rotation Template. Choose the Monitoring tab and review the Invocations graph as shown in Figure 12.
     
    Figure 12: Monitoring tab for the Lambda rotation function

    Figure 12: Monitoring tab for the Lambda rotation function

    Shortly after the CloudFormation stack creation completes, you should see several invocations in the Invocations graph. When a configured rotation schedule or a manual process triggers rotation, Secrets Manager calls the Lambda function several times, each time with different parameters. The Lambda function performs several tasks throughout the process of rotating a secret. This includes the following steps: createSecret, setSecret, testSecret, and finishSecret. Secrets Manager uses staging labels, a simple text string, to enable you to identify different versions of a secret during rotation. This includes the following staging labels: AWSPENDING, AWSCURRENT, and AWSPREVIOUS, which are covered in the following step.

  3. To learn more about the rotation steps configured for this solution, choose View logs in CloudWatch on the Monitoring tab.
    1. On the Log streams tab, select the top entry in the list.
    2. Enter Event in the Filter events field, and then choose the arrows to expand the details for each event as shown in Figure 13.
       
      Figure 13: CloudWatch event logs for the Lambda rotation function

      Figure 13: CloudWatch event logs for the Lambda rotation function

The four rotation steps annotated in Figure 13 work as follows:

Note: This section provides an overview of the rotation process for this solution. For more detailed information about the Lambda rotation function, see the Secrets Manager User Guide.

  1. The createSecret step: In this step, the Lambda function generates a new version of the secret. The rotation Lambda function calls the GetRandomPassword method to generate a new random string, and then labels the new version of the secret with the staging label AWSPENDING to mark it as the in-process version of the secret.
  2. The SetSecret step: In this step, the rotation function retrieves the version of the secret labeled AWSPENDING from Secrets Manager and updates the web ACL rule for the AWS WAF associated with the origin ALB. The two rule statements you reviewed in step 5 of this blog post are updated with the AWSPENDING and AWSCURRENT values. The rotation function also updates the value for the Origin Custom Header X-Origin-Verify. When the rotation function updates your distribution configuration, CloudFront starts to propagate the changes to all edge locations. Maintaining both the AWSPENDING and AWSCURRENT secret values helps to ensure that web requests forwarded to your origin by CloudFront are not blocked. Therefore, once a secret value is created, two rotation intervals are required for it to be removed from the configuration.
  3. The testSecret step: This step of the Lambda function verifies the AWSPENDING version of the secret by using it to access the origin ALB endpoint with the X-Origin-Verify header. Both AWSPENDING and AWSCURRENT X-Origin-Verify header values are tested to confirm a “200 OK” response from the origin ALB endpoint.
  4. The finishSecret step: In the last step, the Lambda function moves the label AWSCURRENT from the current version to this new version of the secret. The old version receives the AWSPREVIOUS staging label, and is available for recovery as the last known good version of the secret, if needed. The old version with the AWSPREVIOUS staging label no longer has any staging labels attached, so Secrets Manager considers the old version deprecated and subject to deletion.

When the finishSecret step has successfully completed, Secrets Manager schedules the next rotation by adding the rotation interval (number of days) to the completion date. This automated process causes the values used for the validation headers to be updated at the configured interval. Although out of scope for this blog post, you should monitor your secrets to ensure usage of your secrets and log any changes to them. This helps you to make sure that any unexpected usage or change can be investigated, and unwanted changes can be rolled back.

Summary

You’ve learned how to use Amazon CloudFront, AWS WAF and AWS Secrets Manager to prevent web requests from directly accessing your CloudFront origin resources. You can use this solution to improve security for CloudFront custom origins that support AWS WAF, such as ALB, Amazon API Gateway, and AWS AppSync.

When using this solution, you will incur AWS WAF usage charges for both the ALB and CloudFront associated AWS WAF web ACLs. You might wish to consider subscribing to AWS Shield Advanced, which provides higher levels of protection against distributed denial of service (DDoS) attacks and includes AWS WAF and AWS Firewall Manager at no additional cost for usage on resources protected by AWS Shield Advanced. You can also learn more about pricing for CloudFront, AWS WAF, Secrets Manager, and AWS Shield Advanced.

You can review more options for restricting access to content with CloudFront, additional AWS WAF security automations, or managed rules for AWS WAF. You can explore solutions for using AWS IP address ranges to enhance CloudFront origin security. You might also wish to learn more about Secrets Manager best practices. This code for this solution is available on GitHub.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about using this solution, you can start a thread in the CloudFront, WAF, or Secrets Manager forums, review or open an issue in this solution’s GitHub repository, or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Cameron Worrell

Cameron Worrell

Cameron is a Solutions Architect with a passion for security and enterprise transformation. He joined AWS in 2015.