Tag Archives: PKCS#11

How to implement cryptographic modules to secure private keys used with IAM Roles Anywhere

Post Syndicated from Edouard Kachelmann original https://aws.amazon.com/blogs/security/how-to-implement-cryptographic-modules-to-secure-private-keys-used-with-iam-roles-anywhere/

AWS Identity and Access Management (IAM) Roles Anywhere enables workloads that run outside of Amazon Web Services (AWS), such as servers, containers, and applications, to use X.509 digital certificates to obtain temporary AWS credentials and access AWS resources, the same way that you use IAM roles for workloads on AWS. Now, IAM Roles Anywhere allows you to use PKCS #11–compatible cryptographic modules to help you securely store private keys associated with your end-entity X.509 certificates.

Cryptographic modules allow you to generate non-exportable asymmetric keys in the module hardware. The cryptographic module exposes high-level functions, such as encrypt, decrypt, and sign, through an interface such as PKCS #11. Using a cryptographic module with IAM Roles Anywhere helps to ensure that the private keys associated with your end-identity X.509 certificates remain in the module and cannot be accessed or copied to the system.

In this post, I will show how you can use PKCS #11–compatible cryptographic modules, such as YubiKey 5 Series and Thales ID smart cards, with your on-premises servers to securely store private keys. I’ll also show how to use those private keys and certificates to obtain temporary credentials for the AWS Command Line Interface (AWS CLI) and AWS SDKs.

Cryptographic modules use cases

IAM Roles Anywhere reduces the need to manage long-term AWS credentials for workloads running outside of AWS, to help improve your security posture. Now IAM Roles Anywhere has added support for compatible PKCS #11 cryptographic modules to the credential helper tool so that organizations that are currently using these (such as defense, government, or large enterprises) can benefit from storing their private keys on their security devices. This mitigates the risk of storing the private keys as files on servers where they can be accessed or copied by unauthorized users.

Note: If your organization does not implement PKCS #11–compatible modules, IAM Roles Anywhere credential helper supports OS certificate stores (Keychain Access for macOS and Cryptography API: Next Generation (CNG) for Windows) to help protect your certificates and private keys.

Solution overview

This authentication flow is shown in Figure 1 and is described in the following sections.

Figure 1: Authentication flow using crypto modules with IAM Roles Anywhere

Figure 1: Authentication flow using crypto modules with IAM Roles Anywhere

How it works

As a prerequisite, you must first create a trust anchor and profile within IAM Roles Anywhere. The trust anchor will establish trust between your public key infrastructure (PKI) and IAM Roles Anywhere, and the profile allows you to specify which roles IAM Roles Anywhere assumes and what your workloads can do with the temporary credentials. You establish trust between IAM Roles Anywhere and your certificate authority (CA) by creating a trust anchor. A trust anchor is a reference to either AWS Private Certificate Authority (AWS Private CA) or an external CA certificate. For this walkthrough, you will use the AWS Private CA.

The one-time initialization process (step “0 – Module initialization” in Figure 1) works as follows:

  1. You first generate the non-exportable private key within the secure container of the cryptographic module.
  2. You then create the X.509 certificate that will bind an identity to a public key:
    1. Create a certificate signing request (CSR).
    2. Submit the CSR to the AWS Private CA.
    3. Obtain the certificate signed by the CA in order to establish trust.
  3. The certificate is then imported into the cryptographic module for mobility purposes, to make it available and simple to locate when the module is connected to the server.

After initialization is done, the module is connected to the server, which can then interact with the AWS CLI and AWS SDK without long-term credentials stored on a disk.

To obtain temporary security credentials from IAM Roles Anywhere:

  1. The server will use the credential helper tool that IAM Roles Anywhere provides. The credential helper works with the credential_process feature of the AWS CLI to provide credentials that can be used by the CLI and the language SDKs. The helper manages the process of creating a signature with the private key.
  2. The credential helper tool calls the IAM Roles Anywhere endpoint to obtain temporary credentials that are issued in a standard JSON format to IAM Roles Anywhere clients via the API method CreateSession action.
  3. The server uses the temporary credentials for programmatic access to AWS services.

Alternatively, you can use the update or serve commands instead of credential-process. The update command will be used as a long-running process that will renew the temporary credentials 5 minutes before the expiration time and replace them in the AWS credentials file. The serve command will be used to vend temporary credentials through an endpoint running on the local host using the same URIs and request headers as IMDSv2 (Instance Metadata Service Version 2).

Supported modules

The credential helper tool for IAM Roles Anywhere supports most devices that are compatible with PKCS #11. The PKCS #11 standard specifies an API for devices that hold cryptographic information and perform cryptographic functions such as signature and encryption.

I will showcase how to use a YubiKey 5 Series device that is a multi-protocol security key that supports Personal Identity Verification (PIV) through PKCS #11. I am using YubiKey 5 Series for the purpose of demonstration, as it is commonly accessible (you can purchase it at the Yubico store or Amazon.com) and is used by some of the world’s largest companies as a means of providing a one-time password (OTP), Fast IDentity Online (FIDO) and PIV for smart card interface for multi-factor authentication. For a production server, we recommend using server-specific PKCS #11–compatible hardware security modules (HSMs) such as the YubiHSM 2, Luna PCIe HSM, or Trusted Platform Modules (TPMs) available on your servers.

Note: The implementation might differ with other modules, because some of these come with their own proprietary tools and drivers.

Implement the solution: Module initialization

You need to have the following prerequisites in order to initialize the module:

Following are the high-level steps for initializing the YubiKey device and generating the certificate that is signed by AWS Private Certificate Authority (AWS Private CA). Note that you could also use your own public key infrastructure (PKI) and register it with IAM Roles Anywhere.

To initialize the module and generate a certificate

  1. Verify that the YubiKey PIV interface is enabled, because some organizations might disable interfaces that are not being used. To do so, run the YubiKey Manager CLI, as follows:
    ykman info

    The output should look like the following, with the PIV interface enabled for USB.

    Figure 2:YubiKey Manager CLI showing that the PIV interface is enabled

    Figure 2:YubiKey Manager CLI showing that the PIV interface is enabled

  2. Use the YubiKey Manager CLI to generate a new RSA2048 private key on the security module in slot 9a and store the associated public key in a file. Different slots are available on YubiKey, and we will use the slot 9a that is for PIV authentication purpose. Use the following command to generate an asymmetric key pair. The private key is generated on the YubiKey, and the generated public key is saved as a file. Enter the YubiKey management key to proceed:
    ykman ‐‐device 123456 piv keys generate 9a pub-yubi.key

  3. Create a certificate request (CSR) based on the public key and specify the subject that will identify your server. Enter the user PIN code when prompted.
    ykman --device 123456 piv certificates request 9a --subject 'CN=server1-demo,O=Example,L=Boston,ST=MA,C=US' pub-yubi.key csr.pem

  4. Submit the certificate request to AWS Private CA to obtain the certificate signed by the CA.
    aws acm-pca issue-certificate \
    --certificate-authority-arn arn:aws:acm-pca:<region>:<accountID>:certificate-authority/<ca-id> \
    --csr fileb://csr.pem \
    --signing-algorithm "SHA256WITHRSA" \
    --validity Value=365,Type="DAYS"

  5. Copy the certificate Amazon Resource Number (ARN), which should look as follows in your clipboard:
    {
    "CertificateArn": "arn:aws:acm-pca:<region>:<accountID>:certificate-authority/<ca-id>/certificate/<certificate-id>"
    }

  6. Export the new certificate from AWS Private CA in a certificate.pem file.
    aws acm-pca get-certificate \
    --certificate-arn arn:aws:acm-pca:<region>:<accountID>:certificate-authority/<ca-id>/certificate/<certificate-id> \
    --certificate-authority-arn arn:aws:acm-pca: <region>:<accountID>:certificate-authority/<ca-id> \
    --query Certificate \
    --output text > certificate.pem

  7. Import the certificate file on the module by using the YubiKey Manager CLI or through the YubiKey Manager UI. Enter the YubiKey management key to proceed.
    ykman --device 123456 piv certificates import 9a certificate.pem

The security module is now initialized and can be plugged into the server.

Configuration to use the security module for programmatic access

The following steps will demonstrate how to configure the server to interact with the AWS CLI and AWS SDKs by using the private key stored on the YubiKey or PKCS #11–compatible device.

To use the YubiKey module with credential helper

  1. Download the credential helper tool for IAM Roles Anywhere for your operating system.
  2. Install the p11-kit package. Most providers (including opensc) will ship with a p11-kit “module” file that makes them discoverable. Users shouldn’t need to specify the PKCS #11 “provider” library when using the credential helper, because we use p11-kit by default.

    If your device library is not supported by p11-kit, you can install that library separately.

  3. Verify the content of the YubiKey by using the following command:
    ykman --device 123456 piv info

    The output should look like the following.

    Figure 3: YubiKey Manager CLI output for the PIV information

    Figure 3: YubiKey Manager CLI output for the PIV information

    This command provides the general status of the PIV application and content in the different slots such as the certificates installed.

  4. Use the credential helper command with the security module. The command will require at least:
    • The ARN of the trust anchor
    • The ARN of the target role to assume
    • The ARN of the profile to pull policies from
    • The certificate and/or key identifiers in the form of a PKCS #11 URI

You can use the certificate flag to search which slot on the security module contains the private key associated with the user certificate.

To specify an object stored in a cryptographic module, you should use the PKCS #11 URI that is defined in RFC7512. The attributes in the identifier string are a set of search criteria used to filter a set of objects. See a recommended method of locating objects in PKCS #11.

In the following example, we search for an object of type certificate, with the object label as “Certificate for Digital Signature”, in slot 1. The pin-value attribute allows you to directly use the pin to log into the cryptographic device.

pkcs11:type=cert;object=Certificate%20for%20Digital%20Signature;id=%01?pin-value=123456

From the folder where you have installed the credential helper tool, use the following command. Because we only have one certificate on the device, we can limit the filter to the certificate type in our PKCS #11 URI.

./aws_signing_helper credential-process
--profile-arn arn:aws:rolesanywhere:<region>:<accountID>:profile/<profileID>
--role-arn arn:aws:iam::<accountID>:role/<assumedRole> 
--trust-anchor-arn arn:aws:rolesanywhere:<region>:<accountID>:trust-anchor/<trustanchorID>
--certificate pkcs11:type=cert?pin-value=<PIN>

If everything is configured correctly, the credential helper tool will return a JSON that contains the credentials, as follows. The PIN code will be requested if you haven’t specified it in the command.

Please enter your user PIN:
  			{
                    "Version":1,
                    "AccessKeyId": <String>,
                    "SecretAccessKey": <String>,
                    "SessionToken": <String>,
                    "Expiration": <Timestamp>
                 }

To use temporary security credentials with AWS SDKs and the AWS CLI, you can configure the credential helper tool as a credential process. For more information, see Source credentials with an external process. The following example shows a config file (usually in ~/.aws/config) that sets the helper tool as the credential process.

[profile server1-demo]
credential_process = ./aws_signing_helper credential-process --profile-arn <arn-for-iam-roles-anywhere-profile> --role-arn <arn-for-iam-role-to-assume> --trust-anchor-arn <arn-for-roles-anywhere-trust-anchor> --certificate pkcs11:type=cert?pin-value=<PIN> 

You can provide the PIN as part of the credential command with the option pin-value=<PIN> so that the user input is not required.

If you prefer not to store your PIN in the config file, you can remove the attribute pin-value. In that case, you will be prompted to enter the PIN for every CLI command.

You can use the serve and update commands of the credential helper mentioned in the solution overview to manage credential rotation for unattended workloads. After the successful use of the PIN, the credential helper will store it in memory for the duration of the process and not ask for it anymore.

Auditability and fine-grained access

You can audit the activity of servers that are assuming roles through IAM Roles Anywhere. IAM Roles Anywhere is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in IAM Roles Anywhere.

To view IAM Roles Anywhere activity in CloudTrail

  1. In the AWS CloudTrail console, in the left navigation menu, choose Event history.
  2. For Lookup attributes, filter by Event source and enter rolesanywhere.amazonaws.com in the textbox. You will find all the API calls that relate to IAM Roles Anywhere, including the CreateSession API call that returns temporary security credentials for workloads that have been authenticated with IAM Roles Anywhere to access AWS resources.
    Figure 4: CloudTrail Events filtered on the “IAM Roles Anywhere” event source

    Figure 4: CloudTrail Events filtered on the “IAM Roles Anywhere” event source

  3. When you review the CreateSession event record details, you can find the assumed role ID in the form of <PrincipalID>:<serverCertificateSerial>, as in the following example:
    Figure 5: Details of the CreateSession event in the CloudTrail console showing which role is being assumed

    Figure 5: Details of the CreateSession event in the CloudTrail console showing which role is being assumed

  4. If you want to identify API calls made by a server, for Lookup attributes, filter by User name, and enter the serverCertificateSerial value from the previous step in the textbox.
    Figure 6: CloudTrail console events filtered by the username associated to our certificate on the security module

    Figure 6: CloudTrail console events filtered by the username associated to our certificate on the security module

    The API calls to AWS services made with the temporary credentials acquired through IAM Roles Anywhere will contain the identity of the server that made the call in the SourceIdentity field. For example, the EC2 DescribeInstances API call provides the following details:

    Figure 7: The event record in the CloudTrail console for the EC2 describe instances call, with details on the assumed role and certificate CN.

    Figure 7: The event record in the CloudTrail console for the EC2 describe instances call, with details on the assumed role and certificate CN.

Additionally, you can include conditions in the identity policy for the IAM role to apply fine-grained access control. This will allow you to apply a fine-grained access control filter to specify which server in the group of servers can perform the action.

To apply access control per server within the same IAM Roles Anywhere profile

  1. In the IAM Roles Anywhere console, select the profile used by the group of servers, then select one of the roles that is being assumed.
  2. Apply the following policy, which will allow only the server with CN=server1-demo to list all buckets by using the condition on aws:SourceIdentity.
    {
      "Version":"2012-10-17",
      "Statement":[
        {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": "s3:ListBuckets",
                "Resource": "*",
                "Condition": {
                    "StringEquals": {
                        "aws:SourceIdentity": "CN=server1-demo"
                    }
                }
            }
      ]
    }

Conclusion

In this blog post, I’ve demonstrated how you can use the YubiKey 5 Series (or any PKCS #11 cryptographic module) to securely store the private keys for the X.509 certificates used with IAM Roles Anywhere. I’ve also highlighted how you can use AWS CloudTrail to audit API actions performed by the roles assumed by the servers.

To learn more about IAM Roles Anywhere, see the IAM Roles Anywhere and Credential Helper tool documentation. For configuration with Thales IDPrime smart card, review the credential helper for IAM Roles Anywhere GitHub page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Identity and Access Management re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Edouard Kachelmann

Edouard is an Enterprise Senior Solutions Architect at Amazon Web Services. Based in Boston, he is a passionate technology enthusiast who enjoys working with customers and helping them build innovative solutions to deliver measurable business outcomes. Prior to his work at AWS, Edouard worked for the French National Cybersecurity Agency, sharing his security expertise and assisting government departments and operators of vital importance. In his free time, Edouard likes to explore new places to eat, try new French recipes, and play with his kids.

How to run AWS CloudHSM workloads in container environments

Post Syndicated from Derek Tumulak original https://aws.amazon.com/blogs/security/how-to-run-aws-cloudhsm-workloads-on-docker-containers/

January 25, 2023: We updated this post to reflect the fact that CloudHSM SDK3 does not support serverless environments and we strongly recommend deploying SDK5.


AWS CloudHSM provides hardware security modules (HSMs) in the AWS Cloud. With CloudHSM, you can generate and use your own encryption keys in the AWS Cloud, and manage your keys by using FIPS 140-2 Level 3 validated HSMs. Your HSMs are part of a CloudHSM cluster. CloudHSM automatically manages synchronization, high availability, and failover within a cluster.

CloudHSM is part of the AWS Cryptography suite of services, which also includes AWS Key Management Service (AWS KMS), AWS Secrets Manager, and AWS Private Certificate Authority (AWS Private CA). AWS KMS, Secrets Manager, and AWS Private CA are fully managed services that are convenient to use and integrate. You’ll generally use CloudHSM only if your workload requires single-tenant HSMs under your own control, or if you need cryptographic algorithms or interfaces that aren’t available in the fully managed alternatives.

CloudHSM offers several options for you to connect your application to your HSMs, including PKCS#11, Java Cryptography Extensions (JCE), OpenSSL Dynamic Engine, or Microsoft Cryptography API: Next Generation (CNG). Regardless of which library you choose, you’ll use the CloudHSM client to connect to HSMs in your cluster.

In this blog post, I’ll show you how to use Docker to develop, deploy, and run applications by using the CloudHSM SDK, and how to manage and orchestrate workloads by using tools and services like Amazon Elastic Container Service (Amazon ECS), Kubernetes, Amazon Elastic Kubernetes Service (Amazon EKS), and Jenkins.

Solution overview

This solution demonstrates how to create a Docker container that uses the CloudHSM JCE SDK to generate a key and use it to encrypt and decrypt data.

Note: In this example, you must manually enter the crypto user (CU) credentials as environment variables when you run the container. For production workloads, you’ll need to consider how to secure and automate the handling and distribution of these credentials. You should work with your security or compliance officer to ensure that you’re using an appropriate method of securing HSM login credentials. For more information on securing credentials, see AWS Secrets Manager.

Figure 1 shows the solution architecture. The Java application, running in a Docker container, integrates with JCE and communicates with CloudHSM instances in a CloudHSM cluster through HSM elastic network interfaces (ENIs). The Docker container runs in an EC2 instance, and access to the HSM ENIs is controlled with a security group.

Figure 1: Architecture diagram

Figure 1: Architecture diagram

Prerequisites

To implement this solution, you need to have working knowledge of the following items:

  • CloudHSM
  • Docker 20.10.17 – used at the time of this post
  • Java 8 or Java 11 – supported at the time of this post
  • Maven 3.05 – used at the time of this post

Here’s what you’ll need to follow along with my example:

  1. An active CloudHSM cluster with at least one active HSM instance. You can follow the CloudHSM getting started guide to create, initialize, and activate a CloudHSM cluster.

    Note: For a production cluster, you should have at least two active HSM instances spread across Availability Zones in the Region.

  2. An Amazon Linux 2 EC2 instance in the same virtual private cloud (VPC) in which you created your CloudHSM cluster. The Amazon Elastic Compute Cloud (Amazon EC2) instance must have the CloudHSM cluster security group attached—this security group is automatically created during the cluster initialization and is used to control network access to the HSMs. To learn about attaching security groups to allow EC2 instances to connect to your HSMs, see Create a cluster in the AWS CloudHSM User Guide.
  3. A CloudHSM crypto user (CU) account. You can create a CU by following the steps in the topic Managing HSM users in AWS CloudHSM in the AWS CloudHSM User Guide.

Solution details

In this section, I’ll walk you through how to download, configure, compile, and run a solution in Docker.

To set up Docker and run the application that encrypts and decrypts data with a key in AWS CloudHSM

  1. On your Amazon Linux EC2 instance, install Docker by running the following command.

    # sudo yum -y install docker

  2. Start the docker service.

    # sudo service docker start

  3. Create a new directory and move to it. In my example, I use a directory named cloudhsm_container. You’ll use the new directory to configure the Docker image.

    # mkdir cloudhsm_container
    # cd cloudhsm_container

  4. Copy the CloudHSM cluster’s trust anchor certificate (customerCA.crt) to the directory that you just created. You can find the trust anchor certificate on a working CloudHSM client instance under the path /opt/cloudhsm/etc/customerCA.crt. The certificate is created during initialization of the CloudHSM cluster and is required to connect to the CloudHSM cluster. This enables our application to validate that the certificate presented by the CloudHSM cluster was signed by our trust anchor certificate.
  5. In your new directory (cloudhsm_container), create a new file with the name run_sample.sh that includes the following contents. The script runs the Java class that is used to generate an Advanced Encryption Standard (AES) key to encrypt and decrypt your data.
    #! /bin/bash
    
    # start application
    echo -e "\n* Entering AES GCM encrypt/decrypt sample in Docker ... \n"
    
    java -ea -jar target/assembly/aesgcm-runner.jar -method environment
    
    echo -e "\n* Exiting AES GCM encrypt/decrypt sample in Docker ... \n"

  6. In the new directory, create another new file and name it Dockerfile (with no extension). This file will specify that the Docker image is built with the following components:
    • The CloudHSM client package.
    • The CloudHSM Java JCE package.
    • OpenJDK 1.8 (Java 8). This is needed to compile and run the Java classes and JAR files.
    • Maven, a build automation tool that is needed to assist with building the Java classes and JAR files.
    • The AWS CloudHSM Java JCE samples that will be downloaded and built as part of the solution.
  7. Cut and paste the following contents into Dockerfile.

    Note: You will need to customize your Dockerfile, as follows:

    • Make sure to specify the SDK version to replace the one specified in the pom.xml file in the sample code. As of the writing of this post, the most current version is 5.7.0. To find the SDK version, follow the steps in the topic Check your client SDK version. For more information, see the Building section in the README file for the Cloud HSM JCE examples.
    • Make sure to update the HSM_IP line with the IP of an HSM in your CloudHSM cluster. You can get your HSM IPs from the CloudHSM console, or by running the describe-clusters AWS CLI command.
      	# Use the amazon linux image
      	FROM amazonlinux:2
      
      	# Pass HSM IP address as a build argument
      	ARG HSM_IP
      
      	# Install CloudHSM client
      	RUN yum install -y https://s3.amazonaws.com/cloudhsmv2-software/CloudHsmClient/EL7/cloudhsm-jce-latest.el7.x86_64.rpm
      
      	# Install Java, Maven, wget, unzip and ncurses-compat-libs
      	RUN yum install -y java maven wget unzip ncurses-compat-libs
              
      	# Create a work dir
      	WORKDIR /app
              
      	# Download sample code
      	RUN wget https://github.com/aws-samples/aws-cloudhsm-jce-examples/archive/refs/heads/sdk5.zip
              
      	# unzip sample code
      	RUN unzip sdk5.zip
             
      	# Change to the create directory
      	WORKDIR aws-cloudhsm-jce-examples-sdk5
      
      # Build JAR files using the installed CloudHSM JCE Provider version
      RUN export CLOUDHSM_CLIENT_VERSION=`rpm -qi cloudhsm-jce | awk -F': ' '/Version/ {print $2}'` \
              && mvn validate -DcloudhsmVersion=$CLOUDHSM_CLIENT_VERSION \
              && mvn clean package -DcloudhsmVersion=$CLOUDHSM_CLIENT_VERSION
              
        # Configure cloudhsm-client
        COPY customerCA.crt /opt/cloudhsm/etc/
        RUN /opt/cloudhsm/bin/configure-jce -a $HSM_IP
             
        # Copy the run_sample.sh script
        COPY run_sample.sh .
              
        # Run the script
        CMD ["bash","run_sample.sh"]

  8. Now you’re ready to build the Docker image. Run the following command, with the name jce_sample. This command will let you use the Dockerfile that you created in step 6 to create the image.

    # sudo docker build --build-arg HSM_IP=”<your HSM IP address>” -t jce_sample .

  9. To run a Docker container from the Docker image that you just created, run the following command. Make sure to replace the user and password with your actual CU username and password. (If you need help setting up your CU credentials, see prerequisite 3. For more information on how to provide CU credentials to the AWS CloudHSM Java JCE Library, see Providing credentials to the JCE provider in the CloudHSM User Guide).

    # sudo docker run --env HSM_USER=<user> --env HSM_PASSWORD=<password> jce_sample

    If successful, the output should look like this:

    	* Entering AES GCM encrypt/decrypt sample in Docker ... 
    
    	737F92D1B7346267D329C16E
    	Successful decryption
    
    	* Exiting AES GCM encrypt/decrypt sample in Docker ...

Conclusion

This solution provides an example of how to run CloudHSM client workloads in Docker containers. You can use the solution as a reference to implement your cryptographic application in a way that benefits from the high availability and load balancing built in to CloudHSM without compromising the flexibility that Docker provides for developing, deploying, and running applications.

If you have comments about this post, submit them in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Derek Tumulak

Derek Tumulak

Derek joined AWS in May 2021 as a Principal Product Manager. He is a data protection and cybersecurity expert who is enthusiastic about assisting customers with a wide range of sophisticated use cases.

Migrate and secure your Windows PKI to AWS with AWS CloudHSM

Post Syndicated from Govindarajan Varadan original https://aws.amazon.com/blogs/security/migrate-and-secure-your-windows-pki-to-aws-with-aws-cloudhsm/

AWS CloudHSM provides a cloud-based hardware security module (HSM) that enables you to easily generate and use your own encryption keys in AWS. Using CloudHSM as part of a Microsoft Active Directory Certificate Services (AD CS) public key infrastructure (PKI) fortifies the security of your certificate authority (CA) private key and ensures the security of the trust hierarchy. In this blog post, we walk you through how to migrate your existing Microsoft AD CS CA private key to the HSM in a CloudHSM cluster.

The challenge

Organizations implement public key infrastructure (PKI) as an application to provide integrity and confidentiality between internal and customer-facing applications. A PKI provides encryption/decryption, message hashing, digital certificates, and digital signatures to ensure these security objectives are met. Microsoft AD CS is a popular choice for creating and managing a CA for enterprise applications such as Active Directory, Exchange, and Systems Center Configuration Manager. Moving your Microsoft AD CS to AWS as part of your overall migration plan allows you to continue to use your existing investment in Windows certificate auto enrollment for users and devices without disrupting existing workflows or requiring new certificates to be issued. However, when you migrate an on-premises infrastructure to the cloud, your security team may determine that storing private keys on the AD CS server’s disk is insufficient for protecting the private key that signs the certificates issued by the CA. Moving from storing private keys on the AD CS server’s disk to a hardware security module (HSM) can provide the added security required to maintain trust of the private keys.

This walkthrough shows you how to migrate your existing AD CS CA private key to the HSM in your CloudHSM cluster. The resulting configuration avoids the security concerns of using keys stored on your AD CS server, and uses the HSM to perform the cryptographic signing operations.

Prerequisites

For this walkthrough, you should have the following in place:

Migrating a domain

In this section, you will walk through migrating your AD CS environment to AWS by using your existing CA certificate and private key that will be secured in CloudHSM. In order to securely migrate the private key into the HSM, you will install the CloudHSM client and import the keys directly from the existing CA server.

This walkthrough includes the following steps:

  1. Create a crypto user (CU) account
  2. Import the CA private key into CloudHSM
  3. Export the CA certificate and database
  4. Configure and import the certificate into the new Windows CA server
  5. Install AD CS on the new server

The operations you perform on the HSM require the credentials of an HSM user. Each HSM user has a type that determines the operations you can perform when authenticated as that user. Next, you will create a crypto user (CU) account to use with your CA servers, to manage keys and to perform cryptographic operations.

To create the CU account

  1. From the on-premises CA server, use the following command to log in with the crypto officer (CO) account that you created when you activated the cluster. Be sure to replace <co_password> with your CO password.
    loginHSM CO admin <co_password>
    

  2. Use the following command to create the CU account. Replace <cu_user> and <cu_password> with the username and password you want to use for the CU.
    createUser CU <cu_user> <cu_password>
    

  3. Use the following command to set the login credentials for the HSM on your system and enable the AWS CloudHSM client for Windows to use key storage providers (KSPs) and Cryptography API: Next Generation (CNG) providers. Replace <cu_user> and <cu_password> with the username and password of the CU.
    set_cloudhsm_credentials.exe --username <cu_user> password <cu_password>
    

Now that you have the CloudHSM client installed and configured on the on-premises CA server, you can import the CA private key from the local server into your CloudHSM cluster.

To import the CA private key into CloudHSM

  1. Open an administrative command prompt and navigate to C:\Program Files\Amazon\CloudHSM.
  2. To identify the unique container name for your CA’s private key, enter certutil -store my to list all certificates stored in the local machine store. The CA certificate will be shown as follows:
    ================ Certificate 0 ================
    Serial Number: <certificate_serial_number>
    Issuer: CN=example-CA, DC=example, DC=com
     NotBefore: 6/25/2021 5:04 PM
     NotAfter: 6/25/2022 5:14 PM
    Subject: CN=example-CA-test3, DC=example, DC=com
    Certificate Template Name (Certificate Type): CA
    CA Version: V0.0
    Signature matches Public Key
    Root Certificate: Subject matches Issuer
    Template: CA, Root Certification Authority
    Cert Hash(sha1): cb7c09cd6c76d69d9682a31fbdbbe01c29cebd82
      Key Container = example-CA-test3
      Unique container name: <unique_container_name>
      Provider = Microsoft Software Key Storage Provider
    Signature test passed
    

  3. Verify that the key is backed by the Microsoft Software Key Storage Provider and make note of the <unique_container_name> from the output, to use it in the following steps.
  4. Use the following command to set the environment variable n3fips_password. Replace <cu_user> and <cu_password> with the username and password for the CU you created earlier for the CloudHSM cluster. This variable will be used by the import_key command in the next step.
    set n3fips_password=<cu_user>:<cu_password>
    

  5. Use the following import_key command to import the private key into the HSM. Replace <unique_container_name> with the value you noted earlier.
    import_key.exe -RSA "<unique_container_name>

The import_key command will report that the import was successful. At this point, your private key has been imported into the HSM, but the on-premises CA server will continue to run using the key stored locally.

The Active Directory Certificate Services Migration Guide for Windows Server 2012 R2 uses the Certification Authority snap-in to migrate the CA database, as well as the certificate and private key. Because you have already imported your private key into the HSM, next you will need to make a slight modification to this process and export the certificate manually, without its private key.

To export the CA certificate and database

  1. To open the Microsoft Management Console (MMC), open the Start menu and in the search field, enter MMC, and choose Enter.
  2. From the File menu, select Add/Remove Snapin.
  3. Select Certificates and choose Add.
  4. You will be prompted to select which certificate store to manage. Select Computer account and choose Next.
  5. Select Local Computer, choose Finish, then choose OK.
  6. In the left pane, choose Personal, then choose Certificates. In the center pane, locate your CA certificate, as shown in Figure 1.
     
    The MMC Certificates snap-in displays the Certificates directories for the local computer. The Personal Certificates location is open displaying the example-CA-test3 certificate.

    Figure 1: Microsoft Management Console Certificates snap-in

  7. Open the context (right-click) menu for the certificate, choose All Tasks, then choose Export.
  8. In the Certificate Export Wizard, choose Next, then choose No, do not export the private key.
  9. Under Select the format you want to use, select Cryptographic Message Syntax Standard – PKCS #7 format file (.p7b) and select Include all certificates in the certification path if possible, as shown in Figure 2.
     
    The Certificate Export Wizard window is displayed.  This windows is prompting for the selection of an export format.  The toggle is selected for Cryptographic Message Syntax Standard – PKCS #7 Certificates (.P7B) and the check box is marked to Include all certificates in the certification path if possible.

    Figure 2: Certificate Export Wizard

  10. Save the file in a location where you’ll be able to locate it later, so you will be able to copy it to the new CA server.
  11. From the Start menu, browse to Administrative Tools, then choose Certificate Authority.
  12. Open the context (right-click) menu for your CA and choose All Tasks, then choose Back up CA.
  13. In the Certificate Authority Backup Wizard, choose Next. For items to back up, select only Certificate database and certificate database log. Leave all other options unselected.
  14. Under Back up to this location, choose Browse and select a new empty folder to hold the backup files, which you will move to the new CA later.
  15. After the backup is complete, in the MMC, open the context (right-click) menu for your CA, choose All Tasks, then choose Stop service.

At this point, until you complete the migration, your CA will no longer be issuing new certificates.

To configure and import the certificate into the new Windows CA server

  1. Open a Remote Desktop session to the EC2 instance that you created in the prerequisite steps, which will serve as your new AD CS certificate authority.
  2. Copy the certificate (.p7b file) backup from the on-premises CA server to the EC2 instance.
  3. On your EC2 instance, locate the certificate you just copied, as shown in Figure 3. Open the certificate to start the import process.
     
    The Certificate Manager tool window shows the Certificates directory for the p7b file that was opened. The main window for this location is displaying the example-CA-test3 certificate.

    Figure 3: Certificate Manager tool

  4. Select Install Certificate. For Store Location, select Local Machine.
  5. Select Place the Certificates in the following store. Allowing Windows to place the certificate automatically will install it as a trusted root certificate, rather than a server certificate.
  6. Select Browse, select the Personal store, and then choose OK.
  7. Choose Next, then choose Finish to complete the certificate installation.

At this point, you’ve installed the public key and certificate from the on-premises CA server to your EC2-based Windows CA server. Next, you need to link this installed certificate with the private key, which is now stored on the CloudHSM cluster, in order to make it functional for signing issued certificates and CRLs.

To link the certificate with the private key

  1. Open an administrative command prompt and navigate to C:\Program Files\Amazon\CloudHSM.
  2. Use the following command to set the environment variable n3fips_password. Replace <cu_user> and <cu_password> with the username and password for the CU that you created earlier for the CloudHSM cluster. This variable will be used by the import_key command in the next step.
    set n3fips_password=<cu_user>:<cu_password>
    

  3. Use the following import_key command to represent all keys stored on the HSM in a new key container in the key storage provider. This step is necessary to allow the cryptography tools to see the CA private key that is stored on the HSM.
    import_key -from HSM -all
    

  4. Use the following Windows certutil command to find your certificate’s unique serial number.
    certutil -store my
    

    Take note of the CA certificate’s serial number.

  5. Use the following Windows certutil command to link the installed certificate with the private key stored on the HSM. Replace <certificate_serial_number> with the value noted in the previous step.
    certutil -repairstore my <certificate_serial_number>
    

  6. Enter the command certutil -store my. The CA certificate will be shown as follows. Verify that the certificate is now linked with the HSM-backed private key. Note that the private key is using the Cavium Key Store Provider. Also note the message Encryption test passed, which means that the private key is usable for encryption.
    ================ Certificate 0 ================
    Serial Number: <certificate_serial_number>
    Issuer: CN=example-CA, DC=example, DC=com
     NotBefore: 6/25/2021 5:04 PM
     NotAfter: 6/25/2022 5:14 PM
    Subject: CN=example-CA, DC=example, DC=com
    Certificate Template Name (Certificate Type): CA
    CA Version: V0.0
    Signature matches Public Key
    Root Certificate: Subject matches Issuer
    Template: CA, Root Certification Authority
    Cert Hash(sha1): cb7c09cd6c76d69d9682a31fbdbbe01c29cebd82
      Key Container = PRV_KEY_IMPORT-6-9-7e5cde
      Provider = Cavium Key Storage Provider
    Private key is NOT exportable
    Encryption test passed
    

Now that your CA certificate and key materials are in place, you are ready to setup your EC2 instance as a CA server.

To install AD CS on the new server

  1. In Microsoft’s documentation to Install the Certificate Authority role on your new EC2 instance, follow steps 1-8. Do not complete the remaining steps, because you will be configuring the CA to use the existing HSM backed certificate and private-key instead of generating a new key.
  2. In Confirm installation selections, select Install.
  3. After your installation is complete, Server Manager will show a notification banner prompting you to configure AD CS. Select Configure Active Directory Certificate Services from this prompt.
  4. Select either Standalone or Enterprise CA installation, based upon the configuration of your on-premises CA.
  5. Select Use Existing Certificate and Private Key and browse to select the CA certificate imported from your on-premises CA server.
  6. Select Next and verify your location for the certificate database files.
  7. Select Finish to complete the wizard.
  8. To restore the CA database backup, from the Start menu, browse to Administrative Tools, then choose Certificate Authority.
  9. Open the context (right-click) menu for the certificate authority and choose All Tasks, then choose Restore CA. Browse to and select the database backup that you copied from the on-premises CA server.

Review the Active Directory Certificate Services Migration Guide for Windows Server 2012 R2 to complete migration of your remaining Microsoft Public Key Infrastructure (PKI) components. Depending on your existing CA environment, these steps may include establishing new CRL and AIA endpoints, configuring Windows Routing and Remote Access to use the new CA, or configuring certificate auto enrollment for Windows clients.

Conclusion

In this post, we walked you through migrating an on-premises Microsoft AD CS environment to an AWS environment that uses AWS CloudHSM to secure the CA private key. By migrating your existing Windows PKI backed by AWS CloudHSM, you can continue to use your Windows certificate auto enrollment for users and devices with your private key secured in a dedicated HSM.

For more information about setting up and managing CloudHSM, see Getting Started with AWS CloudHSM and the AWS Security Blog post CloudHSM best practices to maximize performance and avoid common configuration pitfalls.

If you have feedback about this blog post, submit comments in the Comments section below. You can also start a new thread on the AWS CloudHSM forum to get answers from the community.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Govindarajan Varadan

Govindarajan is a senior solutions architect at AWS based out of Silicon Valley in California. He works with AWS customers to help them achieve their business objectives by innovating at scale, modernizing their applications, and adopting game-changing technologies like AI/ML.

Author

Brian Benscoter

Brian is a senior solutions architect at AWS with a passion for governance at scale and is based in Charlotte, NC. Brian works with enterprise AWS customers to help them design, deploy, and scale applications to achieve their business goals.

Author

Axel Larsson

Axel is an enterprise solutions architect at AWS. He has helped several companies migrate to AWS and modernize their architecture. Axel is passionate about helping organizations establish a solid foundation in the cloud, enabled by security best practices.

CloudHSM best practices to maximize performance and avoid common configuration pitfalls

Post Syndicated from Esteban Hernández original https://aws.amazon.com/blogs/security/cloudhsm-best-practices-to-maximize-performance-and-avoid-common-configuration-pitfalls/

AWS CloudHSM provides fully-managed hardware security modules (HSMs) in the AWS Cloud. CloudHSM automates day-to-day HSM management tasks including backups, high availability, provisioning, and maintenance. You’re still responsible for all user management and application integration.

In this post, you will learn best practices to help you maximize the performance of your workload and avoid common configuration pitfalls in the following areas:

Administration of CloudHSM

The administration of CloudHSM includes those tasks necessary to correctly set up your CloudHSM cluster, and to manage your users and keys in a secure and efficient manner.

Initialize your cluster with a customer key pair

To initialize a new CloudHSM cluster, you will first create a new RSA key pair, which we will call the customer key pair. First, generate a self-signed certificate using the customer key pair. Then, you sign the cluster’s certificate by using the customer public key as described in Initialize the Cluster section in the AWS CloudHSM User Guide. The resulting signed cluster certificate, as shown in Figure 1, identifies your CloudHSM cluster as yours.

Figure 1: CloudHSM key hierarchy and customer generated keys

Figure 1: CloudHSM key hierarchy and customer generated keys

It’s important to use best practices when you generate and store the customer private key. The private key is a binding secret between you and your cluster, and cannot be rotated. We therefore recommend that you create the customer private key in an offline HSM and store the HSM securely. Any entity (organization, person, system) that demonstrates possession of the customer private key will be considered an owner of the cluster and the data it contains. In this procedure, you are using the customer private key to claim a new cluster, but in the future you could also use it to demonstrate ownership of the cluster in scenarios such as cloning and migration.

Manage your keys with crypto user (CU) accounts

The HSMs provided by CloudHSM support different types of HSM users, each with specific entitlements. Crypto users (CUs) generate, manage, and use keys. If you’ve worked with HSMs in the past, you can think of CUs as similar to partitions. However, CU accounts are more flexible. The CU that creates a key owns the key, and can share it with other CUs. The shared key can be used for operations in accordance with the key’s attributes, but the CU that the key was shared with cannot manage it – that is, they cannot delete, wrap, or re-share the key.

From a security standpoint, it is a best practice for you to have multiple CUs with different scopes. For example, you can have different CUs for different classes of keys. As another example, you can have one CU account to create keys, and then share these keys with one or more CU accounts that your application leverages to utilize keys. You can also have multiple shared CU accounts, to simplify rotation of credentials in production applications.

Warning: You should be careful when deleting CU accounts. If the owner CU account for a key is deleted, the key can no longer be used. You can use the cloudhsm_mgmt_util tool command findAllKeys to identify which keys are owned by a specified CU. You should rotate these keys before deleting a CU. As part of your key generation and rotation scheme, consider using labels to identify current and legacy keys.

Manage your cluster by using crypto officer (CO) accounts

Crypto officers (COs) can perform user management operations including change password, create user, and delete user. COs can also set and modify cluster policies.

Important: When you add or remove a user, or change a password, it’s important to ensure that you connect to all the HSMs in a cluster, to keep them synchronized and avoid inconsistencies that can result in errors. It is a best practice to use the Configure tool with the –m option to refresh the cluster configuration file before making mutating changes to the cluster. This helps to ensure that all active HSMs in the cluster are properly updated, and prevents the cluster from becoming desynchronized. You can learn more about safe management of your cluster in the blog post Understanding AWS CloudHSM Cluster Synchronization. You can verify that all HSMs in the cluster have been added by checking the /opt/cloudhsm/etc/cloudhsm_mgmt_util.cfg file.

After a password has been set up or updated, we strongly recommend that you keep a record in a secure location. This will help you avoid lockouts due to erroneous passwords, because clients will fail to log in to HSM instances that do not have consistent credentials. Depending on your security policy, you can use AWS Secrets Manager, specifying a customer master key created in AWS Key Management Service (KMS), to encrypt and distribute your secrets – secrets in this case being the CU credentials used by your CloudHSM clients.

Use quorum authentication

To prevent a single CO from modifying critical cluster settings, a best practice is to use quorum authentication. Quorum authentication is a mechanism that requires any operation to be authorized by a minimum number (M) of a group of N users and is therefore also known as M of N access control.

To prevent lock-outs, it’s important that you have at least two more COs than the M value you define for the quorum minimum value. This ensures that if one CO gets locked out, the others can safely reset their password. Also be careful when deleting users, because if you fall under the threshold of M, you will be unable to create new users or authorize any other operations and will lose the ability to administer your cluster.

If you do fall below the minimum quorum required (M), or if all of your COs end up in a locked-out state, you can revert to a previously known good state by restoring from a backup to a new cluster. CloudHSM automatically creates at least one backup every 24 hours. Backups are event-driven. Adding or removing HSMs will trigger additional backups.

Configuration

CloudHSM is a fully managed service, but it is deployed within the context of an Amazon Virtual Private Cloud (Amazon VPC). This means there are aspects of the CloudHSM service configuration that are under your control, and your choices can positively impact the resilience of your solutions built using CloudHSM. The following sections describe the best practices that can make a difference when things don’t go as expected.

Use multiple HSMs and Availability Zones to optimize resilience

When you’re optimizing a cluster for high availability, one of the aspects you have control of is the number of HSMs in the cluster and the Availability Zones (AZs) where the HSMs get deployed. An AZ is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region, which can be formed of multiple physical buildings, and have different risk profiles between them. Most of the AWS Regions have three Availability Zones, and some have as many as six.

AWS recommends placing at least two HSMs in the cluster, deployed in different AZs, to optimize data loss resilience and improve the uptime in case an individual HSM fails. As your workloads grow, you may want to add extra capacity. In that case, it is a best practice to spread your new HSMs across different AZs to keep improving your resistance to failure. Figure 2 shows an example CloudHSM architecture using multiple AZs.

Figure 2: CloudHSM architecture using multiple AZs

Figure 2: CloudHSM architecture using multiple AZs

When you create a cluster in a Region, it’s a best practice to include subnets from every available AZ of that Region. This is important, because after the cluster is created, you cannot add additional subnets to it. In some Regions, such as Northern Virginia (us-east-1), CloudHSM is not yet available in all AZs at the time of writing. However, you should still include subnets from every AZ, even if CloudHSM is currently not available in that AZ, to allow your cluster to use those additional AZs if they become available.

Increase your resiliency with cross-Region backups

If your threat model involves a failure of the Region itself, there are steps you can take to prepare. First, periodically create copies of the cluster backup in the target Region. You can see the blog post How to clone an AWS CloudHSM cluster across regions to find an extensive description of how to create copies and deploy a clone of an active CloudHSM cluster.

As part of your change management process, you should keep copies of important files, such as the files stored in /opt/cloudhsm/etc/. If you customize the certificates that you use to establish communication with your HSM, you should back up those certificates as well. Additionally, you can use configuration scripts with the AWS Systems Manager Run Command to set up two or more client instances that use exactly the same configuration in different Regions.

The managed backup retention feature in CloudHSM automatically deletes out-of-date backups for an active cluster. However, because backups that you copy across Regions are not associated with an active cluster, they are not in scope of managed backup retention and you must delete out-of-date backups yourself. Backups are secure and contain all users, policies, passwords, certificates and keys for your HSM, so it’s important to delete older backups when you rotate passwords, delete a user, or retire keys. This ensures that you cannot accidentally bring older data back to life by creating a new cluster that uses outdated backups.

The following script shows you how to delete all backups older than a certain point in time. You can also download the script from S3.

#!/usr/bin/env python

#
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: MIT-0
#
# Reference Links:
# https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior
# https://docs.python.org/3/library/re.html
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudhsmv2.html#CloudHSMV2.Client.describe_backups
# https://docs.python.org/3/library/datetime.html#datetime-objects
# https://pypi.org/project/typedate/
# https://pypi.org/project/pytz/
#

import boto3, time, datetime, re, argparse, typedate, json

def main():
    bkparser = argparse.ArgumentParser(prog='backdel',
                                    usage='%(prog)s [-h] --region --clusterID [--timestamp] [--timezone] [--deleteall] [--dryrun]',
                                    description='Deletes CloudHSMv2 backups from a given point in time\n')
    bkparser.add_argument('--region',
                    metavar='-r',
                    dest='region',
                    type=str,
                    help='region where the backups are stored',
                    required=True)
    bkparser.add_argument('--clusterID',
                    metavar='-c',
                    dest='clusterID',
                    type=str,
                    help='CloudHSMv2 cluster_id for which you want to delete backups',
                    required=True)
    bkparser.add_argument('--timestamp',
                    metavar='-t',
                    dest='timestamp',
                    type=str,
                    help="Enter the timestamp to filter the backups that should be deleted:\n   Backups older than the timestamp will be deleted.\n  Timestamp ('MM/DD/YY', 'MM/DD/YYYY' or 'MM/DD/YYYY HH:mm')",
                    required=False)
    bkparser.add_argument('--timezone',
                    metavar='-tz',
                    dest='timezone',
                    type=typedate.TypeZone(),
                    help="Enter the timezone to adjust the timestamp.\n Example arguments:\n --timezone '-0200' , --timezone '05:00' , --timezone GMT #If the pytz module has been installed  ",
                    required=False)
    bkparser.add_argument('--dryrun',
                    dest='dryrun',
                    action='store_true',
                    help="Set this flag to simulate the deletion",
                    required=False)
    bkparser.add_argument('--deleteall',
                    dest='deleteall',
                    action='store_true',
                    help="Set this flag to delete all the back ups for the specified cluster",
                    required=False)
    args = bkparser.parse_args()
    client = boto3.client('cloudhsmv2', args.region)
    cluster_id = args.clusterID 
    timestamp_str = args.timestamp 
    timezone = args.timezone
    dry_true = args.dryrun
    delall_true = args.deleteall
    delete_all_backups_before(client, cluster_id, timestamp_str, timezone, dry_true, delall_true)

def delete_all_backups_before(client, cluster_id, timestamp_str, timezone, dry_true, delall_true, max_results=25):
    timestamp_datetime = None
    if delall_true == True and not timestamp_str:
        
        print("\nAll backups will be deleted...\n")
    
    elif delall_true == True and timestamp_str:
    
        print("\nUse of incompatible instructions: --timestamp  and --deleteall cannot be used in the same invocation\n")
        return
    
    elif not timestamp_str :
    
        print("\nParameter missing: --timestamp must be defined\n")
        return
    
    else :
        # Valid formats: 'MM/DD/YY', 'MM/DD/YYYY' or 'MM/DD/YYYY HH:mm'
        if re.match(r'^\d\d/\d\d/\d\d\d\d \d\d:\d\d$', timestamp_str):
            try:
                timestamp_datetime = datetime.datetime.strptime(timestamp_str, "%m/%d/%Y %H:%M")
            except Exception as e:
                print("Exception: %s" % str(e))
                return
        elif re.match(r'^\d\d/\d\d/\d\d\d\d$', timestamp_str):
            try:
                timestamp_datetime = datetime.datetime.strptime(timestamp_str, "%m/%d/%Y")
            except Exception as e:
                print("Exception: %s" % str(e))
                return
        elif re.match(r'^\d\d/\d\d/\d\d$', timestamp_str):
            try:
                timestamp_datetime = datetime.datetime.strptime(timestamp_str, "%m/%d/%y")
            except Exception as e:
                print("Exception: %s" % str(e))
                return
        else:
            print("The format of the specified timestamp is not supported by this script. Aborting...")
            return

        print("Backups older than %s will be deleted...\n" % timestamp_str)

    try:
        response = client.describe_backups(MaxResults=max_results, Filters={"clusterIds": [cluster_id]}, SortAscending=True)
    except Exception as e:
        print("DescribeBackups failed due to exception: %s" % str(e))
        return

    failed_deletions = []
    while True:
        if 'Backups' in response.keys() and len(response['Backups']) > 0:
            for backup in response['Backups']:
                if timestamp_str and not delall_true:
                    if timezone != None:
                        timestamp_datetime = timestamp_datetime.replace(tzinfo=timezone)
                    else:
                        timestamp_datetime = timestamp_datetime.replace(tzinfo=backup['CreateTimestamp'].tzinfo)

                    if backup['CreateTimestamp'] > timestamp_datetime:
                        break

                print("Deleting backup %s whose creation timestamp is %s:" % (backup['BackupId'], backup['CreateTimestamp']))
                try:
                    if not dry_true :
                        delete_backup_response = client.delete_backup(BackupId=backup['BackupId'])
                except Exception as e:
                    print("DeleteBackup failed due to exception: %s" % str(e))
                    failed_deletions.append(backup['BackupId'])
                print("Sleeping for 1 second to avoid throttling. \n")
                time.sleep(1)

        if 'NextToken' in response.keys():
            try:
                response = client.describe_backups(MaxResults=max_results, Filters={"clusterIds": [cluster_id]}, SortAscending=True, NextToken=response['NextToken'])
            except Exception as e:
                print("DescribeBackups failed due to exception: %s" % str(e))
        else:
            break

    if len(failed_deletions) > 0:
        print("FAILED backup deletions: " + failed_deletions)

if __name__== "__main__":
    main()

Use Amazon VPC security features to control access to your cluster

Because each cluster is deployed inside an Amazon VPC, you should use the familiar controls of Amazon VPC security groups and network access control lists (network ACLs) to limit what instances are allowed to communicate with your cluster. Even though the CloudHSM cluster itself is protected in depth by your login credentials, Amazon VPC offers a useful first line of defense. Because it’s unlikely that you need your communications ports to be reachable from the public internet, it’s a best practice to take advantage of the Amazon VPC security features.

Managing PKI root keys

A common use case for CloudHSM is setting up public key infrastructure (PKI). The root key for PKI is a long-lived key which forms the basis for certificate hierarchies and worker keys. The worker keys are the private portion of the end-entity certificates and are meant for routine rotation, while root PKI keys are generally fixed. As a characteristic, these keys are infrequently used, with very long validity periods that are often measured in decades. Because of this, it is a best practice to not rely solely on CloudHSM to generate and store your root private key. Instead, you should generate and store the root key in an offline HSM (this is frequently referred to as an offline root) and periodically generate intermediate signing key pairs on CloudHSM.

If you decide to store and use the root key pair with CloudHSM, you should take precautions. You can either create the key in an offline HSM and import it into CloudHSM for use, or generate the key in CloudHSM and wrap it out to an offline HSM. Either way, you should always have a copy of the key, usable independently of CloudHSM, in an offline vault. This helps to protect your trust infrastructure against forgotten CloudHSM credentials, lost application code, changing technology, and other such scenarios.

Optimize performance by managing your cluster size

It is important to size your cluster correctly, so that you can maintain its performance at the desired level. You should measure throughput rather than latency, and keep in mind that parallelizing transactions is the key to getting the most performance out of your HSM. You can maximize how efficiently you use your HSM by following these best practices:

  1. Use threading at 50-100 threads per application. The impact of network round-trip delays is magnified if you serialize each operation. The exception to this rule is generating persistent keys – these are serialized on the HSM to ensure consistent state, and so parallelizing these will yield limited benefit.
  2. Use sufficient resources for your CloudHSM client. The CloudHSM client handles all load balancing, failover, and high availability tasks as your application transacts with your HSM cluster. You should ensure that the CloudHSM client has enough computational resources so that the client itself doesn’t become your performance bottleneck. Specifically, do not use resource-limited instances such as t.nano or t.micro instances to run the client. To learn more, see the Amazon Elastic Compute Cloud (EC2) instance types online documentation.
  3. Use cryptographically accelerated commands. There are two types of HSM commands: management commands (such as looking up a key based on its attributes) and cryptographically accelerated commands (such as operating on a key with a known key handle). You should rely on cryptographically accelerated commands as much as possible for latency-sensitive operations. As one example, you can cache the key handles for frequently used keys or do it per application run, rather than looking up a key handle each time. As another example, you can leave frequently used keys on the HSM, rather than unwrapping or importing them prior to each use.
  4. Authenticate once per session. Pay close attention to session logins. Your individual CloudHSM client should create just one session per execution, which is authenticated using the credentials of one cryptographic user. There’s no need to reauthenticate the session for every cryptographic operation.
  5. Use the PKCS #11 library. If performance is critical for your application and you can choose from the multiple software libraries to integrate with your CloudHSM cluster, give preference to PKCS #11, as it tends to give an edge on speed.
  6. Use token keys. For workloads with a limited number of keys, and for which high throughput is required, use token keys. When you create or import a key as a token key, it is available in all the HSMs in the cluster. However, when it is created as a session key with the “-sess” option, it only exists in the context of a single HSM.

After you maximize throughput by using these best practices, you can add HSMs to your cluster for additional throughput. Other reasons to add HSMs to your cluster include if you hit audit log buffering limits while rapidly generating or importing and then deleting keys, or if you run out of capacity to create more session keys.

Error handling

Occasionally, an HSM may fail or lose connectivity during a cryptographic operation. The CloudHSM client does not automatically retry failed operations because it’s not state-aware. It’s a best practice for you to retry as needed by handling retries in your application code. Before retrying, you may want to ensure that your CloudHSM client is still running, that your instance has connectivity, and that your session is still logged in (if you are using explicit login). For an overview of the considerations for retries, see the Amazon Builders’ Library article Timeouts, retries, and backoff with jitter.

Summary

In this post, we’ve outlined a set of best practices for the use of CloudHSM, whether you want to improve the performance and durability of the solution, or implement robust access control.

To get started building and applying these best practices, a great way is to look at the AWS samples we have published on GitHub for the Java Cryptography Extension (JCE) and for the Public-Key Cryptography Standards number 11 (PKCS11).

If you have feedback about this blog post, submit comments in the Comments session below. You can also start a new thread on the AWS CloudHSM forum to get answers from the community.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Esteban Hernández

Esteban is a Specialist Solutions Architect for Security & Compliance at AWS where he works with customers to create secure and robust architectures that help to solve business problems. He is interested in topics like Identity and Cryptography. Outside of work, he enjoys science fiction and taking new challenges like learning to sail.

Author

Avni Rambhia

Avni is the product manager for AWS CloudHSM. As part of AWS Cryptography, she drives technologies and defines best practices that help customers build secure, reliable workloads in the AWS Cloud. Outside of work, she enjoys hiking, travel and philosophical debates with her children.