All posts by Raj Jain

How to use ACM Private CA for enabling mTLS in AWS App Mesh

2021-08-31 Raj Jain

Post Syndicated from Raj Jain original https://aws.amazon.com/blogs/security/how-to-use-acm-private-ca-for-enabling-mtls-in-aws-app-mesh/

Securing east-west traffic in service meshes, such as AWS App Mesh, by using mutual Transport Layer Security (mTLS) adds an additional layer of defense beyond perimeter control. mTLS adds bidirectional peer-to-peer authentication on top of the one-way authentication in normal TLS. This is done by adding a client-side certificate during the TLS handshake, through which a client proves possession of the corresponding private key to the server, and as a result the server trusts the client. This prevents an arbitrary client from connecting to an App Mesh service, because the client wouldn’t possess a valid certificate.

In this blog post, you’ll learn how to enable mTLS in App Mesh by using certificates derived from AWS Certificate Manager Private Certificate Authority (ACM Private CA). You’ll also learn how to reuse AWS CloudFormation templates, which we make available through a companion open-source project, for configuring App Mesh and ACM Private CA.

You’ll first see how to derive server-side certificates from ACM Private CA into App Mesh internally by using the native integration between the two services. You’ll then see a method and code for installing client-side certificates issued from ACM Private CA into App Mesh; this method is needed because client-side certificates aren’t integrated natively.

You’ll learn how to use AWS Lambda to export a client-side certificate from ACM Private CA and store it in AWS Secrets Manager. You’ll then see Envoy proxies in App Mesh retrieve the certificate from Secrets Manager and use it in an mTLS handshake. The solution is designed to ensure confidentiality of the private key of a client-side certificate, in transit and at rest, as it moves from ACM to Envoy.

The solution described in this blog post simplifies and allows you to automate the configuration and operations of mTLS-enabled App Mesh deployments, because all of the certificates are derived from a single managed private public key infrastructure (PKI) service—ACM Private CA—eliminating the need to run your own private PKI. The solution uses Amazon Elastic Container Services (Amazon ECS) with AWS Fargate as the App Mesh hosting environment, although the design presented here can be applied to any compute environment that is supported by App Mesh.

Solution overview

ACM Private CA provides a highly available managed private PKI service that enables creation of private CA hierarchies—including root and subordinate CAs—without the investment and maintenance costs of operating your own private PKI service. The service allows you to choose among several CA key algorithms and key sizes and makes it easier for you to export and deploy private certificates anywhere by using API-based automation.

App Mesh is a service mesh that provides application-level networking across multiple types of compute infrastructure. It standardizes how your microservices communicate, giving you end-to-end visibility and helping to ensure transport security and high availability for your applications. In order to communicate securely between mesh endpoints, App Mesh directs the Envoy proxy instances that are running within the mesh to use one-way or mutual TLS.

TLS provides authentication, privacy, and data integrity between two communicating endpoints. The authentication in TLS communications is governed by the PKI system. The PKI system allows certificate authorities to issue certificates that are used by clients and servers to prove their identity. The authentication process in TLS happens by exchanging certificates via the TLS handshake protocol. By default, the TLS handshake protocol proves the identity of the server to the client by using X.509 certificates, while the authentication of the client to the server is left to the application layer. This is called one-way TLS. TLS also supports two-way authentication through mTLS. In mTLS, in addition to the one-way TLS server authentication with a certificate, a client presents its certificate and proves possession of the corresponding private key to a server during the TLS handshake.

Example application

The following sections describe one-way and mutual TLS integrations between App Mesh and ACM Private CA in the context of an example application. This example application exposes an API to external clients that returns a text string name of a color—for example, “yellow”. It’s an extension of the Color App that’s used to demonstrate several existing App Mesh examples.

The example application is comprised of two services running in App Mesh—ColorGateway and ColorTeller. An external client request enters the mesh through the ColorGateway service and is proxied to the ColorTeller service. The ColorTeller service responds back to the ColorGateway service with the name of a color. The ColorGateway service proxies the response to the external client. Figure 1 shows the basic design of the application.

Figure 1: App Mesh services in the Color App example application

The two services are mapped onto the following constructs in App Mesh:

ColorGateway is mapped as a Virtual gateway. A virtual gateway in App Mesh allows resources that are outside of a mesh to communicate to resources that are inside the mesh. A virtual gateway represents Envoy deployed by itself. In this example, the virtual gateway represents an Envoy proxy that is running as an Amazon ECS service. This Envoy proxy instance acts as a TLS client, since it initiates TLS connections to the Envoy proxy that is running in the ColorTeller service.
ColorTeller is mapped as a Virtual node. A virtual node in App Mesh acts as a logical pointer to a particular task group. In this example, the virtual node—ColorTeller—runs as another Amazon ECS service. The service runs two tasks—an Envoy proxy instance and a ColorTeller application instance. The Envoy proxy instance acts as a TLS server, receiving inbound TLS connections from ColorGateway.

Let’s review running the example application in one-way TLS mode. Although optional, starting with one-way TLS allows you to compare the two methods and establish how to look at certain Envoy proxy statistics to distinguish and verify one-way TLS versus mTLS connections.

For practice, you can deploy the example application project in your own AWS account and perform the steps described in your own test environment.

Note: In both the one-way TLS and mTLS descriptions in the following sections, we’re using a flat certificate hierarchy for demonstration purposes. The root CAs are issuing end-entity certificates. The AWS ACM Private CA best practices recommend that root CAs should only be used to issue certificates for intermediate CAs. When intermediate CAs are involved, your certificate chain has multiple certificates concatenated in it, but the mechanisms are the same as those described here.

One-way TLS in App Mesh using ACM Private CA

Because this is a one-way TLS authentication scenario, you need only one Private CA—ColorTeller—and issue one end-entity certificate from it that’s used as the server-side certificate for the ColorTeller virtual node.

Figure 2, following, shows the architecture for this setup, including notations and color codes for certificates and a step-by-step process that shows how the system is configured and functions. Because this architecture uses a server-side certificate only, you use the native integration between App Mesh and ACM Private CA and don’t need an external mechanism for certificate integration.

Figure 2: One-way TLS in App Mesh integrated with ACM Private CA

The steps in Figure 2 are:

Step 1: A Private CA instance—ColorTeller—is created in ACM Private CA. Next, an end-entity certificate is created and signed by the CA. This certificate is used as the server-side certificate in ColorTeller.

Step 2: The CloudFormation templates configure the ColorGateway to validate server certificates against the ColorTeller private CA certificate chain. As the App Mesh endpoints are starting up, the ColorTeller CA certificate trust chain is ingested into the ColorGateway Envoy instance. The TLS configuration for ColorGateway in App Mesh is shown in Figure 3.

Figure 3: One-way TLS configuration in the client policy of ColorGateway

Figure 3 shows that the client policy attributes for outbound transport connections for ColorGateway have been configured as follows:

Enforce TLS is set to Enforced. This enforces use of TLS while communicating with backends.
TLS validation method is set to AWS Certificate Manager Private Certificate Authority (ACM-PCA hosting). This instructs App Mesh to derive the certificate trust chain from ACM PCA.
Certificate is set to the Amazon Resource Name (ARN) of the ColorTeller Private CA, which is the identifier of the certificate trust chain in ACM PCA.

This configuration ensures that ColorGateway makes outbound TLS-only connections towards ColorTeller, extracts the CA trust chain from ACM-PCA, and validates the server certificate presented by the ColorTeller virtual node against the configured CA ARN.

Step 3: The CloudFormation templates configure the ColorTeller virtual node with the ColorTeller end-entity certificate ARN in ACM Private CA. While the App Mesh endpoints are started, the ColorTeller end-entity certificate is ingested into the ColorTeller Envoy instance.

The TLS configuration for the ColorTeller virtual node in App Mesh is shown in Figure 4.

Figure 4: One-way TLS configuration in the listener configuration of ColorTeller

Figure 4 shows that various TLS-related attributes are configured as follows:

Enable TLS termination is on.
Mode is set to Strict to limit connections to TLS only.
TLS Certificate method is set to ACM Certificate Manager (ACM) hosting as the source of the end-entity certificate.
Certificate is set to ARN of the ColorTeller end-entity certificate.

Note: Figure 4 shows an annotation where the certificate ARN has been superimposed by the cert icon in green color. This icon follows the color convention from Figure 2 and can help you relate how the individual resources are configured to construct the architecture shown in Figure 2. The cert shown (and the associated private key that is not shown) in the diagram is necessary for ColorTeller to run the TLS stack and serve the certificate. The exchange of this material happens over the internal communications between App Mesh and ACM Private CA.

Step 4: The ColorGateway service receives a request from an external client.

Step 5: This step includes multiple sub-steps:

The ColorGateway Envoy initiates a one-way TLS handshake towards the ColorTeller Envoy.
The ColorTeller Envoy presents its server-side certificate to the ColorGateway Envoy.
The ColorGateway Envoy validates the certificate against its configured CA trust chain—the ColorTeller CA trust chain—and the TLS handshake succeeds.

Verifying one-way TLS

To verify that a TLS connection was established and that it is one-way TLS authenticated, run the following command on your bastion host:

$ curl -s http://colorteller.mtls-ec2.svc.cluster.local:9901/stats |grep -E 'ssl.handshake|ssl.no_certificate'

listener.0.0.0.0_15000.ssl.handshake: 1
listener.0.0.0.0_15000.ssl.no_certificate: 1

This command queries the runtime statistics that are maintained in ColorTeller Envoy and filters the output for certain SSL-related counts. The count for ssl.handshake should be one. If the ssl.handshake count is more than one, that means there’s been more than one TLS handshake. The count for ssl.no_certificate should also be one, or equal to the count for ssl.handshake. The ssl.no_certificate count tracks the total successful TLS connections with no client certificate. Since this is a one-way TLS handshake that doesn’t involve client certificates, this count is the same as the count of ssl.handshake.

The preceding statistics verify that a TLS handshake was completed and the authentication was one-way, where the ColorGateway authenticated the ColorTeller but not vice-versa. You’ll see in the next section how the ssl.no_certificate count differs when mTLS is enabled.

Mutual TLS in App Mesh using ACM Private CA

In the one-way TLS discussion in the previous section, you saw that App Mesh and ACM Private CA integration works without needing external enhancements. You also saw that App Mesh retrieved the server-side end-entity certificate in ColorTeller and the root CA trust chain in ColorGateway from ACM Private CA internally, by using the native integration between the two services.

However, a native integration between App Mesh and ACM Private CA isn’t currently available for client-side certificates. Client-side certificates are necessary for mTLS. In this section, you’ll see how you can issue and export client-side certificates from ACM Private CA and ingest them into App Mesh.

The solution uses Lambda to export the client-side certificate from ACM Private CA and store it in Secrets Manager. The solution includes an enhanced startup script embedded in the Envoy image to retrieve the certificate from Secrets Manager and place it on the Envoy file system before the Envoy process is started. The Envoy process reads the certificate, loads it in memory, and uses it in the TLS stack for the client-side certificate exchange of the mTLS handshake.

The choice of Lambda is based on this being an ephemeral workflow that needs to run only during system configuration. You need a short-lived, runtime compute context that lets you run the logic for exporting certificates from ACM Private CA and store them in Secrets Manager. Because this compute doesn’t need to run beyond this step, Lambda is an ideal choice for hosting this logic, for cost and operational effectiveness.

The choice of Secrets Manager for storing certificates is based on the confidentiality requirements of the passphrase that is used for encrypting the private key (PKCS #8) of the certificate. You also need a higher throughput data store that can support secrets retrieval from large meshes. Secrets Manager supports a higher API rate limit than the API for exporting certificates from ACM Private CA, and thus serves as a high-throughput front end for ACM Private CA for serving certificates without compromising data confidentiality.

The resulting architecture is shown in Figure 5. The figure includes notations and color codes for certificates—such as root certificates, endpoint certificates, and private keys—and a step-by-step process showing how the system is configured, started, and functions at runtime. The example uses two CA hierarchies for ColorGateway and ColorTeller to demonstrate an mTLS setup where the client and server belong to separate CA hierarchies but trust each other’s CAs.

Figure 5: mTLS in App Mesh integrated with ACM Private CA

The numbered steps in Figure 5 are:

Step 1: A Private CA instance representing the ColorGateway trust hierarchy is created in ACM Private CA. Next, an end-entity certificate is created and signed by the CA, which is used as the client-side certificate in ColorGateway.

Step 2: Another Private CA instance representing the ColorTeller trust hierarchy is created in ACM Private CA. Next, an end-entity certificate is created and signed by the CA, which is used as the server-side certificate in ColorTeller.

Step 3: As part of running CloudFormation, the Lambda function is invoked. This Lambda function is responsible for exporting the client-side certificate from ACM Private CA and storing it in Secrets Manager. This function begins by requesting a random password from Secrets Manager. This random password is used as the passphrase for encrypting the private key inside ACM Private CA before it’s returned to the function. Generating a random password from Secrets Manager allows you to generate a random password with a specified complexity.

Step 4: The Lambda function issues an export certificate request to ACM, requesting the ColorGateway end-entity certificate. The request conveys the private key passphrase retrieved from Secrets Manager in the previous step so that ACM Private CA can use it to encrypt the private key that’s sent in the response.

Step 5: The ACM Private CA responds to the Lambda function. The response carries the following elements of the ColorGateway end-entity certificate.

{
  'Certificate': '..',
  'CertificateChain': '..',
  'PrivateKey': '..'
}

Step 6: The Lambda function processes the response that is returned from ACM. It extracts individual fields in the JSON-formatted response and stores them in Secrets Manager. The Lambda function stores the following four values in Secrets Manager:

The ColorGateway endpoint certificate
The ColorGateway certificate trust chain, which contains the ColorGateway Private CA root certificate
The encrypted private key for the ColorGateway end-entity certificate
The passphrase that was used to encrypt the private key

Step 7: The App Mesh services—ColorGateway and ColorTeller—are started, which then start their Envoy proxy containers. A custom startup script embedded in the Envoy docker image fetches a certificate from Secrets Manager and places it on the Envoy file system.

Note: App Mesh publishes its own custom Envoy proxy Docker container image that ensures it is fully tested and patched with the latest vulnerability and performance patches. You’ll notice in the example source code that a custom Envoy image is built on top of the base image published by App Mesh. In this solution, we add an Envoy startup script and certain utilities such as AWS Command Line Interface (AWS CLI) and jq to help retrieve the certificate from Secrets Manager and place it on the Envoy file system during Envoy startup.

Step 8: The CloudFormation scripts configure the client policy for mTLS in ColorGateway in App Mesh, as shown in Figure 6. The following attributes are configured:

Provide client certificate is enabled. This ensures that the client certificate is exchanged as part of the mTLS handshake.
Certificate method is set to Local file hosting so that the certificate is read from the local file system.
Certificate chain is set to the path for the file that contains the ColorGateway certificate chain.
Private key is set to the path for the file that contains the private key for the ColorGateway certificate.

Figure 6: Client-side mTLS configuration in ColorGateway

At the end of the custom Envoy startup script described in step 7, the core Envoy process in ColorGateway service is started. It retrieves the ColorTeller CA root certificate from ACM Private CA and configures it internally as a trusted CA. This retrieval happens due to native integration between App Mesh and ACM Private CA. This allows ColorGateway Envoy to validate the certificate presented by ColorTeller Envoy during the TLS handshake.

Step 9: The CloudFormation scripts configure the listener configuration for mTLS in ColorTeller in App Mesh, as shown in Figure 7. The following attributes are configured:

Require client certificate is enabled, which enforces mTLS.
Validation Method is set to Local file hosting, which causes Envoy to read the certificate from the local file system.
Certificate chain is set to the path for the file that contains the ColorGateway certificate chain.

Figure 7: Server-side mTLS configuration in ColorTeller

At the end of the Envoy startup script described in step 7, the core Envoy process in ColorTeller service is started. It retrieves its own server-side end-entity certificate and corresponding private key from ACM Private CA. This retrieval happens internally, driven by the native integration between App Mesh and ACM Private CA. This allows ColorTeller Envoy to present its server-side certificate to ColorGateway Envoy during the TLS handshake.

The system startup concludes with this step, and the application is ready to process external client requests.

Step 10: The ColorGateway service receives a request from an external client.

Step 11: The ColorGateway Envoy initiates a TLS handshake with the ColorTeller Envoy. During the first half of the TLS handshake protocol, the ColorTeller Envoy presents its server-side certificate to the ColorGateway Envoy. The ColorGateway Envoy validates the certificate. Because the ColorGateway Envoy has been configured with the ColorTeller CA trust chain in step 8, the validation succeeds.

Step 12: During the second half of the TLS handshake, the ColorTeller Envoy requests the ColorGateway Envoy to provide its client-side certificate. This step is what distinguishes an mTLS exchange from a one-way TLS exchange.

The ColorGateway Envoy responds with its end-entity certificate that had been placed on its file system in step 7. The ColorTeller Envoy validates the received certificate with its CA trust chain, which contains the ColorGateway root CA that was placed on its file system (in step 7). The validation succeeds, and so an mTLS session is established.

Verifying mTLS

You can now verify that an mTLS exchange happened by running the following command on your bastion host.

$ curl -s http://colorteller.mtls-ec2.svc.cluster.local:9901/stats |grep -E 'ssl.handshake|ssl.no_certificate'

listener.0.0.0.0_15000.ssl.handshake: 1
listener.0.0.0.0_15000.ssl.no_certificate: 0

The count for ssl.handshake should be one. If the ssl.handshake count is more than one, that means that you’ve gone through more than one TLS handshake. It’s important to note that the count for ssl.no_certificate—the total successful TLS connections with no client certificate—is zero. This shows that mTLS configuration is working as expected. Recall that this count was one or higher—equal to the ssl.handshake count—in the previous section that described one-way TLS. The ssl.no_certificate count being zero indicates that this was an mTLS authenticated connection, where the ColorGateway authenticated the ColorTeller and vice-versa.

Certificate renewal

The ACM Private CA certificates that are imported into App Mesh are not eligible for managed renewal, so an external certificate renewal method is needed. This example solution uses an external renewal method as recommended in Renewing certificates in a private PKI that you can use in your own implementations.

The certificate renewal mechanism can be broken down into six steps, which are outlined in Figure 8.

Figure 8: Certificate renewal process in ACM Private CA and App Mesh on ECS integration

Here are the steps illustrated in Figure 8:

Step 1: ACM generates an Amazon CloudWatch Events event when a certificate is close to expiring.

Step 2: CloudWatch triggers a Lambda function that is responsible for certificate renewal.

Step 3: The Lambda function renews the certificate in ACM and exports the new certificate by calling ACM APIs.

Step 4: The Lambda function writes the certificate to Secrets Manager.

Step 5: The Lambda function triggers a new service deployment in an Amazon ECS cluster. This will cause the ECS services to go through a graceful update process to acquire a renewed certificate.

Step 6: The Envoy processes in App Mesh fetch the client-side certificate from Secrets Manager using external integration, and the server-side certificate from ACM using native integration.

Conclusion

In this post, you learned a method for enabling mTLS authentication between App Mesh endpoints based on certificates issued by ACM Private CA. mTLS enhances security of App Mesh deployments due to its bidirectional authentication capability. While server-side certificates are integrated natively, you saw how to use Lambda and Secrets Manager to integrate client-side certificates externally. Because ACM Private CA certificates aren’t eligible for managed renewal, you also learned how to implement an external certificate renewal process.

This solution enhances your App Mesh security posture by simplifying configuration of mTLS-enabled App Mesh deployments. It achieves this because all mTLS certificate requirements are met by a single, managed private PKI service—ACM Private CA—which allows you to centrally manage certificates and eliminates the need to run your own private PKI.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Certificate Manager forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

How to protect sensitive data for its entire lifecycle in AWS

2021-02-26 Raj Jain

Post Syndicated from Raj Jain original https://aws.amazon.com/blogs/security/how-to-protect-sensitive-data-for-its-entire-lifecycle-in-aws/

Many Amazon Web Services (AWS) customer workflows require ingesting sensitive and regulated data such as Payments Card Industry (PCI) data, personally identifiable information (PII), and protected health information (PHI). In this post, I’ll show you a method designed to protect sensitive data for its entire lifecycle in AWS. This method can help enhance your data security posture and be useful for fulfilling the data privacy regulatory requirements applicable to your organization for data protection at-rest, in-transit, and in-use.

An existing method for sensitive data protection in AWS is to use the field-level encryption feature offered by Amazon CloudFront. This CloudFront feature protects sensitive data fields in requests at the AWS network edge. The chosen fields are protected upon ingestion and remain protected throughout the entire application stack. The notion of protecting sensitive data early in its lifecycle in AWS is a highly desirable security architecture. However, CloudFront can protect a maximum of 10 fields and only within HTTP(S) POST requests that carry HTML form encoded payloads.

If your requirements exceed CloudFront’s native field-level encryption feature, such as a need to handle diverse application payload formats, different HTTP methods, and more than 10 sensitive fields, you can implement field-level encryption yourself using the Lambda@Edge feature in CloudFront. In terms of choosing an appropriate encryption scheme, this problem calls for an asymmetric cryptographic system that will allow public keys to be openly distributed to the CloudFront network edges while keeping the corresponding private keys stored securely within the network core. One such popular asymmetric cryptographic system is RSA. Accordingly, we’ll implement a Lambda@Edge function that uses asymmetric encryption using the RSA cryptosystem to protect an arbitrary number of fields in any HTTP(S) request. We will discuss the solution using an example JSON payload, although this approach can be applied to any payload format.

A complex part of any encryption solution is key management. To address that, I use AWS Key Management Service (AWS KMS). AWS KMS simplifies the solution and offers improved security posture and operational benefits, detailed later.

Solution overview

You can protect data in-transit over individual communications channels using transport layer security (TLS), and at-rest in individual storage silos using volume encryption, object encryption or database table encryption. However, if you have sensitive workloads, you might need additional protection that can follow the data as it moves through the application stack. Fine-grained data protection techniques such as field-level encryption allow for the protection of sensitive data fields in larger application payloads while leaving non-sensitive fields in plaintext. This approach lets an application perform business functions on non-sensitive fields without the overhead of encryption, and allows fine-grained control over what fields can be accessed by what parts of the application.

A best practice for protecting sensitive data is to reduce its exposure in the clear throughout its lifecycle. This means protecting data as early as possible on ingestion and ensuring that only authorized users and applications can access the data only when and as needed. CloudFront, when combined with the flexibility provided by Lambda@Edge, provides an appropriate environment at the edge of the AWS network to protect sensitive data upon ingestion in AWS.

Since the downstream systems don’t have access to sensitive data, data exposure is reduced, which helps to minimize your compliance footprint for auditing purposes.

The number of sensitive data elements that may need field-level encryption depends on your requirements. For example:

For healthcare applications, HIPAA regulates 18 personal data elements.
In California, the California Consumer Privacy Act (CCPA) regulates at least 11 categories of personal information—each with its own set of data elements.

The idea behind field-level encryption is to protect sensitive data fields individually, while retaining the structure of the application payload. The alternative is full payload encryption, where the entire application payload is encrypted as a binary blob, which makes it unusable until the entirety of it is decrypted. With field-level encryption, the non-sensitive data left in plaintext remains usable for ordinary business functions. When retrofitting data protection in existing applications, this approach can reduce the risk of application malfunction since the data format is maintained.

The following figure shows how PII data fields in a JSON construction that are deemed sensitive by an application can be transformed from plaintext to ciphertext with a field-level encryption mechanism.

Figure 1: Example of field-level encryption

You can change plaintext to ciphertext as depicted in Figure 1 by using a Lambda@Edge function to perform field-level encryption. I discuss the encryption and decryption processes separately in the following sections.

Field-level encryption process

Let’s discuss the individual steps involved in the encryption process as shown in Figure 2.

Figure 2: Field-level encryption process

Figure 2 shows CloudFront invoking a Lambda@Edge function while processing a client request. CloudFront offers multiple integration points for invoking Lambda@Edge functions. Since you are processing a client request and your encryption behavior is related to requests being forwarded to an origin server, you want your function to run upon the origin request event in CloudFront. The origin request event represents an internal state transition in CloudFront that happens immediately before CloudFront forwards a request to the downstream origin server.

You can associate your Lambda@Edge with CloudFront as described in Adding Triggers by Using the CloudFront Console. A screenshot of the CloudFront console is shown in Figure 3. The selected event type is Origin Request and the Include Body check box is selected so that the request body is conveyed to Lambda@Edge.

Figure 3: Configuration of Lambda@Edge in CloudFront

The Lambda@Edge function acts as a programmable hook in the CloudFront request processing flow. You can use the function to replace the incoming request body with a request body with the sensitive data fields encrypted.

The process includes the following steps:

Step 1 – RSA key generation and inclusion in Lambda@Edge

You can generate an RSA customer managed key (CMK) in AWS KMS as described in Creating asymmetric CMKs. This is done at system configuration time.

Note: You can use your existing RSA key pairs or generate new ones externally by using OpenSSL commands, especially if you need to perform RSA decryption and key management independently of AWS KMS. Your choice won’t affect the fundamental encryption design pattern presented here.

The RSA key creation in AWS KMS requires two inputs: key length and type of usage. In this example, I created a 2048-bit key and assigned its use for encryption and decryption. The cryptographic configuration of an RSA CMK created in AWS KMS is shown in Figure 4.

Figure 4: Cryptographic properties of an RSA key managed by AWS KMS

Of the two encryption algorithms shown in Figure 4— RSAES_OAEP_SHA_256 and RSAES_OAEP_SHA_1, this example uses RSAES_OAEP_SHA_256. The combination of a 2048-bit key and the RSAES_OAEP_SHA_256 algorithm lets you encrypt a maximum of 190 bytes of data, which is enough for most PII fields. You can choose a different key length and encryption algorithm depending on your security and performance requirements. How to choose your CMK configuration includes information about RSA key specs for encryption and decryption.

Using AWS KMS for RSA key management versus managing the keys yourself eliminates that complexity and can help you:

Enforce IAM and key policies that describe administrative and usage permissions for keys.
Manage cross-account access for keys.
Monitor and alarm on key operations through Amazon CloudWatch.
Audit AWS KMS API invocations through AWS CloudTrail.
Record configuration changes to keys and enforce key specification compliance through AWS Config.
Generate high-entropy keys in an AWS KMS hardware security module (HSM) as required by NIST.
Store RSA private keys securely, without the ability to export.
Perform RSA decryption within AWS KMS without exposing private keys to application code.
Categorize and report on keys with key tags for cost allocation.
Disable keys and schedule their deletion.

You need to extract the RSA public key from AWS KMS so you can include it in the AWS Lambda deployment package. You can do this from the AWS Management Console, through the AWS KMS SDK, or by using the get-public-key command in the AWS Command Line Interface (AWS CLI). Figure 5 shows Copy and Download options for a public key in the Public key tab of the AWS KMS console.

Figure 5: RSA public key available for copy or download in the console

Note: As we will see in the sample code in step 3, we embed the public key in the Lambda@Edge deployment package. This is a permissible practice because public keys in asymmetric cryptography systems aren’t a secret and can be freely distributed to entities that need to perform encryption. Alternatively, you can use Lambda@Edge to query AWS KMS for the public key at runtime. However, this introduces latency, increases the load against your KMS account quota, and increases your AWS costs. General patterns for using external data in Lambda@Edge are described in Leveraging external data in Lambda@Edge.

Step 2 – HTTP API request handling by CloudFront

CloudFront receives an HTTP(S) request from a client. CloudFront then invokes Lambda@Edge during origin-request processing and includes the HTTP request body in the invocation.

Step 3 – Lambda@Edge processing

The Lambda@Edge function processes the HTTP request body. The function extracts sensitive data fields and performs RSA encryption over their values.

The following code is sample source code for the Lambda@Edge function implemented in Python 3.7:

import Crypto
import base64
import json
from Crypto.Cipher import PKCS1_OAEP
from Crypto.PublicKey import RSA

# PEM-formatted RSA public key copied over from AWS KMS or your own public key.
RSA_PUBLIC_KEY = "-----BEGIN PUBLIC KEY-----<your key>-----END PUBLIC KEY-----"
RSA_PUBLIC_KEY_OBJ = RSA.importKey(RSA_PUBLIC_KEY)
RSA_CIPHER_OBJ = PKCS1_OAEP.new(RSA_PUBLIC_KEY_OBJ, Crypto.Hash.SHA256)

# Example sensitive data field names in a JSON object. 
PII_SENSITIVE_FIELD_NAMES = ["fname", "lname", "email", "ssn", "dob", "phone"]

CIPHERTEXT_PREFIX = "#01#"
CIPHERTEXT_SUFFIX = "#10#"

def lambda_handler(event, context):
    # Extract HTTP request and its body as per documentation:
    # https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-event-structure.html
    http_request = event['Records'][0]['cf']['request']
    body = http_request['body']
    org_body = base64.b64decode(body['data'])
    mod_body = protect_sensitive_fields_json(org_body)
    body['action'] = 'replace'
    body['encoding'] = 'text'
    body['data'] = mod_body
    return http_request


def protect_sensitive_fields_json(body):
    # Encrypts sensitive fields in sample JSON payload shown earlier in this post.
    # [{"fname": "Alejandro", "lname": "Rosalez", … }]
    person_list = json.loads(body.decode("utf-8"))
    for person_data in person_list:
        for field_name in PII_SENSITIVE_FIELD_NAMES:
            if field_name not in person_data:
                continue
            plaintext = person_data[field_name]
            ciphertext = RSA_CIPHER_OBJ.encrypt(bytes(plaintext, 'utf-8'))
            ciphertext_b64 = base64.b64encode(ciphertext).decode()
            # Optionally, add unique prefix/suffix patterns to ciphertext
            person_data[field_name] = CIPHERTEXT_PREFIX + ciphertext_b64 + CIPHERTEXT_SUFFIX 
    return json.dumps(person_list)

The event structure passed into the Lambda@Edge function is described in Lambda@Edge Event Structure. Following the event structure, you can extract the HTTP request body. In this example, the assumption is that the HTTP payload carries a JSON document based on a particular schema defined as part of the API contract. The input JSON document is parsed by the function, converting it into a Python dictionary. The Python native dictionary operators are then used to extract the sensitive field values.

Note: If you don’t know your API payload structure ahead of time or you’re dealing with unstructured payloads, you can use techniques such as regular expression pattern searches and checksums to look for patterns of sensitive data and target them accordingly. For example, credit card primary account numbers include a Luhn checksum that can be programmatically detected. Additionally, services such as Amazon Comprehend and Amazon Macie can be leveraged for detecting sensitive data such as PII in application payloads.

While iterating over the sensitive fields, individual field values are encrypted using the standard RSA encryption implementation available in the Python Cryptography Toolkit (PyCrypto). The PyCrypto module is included within the Lambda@Edge zip archive as described in Lambda@Edge deployment package.

The example uses the standard optimal asymmetric encryption padding (OAEP) and SHA-256 encryption algorithm properties. These properties are supported by AWS KMS and will allow RSA ciphertext produced here to be decrypted by AWS KMS later.

Note: You may have noticed in the code above that we’re bracketing the ciphertexts with predefined prefix and suffix strings:

person_data[field_name] = CIPHERTEXT_PREFIX + ciphertext_b64 + CIPHERTEXT_SUFFIX

This is an optional measure and is being implemented to simplify the decryption process.

The prefix and suffix strings help demarcate ciphertext embedded in unstructured data in downstream processing and also act as embedded metadata. Unique prefix and suffix strings allow you to extract ciphertext through string or regular expression (regex) searches during the decryption process without having to know the data body format or schema, or the field names that were encrypted.

Distinct strings can also serve as indirect identifiers of RSA key pair identifiers. This can enable key rotation and allow separate keys to be used for separate fields depending on the data security requirements for individual fields.

You can ensure that the prefix and suffix strings can’t collide with the ciphertext by bracketing them with characters that don’t appear in cyphertext. For example, a hash (#) character cannot be part of a base64 encoded ciphertext string.

Deploying a Lambda function as a Lambda@Edge function requires specific IAM permissions and an IAM execution role. Follow the Lambda@Edge deployment instructions in Setting IAM Permissions and Roles for Lambda@Edge.

Step 4 – Lambda@Edge response

The Lambda@Edge function returns the modified HTTP body back to CloudFront and instructs it to replace the original HTTP body with the modified one by setting the following flag:

http_request['body']['action'] = 'replace'

Step 5 – Forward the request to the origin server

CloudFront forwards the modified request body provided by Lambda@Edge to the origin server. In this example, the origin server writes the data body to persistent storage for later processing.

Field-level decryption process

An application that’s authorized to access sensitive data for a business function can decrypt that data. An example decryption process is shown in Figure 6. The figure shows a Lambda function as an example compute environment for invoking AWS KMS for decryption. This functionality isn’t dependent on Lambda and can be performed in any compute environment that has access to AWS KMS.

Figure 6: Field-level decryption process

The steps of the process shown in Figure 6 are described below.

Step 1 – Application retrieves the field-level encrypted data

The example application retrieves the field-level encrypted data from persistent storage that had been previously written during the data ingestion process.

Step 2 – Application invokes the decryption Lambda function

The application invokes a Lambda function responsible for performing field-level decryption, sending the retrieved data to Lambda.

Step 3 – Lambda calls the AWS KMS decryption API

The Lambda function uses AWS KMS for RSA decryption. The example calls the KMS decryption API that inputs ciphertext and returns plaintext. The actual decryption happens in KMS; the RSA private key is never exposed to the application, which is a highly desirable characteristic for building secure applications.

Note: If you choose to use an external key pair, then you can securely store the RSA private key in AWS services like AWS Systems Manager Parameter Store or AWS Secrets Manager and control access to the key through IAM and resource policies. You can fetch the key from relevant vault using the vault’s API, then decrypt using the standard RSA implementation available in your programming language. For example, the cryptography toolkit in Python or javax.crypto in Java.

The Lambda function Python code for decryption is shown below.

import base64
import boto3
import re

kms_client = boto3.client('kms')
CIPHERTEXT_PREFIX = "#01#"
CIPHERTEXT_SUFFIX = "#10#"

# This lambda function extracts event body, searches for and decrypts ciphertext 
# fields surrounded by provided prefix and suffix strings in arbitrary text bodies 
# and substitutes plaintext fields in-place.  
def lambda_handler(event, context):    
    org_data = event["body"]
    mod_data = unprotect_fields(org_data, CIPHERTEXT_PREFIX, CIPHERTEXT_SUFFIX)
    return mod_data

# Helper function that performs non-greedy regex search for ciphertext strings on
# input data and performs RSA decryption of them using AWS KMS 
def unprotect_fields(org_data, prefix, suffix):
    regex_pattern = prefix + "(.*?)" + suffix
    mod_data_parts = []
    cursor = 0

    # Search ciphertexts iteratively using python regular expression module
    for match in re.finditer(regex_pattern, org_data):
        mod_data_parts.append(org_data[cursor: match.start()])
        try:
            # Ciphertext was stored as Base64 encoded in our example. Decode it.
            ciphertext = base64.b64decode(match.group(1))

            # Decrypt ciphertext using AWS KMS  
            decrypt_rsp = kms_client.decrypt(
                EncryptionAlgorithm="RSAES_OAEP_SHA_256",
                KeyId="<Your-Key-ID>",
                CiphertextBlob=ciphertext)
            decrypted_val = decrypt_rsp["Plaintext"].decode("utf-8")
            mod_data_parts.append(decrypted_val)
        except Exception as e:
            print ("Exception: " + str(e))
            return None
        cursor = match.end()

    mod_data_parts.append(org_data[cursor:])
    return "".join(mod_data_parts)

The function performs a regular expression search in the input data body looking for ciphertext strings bracketed in predefined prefix and suffix strings that were added during encryption.

While iterating over ciphertext strings one-by-one, the function calls the AWS KMS decrypt() API. The example function uses the same RSA encryption algorithm properties—OAEP and SHA-256—and the Key ID of the public key that was used during encryption in Lambda@Edge.

Note that the Key ID itself is not a secret. Any application can be configured with it, but that doesn’t mean any application will be able to perform decryption. The security control here is that the AWS KMS key policy must allow the caller to use the Key ID to perform the decryption. An additional security control is provided by Lambda execution role that should allow calling the KMS decrypt() API.

Step 4 – AWS KMS decrypts ciphertext and returns plaintext

To ensure that only authorized users can perform decrypt operation, the KMS is configured as described in Using key policies in AWS KMS. In addition, the Lambda IAM execution role is configured as described in AWS Lambda execution role to allow it to access KMS. If both the key policy and IAM policy conditions are met, KMS returns the decrypted plaintext. Lambda substitutes the plaintext in place of ciphertext in the encapsulating data body.

Steps three and four are repeated for each ciphertext string.

Step 5 – Lambda returns decrypted data body

Once all the ciphertext has been converted to plaintext and substituted in the larger data body, the Lambda function returns the modified data body to the client application.

Conclusion

In this post, I demonstrated how you can implement field-level encryption integrated with AWS KMS to help protect sensitive data workloads for their entire lifecycle in AWS. Since your Lambda@Edge is designed to protect data at the network edge, data remains protected throughout the application execution stack. In addition to improving your data security posture, this protection can help you comply with data privacy regulations applicable to your organization.

Since you author your own Lambda@Edge function to perform standard RSA encryption, you have flexibility in terms of payload formats and the number of fields that you consider to be sensitive. The integration with AWS KMS for RSA key management and decryption provides significant simplicity, higher key security, and rich integration with other AWS security services enabling an overall strong security solution.

By using encrypted fields with identifiers as described in this post, you can create fine-grained controls for data accessibility to meet the security principle of least privilege. Instead of granting either complete access or no access to data fields, you can ensure least privileges where a given part of an application can only access the fields that it needs, when it needs to, all the way down to controlling access field by field. Field by field access can be enabled by using different keys for different fields and controlling their respective policies.

In addition to protecting sensitive data workloads to meet regulatory and security best practices, this solution can be used to build de-identified data lakes in AWS. Sensitive data fields remain protected throughout their lifecycle, while non-sensitive data fields remain in the clear. This approach can allow analytics or other business functions to operate on data without exposing sensitive data.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Noise

All posts by Raj Jain

How to use ACM Private CA for enabling mTLS in AWS App Mesh

Solution overview

Example application

One-way TLS in App Mesh using ACM Private CA

Verifying one-way TLS

Mutual TLS in App Mesh using ACM Private CA

Verifying mTLS

Certificate renewal

Conclusion

The collective thoughts of the interwebz