Tag Archives: client side encryption

How to enable encryption in a browser with the AWS Encryption SDK for JavaScript and Node.js

Post Syndicated from Spencer Janyk original https://aws.amazon.com/blogs/security/how-to-enable-encryption-browser-aws-encryption-sdk-javascript-node-js/

In this post, we’ll show you how to use the AWS Encryption SDK (“ESDK”) for JavaScript to handle an in-browser encryption workload for a hypothetical application. First, we’ll review some of the security and privacy properties of encryption, including the names AWS uses for the different components of a typical application. Then, we’ll discuss some of the reasons you might want to encrypt each of those components, with a focus on in-browser encryption, and we’ll describe how to perform that encryption using the ESDK. Lastly, we’ll talk about some of the security properties to be mindful of when designing an application, and where to find additional resources.

An overview of the security and privacy properties of encryption

Encryption is a technique that can restrict access to sensitive data by making it unreadable without a key. An encryption process takes data that is plainly readable or processable (“plaintext”) and uses principles of mathematics to obscure the contents so that it can’t be read without the use of a secret key. To preserve user privacy and prevent unauthorized disclosure of sensitive business data, developers need ways to protect sensitive data during the entire data lifecycle. Data needs to be protected from risks associated with unintentional disclosure as data flows between collection, storage, processing, and sharing components of an application. In this context, encryption is typically divided into two separate techniques: encryption at rest for storing data; and encryption in transit for moving data between entities or systems.

Many applications use encryption in transit to secure connections between their users and the services they provide, and then encrypt the data before it’s stored. However, as applications become more complex and data must be moved between more nodes and stored in more diverse places, there are more opportunities for data to be accidentally leaked or unintentionally disclosed. When a user enters their data in a browser, Transport Layer Security (TLS) can protect that data in transit between the user’s browser and a service endpoint. But in a distributed system, intermediary services between that endpoint and the service that processes that sensitive data might log or cache the data before transporting it. Encrypting sensitive data at the point of collection in the browser is a form of encryption at rest that minimizes the risk of unauthorized access and protects the data if it’s lost, stolen, or accidentally exposed. Encrypting data in the browser means that even if it’s completely exposed elsewhere, it’s unreadable and worthless to anyone without access to the key.

A typical web application

A typical web application will accept some data as input, process it, and then store it. When the user needs to access stored data, the data often follows the same path used when it was input. In our example there are three primary components to the path:

Figure 1: A hypothetical web application where the application is composed of an end-user interacting with a browser front-end, a third party which processes data received from the browser, processing is performed in Amazon EC2, and storage happens in Amazon S3

Figure 1: A hypothetical web application where the application is composed of an end-user interacting with a browser front-end, a third party which processes data received from the browser, processing is performed in Amazon EC2, and storage happens in Amazon S3

  1. An end-user interacts with the application using an interface in the browser.
  2. As data is sent to Amazon EC2, it passes through the infrastructure of a third party which could be an Internet Service Provider, an appliance in the user’s environment, or an application running in the cloud.
  3. The application on Amazon EC2 processes the data once it has been received.
  4. Once the application is done processing data, it is stored in Amazon S3 until it is needed again.

As data moves between components, TLS is used to prevent inadvertent disclosure. But what if one or more of these components is a third-party service that doesn’t need access to sensitive data? That’s where encryption at rest comes in.

Encryption at rest is available as a server-side, client-side, and client-side in-browser protection. Server-side encryption (SSE) is the most commonly used form of encryption with AWS customers, and for good reason: it’s easy to use because it’s natively supported by many services, such as Amazon S3. When SSE is used, the service that’s storing data will encrypt each piece of data with a key (a “data key”) when it’s received, and then decrypt it transparently when it’s requested by an authorized user. This has the benefit of being seamless for application developers because they only need to check a box in Amazon S3 to enable encryption, and it also adds an additional level of access control by having separate permissions to download an object and perform a decryption operation. However, there is a security/convenience tradeoff to consider, because the service will allow any role with the appropriate permissions to perform a decryption. For additional control, many AWS services—including S3—support the use of customer-managed AWS Key Management Service (AWS KMS) customer master keys (CMKs) that allow you to specify key policies or use grants or AWS Identity and Access Management (IAM) policies to control which roles or users have access to decryption, and when. Configuring permission to decrypt using customer-managed CMKs is often sufficient to satisfy compliance regimes that require “application-level encryption.”

Some threat models or compliance regimes may require client-side encryption (CSE), which can add a powerful additional level of access control at the expense of additional complexity. As noted above, services perform server-side encryption on data after it has left the boundary of your application. TLS is used to secure the data in transit to the service, but some customers might want to only manage encrypt/decrypt operations within their application on EC2 or in the browser. Applications can use the AWS Encryption SDK to encrypt data within the application trust boundary before it’s sent to a storage service.

But what about a use case where customers don’t even want plaintext data to leave the browser? Or what if end-users input data that is passed through or logged by intermediate systems that belong to a third-party? It’s possible to create a separate application that only manages encryption to ensure that your environment is segregated, but using the AWS Encryption SDK for JavaScript allows you to encrypt data in an end-user browser before it’s ever sent to your application, so only your end-user will be able to view their plaintext data. As you can see in Figure 2 below, in-browser encryption can allow data to be safely handled by untrusted intermediate systems while ensuring its confidentiality and integrity.

Figure 2: A hypothetical web application with encryption where the application is composed of an end-user interacting with a browser front-end, a third party which processes data received from the browser, processing is performed in Amazon EC2, and storage happens in Amazon S3

Figure 2: A hypothetical web application with encryption where the application is composed of an end-user interacting with a browser front-end, a third party which processes data received from the browser, processing is performed in Amazon EC2, and storage happens in Amazon S3

  1. The application in the browser requests a data key to encrypt sensitive data entered by the user before it is passed to a third party.
  2. Because the sensitive data has been encrypted, the third party cannot read it. The third party may be an Internet Service Provider, an appliance in the user’s environment, an application running in the cloud, or a variety of other actors.
  3. The application on Amazon EC2 can make a request to KMS to decrypt the data key so the data can be decrypted, processed, and re-encrypted.
  4. The encrypted object is stored in S3 where a second encryption request is made so the object can be encrypted when it is stored server side.

How to encrypt in the browser

The first step of in-browser encryption is including a copy of the AWS Encryption SDK for JavaScript with the scripts you’re already sending to the user when they access your application. Once it’s present in the end-user environment, it’s available for your application to make calls. To perform the encryption, the ESDK will request a data key from the cryptographic materials provider that is used to encrypt, and an encrypted copy of the data key that will be stored with the object being encrypted. After a piece of data is encrypted within the browser, the ciphertext can be uploaded to your application backend for processing or storage. When a user needs to retrieve the plaintext, the ESDK can read the metadata attached to the ciphertext to determine the appropriate method to decrypt the data key, and if they have access to the CMK decrypt the data key and then use it to decrypt the data.

Important considerations

One common issue with browser-based applications is inconsistent feature support across different browser vendors and versions. For example, how will the application respond to browsers that lack native support for the strongest recommended cryptographic algorithm suites? Or, will there be a message or alternative mode if a user accesses the application using a browser that has JavaScript disabled? The ESDK for JavaScript natively supports a fallback mode, but it may not be appropriate for all use cases. Be sure to understand what kind of browser environments you will need to support to determine whether in-browser encryption is appropriate, and include support for graceful degradation if you expect limited browser support. Developers should also consider the ways that unauthorized users might monitor user actions via a browser extension, make unauthorized browser requests without user knowledge, or request a “downgraded” (less mathematically intensive) cryptographic operation.

It’s always a good idea to have your application designs reviewed by security professionals. If you have an AWS Account Manager or Technical Account Manager, you can ask them to connect you with a Solutions Architect to review your design. If you’re an AWS customer but don’t have an account manager, consider visiting an AWS Loft to participate in our “Ask an Expert” plan.

Where to learn more

If you have questions about this post, let us know in the Comments section below, or consult the AWS Encryption SDK Developer Forum. Because the Encryption SDK is open-source, you can always contribute, open an issue, or ask questions in Github.

The AWS Encryption SDK for JavaScript is available at: https://github.com/awslabs/aws-encryption-sdk-javascript
Documentation is available at: https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/javascript.html

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Janyk author photo

Spencer Janyk

Spencer is a Senior Product Manager at Amazon Web Services working on data encryption and privacy. He has previously worked on vulnerability management and monitoring for enterprises and applying machine learning to challenges in ad tech, social media, diversity in recruiting, and talent management. Spencer holds a Master of Arts in Performance Studies from New York University and a Bachelor of Arts in Gender Studies from Whitman College.

Gray author photo

Amanda Gray

Amanda is a Senior Security Engineer at Amazon Web Services on the Crypto Tools team. Previously, Amanda worked on application security and privacy by design, and she continues to promote these goals every day. Amanda holds Bachelors’ degrees in Physics and Computer Science from the University of Washington and Smith College respectively, and a Master’s degree in Physical Oceanography from the University of Washington.

How to decrypt ciphertexts in multiple regions with the AWS Encryption SDK in C

Post Syndicated from Liz Roth original https://aws.amazon.com/blogs/security/how-to-decrypt-ciphertexts-multiple-regions-aws-encryption-sdk-in-c/

You’ve told us that you want to encrypt data once with AWS Key Management Service (AWS KMS) and decrypt that data with customer master keys (CMKs) that you specify, often with CMKs in different AWS Regions. Doing this saves you compute resources and helps you to enable secure and efficient high-availability schemes.

The AWS Crypto Tools team has introduced the AWS Encryption SDK for C so you can achieve these goals. The new tool also adds more options for language and platform support and is fully interoperable with the implementations in Java and Python.

The AWS Encryption SDK is a client-side encryption library that helps make it easier for you to implement encryption best practices in your applications. You can use it with master keys from multiple sources, including AWS KMS CMKs. The AWS Encryption SDK doesn’t require AWS KMS or any other AWS service.

You can use AWS KMS APIs directly to encrypt data keys using multiple CMKs, but the AWS Encryption SDK provides tools to make working with multiple CMKs even easier, with everything you need stored in the Encryption SDK’s portable encrypted message format. The AWS Encryption SDK for C uses the concept of keyrings, which makes it easy to work with ciphertexts encrypted using multiple CMKs.

In this post, I will walk you through an example using the new AWS Encryption SDK for C. I’ll focus on some highlights from example code in the context of what an example application deployment might look like. You can find the complete example code in this GitHub repository. As always, we welcome your comments and your contributions.

Example scenario

To add some context around the example code, assume that you have a data processing application deployed both in US West (Oregon) us-west-2 and EU Central (Frankfurt) eu-central-1. For added durability, this example application creates and encrypts data in us-west-2 before it’s copied to the eu-central-1 Region. You have assurance that you could decrypt that data in us-west-2 if needed, but you want to mitigate the case where the decryption service in us-west-2 is unavailable. So how do you ensure you can decrypt your data in the eu-central-1 region when you need to?

In this example, your data processing application uses the AWS Encryption SDK and AWS KMS to generate a 256-bit data key to encrypt content locally in us-west-2. The AWS Encryption SDK for C deletes the plaintext data key after use, but an encrypted copy of that data key is included in the encrypted message that the AWS Encryption SDK returns. This prevents you from losing the encrypted copy of the data key, which would make your encrypted content unrecoverable. The data key is encrypted under the AWS KMS CMKs in each of the two regions in which you might want to decrypt the data in the future.

A best practice is to plan to decrypt data using in-region data keys and CMKs. This reduces latency and simplifies the permissions and auditing properties of the decryption operation. The latency impact from the cross-region API calls occur only during the encryption operation.

In this scenario, the AWS KMS CMK key policy permissions look like this:

  • To encrypt data, the AWS identity used by the data processing application in us-west-2 needs kms:GenerateDataKey permission on the us-west-2 CMK and kms:Encrypt permission on the eu-central-1 CMK. You can specify these permissions in a key policy or IAM policy. This will let the application create a data key in us-west-2 and encrypt the data key under CMKs in both AWS Regions.
  • To decrypt data, the AWS identity used by the data processing application in us-west-2 needs kms:Decrypt permissions on the CMK in us-west-2 or the CMK in eu-central-1.

Encryption path

First, define variables for the Amazon Resource Names (ARNs) of your CMKs in us-west-2 and eu-central-1. In the Encryption SDK for C, to encrypt, you can identify a CMK by its CMK ARN or the Alias ARN that is mapped to the CMK ARN.

const char *KEY_ARN_US_WEST_2 = "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab";

const char *KEY_ARN_EU_CENTRAL_1 = "arn:aws:kms:eu-central-1:111122223333:key/0987dcba-09fe-87dc-65ba-ab0987654321";      

Now, use the CMK ARNs to create a keyring. In the Encryption SDK, a keyring is used to generate, encrypt, and decrypt data keys under multiple master keys. You’ll create a KMS keyring configured to use multiple CMKs.

struct aws_cryptosdk_keyring *kms_keyring=Aws::Cryptosdk::KmsKeyring::Builder().Build(KEY_ARN_US_WEST_2, { KEY_ARN_EU_CENTRAL_1 });

When the AWS Encryption SDK uses this keyring to encrypt data, it calls GenerateDataKey on the first CMK that you specify, and Encrypt on each of the remaining CMKs that you specify. The result is a plaintext data key generated in us-west-2, an encryption of the data key using the CMK in us-west-2, and an encryption of the data key using the CMK in eu-central-1.

The plaintext data key that AWS KMS generated in us-west-2 is protected under a TLS session using only cipher suites that support forward-secrecy. The process of sending that same plaintext data key to the AWS KMS endpoint in eu-central-1 for encryption is also protected under a similar TLS session.

The Encryption SDK uses the data key to encrypt your data, and it stores the encrypted data keys with your encrypted content. The result is an encrypted message that can be decrypted using the CMK in us-west-2 or the CMK in eu-central-1.

Now that you understand what’s going to happen after you create the keyring, I’ll return to the code sample. Next, you need to create an encrypt-mode session with your keyring and pass in the CMM. In the AWS Encryption SDK for C, you use a session to encrypt a single plaintext message or decrypt a single ciphertext message, regardless of its size. The session maintains the state of the message throughout its processing.

struct aws_cryptosdk_session *session = aws_cryptosdk_session_new_from_keyring(alloc, AWS_CRYPTOSDK_ENCRYPT, kms_keyring);

With the keyring and encrypt-mode session, the data processing application can ask the Encryption SDK to encrypt the data under the CMKs that you specified in two different AWS regions:


The result is an encrypted message that contains the ciphertext and two encrypted copies of the same data key. One encrypted data key was encrypted by your CMK in us-west-2 and other encrypted data key was encrypted by your CMK in eu-central-1.

Decryption path

In the AWS Encryption SDK for C, you use keyrings for both encrypting and decrypting. You can use the same keyring for both, or you can use different keyrings for each operation.

Why would you want to use a different keyring for decryption? At a high level, encrypt keyrings specify all CMKs that can decrypt the ciphertext. Decrypt keyrings constrain the CMKs the application is permitted to use.

Reusing a keyring for both encrypt and decrypt mode can simplify your AWS Encryption SDK client configuration, but splitting the keyring and using different AWS KMS clients provides more flexibility to meet your security and architecture goals. The option you choose depends in part on the constraints you want to place on the CMKs your application uses.

The Decrypt API in the AWS KMS service doesn’t permit you to specify a CMK as a request parameter. But the AWS Encryption SDK lets you specify one or many CMKs in a decryption keyring, or even discover which CMKs to try automatically. I’ll discuss each option in the next section.

Decryption path 1: Use a specific CMK

This keyring option configures the AWS Encryption SDK to use only a specified CMK in the specified AWS Region. This implies that your data processing application will need kms:Decrypt permissions on that specific CMK and your application will always call the same AWS KMS endpoints in the specified AWS Region. CloudTrail events from the Decrypt API will also only appear in the specified AWS Region.

You might use a specific CMK when the user or application that is decrypting the data has kms:Decrypt permission on only one of the CMKs that encrypted the data keys.

The CMK that you specify to decrypt the data must be one of the CMKs that was used to encrypt the data. Make sure that at least one of the CMKs from your encrypt keyring is included in the decrypt keyring and that the caller has kms:Decrypt permission to use it.

In my example, I encrypted the data keys using CMKs in us-west-2 and eu-central 1, so I’ll start decrypting in eu-central-1 because I want to have a specific decrypt instantiation of the data processing application dedicated to eu-central-1. Assume the eu-central-1 data processing application has configured AWS IAM credentials for a principal with permission to call the Decrypt operation on the eu-central-1 CMK.

Configure a keyring that asks the AWS Encryption SDK to use the CMK in eu-central-1 to decrypt:


The Encryption SDK reads the encrypted message, finds the encrypted data key that was encrypted using the CMK in eu-central-1, and uses this keyring to decrypt.

Decryption path 2: Use any of several CMKs

This keyring option configures the AWS Encryption SDK to try several specific CMKs during its decryption attempts, stopping as soon as it succeeds. You should configure the AWS IAM credentials used by your data processing application to have kms:Decrypt permissions on each of the specified regional CMKs.

Your application could end up calling multiple regional AWS KMS endpoints. CloudTrail events from the Decrypt API will appear in the AWS Region in which the decrypt operation succeeds, and in any of the other AWS Regions that the keyring attempts to use. The CMK that you specify to decrypt the data must be one of the CMKs that was used to encrypt the data. Make sure that at least one of the CMKs from your encrypt keyring is included in the decrypt keyring and that the application has kms:Decrypt permission to use it.

You might define an encryption keyring that includes multiple CMKs so that users with different permissions can decrypt the same message. For example, you might include in your encryption keyring keys in multiple AWS regions.

Here’s an example keyring constructed with multiple CMKs:

Aws::Cryptosdk::KmsKeyring::Builder().Build(KEY_ARN_EU_CENTRAL_1, { KEY_ARN_US_WEST_2 })

The AWS Encryption SDK reads each of the encrypted data keys stored in the encrypted message in the order that they appear. For each data key, the Encryption SDK searches the keyring for the matching CMK that encrypted it. If it finds that CMK, the AWS Encryption SDK calls AWS KMS in the AWS Region where the CMK exists to decrypt that data key, then uses that decrypted key to decrypt the message. If the decryption operation fails for any reason, the AWS Encryption SDK moves on to the next encrypted data key in the message and tries again.

The AWS Encryption SDK will try to decrypt the encrypted message in this way until either decryption succeeds, or the AWS Encryption SDK has attempted and failed to decrypt any of the encrypted data keys using the CMKs specified in the keyring.

If this keyring configuration looks familiar, it’s because it’s similar to the configuration you used on the encrypt path when you encrypted under multiple CMKs. The difference is this:

  • Encryption: The AWS Encryption SDK uses every CMK in the keyring to encrypt the data key, and adds all of the encrypted data keys to the encrypted message.
  • Decryption: The AWS Encryption SDK attempts to decrypt one of the encrypted data key using only the CMKs in the keyring. It stops as soon as it succeeds.

Decryption path 3: Strategic ARNs reduction using the Discovery keyring

The previous decryption paths required you to keep track of the exact CMKs used during the encryption operation, which may suit your needs for security and event logging. But what if you want more flexibility? What if you want to change the CMKs that you use in encryption operations without updating the data processing application that decrypts your data? You can configure a keyring that doesn’t specify CMKs to use for decryption, but instead tries each CMK that encrypted a data key until decryption succeeds or all referenced CMKs fail. We call this configuration a KMS Discovery keyring.

A Discovery keyring is equivalent to a keyring that includes all of the same CMKs that were used to encrypt the data, but it’s simpler and less error-prone. You might use a KMS Discovery keyring if you have no preference among the CMKs that encrypted a data key, and don’t mind the latency tradeoffs of trying CMKs in remote AWS Regions, or trying CMKs that will fail a permissions check while searching for one that succeeds. You can think of the KMS Discovery keyring as a universal keyring that you can use and reuse in your applications in many AWS Regions.

When you use a KMS Discovery keyring, the AWS Encryption SDK reads each encrypted data key and discovers the ARN of the CMK used to encrypt it. The AWS Encryption SDK then uses the configured IAM credentials to call AWS KMS in that CMK’s AWS Region to decrypt the data key. The AWS Encryption SDK repeats that process until it has decrypted the data key or runs out of encrypted data keys to try. .


While KMS Discovery keyrings are simpler, you run the risk of having your data processing application make a cross-region call to an AWS KMS endpoint that adds unwanted latency. In my example, you might not want the decrypting application running in us-west-2 to wait for the AWS Encryption SDK to call AWS KMS in eu-central-1. To use only the CMKs in a particular AWS Region to decrypt the data keys, create a KMS Regional Discovery keyring that specifies the AWS Region, but not the CMK ARNs. In my example, the following keyring allows the AWS Encryption SDK to use only CMKs in us-west-2.


Because this example KMS Regional Discovery keyring specifies a client for the us-west-2 AWS Region, not a CMK ARN, the AWS Encryption SDK will only try to decrypt any encrypted data key it finds that was encrypted using any CMK in us-west-2. If, for some reason, none of the encrypted data keys was encrypted using a CMK in us-west-2, or the application decrypting the data doesn’t have permission to use CMKs in us-west-2, the AWS Encryption SDK call to decrypt the message with this keyring fails and fails fast. This may provide you with more options for deterministic error handling.

Keep in mind that the KMS Regional Discovery keyring allows the AWS Encryption SDK to try the CMK for each encrypted data key in the specified AWS Region. However, AWS KMS never uses a CMK until it verifies that the caller has permission to perform the requested operation. If the application doesn’t have kms:Decrypt permission for any of the CMKs that were used to encrypt the data keys, decryption fails.


Encrypting KMS data keys using multiple CMKs provides a variety of options to decrypt ciphertexts to meet your security, auditing, and latency requirements. My examples show how encrypted messages can be decrypted by using AWS KMS CMKs in multiple AWS Regions. You can also use the Encryption SDK with master keys supplied by a custom key management infrastructure independent of AWS.

The AWS Encryption SDK’s portable and interoperable encrypted message format makes it easier to combine multiple encrypted data keys with your encrypted data to support the decryption access scheme you want. The AWS Encryption SDK for C brings these utilities to a new, broader set of platform and application environments to complement the existing Java and Python versions.

You can find the AWS Encryption SDK for C on GitHub.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Crypto Tools forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Liz Roth

Liz is a Senior Software Development Engineer at Amazon Web Services. She has been at Amazon for more than 8 years and has more than 10 years of industry experience across a variety of areas, including security, networks, and operations.

How to Encrypt Amazon S3 Objects with the AWS SDK for Ruby

Post Syndicated from Doug Schwartz original https://aws.amazon.com/blogs/security/how-to-encrypt-amazon-s3-objects-with-the-aws-sdk-for-ruby/

AWS KMS image

Recently, Amazon announced some new Amazon S3 encryption and security features. The AWS Blog post showed how to use the Amazon S3 console to take advantage of these new features. However, if you have a large number of Amazon S3 buckets, using the console to implement these features could take hours, if not days. As an alternative, I created documentation topics in the AWS SDK for Ruby Developer Guide that include code examples showing you how to use the new Amazon S3 encryption features using the AWS SDK for Ruby.

What are my encryption options?

You can encrypt Amazon S3 bucket objects on a server or on a client:

  • When you encrypt objects on a server, you request that Amazon S3 encrypt the objects before saving them to disk in data centers and decrypt the objects when you download them. The main advantage of this approach is that Amazon S3 manages the entire encryption process.
  • When you encrypt objects on a client, you encrypt the objects before you upload them to Amazon S3. In this case, you manage the encryption process, the encryption keys, and related tools. Use this option when:
    • Company policy and standards require it.
    • You already have a development process in place that meets your needs.

    Encrypting on the client has always been available, but you should know the following points:

    • You must be diligent about protecting your encryption keys, which is analogous to having a burglar-proof lock on your front door. If you leave a key under the mat, your security is compromised.
    • If you lose your encryption keys, you won’t be able to decrypt your data.

    If you encrypt objects on the client, we strongly recommend that you use an AWS Key Management Service (AWS KMS) managed customer master key (CMK)

How to use encryption on a server

You can specify that Amazon S3 automatically encrypts objects as you upload them to a bucket or require that objects uploaded to an Amazon S3 bucket include encryption on a server before they are uploaded to an Amazon S3 bucket.

The advantage of these settings is that when you specify them, you ensure that objects uploaded to Amazon S3 are encrypted. Alternatively, you can have Amazon S3 encrypt individual objects on the server as you upload them to a bucket or encrypt them on the server with your own key as you upload them to a bucket.

The AWS SDK for Ruby Developer Guide now contains the following topics that explain your encryption options on a server:

How to use encryption on a client

You can encrypt objects on a client before you upload them to a bucket and decrypt them after you download them from a bucket by using the Amazon S3 encryption client.

The AWS SDK for Ruby Developer Guide now contains the following topics that explain your encryption options on the client:

Note: The Amazon S3 encryption client in the AWS SDK for Ruby is compatible with other Amazon S3 encryption clients, but it is not compatible with other AWS client-side encryption libraries, including the AWS Encryption SDK and the Amazon DynamoDB encryption client for Java. Each library returns a different ciphertext (“encrypted message”) format, so you can’t use one library to encrypt objects and a different library to decrypt them. For more information, see Protecting Data Using Client-Side Encryption.

If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about encrypting objects on servers and clients, start a new thread on the Amazon S3 forum or contact AWS Support.

– Doug

New AWS DevOps Blog Post: How to Help Secure Your Code in a Cross-Region/Cross-Account Deployment Solution on AWS

Post Syndicated from Craig Liebendorfer original https://aws.amazon.com/blogs/security/new-aws-devops-blog-post-how-to-help-secure-your-code-in-a-cross-regioncross-account-deployment-solution/

Security image

You can help to protect your data in a number of ways while it is in transit and at rest, such as by using Secure Sockets Layer (SSL) or client-side encryption. AWS Key Management Service (AWS KMS) is a managed service that makes it easy for you to create, control, rotate, and use your encryption keys. AWS KMS allows you to create custom keys, which you can share with AWS Identity and Access Management users and roles in your AWS account or in an AWS account owned by someone else.

In a new AWS DevOps Blog post, BK Chaurasiya describes a solution for building a cross-region/cross-account code deployment solution on AWS. BK explains options for helping to protect your source code as it travels between regions and between AWS accounts.

For more information, see the full AWS DevOps Blog post.

– Craig

Secure Amazon EMR with Encryption

Post Syndicated from Sai Sriparasa original https://aws.amazon.com/blogs/big-data/secure-amazon-emr-with-encryption/

In the last few years, there has been a rapid rise in enterprises adopting the Apache Hadoop ecosystem for critical workloads that process sensitive or highly confidential data. Due to the highly critical nature of the workloads, the enterprises implement certain organization/industry wide policies and certain regulatory or compliance policies. Such policy requirements are designed to protect sensitive data from unauthorized access.

A common requirement within such policies is about encrypting data at-rest and in-flight. Amazon EMR uses “security configurations” to make it easy to specify the encryption keys and certificates, ranging from AWS Key Management Service to supplying your own custom encryption materials provider.

You create a security configuration that specifies encryption settings and then use the configuration when you create a cluster. This makes it easy to build the security configuration one time and use it for any number of clusters.


In this post, I go through the process of setting up the encryption of data at multiple levels using security configurations with EMR. Before I dive deep into encryption, here are the different phases where data needs to be encrypted.

Data at rest

  • Data residing on Amazon S3—S3 client-side encryption with EMR
  • Data residing on disk—the Amazon EC2 instance store volumes (except boot volumes) and the attached Amazon EBS volumes of cluster instances are encrypted using Linux Unified Key System (LUKS)

Data in transit

  • Data in transit from EMR to S3, or vice versa—S3 client side encryption with EMR
  • Data in transit between nodes in a cluster—in-transit encryption via Secure Sockets Layer (SSL) for MapReduce and Simple Authentication and Security Layer (SASL) for Spark shuffle encryption
  • Data being spilled to disk or cached during a shuffle phase—Spark shuffle encryption or LUKS encryption

Encryption walkthrough

For this post, you create a security configuration that implements encryption in transit and at rest. To achieve this, you create the following resources:

  • KMS keys for LUKS encryption and S3 client-side encryption for data exiting EMR to S3
  • SSL certificates to be used for MapReduce shuffle encryption
  • The environment into which the EMR cluster is launched. For this post, you launch EMR in private subnets and set up an S3 VPC endpoint to get the data from S3.
  • An EMR security configuration

All of the scripts and code snippets used for this walkthrough are available on the aws-blog-emrencryption GitHub repo.

Generate KMS keys

For this walkthrough, you use AWS KMS, a managed service that makes it easy for you to create and control the encryption keys used to encrypt your data and disks.

You generate two KMS master keys, one for S3 client-side encryption to encrypt data going out of EMR and the other for LUKS encryption to encrypt the local disks. The Hadoop MapReduce framework uses HDFS. Spark uses the local file system on each slave instance for intermediate data throughout a workload, where data could be spilled to disk when it overflows memory.

To generate the keys, use the kms.json AWS CloudFormation script.  As part of this script, provide an alias name, or display name, for the keys. An alias must be in the “alias/aliasname” format, and can only contain alphanumeric characters, an underscore, or a dash.


After you finish generating the keys, the ARNs are available as part of the outputs.


Generate SSL certificates

The SSL certificates allow the encryption of the MapReduce shuffle using HTTPS while the data is in transit between nodes.


For this walkthrough, use OpenSSL to generate a self-signed X.509 certificate with a 2048-bit RSA private key that allows access to the issuer’s EMR cluster instances. This prompts you to provide subject information to generate the certificates.

Use the cert-create.sh script to generate SSL certificates that are compressed into a zip file. Upload the zipped certificates to S3 and keep a note of the S3 prefix. You use this S3 prefix when you build your security configuration.


This example is a proof-of-concept demonstration only. Using self-signed certificates is not recommended and presents a potential security risk. For production systems, use a trusted certification authority (CA) to issue certificates.

To implement certificates from custom providers, use the TLSArtifacts provider interface.

Build the environment

For this walkthrough, launch an EMR cluster into a private subnet. If you already have a VPC and would like to launch this cluster into a public subnet, skip this section and jump to the Create a Security Configuration section.

To launch the cluster into a private subnet, the environment must include the following resources:

  • VPC
  • Private subnet
  • Public subnet
  • Bastion
  • Managed NAT gateway
  • S3 VPC endpoint

As the EMR cluster is launched into a private subnet, you need a bastion or a jump server to SSH onto the cluster. After the cluster is running, you need access to the Internet to request the data keys from KMS. Private subnets do not have access to the Internet directly, so route this traffic via the managed NAT gateway. Use an S3 VPC endpoint to provide a highly reliable and a secure connection to S3.


In the CloudFormation console, create a new stack for this environment and use the environment.json CloudFormation template to deploy it.

As part of the parameters, pick an instance family for the bastion and an EC2 key pair to be used to SSH onto the bastion. Provide an appropriate stack name and add the appropriate tags. For example, the following screenshot is the review step for a stack that I created.


After creating the environment stack, look at the Output tab and make a note of the VPC ID, bastion, and private subnet IDs, as you will use them when you launch the EMR cluster resources.


Create a security configuration

The final step before launching the secure EMR cluster is to create a security configuration. For this walkthrough, create a security configuration with S3 client-side encryption using EMR, and LUKS encryption for local volumes using the KMS keys created earlier. You also use the SSL certificates generated and uploaded to S3 earlier for encrypting the MapReduce shuffle.


Launch an EMR cluster

Now, you can launch an EMR cluster in the private subnet. First, verify that the service role being used for EMR has access to the AmazonElasticMapReduceRole managed service policy. The default service role is EMR_DefaultRole. For more information, see Configuring User Permissions Using IAM Roles.

From the Build an environment section, you have the VPC ID and the subnet ID for the private subnet into which the EMR cluster should be launched. Select those values for the Network and EC2 Subnet fields. In the next step, provide a name and tags for the cluster.


The last step is to select the private key, assign the security configuration that was created in the Create a security configuration section, and choose Create Cluster.


Now that you have the environment and the cluster up and running, you can get onto the master node to run scripts. You need the IP address, which you can retrieve from the EMR console page. Choose Hardware, Master Instance group and note the private IP address of the master node.


As the master node is in a private subnet, SSH onto the bastion instance first and then jump from the bastion instance to the master node. For information about how to SSH onto the bastion and then to the Hadoop master, open the ssh-commands.txt file. For more information about how to get onto the bastion, see the Securely Connect to Linux Instances Running in a Private Amazon VPC post.

After you are on the master node, bring your own Hive or Spark scripts. For testing purposes, the GitHub /code directory includes the test.py PySpark and test.q Hive scripts.


As part of this post, I’ve identified the different phases where data needs to be encrypted and walked through how data in each phase can be encrypted. Then, I described a step-by-step process to achieve all the encryption prerequisites, such as building the KMS keys, building SSL certificates, and launching the EMR cluster with a strong security configuration. As part of this walkthrough, you also secured the data by launching your cluster in a private subnet within a VPC, and used a bastion instance for access to the EMR cluster.

If you have questions or suggestions, please comment below.

About the Author

sai_90Sai Sriparasa is a Big Data Consultant for AWS Professional Services. He works with our customers to provide strategic & tactical big data solutions with an emphasis on automation, operations & security on AWS. In his spare time, he follows sports and current affairs.





Implementing Authorization and Auditing using Apache Ranger on Amazon EMR









New – At-Rest and In-Transit Encryption for Amazon EMR

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-at-rest-and-in-transit-encryption-for-amazon-emr/

Our customers use Amazon EMR (including Apache Hadoop and the full range of tools that make up the Apache Spark ecosystem) to handle many types of mission-critical big data use cases. For example:

  • Yelp processes over a terabyte of log files and photos every day.
  • Expedia processes streams of clickstream, user interaction, and supply data.
  • FINRA analyzes billions of brokerage transaction records daily.
  • DataXu evaluates 30 trillion ad opportunities monthly.

Because customers like these (see our big data use cases for many others) are processing data that is mission-critical and often sensitive, they need to keep it safe and sound.

We already offer several data encryption options for EMR including server and client side encryption for Amazon S3 with EMRFS and Transparent Data Encryption for HDFS. While these solutions do a good job of protecting data at rest, they do not address data stored in temporary files or data that is in flight, moving between job steps. Each of these encryption options must be individually enabled and configured, making the process of implementing encryption more tedious that it need be.

It is time to change this!

New Encryption Support
Today we are launch a new, comprehensive encryption solution for EMR. You can now easily enable at-rest and in-transit encryption for Apache Spark, Apache Tez, and Hadoop MapReduce on EMR.

The at-rest encryption addresses the following types of storage:

  • Data stored in S3 via EMRFS.
  • Data stored in the local file system of each node.
  • Data stored on the cluster using HDFS.

The in-transit encryption makes use of the open-source encryption features native to the following frameworks:

  • Apache Spark
  • Apache Tez
  • Apache Hadoop MapReduce

This new feature can be configured using an Amazon EMR security configuration.  You can create a configuration from the EMR Console, the EMR CLI, or via the EMR API.

The EMR Console now includes a list of security configurations:

Click on Create to make a new one:

Enter a name, and then choose the desired mode and type for each aspect of this new feature. Based on the mode or the type, the console will prompt you for additional information.

S3 Encryption:

Local disk encryption:

In-transit encryption:

If you choose PEM as the certificate provider type, you will need to enter the S3 location of a ZIP file that contains the PEM file(s) that you want to use for encryption. If you choose Custom, you will need to enter the S3 location of a JAR file and the class name of the custom certificate provider.

After you make all of your choices and click on Create, your security configuration will appear in the console:

You can then specify the configuration when you create a new EMR Cluster. This feature is available for clusters that are running Amazon EMR release 4.8.0 or 5.0.0. To learn more, read about Amazon EMR Encryption with Security Configurations.