Tag Archives: AWS KMS

The importance of encryption and how AWS can help

Post Syndicated from Ken Beer original https://aws.amazon.com/blogs/security/importance-of-encryption-and-how-aws-can-help/

Encryption is a critical component of a defense-in-depth strategy, which is a security approach with a series of defensive mechanisms designed. It means if one security mechanism fails, there’s at least one more still operating. As more organizations look to operate faster and at scale, they need ways to meet critical compliance requirements and improve data security. Encryption, when used correctly, can provide an additional layer of protection above basic access control.

How and why does encryption work?

Encryption works by using an algorithm with a key to convert data into unreadable data (ciphertext) that can only become readable again with the right key. For example, a simple phrase like “Hello World!” may look like “1c28df2b595b4e30b7b07500963dc7c” when encrypted. There are several different types of encryption algorithms, all using different types of keys. A strong encryption algorithm relies on mathematical properties to produce ciphertext that can’t be decrypted using any practically available amount of computing power without also having the necessary key. Therefore, protecting and managing the keys becomes a critical part of any encryption solution.

Encryption as part of your security strategy

An effective security strategy begins with stringent access control and continuous work to define the least privilege necessary for persons or systems accessing data. AWS requires that you manage your own access control policies, and also supports defense in depth to achieve the best possible data protection.

Encryption is a critical component of a defense-in-depth strategy because it can mitigate weaknesses in your primary access control mechanism. What if an access control mechanism fails and allows access to the raw data on disk or traveling along a network link? If the data is encrypted using a strong key, as long as the decryption key is not on the same system as your data, it is computationally infeasible for an attacker to decrypt your data. To show how infeasible it is, let’s consider the Advanced Encryption Standard (AES) with 256-bit keys (AES-256). It’s the strongest industry-adopted and government-approved algorithm for encrypting data. AES-256 is the technology we use to encrypt data in AWS, including Amazon Simple Storage Service (S3) server-side encryption. It would take at least a trillion years to break using current computing technology. Current research suggests that even the future availability of quantum-based computing won’t sufficiently reduce the time it would take to break AES encryption.

But what if you mistakenly create overly permissive access policies on your data? A well-designed encryption and key management system can also prevent this from becoming an issue, because it separates access to the decryption key from access to your data.

Requirements for an encryption solution

To get the most from an encryption solution, you need to think about two things:

  1. Protecting keys at rest: Are the systems using encryption keys secured so the keys can never be used outside the system? In addition, do these systems implement encryption algorithms correctly to produce strong ciphertexts that cannot be decrypted without access to the right keys?
  2. Independent key management: Is the authorization to use encryption independent from how access to the underlying data is controlled?

There are third-party solutions that you can bring to AWS to meet these requirements. However, these systems can be difficult and expensive to operate at scale. AWS offers a range of options to simplify encryption and key management.

Protecting keys at rest

When you use third-party key management solutions, it can be difficult to gauge the risk of your plaintext keys leaking and being used outside the solution. The keys have to be stored somewhere, and you can’t always know or audit all the ways those storage systems are secured from unauthorized access. The combination of technical complexity and the necessity of making the encryption usable without degrading performance or availability means that choosing and operating a key management solution can present difficult tradeoffs. The best practice to maximize key security is using a hardware security module (HSM). This is a specialized computing device that has several security controls built into it to prevent encryption keys from leaving the device in a way that could allow an adversary to access and use those keys.

One such control in modern HSMs is tamper response, in which the device detects physical or logical attempts to access plaintext keys without authorization, and destroys the keys before the attack succeeds. Because you can’t install and operate your own hardware in AWS datacenters, AWS offers two services using HSMs with tamper response to protect customers’ keys: AWS Key Management Service (KMS), which manages a fleet of HSMs on the customer’s behalf, and AWS CloudHSM, which gives customers the ability to manage their own HSMs. Each service can create keys on your behalf, or you can import keys from your on-premises systems to be used by each service.

The keys in AWS KMS or AWS CloudHSM can be used to encrypt data directly, or to protect other keys that are distributed to applications that directly encrypt data. The technique of encrypting encryption keys is called envelope encryption, and it enables encryption and decryption to happen on the computer where the plaintext customer data exists, rather than sending the data to the HSM each time. For very large data sets (e.g., a database), it’s not practical to move gigabytes of data between the data set and the HSM for every read/write operation. Instead, envelope encryption allows a data encryption key to be distributed to the application when it’s needed. The “master” keys in the HSM are used to encrypt a copy of the data key so the application can store the encrypted key alongside the data encrypted under that key. Once the application encrypts the data, the plaintext copy of data key can be deleted from its memory. The only way for the data to be decrypted is if the encrypted data key, which is only a few hundred bytes in size, is sent back to the HSM and decrypted.

The process of envelope encryption is used in all AWS services in which data is encrypted on a customer’s behalf (which is known as server-side encryption) to minimize performance degradation. If you want to encrypt data in your own applications (client-side encryption), you’re encouraged to use envelope encryption with AWS KMS or AWS CloudHSM. Both services offer client libraries and SDKs to add encryption functionality to their application code and use the cryptographic functionality of each service. The AWS Encryption SDK is an example of a tool that can be used anywhere, not just in applications running in AWS.

Because implementing encryption algorithms and HSMs is critical to get right, all vendors of HSMs should have their products validated by a trusted third party. HSMs in both AWS KMS and AWS CloudHSM are validated under the National Institute of Standards and Technology’s FIPS 140-2 program, the standard for evaluating cryptographic modules. This validates the secure design and implementation of cryptographic modules, including functions related to ports and interfaces, authentication mechanisms, physical security and tamper response, operational environments, cryptographic key management, and electromagnetic interference/electromagnetic compatibility (EMI/EMC). Encryption using a FIPS 140-2 validated cryptographic module is often a requirement for other security-related compliance schemes like FedRamp and HIPAA-HITECH in the U.S., or the international payment card industry standard (PCI-DSS).

Independent key management

While AWS KMS and AWS CloudHSM can protect plaintext master keys on your behalf, you are still responsible for managing access controls to determine who can cause which encryption keys to be used under which conditions. One advantage of using AWS KMS is that the policy language you use to define access controls on keys is the same one you use to define access to all other AWS resources. Note that the language is the same, not the actual authorization controls. You need a mechanism for managing access to keys that is different from the one you use for managing access to your data. AWS KMS provides that mechanism by allowing you to assign one set of administrators who can only manage keys and a different set of administrators who can only manage access to the underlying encrypted data. Configuring your key management process in this way helps provide separation of duties you need to avoid accidentally escalating privilege to decrypt data to unauthorized users. For even further separation of control, AWS CloudHSM offers an independent policy mechanism to define access to keys.

Even with the ability to separate key management from data management, you can still verify that you have configured access to encryption keys correctly. AWS KMS is integrated with AWS CloudTrail so you can audit who used which keys, for which resources, and when. This provides granular vision into your encryption management processes, which is typically much more in-depth than on-premises audit mechanisms. Audit events from AWS CloudHSM can be sent to Amazon CloudWatch, the AWS service for monitoring and alarming third-party solutions you operate in AWS.

Encrypting data at rest and in motion

All AWS services that handle customer data encrypt data in motion and provide options to encrypt data at rest. All AWS services that offer encryption at rest using AWS KMS or AWS CloudHSM use AES-256. None of these services store plaintext encryption keys at rest — that’s a function that only AWS KMS and AWS CloudHSM may perform using their FIPS 140-2 validated HSMs. This architecture helps minimize the unauthorized use of keys.

When encrypting data in motion, AWS services use the Transport Layer Security (TLS) protocol to provide encryption between your application and the AWS service. Most commercial solutions use an open source project called OpenSSL for their TLS needs. OpenSSL has roughly 500,000 lines of code with at least 70,000 of those implementing TLS. The code base is large, complex, and difficult to audit. Moreover, when OpenSSL has bugs, the global developer community is challenged to not only fix and test the changes, but also to ensure that the resulting fixes themselves do not introduce new flaws.

AWS’s response to challenges with the TLS implementation in OpenSSL was to develop our own implementation of TLS, known as s2n, or signal to noise. We released s2n in June 2015, which we designed to be small and fast. The goal of s2n is to provide you with network encryption that is easier to understand and that is fully auditable. We released and licensed it under the Apache 2.0 license and hosted it on GitHub.

We also designed s2n to be analyzed using automated reasoning to test for safety and correctness using mathematical logic. Through this process, known as formal methods, we verify the correctness of the s2n code base every time we change the code. We also automated these mathematical proofs, which we regularly re-run to ensure the desired security properties are unchanged with new releases of the code. Automated mathematical proofs of correctness are an emerging trend in the security industry, and AWS uses this approach for a wide variety of our mission-critical software.

Implementing TLS requires using encryption keys and digital certificates that assert the ownership of those keys. AWS Certificate Manager and AWS Private Certificate Authority are two services that can simplify the issuance and rotation of digital certificates across your infrastructure that needs to offer TLS endpoints. Both services use a combination of AWS KMS and AWS CloudHSM to generate and/or protect the keys used in the digital certificates they issue.

Summary

At AWS, security is our top priority and we aim to make it as easy as possible for you to use encryption to protect your data above and beyond basic access control. By building and supporting encryption tools that work both on and off the cloud, we help you secure your data and ensure compliance across your entire environment. We put security at the center of everything we do to make sure that you can protect your data using best-of-breed security technology in a cost-effective way.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS KMS forum or the AWS CloudHSM forum, or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Ken Beer

Ken is the General Manager of the AWS Key Management Service. Ken has worked in identity and access management, encryption, and key management for over 7 years at AWS. Before joining AWS, Ken was in charge of the network security business at Trend Micro. Before Trend Micro, he was at Tumbleweed Communications. Ken has spoken on a variety of security topics at events such as the RSA Conference, the DoD PKI User’s Forum, and AWS re:Invent.

How to verify AWS KMS asymmetric key signatures locally with OpenSSL

Post Syndicated from J.D. Bean original https://aws.amazon.com/blogs/security/how-to-verify-aws-kms-asymmetric-key-signatures-locally-with-openssl/

In this post, I demonstrate a sample workflow for generating a digital signature within AWS Key Management Service (KMS) and then verifying that signature on a client machine using OpenSSL.

The support for asymmetric keys in AWS KMS has exciting use cases. The ability to create, manage, and use public and private key pairs with KMS enables you to perform digital signing operations using RSA and Elliptic Curve (ECC) keys. You can also perform public key encryption or decryption operations using RSA keys.

For example, you can use ECC or RSA private keys to generate digital signatures. Third parties can perform verification outside of AWS KMS using the corresponding public keys. Similarly, AWS customers and third parties can perform unauthenticated encryption outside of AWS KMS using an RSA public key and still enforce authenticated decryption within AWS KMS. This is done using the corresponding private key.

The commands found in this tutorial were tested using Amazon Linux 2. Other Linux, macOS, or Unix operating systems are likely to work with minimal modification but have not been tested.

Creating an asymmetric signing key pair

To start, create an asymmetric customer master key (CMK) using the AWS Command Line Interface (AWS CLI) example command below. This generates an RSA 4096 key for signature creation and verification using AWS KMS.


aws kms create-key --customer-master-key-spec RSA_4096 \
 --key-usage SIGN_VERIFY \
 --description "Sample Digital Signature Key Pair"

If successful, this command returns a KeyMetadata object. Take note of the KeyID value. As a best practice, I recommend assigning an alias for your key. The command below assigns an alias of sample-sign-verify-key to your newly created CMK (replace the target-key-id value of <1234abcd-12ab-34cd-56ef-1234567890ab> with your KeyID).


aws kms create-alias \
    --alias-name alias/sample-sign-verify-key \
    --target-key-id <1234abcd-12ab-34cd-56ef-1234567890ab> 

Creating signer and verifier roles

For the next phase of this tutorial, you must create two AWS principals. You’ll create two roles: a signer principal and a verifier principal. First, navigate to the AWS Identity and Access Management (IAM) “Create role” Console dialogue that allows entities in a specified account to assume the role. Enter your Account ID and select Next: Permissions, as shown in Figure 1 below.

Figure 1: Enter your Account ID to begin creating a role in AWS IAM

Figure 1: Enter your Account ID to begin creating a role in AWS IAM

Select Next through the next two screens. On the fourth and final screen, enter a Role name of SignerRole and Role description, as shown in Figure 2 below.

Figure 2: Enter a role name and description to finish creating the role

Figure 2: Enter a role name and description to finish creating the role

Select Create role to finish creating the signer role. To create the verifier role, you must perform this same process one more time. On the final screen, provide the name OfflineVerifierRole for the role instead.

Configuring key policy permissions

A best practice is to adhere to the principle of least privilege and provide each AWS principal with the minimal permissions necessary to perform its tasks. The signer and verifier roles that you created currently have no permissions in your account. The signer principal must have permission to be able to create digital signatures in KMS for files using the public portion of your CMK. The verifier principal must have permission to download the plaintext public key portion of your CMK.

To provide access control permissions for KMS actions to your AWS principals, attach a key policy to the CMK. The IAM role for the signer principal (SignerRole) is given kms:Sign permission in the CMK key policy. The IAM role for the verifier principal (OfflineVerifierRole) is given kms:GetPublicKey permission in the CMK key policy.

Navigate to the KMS page in the AWS Console and select customer-managed keys. Next, select your CMK, scroll down to the key policy section, and select edit.

To allow your signer principal to use the CMK for digital signing, append the following stanza to the key policy (replace the account ID value of <111122223333> with your own):


{
    "Sid": "Allow use of the key pair for digital signing",
    "Effect": "Allow",
    "Principal": {"AWS":"arn:aws:iam::<111122223333>:role/SignerRole"},
    "Action": "kms:Sign",
    "Resource": "*"
}

To allow your verifier principal to download the CMK public key, append the following stanza to the key policy (replace the account ID value of <111122223333> with your own):


{
    "Sid": "Allow plaintext download of the public portion of the key pair",
    "Effect": "Allow",
    "Principal": {"AWS":"arn:aws:iam::<111122223333>:role/OfflineVerifierRole"},
    "Action": "kms:GetPublicKey",
    "Resource": "*"
}

You can permit the verifier to perform digital signature verification using KMS by granting the kms:Verify action. However, the kms:GetPublicKey action enables the verifier principal to download the CMK public key in plaintext to verify the signature in a local environment without access to AWS KMS.

Although you configured the policy for a verifier principal within your own account, you can also configure the policy for a verifier principal in a separate account to validate signatures generated by your CMK.

Because AWS KMS enables the verifier principal to download the CMK public key in plaintext, you can also use the verifier principal you configure to download the public key and distribute it to third parties. This can be done whether or not they have AWS security credentials via, for example, an S3 presigned URL.

Signing a message

To demonstrate signature verification, you need KMS to sign a file with your CMK using the KMS Sign API. KMS signatures can be generated directly for messages of up to 4096 bytes. To sign a larger message, you can generate a hash digest of the message, and then provide the hash digest to KMS for signing.

For messages up to 4096 bytes, you first create a text file containing a short message of your choosing, which we refer to as SampleText.txt. To sign the file, you must assume your signer role. To do so, execute the following command, but replace the account ID value of <111122223333> with your own:


aws sts assume-role \
--role-arn arn:aws:iam::<111122223333>:role/SignerRole \
--role-session-name AWSCLI-Session

The return values provide an access key ID, secret key, and session token. Substitute these values into their respective fields in the following command and execute it:


export AWS_ACCESS_KEY_ID=<ExampleAccessKeyID1>
export AWS_SECRET_ACCESS_KEY=<ExampleSecretKey1>
export AWS_SESSION_TOKEN=<ExampleSessionToken1>

Then confirm that you have successfully assumed the signer role by issuing:


aws sts get-caller-identity

If the output of this command contains the text assumed-role/SignerRole then you have successfully assumed the signer role and you may sign your message file with:


aws kms sign \
    --key-id alias/sample-sign-verify-key \
    --message-type RAW \
    --signing-algorithm RSASSA_PKCS1_V1_5_SHA_512 \
    --message fileb://SampleText.txt \
    --output text \
    --query Signature | base64 --decode > SampleText.sig

To indicate that the file is a message and not a message digest, the command passes a MessageType parameter of RAW. The command then decodes the signature and writes it to a local disk as SampleText.sig. This file is important later when you want to verify the signature entirely client-side without calling AWS KMS.

Finally, to drop your assumed role, you may issue:


unset \
	AWS_ACCESS_KEY_ID \
	AWS_SECRET_ACCESS_KEY \
	AWS_SESSION_TOKEN

followed by:


aws sts get-caller-identity

Verifying a signature client-side

Assume your verifier role using the same process as before and issue the following command to fetch a copy of the public portion of your CMK from AWS KMS:


aws kms get-public-key \
 --key-id alias/sample-sign-verify-key \
 --output text \
 --query PublicKey | base64 --decode > SamplePublicKey.der

This command writes to disk the DER-encoded X.509 public key with a name of SamplePublicKey.der . You can convert this DER-encoded key to a PEM-encoded key by running the following command:


openssl rsa -pubin -inform DER \
    -outform PEM -in SamplePublicKey.der \
    -pubout -out SamplePublicKey.pem

You now have the following three files:

  1. A PEM file, SamplePublicKey.pem containing the CMK public key
  2. The original SampleText.txt file
  3. The SampleText.sig file that you generated in KMS using the CMK private key

With these three inputs, you can now verify the signature entirely client-side without calling AWS KMS. To verify the signature, run the following command:


openssl dgst -sha512 \
    -verify SamplePublicKey.pem \
    -signature SampleText.sig \
    SampleText.txt

If you performed all of the steps correctly, you see the following message on your console:


 Verified OK

This successful verification provides a high degree of confidence that the message was endorsed by a principal with permission to sign using your KMS CMK (authentication) — in this case, your sender role principal. It also verifies that the message has not been modified in transit (integrity).

To demonstrate this, update your SampleText.txt file by adding new characters to the file. If you rerun the command, you see the following message:


Verification Failure

Summary

In this tutorial, you verified the authenticity of a digital signature generated by a KMS asymmetric key pair on your local machine. You did this by using OpenSSL and a plaintext public key exported from KMS.

You created an asymmetric CMK in KMS and configured key policy permissions for your signer and verifier principals. You then digitally signed a message in KMS using the private portion of your asymmetric CMK. Then, you exported a copy of the public portion of your asymmetric key pair from KMS in plaintext. With this copy of your public key, you were able to perform signature verification using OpenSSL entirely in your local environment. A similar pattern can be used with an asymmetric CMK configured with a KeyUsage value of ENCRYPT_DECRYPT. This pattern can be used to perform encryption operations in a local environment and decryption operations in AWS KMS.

To learn more about the asymmetric keys feature of KMS, please read the KMS Developer Guide. If you’re considering implementing an architecture involving downloading public keys, be sure to refer to the KMS Developer Guide for Special Considerations for Downloading Public Keys. If you have questions about the asymmetric keys feature, please start a new thread on the AWS KMS Discussion Forum.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

J.D. Bean

J.D. Bean is a Solutions Architect at Amazon Web Services on the World Wide Public Sector Federal Financials team based out of New York City. He is passionate about his work enabling AWS customers’ successful cloud journeys, and his interests include security, privacy, and compliance. J.D. holds a Bachelor of Arts from The George Washington University and a Juris Doctor from New York University School of Law. In his spare time J.D. enjoys spending time with his family, yoga, and experimenting in his kitchen.

Manage your AWS KMS API request rates using Service Quotas and Amazon CloudWatch

Post Syndicated from Raj Copparapu original https://aws.amazon.com/blogs/security/manage-your-aws-kms-api-request-rates-using-service-quotas-and-amazon-cloudwatch/

AWS Key Management Service (KMS) publishes API usage metrics to Amazon CloudWatch and Service Quotas allowing you to both monitor and manage your AWS KMS API request rate quotas. This functionality helps you understand trends in your usage of AWS KMS and can help prevent API request throttling as you grow your use of AWS KMS.

When you surpass your AWS KMS API request rate quotas, you receive an error “You have exceeded the rate at which you may call KMS. Reduce the frequency of your calls.” Such errors can also be caused by an increased use of AWS services that encrypt your data under keys managed in AWS KMS. For example, if you are using Amazon Redshift Spectrum, you might encounter this error – “HTTP response error code: 503 Message: SlowDown. Please reduce your request rate for operations involving AWS KMS.” Historically, in order to understand how close to a request rate quota you were, you had to perform three tasks: (i) send AWS CloudTrail events generated by AWS KMS to Amazon CloudWatch Logs; (ii) write queries in Amazon CloudWatch Logs Insights to track your API request usage; and (iii) submit an AWS Support case to request a quota increase. Now, you can view your AWS KMS API usage and request quota increases within the AWS Service Quotas console itself without doing any special configuration.

In this post, we will show you how to 1) view your KMS API utilization within Service Quotas 2) create a CloudWatch Alarm that alerts you to an approaching quota so you can request quota increases before you are throttled.

View your AWS KMS API utilization

Background

API utilization is the percentage rate at which you are calling a particular API compared to that API’s request rate quota in your account. For AWS KMS, the default request rate for cryptographic operations using symmetric keys is 10,000 requests per second in 6 specific AWS Regions*, aggregated across all requesting clients in an account. AWS KMS aggregates your API requests every minute and sends it to CloudWatch, where it is consumed by AWS Service Quotas for you to see. Because quota usage is aggregated by the minute, your effective quota would be 600,000 requests per minute.

*See Request Quotas for Each AWS KMS API Operation for the specific quotas in the AWS Region in which you operate.

Scenario

Imagine that all the applications in your account using AWS KMS collectively made 100,000 requests to the Decrypt API, 100,000 requests to the GenerateDataKey API, and 100,000 requests to the Encrypt API in a minute. AWS KMS sends a count of 300,000 requests to Amazon CloudWatch for that particular minute. Your utilization for that minute will be 50% of your quota (300,000 divided by 600,000, which is 60 seconds times your quota of 10,000 requests per second). Within the Service Quotas console, you can view utilization across several time frames, from the most recent hour up to a week.

Here are the steps to view your AWS KMS API Utilization within Service Quotas:

  1. Sign in to the AWS Management Console.
  2. Click on “Services” dropdown on the top left corner and search for “Service Quotas” and select it from the dropdown.
  3. Click on the AWS Key Management Service (AWS KMS) tile on the Service Quotas dashboard.
  4. Search for “symmetric” and click on the link for “Cryptographic operations (symmetric) request rate”.
  5. The Monitoring section will display the combined utilization percentage for the following APIs – Decrypt, Encrypt, GenerateDataKey, GenerateDataKeyWithoutPlaintext, GenerateRandom, and ReEncrypt. All these APIs are grouped under the shared “Cryptographic operations (symmetric) request rate”.
  6. Adjust the graph to view the utilization trend over a week by selecting “1w” from the top right corner of the graph.

You can view the utilization for any of the other available AWS KMS APIs from the Service Quotas dashboard in a similar fashion.

The API utilization provides you the overall trend of your API usage. Because the requests sent from AWS KMS are aggregated per minute, you could still experience throttling errors at a less than 100% utilization, especially if your usage is spiky and if you do not have exponential back off built into your applications’ error handling logic. For example, you might have surpassed the requests per second quota between the 12th second and the 15th second of the minute, but you were below the quota for the other 57 seconds of that minute.
 
Customizable CloudWatch graph

The utilization shown is across your entire AWS account in a given region, so if you are introducing a new application, you can monitor and see how it impacts your overall utilization. If you need a request rate quota increase before deploying your new application to production, you can request a quota increase at the top right portion of the Details section of the AWS Service Quotas page.

Create a CloudWatch Alarm

In the previous section we described how you can view historical utilization of API request rates from the Monitoring section of the AWS Service Quotas console. What if you want to be alerted when you have reached a predetermined utilization percentage so you can request a quota increase before you begin to experience extended throttling?

Here are the steps to do so:

  1. Click on the API of your interest from the Service Quotas console. In this example, let’s select Cryptographic operations (symmetric) request rate.
  2. In the Amazon CloudWatch alarms section (under the Monitoring section), click Create on the right hand corner.
  3. From the Alarm threshold dropdown select “80% of applied quota value”.
  4. Enter “80threshold” as the Alarm name and click the orange Create button on the right side.
  5. Click on the “80threshold” link that now appears in the table. A new browser window will appear that takes you to the Amazon CloudWatch console.
  6. Click Edit on the top right corner.
  7. Leave all the default values selected on the Specify metrics and condition page and click Next on the bottom right.
  8. Click Add notification and select Create new topic under the Select an SNS topic section. Enter “SNS-Topic” as the topic name. Add your email address to receive notifications when the alarm is set. Click Create topic.
  9. Click Update alarm.
  10. Confirm your SNS subscription by clicking on View SNS Subscriptions.
     
  11.  

  12. Select your email address endpoint and click Request confirmation.
  13. You will receive an email to confirm your subscription. Once you confirm the subscription, you are all set to receive email notifications on the new alarm.
     
    User interface after CloudWatch alarm created

Here are more details on creating CloudWatch alarms if you want to make additional modifications to your alarms. We recommend 80% as a good threshold to set your alarm to begin with. When you are testing a new application, you can start with this threshold and run your application for a period of time and monitor its utilization. When an alarm fires, you can you can proactively request a quota increase at the top right portion of the Details section of the AWS Service Quotas page.

Conclusion

We’ve explored how to view your AWS KMS API request usage, how to add alarms on the most critical items in your application’s use of AWS KMS, and how to request quota increases. These items provide visibility and control over how your applications interact with AWS KMS.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread in the AWS Key Management Service forums.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Raj Copparapu

Raj Copparapu is a Senior Product Manager Technical on the AWS KMS team who focuses on defining the product roadmap to satisfy customer requirements. Raj has spent over 5 years innovating to deliver products that help customers secure their data in the cloud. In his spare time, he enjoys yoga and spending time with family.

How to use KMS and IAM to enable independent security controls for encrypted data in S3

Post Syndicated from Paco Hope original https://aws.amazon.com/blogs/security/how-to-use-kms-and-iam-to-enable-independent-security-controls-for-encrypted-data-in-s3/

Typically, when you protect data in Amazon Simple Storage Service (Amazon S3), you use a combination of Identity and Access Management (IAM) policies and S3 bucket policies to control access, and you use the AWS Key Management Service (AWS KMS) to encrypt the data. This approach is well-understood, documented, and widely implemented. However, many customers want to extend the value of encryption beyond basic protection against unauthorized access to the storage layer where the data resides. They want to enforce a separation of duties between which team manages access to the storage layer and which team manages access to the encryption keys. This model ensures that configuration errors made by only one of these teams won’t compromise the data in ways that grant unauthorized access to plaintext data. For example, if the team that owns permissions to the S3 bucket mistakenly grants access to unauthorized users, when those users attempt to access objects in S3 they will fail. Why? Because the separate team who manages access to the keys didn’t grant those users access to use the keys for decryption.

You can create this kind of independent access control by combining KMS encryption with IAM policies and S3 bucket policies. When data is encrypted with a customer-managed KMS customer master key (CMK), the key’s policy acts as an independent access control. Users can be prevented from accessing the data, even though the IAM permissions and the S3 bucket policy would permit the access. Figure 1 shows a Venn diagram of the access that is required. The bucket policy, the IAM policy, and the KMS key policy all play a role. Users have permission for the data only when they are granted permissions in all three policies.
 

Figure 1: Venn diagram showing the required permissions for access

Figure 1: Venn diagram showing the required permissions for access

This exercise builds the resources shown in Figure 2:

  • Three AWS IAM roles
    1. A role (1) with permission to create and manage permissions on an S3 bucket (secure-bucket-admin)
    2. A role (2) with permission to create and manage permissions on a KMS master key (secure-key-admin)
    3. A role (3) with permissions to access (but not manage) a specific S3 bucket and to use (but not manage) a specific AWS KMS customer master key (authorized-users).
  • An S3 bucket (4) with a custom bucket policy (5) that only allows data to be stored if that data is encrypted with a specific KMS key. The ability to write to or read from this bucket will be restricted to the IAM role authorized-users.
  • A KMS key (6) with a specific key policy (7) that can only be used by the IAM role authorized-users and only managed by the IAM user secure-key-admin.

 

Figure 2: Architecture diagram

Figure 2: Architecture diagram

When you have completed this exercise, you will have:

  • Created an S3 bucket protected by IAM policies, and a bucket policy that enforces encryption.
  • Attached the IAM role authorized-users to an EC2 instance so your applications in that instance can assume that role and access encrypted data in the S3 bucket.
  • Uploaded and downloaded data from the bucket that is protected by the KMS key.
  • Demonstrated that when the KMS key policy is modified, removing access for the IAM role authorized-users, the applications on the EC2 instance no longer have access to the data in the S3 bucket.

Set things up

For simplicity, I create the S3 bucket, KMS keys, and EC2 instances all in the same region and in the same AWS account. It’s possible to use KMS keys that are owned by a different AWS account, to assume roles across accounts, and to have instances in different regions from the buckets and the keys. I discuss those variations at the end.

I assume you have at least one administrator identity available to you already: one that has broad rights for creating users, creating roles, managing KMS keys, and launching EC2 instances. I will refer to this as your “Admin identity” throughout these instructions. This can be a federated identity (for example, from your corporate identity provider or from a social identity), or it can be an AWS IAM user.

Assuming Roles

Throughout this exercise I will use IAM roles to acquire and release privileges. If you’re working from the AWS command line, you’ll need to configure your command line environment to use profiles. If you’re working from the AWS Management Console, then you’ll follow these instructions to switch role. If you haven’t worked with roles before, take a minute to follow those instructions and become familiar with it before continuing.

Step 1: Create IAM policies

First, I will create 3 policies that grant very specific sets of rights. Then, I will attach those policies to roles: two roles for administrators, and one for software running on EC2 instances. You’re going to create an S3 bucket in Step 3. That bucket, like all S3 buckets, needs a globally unique name. You will reference that bucket’s name in these policies, even though you will create the bucket later. Decide the name of your bucket now. When you reach steps that require you to type or paste a JSON policy document for your bucket policy, remember to use the name of your bucket where I have written secure-demo-bucket.

Step 1a: Create the S3 bucket management policy

While logged in to the console as your Admin user, create an IAM policy in the web console using the JSON tab. Name the policy secure-bucket-admin. When you reach the step to type or paste a JSON policy document, paste the JSON from Listing 1 below. This policy allows broad S3 administration rights (creating, deleting, and modifying policies), so it is a high privilege policy. In an effort to be concise, it grants all permissions to S3 and then takes a few away by explicitly denying them. The intention is to permit managing all aspects of the bucket’s operation, while denying all access to the contents of the bucket. The explicit deny mechanism is important because, due to IAM’s policy evaluation logic, an explicit deny cannot be overridden by subsequent “allow” statements or by attaching additional policies. As the S3 service evolves over time and new features are added, the policy will permit using those new features, without any change to this policy. If you prefer to enable features explicitly, you’ll need to rewrite this policy to explicitly allow only the features you want, and then come back and revise the policy every so often, as S3 features are added that your role needs to use.


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAllActions",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Sid": "DenyObjectAccess",
      "Action": [
        "s3:DeleteObject",
        "s3:DeleteObjectVersion",
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:PutObjectVersionAcl"
      ],
      "Effect": "Deny",
      "Resource": "arn:aws:s3:::secure-demo-bucket"
    }
  ]
}

Listing 1: secure-bucket-admin IAM policy
 
Your policy will have an ARN (it will look something like arn:aws:iam::111122223333:policy/secure-bucket-admin). Make a note of this ARN. You will use it later to attach to the secure-bucket-admin role you’ll create in step 2.

Step 1b: Create the KMS administrator policy

While logged in to the console as your Admin user, create an IAM policy in the web console using the JSON tab. Name the policy secure-key-admin. When you reach the step to type or paste a JSON policy document, paste the JSON from Listing 2 below. Be sure to add your own 12-digit AWS account number where I have written 111122223333. This policy allows broad KMS administration rights (creating keys, granting access to keys, and modifying key policies), so it is a high privilege policy. In an effort to be concise, this policy grants all permissions to the KMS service and then denies certain rights through an explicit deny statement. The intention is to permit managing all aspects of KMS keys, while denying all access to perform encryption and decryption using KMS keys. As the KMS service evolves over time and new features are added, the policy will permit using those new features, without any change to this policy. If you prefer to enable features explicitly, you’ll need to rewrite this policy to explicitly allow only the features you want, and then come back and revise the policy every so often, as KMS features are added that your role needs to use.


{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAllKMS",
      "Action": "kms:*",
      "Effect": "Allow",
      "Resource": " arn:aws:kms:*:111122223333:key/*"
    },
    {
      "Sid": "DenyKMSKeyUsage",
      "Action": [
        "kms:Decrypt",
        "kms:Encrypt",
        "kms:GenerateDataKey",
        "kms:ReEncryptFrom",
        "kms:ReEncryptTo"
      ],
      "Effect": "Deny",
      "Resource": " arn:aws:kms:*:111122223333:key/*"
    }
  ]
}

Listing 2: secure-key-admin IAM policy
 
Your policy will have an ARN (it will look something like arn:aws:iam::111122223333:policy/secure-key-admin). Make a note of this ARN. You will use it later to attach to the secure-key-admin role you’ll create in step 2.

Step 1c: Create the S3 bucket usage policy

This final policy grants access to read and write encrypted data in the target S3 bucket. This is a narrowly-scoped policy that only grants rights to a single bucket. While logged in to the console as your Admin user, create an IAM policy in the web console using the JSON tab. Name the policy secure-bucket-access.

When you reach the step to type or paste a JSON policy document for your bucket policy, paste the JSON from Listing 3 below, substituting the name of your bucket on the two lines where I have secure-demo-bucket.


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "BasicList",
            "Effect": "Allow",
            "Action": [ "s3:ListAllMyBuckets", "s3:HeadBucket" ],
            "Resource": "*"
        },
        {
            "Sid": "AllowSecureBucket",
            "Effect": "Allow",
            "Action": [ "s3:PutObject", "s3:GetObjectAcl",
                "s3:GetObject", "s3:DeleteObjectVersion",
                "s3:DeleteObject", "s3:GetBucketLocation",
                "s3:GetObjectVersion" ],
            "Resource": [
                "arn:aws:s3:::secure-demo-bucket/*",
                "arn:aws:s3:::secure-demo-bucket"
            ]
        }
    ]
}

Listing 3: secure-bucket-access IAM policy

Note: In an effort to grant a minimal, but realistic, set of permissions, this IAM policy only grants access to basic get, put, and delete operations. You might have a use for other features, like tagging objects. If so, you will need to change the policy to enable the features you want to use.

Your policy will have an ARN (it will look something like arn:aws:iam::111122223333:policy/secure-bucket-access). Make a note of this ARN. You will use it later to attach to the authorized-users role you’ll create in step 2.

You might ask why this policy designed to control access to encrypted objects has no KMS permissions in it. Wouldn’t that prevent the users that assume this IAM role from using the encryption keys? It would normally prevent them, except you have the ability to list the authorized-users IAM role within the resource policy attached to the KMS key you’re about to create. By placing the authorized-users role in the KMS key resource policy, it further enforces the separation of duties so administrators in the account with an ability to modify IAM policies don’t inadvertently escalate privilege to other IAM users/roles and give them permissions to use KMS keys for decryption.

Step 2: Create IAM roles

An AWS IAM role is an identity that you can create in an AWS account that has specific permissions. An IAM role is similar to an IAM user, because it has permission policies that determine what the identity can and cannot do in AWS. It’s different from an IAM user because it’s not associated with a single person. A role can be used by users, by EC2 instances, by AWS services, or by other entities like AWS Lambda functions that you allow to use it. The IAM policies we created in step 1 do not grant permissions until we assign them to roles and assign the roles to users or entities.

Step 2a: Create the S3 bucket management role

This role will be used by administrators who need to manage the properties of the bucket.

  1. Follow the online instructions for creating an IAM role.
  2. Choose Another AWS account under the section labeled Select type of trusted entity.
  3. For the authorized AWS account ID, enter the 12-digit account number for the account that you’re working in. If you intend to authorize AWS IAM users that are defined in a different AWS IAM account to access the S3 bucket and decrypt objects, then you would include that AWS account’s ID number, instead.
  4. Name the IAM role secure-bucket-admin and import the customer managed policy named secure-bucket-admin that you created in step 1a to the role that you have created.

    Your AWS IAM role will have an ARN (it will look something like arn:aws:iam::111122223333:role/secure-bucket-admin). Make a note of this ARN. You will use it in the step 3 when you create your S3 bucket.

Step 2b: Create the KMS key management role

This role will be used by administrators who need to manage the KMS customer master keys that protect the data. The actions you take to manage the keys will be authorized by this role. Importantly, this role has no ability to modify the bucket, grant access to the bucket, or access any of the data in the bucket.

  1. Follow the online instructions for creating an IAM role.
  2. In the Select type of trusted entity section, select Another AWS account.
  3. For the authorized AWS account ID, enter the 12-digit account number for the account that you’re working in. If you intend to authorize AWS IAM users that are defined in a different AWS IAM account, then you would include that AWS account’s ID number, instead.
  4. Name the IAM role secure-key-admin and import the customer-managed policy named secure-key-admin that you created in step 1b to the role that you have created.

    Your AWS IAM role will have an ARN (it will look something like arn:aws:iam::111122223333:role/secure-key-admin). Make a note of this ARN. You will use it in step 4 when you create your KMS key.

Step 2c. Create the bucket usage role

This role will grant permissions to EC2 instances. An EC2 instance running with this role will be able to create and read encrypted data in the protected S3 bucket.

  1. Follow the online instructions for creating an IAM role.
  2. In the Select type of trusted entity section, select AWS service.
  3. Choose EC2 as the service that you will authorize. This authorizes all applications running on that EC2 instance to use credentials with permissions attached to the role.
  4. Name the IAM role authorized-users and import the customer-managed secure-bucket-access policy that you created in step 1c to the role that you have created.

This role is not for users trying to access the S3 bucket from any arbitrary application that happens to have the role’s credentials. It will only be used by users operating within applications running in AWS EC2 instances.

Step 3: Create an S3 bucket for the encrypted data

Log in to the console using your secure-bucket-admin role. (Either log in with the correct federated identity, or with the AWS IAM user you created in step 1d). Follow the instructions to create a bucket that will hold the encrypted data. In my example, I call my bucket secure-demo-bucket. You chose your own unique bucket name back in step 1. Type that bucket name throughout these steps where I use secure-demo-bucket. You will set a bucket policy and properties on that bucket later.

Step 4: Create a KMS key to encrypt and decrypt the data in the S3 bucket

Log out of the console and log back in using your secure-key-admin role. Create a customer-managed customer master key (CMK) to encrypt and decrypt the data in the S3 bucket you just created. If you already have a customer-managed CMK created that you want to use for this purpose, you can do that. To use your own CMK, skip steps 1-5 below about creating a key and, instead, select your existing key in the KMS console and then follow steps 6-8 to change the key policy to allow the authorized-users role permissions to use the key.

  1. In the AWS console, go to Key Management Service.
  2. Select the Create Key button.
  3. On the Step 1 screen, set a display name (called an “Alias”) for the key and a description. I recommend a meaningful description that tells others what the key is for.
  4. On the Step 2 screen, set tags if you need them to track usage of keys for billing purposes. Tags won’t have a functional impact in this exercise so you can skip this step if you want by selecting Next.
  5. On the Step 3 screen, select key administrators. Pick only the secure-key-admin IAM role. You must not pick the secure-bucket-admin role or the authorized-users role as key administrators to ensure separation of duties. For example, if you were to pick the authorized-users IAM role, then any user that assumed that role could escalate their own (or others’) privileges to use this key to decrypt any other data encrypted under this key in your account. If you were to pick the secure-bucket-admin user, then that user could modify permissions both on the S3 bucket and the KMS key in ways that allowed unauthorized users access to decrypt data.
  6. On the Step 4 screen, select key users. Pick only the authorized-users IAM role you created in step 2c.
  7. On the Step 5 screen, select Finish.

    After you have created the key, make note of the key’s ARN. It will look something like this:

    arn:aws:kms::11112222333:key/1234abcd-12ab-34cd-56ef-1234567890ab

    You will need it for the next step where you enforce all objects uploaded into the S3 bucket to be encrypted under this key.

Step 5: Modify the bucket policy

Log out of the console and log back with the secure-bucket-admin role. You’re going to attach a bucket policy to the bucket that does two things: it requires objects to be encrypted and it requires them to be encrypted with a specific KMS key. You will accomplish this by explicitly denying any attempt to call PutObject unless the correct conditions are true. This helps you increase your confidence that you will not store unencrypted data in this bucket.

Find the secure-demo-bucket bucket in the S3 web console, and then modify its bucket policy. Use the code from Listing 4 below as the entire bucket policy. Be sure to change secure-demo-bucket to the actual name of the bucket that you’re using in both places where it appears in the policy. You recorded the key’s ARN in step 4, make sure you insert that ARN for your KMS key where I use an example key ARN below.


{
    "Version": "2012-10-17",
    "Id": "PutObjPolicy",
    "Statement": [
      {
        "Sid": "DenyUnencryptedObjectUploads",
        "Effect": "Deny",
        "Principal": "*",
        "Action": "s3:PutObject",
        "Resource": "arn:aws:s3:::secure-demo-bucket/*",
        "Condition": {
          "StringNotEquals": {
            "s3:x-amz-server-side-encryption": "aws:kms"
          }
        }
      },
      {
        "Sid": "DenyWrongKMSKey",
        "Effect": "Deny",
        "Principal": "*",
        "Action": "s3:PutObject",
        "Resource": "arn:aws:s3:::secure-demo-bucket/*",
        "Condition": {
          "StringNotEquals": {
            "s3:x-amz-server-side-encryption-aws-kms-key-id": "arn:aws:kms::11112222333:key/1234abcd-12ab-34cd-56ef-1234567890ab"
          }
        }
      }
    ]
  }

Listing 4: Bucket policy requiring encryption

Note: This bucket policy is not retroactive: If you apply this policy to a bucket that already exists and already has unencrypted objects, nothing happens to the objects that are already in the bucket. They remain unencrypted. They can be fetched or deleted. Once the policy is applied, however, new objects cannot be put in the bucket unless they are correctly encrypted.

Instead of applying a bucket policy, you could consider turning on S3 default encryption. This feature forces all new objects uploaded to an S3 bucket to be encrypted using the KMS key you created in step 4 unless the user specifies a different key. This feature doesn’t prohibit callers from encrypting objects under other KMS keys, but it ensures that the data is protected even if the user does not specify KMS encryption when putting the object. The bucket policy in Listing 4 is a bit stricter than S3 default encryption because it ensures that no object is ever encrypted by any key other than the CMK created in step 4. That strictness means the attempt to put an object fails, unless the caller explicitly names the KMS keyId in every S3 PUT request. With S3 default encryption, attempts to put an object without specifying encryption will succeed, and the data will be protected by the named KMS CMK.

Step 7: Launch an EC2 instance to demonstrate the solution

The final step to showing how this solution works is to launch an EC2 instance and show that applications running in that instance can write and read data in the S3 bucket you created. If you launch an EC2 instance that has your authorized-users role attached and log in on that instance, you will be able to upload and download objects from the bucket, encrypting and decrypting transparently as you do it. No other identity (for example, other IAM users, other IAM roles, other EC2 instances, and Lambda functions) will be able to upload and download data to this S3 bucket because these other identities don’t have the permissions to use the KMS key that protects the data.

Start by logging out of the console and log back in as your Admin user. Following instructions to launch an EC2 instance:

  1. Choose an Amazon Linux AMI.
  2. Choose an instance type. Any instance type will work. If you launch an Amazon Linux t2.micro instance, it might qualify for free tier pricing.
  3. For IAM Role, select the authorized-users role from the drop-down menu.
  4. Make sure you specify an SSH key that you have access to, and make sure that you have a way to reach the EC2 instance over the network.

Satisfy yourself that it works as expected

At this point, the solution is complete and is running. I want to demonstrate that the KMS key is providing the independent access controls the way I said it would. I will modify the key policy to remove the instance’s rights to use the KMS key. Then, I will confirm that the commands that had succeeded before now fail after the key policy change. This shows how the KMS key and its policy are completely independent of the S3 bucket policies and the IAM policies.

Test 1: Uploading encrypted objects

Using SSH, log in on the EC2 instance you launched that has the authorized-users role attached.

You will need to download a file onto the EC2 instance that you can then upload, encrypted, to the S3 bucket. If you don’t have a file that you want to use, you can use the AWS Cryptographic Details whitepaper as a reasonable test file.

On the instance, run the following command to download a local copy of the AWS Cryptographic Details whitepaper that you can use as test data:


curl -O 'https://d1.awsstatic.com/whitepapers/KMS-Cryptographic-Details.pdf'

Side note: You should also read this whitepaper. It’s very informative on how AWS KMS is built and operated to secure your encryption keys.

On the EC2 instance, use the AWS command line to upload the file to the S3 bucket. Note all the options that tell S3 to use KMS encryption and to use the correct key ID. Remember to insert the bucket name for the bucket that you’re using and the ARN of your KMS key from step 4 above.


aws s3 cp KMS-Cryptographic-Details.pdf s3://secure-demo-bucket/
--sse aws:kms --sse-kms-key-id arn:aws:kms::11112222333:key/1234abcd-12ab-34cd-56ef-1234567890ab

If all went well, you should see a message like the following, showing that the object was uploaded successfully:


upload: ./KMS-Cryptographic-Details.pdf to s3://secure-demo-bucket/KMS-Cryptographic-Details.pdf

Test 2: Upload an Unencrypted Object

You can now prove the fact that a user on this instance attempting to upload unencrypted objects will fail. Run this command to upload a second copy of the PDF file to be called test2.pdf. Be sure to substitute your bucket’s name into the command.


aws s3 cp KMS-Cryptographic-Details.pdf s3://secure-demo-bucket/test2.pdf

You’ll notice this command doesn’t include the options instructing S3 to use KMS to encrypt the file. You should see this error message:


An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

If you see no error, then double-check that your bucket policy in Step 5 above is correct.

Test 3: Downloading Encrypted Objects

You’ve now proven that the EC2 instance can upload encrypted objects and that unencrypted objects are refused. Now, you can prove that the EC2 instance has access to cause S3 to decrypt the encrypted object in the bucket using the KMS keys. Here’s how: While still on your EC2 instance, run this command, substituting your bucket name, to download a copy of the PDF file:


aws s3 cp s3://secure-demo-bucket/KMS-Cryptographic-Details.pdf test3.pdf

If this command succeeds, then you will have a file in your current directory on your EC2 instance named test3.pdf. That shows that you have successfully decrypted and downloaded the PDF file.

Test 4: Demonstrate that the key policy regulates access

Now, I will demonstrate the independence of access control provided by the KMS key policy. Leaving the bucket policy and IAM role/policy as they are, you will disable the EC2 instance’s access to the objects using the KMS key policy. The IAM policy for S3 and the bucket policy on the bucket would still normally permit the EC2 instance to access the data. But, because the KMS key policy will prevent use of the key by the authorized-users IAM role, S3 will fail to encrypt or decrypt the object. This means that any commands that execute on the EC2 instance will no longer be able to upload or download data from the S3 bucket.

First, modify the key policy.

  1. Log out of the console and log back in under the secure-key-admin user. Go to the Key Management Service console.
  2. In the left-hand navigation, select Customer managed keys and look for the key with the alias or Key ID that you’re using. The Key ID is the last 32 characters of the full key ARN.
  3. Select the Key ID for the key that you’re using to get to the screen where you can edit the key policy.
  4. In the list of Key users, you will see your authorized-users role listed. Select that role, and then select the Remove button to remove its access to use the KMS key.

At this point, the EC2 instance no longer has the permissions to use the KMS key because its role no longer grants it permission to use the key.

Repeat the command that you did in Test 1 that uploaded a PDF file to the bucket. In this case, try to make a second copy of the PDF file into an object named test4.pdf. Run this command, substituting your bucket name and your KMS key ID as required:


aws s3 cp KMS-Cryptographic-Details.pdf s3://secure-demo-bucket/test4.pdf --sse aws:kms --sse-kms-key-id abcdefab-1234-1234-1234-abcdef01234567890

You should see an error like this:


An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

Now, try to download the copy of the KMS-Cryptographic-Details.pdf file from the bucket, again using the command that worked before, substituting the bucket name as required:


aws s3 cp s3://secure-demo-bucket/KMS-Cryptographic-Details.pdf test4.pdf

You should see an error message like this:


An error occurred (AccessDenied) when calling the GetObject operation: Access Denied

These two commands are denied because when S3 tried to invoke KMS to encrypt or decrypt data, the EC2 instance role did not have permission to use the KMS key and thus the request failed. Note that there is no situation where the API call returns the KMS-encrypted data from S3. Either the API call succeeds, and you receive the decrypted data, or the API call fails, and you receive an error. All AWS services that use KMS to encrypt data behave this way—you either get the decrypted data, or you get an error message.

Restoring access to the key

To restore the EC2 instance’s access to the data, you authorize its role again in the KMS key policy:

  1. Go to Key Management Service in the AWS Console.
  2. Select Customer managed keys.
  3. Find the key that you’re using and select it.
  4. Find your authorized-users role in the list of roles, or type “authorized-users” in the search box to find it.
  5. Select the checkbox next to the authorized-users role, and then select Add to add that role as a key user.

The role will now have permission to use the key as it did before.

Useful variations on this solution

Variation 1: Using KMS keys in different AWS accounts

You can use a KMS key that is in a different AWS account for encrypting and decrypting. This allows administrators in a central AWS account to manage KMS keys, while the data itself resides in other AWS accounts. This can offer further separation of roles from the example above because even a highly privileged user (for example, root) in the account in which the authorized-users role exists won’t be able to modify the key policy. The account ID in which authorized-users role exists must be listed in the key policy. For more information, follow the instructions on sharing KMS keys across accounts.

Note that the KMS key and the S3 bucket must always be in the same region. The EC2 instance does not need to be in the same region as the S3 bucket. You will experience higher latency when your EC2 instance is not in the same region as the S3 bucket.

Variation 2: Granting KMS key usage permissions to other AWS services

EC2 is not the only service that can be granted a role this way. Lambda functions can be granted AWS IAM roles that allow them to use KMS keys. That would permit the Lambda functions with the correct roles to manipulate the S3 data, while other entities (users, EC2 instances) could not. Likewise, AWS services such as Amazon Athena might require access to a KMS key if you want to use it to search data stored in S3 that has been encrypted using KMS. If Athena is given permission to assume a role with permissions to use the KMS key, then Athena can successfully execute its search queries because S3 will be allowed to decrypt objects on behalf of Athena, which is acting on your behalf when assuming the authorized-users role.

Variation 3: Creating isolated authorization to encrypt vs decrypt

You can use the KMS key policy to isolate authorization to encrypt versus decrypt data between two identities. For example, if a role has the kms:Encrypt or kms:GenerateDataKey permissions for a key, that means that role can write encrypted data directly or ask an AWS service to do it on their behalf (for example, during an upload to an S3 bucket). If the role does not also have kms:Decrypt permission, it can’t read encrypted data. This write-only permission might be appropriate for data acquisition, security log delivery, or other functions that should not be allowed to read the data they have written. Likewise, if a role has the kms:Decrypt permission, then the role has the ability to read data. But if it lacks the kms:Encrypt permission, it cannot write or modify encrypted data. This kind of isolation authorization is suitable for audit functions and log aggregation functions that need to read data but typically are prohibited from modifying the data/logs that they read. The complete set of permissions for KMS key policies can be found in the KMS developers guide.

Cost of this solution

Three services with charges are used in this solution: EC2, S3, and KMS. The EC2 instance hours are charged according to standard EC2 pricing. Likewise, storing data in S3 will incur costs according to standard S3 pricing. There is no difference in S3 pricing for storing encrypted versus unencrypted data. Finally, KMS has a fixed price per month for each customer-managed CMK you create, which is described in the KMS pricing page. Each encryption and decryption of an object is a KMS API call and a certain number of KMS API calls are free each month. The number of free KMS API calls, and the price for API calls beyond the free tier, are described on the KMS pricing page.

Summary

The combination of IAM policies, S3 bucket policies, and KMS key policies gives you a powerful way to apply independent access control mechanisms on data. This mechanism means that one set of users can be granted rights to do maintenance operations on the buckets themselves, while not having rights to access or manipulate the data itself. Even a user or function with full privileges in S3 would be denied access to this encrypted data unless it also had the rights to use the KMS keys. It gives you an approach to access control that allows key policies to serve as an additional control when IAM policies or S3 bucket policies alone are not sufficient.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Key Management Service forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author bio

Paco Hope

Paco Hope is a Principal Security Consultant with AWS Professional Services working to help enterprise customers secure their workloads in the cloud. He has helped secure migration landing zones, design customer security architectures, and has mentored a number of AWS partners in the UK on AWS Security. He frequently speaks at information security conferences and security meetups.

Digital signing with the new asymmetric keys feature of AWS KMS

Post Syndicated from Raj Copparapu original https://aws.amazon.com/blogs/security/digital-signing-asymmetric-keys-aws-kms/

AWS Key Management Service (AWS KMS) now supports asymmetric keys. You can create, manage, and use public/private key pairs to protect your application data using the new APIs via the AWS SDK. Similar to the symmetric key features we’ve been offering, asymmetric keys can be generated as customer master keys (CMKs) where the private portion never leaves the service, or as a data key where the private portion is returned to your calling application encrypted under a CMK. The private portion of asymmetric CMKs are used in AWS KMS hardware security modules (HSMs) designed so that no one, including AWS employees, can access the plaintext key material. AWS KMS supports the following asymmetric key types – RSA 2048, RSA 3072, RSA 4096, ECC NIST P-256, ECC NIST P-384, ECC NIST-521, and ECC SECG P-256k1.

We’ve talked with customers and know that one popular use case for asymmetric keys is digital signing. In this post, I will walk you through an example of signing and verifying files using some of the new APIs in AWS KMS.

Background

A common way to ensure the integrity of a digital message as it passes between systems is to use a digital signature. A sender uses a secret along with cryptographic algorithms to create a data structure that is appended to the original message. A recipient with access to that secret can cryptographically verify that the message hasn’t been modified since the sender signed it. In cases where the recipient doesn’t have access to the same secret used by the sender for verification, a digital signing scheme that uses asymmetric keys is useful. The sender can make the public portion of the key available to any recipient to verify the signature, but the sender retains control over creating signatures using the private portion of the key. Asymmetric keys are used for digital signature applications such as trusted source code, authentication/authorization tokens, document e-signing, e-commerce transactions, and secure messaging. AWS KMS supports what are known as raw digital signatures, where there is no identity information about the signer embedded in the signature object. A common way to attach identity information to a digital signature is to use digital certificates. If your application relies on digital certificates for signing and signature verification, we recommend you look at AWS Certificate Manager and Private Certificate Authority. These services allow you to programmatically create and deploy certificates with keys to your applications for digital signing operations. A common application of digital certificates is TLS termination on a web server to secure data in transit.

Signing and verifying files with AWS KMS

Assume that you have an application A that sends a file to application B in your AWS account. You want the file to be digitally signed so that the receiving application B can verify it hasn’t been tampered with in transit. You also want to make sure only application A can digitally sign files using the key because you don’t want application B to receive a file thinking it’s from application A when it was really from a different sender that had access to the signing key. Because AWS KMS is designed so that the private portion of the asymmetric key pair used for signing cannot be used outside the service or by unauthenticated users, you’re able to define and enforce policies so that only application A can sign with the key.

To start, application A will submit either the file itself or a digest of the file to the AWS KMS Sign API under an asymmetric CMK. If the file is less than 4KB, AWS KMS will compute a digest for you as a part of the signing operation. If the file is greater than 4KB, you must send only the digest you created locally and you must tell AWS KMS that you’re passing a digest in the MessageType parameter of the request. You can use any of several hashing functions in your local environment to create a digest of the file, but be aware that the receiving application in account B will need to be able to compute the digest using the same hash function in order to verify the integrity of the file. In my example, I’m using SHA256 as the hash function. Once the digest is created, AWS KMS uses the private portion of the asymmetric CMK to encrypt the digest using the signing algorithm specified in the API request. The result is a binary data object, which we’ll refer to as “the signature” throughout this post.

Once application B receives the file with the signature, it must create a digest of the file. It then passes this newly generated digest, the signature object, the signing algorithm used, and the CMK keyId to the Verify API. AWS KMS uses the corresponding public key of the CMK with the signing algorithm specified in the request to verify the signature. Instead of submitting the signature to the Verify API, application B could verify the signature locally by acquiring the public key. This might be an attractive option if application B didn’t have a way to acquire valid AWS credentials to make a request of AWS KMS. However, this method requires application B to have access to the necessary cryptographic algorithms and to have previously received the public portion of the asymmetric CMK. In my example, application B is running in the same account as application A, so it can acquire AWS credentials to make the Verify API request. I’ll describe how to verify signatures using both methods in a bit more detail later in the post.

Creating signing keys and setting up key policy permissions

To start, you need to create an asymmetric CMK. When calling the CreateKey API, you’ll pass one of the asymmetric values for the CustomerMasterKeySpec parameter. In my example, I’m choosing a key spec of ECC_NIST_P384 because keys used with elliptic curve algorithms tend to be more efficient than those used with RSA-based algorithms.

As a part of creating your asymmetric CMK, you need to attach a resource policy to the key to control which cryptographic operations the AWS principals representing applications A and B can use. A best practice is to use a different IAM principal for each application in order to scope down permissions. In this case, you want application A to only be able to sign files, and application B to only be able to verify them. I will assume each of these applications are running in Amazon EC2, and so I’ll create a couple of IAM roles.

  • The IAM role for application A (SignRole) will be given kms:Sign permission in the CMK key policy
  • The IAM role for application B (VerifyRole) will be given kms:Verify permission in the CMK key policy

The stanza in the CMK key policy document to allow signing should look like this (replace the account ID value of <111122223333> with your own):


{
	"Sid": "Allow use of the key for digital signing",
	"Effect": "Allow",
	"Principal": {"AWS":"arn:aws:iam::<111122223333>:role/SignRole"},
	"Action": "kms:Sign",
	"Resource": "*"
}

The stanza in the CMK key policy document to allow verification should look like this (replace the account ID value of <111122223333> with your own):


{
	"Sid": "Allow use of the key for verification",
	"Effect": "Allow",
	"Principal": {"AWS":"arn:aws:iam::<111122223333>:role/VerifyRole"},
	"Action": "kms:Verify",
	"Resource": "*"
}

Signing Workflow

Once you have created the asymmetric CMK and IAM roles, you’re ready to sign your file. Application A will create a message digest of the file and make a sign request to AWS KMS with the asymmetric CMK keyId, and signing algorithm. The CLI command to do this is shown below. Replace the key-id parameter with your CMK’s specific keyId.


aws kms sign \
	--key-id <1234abcd-12ab-34cd-56ef-1234567890ab> \
	--message-type DIGEST \
	--signing-algorithm ECDSA_SHA_256 \
	--message fileb://ExampleDigest

I chose the ECDSA_SHA_256 signing algorithm for this example. See the Sign API specification for a complete list of supported signing algorithms.

After validating that the API call is authorized by the credentials available to SignRole, KMS generates a signature around the digest and returns the CMK keyId, signature, and the signing algorithm.

Verify Workflow 1 — Calling the verify API

Once application B receives the file and the signature, it computes the SHA 256 digest over the copy of the file it received. It then makes a verify request to AWS KMS, passing this new digest, the signature it received from application A, signing algorithm, and the CMK keyId. The CLI command to do this is shown below. Replace the key-id parameter with your CMK’s specific keyId.


aws kms verify \
	--key-id <1234abcd-12ab-34cd-56ef-1234567890ab> \
	--message-type DIGEST \
	--signing-algorithm ECDSA_SHA_256 \
	--message fileb://ExampleDigest \
	--signature fileb://Signature

After validating that the verify request is authorized, AWS KMS verifies the signature by first decrypting the signature using the public portion of the CMK. It then compares the decrypted result to the digest received in the verify request. If they match, it returns a SignatureValid boolean of True, indicating that the original digest created by the sender matches the digest created by the recipient. Because the original digest is unique to the original file, the recipient can know that the file was not tampered with during transit.

One advantage of using the AWS KMS verify API is that the caller doesn’t have to keep track of the specific public key matching the private key used to create the signature; the caller only has to know the CMK keyId and signing algorithm used. Also, because all request to AWS KMS are logged to AWS CloudTrail, you can audit that the signature and verification operations were both executed as expected. See the Verify API specification for more detail on available parameters.

Verify Workflow 2 — Verifying locally using the public key

Apart from using the Verify API directly, you can choose to retrieve the public key in the CMK using the AWS KMS GetPublicKey API and verify the signature locally. You might want to do this if application B needs to verify multiple signatures at a high rate and you don’t want to make a network call to the Verify API each time. In this method, application B makes a GetPublicKey request to AWS KMS to retrieve the public key. The CLI command to do this is below. Replace the key-id parameter with your CMK’s specific keyId.

aws kms get-public-key \
–key-id <1234abcd-12ab-34cd-56ef-1234567890ab>

Note that the application B will need permissions to make a GetPublicKey request to AWS KMS. The stanza in the CMK key policy document to allow the VerifyRole identity to download the public key should look like this (replace the account ID value of <111122223333> with your own):


{
	"Sid": "Allow retrieval of the public key for verification",
	"Effect": "Allow",
	"Principal": {"AWS":"arn:aws:iam::<111122223333>:role/VerifyRole"},
	"Action": "kms:GetPublicKey ",
	"Resource": "*"
}

Once application B has the public key, it can use your preferred cryptographic provider to perform the signature verification locally. Application B needs to keep track of the public key and signing algorithm used for each signature object it will verify locally. Using the wrong public key will fail to decrypt the signature from application A, making the signature verification operation unsuccessful.

Availability and pricing

Asymmetric keys and operations in AWS KMS are available now in the Northern Virginia, Oregon, Sydney, Ireland, and Tokyo AWS Regions with support for other regions planned. Pricing information for the new feature can be found at the AWS KMS pricing page.

Summary

I showed you a simple example of how you can use the new AWS KMS APIs to digitally sign and verify an arbitrary file. By having AWS KMS generate and store the private portion of the asymmetric key, you can limit use of the key for signing only to IAM principals you define. OIDC ID tokens, OAuth 2.0 access tokens, documents, configuration files, system update messages, and audit logs are but a few of the types of objects you might want to sign and verify using this feature.

You can also perform encrypt and decrypt operations under asymmetric CMKs in AWS KMS as an alternative to using the symmetric CMKs available since the service launched. Similar to how you can ask AWS KMS to generate symmetric keys for local use in your application, you can ask AWS KMS to generate and return asymmetric key pairs for local use to encrypt and decrypt data. Look for a future AWS Security Blog post describing these use cases. For more information about asymmetric key support, see the AWS KMS documentation page.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about the asymmetric key feature, please start a new thread on the AWS KMS Discussion Forum.

Want more AWS Security news? Follow us on Twitter.

Raj Copparapu

Raj Copparapu

Raj Copparapu is a Senior Product Manager Technical. He’s a member of the AWS KMS team and focuses on defining the product roadmap to satisfy customer requirements. He spent over 5 years innovating on behalf of customers to deliver products to help customers secure their data in the cloud. Raj received his MBA from the Duke’s Fuqua School of Business and spent his early career working as an engineer and a business intelligence consultant. In his spare time, Raj enjoys yoga and spending time with his kids.

Post-quantum TLS now supported in AWS KMS

Post Syndicated from Andrew Hopkins original https://aws.amazon.com/blogs/security/post-quantum-tls-now-supported-in-aws-kms/

AWS Key Management Service (AWS KMS) now supports post-quantum hybrid key exchange for the Transport Layer Security (TLS) network encryption protocol that is used when connecting to KMS API endpoints. In this post, I’ll tell you what post-quantum TLS is, what hybrid key exchange is, why it’s important, how to take advantage of this new feature, and how to give us feedback.

What is post-quantum TLS?

Post-quantum TLS is a feature that adds new, post-quantum cipher suites to the protocol. AWS implements TLS using s2n, a streamlined open source implementation of TLS. In June, 2019, AWS introduced post-quantum s2n, which implements two proposed post-quantum hybrid cipher suites specified in this IETF draft. The cipher suites specify a key exchange that provides the security protections of both the classical and post-quantum schemes.

Why is this important?

A large-scale quantum computer would break the current public key cryptography that is used for key exchange in every TLS connection. While a large-scale quantum computer is not available today, it’s still important to think about and plan for your long-term security needs. TLS traffic recorded today could be decrypted by a large-scale quantum computer in the future. If you’re developing applications that rely on the long-term confidentiality of data passed over a TLS connection, you should consider a plan to migrate to post-quantum cryptography before a large-scale quantum computer is available for use by potential adversaries. AWS is working to prepare for this future, and we want you to be well-prepared, too.

We’re offering this feature now instead of waiting so you’ll have a way to measure the potential performance impact to your applications, and you’ll have the additional benefit of the protection afforded by the proposed post-quantum schemes today. While we believe the use of this feature raises the already high security bar for connecting to KMS endpoints, these new cipher suites will have an impact on bandwidth utilization, latency, and could also create issues for intermediate systems that proxy TLS connections. We’d like to get feedback from you on the effectiveness of our implementation so we can improve it over time.

Some background on post-quantum TLS

Today, all requests to AWS KMS use TLS with one of two key exchange schemes:

FFDHE and ECDHE are industry standards for secure key exchange. KMS uses only ephemeral keys for TLS key negotiation; this ensures every connection uses a unique key and the compromise of one connection does not affect the security of another connection. They are secure today against known cryptanalysis techniques which use classic computers; however, they’re not secure against known attacks which use a large-scale quantum computer. In the future a sufficiently capable large-scale quantum computer could run Shor’s Algorithm to recover the TLS session key of a recorded session, and therefore gain access to the data inside. Protecting against a large-scale quantum computer requires using a post-quantum key exchange algorithm during the TLS handshake.

The possibility of large-scale quantum computing has spurred the development of new quantum-resistant cryptographic algorithms. The National Institute for Science and Technology (NIST) has started the process of standardizing post-quantum cryptographic algorithms. AWS contributed to two NIST submissions:

BIKE and SIKE are Key Encapsulation Mechanisms (KEMs); a KEM is a type of key exchange used to establish a shared symmetric key. Post-quantum s2n only uses ephemeral BIKE and SIKE keys.

The NIST standardization process isn’t expected to complete until 2024. Until then, there is a risk that the exclusive use of proposed algorithms like BIKE and SIKE could expose data in TLS connections to security vulnerabilities not yet discovered. To mitigate this risk and use these new post-quantum schemes safely today, we need a way to combine classical algorithms with the expected post-quantum security of the new algorithms submitted to NIST. The Hybrid Post-Quantum Key Encapsulation Methods for Transport Layer Security 1.2 IETF draft describes how to combine BIKE and SIKE with ECDHE to create two new cipher suites for TLS.

These two cipher suites use a hybrid key exchange that performs two independent key exchanges during the TLS handshake and then cryptographically combines the keys into a single TLS session key. This strategy combines the high assurance of a classical key exchange with the security of the proposed post-quantum key exchanges.

The effect of hybrid post-quantum TLS on performance

Post-quantum cipher suites have a different performance profile and bandwidth requirements than traditional cipher suites. We measured the latency and bandwidth for a single handshake on an EC2 C5 2x.large. This provides a baseline for what to expect when you connect to KMS with the SDK. Your exact results will depend on your hardware (CPU speed and number of cores), existing workloads (how often you call KMS and what other work your application performs), and your network (location and capacity).

BIKE and SIKE have different performance tradeoffs: BIKE has faster computations and large keys, and SIKE has slower computations and smaller keys. The tables below show the results of the AWS measurements. ECDHE, a classic cryptographic key exchange algorithm, is included by itself for comparison.

Table 1
TLS MessageECDHE (bytes)ECDHE w/ BIKE (bytes)ECDHE w/SIKE (bytes)
ClientHello139147147
ServerKeyExchange3292,875711
ClientKeyExchange662,610470

Table 1 shows the amount of data (in bytes) sent in each TLS message. The ClientHello message is larger for post-quantum cipher suites because they include a new ClientHello extension. The key exchange messages are larger because they include BIKE or SIKE messages.

Table 2
ItemECDHE (ms)ECDHE w/ BIKE (ms)ECDHE w/SIKE (ms)
Server processing time0.1120.2695.53
Client processing time0.100.3957.05
Total handshake time1.1925.58155.08

Table 2 shows the time (in milliseconds) a client and server in the same region take to complete a handshake. Server processing time includes: key generation, signing the server key exchange message, and processing the client key exchange message. The client processing time includes: verifying the server’s certificate, processing the server key exchange message, and generating the client key exchange message. The total time was measured on the client from the start of the handshake to the end and includes network transfer time. All connections used RSA authentication with a 2048-bit key, and ECDHE used the secp256r1 curve. The BIKE test used the BIKE-1 Level 1 parameter and the SIKE test used the SIKEp503 parameter.

A TLS handshake is only performed once to setup a new connection. The SDK will reuse connections for multiple KMS requests when possible. This means that you don’t want to include measurements of subsequent round-trips under an existing TLS session, otherwise you will skew your performance data.

How to use hybrid post-quantum cipher suites

Note: The “AWS CRT HTTP Client” in the aws-crt-dev-preview branch of the aws-sdk-java-v2 repository is a beta release. This beta release and your use are subject to Section 1.10 (“Beta Service Participation”) of the AWS Service Terms.

To use the post-quantum cipher suites with AWS KMS, you’ll need the Developer Previews of the Java SDK 2.0 and the AWS Common Runtime. You’ll need to configure the AWS Common Runtime HTTP client to use s2n’s post-quantum hybrid cipher suites, and configure the AWS Java SDK 2.0 to use that HTTP client. This client can then be used when connecting to any KMS endpoints, but only those endpoints that are not using FIPS 140-2 validated crypto for the TLS termination. For example, kms.<region>.amazonaws.com supports the use of post-quantum cipher suites, while kms-fips.<region>.amazonaws.com does not.

To see a complete example of everything setup check out the example application here.
 

Figure 1: GitHub and package layout

Figure 1: GitHub and package layout

Figure 1 shows the GitHub and package layout. The steps below will walk you through building and configuring the SDK.

  1. Download the Java SDK v2 Common Runtime Developer Preview:
    
    $ git clone [email protected]:aws/aws-sdk-java-v2.git --branch aws-crt-dev-preview
    $ cd aws-sdk-java-v2
    

  2. Build the aws-crt-client JAR:
    
    $ mvn install -Pquick
    

  3. In your project add the AWS Common Runtime client to your Maven Dependencies:
    
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>aws-crt-client</artifactId>
        <version>2.10.7-SNAPSHOT</version>
    </dependency>
    

  4. Configure the new SDK and cipher suite in your application’s existing initialization code:
    
    if(!TLS_CIPHER_KMS_PQ_TLSv1_0_2019_06.isSupported()){
        throw new RuntimeException("Post Quantum Ciphers not supported on this Platform");
    }
    SdkAsyncHttpClient awsCrtHttpClient = AwsCrtAsyncHttpClient.builder()
              .tlsCipherPreference(TLS_CIPHER_KMS_PQ_TLSv1_0_2019_06)
              .build();
    KmsAsyncClient kms = KmsAsyncClient.builder()
             .httpClient(awsCrtHttpClient)
             .build();
    ListKeysResponse response = kms.listKeys().get();
    

Now, all connections made to AWS KMS in supported regions will use the new hybrid post-quantum cipher suites.

Things to try

Here are some ideas about how to use this post-quantum-enabled client:

  • Run load tests and benchmarks. These new cipher suites perform differently than traditional key exchange algorithms. You might need to adjust your connection timeouts to allow for the longer handshake times or, if you’re running inside an AWS Lambda function, extend the execution timeout setting.
  • Try connecting from different locations. Depending on the network path your request takes, you might discover that intermediate hosts, proxies, or firewalls with deep packet inspection (DPI) block the request. This could be due to the new cipher suites in the ClientHello or the larger key exchange messages. If this is the case, you might need to work with your Security team or IT administrators to update the relevant configuration to unblock the new TLS cipher suites. We’d like to hear from your about how your infrastructure interacts with this new variant of TLS traffic.

More info

If you’re interested to learn more about post-quantum cryptography check out:

Conclusion

In this blog post, I introduced you to the topic of post-quantum security and covered what AWS and NIST are doing to address the issue. I also showed you how to begin experimenting with hybrid post-quantum key exchange algorithms for TLS when connecting to KMS endpoints.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about how to configure the HTTP client or its interaction with KMS endpoints, please start a new thread on the AWS KMS discussion forum.

How to deploy CloudHSM to securely share your keys with your SaaS provider

Post Syndicated from Vinod Madabushi original https://aws.amazon.com/blogs/security/how-to-deploy-cloudhsm-securely-share-keys-with-saas-provider/

If your organization is using software as a service (SaaS), your data is likely stored and protected by the SaaS provider. However, depending on the type of data that your organization stores and the compliance requirements that it must meet, you might need more control over how the encryption keys are stored, protected, and used. In this post, I’ll show you two options for deploying and managing your own CloudHSM cluster to secure your keys, while still allowing trusted third-party SaaS providers to securely access your HSM cluster in order to perform cryptographic operations. You can also use this architecture when you want to share your keys with another business unit or with an application that’s running in a separate AWS account.

AWS CloudHSM is one of several cryptography services provided by AWS to help you secure your data and keys in the AWS cloud. AWS CloudHSM provides single-tenant HSMs based on third-party FIPS 140-2 Level 3 validated hardware, under your control, in your Amazon Virtual Private Cloud (Amazon VPC). You can generate and use keys on your HSM using CloudHSM command line tools or standards-compliant C, Java, and OpenSSL SDKs.

A related, more widely used service is AWS Key Management Service (KMS). KMS is generally easier to use, cheaper to operate, and is natively integrated with most AWS services. However, there are some use cases for which you may choose to rely on CloudHSM to meet your security and compliance requirements.

Solution Overview

There are two ways you can set up your VPC and CloudHSM clusters to allow trusted third-party SaaS providers to use the HSM cluster for cryptographic operations. The first option is to use VPC peering to allow traffic to flow between the SaaS provider’s HSM client VPC and your CloudHSM VPC, and to utilize a custom application to harness the HSM.

The second option is to use KMS to manage the keys, specifying a custom key store to generate and store the keys. AWS KMS supports custom key stores backed by AWS CloudHSM clusters. When you create an AWS KMS customer master key (CMK) in a custom key store, AWS KMS generates and stores non-extractable key material for the CMK in an AWS CloudHSM cluster that you own and manage.

Decision Criteria: VPC Peering vs Custom Key Store

The right solution for you will depend on factors like your VPC configuration, security requirements, network setup, and the type of cryptographic operations you need. The following table provides a high-level summary of how these two options compare. Later in this post, I’ll go over both options in detail and explain the design considerations you need to be aware of before deploying the solution in your environment.

Technical ConsiderationsSolution
VPC PeeringCustom Keystore
Are you able to peer or connect your HSM VPC with your SaaS provider?✔
Is your SaaS provider sensitive to costs from KMS usage in their AWS account?✔
Do you need CloudHSM-specific cryptographic tasks like signing, HMAC, or random number generation?✔
Does your SaaS provider need to encrypt your data directly with the Master Key?✔
Does your application rely on a PKCS#11-compliant or JCE-compliant SDK?✔
Does your SaaS provider need to use the keys in AWS services?✔
Do you need to log all key usage activities when SaaS providers use your HSM keys?✔

Option 1: VPC Peering

 

Figure 1: Architecture diagram showing VPC peering between the SaaS provider's HSM client VPC and the customer's HSM VPC

Figure 1: Architecture diagram showing VPC peering between the SaaS provider’s HSM client VPC and the customer’s HSM VPC

Figure 1 shows how you can deploy a CloudHSM cluster in a dedicated HSM VPC and peer this HSM VPC with your service provider’s VPC to allow them to access the HSM cluster through the client/application. I recommend that you deploy the CloudHSM cluster in a separate HSM VPC to limit the scope of resources running in that VPC. Since VPC peering is not transitive, service providers will not have access to any resources in your application VPCs or any other VPCs that are peered with the HSM VPC.

It’s possible to leverage the HSM cluster for other purposes and applications, but you should be aware of the potential drawbacks before you do. This approach could make it harder for you to find non-overlapping CIDR ranges for use with your SaaS provider. It would also mean that your SaaS provider could accidentally overwrite HSM account credentials or lock out your HSMs, causing an availability issue for your other applications. Due to these reasons, I recommend that you dedicate a CloudHSM cluster for use with your SaaS providers and use small VPC and subnet sizes, like /27, so that you’re not wasting IP space and it’s easier to find non-overlapping IP addresses with your SaaS provider.

If you’re using VPC peering, your HSM VPC CIDR cannot overlap with your SaaS provider’s VPC. Deploying the HSM cluster in a separate VPC gives you flexibility in selecting a suitable CIDR range that is non-overlapping with the service provider since you don’t have to worry about your other applications. Also, since you’re only hosting the HSM Cluster in this VPC, you can choose a CIDR range that is relatively small.

Design considerations

Here are additional considerations to think about when deploying this solution in your environment:

  • VPC peering allow resources in either VPC to communicate with each other as long as security groups, NACLS, and routing allow for it. In order to improve security, place only resources that are meant to be shared in the VPC, and secure communication at the port/protocol level by using security groups.
  • If you decide to revoke the SaaS provider’s access to your CloudHSM, you have two choices:
    • At the network layer, you can remove connectivity by deleting the VPC peering or by modifying the CloudHSM security groups to disallow the SaaS provider’s CIDR ranges.
    • Alternately, you can log in to the CloudHSM as Crypto Officer (CO) and change the password or delete the Crypto user that the SaaS provider is using.
  • If you’re deploying CloudHSM across multiple accounts or VPCs within your organization, you can also use AWS Transit Gateway to connect the CloudHSM VPC to your application VPCs. Transit Gateway is ideal when you have multiple application VPCs that needs CloudHSM access, as it easily scales and you don’t have to worry about the VPC peering limits or the number of peering connections to manage.
  • If you’re the SaaS provider, and you have multiple clients who might be interested in this solution, you must make sure that one customer IP space doesn’t overlap with yours. You must also make sure that each customer’s HSM VPC doesn’t overlap with any of the others. One solution is to dedicate one VPC per customer, to keep the client/application dedicated to that customer, and to peer this VPC with your application VPC. This reduces the overlapping CIDR dependency among all your customers.

Option 2: Custom Key Store

As the AWS KMS documentation explains, KMS supports custom key stores backed by AWS CloudHSM clusters. When you create an AWS KMS customer master key (CMK) in a custom key store, AWS KMS generates and stores non-extractable key material for the CMK in an AWS CloudHSM cluster that you own and manage. When you use a CMK in a custom key store, the cryptographic operations are performed in the HSMs in the cluster. This feature combines the convenience and widespread integration of AWS KMS with the added control of an AWS CloudHSM cluster in your AWS account. This option allows you to keep your master key in the CloudHSM cluster but allows your SaaS provider to use your master key securely by using KMS.

Each custom key store is associated with an AWS CloudHSM cluster in your AWS account. When you connect the custom key store to its cluster, AWS KMS creates the network infrastructure to support the connection. Then it logs into the key AWS CloudHSM client in the cluster using the credentials of a dedicated crypto user in the cluster. All of this is automatically set up, with no need to peer VPCs or connect to your SaaS provider’s VPC.

You create and manage your custom key stores in AWS KMS, and you create and manage your HSM clusters in AWS CloudHSM. When you create CMKs in an AWS KMS custom key store, you view and manage the CMKs in AWS KMS. But you can also view and manage their key material in AWS CloudHSM, just as you would do for other keys in the cluster.

The following diagram shows how some keys can be located in a CloudHSM cluster but be visible through AWS KMS. These are the keys that AWS KMS can use for crypto operations performed through KMS.
 

Figure 2: High level overview of KMS custom key store

Figure 2: High level overview of KMS custom key store

While this option eliminates many of the networking components you need to set up for Option 1, it does limit the type of cryptographic operations that your SaaS provider can perform. Since the SaaS provider doesn’t have direct access to CloudHSM, the crypto operations are limited to the encrypt and decrypt operations supported by KMS, and your SaaS provider must use KMS APIs for all of their operations. This is easy if they’re using AWS services which use KMS already, but if they’re performing operations within their application before storing the data in AWS storage services, this approach could be challenging, because KMS doesn’t support all the same types of cryptographic operations that CloudHSM supports.

Figure 3 illustrates the various components that make up a custom key store and shows how a CloudHSM cluster can connect to KMS to create a customer controlled key store.
 

Figure 3: A cluster of two CloudHSM instances is connected to KMS to create a customer controlled key store

Figure 3: A cluster of two CloudHSM instances is connected to KMS to create a customer controlled key store

Design Considerations

  • Note that when using custom key store, you’re creating a kmsuser CU account in your AWS CloudHSM cluster and providing the kmsuser account credentials to AWS KMS.
  • This option requires your service provider to be able to use KMS as the key management option within their application. Because your SaaS provider cannot communicate directly with the CloudHSM cluster, they must instead use KMS APIs to encrypt the data. If your SaaS provider is encrypting within their application without using KMS, this option may not work for you.
  • When deploying a custom key store, you must not only control access to the CloudHSM cluster, you must also control access to AWS KMS.
  • Because the custom key store and KMS are located in your account, you must give permission to the SaaS provider to use certain KMS keys. You can do this by enabling cross account access. For more information, please refer to the blog post “Share custom encryption keys more securely between accounts by using AWS Key Management Service.”
  • I recommend dedicating an AWS account to the CloudHSM cluster and custom key store, as this simplifies setup. For more information, please refer to Controlling Access to Your Custom Key Store.

Network architecture that is not supported by CloudHSM

Figure 4: Diagram showing the network anti-pattern for deploying CloudHSM

Figure 4: Diagram showing the network anti-pattern for deploying CloudHSM

Figure 4 shows various networking technologies, like AWS PrivateLink, Network Address Translation (NAT), and AWS Load Balancers, that cannot be used with CloudHSM when placed between the CloudHSM cluster and the client/application. All of these methods mask the real IPs of the HSM cluster nodes from the client, which breaks the communication between the CloudHSM client and the HSMs.

When the CloudHSM client successfully connects to the HSM cluster, it downloads a list of HSM IP addresses which is then stored and used for subsequent connections. When one of the HSM nodes is unavailable, the client/application will automatically try the IP address of the HSM nodes it knows about. When HSMs are added or removed from the cluster, the client is automatically reconfigured. Since the client relies on a current list of IP addresses to transparently handle high availability and failover within the cluster, masking the real IP address of the HSM node thus breaks the communication between the cluster and the client.

You can read more about how the CloudHSM client works in the AWS CloudHSM User Guide.

Summary

In this blog post, I’ve shown you two options for deploying CloudHSM to store your key material while allowing your SaaS provider to access and use those keys on your behalf. This allows you to remain in control of your encryption keys and use a SaaS solution without compromising security.

It’s important to understand the security requirements, network setup, and type of cryptographic operation for each approach, and to choose the option that aligns the best with your goals. As a best practice, it’s also important to understand how to secure your CloudHSM and KMS deployment and to use necessary role-based access control with minimum privilege. Read more about AWS KMS Best Practices and CloudHSM Best Practices.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Key Management Service discussion forum.

Want more AWS Security news? Follow us on Twitter.

Vinod Madabushi

Vinod is an Enterprise Solutions Architect with AWS. He works with customers on building highly available, scalable, and secure applications on AWS Cloud. He’s passionate about solving technology challenges and helping customers with their cloud journey.

Are KMS custom key stores right for you?

Post Syndicated from Richard Moulds original https://aws.amazon.com/blogs/security/are-kms-custom-key-stores-right-for-you/

You can use the AWS Key Management Service (KMS) custom key store feature to gain more control over your KMS keys. The KMS custom key store integrates KMS with AWS CloudHSM to help satisfy compliance obligations that would otherwise require the use of on-premises hardware security modules (HSMs) while providing the AWS service integrations of KMS. However, the additional control comes with increased cost and potential impact on performance and availability. This post will help you decide if this feature is the best approach for you.

KMS is a fully managed service that generates encryption keys and helps you manage their use across more than 45 AWS services. It also supports the AWS Encryption SDK and other client-side encryption tools, and you can integrate it into your own applications. KMS is designed to meet the requirements of the vast majority of AWS customers. However, there are situations where customers need to manage their keys in single-tenant HSMs that they exclusively control. Previously, KMS did not meet these requirements since it offered only the ability to store keys in shared HSMs that are managed by KMS.

AWS CloudHSM is a service that’s primarily intended to support customer-managed applications that are specifically designed to use HSMs. It provides direct control over HSM resources, but the service isn’t, by itself, widely integrated with other AWS managed services. Before custom key store, this meant that if you required direct control of your HSMs but still wanted to use and store regulated data in AWS managed services, you had to choose between changing those requirements, not using a given AWS service, or building your own solution. KMS custom key store gives you another option.

How does a custom key store work?

With custom key store, you can configure your own CloudHSM cluster and authorize KMS to use it as a dedicated key store for your keys rather than the default KMS key store. Then, when you create keys in KMS, you can choose to generate the key material in your CloudHSM cluster. Your KMS customer master keys (CMKs) never leave the CloudHSM instances, and all KMS operations that use those keys are only performed in your HSMs. In all other respects, the master keys stored in your custom key store are used in a way that is consistent with other KMS CMKs.

This diagram illustrates the primary components of the service and shows how a cluster of two CloudHSM instances is connected to KMS to create a customer controlled key store.
 

Figure 1: A cluster of two CloudHSM instances is connected to the KMS front-end hosts to create a customer controlled key store

Figure 1: A cluster of two CloudHSM instances is connected to KMS to create a customer controlled key store

Because you control your CloudHSM cluster, you can take direct action to manage certain aspects of the lifecycle of your keys, independently of KMS. Specifically, you can verify that KMS correctly created keys in your HSMs and you can delete key material and restore keys from backup at any time. You can also choose to connect and disconnect the CloudHSM cluster from KMS, effectively isolating your keys from KMS. However, with more control comes more responsibility. It’s important that you understand the availability and durability impact of using this feature, and I discuss the issues in the next section.

Decision criteria

KMS customers who plan to use a custom key store tell us they expect to use the feature selectively, deciding on a key-by-key basis where to store them. To help you decide if and how you might use the new feature, here are some important issues to consider.

Here are some reasons you might want to store a key in a custom key store:

  • You have keys that are required to be protected in a single-tenant HSM or in an HSM over which you have direct control.
  • You have keys that are explicitly required to be stored in an HSM validated at FIPS 140-2 level 3 overall (the HSMs used in the default KMS key store are validated to level 2 overall, with level 3 in several categories, including physical security).
  • You have keys that are required to be auditable independently of KMS.

And here are some considerations that might influence your decision to use a custom key store:

  • Cost — Each custom key store requires that your CloudHSM cluster contains at least two HSMs. CloudHSM charges vary by region, but you should expect costs of at least $1,000 per month, per HSM, if each device is permanently provisioned. This cost occurs regardless of whether you make any requests of the KMS API directly or indirectly through an AWS service.
  • Performance — The number of HSMs determines the rate at which keys can be used. It’s important that you understand the intended usage patterns for your keys and ensure that you have provisioned your HSM resources appropriately.
  • Availability — The number of HSMs and the use of availability zones (AZs) impacts the availability of your cluster and, therefore, your keys. The risk of your configuration errors that result in a custom key store being disconnected, or key material being deleted and unrecoverable, must be understood and assessed.
  • Operations — By using the custom key store feature, you will perform certain tasks that are normally handled by KMS. You will need to set up HSM clusters, configure HSM users, and potentially restore HSMs from backup. These are security-sensitive tasks for which you should have the appropriate resources and organizational controls in place to perform.

Getting Started

Here’s a basic rundown of the steps that you’ll take to create your first key in a custom key store within a given region.

  1. Create your CloudHSM cluster, initialize it, and add HSMs to the cluster. If you already have a CloudHSM cluster, you can use it as a custom key store in addition to your existing applications.
  2. Create a CloudHSM user so that KMS can access your cluster to create and use keys.
  3. Create a custom key store entry in KMS, give it a name, define which CloudHSM cluster you want it to use, and give KMS the credentials to access your cluster.
  4. Instruct KMS to make a connection to your cluster and log in.
  5. Create a CMK in KMS in the usual way except now select CloudHSM as the source of your key material. You’ll define administrators, users, and policies for the key as you would for any other CMK.
  6. Use the key via the existing KMS APIs, AWS CLI, or the AWS Encryption SDK. Requests to use the key don’t need to be context-aware of whether the key is stored in a custom key store or the default KMS key store.

Summary

Some customers need specific controls in place before they can use KMS to manage encryption keys in AWS. The new KMS custom key store feature is intended to satisfy that requirement. You can now apply the controls provided by CloudHSM to keys managed in KMS, without changing access control policies or service integration.

However, by using the new feature, you take responsibility for certain operational aspects that would otherwise be handled by KMS. It’s important that you have the appropriate controls in place and understand the performance and availability requirements of each key that you create in a custom key store.

If you’ve been prevented from migrating sensitive data to AWS because of specific key management requirements that are currently not met by KMS, consider using the new KMS custom key store feature.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Key Management Service discussion forum.

Want more AWS Security news? Follow us on Twitter.

Author

Richard Moulds

Richard is a Principal Product Manager at AWS. He’s a member of the KMS team and is primarily focused on helping to define the product roadmap and satisfy customer requirements. Richard has more than 15 years experience in helping customers build encryption and key management systems to protect their data. His attraction to cryptography stems from the challenge of taking such a complex subject and translating it into simple solutions that customers should be able to take for granted, on a grand scale. When he’s not thinking ahead he’s focused on the past, restoring classic cars, the more rust the better.

Podcast: How AWS KMS could help customers meet encryption and deletion requirements, including GDPR

Post Syndicated from Katie Doptis original https://aws.amazon.com/blogs/security/podcast-how-aws-kms-could-help-customers-meet-encryption-and-deletion-requirements-including-gdpr/

Encryption is a powerful tool to protect your data but it can be difficult to get right because it demands understanding how encryption keys are created, distributed, used, and managed. To make encryption easier to use, we created AWS Key Management Service (KMS) to let you scale your use of the cloud without struggling to ensure encryption is used consistently across workloads.

Because AWS KMS makes it easy for you to create and control the encryption keys used to encrypt your data, the service can be used to meet both encryption and deletion requirements in a data lifecycle management policy. Cryptographic deletion is the idea is that you can delete a relatively small number of keys to make a large amount of encrypted data irretrievable. This concept is being widely discussed as an option for organizations facing data deletion requirements, such as those in the EU’s General Data Protection Regulation (GDPR).

Listen to the podcast and hear from Ken Beer, general manager of AWS KMS, about best practices related to encryption, key management, and cryptographic deletion. He also covers the advantages of KMS over on-premises systems and how the service has been designed so that even AWS operators can’t access customer keys.

Now You Can Create Encrypted Amazon EBS Volumes by Using Your Custom Encryption Keys When You Launch an Amazon EC2 Instance

Post Syndicated from Nishit Nagar original https://aws.amazon.com/blogs/security/create-encrypted-amazon-ebs-volumes-custom-encryption-keys-launch-amazon-ec2-instance-2/

Amazon Elastic Block Store (EBS) offers an encryption solution for your Amazon EBS volumes so you don’t have to build, maintain, and secure your own infrastructure for managing encryption keys for block storage. Amazon EBS encryption uses AWS Key Management Service (AWS KMS) customer master keys (CMKs) when creating encrypted Amazon EBS volumes, providing you all the benefits associated with using AWS KMS. You can specify either an AWS managed CMK or a customer-managed CMK to encrypt your Amazon EBS volume. If you use a customer-managed CMK, you retain granular control over your encryption keys, such as having AWS KMS rotate your CMK every year. To learn more about creating CMKs, see Creating Keys.

In this post, we demonstrate how to create an encrypted Amazon EBS volume using a customer-managed CMK when you launch an EC2 instance from the EC2 console, AWS CLI, and AWS SDK.

Creating an encrypted Amazon EBS volume from the EC2 console

Follow these steps to launch an EC2 instance from the EC2 console with Amazon EBS volumes that are encrypted by customer-managed CMKs:

  1. Sign in to the AWS Management Console and open the EC2 console.
  2. Select Launch instance, and then, in Step 1 of the wizard, select an Amazon Machine Image (AMI).
  3. In Step 2 of the wizard, select an instance type, and then provide additional configuration details in Step 3. For details about configuring your instances, see Launching an Instance.
  4. In Step 4 of the wizard, specify additional EBS volumes that you want to attach to your instances.
  5. To create an encrypted Amazon EBS volume, first add a new volume by selecting Add new volume. Leave the Snapshot column blank.
  6. In the Encrypted column, select your CMK from the drop-down menu. You can also paste the full Amazon Resource Name (ARN) of your custom CMK key ID in this box. To learn more about finding the ARN of a CMK, see Working with Keys.
  7. Select Review and Launch. Your instance will launch with an additional Amazon EBS volume with the key that you selected. To learn more about the launch wizard, see Launching an Instance with Launch Wizard.

Creating Amazon EBS encrypted volumes from the AWS CLI or SDK

You also can use RunInstances to launch an instance with additional encrypted Amazon EBS volumes by setting Encrypted to true and adding kmsKeyID along with the actual key ID in the BlockDeviceMapping object, as shown in the following command:

$> aws ec2 run-instances –image-id ami-b42209de –count 1 –instance-type m4.large –region us-east-1 –block-device-mappings file://mapping.json

In this example, mapping.json describes the properties of the EBS volume that you want to create:


{
"DeviceName": "/dev/sda1",
"Ebs": {
"DeleteOnTermination": true,
"VolumeSize": 100,
"VolumeType": "gp2",
"Encrypted": true,
"kmsKeyID": "arn:aws:kms:us-east-1:012345678910:key/abcd1234-a123-456a-a12b-a123b4cd56ef"
}
}

You can also launch instances with additional encrypted EBS data volumes via an Auto Scaling or Spot Fleet by creating a launch template with the above BlockDeviceMapping. For example:

$> aws ec2 create-launch-template –MyLTName –image-id ami-b42209de –count 1 –instance-type m4.large –region us-east-1 –block-device-mappings file://mapping.json

To learn more about launching an instance with the AWS CLI or SDK, see the AWS CLI Command Reference.

In this blog post, we’ve demonstrated a single-step, streamlined process for creating Amazon EBS volumes that are encrypted under your CMK when you launch your EC2 instance, thereby streamlining your instance launch workflow. To start using this functionality, navigate to the EC2 console.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Amazon EC2 forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

How to retain system tables’ data spanning multiple Amazon Redshift clusters and run cross-cluster diagnostic queries

Post Syndicated from Karthik Sonti original https://aws.amazon.com/blogs/big-data/how-to-retain-system-tables-data-spanning-multiple-amazon-redshift-clusters-and-run-cross-cluster-diagnostic-queries/

Amazon Redshift is a data warehouse service that logs the history of the system in STL log tables. The STL log tables manage disk space by retaining only two to five days of log history, depending on log usage and available disk space.

To retain STL tables’ data for an extended period, you usually have to create a replica table for every system table. Then, for each you load the data from the system table into the replica at regular intervals. By maintaining replica tables for STL tables, you can run diagnostic queries on historical data from the STL tables. You then can derive insights from query execution times, query plans, and disk-spill patterns, and make better cluster-sizing decisions. However, refreshing replica tables with live data from STL tables at regular intervals requires schedulers such as Cron or AWS Data Pipeline. Also, these tables are specific to one cluster and they are not accessible after the cluster is terminated. This is especially true for transient Amazon Redshift clusters that last for only a finite period of ad hoc query execution.

In this blog post, I present a solution that exports system tables from multiple Amazon Redshift clusters into an Amazon S3 bucket. This solution is serverless, and you can schedule it as frequently as every five minutes. The AWS CloudFormation deployment template that I provide automates the solution setup in your environment. The system tables’ data in the Amazon S3 bucket is partitioned by cluster name and query execution date to enable efficient joins in cross-cluster diagnostic queries.

I also provide another CloudFormation template later in this post. This second template helps to automate the creation of tables in the AWS Glue Data Catalog for the system tables’ data stored in Amazon S3. After the system tables are exported to Amazon S3, you can run cross-cluster diagnostic queries on the system tables’ data and derive insights about query executions in each Amazon Redshift cluster. You can do this using Amazon QuickSight, Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum.

You can find all the code examples in this post, including the CloudFormation templates, AWS Glue extract, transform, and load (ETL) scripts, and the resolution steps for common errors you might encounter in this GitHub repository.

Solution overview

The solution in this post uses AWS Glue to export system tables’ log data from Amazon Redshift clusters into Amazon S3. The AWS Glue ETL jobs are invoked at a scheduled interval by AWS Lambda. AWS Systems Manager, which provides secure, hierarchical storage for configuration data management and secrets management, maintains the details of Amazon Redshift clusters for which the solution is enabled. The last-fetched time stamp values for the respective cluster-table combination are maintained in an Amazon DynamoDB table.

The following diagram covers the key steps involved in this solution.

The solution as illustrated in the preceding diagram flows like this:

  1. The Lambda function, invoke_rs_stl_export_etl, is triggered at regular intervals, as controlled by Amazon CloudWatch. It’s triggered to look up the AWS Systems Manager parameter store to get the details of the Amazon Redshift clusters for which the system table export is enabled.
  2. The same Lambda function, based on the Amazon Redshift cluster details obtained in step 1, invokes the AWS Glue ETL job designated for the Amazon Redshift cluster. If an ETL job for the cluster is not found, the Lambda function creates one.
  3. The ETL job invoked for the Amazon Redshift cluster gets the cluster credentials from the parameter store. It gets from the DynamoDB table the last exported time stamp of when each of the system tables was exported from the respective Amazon Redshift cluster.
  4. The ETL job unloads the system tables’ data from the Amazon Redshift cluster into an Amazon S3 bucket.
  5. The ETL job updates the DynamoDB table with the last exported time stamp value for each system table exported from the Amazon Redshift cluster.
  6. The Amazon Redshift cluster system tables’ data is available in Amazon S3 and is partitioned by cluster name and date for running cross-cluster diagnostic queries.

Understanding the configuration data

This solution uses AWS Systems Manager parameter store to store the Amazon Redshift cluster credentials securely. The parameter store also securely stores other configuration information that the AWS Glue ETL job needs for extracting and storing system tables’ data in Amazon S3. Systems Manager comes with a default AWS Key Management Service (AWS KMS) key that it uses to encrypt the password component of the Amazon Redshift cluster credentials.

The following table explains the global parameters and cluster-specific parameters required in this solution. The global parameters are defined once and applicable at the overall solution level. The cluster-specific parameters are specific to an Amazon Redshift cluster and repeat for each cluster for which you enable this post’s solution. The CloudFormation template explained later in this post creates these parameters as part of the deployment process.

Parameter nameTypeDescription
Global parametersdefined once and applied to all jobs
redshift_query_logs.global.s3_prefixStringThe Amazon S3 path where the query logs are exported. Under this path, each exported table is partitioned by cluster name and date.
redshift_query_logs.global.tempdirStringThe Amazon S3 path that AWS Glue ETL jobs use for temporarily staging the data.
redshift_query_logs.global.role>StringThe name of the role that the AWS Glue ETL jobs assume. Just the role name is sufficient. The complete Amazon Resource Name (ARN) is not required.
redshift_query_logs.global.enabled_cluster_listStringListA comma-separated list of cluster names for which system tables’ data export is enabled. This gives flexibility for a user to exclude certain clusters.
Cluster-specific parametersfor each cluster specified in the enabled_cluster_list parameter
redshift_query_logs.<<cluster_name>>.connectionStringThe name of the AWS Glue Data Catalog connection to the Amazon Redshift cluster. For example, if the cluster name is product_warehouse, the entry is redshift_query_logs.product_warehouse.connection.
redshift_query_logs.<<cluster_name>>.userStringThe user name that AWS Glue uses to connect to the Amazon Redshift cluster.
redshift_query_logs.<<cluster_name>>.passwordSecure StringThe password that AWS Glue uses to connect the Amazon Redshift cluster’s encrypted-by key that is managed in AWS KMS.

For example, suppose that you have two Amazon Redshift clusters, product-warehouse and category-management, for which the solution described in this post is enabled. In this case, the parameters shown in the following screenshot are created by the solution deployment CloudFormation template in the AWS Systems Manager parameter store.

Solution deployment

To make it easier for you to get started, I created a CloudFormation template that automatically configures and deploys the solution—only one step is required after deployment.

Prerequisites

To deploy the solution, you must have one or more Amazon Redshift clusters in a private subnet. This subnet must have a network address translation (NAT) gateway or a NAT instance configured, and also a security group with a self-referencing inbound rule for all TCP ports. For more information about why AWS Glue ETL needs the configuration it does, described previously, see Connecting to a JDBC Data Store in a VPC in the AWS Glue documentation.

To start the deployment, launch the CloudFormation template:

CloudFormation stack parameters

The following table lists and describes the parameters for deploying the solution to export query logs from multiple Amazon Redshift clusters.

PropertyDefaultDescription
S3BucketmybucketThe bucket this solution uses to store the exported query logs, stage code artifacts, and perform unloads from Amazon Redshift. For example, the mybucket/extract_rs_logs/data bucket is used for storing all the exported query logs for each system table partitioned by the cluster. The mybucket/extract_rs_logs/temp/ bucket is used for temporarily staging the unloaded data from Amazon Redshift. The mybucket/extract_rs_logs/code bucket is used for storing all the code artifacts required for Lambda and the AWS Glue ETL jobs.
ExportEnabledRedshiftClustersRequires InputA comma-separated list of cluster names from which the system table logs need to be exported.
DataStoreSecurityGroupsRequires InputA list of security groups with an inbound rule to the Amazon Redshift clusters provided in the parameter, ExportEnabledClusters. These security groups should also have a self-referencing inbound rule on all TCP ports, as explained on Connecting to a JDBC Data Store in a VPC.

After you launch the template and create the stack, you see that the following resources have been created:

  1. AWS Glue connections for each Amazon Redshift cluster you provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  2. All parameters required for this solution created in the parameter store.
  3. The Lambda function that invokes the AWS Glue ETL jobs for each configured Amazon Redshift cluster at a regular interval of five minutes.
  4. The DynamoDB table that captures the last exported time stamps for each exported cluster-table combination.
  5. The AWS Glue ETL jobs to export query logs from each Amazon Redshift cluster provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  6. The IAM roles and policies required for the Lambda function and AWS Glue ETL jobs.

After the deployment

For each Amazon Redshift cluster for which you enabled the solution through the CloudFormation stack parameter, ExportEnabledRedshiftClusters, the automated deployment includes temporary credentials that you must update after the deployment:

  1. Go to the parameter store.
  2. Note the parameters <<cluster_name>>.user and redshift_query_logs.<<cluster_name>>.password that correspond to each Amazon Redshift cluster for which you enabled this solution. Edit these parameters to replace the placeholder values with the right credentials.

For example, if product-warehouse is one of the clusters for which you enabled system table export, you edit these two parameters with the right user name and password and choose Save parameter.

Querying the exported system tables

Within a few minutes after the solution deployment, you should see Amazon Redshift query logs being exported to the Amazon S3 location, <<S3Bucket_you_provided>>/extract_redshift_query_logs/data/. In that bucket, you should see the eight system tables partitioned by customer name and date: stl_alert_event_log, stl_dlltext, stl_explain, stl_query, stl_querytext, stl_scan, stl_utilitytext, and stl_wlm_query.

To run cross-cluster diagnostic queries on the exported system tables, create external tables in the AWS Glue Data Catalog. To make it easier for you to get started, I provide a CloudFormation template that creates an AWS Glue crawler, which crawls the exported system tables stored in Amazon S3 and builds the external tables in the AWS Glue Data Catalog.

Launch this CloudFormation template to create external tables that correspond to the Amazon Redshift system tables. S3Bucket is the only input parameter required for this stack deployment. Provide the same Amazon S3 bucket name where the system tables’ data is being exported. After you successfully create the stack, you can see the eight tables in the database, redshift_query_logs_db, as shown in the following screenshot.

Now, navigate to the Athena console to run cross-cluster diagnostic queries. The following screenshot shows a diagnostic query executed in Athena that retrieves query alerts logged across multiple Amazon Redshift clusters.

You can build the following example Amazon QuickSight dashboard by running cross-cluster diagnostic queries on Athena to identify the hourly query count and the key query alert events across multiple Amazon Redshift clusters.

How to extend the solution

You can extend this post’s solution in two ways:

  • Add any new Amazon Redshift clusters that you spin up after you deploy the solution.
  • Add other system tables or custom query results to the list of exports from an Amazon Redshift cluster.

Extend the solution to other Amazon Redshift clusters

To extend the solution to more Amazon Redshift clusters, add the three cluster-specific parameters in the AWS Systems Manager parameter store following the guidelines earlier in this post. Modify the redshift_query_logs.global.enabled_cluster_list parameter to append the new cluster to the comma-separated string.

Extend the solution to add other tables or custom queries to an Amazon Redshift cluster

The current solution ships with the export functionality for the following Amazon Redshift system tables:

  • stl_alert_event_log
  • stl_dlltext
  • stl_explain
  • stl_query
  • stl_querytext
  • stl_scan
  • stl_utilitytext
  • stl_wlm_query

You can easily add another system table or custom query by adding a few lines of code to the AWS Glue ETL job, <<cluster-name>_extract_rs_query_logs. For example, suppose that from the product-warehouse Amazon Redshift cluster you want to export orders greater than $2,000. To do so, add the following five lines of code to the AWS Glue ETL job product-warehouse_extract_rs_query_logs, where product-warehouse is your cluster name:

  1. Get the last-processed time-stamp value. The function creates a value if it doesn’t already exist.

salesLastProcessTSValue = functions.getLastProcessedTSValue(trackingEntry=”mydb.sales_2000",job_configs=job_configs)

  1. Run the custom query with the time stamp.

returnDF=functions.runQuery(query="select * from sales s join order o where o.order_amnt > 2000 and sale_timestamp > '{}'".format (salesLastProcessTSValue) ,tableName="mydb.sales_2000",job_configs=job_configs)

  1. Save the results to Amazon S3.

functions.saveToS3(dataframe=returnDF,s3Prefix=s3Prefix,tableName="mydb.sales_2000",partitionColumns=["sale_date"],job_configs=job_configs)

  1. Get the latest time-stamp value from the returned data frame in Step 2.

latestTimestampVal=functions.getMaxValue(returnDF,"sale_timestamp",job_configs)

  1. Update the last-processed time-stamp value in the DynamoDB table.

functions.updateLastProcessedTSValue(“mydb.sales_2000",latestTimestampVal[0],job_configs)

Conclusion

In this post, I demonstrate a serverless solution to retain the system tables’ log data across multiple Amazon Redshift clusters. By using this solution, you can incrementally export the data from system tables into Amazon S3. By performing this export, you can build cross-cluster diagnostic queries, build audit dashboards, and derive insights into capacity planning by using services such as Athena. I also demonstrate how you can extend this solution to other ad hoc query use cases or tables other than system tables by adding a few lines of code.


Additional Reading

If you found this post useful, be sure to check out Using Amazon Redshift Spectrum, Amazon Athena, and AWS Glue with Node.js in Production and Amazon Redshift – 2017 Recap.


About the Author

Karthik Sonti is a senior big data architect at Amazon Web Services. He helps AWS customers build big data and analytical solutions and provides guidance on architecture and best practices.

 

 

 

 

Rotate Amazon RDS database credentials automatically with AWS Secrets Manager

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/rotate-amazon-rds-database-credentials-automatically-with-aws-secrets-manager/

Recently, we launched AWS Secrets Manager, a service that makes it easier to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can configure Secrets Manager to rotate secrets automatically, which can help you meet your security and compliance needs. Secrets Manager offers built-in integrations for MySQL, PostgreSQL, and Amazon Aurora on Amazon RDS, and can rotate credentials for these databases natively. You can control access to your secrets by using fine-grained AWS Identity and Access Management (IAM) policies. To retrieve secrets, employees replace plaintext secrets with a call to Secrets Manager APIs, eliminating the need to hard-code secrets in source code or update configuration files and redeploy code when secrets are rotated.

In this post, I introduce the key features of Secrets Manager. I then show you how to store a database credential for a MySQL database hosted on Amazon RDS and how your applications can access this secret. Finally, I show you how to configure Secrets Manager to rotate this secret automatically.

Key features of Secrets Manager

These features include the ability to:

  • Rotate secrets safely. You can configure Secrets Manager to rotate secrets automatically without disrupting your applications. Secrets Manager offers built-in integrations for rotating credentials for Amazon RDS databases for MySQL, PostgreSQL, and Amazon Aurora. You can extend Secrets Manager to meet your custom rotation requirements by creating an AWS Lambda function to rotate other types of secrets. For example, you can create an AWS Lambda function to rotate OAuth tokens used in a mobile application. Users and applications retrieve the secret from Secrets Manager, eliminating the need to email secrets to developers or update and redeploy applications after AWS Secrets Manager rotates a secret.
  • Secure and manage secrets centrally. You can store, view, and manage all your secrets. By default, Secrets Manager encrypts these secrets with encryption keys that you own and control. Using fine-grained IAM policies, you can control access to secrets. For example, you can require developers to provide a second factor of authentication when they attempt to retrieve a production database credential. You can also tag secrets to help you discover, organize, and control access to secrets used throughout your organization.
  • Monitor and audit easily. Secrets Manager integrates with AWS logging and monitoring services to enable you to meet your security and compliance requirements. For example, you can audit AWS CloudTrail logs to see when Secrets Manager rotated a secret or configure AWS CloudWatch Events to alert you when an administrator deletes a secret.
  • Pay as you go. Pay for the secrets you store in Secrets Manager and for the use of these secrets; there are no long-term contracts or licensing fees.

Get started with Secrets Manager

Now that you’re familiar with the key features, I’ll show you how to store the credential for a MySQL database hosted on Amazon RDS. To demonstrate how to retrieve and use the secret, I use a python application running on Amazon EC2 that requires this database credential to access the MySQL instance. Finally, I show how to configure Secrets Manager to rotate this database credential automatically. Let’s get started.

Phase 1: Store a secret in Secrets Manager

  1. Open the Secrets Manager console and select Store a new secret.
     
    Secrets Manager console interface
     
  2. I select Credentials for RDS database because I’m storing credentials for a MySQL database hosted on Amazon RDS. For this example, I store the credentials for the database superuser. I start by securing the superuser because it’s the most powerful database credential and has full access over the database.
     
    Store a new secret interface with Credentials for RDS database selected
     

    Note: For this example, you need permissions to store secrets in Secrets Manager. To grant these permissions, you can use the AWSSecretsManagerReadWriteAccess managed policy. Read the AWS Secrets Manager Documentation for more information about the minimum IAM permissions required to store a secret.

  3. Next, I review the encryption setting and choose to use the default encryption settings. Secrets Manager will encrypt this secret using the Secrets Manager DefaultEncryptionKeyDefaultEncryptionKey in this account. Alternatively, I can choose to encrypt using a customer master key (CMK) that I have stored in AWS KMS.
     
    Select the encryption key interface
     
  4. Next, I view the list of Amazon RDS instances in my account and select the database this credential accesses. For this example, I select the DB instance mysql-rds-database, and then I select Next.
     
    Select the RDS database interface
     
  5. In this step, I specify values for Secret Name and Description. For this example, I use Applications/MyApp/MySQL-RDS-Database as the name and enter a description of this secret, and then select Next.
     
    Secret Name and description interface
     
  6. For the next step, I keep the default setting Disable automatic rotation because my secret is used by my application running on Amazon EC2. I’ll enable rotation after I’ve updated my application (see Phase 2 below) to use Secrets Manager APIs to retrieve secrets. I then select Next.

    Note: If you’re storing a secret that you’re not using in your application, select Enable automatic rotation. See our AWS Secrets Manager getting started guide on rotation for details.

     
    Configure automatic rotation interface
     

  7. Review the information on the next screen and, if everything looks correct, select Store. We’ve now successfully stored a secret in Secrets Manager.
  8. Next, I select See sample code.
     
    The See sample code button
     
  9. Take note of the code samples provided. I will use this code to update my application to retrieve the secret using Secrets Manager APIs.
     
    Python sample code
     

Phase 2: Update an application to retrieve secret from Secrets Manager

Now that I have stored the secret in Secrets Manager, I update my application to retrieve the database credential from Secrets Manager instead of hard coding this information in a configuration file or source code. For this example, I show how to configure a python application to retrieve this secret from Secrets Manager.

  1. I connect to my Amazon EC2 instance via Secure Shell (SSH).
  2. Previously, I configured my application to retrieve the database user name and password from the configuration file. Below is the source code for my application.
    import MySQLdb
    import config

    def no_secrets_manager_sample()

    # Get the user name, password, and database connection information from a config file.
    database = config.database
    user_name = config.user_name
    password = config.password

    # Use the user name, password, and database connection information to connect to the database
    db = MySQLdb.connect(database.endpoint, user_name, password, database.db_name, database.port)

  3. I use the sample code from Phase 1 above and update my application to retrieve the user name and password from Secrets Manager. This code sets up the client and retrieves and decrypts the secret Applications/MyApp/MySQL-RDS-Database. I’ve added comments to the code to make the code easier to understand.
    # Use the code snippet provided by Secrets Manager.
    import boto3
    from botocore.exceptions import ClientError

    def get_secret():
    #Define the secret you want to retrieve
    secret_name = "Applications/MyApp/MySQL-RDS-Database"
    #Define the Secrets mManager end-point your code should use.
    endpoint_url = "https://secretsmanager.us-east-1.amazonaws.com"
    region_name = "us-east-1"

    #Setup the client
    session = boto3.session.Session()
    client = session.client(
    service_name='secretsmanager',
    region_name=region_name,
    endpoint_url=endpoint_url
    )

    #Use the client to retrieve the secret
    try:
    get_secret_value_response = client.get_secret_value(
    SecretId=secret_name
    )
    #Error handling to make it easier for your code to tolerate faults
    except ClientError as e:
    if e.response['Error']['Code'] == 'ResourceNotFoundException':
    print("The requested secret " + secret_name + " was not found")
    elif e.response['Error']['Code'] == 'InvalidRequestException':
    print("The request was invalid due to:", e)
    elif e.response['Error']['Code'] == 'InvalidParameterException':
    print("The request had invalid params:", e)
    else:
    # Decrypted secret using the associated KMS CMK
    # Depending on whether the secret was a string or binary, one of these fields will be populated
    if 'SecretString' in get_secret_value_response:
    secret = get_secret_value_response['SecretString']
    else:
    binary_secret_data = get_secret_value_response['SecretBinary']

    # Your code goes here.

  4. Applications require permissions to access Secrets Manager. My application runs on Amazon EC2 and uses an IAM role to obtain access to AWS services. I will attach the following policy to my IAM role. This policy uses the GetSecretValue action to grant my application permissions to read secret from Secrets Manager. This policy also uses the resource element to limit my application to read only the Applications/MyApp/MySQL-RDS-Database secret from Secrets Manager. You can visit the AWS Secrets Manager Documentation to understand the minimum IAM permissions required to retrieve a secret.
    {
    "Version": "2012-10-17",
    "Statement": {
    "Sid": "RetrieveDbCredentialFromSecretsManager",
    "Effect": "Allow",
    "Action": "secretsmanager:GetSecretValue",
    "Resource": "arn:aws:secretsmanager:::secret:Applications/MyApp/MySQL-RDS-Database"
    }
    }

Phase 3: Enable Rotation for Your Secret

Rotating secrets periodically is a security best practice because it reduces the risk of misuse of secrets. Secrets Manager makes it easy to follow this security best practice and offers built-in integrations for rotating credentials for MySQL, PostgreSQL, and Amazon Aurora databases hosted on Amazon RDS. When you enable rotation, Secrets Manager creates a Lambda function and attaches an IAM role to this function to execute rotations on a schedule you define.

Note: Configuring rotation is a privileged action that requires several IAM permissions and you should only grant this access to trusted individuals. To grant these permissions, you can use the AWS IAMFullAccess managed policy.

Next, I show you how to configure Secrets Manager to rotate the secret Applications/MyApp/MySQL-RDS-Database automatically.

  1. From the Secrets Manager console, I go to the list of secrets and choose the secret I created in the first step Applications/MyApp/MySQL-RDS-Database.
     
    List of secrets in the Secrets Manager console
     
  2. I scroll to Rotation configuration, and then select Edit rotation.
     
    Rotation configuration interface
     
  3. To enable rotation, I select Enable automatic rotation. I then choose how frequently I want Secrets Manager to rotate this secret. For this example, I set the rotation interval to 60 days.
     
    Edit rotation configuration interface
     
  4. Next, Secrets Manager requires permissions to rotate this secret on your behalf. Because I’m storing the superuser database credential, Secrets Manager can use this credential to perform rotations. Therefore, I select Use the secret that I provided in step 1, and then select Next.
     
    Select which secret to use in the Edit rotation configuration interface
     
  5. The banner on the next screen confirms that I have successfully configured rotation and the first rotation is in progress, which enables you to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 60 days.
     
    Confirmation banner message
     

Summary

I introduced AWS Secrets Manager, explained the key benefits, and showed you how to help meet your compliance requirements by configuring AWS Secrets Manager to rotate database credentials automatically on your behalf. Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, visit Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

AWS Key Management Service now offers FIPS 140-2 validated cryptographic modules enabling easier adoption of the service for regulated workloads

Post Syndicated from Sreekumar Pisharody original https://aws.amazon.com/blogs/security/aws-key-management-service-now-offers-fips-140-2-validated-cryptographic-modules-enabling-easier-adoption-of-the-service-for-regulated-workloads/

AWS Key Management Service (KMS) now uses FIPS 140-2 validated hardware security modules (HSM) and supports FIPS 140-2 validated endpoints, which provide independent assurances about the confidentiality and integrity of your keys. Having additional third-party assurances about the keys you manage in AWS KMS can make it easier to use the service for regulated workloads.

The process of gaining FIPS 140-2 validation is rigorous. First, AWS KMS HSMs were tested by an independent lab; those results were further reviewed by the Cryptographic Module Validation Program run by NIST. You can view the FIPS 140-2 certificate of the AWS Key Management Service HSM to get more details.

AWS KMS HSMs are designed so that no one, not even AWS employees, can retrieve your plaintext keys. The service uses the FIPS 140-2 validated HSMs to protect your keys when you request the service to create keys on your behalf or when you import them. Your plaintext keys are never written to disk and are only used in volatile memory of the HSMs while performing your requested cryptographic operation. Furthermore, AWS KMS keys are never transmitted outside the AWS Regions they were created. And HSM firmware updates are controlled by multi-party access that is audited and reviewed by an independent group within AWS.

AWS KMS HSMs are validated at level 2 overall and at level 3 in the following areas:

  • Cryptographic Module Specification
  • Roles, Services, and Authentication
  • Physical Security
  • Design Assurance

You can also make AWS KMS requests to API endpoints that terminate TLS sessions using a FIPS 140-2 validated cryptographic software module. To do so, connect to the unique FIPS 140-2 validated HTTPS endpoints in the AWS KMS requests made from your applications. AWS KMS FIPS 140-2 validated HTTPS endpoints are powered by the OpenSSL FIPS Object Module. FIPS 140-2 validated API endpoints are available in all commercial regions where AWS KMS is available.

Best Practices for Running Apache Kafka on AWS

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-kafka-on-aws/

This post was written in partnership with Intuit to share learnings, best practices, and recommendations for running an Apache Kafka cluster on AWS. Thanks to Vaishak Suresh and his colleagues at Intuit for their contribution and support.

Intuit, in their own words: Intuit, a leading enterprise customer for AWS, is a creator of business and financial management solutions. For more information on how Intuit partners with AWS, see our previous blog post, Real-time Stream Processing Using Apache Spark Streaming and Apache Kafka on AWS. Apache Kafka is an open-source, distributed streaming platform that enables you to build real-time streaming applications.

The best practices described in this post are based on our experience in running and operating large-scale Kafka clusters on AWS for more than two years. Our intent for this post is to help AWS customers who are currently running Kafka on AWS, and also customers who are considering migrating on-premises Kafka deployments to AWS.

AWS offers Amazon Kinesis Data Streams, a Kafka alternative that is fully managed.

Running your Kafka deployment on Amazon EC2 provides a high performance, scalable solution for ingesting streaming data. AWS offers many different instance types and storage option combinations for Kafka deployments. However, given the number of possible deployment topologies, it’s not always trivial to select the most appropriate strategy suitable for your use case.

In this blog post, we cover the following aspects of running Kafka clusters on AWS:

  • Deployment considerations and patterns
  • Storage options
  • Instance types
  • Networking
  • Upgrades
  • Performance tuning
  • Monitoring
  • Security
  • Backup and restore

Note: While implementing Kafka clusters in a production environment, make sure also to consider factors like your number of messages, message size, monitoring, failure handling, and any operational issues.

Deployment considerations and patterns

In this section, we discuss various deployment options available for Kafka on AWS, along with pros and cons of each option. A successful deployment starts with thoughtful consideration of these options. Considering availability, consistency, and operational overhead of the deployment helps when choosing the right option.

Single AWS Region, Three Availability Zones, All Active

One typical deployment pattern (all active) is in a single AWS Region with three Availability Zones (AZs). One Kafka cluster is deployed in each AZ along with Apache ZooKeeper and Kafka producer and consumer instances as shown in the illustration following.

In this pattern, this is the Kafka cluster deployment:

  • Kafka producers and Kafka cluster are deployed on each AZ.
  • Data is distributed evenly across three Kafka clusters by using Elastic Load Balancer.
  • Kafka consumers aggregate data from all three Kafka clusters.

Kafka cluster failover occurs this way:

  • Mark down all Kafka producers
  • Stop consumers
  • Debug and restack Kafka
  • Restart consumers
  • Restart Kafka producers

Following are the pros and cons of this pattern.

ProsCons
  • Highly available
  • Can sustain the failure of two AZs
  • No message loss during failover
  • Simple deployment

 

  • Very high operational overhead:
    • All changes need to be deployed three times, one for each Kafka cluster
    • Maintaining and monitoring three Kafka clusters
    • Maintaining and monitoring three consumer clusters

A restart is required for patching and upgrading brokers in a Kafka cluster. In this approach, a rolling upgrade is done separately for each cluster.

Single Region, Three Availability Zones, Active-Standby

Another typical deployment pattern (active-standby) is in a single AWS Region with a single Kafka cluster and Kafka brokers and Zookeepers distributed across three AZs. Another similar Kafka cluster acts as a standby as shown in the illustration following. You can use Kafka mirroring with MirrorMaker to replicate messages between any two clusters.

In this pattern, this is the Kafka cluster deployment:

  • Kafka producers are deployed on all three AZs.
  • Only one Kafka cluster is deployed across three AZs (active).
  • ZooKeeper instances are deployed on each AZ.
  • Brokers are spread evenly across all three AZs.
  • Kafka consumers can be deployed across all three AZs.
  • Standby Kafka producers and a Multi-AZ Kafka cluster are part of the deployment.

Kafka cluster failover occurs this way:

  • Switch traffic to standby Kafka producers cluster and Kafka cluster.
  • Restart consumers to consume from standby Kafka cluster.

Following are the pros and cons of this pattern.

ProsCons
  • Less operational overhead when compared to the first option
  • Only one Kafka cluster to manage and consume data from
  • Can handle single AZ failures without activating a standby Kafka cluster
  • Added latency due to cross-AZ data transfer among Kafka brokers
  • For Kafka versions before 0.10, replicas for topic partitions have to be assigned so they’re distributed to the brokers on different AZs (rack-awareness)
  • The cluster can become unavailable in case of a network glitch, where ZooKeeper does not see Kafka brokers
  • Possibility of in-transit message loss during failover

Intuit recommends using a single Kafka cluster in one AWS Region, with brokers distributing across three AZs (single region, three AZs). This approach offers stronger fault tolerance than otherwise, because a failed AZ won’t cause Kafka downtime.

Storage options

There are two storage options for file storage in Amazon EC2:

Ephemeral storage is local to the Amazon EC2 instance. It can provide high IOPS based on the instance type. On the other hand, Amazon EBS volumes offer higher resiliency and you can configure IOPS based on your storage needs. EBS volumes also offer some distinct advantages in terms of recovery time. Your choice of storage is closely related to the type of workload supported by your Kafka cluster.

Kafka provides built-in fault tolerance by replicating data partitions across a configurable number of instances. If a broker fails, you can recover it by fetching all the data from other brokers in the cluster that host the other replicas. Depending on the size of the data transfer, it can affect recovery process and network traffic. These in turn eventually affect the cluster’s performance.

The following table contrasts the benefits of using an instance store versus using EBS for storage.

Instance storeEBS
  • Instance storage is recommended for large- and medium-sized Kafka clusters. For a large cluster, read/write traffic is distributed across a high number of brokers, so the loss of a broker has less of an impact. However, for smaller clusters, a quick recovery for the failed node is important, but a failed broker takes longer and requires more network traffic for a smaller Kafka cluster.
  • Storage-optimized instances like h1, i3, and d2 are an ideal choice for distributed applications like Kafka.

 

  • The primary advantage of using EBS in a Kafka deployment is that it significantly reduces data-transfer traffic when a broker fails or must be replaced. The replacement broker joins the cluster much faster.
  • Data stored on EBS is persisted in case of an instance failure or termination. The broker’s data stored on an EBS volume remains intact, and you can mount the EBS volume to a new EC2 instance. Most of the replicated data for the replacement broker is already available in the EBS volume and need not be copied over the network from another broker. Only the changes made after the original broker failure need to be transferred across the network. That makes this process much faster.

 

 

Intuit chose EBS because of their frequent instance restacking requirements and also other benefits provided by EBS.

Generally, Kafka deployments use a replication factor of three. EBS offers replication within their service, so Intuit chose a replication factor of two instead of three.

Instance types

The choice of instance types is generally driven by the type of storage required for your streaming applications on a Kafka cluster. If your application requires ephemeral storage, h1, i3, and d2 instances are your best option.

Intuit used r3.xlarge instances for their brokers and r3.large for ZooKeeper, with ST1 (throughput optimized HDD) EBS for their Kafka cluster.

Here are sample benchmark numbers from Intuit tests.

ConfigurationBroker bytes (MB/s)
  • r3.xlarge
  • ST1 EBS
  • 12 brokers
  • 12 partitions

 

Aggregate 346.9

If you need EBS storage, then AWS has a newer-generation r4 instance. The r4 instance is superior to R3 in many ways:

  • It has a faster processor (Broadwell).
  • EBS is optimized by default.
  • It features networking based on Elastic Network Adapter (ENA), with up to 10 Gbps on smaller sizes.
  • It costs 20 percent less than R3.

Note: It’s always best practice to check for the latest changes in instance types.

Networking

The network plays a very important role in a distributed system like Kafka. A fast and reliable network ensures that nodes can communicate with each other easily. The available network throughput controls the maximum amount of traffic that Kafka can handle. Network throughput, combined with disk storage, is often the governing factor for cluster sizing.

If you expect your cluster to receive high read/write traffic, select an instance type that offers 10-Gb/s performance.

In addition, choose an option that keeps interbroker network traffic on the private subnet, because this approach allows clients to connect to the brokers. Communication between brokers and clients uses the same network interface and port. For more details, see the documentation about IP addressing for EC2 instances.

If you are deploying in more than one AWS Region, you can connect the two VPCs in the two AWS Regions using cross-region VPC peering. However, be aware of the networking costs associated with cross-AZ deployments.

Upgrades

Kafka has a history of not being backward compatible, but its support of backward compatibility is getting better. During a Kafka upgrade, you should keep your producer and consumer clients on a version equal to or lower than the version you are upgrading from. After the upgrade is finished, you can start using a new protocol version and any new features it supports. There are three upgrade approaches available, discussed following.

Rolling or in-place upgrade

In a rolling or in-place upgrade scenario, upgrade one Kafka broker at a time. Take into consideration the recommendations for doing rolling restarts to avoid downtime for end users.

Downtime upgrade

If you can afford the downtime, you can take your entire cluster down, upgrade each Kafka broker, and then restart the cluster.

Blue/green upgrade

Intuit followed the blue/green deployment model for their workloads, as described following.

If you can afford to create a separate Kafka cluster and upgrade it, we highly recommend the blue/green upgrade scenario. In this scenario, we recommend that you keep your clusters up-to-date with the latest Kafka version. For additional details on Kafka version upgrades or more details, see the Kafka upgrade documentation.

The following illustration shows a blue/green upgrade.

In this scenario, the upgrade plan works like this:

  • Create a new Kafka cluster on AWS.
  • Create a new Kafka producers stack to point to the new Kafka cluster.
  • Create topics on the new Kafka cluster.
  • Test the green deployment end to end (sanity check).
  • Using Amazon Route 53, change the new Kafka producers stack on AWS to point to the new green Kafka environment that you have created.

The roll-back plan works like this:

  • Switch Amazon Route 53 to the old Kafka producers stack on AWS to point to the old Kafka environment.

For additional details on blue/green deployment architecture using Kafka, see the re:Invent presentation Leveraging the Cloud with a Blue-Green Deployment Architecture.

Performance tuning

You can tune Kafka performance in multiple dimensions. Following are some best practices for performance tuning.

 These are some general performance tuning techniques:

  • If throughput is less than network capacity, try the following:
    • Add more threads
    • Increase batch size
    • Add more producer instances
    • Add more partitions
  • To improve latency when acks =-1, increase your num.replica.fetches value.
  • For cross-AZ data transfer, tune your buffer settings for sockets and for OS TCP.
  • Make sure that num.io.threads is greater than the number of disks dedicated for Kafka.
  • Adjust num.network.threads based on the number of producers plus the number of consumers plus the replication factor.
  • Your message size affects your network bandwidth. To get higher performance from a Kafka cluster, select an instance type that offers 10 Gb/s performance.

For Java and JVM tuning, try the following:

  • Minimize GC pauses by using the Oracle JDK, which uses the new G1 garbage-first collector.
  • Try to keep the Kafka heap size below 4 GB.

Monitoring

Knowing whether a Kafka cluster is working correctly in a production environment is critical. Sometimes, just knowing that the cluster is up is enough, but Kafka applications have many moving parts to monitor. In fact, it can easily become confusing to understand what’s important to watch and what you can set aside. Items to monitor range from simple metrics about the overall rate of traffic, to producers, consumers, brokers, controller, ZooKeeper, topics, partitions, messages, and so on.

For monitoring, Intuit used several tools, including Newrelec, Wavefront, Amazon CloudWatch, and AWS CloudTrail. Our recommended monitoring approach follows.

For system metrics, we recommend that you monitor:

  • CPU load
  • Network metrics
  • File handle usage
  • Disk space
  • Disk I/O performance
  • Garbage collection
  • ZooKeeper

For producers, we recommend that you monitor:

  • Batch-size-avg
  • Compression-rate-avg
  • Waiting-threads
  • Buffer-available-bytes
  • Record-queue-time-max
  • Record-send-rate
  • Records-per-request-avg

For consumers, we recommend that you monitor:

  • Batch-size-avg
  • Compression-rate-avg
  • Waiting-threads
  • Buffer-available-bytes
  • Record-queue-time-max
  • Record-send-rate
  • Records-per-request-avg

Security

Like most distributed systems, Kafka provides the mechanisms to transfer data with relatively high security across the components involved. Depending on your setup, security might involve different services such as encryption, Kerberos, Transport Layer Security (TLS) certificates, and advanced access control list (ACL) setup in brokers and ZooKeeper. The following tells you more about the Intuit approach. For details on Kafka security not covered in this section, see the Kafka documentation.

Encryption at rest

For EBS-backed EC2 instances, you can enable encryption at rest by using Amazon EBS volumes with encryption enabled. Amazon EBS uses AWS Key Management Service (AWS KMS) for encryption. For more details, see Amazon EBS Encryption in the EBS documentation. For instance store–backed EC2 instances, you can enable encryption at rest by using Amazon EC2 instance store encryption.

Encryption in transit

Kafka uses TLS for client and internode communications.

Authentication

Authentication of connections to brokers from clients (producers and consumers) to other brokers and tools uses either Secure Sockets Layer (SSL) or Simple Authentication and Security Layer (SASL).

Kafka supports Kerberos authentication. If you already have a Kerberos server, you can add Kafka to your current configuration.

Authorization

In Kafka, authorization is pluggable and integration with external authorization services is supported.

Backup and restore

The type of storage used in your deployment dictates your backup and restore strategy.

The best way to back up a Kafka cluster based on instance storage is to set up a second cluster and replicate messages using MirrorMaker. Kafka’s mirroring feature makes it possible to maintain a replica of an existing Kafka cluster. Depending on your setup and requirements, your backup cluster might be in the same AWS Region as your main cluster or in a different one.

For EBS-based deployments, you can enable automatic snapshots of EBS volumes to back up volumes. You can easily create new EBS volumes from these snapshots to restore. We recommend storing backup files in Amazon S3.

For more information on how to back up in Kafka, see the Kafka documentation.

Conclusion

In this post, we discussed several patterns for running Kafka in the AWS Cloud. AWS also provides an alternative managed solution with Amazon Kinesis Data Streams, there are no servers to manage or scaling cliffs to worry about, you can scale the size of your streaming pipeline in seconds without downtime, data replication across availability zones is automatic, you benefit from security out of the box, Kinesis Data Streams is tightly integrated with a wide variety of AWS services like Lambda, Redshift, Elasticsearch and it supports open source frameworks like Storm, Spark, Flink, and more. You may refer to kafka-kinesis connector.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Implement Serverless Log Analytics Using Amazon Kinesis Analytics and Real-time Clickstream Anomaly Detection with Amazon Kinesis Analytics.


About the Author

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 

 

Best Practices for Running Apache Cassandra on Amazon EC2

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/

Apache Cassandra is a commonly used, high performance NoSQL database. AWS customers that currently maintain Cassandra on-premises may want to take advantage of the scalability, reliability, security, and economic benefits of running Cassandra on Amazon EC2.

Amazon EC2 and Amazon Elastic Block Store (Amazon EBS) provide secure, resizable compute capacity and storage in the AWS Cloud. When combined, you can deploy Cassandra, allowing you to scale capacity according to your requirements. Given the number of possible deployment topologies, it’s not always trivial to select the most appropriate strategy suitable for your use case.

In this post, we outline three Cassandra deployment options, as well as provide guidance about determining the best practices for your use case in the following areas:

  • Cassandra resource overview
  • Deployment considerations
  • Storage options
  • Networking
  • High availability and resiliency
  • Maintenance
  • Security

Before we jump into best practices for running Cassandra on AWS, we should mention that we have many customers who decided to use DynamoDB instead of managing their own Cassandra cluster. DynamoDB is fully managed, serverless, and provides multi-master cross-region replication, encryption at rest, and managed backup and restore. Integration with AWS Identity and Access Management (IAM) enables DynamoDB customers to implement fine-grained access control for their data security needs.

Several customers who have been using large Cassandra clusters for many years have moved to DynamoDB to eliminate the complications of administering Cassandra clusters and maintaining high availability and durability themselves. Gumgum.com is one customer who migrated to DynamoDB and observed significant savings. For more information, see Moving to Amazon DynamoDB from Hosted Cassandra: A Leap Towards 60% Cost Saving per Year.

AWS provides options, so you’re covered whether you want to run your own NoSQL Cassandra database, or move to a fully managed, serverless DynamoDB database.

Cassandra resource overview

Here’s a short introduction to standard Cassandra resources and how they are implemented with AWS infrastructure. If you’re already familiar with Cassandra or AWS deployments, this can serve as a refresher.

ResourceCassandraAWS
Cluster

A single Cassandra deployment.

 

This typically consists of multiple physical locations, keyspaces, and physical servers.

A logical deployment construct in AWS that maps to an AWS CloudFormation StackSet, which consists of one or many CloudFormation stacks to deploy Cassandra.
DatacenterA group of nodes configured as a single replication group.

A logical deployment construct in AWS.

 

A datacenter is deployed with a single CloudFormation stack consisting of Amazon EC2 instances, networking, storage, and security resources.

Rack

A collection of servers.

 

A datacenter consists of at least one rack. Cassandra tries to place the replicas on different racks.

A single Availability Zone.
Server/nodeA physical virtual machine running Cassandra software.An EC2 instance.
TokenConceptually, the data managed by a cluster is represented as a ring. The ring is then divided into ranges equal to the number of nodes. Each node being responsible for one or more ranges of the data. Each node gets assigned with a token, which is essentially a random number from the range. The token value determines the node’s position in the ring and its range of data.Managed within Cassandra.
Virtual node (vnode)Responsible for storing a range of data. Each vnode receives one token in the ring. A cluster (by default) consists of 256 tokens, which are uniformly distributed across all servers in the Cassandra datacenter.Managed within Cassandra.
Replication factorThe total number of replicas across the cluster.Managed within Cassandra.

Deployment considerations

One of the many benefits of deploying Cassandra on Amazon EC2 is that you can automate many deployment tasks. In addition, AWS includes services, such as CloudFormation, that allow you to describe and provision all your infrastructure resources in your cloud environment.

We recommend orchestrating each Cassandra ring with one CloudFormation template. If you are deploying in multiple AWS Regions, you can use a CloudFormation StackSet to manage those stacks. All the maintenance actions (scaling, upgrading, and backing up) should be scripted with an AWS SDK. These may live as standalone AWS Lambda functions that can be invoked on demand during maintenance.

You can get started by following the Cassandra Quick Start deployment guide. Keep in mind that this guide does not address the requirements to operate a production deployment and should be used only for learning more about Cassandra.

Deployment patterns

In this section, we discuss various deployment options available for Cassandra in Amazon EC2. A successful deployment starts with thoughtful consideration of these options. Consider the amount of data, network environment, throughput, and availability.

  • Single AWS Region, 3 Availability Zones
  • Active-active, multi-Region
  • Active-standby, multi-Region

Single region, 3 Availability Zones

In this pattern, you deploy the Cassandra cluster in one AWS Region and three Availability Zones. There is only one ring in the cluster. By using EC2 instances in three zones, you ensure that the replicas are distributed uniformly in all zones.

To ensure the even distribution of data across all Availability Zones, we recommend that you distribute the EC2 instances evenly in all three Availability Zones. The number of EC2 instances in the cluster is a multiple of three (the replication factor).

This pattern is suitable in situations where the application is deployed in one Region or where deployments in different Regions should be constrained to the same Region because of data privacy or other legal requirements.

ProsCons

●     Highly available, can sustain failure of one Availability Zone.

●     Simple deployment

●     Does not protect in a situation when many of the resources in a Region are experiencing intermittent failure.

 

Active-active, multi-Region

In this pattern, you deploy two rings in two different Regions and link them. The VPCs in the two Regions are peered so that data can be replicated between two rings.

We recommend that the two rings in the two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.

This pattern is most suitable when the applications using the Cassandra cluster are deployed in more than one Region.

ProsCons

●     No data loss during failover.

●     Highly available, can sustain when many of the resources in a Region are experiencing intermittent failures.

●     Read/write traffic can be localized to the closest Region for the user for lower latency and higher performance.

●     High operational overhead

●     The second Region effectively doubles the cost

 

Active-standby, multi-region

In this pattern, you deploy two rings in two different Regions and link them. The VPCs in the two Regions are peered so that data can be replicated between two rings.

However, the second Region does not receive traffic from the applications. It only functions as a secondary location for disaster recovery reasons. If the primary Region is not available, the second Region receives traffic.

We recommend that the two rings in the two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.

This pattern is most suitable when the applications using the Cassandra cluster require low recovery point objective (RPO) and recovery time objective (RTO).

ProsCons

●     No data loss during failover.

●     Highly available, can sustain failure or partitioning of one whole Region.

●     High operational overhead.

●     High latency for writes for eventual consistency.

●     The second Region effectively doubles the cost.

Storage options

In on-premises deployments, Cassandra deployments use local disks to store data. There are two storage options for EC2 instances:

Your choice of storage is closely related to the type of workload supported by the Cassandra cluster. Instance store works best for most general purpose Cassandra deployments. However, in certain read-heavy clusters, Amazon EBS is a better choice.

The choice of instance type is generally driven by the type of storage:

  • If ephemeral storage is required for your application, a storage-optimized (I3) instance is the best option.
  • If your workload requires Amazon EBS, it is best to go with compute-optimized (C5) instances.
  • Burstable instance types (T2) don’t offer good performance for Cassandra deployments.

Instance store

Ephemeral storage is local to the EC2 instance. It may provide high input/output operations per second (IOPs) based on the instance type. An SSD-based instance store can support up to 3.3M IOPS in I3 instances. This high performance makes it an ideal choice for transactional or write-intensive applications such as Cassandra.

In general, instance storage is recommended for transactional, large, and medium-size Cassandra clusters. For a large cluster, read/write traffic is distributed across a higher number of nodes, so the loss of one node has less of an impact. However, for smaller clusters, a quick recovery for the failed node is important.

As an example, for a cluster with 100 nodes, the loss of 1 node is 3.33% loss (with a replication factor of 3). Similarly, for a cluster with 10 nodes, the loss of 1 node is 33% less capacity (with a replication factor of 3).

 Ephemeral storageAmazon EBSComments

IOPS

(translates to higher query performance)

Up to 3.3M on I3

80K/instance

10K/gp2/volume

32K/io1/volume

This results in a higher query performance on each host. However, Cassandra implicitly scales well in terms of horizontal scale. In general, we recommend scaling horizontally first. Then, scale vertically to mitigate specific issues.

 

Note: 3.3M IOPS is observed with 100% random read with a 4-KB block size on Amazon Linux.

AWS instance typesI3Compute optimized, C5Being able to choose between different instance types is an advantage in terms of CPU, memory, etc., for horizontal and vertical scaling.
Backup/ recoveryCustomBasic building blocks are available from AWS.

Amazon EBS offers distinct advantage here. It is small engineering effort to establish a backup/restore strategy.

a) In case of an instance failure, the EBS volumes from the failing instance are attached to a new instance.

b) In case of an EBS volume failure, the data is restored by creating a new EBS volume from last snapshot.

Amazon EBS

EBS volumes offer higher resiliency, and IOPs can be configured based on your storage needs. EBS volumes also offer some distinct advantages in terms of recovery time. EBS volumes can support up to 32K IOPS per volume and up to 80K IOPS per instance in RAID configuration. They have an annualized failure rate (AFR) of 0.1–0.2%, which makes EBS volumes 20 times more reliable than typical commodity disk drives.

The primary advantage of using Amazon EBS in a Cassandra deployment is that it reduces data-transfer traffic significantly when a node fails or must be replaced. The replacement node joins the cluster much faster. However, Amazon EBS could be more expensive, depending on your data storage needs.

Cassandra has built-in fault tolerance by replicating data to partitions across a configurable number of nodes. It can not only withstand node failures but if a node fails, it can also recover by copying data from other replicas into a new node. Depending on your application, this could mean copying tens of gigabytes of data. This adds additional delay to the recovery process, increases network traffic, and could possibly impact the performance of the Cassandra cluster during recovery.

Data stored on Amazon EBS is persisted in case of an instance failure or termination. The node’s data stored on an EBS volume remains intact and the EBS volume can be mounted to a new EC2 instance. Most of the replicated data for the replacement node is already available in the EBS volume and won’t need to be copied over the network from another node. Only the changes made after the original node failed need to be transferred across the network. That makes this process much faster.

EBS volumes are snapshotted periodically. So, if a volume fails, a new volume can be created from the last known good snapshot and be attached to a new instance. This is faster than creating a new volume and coping all the data to it.

Most Cassandra deployments use a replication factor of three. However, Amazon EBS does its own replication under the covers for fault tolerance. In practice, EBS volumes are about 20 times more reliable than typical disk drives. So, it is possible to go with a replication factor of two. This not only saves cost, but also enables deployments in a region that has two Availability Zones.

EBS volumes are recommended in case of read-heavy, small clusters (fewer nodes) that require storage of a large amount of data. Keep in mind that the Amazon EBS provisioned IOPS could get expensive. General purpose EBS volumes work best when sized for required performance.

Networking

If your cluster is expected to receive high read/write traffic, select an instance type that offers 10–Gb/s performance. As an example, i3.8xlarge and c5.9xlarge both offer 10–Gb/s networking performance. A smaller instance type in the same family leads to a relatively lower networking throughput.

Cassandra generates a universal unique identifier (UUID) for each node based on IP address for the instance. This UUID is used for distributing vnodes on the ring.

In the case of an AWS deployment, IP addresses are assigned automatically to the instance when an EC2 instance is created. With the new IP address, the data distribution changes and the whole ring has to be rebalanced. This is not desirable.

To preserve the assigned IP address, use a secondary elastic network interface with a fixed IP address. Before swapping an EC2 instance with a new one, detach the secondary network interface from the old instance and attach it to the new one. This way, the UUID remains same and there is no change in the way that data is distributed in the cluster.

If you are deploying in more than one region, you can connect the two VPCs in two regions using cross-region VPC peering.

High availability and resiliency

Cassandra is designed to be fault-tolerant and highly available during multiple node failures. In the patterns described earlier in this post, you deploy Cassandra to three Availability Zones with a replication factor of three. Even though it limits the AWS Region choices to the Regions with three or more Availability Zones, it offers protection for the cases of one-zone failure and network partitioning within a single Region. The multi-Region deployments described earlier in this post protect when many of the resources in a Region are experiencing intermittent failure.

Resiliency is ensured through infrastructure automation. The deployment patterns all require a quick replacement of the failing nodes. In the case of a regionwide failure, when you deploy with the multi-Region option, traffic can be directed to the other active Region while the infrastructure is recovering in the failing Region. In the case of unforeseen data corruption, the standby cluster can be restored with point-in-time backups stored in Amazon S3.

Maintenance

In this section, we look at ways to ensure that your Cassandra cluster is healthy:

  • Scaling
  • Upgrades
  • Backup and restore

Scaling

Cassandra is horizontally scaled by adding more instances to the ring. We recommend doubling the number of nodes in a cluster to scale up in one scale operation. This leaves the data homogeneously distributed across Availability Zones. Similarly, when scaling down, it’s best to halve the number of instances to keep the data homogeneously distributed.

Cassandra is vertically scaled by increasing the compute power of each node. Larger instance types have proportionally bigger memory. Use deployment automation to swap instances for bigger instances without downtime or data loss.

Upgrades

All three types of upgrades (Cassandra, operating system patching, and instance type changes) follow the same rolling upgrade pattern.

In this process, you start with a new EC2 instance and install software and patches on it. Thereafter, remove one node from the ring. For more information, see Cassandra cluster Rolling upgrade. Then, you detach the secondary network interface from one of the EC2 instances in the ring and attach it to the new EC2 instance. Restart the Cassandra service and wait for it to sync. Repeat this process for all nodes in the cluster.

Backup and restore

Your backup and restore strategy is dependent on the type of storage used in the deployment. Cassandra supports snapshots and incremental backups. When using instance store, a file-based backup tool works best. Customers use rsync or other third-party products to copy data backups from the instance to long-term storage. For more information, see Backing up and restoring data in the DataStax documentation. This process has to be repeated for all instances in the cluster for a complete backup. These backup files are copied back to new instances to restore. We recommend using S3 to durably store backup files for long-term storage.

For Amazon EBS based deployments, you can enable automated snapshots of EBS volumes to back up volumes. New EBS volumes can be easily created from these snapshots for restoration.

Security

We recommend that you think about security in all aspects of deployment. The first step is to ensure that the data is encrypted at rest and in transit. The second step is to restrict access to unauthorized users. For more information about security, see the Cassandra documentation.

Encryption at rest

Encryption at rest can be achieved by using EBS volumes with encryption enabled. Amazon EBS uses AWS KMS for encryption. For more information, see Amazon EBS Encryption.

Instance store–based deployments require using an encrypted file system or an AWS partner solution. If you are using DataStax Enterprise, it supports transparent data encryption.

Encryption in transit

Cassandra uses Transport Layer Security (TLS) for client and internode communications.

Authentication

The security mechanism is pluggable, which means that you can easily swap out one authentication method for another. You can also provide your own method of authenticating to Cassandra, such as a Kerberos ticket, or if you want to store passwords in a different location, such as an LDAP directory.

Authorization

The authorizer that’s plugged in by default is org.apache.cassandra.auth.Allow AllAuthorizer. Cassandra also provides a role-based access control (RBAC) capability, which allows you to create roles and assign permissions to these roles.

Conclusion

In this post, we discussed several patterns for running Cassandra in the AWS Cloud. This post describes how you can manage Cassandra databases running on Amazon EC2. AWS also provides managed offerings for a number of databases. To learn more, see Purpose-built databases for all your application needs.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Analyze Your Data on Amazon DynamoDB with Apache Spark and Analysis of Top-N DynamoDB Objects using Amazon Athena and Amazon QuickSight.


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 

 

 

Provanshu Dey is a Senior IoT Consultant with AWS Professional Services. He works on highly scalable and reliable IoT, data and machine learning solutions with our customers. In his spare time, he enjoys spending time with his family and tinkering with electronics & gadgets.

 

 

 

New – Encryption at Rest for DynamoDB

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-encryption-at-rest-for-dynamodb/

At AWS re:Invent 2017, Werner encouraged his audience to “Dance like nobody is watching, and to encrypt like everyone is:

The AWS team is always eager to add features that make it easier for you to protect your sensitive data and to help you to achieve your compliance objectives. For example, in 2017 we launched encryption at rest for SQS and EFS, additional encryption options for S3, and server-side encryption of Kinesis Data Streams.

Today we are giving you another data protection option with the introduction of encryption at rest for Amazon DynamoDB. You simply enable encryption when you create a new table and DynamoDB takes care of the rest. Your data (tables, local secondary indexes, and global secondary indexes) will be encrypted using AES-256 and a service-default AWS Key Management Service (KMS) key. The encryption adds no storage overhead and is completely transparent; you can insert, query, scan, and delete items as before. The team did not observe any changes in latency after enabling encryption and running several different workloads on an encrypted DynamoDB table.

Creating an Encrypted Table
You can create an encrypted table from the AWS Management Console, API (CreateTable), or CLI (create-table). I’ll use the console! I enter the name and set up the primary key as usual:

Before proceeding, I uncheck Use default settings, scroll down to the Encrypytion section, and check Enable encryption. Then I click Create and my table is created in encrypted form:

I can see the encryption setting for the table at a glance:

When my compliance team asks me to show them how DynamoDB uses the key to encrypt the data, I can create a AWS CloudTrail trail, insert an item, and then scan the table to see the calls to the AWS KMS API. Here’s an extract from the trail:

{
  "eventTime": "2018-01-24T00:06:34Z",
  "eventSource": "kms.amazonaws.com",
  "eventName": "Decrypt",
  "awsRegion": "us-west-2",
  "sourceIPAddress": "dynamodb.amazonaws.com",
  "userAgent": "dynamodb.amazonaws.com",
  "requestParameters": {
    "encryptionContext": {
      "aws:dynamodb:tableName": "reg-users",
      "aws:dynamodb:subscriberId": "1234567890"
    }
  },
  "responseElements": null,
  "requestID": "7072def1-009a-11e8-9ab9-4504c26bd391",
  "eventID": "3698678a-d04e-48c7-96f2-3d734c5c7903",
  "readOnly": true,
  "resources": [
    {
      "ARN": "arn:aws:kms:us-west-2:1234567890:key/e7bd721d-37f3-4acd-bec5-4d08c765f9f5",
      "accountId": "1234567890",
      "type": "AWS::KMS::Key"
    }
  ]
}

Available Now
This feature is available now in the US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) Regions and you can start using it today.

There’s no charge for the encryption; you will be charged for the calls that DynamoDB makes to AWS KMS on your behalf.

Jeff;