Post Syndicated from Damian Wylie original https://aws.amazon.com/blogs/big-data/under-the-hood-of-server-side-encryption-for-amazon-kinesis-streams/
Customers are using Amazon Kinesis Streams to ingest, process, and deliver data in real time from millions of devices or applications. Use cases for Kinesis Streams vary, but a few common ones include IoT data ingestion and analytics, log processing, clickstream analytics, and enterprise data bus architectures.
Within milliseconds of data arrival, applications (KCL, Apache Spark, AWS Lambda, Amazon Kinesis Analytics) attached to a stream are continuously mining value or delivering data to downstream destinations. Customers are then scaling their streams elastically to match demand. They pay incrementally for the resources that they need, while taking advantage of a fully managed, serverless streaming data service that allows them to focus on adding value closer to their customers.
These benefits are great; however, AWS learned that many customers could not take advantage of Kinesis Streams unless their data-at-rest within a stream was encrypted. Many customers did not want to manage encryption on their own, so they asked for a fully managed, automatic, server-side encryption mechanism leveraging centralized AWS Key Management Service (AWS KMS) customer master keys (CMK).
Motivated by this feedback, AWS added another fully managed, low cost aspect to Kinesis Streams by delivering server-side encryption via KMS managed encryption keys (SSE-KMS) in the following regions:
- US East (N. Virginia)
- US West (Oregon)
- US West (N. California)
- EU (Ireland)
- Asia Pacific (Singapore)
- Asia Pacific (Tokyo)
In this post, I cover the mechanics of the Kinesis Streams server-side encryption feature. I also share a few best practices and considerations so that you can get started quickly.
Understanding the mechanics
The following section walks you through how Kinesis Streams uses CMKs to encrypt a message in the PutRecord or PutRecords path before it is propagated to the Kinesis Streams storage layer, and then decrypt it in the GetRecords path after it has been retrieved from the storage layer.
When server-side encryption is enabled—which takes just a few clicks in the console—the partition key and payload for every incoming record is encrypted automatically as it’s flowing into Kinesis Streams, using the selected CMK. When data is at rest within a stream, it’s encrypted.
When records are retrieved through a GetRecords request from the encrypted stream, they are decrypted automatically as they are flowing out of the service. That means your Kinesis Streams producers and consumers do not need to be aware of encryption. You have a fully managed data encryption feature at your fingertips, which can be enabled within seconds.
AWS also makes it easy to audit the application of server-side encryption. You can use the AWS Management Console for instant stream-level verification; the responses from PutRecord, PutRecords, and getRecords; or AWS CloudTrail.
Calling PutRecord or PutRecords
When server-side encryption is enabled for a particular stream, Kinesis Streams and KMS perform the following actions when your applications call PutRecord or PutRecords on a stream with server-side encryption enabled. The Amazon Kinesis Producer Library (KPL) uses PutRecords.
- Data is sent from a customer’s producer (client) to a Kinesis stream using TLS via HTTPS. Data in transit to a stream is encrypted by default.
- After data is received, it is momentarily stored in RAM within a front-end proxy layer.
- Kinesis Streams authenticates the producer, then impersonates the producer to request input keying material from KMS.
- KMS creates key material, encrypts it by using CMK, and sends both the plaintext and encrypted key material to the service, encrypted with TLS.
- The client uses the plaintext key material to derive data encryption keys (data keys) that are unique per-record.
- The client encrypts the payload and partition key using the data key in RAM within the front-end proxy layer and removes the plaintext data key from memory.
- The client appends the encrypted key material to the encrypted data.
- The plaintext key material is securely cached in memory within the front-end layer for reuse, until it expires after 5 minutes.
- The client delivers the encrypted message to a back-end store where it is stored at rest and fetchable by an authorized consumer through a GetRecords The Amazon Kinesis Client Library (KCL) calls GetRecords to retrieve records from a stream.
Kinesis Streams and KMS perform the following actions when your applications call GetRecords on a server-side encrypted stream.
- When a GeRecords call is made, the front-end proxy layer retrieves the encrypted record from its back-end store.
- The consumer (client) makes a request to KMS using a token generated by the customer’s request. KMS authorizes it.
- The client requests that KMS decrypt the encrypted key material.
- KMS decrypts the encrypted key material and sends the plaintext key material to the client.
- Kinesis Streams derives the per-record data keys from the decrypted key material.
- If the calling application is authorized, the client decrypts the payload and removes the plaintext data key from memory.
- The client delivers the payload over TLS and HTTPS to the consumer, requesting the records. Data in transit to a consumer is encrypted by default.
Verifying server-side encryption
Auditors or administrators often ask for proof that server-side encryption was or is enabled. Here are a few ways to do this.
To check if encryption is enabled now for your streams:
- Use the AWS Management Console or the DescribeStream API operation. You can also see what CMK is being used for encryption.
- See encryption in action by looking at responses from PutRecord, PutRecords, or GetRecords When encryption is enabled, the encryptionType parameter is set to “KMS”. If encryption is not enabled, encryptionType is not included in the response.
Sample PutRecord response
Sample GetRecords response
To check if encryption was enabled, use CloudTrail, which logs the StartStreamEncryption() and StopStreamEncryption() API calls made against a particular stream.
It’s very easy to enable, disable, or modify server-side encryption for a particular stream.
- In the Kinesis Streams console, select a stream and choose Details.
- Select a CMK and select Enabled.
- Choose Save.
You can enable encryption only for a live stream, not upon stream creation. Follow the same process to disable a stream. To use a different CMK, select it and choose Save.
Each of these tasks can also be accomplished using the StartStreamEncryption and StopStreamEncryption API operations.
There are a few considerations you should be aware of when using server-side encryption for Kinesis Streams:
One benefit of using the “(Default) aws/kinesis” AWS managed key is that every producer and consumer with permissions to call PutRecord, PutRecords, or GetRecords inherits the right permissions over the “(Default) aws/kinesis” key automatically.
However, this is not necessarily the same case for a CMK. Kinesis Streams producers and consumers do not need to be aware of encryption. However, if you enable encryption using a custom master key but a producer or consumer doesn’t have IAM permissions to use it, PutRecord, PutRecords, or GetRecords requests fail.
This is a great security feature. On the other hand, it can effectively lead to data loss if you inadvertently apply a custom master key that restricts producers and consumers from interacting from the Kinesis stream. Take precautions when applying a custom master key. For more information about the minimum IAM permissions required for producers and consumers interacting with an encrypted stream, see Using Server-Side Encryption.
When you apply server-side encryption, you are subject to KMS API usage and key costs. Unlike custom KMS master keys, the “(Default) aws/kinesis” CMK is offered free of charge. However, you still need to pay for the API usage costs that Kinesis Streams incurs on your behalf.
API usage costs apply for every CMK, including custom ones. Kinesis Streams calls KMS approximately every 5 minutes when it is rotating the data key. In a 30-day month, the total cost of KMS API calls initiated by a Kinesis stream should be less than a few dollars.
During testing, AWS discovered that there was a slight increase (typically 0.2 millisecond or less per record) with put and get record latencies due to the additional overhead of encryption.
If you have questions or suggestions, please comment below.