Businesses need to send real-time notifications in order to take action when alerted of a critical situation. Examples could include anomaly detection, healthcare emergencies, operations failures, and fraud transactions. Email, SMS, and push notifications are often used to notify stakeholders in real-time. However, building a large-scale, real-time notification solution can be a complex and costly challenge for a business.
Amazon Pinpoint enables you to engage with your stakeholders in real-time by sending email, SMS and push notifications. Your app can use the Amazon Pinpoint API and the AWS SDKs to send direct messages. With transactional messages, you send alerts to specific recipients, as opposed to messages that you send to segments. There is no minimum fee, no setup cost, and no fixed monthly cost with Amazon Pinpoint.
In this blog, we explore a solution to notify stakeholders of a large customer transaction. This requires immediate attention as our stakeholders want to ensure that there is enough inventory available to deliver goods with no delay.
Solution Overview
The solution that we build to handle this use case can be deployed in one hour. The following diagram illustrates the AWS services integrated in this solution:
At a high level, the solution uses the following workflow:
1. Define large value transaction threshold in rule table in Amazon DynamoDB. 2. Setup Amazon Pinpoint and configure to send email and SMS. 3. Setup AWS Lambda and implement the logic to send SMS and email if a customer makes a large value order transaction. 4. Create a test transaction in an order table using Amazon DynamoDB. 5. Check details of SMS and email received.
Setting up the solution
Step 1: Set up Amazon Pinpoint
The first step in setting up this solution is to create a new Amazon Pinpoint project and configure the SMS and Email channel. 1. Navigate to Services -> Pinpoint. 2.Click on create a project. 3. Provide a name and click on the Create button. 4. Select Email in left panel and click on Edit. 5. Select “Enable the email channel for this project”. Select “Verify a new email address” and provide a default sender address. Click on verify email address. Click on save. You should get a verification mail. Verify the email by clicking on the verification link.
Once your email is verified, it should look like this:
6. Repeat the same step to verify receiver email address as well.
Please note: By enabling the email channel, we can send up to 200 email per day in sandbox mode. We must verify an email address or domain identity in order to send any email. While we remain in sandbox, we also must verify all recipient email addresses before we proceed — this does not apply in production.
7. Select SMS and voice in left panel and select “Enable the SMS channel for this project”. Please make sure “Transactional” is selected for critical or time-sensitive messages. Amazon Pinpoint optimizes the delivery of these messages for highest reliability.
Please note: We do not need to verify any phone numbers for this channel. However, we can request dedicated codes (either short or long) for our own exclusive use. Otherwise AWS will automatically allocate codes when we send SMS.
4. Create order_detail table. Provide order_id as partition key. You can select “String” as data type. 5. Now we must enable the stream in this table. Once we enable a stream on a table, Amazon DynamoDB captures information about every modification to data items in the table. We will integrate this table with AWS Lambda to validate the order value. Click on Manage Stream. 6. Select appropriate view type and click on Enable.
Step 3: Set up AWS Lambda
In this step, we create an AWS Lambda function and then integrate it with the order_detail table. After, we will check the order and send a notification if large enough.
1. Navigate to Service>Lambda. 2.Click on Create function. 3. Select runtime as Python 3.6. Select an execution role. Please make sure that your role gives read access to Amazon DynamoDB table. 4. Click on Add Trigger. 5. Select the Amazon DynamoDB table from the drop-down. Select order_detail table from the drop-down. Keep everything default. This integrates order_detail streaming to our AWS Lambda function.
Our Lambda function is invoked every time a transaction happens in order_detail table.
1. That’s it! We have built the solution. Now it’s time to do an end to end test by creating an order transaction in order_detail table.
2. Run the Amazon DynamoDB put-item command to create a new item over our threshhold, or replace an old item with a new one. In our case it will be a new order transaction in order_detail table. Please make sure you have configured AWS CLI. You can refer to the quickstart guide to learn more.
aws dynamodb put-item --table-name order_detail --item "{""order_id"":{""S"":""O0001""},""customer_id"":{""S"":""C0001""},""customer_name"":{""S"":""JOHN MILLER""},""order_date"":{""S"":""2020-06-13T17:42:34Z""},""item_id"":{""S"":""P0001""},""item_quantity"":{""N"":""12""},""order_value"":{""N"":""20000""},""unit"":{""S"":""$""},""delivery_date"":{""S"":""2020-06-20""}}" --return-consumed-capacity TOTAL
3. It returns a success message similar to this:
4. The order value of $20,000 is falling in the range of our large value order transaction definition (between $10,000 to $50,000). You should receive an SMS on the mobile number you have provided in the transaction_alert_rule table.
5. You will also receive an email as per configuration in the transaction_alert_rule table.
Step 4: Clean up
You’ve now successfully built our real-time notification solution using Amazon Pinpoint. Delete the resource you created in this blog like the Amazon DynamoDB tables to avoid ongoing charges. You can use the AWS CLI, AWS Management Consoles, or the AWS APIs to perform the cleanup.
Conclusion
Customers can use Amazon Pinpoint to help scale communications across use cases, including real-time notifications. Amazon Pinpoint is a flexible and scalable outbound and inbound marketing communications service. You can connect with customers and stakeholders over channels like email, SMS, push, or voice.
Note: This post was written by Murat Balkan, an AWS Senior Solutions Architect.
Many of our customers use voice notifications to deliver mission-critical and time-sensitive messages to their users. Customers often configure their systems to retry delivery when these voice messages aren’t delivered the first time around. Other customers set up their systems to fall back to another channel in this situation.
This blog post shows you how to retry the delivery of a voice message if the initial attempt wasn’t successful.
Architecture
By completing the steps in this post, you can create a system that uses the architecture illustrated in the following image:
First, a Lambda function calls the SendMessage operation in the Amazon Pinpoint API. The SendMessage operation then initiates a phone call to the recipient and generates a unique message ID, which is returned to the Lambda function. The Lambda function then adds this message ID to a DynamoDB table.
While Amazon Pinpoint attempts to deliver a message to a recipient, it emits several event records. These records indicate when the call is initiated, when the phone is ringing, when the call is answered, and so forth. Amazon Pinpoint publishes these events to an Amazon SNS topic. In this example, we’re only interested in the BUSY, FAILED, and NO_ANSWER event types, so we add some filtering criteria.
An Amazon SQS queue then subscribes to the Amazon SNS topic and monitors the incoming events. The Delivery Delay attribute of this queue is also set at the queue level. This configuration provides a back-off retry mechanism for failed voice messages.
When the Delivery Delay timer is reached, another Lambda function polls the queue and extracts the MessageId attribute from the polled message. It uses this attribute to locate the DynamoDB record for the original call. This record also tells us how many times Amazon Pinpoint has attempted to deliver the message.
The Lambda function compares the number of retries to a MAX_RETRY environment variable to determine whether it should attempt to send the message again.
After you add a long code to your account, use AWS SAM to deploy the remaining parts of this serverless architecture. You provide the long number as an input parameter to this template.
The AWS SAM template creates the following resources:
A Lambda function (CallGenerator) that initiates a voice call using the Amazon Pinpoint API.
An Amazon SNS topic that collects state change events from Amazon Pinpoint.
An Amazon SQS queue that queues the messages.
A Lambda function (RetryCallGenerator) that polls the Amazon SQS queue and re-initiates the previously failed call attempt by calling the CallGenerator function.
A DynamoDB table that contains information about previous attempts to deliver the call.
The template also defines a custom Lambda resource, CustomResource, which creates a configuration set in Amazon Pinpoint. This configuration set specifies the events to send to Amazon SNS. A Lambda environment variable, CONFIG_SET_NAME, contains the name of the configuration set.
This architecture consists of two Lambda functions, which are represented as two different apps in the AWS SAM template. These functions are named CallGenerator and RetryCallGenerator. The CallGenerator function initiates the voice message with Amazon Pinpoint. The SendMessage API in Amazon Pinpoint returns a MessageId. The architecture uses this ID as a key to connect messages to the various events that they generate. The CallGenerator function also retains this ID in a DynamoDB table called call_attempts. The RetryCallGenerator function looks up the MessageId in the call_attempts table. If necessary, the function tries to send the message again by invoking the CallGenerator function.
Deploying and Testing
Start by downloading the template from the GitHub repository. AWS SAM requires you to specify an Amazon Simple Storage Service (Amazon S3) bucket to hold the deployment artifacts. If you haven’t already created a bucket for this purpose, create one now. The bucket should be reachable by a AWS Identity and Access Management (IAM) user.
At the command line, enter the following command to package the application:
sam package --template template.yaml --output-template-file output_template.yaml --s3-bucket BUCKET_NAME_HERE
In the preceding command, replace BUCKET_NAME_HERE with the name of the Amazon S3 bucket that should hold the deployment artifacts.
AWS SAM packages the application and copies it into the Amazon S3 bucket. This AWS SAM template requires you to specify three parameters: longCode, the phone number that’s used to make the outbound calls; maxRetry, which is used to set the MAX_RETRY environment variable for the RetryCallGenerator application; and retryDelaySeconds, which sets the delivery delay time for the Amazon SQS queue.
When the AWS SAM package command finishes running, enter the following command to deploy the package:
In the preceding command, replace LONG_CODE with the dedicated phone number that you acquired earlier.
When you run this command, AWS SAM shows the progress of the deployment. When the deployment finishes, you can test it by sending a sample event to the CallGenerator Lambda function. Use the following sample event to test the Lambda function:
{ "Message": "<speak>Thank you for visiting the AWS <emphasis>Messaging and Targeting Blog</emphasis>.</speak>", "PhoneNumber": "DESTINATION_PHONE_NUMBER", "RetryCount": 0 }
In the preceding event, replace DESTINATION_PHONE_NUMBER with the phone number to which you want to send a test message.
Important: Telecommunication providers take several steps to limit unsolicited voice messages. For example, providers in the United States only deliver a certain number of automated voice messages to each recipient per day. For this reason, you can only use Amazon Pinpoint to send 10 calls per day to each recipient. Keep this limit in mind during the testing process.
Conclusion
This architecture shows how Amazon Pinpoint can deliver state change events to Amazon SNS and how a serverless application can use it. You can adapt this architecture to apply to other use cases, such as call auditing, advanced call analytics and more.
Your customers deserve to have helpful communications with your brand, regardless of the channel that you use to interact with them. There are many situations in which you might have to move customers from one channel to another—for example, when a customer is interacting with a chatbot over SMS, but their needs suddenly change to require voice assistance. To create a great customer experience, your communications with your customers should be seamless across all communication channels.
Welcome aboard Customer Obsessed Airlines
In this post, we look at a scenario that involves our fictitious airline, Customer Obsessed Airlines. Severe storms in one area of the country have caused Customer Obsessed Airlines to cancel a large number of flights. Customer Obsessed Airlines has to notify all of the affected customers of the cancellations right away. But most importantly, to keep customers as happy as possible in this unfortunate and unavoidable situation, Customer Obsessed Airlines has to make it easy for customers to rebook their flights.
Fortunately, Customer Obsessed Airlines has implemented the solution that’s outlined later in this post. This solution uses Amazon Pinpoint to send messages to a targeted segment of customers—in this case, the specific customers who were booked on the affected flights. Some of these customers might have straightforward travel itineraries that can simply be rebooked through interactions with a chatbot. Other customers who have more complex itineraries, or those who simply prefer to interact with a human over the phone, can be handed off to an agent in your call center.
About the solution
The solution that we’ll build to handle this scenario can be deployed in under an hour. The following diagram illustrates the interactions in this solution.
At a high level, this solution uses the following workflow:
An event occurs. Automated impact analysis systems trigger the creation of custom segments—in this case, all passengers whose flights were cancelled.
Amazon Pinpoint sends a message to the affected passengers through their preferred channels. Amazon Pinpoint supports the email, SMS, push, and voice channels, but in this example, we focus exclusively on SMS.
Passengers who receive the message can respond. When they do, they interact with a chatbot that helps them book a different flight.
If a passenger requests a live agent, or if their situation can’t be handled by a chatbot, then Amazon Pinpoint passes information about the customer’s situation and communication history to Amazon Connect. The passenger is entered into a queue. When the passenger reaches the front of the queue, they receive a phone call from an agent.
After being re-booked, the passenger receives a written confirmation of the changes to their itinerary through their preferred channel. Passengers are also given the option of providing feedback on their interaction when the process is complete.
To build this solution, we use Amazon Pinpoint to segment our customers based on their attributes (such as which flight they’ve booked), and to deliver messages to those segments.
We also use Amazon Connect to manage the voice calling part of the solution, and Amazon Lex to power the chatbot. Finally, we connect these services using logic that’s defined in AWS Lambda functions.
Setting up the solution
Step 1: Set up Amazon Pinpoint and link it with Amazon Lex
The first step in setting up this solution is to create a new Amazon Pinpoint project and configure the SMS channel. When that’s done, you can create an Amazon Lex chatbot and link it to the Amazon Pinpoint project.
Step 2: Set up Amazon Connect and link it with your Amazon Lex chatbot
By completing step 1, we’ve created a system that can send messages to our passengers and receive messages from them. The next step is to create a way for passengers to communicate with our call center.
The Amazon Connect Administrator Guide provides instructions for linking an Amazon Lex bot to an Amazon Connect instance. For complete procedures, see Add an Amazon Lex Bot.
When you complete these procedures, link your Amazon Connect instance to the same Amazon Lex bot that you created in step 1. This step is intended to provide customers with a consistent, cohesive experience across channels.
Step 3: Set up an Amazon Connect callback queue and use Amazon Pinpoint keyword logic to trigger it
Now that we’ve configured Amazon Pinpoint and Amazon Connect, we can connect them.
Linking the two services makes it possible for passengers to request additional assistance. Traditionally, passengers in this situation would have to call a call center themselves and then wait on hold for an agent to become available. However, in this solution, our call center calls the passenger directly as soon as an agent is available. When the agent calls the passenger, the agent has all of the information about the passenger’s issue, as well as a transcript of the passenger’s interactions with your chatbot.
By completing the preceding three steps, you can send messages to a subset of your users based on the criteria you choose and the type of message you want to send. Your customers can interact with your message by replying with questions. When they do, a chatbot responds intelligently and appropriately.
You can add to this solution by expanding it to cover other communication channels, such as push notifications. You can also automate the initial communication by integrating the solution with your systems of record.
We’re excited to see what you build using the solution that we outlined in this post. Let us know of your ideas and your successes in the comments.
Fireeye reports on a Chinese-sponsored espionage effort to eavesdrop on text messages:
FireEye Mandiant recently discovered a new malware family used by APT41 (a Chinese APT group) that is designed to monitor and save SMS traffic from specific phone numbers, IMSI numbers and keywords for subsequent theft. Named MESSAGETAP, the tool was deployed by APT41 in a telecommunications network provider in support of Chinese espionage efforts. APT41’s operations have included state-sponsored cyber espionage missions as well as financially-motivated intrusions. These operations have spanned from as early as 2012 to the present day. For an overview of APT41, see our August 2019 blog post or our full published report.
Yet another example that demonstrates why end-to-end message encryption is so important.
The AWS Digital User Engagement team hit the ground running this year. From speaking in front of crowds of digital marketers and developers, to developing new tutorials to help make it easier to get started building solutions to common use cases, here’s the latest on what we’ve been up to and our latest updates to Amazon Pinpoint.
How To Achieve Customer-Obsessed Digital User Engagement
Simon Poile, GM of AWS Digital User Engagement, had the pleasure of speaking to hundreds of digital marketers at the Digital Summit conference in Seattle, WA on February 26th. Digital Summit attendees are the movers and shakers influencing the growth and success of their company’s digital marketing — and the future landscape of the digital economy. Simon provided insights on how marketers can embody the Amazon culture of customer obsession to gain a deeper understanding of their customers, strengthen trust between brands and their users, and create a personalized digital engagement experience that is timely, contextually relevant, and reaches the right user at the right time through the right medium. He discussed how marketers can embrace technology such as machine learning and IoT to accomplish transformative engagement, and provided insights about how brands around the world are using AWS Digital User Engagement solutions to transform their engagement efforts.
Learn to implement two-way SMS messaging for a simple approach that results in higher levels of customer engagement
In a recent article posted on A Cloud Guru, Dennis Hill explains what two-way SMS is and how you can quickly and easily start sending personalized, timely, and relevant text messages to your customers with Amazon Pinpoint. He then shows how you can implement a practical solution for setting up an SMS long codeso you can start sending and receiving text messages.
New Amazon Pinpoint Getting Started Guide: How to Create an SMS Registration System
On Wednesday the 27th, we launched the first Amazon Pinpoint Getting Started Guide. This guide, located in the Tutorials section of the Pinpoint Developer Guide, shows you the entire process of creating a customer registration solution for SMS messaging. A common way to capture customers’ mobile phone numbers is to use a web-based form. After you verify the customer’s and confirm the customer’s subscription, you can start sending promotional, transactional, and informational SMS messages to that customer.
In the tutorial, you’ll learn how to set up two-way SMS messaging in Pinpoint, create a web form to capture customers’ contact information, send registration information from your own website to a Lambda function using API Gateway, how to implement a double opt-in strategy, and more.
The tutorial is intended for users of all skill levels. While there is some coding involved, all of the necessary code is included. You can use this tutorial to create a complete solution, or as a starting point for your own use case.
Amazon Pinpoint is now available in the US West (Oregon), EU (Frankfurt), and EU (Ireland) regions in addition to the US East (Virginia) region. You can now use Amazon Pinpoint to power your digital user engagement without having to transfer your customer data across regions.
This regional expansion is particularly useful for organizations in certain regions of the EU, where data residency considerations previously made it difficult for many customers to use Amazon Pinpoint. It also creates a global infrastructure that helps to improve availability and redundancy while reducing latency.
How Hulu uses Amazon Pinpoint for their real-time notification platform.
At Hulu, notifying their viewers when their favorite teams are playing helps them drive growth and improve viewer engagement. However, building this feature was a complex process. Managing their live TV metadata, while generating audiences in real time in high-scalability scenarios, posed unique challenges for the engineering team. In this video, Hulu discuss the challenges in building their real-time notification platform, how Amazon Pinpoint helped them with their goals, and how they architected their solution for global scale and deliverability. Watch to learn how they built their solution.
The AWS Digital User Engagement team will be at the AWS Booth #2617 at Shoptalk, March 3-6 at the Venetian in Las Vegas. Stop by to view our demo of the integration of Amazon Pinpoint and Amazon Personalize, which will show how a customer’s interaction with products in a retail setting can be tracked with smart-devices connected to AWS, resulting in real-time inferences and predictions on a customer’s affinity for products they haven’t yet interacted with. This information can be used to send push notifications with Amazon Pinpoint to a customer’s mobile device, making them aware of the products and possible deals that Amazon Personalize has predicted they will appreciate.
Learn to implement two-way SMS messaging for a simple approach that results in higher levels of customer engagement
SMS, or text messaging, is the simplest way to reach your users outside of normal customer-facing web or mobile applications. Compared to other communication channels, such as email and push notifications, text messaging results in higher engagement.
SMS messaging is extremely convenient — users don’t have to authenticate, download your app, or go to your website. They simply receive your message on their device. When it comes to customer acquisition and retention, it doesn’t get any easier than this.
In this article posted on A Cloud Guru, Dennis Hills explains what two-way SMS is and how you can quickly and easily start sending personalized, timely, and relevant text messages to your customers with Amazon Pinpoint. He then shows how you can implement a practical solution for setting up an SMS long code so you can start sending and receiving text messages.
Read the article now, and be sure to let us know in the comments what types of advanced topics for SMS messaging you’d like to see us or Dennis write about in the future.
Apple is rolling out an iOS security usability feature called Security code AutoFill. The basic idea is that the OS scans incoming SMS messages for security codes and suggests them in AutoFill, so that people can use them without having to memorize or type them.
Sounds like a really good idea, but Andreas Gutmann points out an application where this could become a vulnerability: when authenticating transactions:
Transaction authentication, as opposed to user authentication, is used to attest the correctness of the intention of an action rather than just the identity of a user. It is most widely known from online banking, where it is an essential tool to defend against sophisticated attacks. For example, an adversary can try to trick a victim into transferring money to a different account than the one intended. To achieve this the adversary might use social engineering techniques such as phishing and vishing and/or tools such as Man-in-the-Browser malware.
Transaction authentication is used to defend against these adversaries. Different methods exist but in the one of relevance here — which is among the most common methods currently used — the bank will summarise the salient information of any transaction request, augment this summary with a TAN tailored to that information, and send this data to the registered phone number via SMS. The user, or bank customer in this case, should verify the summary and, if this summary matches with his or her intentions, copy the TAN from the SMS message into the webpage.
This new iOS feature creates problems for the use of SMS in transaction authentication. Applied to 2FA, the user would no longer need to open and read the SMS from which the code has already been conveniently extracted and presented. Unless this feature can reliably distinguish between OTPs in 2FA and TANs in transaction authentication, we can expect that users will also have their TANs extracted and presented without context of the salient information, e.g. amount and destination of the transaction. Yet, precisely the verification of this salient information is essential for security. Examples of where this scenario could apply include a Man-in-the-Middle attack on the user accessing online banking from their mobile browser, or where a malicious website or app on the user’s phone accesses the bank’s legitimate online banking service.
This is an interesting interaction between two security systems. Security code AutoFill eliminates the need for the user to view the SMS or memorize the one-time code. Transaction authentication assumes the user read and approved the additional information in the SMS message before using the one-time code.
Abstract: We review the salient evidence consistent with or predicted by the Hoyle-Wickramasinghe (H-W) thesis of Cometary (Cosmic) Biology. Much of this physical and biological evidence is multifactorial. One particular focus are the recent studies which date the emergence of the complex retroviruses of vertebrate lines at or just before the Cambrian Explosion of ~500 Ma. Such viruses are known to be plausibly associated with major evolutionary genomic processes. We believe this coincidence is not fortuitous but is consistent with a key prediction of H-W theory whereby major extinction-diversification evolutionary boundaries coincide with virus-bearing cometary-bolide bombardment events. A second focus is the remarkable evolution of intelligent complexity (Cephalopods) culminating in the emergence of the Octopus. A third focus concerns the micro-organism fossil evidence contained within meteorites as well as the detection in the upper atmosphere of apparent incoming life-bearing particles from space. In our view the totality of the multifactorial data and critical analyses assembled by Fred Hoyle, Chandra Wickramasinghe and their many colleagues since the 1960s leads to a very plausible conclusion — life may have been seeded here on Earth by life-bearing comets as soon as conditions on Earth allowed it to flourish (about or just before 4.1 Billion years ago); and living organisms such as space-resistant and space-hardy bacteria, viruses, more complex eukaryotic cells, fertilised ova and seeds have been continuously delivered ever since to Earth so being one important driver of further terrestrial evolution which has resulted in considerable genetic diversity and which has led to the emergence of mankind.
This post is courtesy of Otavio Ferreira, Manager, Amazon SNS, AWS Messaging.
Amazon SNS message filtering provides a set of string and numeric matching operators that allow each subscription to receive only the messages of interest. Hence, SNS message filtering can simplify your pub/sub messaging architecture by offloading the message filtering logic from your subscriber systems, as well as the message routing logic from your publisher systems.
After you set the subscription attribute that defines a filter policy, the subscribing endpoint receives only the messages that carry attributes matching this filter policy. Other messages published to the topic are filtered out for this subscription. In this way, the native integration between SNS and Amazon CloudWatch provides visibility into the number of messages delivered, as well as the number of messages filtered out.
CloudWatch metrics are captured automatically for you. To get started with SNS message filtering, see Filtering Messages with Amazon SNS.
Message Filtering Metrics
The following six CloudWatch metrics are relevant to understanding your SNS message filtering activity:
NumberOfMessagesPublished – Inbound traffic to SNS. This metric tracks all the messages that have been published to the topic.
NumberOfNotificationsDelivered – Outbound traffic from SNS. This metric tracks all the messages that have been successfully delivered to endpoints subscribed to the topic. A delivery takes place either when the incoming message attributes match a subscription filter policy, or when the subscription has no filter policy at all, which results in a catch-all behavior.
NumberOfNotificationsFilteredOut – This metric tracks all the messages that were filtered out because they carried attributes that didn’t match the subscription filter policy.
NumberOfNotificationsFilteredOut-NoMessageAttributes – This metric tracks all the messages that were filtered out because they didn’t carry any attributes at all and, consequently, didn’t match the subscription filter policy.
NumberOfNotificationsFilteredOut-InvalidAttributes – This metric keeps track of messages that were filtered out because they carried invalid or malformed attributes and, thus, didn’t match the subscription filter policy.
NumberOfNotificationsFailed – This last metric tracks all the messages that failed to be delivered to subscribing endpoints, regardless of whether a filter policy had been set for the endpoint. This metric is emitted after the message delivery retry policy is exhausted, and SNS stops attempting to deliver the message. At that moment, the subscribing endpoint is likely no longer reachable. For example, the subscribing SQS queue or Lambda function has been deleted by its owner. You may want to closely monitor this metric to address message delivery issues quickly.
Message filtering graphs
Through the AWS Management Console, you can compose graphs to display your SNS message filtering activity. The graph shows the number of messages published, delivered, and filtered out within the timeframe you specify (1h, 3h, 12h, 1d, 3d, 1w, or custom).
To compose an SNS message filtering graph with CloudWatch:
Open the CloudWatch console.
Choose Metrics, SNS, All Metrics, and Topic Metrics.
Select all metrics to add to the graph, such as:
NumberOfMessagesPublished
NumberOfNotificationsDelivered
NumberOfNotificationsFilteredOut
Choose Graphed metrics.
In the Statistic column, switch from Average to Sum.
Title your graph with a descriptive name, such as “SNS Message Filtering”
After you have your graph set up, you may want to copy the graph link for bookmarking, emailing, or sharing with co-workers. You may also want to add your graph to a CloudWatch dashboard for easy access in the future. Both actions are available to you on the Actions menu, which is found above the graph.
Summary
SNS message filtering defines how SNS topics behave in terms of message delivery. By using CloudWatch metrics, you gain visibility into the number of messages published, delivered, and filtered out. This enables you to validate the operation of filter policies and more easily troubleshoot during development phases.
SNS message filtering can be implemented easily with existing AWS SDKs by applying message and subscription attributes across all SNS supported protocols (Amazon SQS, AWS Lambda, HTTP, SMS, email, and mobile push). CloudWatch metrics for SNS message filtering is available now, in all AWS Regions.
We announced a preview of AWS IoT 1-Click at AWS re:Invent 2017 and have been refining it ever since, focusing on simplicity and a clean out-of-box experience. Designed to make IoT available and accessible to a broad audience, AWS IoT 1-Click is now generally available, along with new IoT buttons from AWS and AT&T.
I sat down with the dev team a month or two ago to learn about the service so that I could start thinking about my blog post. During the meeting they gave me a pair of IoT buttons and I started to think about some creative ways to put them to use. Here are a few that I came up with:
Help Request – Earlier this month I spent a very pleasant weekend at the HackTillDawn hackathon in Los Angeles. As the participants were hacking away, they occasionally had questions about AWS, machine learning, Amazon SageMaker, and AWS DeepLens. While we had plenty of AWS Solution Architects on hand (decked out in fashionable & distinctive AWS shirts for easy identification), I imagined an IoT button for each team. Pressing the button would alert the SA crew via SMS and direct them to the proper table.
Camera Control – Tim Bray and I were in the AWS video studio, prepping for the first episode of Tim’s series on AWS Messaging. Minutes before we opened the Twitch stream I realized that we did not have a clean, unobtrusive way to ask the camera operator to switch to a closeup view. Again, I imagined that a couple of IoT buttons would allow us to make the request.
Remote Dog Treat Dispenser – My dog barks every time a stranger opens the gate in front of our house. While it is great to have confirmation that my Ring doorbell is working, I would like to be able to press a button and dispense a treat so that Luna stops barking!
Homes, offices, factories, schools, vehicles, and health care facilities can all benefit from IoT buttons and other simple IoT devices, all managed using AWS IoT 1-Click.
All About AWS IoT 1-Click As I said earlier, we have been focusing on simplicity and a clean out-of-box experience. Here’s what that means:
Architects can dream up applications for inexpensive, low-powered devices.
Developers don’t need to write any device-level code. They can make use of pre-built actions, which send email or SMS messages, or write their own custom actions using AWS Lambda functions.
Installers don’t have to install certificates or configure cloud endpoints on newly acquired devices, and don’t have to worry about firmware updates.
Administrators can monitor the overall status and health of each device, and can arrange to receive alerts when a device nears the end of its useful life and needs to be replaced, using a single interface that spans device types and manufacturers.
I’ll show you how easy this is in just a moment. But first, let’s talk about the current set of devices that are supported by AWS IoT 1-Click.
Who’s Got the Button? We’re launching with support for two types of buttons (both pictured above). Both types of buttons are pre-configured with X.509 certificates, communicate to the cloud over secure connections, and are ready to use.
The AWS IoT Enterprise Button communicates via Wi-Fi. It has a 2000-click lifetime, encrypts outbound data using TLS, and can be configured using BLE and our mobile app. It retails for $19.99 (shipping and handling not included) and can be used in the United States, Europe, and Japan.
The AT&T LTE-M Button communicates via the LTE-M cellular network. It has a 1500-click lifetime, and also encrypts outbound data using TLS. The device and the bundled data plan is available an an introductory price of $29.99 (shipping and handling not included), and can be used in the United States.
We are very interested in working with device manufacturers in order to make even more shapes, sizes, and types of devices (badge readers, asset trackers, motion detectors, and industrial sensors, to name a few) available to our customers. Our team will be happy to tell you about our provisioning tools and our facility for pushing OTA (over the air) updates to large fleets of devices; you can contact them at [email protected].
AWS IoT 1-Click Concepts I’m eager to show you how to use AWS IoT 1-Click and the buttons, but need to introduce a few concepts first.
Device – A button or other item that can send messages. Each device is uniquely identified by a serial number.
Placement Template – Describes a like-minded collection of devices to be deployed. Specifies the action to be performed and lists the names of custom attributes for each device.
Placement – A device that has been deployed. Referring to placements instead of devices gives you the freedom to replace and upgrade devices with minimal disruption. Each placement can include values for custom attributes such as a location (“Building 8, 3rd Floor, Room 1337”) or a purpose (“Coffee Request Button”).
Action – The AWS Lambda function to invoke when the button is pressed. You can write a function from scratch, or you can make use of a pair of predefined functions that send an email or an SMS message. The actions have access to the attributes; you can, for example, send an SMS message with the text “Urgent need for coffee in Building 8, 3rd Floor, Room 1337.”
Getting Started with AWS IoT 1-Click Let’s set up an IoT button using the AWS IoT 1-Click Console:
If I didn’t have any buttons I could click Buy devices to get some. But, I do have some, so I click Claim devices to move ahead. I enter the device ID or claim code for my AT&T button and click Claim (I can enter multiple claim codes or device IDs if I want):
The AWS buttons can be claimed using the console or the mobile app; the first step is to use the mobile app to configure the button to use my Wi-Fi:
Then I scan the barcode on the box and click the button to complete the process of claiming the device. Both of my buttons are now visible in the console:
I am now ready to put them to use. I click on Projects, and then Create a project:
I name and describe my project, and click Next to proceed:
Now I define a device template, along with names and default values for the placement attributes. Here’s how I set up a device template (projects can contain several, but I just need one):
The action has two mandatory parameters (phone number and SMS message) built in; I add three more (Building, Room, and Floor) and click Create project:
I’m almost ready to ask for some coffee! The next step is to associate my buttons with this project by creating a placement for each one. I click Create placements to proceed. I name each placement, select the device to associate with it, and then enter values for the attributes that I established for the project. I can also add additional attributes that are peculiar to this placement:
I can inspect my project and see that everything looks good:
I click on the buttons and the SMS messages appear:
I can monitor device activity in the AWS IoT 1-Click Console:
And also in the Lambda Console:
The Lambda function itself is also accessible, and can be used as-is or customized:
As you can see, this is the code that lets me use {{*}}include all of the placement attributes in the message and {{Building}} (for example) to include a specific placement attribute.
Now Available I’ve barely scratched the surface of this cool new service and I encourage you to give it a try (or a click) yourself. Buy a button or two, build something cool, and let me know all about it!
Pricing is based on the number of enabled devices in your account, measured monthly and pro-rated for partial months. Devices can be enabled or disabled at any time. See the AWS IoT 1-Click Pricing page for more info.
Last month, Wired published a long article about Ray Ozzie and his supposed new scheme for adding a backdoor in encrypted devices. It’s a weird article. It paints Ozzie’s proposal as something that “attains the impossible” and “satisfies both law enforcement and privacy purists,” when (1) it’s barely a proposal, and (2) it’s essentially the same key escrow scheme we’ve been hearing about for decades.
Basically, each device has a unique public/private key pair and a secure processor. The public key goes into the processor and the device, and is used to encrypt whatever user key encrypts the data. The private key is stored in a secure database, available to law enforcement on demand. The only other trick is that for law enforcement to use that key, they have to put the device in some sort of irreversible recovery mode, which means it can never be used again. That’s basically it.
I have no idea why anyone is talking as if this were anything new. Severalcryptographershavealreadyexplained why this key escrow scheme is no better than any other key escrow scheme. The short answer is (1) we won’t be able to secure that database of backdoor keys, (2) we don’t know how to build the secure coprocessor the scheme requires, and (3) it solves none of the policy problems around the whole system. This is the typical mistake non-cryptographers make when they approach this problem: they think that the hard part is the cryptography to create the backdoor. That’s actually the easy part. The hard part is ensuring that it’s only used by the good guys, and there’s nothing in Ozzie’s proposal that addresses any of that.
I worry that this kind of thing is damaging in the long run. There should be some rule that any backdoor or key escrow proposal be a fully specified proposal, not just some cryptography and hand-waving notions about how it will be used in practice. And before it is analyzed and debated, it should have to satisfy some sort of basic security analysis. Otherwise, we’ll be swatting pseudo-proposals like this one, while those on the other side of this debate become increasingly convinced that it’s possible to design one of these things securely.
Already people are using the National Academies report on backdoors for law enforcement as evidence that engineers are developing workable and secure backdoors. Writing in Lawfare, Alan Z. Rozenshtein claims that the report — and a related New York Timesstory — “undermine the argument that secure third-party access systems are so implausible that it’s not even worth trying to develop them.” Susan Landau effectively corrects this misconception, but the damage is done.
Here’s the thing: it’s not hard to design and build a backdoor. What’s hard is building the systems — both technical and procedural — around them. Here’s Rob Graham:
He’s only solving the part we already know how to solve. He’s deliberately ignoring the stuff we don’t know how to solve. We know how to make backdoors, we just don’t know how to secure them.
A bunch of us cryptographers have already explained why we don’t think this sort of thing will work in the foreseeable future. We write:
Exceptional access would force Internet system developers to reverse “forward secrecy” design practices that seek to minimize the impact on user privacy when systems are breached. The complexity of today’s Internet environment, with millions of apps and globally connected services, means that new law enforcement requirements are likely to introduce unanticipated, hard to detect security flaws. Beyond these and other technical vulnerabilities, the prospect of globally deployed exceptional access systems raises difficult problems about how such an environment would be governed and how to ensure that such systems would respect human rights and the rule of law.
The reason so few of us are willing to bet on massive-scale key escrow systems is that we’ve thought about it and we don’t think it will work. We’ve looked at the threat model, the usage model, and the quality of hardware and software that exists today. Our informed opinion is that there’s no detection system for key theft, there’s no renewability system, HSMs are terrifically vulnerable (and the companies largely staffed with ex-intelligence employees), and insiders can be suborned. We’re not going to put the data of a few billion people on the line an environment where we believe with high probability that the system will fail.
SMS messaging is becoming an increasingly vital tool for companies in a variety of industries and across a diverse range of use cases. For example, many app developers use SMS messaging as part of their process for onboarding new customers. When new customers sign up for a service, they’re asked to provide their mobile phone numbers. The developer sends the customer a one-time password in an SMS message, which the customer then enters into a form on the web or in an app to complete the registration process. This capability is an important tool for ensuring the security of customers’ accounts. However, if the customer doesn’t receive the SMS message that contains the one-time password, he or she may become frustrated, and might abandon the registration process completely.
There are many things that could prevent an SMS message from arriving, but the most common causes are data entry errors. Apps and web forms typically only perform basic verification of phone numbers that end users provide. For example, they might check to make sure customers only entered numeric characters, or that the number contains the right number of digits. When customers enter their mobile numbers into an app or on a web form, they might accidentally enter their country code twice, or not enter a country code at all, or enter a leading zero in front of their phone number, or any of a variety of other common errors. In some cases, customers might even enter numbers that aren’t capable of receiving SMS messages, such as landline or Voice over IP (VoIP) numbers. Basic number validation techniques won’t stop any of these common errors from occurring. To help our customers find and correct these common issues, we’re launching a new feature in Amazon Pinpoint called Phone Number Verify.
With Phone Number Verify, Amazon Pinpoint checks to see whether the phone number provided is in a valid format based on its country code prefix. If the number is formatted correctly, Amazon Pinpoint leaves the number as the customer entered it. If the number isn’t formatted correctly, Amazon Pinpoint compares the phone number to various rules to make it valid. For example, if a customer enters a leading zero after the country code, Phone Number Verify removes the extra character to make the number valid. In this case, if the customer entered +1 0 206 555 0199 (a United States phone number in the 206 area code), Phone Number Verify changes the number to a correctly formatted number without the extra zero (+1 206 555 0199).
In addition to formatting numbers properly, Phone Number Verify can also detect and fix country-specific nuances. For example, in Brazil, older mobile phone numbers had eight digits. To create a larger pool of possible mobile phone numbers, Brazilian mobile companies added a ninth digit, and added a 9 to the beginning of the older eight digit numbers. If a customer provides a Brazilian mobile number that contains eight digits, Phone Number Verify will detect the issue and automatically insert the 9 in the appropriate place, between the area code and the phone number. In this example, Phone Number Verify would change +55 11 9123 4567 (a number in country code 55 and area code 11) to +55 11 99123 4567. By capturing and addressing these issues at the time of entry, Phone Number Verify can potentially increase the number of customers you’re able to contact. Our internal testing shows that up to 10% of mobile phone numbers contain these kinds of errors—that’s 10% of your potential customers that you might not be able to contact otherwise!
Phone Number Verify also returns metadata about phone numbers, such as the name of the telephone carrier, the type of phone number (landline, VoIP, mobile), and the geographic location (city, country, state and time zone) where the number is registered. Amazon Pinpoint users can use this metadata to perform additional validation. For example, if a user provides a landline number, you can immediately prompt the customer to enter a phone number that is capable of receiving text messages. Alternatively, when a customer provides a landline or VoIP number, you can call the user by using text-to-speech technology, rather than by sending a text message.
Phone Number Verify is now available in Amazon Pinpoint in a limited release. If you’re interested in testing this feature, complete our application form.
Enterprises adopt containers because they recognize the benefits: speed, agility, portability, and high compute density. They understand how accelerating application delivery and deployment pipelines makes it possible to rapidly slipstream new features to customers. Although the benefits are indisputable, this acceleration raises concerns about security and corporate compliance with software governance. In this blog post, I provide a solution that shows how Layered Insight, the pioneer and global leader in container-native application protection, can be used with seamless application build and delivery pipelines like those available in AWS CodeBuild to address these concerns.
Layered Insight solutions
Layered Insight enables organizations to unify DevOps and SecOps by providing complete visibility and control of containerized applications. Using the industry’s first embedded security approach, Layered Insight solves the challenges of container performance and protection by providing accurate insight into container images, adaptive analysis of running containers, and automated enforcement of container behavior.
AWS CodeBuild
AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. CodeBuild scales continuously and processes multiple builds concurrently, so your builds are not left waiting in a queue. You can get started quickly by using prepackaged build environments, or you can create custom build environments that use your own build tools.
Problem Definition
Security and compliance concerns span the lifecycle of application containers. Common concerns include:
Visibility into the container images. You need to verify the software composition information of the container image to determine whether known vulnerabilities associated with any of the software packages and libraries are included in the container image.
Governance of container images is critical because only certain open source packages/libraries, of specific versions, should be included in the container images. You need support for mechanisms for blacklisting all container images that include a certain version of a software package/library, or only allowing open source software that come with a specific type of license (such as Apache, MIT, GPL, and so on). You need to be able to address challenges such as:
· Defining the process for image compliance policies at the enterprise, department, and group levels.
· Preventing the images that fail the compliance checks from being deployed in critical environments, such as staging, pre-prod, and production.
Visibility into running container instances is critical, including:
· CPU and memory utilization.
· Security of the build environment.
· All activities (system, network, storage, and application layer) of the application code running in each container instance.
Protection of running container instances that is:
· Zero-touch to the developers (not an SDK-based approach).
· Zero touch to the DevOps team and doesn’t limit the portability of the containerized application.
· This protection must retain the option to switch to a different container stack or orchestration layer, or even to a different Container as a Service (CaaS ).
· And it must be a fully automated solution to SecOps, so that the SecOps team doesn’t have to manually analyze and define detailed blacklist and whitelist policies.
Solution Details
In AWS CodeCommit, we have three projects: ● “Democode” is a simple Java application, with one buildspec to build the app into a Docker container (run by build-demo-image CodeBuild project), and another to instrument said container (instrument-image CodeBuild project). The resulting container is stored in ECR repo javatestasjavatest:20180415-layered. This instrumented container is running in AWS Fargate cluster demo-java-appand can be seen in the Layered Insight runtime console as the javatestapplication in us-east-1. ● aws-codebuild-docker-imagesis a clone of the official aws-codebuild-docker-images repo on GitHub . This CodeCommit project is used by the build-python-builder CodeBuild project to build the python 3.3.6 codebuild image and is stored at the codebuild-python ECR repo. We then manually instructed the Layered Insight console to instrument the image. ● scan-java-imagecontains just a buildspec.yml file. This file is used by the scan-java-image CodeBuild project to instruct Layered Assessment to perform a vulnerability scan of the javatest container image built previously, and then run the scan results through a compliance policy that states there should be no medium vulnerabilities. This build fails — but in this case that is a success: the scan completes successfully, but compliance fails as there are medium-level issues found in the scan.
This build is performed using the instrumented version of the Python 3.3.6 CodeBuild image, so the activity of the processes running within the build are recorded each time within the LI console.
Build container image
Create or use a CodeCommit project with your application. To build this image and store it in Amazon Elastic Container Registry (Amazon ECR), add a buildspec file to the project and build a container image and create a CodeBuild project.
Scan container image
Once the image is built, create a new buildspec in the same project or a new one that looks similar to below (update ECR URL as necessary):
version: 0.2
phases:
pre_build:
commands:
- echo Pulling down LI Scan API client scripts
- git clone https://github.com/LayeredInsight/scan-api-example-python.git
- echo Setting up LI Scan API client
- cd scan-api-example-python
- pip install layint_scan_api
- pip install -r requirements.txt
build:
commands:
- echo Scanning container started on `date`
- IMAGEID=$(./li_add_image --name <aws-region>.amazonaws.com/javatest:20180415)
- ./li_wait_for_scan -v --imageid $IMAGEID
- ./li_run_image_compliance -v --imageid $IMAGEID --policyid PB15260f1acb6b2aa5b597e9d22feffb538256a01fbb4e5a95
Add the buildspec file to the git repo, push it, and then build a CodeBuild project using with the instrumented Python 3.3.6 CodeBuild image at <aws-region>.amazonaws.com/codebuild-python:3.3.6-layered. Set the following environment variables in the CodeBuild project: ● LI_APPLICATIONNAME – name of the build to display ● LI_LOCATION – location of the build project to display ● LI_API_KEY – ApiKey:<key-name>:<api-key> ● LI_API_HOST – location of the Layered Insight API service
Instrument container image
Next, to instrument the new container image:
In the Layered Insight runtime console, ensure that the ECR registry and credentials are defined (click the Setup icon and the ‘+’ sign on the top right of the screen to add a new container registry). Note the name given to the registry in the console, as this needs to be referenced in the li_add_imagecommand in the script, below.
Next, add a new buildspec (with a new name) to the CodeCommit project, such as the one shown below. This code will download the Layered Insight runtime client, and use it to instruct the Layered Insight service to instrument the image that was just built:
version: 0.2
phases:
pre_build:
commands:
echo Pulling down LI API Runtime client scripts
git clone https://github.com/LayeredInsight/runtime-api-example-python
echo Setting up LI API client
cd runtime-api-example-python
pip install layint-runtime-api
pip install -r requirements.txt
build:
commands:
echo Instrumentation started on `date`
./li_add_image --registry "Javatest ECR" --name IMAGE_NAME:TAG --description "IMAGE DESCRIPTION" --policy "Default Policy" --instrument --wait --verbose
Commit and push the new buildspec file.
Going back to CodeBuild, create a new project, with the same CodeCommit repo, but this time select the new buildspec file. Use a Python 3.3.6 builder – either the AWS or LI Instrumented version.
Click Continue
Click Save
Run the build, again on the master branch.
If everything runs successfully, a new image should appear in the ECR registry with a -layered suffix. This is the instrumented image.
Run instrumented container image
When the instrumented container is now run — in ECS, Fargate, or elsewhere — it will log data back to the Layered Insight runtime console. It’s appearance in the console can be modified by setting the LI_APPLICATIONNAME and LI_LOCATION environment variables when running the container.
Conclusion
In the above blog we have provided you steps needed to embed governance and runtime security in your build pipelines running on AWS CodeBuild using Layered Insight.
Good news for cloud security experts: following our most popular beta exam ever, the AWS Certified Security – Specialty exam is here. This new exam allows experienced cloud security professionals to demonstrate and validate their knowledge of how to secure the AWS platform.
About the exam The security exam covers incident response, logging and monitoring, infrastructure security, identity and access management, and data protection. The exam is open to anyone who currently holds a Cloud Practitioner or Associate-level certification. We recommend candidates have five years of IT security experience designing and implementing security solutions, and at least two years of hands-on experience securing AWS workloads.
The exam validates:
An understanding of specialized data classifications and AWS data protection mechanisms.
An understanding of data encryption methods and AWS mechanisms to implement them.
An understanding of secure Internet protocols and AWS mechanisms to implement them.
A working knowledge of AWS security services and features of services to provide a secure production environment.
Competency gained from two or more years of production deployment experience using AWS security services and features.
Ability to make trade-off decisions with regard to cost, security, and deployment complexity given a set of application requirements.
An understanding of security operations and risk.
Learn more and register >>
How to prepare We have training and other resources to help you prepare for the exam:
Good news for cloud security experts: the AWS Certified Security — Specialty exam is here. This new exam allows experienced cloud security professionals to demonstrate and validate their knowledge of how to secure the AWS platform.
About the exam
The security exam covers incident response, logging and monitoring, infrastructure security, identity and access management, and data protection. The exam is open to anyone who currently holds a Cloud Practitioner or Associate-level certification. We recommend candidates have five years of IT security experience designing and implementing security solutions, and at least two years of hands-on experience securing AWS workloads.
The exam validates your understanding of:
Specialized data classifications and AWS data protection mechanisms
Data encryption methods and AWS mechanisms to implement them
Secure Internet protocols and AWS mechanisms to implement them
AWS security services and features of services to provide a secure production environment
Making tradeoff decisions with regard to cost, security, and deployment complexity given a set of application requirements
Security operations and risk
How to prepare
We have training and other resources to help you prepare for the exam.
AWS Glue provides enhanced support for working with datasets that are organized into Hive-style partitions. AWS Glue crawlers automatically identify partitions in your Amazon S3 data. The AWS Glue ETL (extract, transform, and load) library natively supports partitions when you work with DynamicFrames. DynamicFrames represent a distributed collection of data without requiring you to specify a schema. You can now push down predicates when creating DynamicFrames to filter out partitions and avoid costly calls to S3. We have also added support for writing DynamicFrames directly into partitioned directories without converting them to Apache Spark DataFrames.
Partitioning has emerged as an important technique for organizing datasets so that they can be queried efficiently by a variety of big data systems. Data is organized in a hierarchical directory structure based on the distinct values of one or more columns. For example, you might decide to partition your application logs in Amazon S3 by date—broken down by year, month, and day. Files corresponding to a single day’s worth of data would then be placed under a prefix such as s3://my_bucket/logs/year=2018/month=01/day=23/.
Systems like Amazon Athena, Amazon Redshift Spectrum, and now AWS Glue can use these partitions to filter data by value without making unnecessary calls to Amazon S3. This can significantly improve the performance of applications that need to read only a few partitions.
In this post, we show you how to efficiently process partitioned datasets using AWS Glue. First, we cover how to set up a crawler to automatically scan your partitioned dataset and create a table and partitions in the AWS Glue Data Catalog. Then, we introduce some features of the AWS Glue ETL library for working with partitioned data. You can now filter partitions using SQL expressions or user-defined functions to avoid listing and reading unnecessary data from Amazon S3. We’ve also added support in the ETL library for writing AWS Glue DynamicFrames directly into partitions without relying on Spark SQL DataFrames.
Let’s get started!
Crawling partitioned data
In this example, we use the same GitHub archive dataset that we introduced in a previous post about Scala support in AWS Glue. This data, which is publicly available from the GitHub archive, contains a JSON record for every API request made to the GitHub service. A sample dataset containing one month of activity from January 2017 is available at the following location:
Here you can replace <region> with the AWS Region in which you are working, for example, us-east-1. This dataset is partitioned by year, month, and day, so an actual file will be at a path like the following:
To crawl this data, you can either follow the instructions in the AWS Glue Developer Guide or use the provided AWS CloudFormation template. This template creates a stack that contains the following:
An IAM role with permissions to access AWS Glue resources
A database in the AWS Glue Data Catalog named githubarchive_month
A crawler set up to crawl the GitHub dataset
An AWS Glue development endpoint (which is used in the next section to transform the data)
To run this template, you must provide an S3 bucket and prefix where you can write output data in the next section. The role that this template creates will have permission to write to this bucket only. You also need to provide a public SSH key for connecting to the development endpoint. For more information about creating an SSH key, see our Development Endpoint tutorial. After you create the AWS CloudFormation stack, you can run the crawler from the AWS Glue console.
In addition to inferring file types and schemas, crawlers automatically identify the partition structure of your dataset and populate the AWS Glue Data Catalog. This ensures that your data is correctly grouped into logical tables and makes the partition columns available for querying in AWS Glue ETL jobs or query engines like Amazon Athena.
After you crawl the table, you can view the partitions by navigating to the table in the AWS Glue console and choosing View partitions. The partitions should look like the following:
For partitioned paths in Hive-style of the form key=val, crawlers automatically populate the column name. In this case, because the GitHub data is stored in directories of the form 2017/01/01, the crawlers use default names like partition_0, partition_1, and so on. You can easily change these names on the AWS Glue console: Navigate to the table, choose Edit schema, and rename partition_0 to year, partition_1 to month, and partition_2 to day:
Now that you’ve crawled the dataset and named your partitions appropriately, let’s see how to work with partitioned data in an AWS Glue ETL job.
Transforming and filtering the data
To get started with the AWS Glue ETL libraries, you can use an AWS Glue development endpoint and an Apache Zeppelin notebook. AWS Glue development endpoints provide an interactive environment to build and run scripts using Apache Spark and the AWS Glue ETL library. They are great for debugging and exploratory analysis, and can be used to develop and test scripts before migrating them to a recurring job.
If you ran the AWS CloudFormation template in the previous section, then you already have a development endpoint named partition-endpoint in your account. Otherwise, you can follow the instructions in this development endpoint tutorial. In either case, you need to set up an Apache Zeppelin notebook, either locally, or on an EC2 instance. You can find more information about development endpoints and notebooks in the AWS Glue Developer Guide.
The following examples are all written in the Scala programming language, but they can all be implemented in Python with minimal changes.
Reading a partitioned dataset
To get started, let’s read the dataset and see how the partitions are reflected in the schema. First, you import some classes that you will need for this example and set up a GlueContext, which is the main class that you will use to read and write data.
Execute the following in a Zeppelin paragraph, which is a unit of executable code:
%spark
import com.amazonaws.services.glue.DynamicFrame import com.amazonaws.services.glue.DynamicRecord
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.util.JsonOptions import org.apache.spark.SparkContext
import java.util.Calendar
import java.util.GregorianCalendar
import scala.collection.JavaConversions._
@transient val spark: SparkContext = SparkContext.getOrCreate()
val glueContext: GlueContext = new GlueContext(spark)
This is straightforward with two caveats: First, each paragraph must start with the line %spark to indicate that the paragraph is Scala. Second, the spark variable must be marked @transient to avoid serialization issues. This is only necessary when running in a Zeppelin notebook.
Next, read the GitHub data into a DynamicFrame, which is the primary data structure that is used in AWS Glue scripts to represent a distributed collection of data. A DynamicFrame is similar to a Spark DataFrame, except that it has additional enhancements for ETL transformations. DynamicFrames are discussed further in the post AWS Glue Now Supports Scala Scripts, and in the AWS Glue API documentation.
The following snippet creates a DynamicFrame by referencing the Data Catalog table that you just crawled and then prints the schema:
%spark
val githubEvents: DynamicFrame = glueContext.getCatalogSource(
database = "githubarchive_month",
tableName = "data"
).getDynamicFrame()
githubEvents.schema.asFieldList.foreach { field =>
println(s"${field.getName}: ${field.getType.getType.getName}")
}
You could also print the full schema using githubEvents.printSchema(). But in this case, the full schema is quite large, so I’ve printed only the top-level columns. This paragraph takes about 5 minutes to run on a standard size AWS Glue development endpoint. After it runs, you should see the following output:
Note that the partition columns year, month, and day were automatically added to each record.
Filtering by partition columns
One of the primary reasons for partitioning data is to make it easier to operate on a subset of the partitions, so now let’s see how to filter data by the partition columns. In particular, let’s find out what people are building in their free time by looking at GitHub activity on the weekends. One way to accomplish this is to use the filter transformation on the githubEvents DynamicFrame that you created earlier to select the appropriate events:
%spark
def filterWeekend(rec: DynamicRecord): Boolean = {
def getAsInt(field: String): Int = {
rec.getField(field) match {
case Some(strVal: String) => strVal.toInt
// The filter transformation will catch exceptions and mark the record as an error.
case _ => throw new IllegalArgumentException(s"Unable to extract field $field")
}
}
val (year, month, day) = (getAsInt("year"), getAsInt("month"), getAsInt("day"))
val cal = new GregorianCalendar(year, month - 1, day) // Calendar months start at 0.
val dayOfWeek = cal.get(Calendar.DAY_OF_WEEK)
dayOfWeek == Calendar.SATURDAY || dayOfWeek == Calendar.SUNDAY
}
val filteredEvents = githubEvents.filter(filterWeekend)
filteredEvents.count
This snippet defines the filterWeekend function that uses the Java Calendar class to identify those records where the partition columns (year, month, and day) fall on a weekend. If you run this code, you see that there were 6,303,480 GitHub events falling on the weekend in January 2017, out of a total of 29,160,561 events. This seems reasonable—about 22 percent of the events fell on the weekend, and about 29 percent of the days that month fell on the weekend (9 out of 31). So people are using GitHub slightly less on the weekends, but there is still a lot of activity!
Predicate pushdowns for partition columns
The main downside to using the filter transformation in this way is that you have to list and read all files in the entire dataset from Amazon S3 even though you need only a small fraction of them. This is manageable when dealing with a single month’s worth of data. But as you try to process more data, you will spend an increasing amount of time reading records only to immediately discard them.
To address this issue, we recently released support for pushing down predicates on partition columns that are specified in the AWS Glue Data Catalog. Instead of reading the data and filtering the DynamicFrame at executors in the cluster, you apply the filter directly on the partition metadata available from the catalog. Then you list and read only the partitions from S3 that you need to process.
To accomplish this, you can specify a Spark SQL predicate as an additional parameter to the getCatalogSource method. This predicate can be any SQL expression or user-defined function as long as it uses only the partition columns for filtering. Remember that you are applying this to the metadata stored in the catalog, so you don’t have access to other fields in the schema.
The following snippet shows how to use this functionality to read only those partitions occurring on a weekend:
%spark
val partitionPredicate =
"date_format(to_date(concat(year, '-', month, '-', day)), 'E') in ('Sat', 'Sun')"
val pushdownEvents = glueContext.getCatalogSource(
database = "githubarchive_month",
tableName = "data",
pushDownPredicate = partitionPredicate).getDynamicFrame()
Here you use the SparkSQL string concat function to construct a date string. You use the to_date function to convert it to a date object, and the date_format function with the ‘E’ pattern to convert the date to a three-character day of the week (for example, Mon, Tue, and so on). For more information about these functions, Spark SQL expressions, and user-defined functions in general, see the Spark SQL documentation and list of functions.
Note that the pushdownPredicate parameter is also available in Python. The corresponding call in Python is as follows:
You can observe the performance impact of pushing down predicates by looking at the execution time reported for each Zeppelin paragraph. The initial approach using a Scala filter function took 2.5 minutes:
Because the version using a pushdown lists and reads much less data, it takes only 24 seconds to complete, a 5X improvement!
Of course, the exact benefit that you see depends on the selectivity of your filter. The more partitions that you exclude, the more improvement you will see.
In addition to Hive-style partitioning for Amazon S3 paths, Parquet and ORC file formats further partition each file into blocks of data that represent column values. Each block also stores statistics for the records that it contains, such as min/max for column values. AWS Glue supports pushdown predicates for both Hive-style partitions and block partitions in these formats. While reading data, it prunes unnecessary S3 partitions and also skips the blocks that are determined unnecessary to be read by column statistics in Parquet and ORC formats.
Additional transformations
Now that you’ve read and filtered your dataset, you can apply any additional transformations to clean or modify the data. For example, you could augment it with sentiment analysis as described in the previous AWS Glue post.
To keep things simple, you can just pick out some columns from the dataset using the ApplyMapping transformation:
ApplyMapping is a flexible transformation for performing projection and type-casting. In this example, we use it to unnest several fields, such as actor.login, which we map to the top-level actor field. We also cast the id column to a long and the partition columns to integers.
Writing out partitioned data
The final step is to write out your transformed dataset to Amazon S3 so that you can process it with other systems like Amazon Athena. By default, when you write out a DynamicFrame, it is not partitioned—all the output files are written at the top level under the specified output path. Until recently, the only way to write a DynamicFrame into partitions was to convert it into a Spark SQL DataFrame before writing. We are excited to share that DynamicFrames now support native partitioning by a sequence of keys.
You can accomplish this by passing the additional partitionKeys option when creating a sink. For example, the following code writes out the dataset that you created earlier in Parquet format to S3 in directories partitioned by the type field.
Here, $outpath is a placeholder for the base output path in S3. The partitionKeys parameter can also be specified in Python in the connection_options dict:
When you execute this write, the type field is removed from the individual records and is encoded in the directory structure. To demonstrate this, you can list the output path using the aws s3 ls command from the AWS CLI:
PRE type=CommitCommentEvent/
PRE type=CreateEvent/
PRE type=DeleteEvent/
PRE type=ForkEvent/
PRE type=GollumEvent/
PRE type=IssueCommentEvent/
PRE type=IssuesEvent/
PRE type=MemberEvent/
PRE type=PublicEvent/
PRE type=PullRequestEvent/
PRE type=PullRequestReviewCommentEvent/
PRE type=PushEvent/
PRE type=ReleaseEvent/
PRE type=WatchEvent/
As expected, there is a partition for each distinct event type. In this example, we partitioned by a single value, but this is by no means required. For example, if you want to preserve the original partitioning by year, month, and day, you could simply set the partitionKeys option to be Seq(“year”, “month”, “day”).
Conclusion
In this post, we showed you how to work with partitioned data in AWS Glue. Partitioning is a crucial technique for getting the most out of your large datasets. Many tools in the AWS big data ecosystem, including Amazon Athena and Amazon Redshift Spectrum, take advantage of partitions to accelerate query processing. AWS Glue provides mechanisms to crawl, filter, and write partitioned data so that you can structure your data in Amazon S3 however you want, to get the best performance out of your big data applications.
Ben Sowell is a senior software development engineer at AWS Glue. He has worked for more than 5 years on ETL systems to help users unlock the potential of their data. In his free time, he enjoys reading and exploring the Bay Area.
Mohit Saxena is a senior software development engineer at AWS Glue. His passion is building scalable distributed systems for efficiently managing data on cloud. He also enjoys watching movies and reading about the latest technology.
AWS CloudHSM provides fully managed, single-tenant hardware security modules (HSMs) in the AWS cloud. A CloudHSM cluster contains either one or multiple HSMs. Multiple HSMs support higher throughput levels for cryptographic operations and provide redundancy. For clusters with multiple HSMs, the CloudHSM service supports server-side automated synchronization of keys and policies. Users, however, are synchronized from the client-side and the synchronization is driven by configuration files which must be refreshed when the cluster size changes. If you do not refresh the configuration files, your CloudHSM user configurations could become unsynchronized and affect the ability of your CloudHSM cluster to provide consistent support of cryptographic information.
In this blog post, I’ll provide a general overview of a CloudHSM architecture, discuss the cluster synchronization process, build a CloudHSM environment, show how the cluster users can become unsynchronized, and then restore user synchronization to bring your cluster back to a consistent state to meet your needs for consistency and redundancy.
CloudHSM Architectural Overview
When you provision an HSM instance in CloudHSM, the HSM instance provides an elastic network interface (ENI) in yourAmazon VPC while the HSM itself resides in a separate VPC managed by AWS CloudHSM. Your applications use the CloudHSM cluster ID to add or remove HSMs from the cluster and the ENI(s) of the HSM instance(s) to access the HSM instances.
You configure your cluster and its HSM instances using CloudHSM client software you deploy on Amazon EC2 instances in your VPC. You only need one such EC2 instance to manage a CloudHSM cluster, but it’s common to deploy additional EC2 instances in other availability zones to provide for client redundancy. Your applications communicate with the HSM instances using the client daemon. You manage and configure the cluster with command line tools including cloudhsm_mgmt_util, key_mgmt_util, and configure. An example of a CloudHSM architecture appears below.
Figure 1: A 3-Node CloudHSM architecture
The diagram shows a three-node CloudHSM cluster deployed in the us-west-2 (Oregon) region with three Amazon EC2 instances with the CloudHSM software. The client in Availability Zone 2 is communicating with the cluster through the elastic network interfaces in each availability zone.
CloudHSM Synchronization Process
Having discussed the architecture of AWS CloudHSM, let’s turn our attention to the matter of cluster synchronization. There are three events that require synchronization: cluster expansion, key management operations, and user management operations. Let’s look at each of these in more detail.
Cluster Expansion
When you add an HSM to an existing cluster, AWS CloudHSM clones all users, keys, and policies from another HSM in the cluster. No additional steps are required on your part.
Key Management Operations
Key management with the key_mgmt_util tool uses the CloudHSM client to communicate with the HSM cluster. Additionally, a fallback, HSM-based synchronization protocol keeps keys in sync.
User Management
You perform user management tasks, such as adding users or changing passwords, using the cloudhsm_mgmt_util tool. This tool communicates directly with the HSMs, bypassing the client daemon. cloudhsm_mgmt_util uses its own configuration files to determine the HSMs that it should connect to within the cluster. These configuration files aren’t updated dynamically when HSM instances are added. To prevent user synchronization errors, you must update the configuration files before running cloudhsm_mgmt_util. You must also not add new HSM instances to the cluster while you’re using the tool. This helps ensure that no HSM instances are accidentally left out of user updates that would in turn result in user synchronization problems.
Again, these safeguards are only necessary when using cloudhsm_mgmt_util. For all other applications and utilities using CloudHSM, the client daemon automatically reconfigures itself as you add and remove HSM instances from your cluster. In the remainder of this post, I will build a CloudHSM infrastructure as shown in the above diagram. I’ll then show you how users on your CloudHSM instances can become unsynchronized, and how to restore proper synchronization.
Prerequisites and Assumptions
You’ll need to have an AWS account that allows you to provision Amazon VPCs, Amazon EC2 instances, and CloudHSMs.
I’ll use the us-west-2 (Oregon) region, but you can use any region that offers CloudHSM.
You’ll need an Amazon EC2 key pair in the region.
You should have a working knowledge of the services I’ve mentioned.
Important: You’ll incur charges for the resources used in this example. You can find the cost of each service on that service’s pricing page.
Building a CloudHSM Infrastructure
Create an Amazon VPC with subnets in the us-west-2a, us-west-2b, and us-east-2c availability zones. I’ll use the Amazon VPC Architecture Quick Start, which is an AWS CloudFormation template that will do this on your behalf. Make sure you select the correct region after you load the Quick Start. Select the following parameters:
Parameter
Value
Availability Zones
us-west-2a, us-west-2b, us-west-2c
Number of Availability Zones
3
Create private subnets
False
Create additional private subnets with dedicated network ACLs
False
Key pair name
The name of your Amazon EC2 key pair
Accept the default values for all other parameters.
Follow these instructions to create a CloudHSM cluster in your new VPC in the us-west-2a, us-west-2b and us-west-2c availability zones. Note that the cluster will not have any HSMs after it’s created.
Follow these instructions to initialize the cluster with an HSM in the us-west-2a availability zone. After the cluster is initialized, note the ENI IP address from the cluster details section in the console as shown here:
Install the client software on the EC2 instance you launched in step 4.
Add the IP of the EC2 instance that you identified in step 4 to the security group you identified in step 3.
Activate the cluster. The activation instructions will guide you through connecting to the EC2 instance you launched in step 4. Remain logged into the EC2 instance following the activation of the cluster for the steps below.
While you are still logged into the EC2 instance you just launched, follow the steps below to add a crypto user named example_user to the cluster:
Ensure the CloudHSM daemon is stopped:$ sudo stop cloudhsm-client
Configure the IP address of the initial HSM using the ENI IP address from step 3:$ sudo /opt/cloudhsm/bin/configure –a 10.0.129.209
Note: the configure tool updates two configuration files: one for the CloudHSM client, and the other for the cloudhsm_mgmt_util program that is used to administer users.
Start the CloudHSM client:$ sudo start cloudhsm-client
Ensure the cloudhsm_mgmt_util configuration file is up to date. We need to do this to ensure cloudhsm_mgmt_util is aware of all the HSM instances in the cluster:$ sudo /opt/cloudhsm/bin/configure –m
Connect to the HSM instances, enable end-to-end encryption, and log in to the HSM instances. Enabling end-to-end encryption encrypts the communication between cloudhsm_mgmt_util and the HSM to prevent interception of sensitive information such as passwords:$ /opt/cloudhsm/bin/cloudhsm_mgmt_util /opt/cloudhsm/etc/cloudhsm_mgmt_util.cfg
aws-cloudhsm> enable_e2e
aws-cloudhsm> loginHSM CO admin
Figure 4: Connecting to a Single CloudHSM
Note: The connection or log in is automatically executed on every HSM instance that cloudhsm_mgmt_util is aware of. Note also that for each of the commands that you enter, the cloudhsm_mgmt_util program identifies the IP address of the HSM to which it is communicating.
Add the user example_user and then confirm the addition by listing the users in the HSM:aws-cloudhsm> createUser CU example_user yourpassword
aws-cloudhsm> listUsers
Use the quit command to log out and exit the program:aws-cloudhsm> quit
Now that we’ve added a user to the CloudHSM, let’s add a key so we can see how users and keys are synchronized as the cluster changes.
Start the key_mgmt_util program:$ /opt/cloudhsm/bin/key_mgmt_util
Log in to the HSM:Command: loginHSM –u CU –s example_user
Notice that key_mgmt_util displays the node id to which it is communicating.
Use the exit command to leave the program:exit
Add another HSM to the cluster in the us-west-2b availability zone and note the ENI IP address from the cluster details section in the console, as shown here:
Figure 6: The ENI IP address
Update the cluster configuration files and use cloud_mgmt_util to examine the user configuration: $ sudo stop cloudhsm-client$ sudo /opt/cloudhsm/bin/configure –a 10.0.129.209
Figure 7: Connecting to the 2-node CloudHSM cluster
Note that cloudhsm_mgmt_utilcloudhsm_mgmt_util now sends commands to both of the HSMs in the cluster. You can see the same thing when we list the users in the cluster.
Figure 8: Showing proper user synchronization across two CloudHSMs
Now, use key_mgmt_util to examine the keys:Command: findKey
Figure 9: Showing that keys are properly synchronized across a 2-node CloudHSM cluster
This command confirms that when we added the second HSM, CloudHSM used cluster-initiated synchronization to load the users and keys into the new HSM.
The CloudHSM Cluster Users Become Unsynchronized
Start cloudhsm_mgmt_util and enable end-to-end encryption:$ /opt/cloudhsm/bin/cloudhsm_mgmt_util /opt/cloudhsm/etc/cloudhsm_mgmt_util.cfg
aws-cloudhsm> enable_e2e
Figure 10: Connecting to the 2-node CloudHSM cluster
While cloudhsm_mgmt_util is left running, add a third HSM in us-west-2c through the console and note the ENI IP address, as shown here:
Figure 11: Connecting to the 2-node CloudHSM cluster
Going back to cloudhsm_mgmt_util, let’s add a user named newest_user to our cluster. Note that we have not exited cloudhsm_mgmt_util and refreshed its configuration file. So it’s still connected only to the first two HSM instances.aws-cloudhsm> enable_e2e
aws-cloudhsm> loginHSM CO admin yourpassword
aws-cloudhsm> createUser CU newest_user yourpassword
Figure 12: Adding a User to only two nodes of a 3-node CloudHSM Cluster and breaking synchronization
The cloudhsm_mgmt_util command adds the user to the two HSMs it already knows about and had connected to. It doesn’t communicate with the newly added HSM.
Let’s fix this by exiting cloudhsm_mgmt_util. Refresh the configuration, and then run the management utility again.$sudo stop cloudhsm-client
You can now see cloudhsm_mgmt_util is communicating with all of the cluster nodes.
Figure 13: Connecting to a 3-node CloudHSM cluster
Let’s see what happens when we list the users:aws-cloudhsm> listUsers
Figure 14: Showing that users are now unsynchronized
You can see from the results that one of the HSMs (server 1) is missing the user named newest_user. The reason this happened is that cloudhsm_mgmt_util was unaware of the HSM instance that was added while it was running (recall that cloudhsm_mgmt_util doesn’t use the cloudhsm_client daemon and, therefore, doesn’t get automatic cluster configuration updates).
Restoring User Synchronization to the CloudHSM Cluster
We now want to add the user newest_user to the single HSM (server 1) that is out of sync. Normally, cloudhsm_mgmt_util works in cluster mode and applies your commands to all HSMs in the cluster. Since we want to work on a single HSM, we’re going to enter the server command to tell cloudhsm_mgmt_util to work in server mode and apply our commands just to that one HSM.
In the server command below, we specify the number of the HSM that we want to change based on the figure above. In the createUser command, you must use the same password that you used in step 3 (in the section titled “The CloudHSM Cluster Users Become Unsynchronized”) on the other HSMs in the cluster so that all HSMs in the cluster have identical user names and passwords. After we make this change, we use the exit command to transition from server mode back to cluster mode.aws-cloudhsm> server 1
server1> createUser CU newest_user yourpassword
exit
Figure 15: Adding a user to a single-node of a 3-node CloudHSM cluster
Now that we have transitioned back to cluster mode, let’s confirm that the HSM user tables are now synchronized by listing the users:aws-cloudhsm> listUsers
Figure 16: Showing that users are now synchronized across the 3-node CloudHSM cluster
Let’s take a look at the keys using key_mgmt_util:Command: loginHSM –u CU –s example_user –p yourpassword
Command: findKey
Figure 17: Showing that keys continued to be synchronized across a 3-node CloudHSM Cluster
You can see that CloudHSM kept the keys in sync because key synchronization is cluster-initiated. No additional actions are required on our part.
Conclusion
AWS CloudHSM provides the ability to create scalable clusters of HSM instances to support the high volumes of cryptographic operations and provide resiliency by supporting multiple availability zones. As mentioned, it’s important to be aware of the various modes of synchronization used in CloudHSM so that each HSM can provide consistent service. In particular, users are synchronized only by the client. Since cloudhsm_mgmt_util doesn’t rely on the client daemon to talk to HSM instances in your cluster, it doesn’t automatically update its configuration. By following the steps above and refreshing the configuration information before changing users or passwords, CloudHSM will keep users and passwords synchronized within the cluster and provide consistent responses to cryptographic operations if the level of redundancy within the HSM cluster changes.
If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Amazon CloudHSM forum or contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Regular visitors to this blog may have noticed that its name has changed from the Amazon SES Blog to the AWS Messaging and Targeting Blog. The Amazon SES team has been working closely with the Amazon Pinpoint team in recent months, so we decided to create a single source of information for both products.
If you’re a dedicated Amazon SES user, don’t worry—Amazon SES isn’t going anywhere. However, as the goals of our two teams started to overlap more and more, we realized that we had lots of ideas for blog posts that would be relevant to users of both products.
If you’re not familiar with Amazon Pinpoint yet, allow us to make a brief introduction. Amazon Pinpoint was originally created to help mobile app developers analyze the ways that their customers used their apps, and to send mobile push messages to those users. Over time, the capabilities of Amazon Pinpoint grew to include the ability to send transactional messages (such as order confirmations, welcome messages, and one-time passwords), segment your audience, schedule campaign execution, and send messages using other channels (including SMS and email). In short, Amazon Pinpoint helps you deliver the right message to the right customers at the right time using the right channel.
In the past, this blog focused mainly on providing information about new features and common issues. Our new blog will include that same information, as well as practical tips, industry best practices, and the exciting things our customers have done using Amazon SES and Amazon Pinpoint. We hope you enjoy it!
If you have any questions, or if there’s anything you’d like to see us cover in the blog, please let us know in the comments section.
Today we’re launching a new feature for AWS Certificate Manager (ACM), Private Certificate Authority (CA). This new service allows ACM to act as a private subordinate CA. Previously, if a customer wanted to use private certificates, they needed specialized infrastructure and security expertise that could be expensive to maintain and operate. ACM Private CA builds on ACM’s existing certificate capabilities to help you easily and securely manage the lifecycle of your private certificates with pay as you go pricing. This enables developers to provision certificates in just a few simple API calls while administrators have a central CA management console and fine grained access control through granular IAM policies. ACM Private CA keys are stored securely in AWS managed hardware security modules (HSMs) that adhere to FIPS 140-2 Level 3 security standards. ACM Private CA automatically maintains certificate revocation lists (CRLs) in Amazon Simple Storage Service (S3) and lets administrators generate audit reports of certificate creation with the API or console. This service is packed full of features so let’s jump in and provision a CA.
Provisioning a Private Certificate Authority (CA)
First, I’ll navigate to the ACM console in my region and select the new Private CAs section in the sidebar. From there I’ll click Get Started to start the CA wizard. For now, I only have the option to provision a subordinate CA so we’ll select that and use my super secure desktop as the root CA and click Next. This isn’t what I would do in a production setting but it will work for testing out our private CA.
Now, I’ll configure the CA with some common details. The most important thing here is the Common Name which I’ll set as secure.internal to represent my internal domain.
Now I need to choose my key algorithm. You should choose the best algorithm for your needs but know that ACM has a limitation today that it can only manage certificates that chain up to to RSA CAs. For now, I’ll go with RSA 2048 bit and click Next.
In this next screen, I’m able to configure my certificate revocation list (CRL). CRLs are essential for notifying clients in the case that a certificate has been compromised before certificate expiration. ACM will maintain the revocation list for me and I have the option of routing my S3 bucket to a custome domain. In this case I’ll create a new S3 bucket to store my CRL in and click Next.
Finally, I’ll review all the details to make sure I didn’t make any typos and click Confirm and create.
A few seconds later and I’m greeted with a fancy screen saying I successfully provisioned a certificate authority. Hooray! I’m not done yet though. I still need to activate my CA by creating a certificate signing request (CSR) and signing that with my root CA. I’ll click Get started to begin that process.
Now I’ll copy the CSR or download it to a server or desktop that has access to my root CA (or potentially another subordinate – so long as it chains to a trusted root for my clients).
Now I can use a tool like openssl to sign my cert and generate the certificate chain.
$openssl ca -config openssl_root.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in csr/CSR.pem -out certs/subordinate_cert.pem
Using configuration from openssl_root.cnf
Enter pass phrase for /Users/randhunt/dev/amzn/ca/private/root_private_key.pem:
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
stateOrProvinceName :ASN.1 12:'Washington'
localityName :ASN.1 12:'Seattle'
organizationName :ASN.1 12:'Amazon'
organizationalUnitName:ASN.1 12:'Engineering'
commonName :ASN.1 12:'secure.internal'
Certificate is to be certified until Mar 31 06:05:30 2028 GMT (3650 days)
Sign the certificate? [y/n]:y
1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
After that I’ll copy my subordinate_cert.pem and certificate chain back into the console. and click Next.
Finally, I’ll review all the information and click Confirm and import. I should see a screen like the one below that shows my CA has been activated successfully.
Now that I have a private CA we can provision private certificates by hopping back to the ACM console and creating a new certificate. After clicking create a new certificate I’ll select the radio button Request a private certificate then I’ll click Request a certificate.
From there it’s just similar to provisioning a normal certificate in ACM.
Now I have a private certificate that I can bind to my ELBs, CloudFront Distributions, API Gateways, and more. I can also export the certificate for use on embedded devices or outside of ACM managed environments.
Available Now ACM Private CA is a service in and of itself and it is packed full of features that won’t fit into a blog post. I strongly encourage the interested readers to go through the developer guide and familiarize themselves with certificate based security. ACM Private CA is available in in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt) and EU (Ireland). Private CAs cost $400 per month (prorated) for each private CA. You are not charged for certificates created and maintained in ACM but you are charged for certificates where you have access to the private key (exported or created outside of ACM). The pricing per certificate is tiered starting at $0.75 per certificate for the first 1000 certificates and going down to $0.001 per certificate after 10,000 certificates.
I’m excited to see administrators and developers take advantage of this new service. As always please let us know what you think of this service on Twitter or in the comments below.
Today, our customers use AWS CloudHSM to meet corporate, contractual and regulatory compliance requirements for data security by using dedicated Hardware Security Module (HSM) instances within the AWS cloud. CloudHSM delivers all the benefits of traditional HSMs including secure generation, storage, and management of cryptographic keys used for data encryption that are controlled and accessible only by you.
As a managed service, it automates time-consuming administrative tasks such as hardware provisioning, software patching, high availability, backups and scaling for your sensitive and regulated workloads in a cost-effective manner. Backup and restore functionality is the core building block enabling scalability, reliability and high availability in CloudHSM.
You should consider using AWS CloudHSM if you require:
Keys stored in dedicated, third-party validated hardware security modules under your exclusive control
FIPS 140-2 compliance
Integration with applications using PKCS#11, Java JCE, or Microsoft CNG interfaces
Healthcare applications subject to HIPAA regulations
Streaming video solutions subject to contractual DRM requirements
We recently released a whitepaper, “Security of CloudHSM Backups” that provides in-depth information on how backups are protected in all three phases of the CloudHSM backup lifecycle process: Creation, Archive, and Restore.
About the Author
Balaji Iyer is a senior consultant in the Professional Services team at Amazon Web Services. In this role, he has helped several customers successfully navigate their journey to AWS. His specialties include architecting and implementing highly-scalable distributed systems, operational security, large scale migrations, and leading strategic AWS initiatives.
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.