These services no longer support using TLS 1.0 or TLS 1.1 on their FIPS endpoints. To help you meet your compliance needs, we are updating all AWS FIPS endpoints to a minimum of TLS 1.2 across all Regions. We will continue to update our services to support only TLS 1.2 or later on AWS FIPS endpoints, which you can check on the AWS FIPS webpage. This change doesn’t affect non-FIPS AWS endpoints.
When you make a connection from your client application to an AWS service endpoint, the client provides its TLS minimum and TLS maximum versions. The AWS service endpoint will always select the maximum version offered.
The FIPS 140-2 is a US and Canadian government standard that specifies the security requirements for cryptographic modules that protect sensitive information.
What are AWS FIPS endpoints?
All AWS services offer TLS 1.2 encrypted endpoints that can be used for all API calls. Some AWS services also offer FIPS 140-2 endpoints for customers who need to use FIPS validated cryptographic libraries to connect to AWS services.
Why are we upgrading to TLS 1.2?
Our upgrade to TLS 1.2 across all Regions reflects our ongoing commitment to help customers meet their compliance needs.
Is there more assistance available to help verify or update client applications?
If you’re using an AWS software development kit (AWS SDK), you can find information about how to properly configure the minimum and maximum TLS versions for your clients in the following AWS SDK topics:
You can also visit Tools to Build on AWS and browse by programming language to find the relevant SDK. AWS Support tiers cover development and production issues for AWS products and services, along with other key stack components. AWS Support doesn’t include code development for client applications.
If you have any questions or issues, you can start a new thread on one of the AWS forums, or contact AWS Support or your technical account manager (TAM).
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.
Amazon Simple Notification Service (SNS) now supports VPC Endpoints (VPCE) via AWS PrivateLink. You can use VPC Endpoints to privately publish messages to SNS topics, from an Amazon Virtual Private Cloud (VPC), without traversing the public internet. When you use AWS PrivateLink, you don’t need to set up an Internet Gateway (IGW), Network Address Translation (NAT) device, or Virtual Private Network (VPN) connection. You don’t need to use public IP addresses, either.
VPC Endpoints doesn’t require code changes and can bring additional security to Pub/Sub Messaging use cases that rely on SNS. VPC Endpoints helps promote data privacy and is aligned with assurance programs, including the Health Insurance Portability and Accountability Act (HIPAA), FedRAMP, and others discussed below.
VPC Endpoints for SNS in action
Here’s how VPC Endpoints for SNS works. The following example is based on a banking system that processes mortgage applications. This banking system, which has been deployed to a VPC, publishes each mortgage application to an SNS topic. The SNS topic then fans out the mortgage application message to two subscribing AWS Lambda functions:
Save-Mortgage-Application stores the application in an Amazon DynamoDB table. As the mortgage application contains personally identifiable information (PII), the message must not traverse the public internet.
Save-Credit-Report checks the applicant’s credit history against an external Credit Reporting Agency (CRA), then stores the final credit report in an Amazon S3 bucket.
The following diagram depicts the underlying architecture for this banking system:
To protect applicants’ data, the financial institution responsible for developing this banking system needed a mechanism to prevent PII data from traversing the internet when publishing mortgage applications from their VPC to the SNS topic. Therefore, they created a VPC endpoint to enable their publisher Amazon EC2 instance to privately connect to the SNS API. As shown in the diagram, when the VPC endpoint is created, an Elastic Network Interface (ENI) is automatically placed in the same VPC subnet as the publisher EC2 instance. This ENI exposes a private IP address that is used as the entry point for traffic destined to SNS. This ensures that traffic between the VPC and SNS doesn’t leave the Amazon network.
Set up VPC Endpoints for SNS
The process for creating a VPC endpoint to privately connect to SNS doesn’t require code changes: access the VPC Management Console, navigate to the Endpoints section, and create a new Endpoint. Three attributes are required:
The SNS service name.
The VPC and Availability Zones (AZs) from which you’ll publish your messages.
The Security Group (SG) to be associated with the endpoint network interface. The Security Group controls the traffic to the endpoint network interface from resources in your VPC. If you don’t specify a Security Group, the default Security Group for your VPC will be associated.
The SNS API is served through HTTP Secure (HTTPS), and encrypts all messages in transit with Transport Layer Security (TLS) certificates issued by Amazon Trust Services (ATS). The certificates verify the identity of the SNS API server when encrypted connections are established. The certificates help establish proof that your SNS API client (SDK, CLI) is communicating securely with the SNS API server. A Certificate Authority (CA) issues the certificate to a specific domain. Hence, when a domain presents a certificate that’s issued by a trusted CA, the SNS API client knows it’s safe to make the connection.
Summary
VPC Endpoints can increase the security of your pub/sub messaging use cases by allowing you to publish messages to SNS topics, from instances in your VPC, without traversing the internet. Setting up VPC Endpoints for SNS doesn’t require any code changes because the SNS API address remains the same.
VPC Endpoints for SNS is now available in all AWS Regions where AWS PrivateLink is available. For information on pricing and regional availability, visit the VPC pricing page. For more information and on-boarding, see Publishing to Amazon SNS Topics from Amazon Virtual Private Cloud in the SNS documentation.
If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Amazon SNS forum or contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Today we’re launching a new feature for AWS Certificate Manager (ACM), Private Certificate Authority (CA). This new service allows ACM to act as a private subordinate CA. Previously, if a customer wanted to use private certificates, they needed specialized infrastructure and security expertise that could be expensive to maintain and operate. ACM Private CA builds on ACM’s existing certificate capabilities to help you easily and securely manage the lifecycle of your private certificates with pay as you go pricing. This enables developers to provision certificates in just a few simple API calls while administrators have a central CA management console and fine grained access control through granular IAM policies. ACM Private CA keys are stored securely in AWS managed hardware security modules (HSMs) that adhere to FIPS 140-2 Level 3 security standards. ACM Private CA automatically maintains certificate revocation lists (CRLs) in Amazon Simple Storage Service (S3) and lets administrators generate audit reports of certificate creation with the API or console. This service is packed full of features so let’s jump in and provision a CA.
Provisioning a Private Certificate Authority (CA)
First, I’ll navigate to the ACM console in my region and select the new Private CAs section in the sidebar. From there I’ll click Get Started to start the CA wizard. For now, I only have the option to provision a subordinate CA so we’ll select that and use my super secure desktop as the root CA and click Next. This isn’t what I would do in a production setting but it will work for testing out our private CA.
Now, I’ll configure the CA with some common details. The most important thing here is the Common Name which I’ll set as secure.internal to represent my internal domain.
Now I need to choose my key algorithm. You should choose the best algorithm for your needs but know that ACM has a limitation today that it can only manage certificates that chain up to to RSA CAs. For now, I’ll go with RSA 2048 bit and click Next.
In this next screen, I’m able to configure my certificate revocation list (CRL). CRLs are essential for notifying clients in the case that a certificate has been compromised before certificate expiration. ACM will maintain the revocation list for me and I have the option of routing my S3 bucket to a custome domain. In this case I’ll create a new S3 bucket to store my CRL in and click Next.
Finally, I’ll review all the details to make sure I didn’t make any typos and click Confirm and create.
A few seconds later and I’m greeted with a fancy screen saying I successfully provisioned a certificate authority. Hooray! I’m not done yet though. I still need to activate my CA by creating a certificate signing request (CSR) and signing that with my root CA. I’ll click Get started to begin that process.
Now I’ll copy the CSR or download it to a server or desktop that has access to my root CA (or potentially another subordinate – so long as it chains to a trusted root for my clients).
Now I can use a tool like openssl to sign my cert and generate the certificate chain.
$openssl ca -config openssl_root.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in csr/CSR.pem -out certs/subordinate_cert.pem
Using configuration from openssl_root.cnf
Enter pass phrase for /Users/randhunt/dev/amzn/ca/private/root_private_key.pem:
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
stateOrProvinceName :ASN.1 12:'Washington'
localityName :ASN.1 12:'Seattle'
organizationName :ASN.1 12:'Amazon'
organizationalUnitName:ASN.1 12:'Engineering'
commonName :ASN.1 12:'secure.internal'
Certificate is to be certified until Mar 31 06:05:30 2028 GMT (3650 days)
Sign the certificate? [y/n]:y
1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
After that I’ll copy my subordinate_cert.pem and certificate chain back into the console. and click Next.
Finally, I’ll review all the information and click Confirm and import. I should see a screen like the one below that shows my CA has been activated successfully.
Now that I have a private CA we can provision private certificates by hopping back to the ACM console and creating a new certificate. After clicking create a new certificate I’ll select the radio button Request a private certificate then I’ll click Request a certificate.
From there it’s just similar to provisioning a normal certificate in ACM.
Now I have a private certificate that I can bind to my ELBs, CloudFront Distributions, API Gateways, and more. I can also export the certificate for use on embedded devices or outside of ACM managed environments.
Available Now ACM Private CA is a service in and of itself and it is packed full of features that won’t fit into a blog post. I strongly encourage the interested readers to go through the developer guide and familiarize themselves with certificate based security. ACM Private CA is available in in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt) and EU (Ireland). Private CAs cost $400 per month (prorated) for each private CA. You are not charged for certificates created and maintained in ACM but you are charged for certificates where you have access to the private key (exported or created outside of ACM). The pricing per certificate is tiered starting at $0.75 per certificate for the first 1000 certificates and going down to $0.001 per certificate after 10,000 certificates.
I’m excited to see administrators and developers take advantage of this new service. As always please let us know what you think of this service on Twitter or in the comments below.
Today, our customers use AWS CloudHSM to meet corporate, contractual and regulatory compliance requirements for data security by using dedicated Hardware Security Module (HSM) instances within the AWS cloud. CloudHSM delivers all the benefits of traditional HSMs including secure generation, storage, and management of cryptographic keys used for data encryption that are controlled and accessible only by you.
As a managed service, it automates time-consuming administrative tasks such as hardware provisioning, software patching, high availability, backups and scaling for your sensitive and regulated workloads in a cost-effective manner. Backup and restore functionality is the core building block enabling scalability, reliability and high availability in CloudHSM.
You should consider using AWS CloudHSM if you require:
Keys stored in dedicated, third-party validated hardware security modules under your exclusive control
FIPS 140-2 compliance
Integration with applications using PKCS#11, Java JCE, or Microsoft CNG interfaces
Healthcare applications subject to HIPAA regulations
Streaming video solutions subject to contractual DRM requirements
We recently released a whitepaper, “Security of CloudHSM Backups” that provides in-depth information on how backups are protected in all three phases of the CloudHSM backup lifecycle process: Creation, Archive, and Restore.
About the Author
Balaji Iyer is a senior consultant in the Professional Services team at Amazon Web Services. In this role, he has helped several customers successfully navigate their journey to AWS. His specialties include architecting and implementing highly-scalable distributed systems, operational security, large scale migrations, and leading strategic AWS initiatives.
AWS Key Management Service (KMS) now uses FIPS 140-2 validated hardware security modules (HSM) and supports FIPS 140-2 validated endpoints, which provide independent assurances about the confidentiality and integrity of your keys. Having additional third-party assurances about the keys you manage in AWS KMS can make it easier to use the service for regulated workloads.
AWS KMS HSMs are designed so that no one, not even AWS employees, can retrieve your plaintext keys. The service uses the FIPS 140-2 validated HSMs to protect your keys when you request the service to create keys on your behalf or when you import them. Your plaintext keys are never written to disk and are only used in volatile memory of the HSMs while performing your requested cryptographic operation. Furthermore, AWS KMS keys are never transmitted outside the AWS Regions they were created. And HSM firmware updates are controlled by multi-party access that is audited and reviewed by an independent group within AWS.
AWS KMS HSMs are validated at level 2 overall and at level 3 in the following areas:
Cryptographic Module Specification
Roles, Services, and Authentication
Physical Security
Design Assurance
You can also make AWS KMS requests to API endpoints that terminate TLS sessions using a FIPS 140-2 validated cryptographic software module. To do so, connect to the unique FIPS 140-2 validated HTTPS endpoints in the AWS KMS requests made from your applications. AWS KMS FIPS 140-2 validated HTTPS endpoints are powered by the OpenSSL FIPS Object Module. FIPS 140-2 validated API endpoints are available in all commercial regions where AWS KMS is available.
Our Health Customer Stories page lists just a few of the many customers that are building and running healthcare and life sciences applications that run on AWS. Customers like Verge Health, Care Cloud, and Orion Health trust AWS with Protected Health Information (PHI) and Personally Identifying Information (PII) as part of their efforts to comply with HIPAA and HITECH.
Sixteen More Services In my last HIPAA Eligibility Update I shared the news that we added eight additional services to our list of HIPAA eligible services. Today I am happy to let you know that we have added another sixteen services to the list, bringing the total up to 46. Here are the newest additions, along with some short descriptions and links to some of my blog posts to jog your memory:
Amazon RDS for MariaDB – This service lets you set up scalable, managed MariaDB instances in minutes, and offers high performance, high availability, and a simplified security model that makes it easy for you to encrypt data at rest and in transit. Read Amazon RDS Update – MariaDB is Now Available to learn more.
AWS Batch – This service lets you run large-scale batch computing jobs on AWS. You don’t need to install or maintain specialized batch software or build your own server clusters. Read AWS Batch – Run Batch Computing Jobs on AWS to learn more.
AWS Key Management Service – This service makes it easy for you to create and control the encryption keys used to encrypt your data. It uses HSMs to protect your keys, and is integrated with AWS CloudTrail in order to provide you with a log of all key usage. Read New AWS Key Management Service (KMS) to learn more.
AWS Snowball Edge – This is a data transfer device with 100 terabytes of on-board storage as well as compute capabilities. You can use it to move large amounts of data into or out of AWS, as a temporary storage tier, or to support workloads in remote or offline locations. To learn more, read AWS Snowball Edge – More Storage, Local Endpoints, Lambda Functions.
AWS Snowmobile – This is an exabyte-scale data transfer service. Pulled by a semi-trailer truck, each Snowmobile packs 100 petabytes of storage into a ruggedized 45-foot long shipping container. Read AWS Snowmobile – Move Exabytes of Data to the Cloud in Weeks to learn more (and to see some of my finest LEGO work).
Now that you can reserve seating in AWS re:Invent 2017 breakout sessions, workshops, chalk talks, and other events, the time is right to review the list of introductory, advanced, and expert content being offered this year. To learn more about breakout content types and levels, see Breakout Content.
SID201 – IAM for Enterprises: How Vanguard Strikes the Balance Between Agility, Governance, and Security For Vanguard, managing the creation of AWS Identity and Access Management (IAM) objects is key to balancing developer velocity and compliance. In this session, you learn how Vanguard designs IAM roles to control the blast radius of AWS resources and maintain simplicity for developers. Vanguard will also share best practices to help you manage governance and improve your visibility across your AWS resources.
SID202 – Deep dive about how Capital One automates the delivery of directory services across AWS accounts Traditional solutions for using Microsoft Active Directory across on-premises and AWS Cloud Windows workloads can require complex networking or syncing identities across multiple systems. AWS Directory Service for Microsoft Active Directory, also known as AWS Managed AD, offers you actual Microsoft Active Directory in the AWS Cloud as a managed service. In this session, you will learn how Capital One uses AWS Managed AD to provide highly available authentication and authorization services for its Windows workloads, such as Amazon RDS for SQL Server.
SID205 – Building the Largest Repo for Serverless Compliance-as-Code When you use the cloud to enable speed and agility, how do you know if you’ve done it correctly? We are on a mission to help builders follow industry best practices within security guardrails by creating the largest compliance-as-code repository, available to all. Compliance-as-code is the idea to translate best practices, guardrails, policies, and standards into codified unit testing. Apply this to your AWS environment to provide insights about what can or must be improved. Learn why compliance-as-code matters to gain speed (by getting developers, architects, and security pros on the same page), how it is currently used (demo), and how to start using it or being part of building it.
SID206 – Best Practices for Managing Security Operation on AWS To help prevent unexpected access to your AWS resources, it is critical to maintain strong identity and access policies and track, detect, and react to changes. In this session, you will learn how to use AWS Identity and Access Management (IAM) to control access to AWS resources and integrate your existing authentication system with IAM. We will cover how to deploy and control AWS infrastructure using code templates, including change management policies with AWS CloudFormation.
SID207 – Feedback Security in the Cloud Like many security teams, Riot has been challenged by new paradigms that came with the move to the cloud. We discuss how our security team has developed a security culture based on feedback and self-service to best thrive in the cloud. We detail how the team assessed the security gaps and challenges in our move to AWS, and then describe how the team works within Riot’s unique feedback culture.
SID208 – Less (Privilege) Is More: Getting Least Privilege Right in AWS AWS services are designed to enable control through AWS Identity and Access Management (IAM) and Amazon Virtual Private Cloud (VPC). Join us in this chalk talk to learn how to apply these toward the security principal of least privilege for applications and data and how to practically integrate them in your security operations.
SID209 – Designing and Deploying an AWS Account Factory AWS customers start off with one AWS account, but quickly realize the benefits of having multiple AWS accounts. A common learning curve for customers is how to securely baseline and set up new accounts at scale. This talk helps you understand how to use AWS Organizations, AWS Identity and Access Management (IAM), AWS CloudFormation, and other tools to baseline new accounts, set them up for federation, and make a secure and repeatable account factory to create new AWS accounts. Walk away with demos and tools to use in your own environment.
SID210 – A CISO’s Journey at Vonage: Achieving Unified Security at Scale Making sense of the risks of IT deployments that sit in hybrid environments and span multiple countries is a major challenge. When you add in multiple toolsets and global compliance requirements, including GDPR, it can get overwhelming. Listen to Vonage’s Chief Information Security Officer, Johan Hybinette, share his experiences tackling these challenges.
SID212 – Maximizing Your Move to AWS – Five Key Lessons from Vanguard and Cloud Technology Partners CTP’s Robert Christiansen and Mike Kavis describe how to maximize the value of your AWS initiative. From building a Minimum Viable Cloud to establishing a cloud robust security and compliance posture, we walk through key client success stories and lessons learned. We also explore how CTP has helped Vanguard, the leading provider of investor communications and technology, take advantage of AWS to delight customers, drive new revenue streams, and transform their business.
SID213 – Managing Regulator Expectations – Lessons Learned on Positioning AWS Services from an Audit Perspective Cloud migration in highly regulated industries can stall without a solid understanding of how (and when) to address regulatory expectations. This session provides a guide to explaining the aspects of AWS services that are most frequently the subject of an internal or regulatory audit. Because regulatory agencies and internal auditors might not share a common understanding of the cloud, this session is designed to help you to help them, regardless of their level of technical fluency.
SID214 – Best Security Practices in the Intelligence Community Executives from the Intelligence community discuss cloud security best practices in a field where security is imperative to operations. CIA security cloud chief John Nicely and NGA security cloud chief Scot Kaplan share success stories of migrating mass data to the cloud from a security perspective. Hear how they migrated their IT portfolios while managing their organizations’ unique blend of constraints, budget issues, politics, culture, and security pressures. Learn how these institutions overcame barriers to migration, and ask these panelists what actions you can take to better prepare yourself for the journey of mass migration to the cloud.
SID216 – Defending Diverse Applications Against Common Threats In this session, you learn how to adapt application defenses and operational responses based on your unique requirements. You also hear directly from customers about how they architected their applications on AWS to protect their applications. There are many ways to build secure, high-availability applications in the cloud. Services such as Amazon API Gateway, Amazon VPC, ALB, ELB, and Amazon EC2 are the basic building blocks that enable you to address a wide range of use cases. Best practices for defending your applications against Distributed Denial of Service (DDoS) attacks, exploitation attempts, and bad bots can vary with your choices in architecture.
Advanced level
SID301 – Using AWS Lambda as a Security Team Operating a security practice on AWS brings many new challenges that haven’t been faced in data center environments. The dynamic nature of infrastructure, the relationship between development team members and their applications, and the architecture paradigms have all changed as a result of building software on top of AWS. In this session, learn how your security team can leverage AWS Lambda as a tool to monitor, audit, and enforce your security policies within an AWS environment.
SID302 – Force Multiply Your Security Team with Automation and Alexa Adversaries automate. Who says the good guys can’t as well? By combining AWS offerings like AWS CloudTrail, Amazon Cloudwatch, AWS Config, and AWS Lambda with the power of Amazon Alexa, you can do more security tasks faster, with fewer resources. Force multiplying your security team is all about automation! Last year, we showed off penetration testing at the push of an (AWS IoT) button, and surprise-previewed how to ask Alexa to run Inspector as-needed. Want to see other ways to ask Alexa to be your cloud security sidekick? We have crazy new demos at the ready to show security geeks how to sling security automation solutions for their AWS environments (and impress and help your boss, too).
SID303 – How You Can Use AWS’s Identity Services to be Successful on Your AWS Cloud Journey Every journey to the AWS Cloud is unique. Some customers are migrating existing applications, while others are building new applications using cloud-native services. Along each of these journeys, identity and access management helps customers protect their applications and resources. In this session, you will learn how AWS’s identity services provide you a secure, flexible, and easy solution for managing identities and access on the AWS Cloud. With AWS’’s Identity Services, you do not have to adapt to AWS. Instead, you have a choice of services designed to meet you anywhere along your journey to the AWS Cloud. Every journey to the AWS Cloud is unique. Some customers are migrating existing applications, while others are building new applications using cloud-native services.
SID304 – SecOps 2021 Today: Using AWS Services to Deliver SecOps This talk dives deep on how to build end-to-end security capabilities using AWS. Our goal is orchestrating AWS Security services with other AWS building blocks to deliver enhanced security. We cover working with AWS CloudWatch Events as a queueing mechanism for processing security events, using Amazon DynamoDB to provide a stateful layer to provide tailored response to events and other ancillary functions, using DynamoDB as an attack signature engine, and the use of analytics to derive tailored signatures for detection with AWS Lambda.
SID305 – How CrowdStrike Built a Real-time Security Monitoring Service on AWS The CrowdStrike motto is “We Stop Breaches.” To do that, it needed to build a real-time security monitoring service to detect threats. Join this session to learn how Crowdstrike uses Amazon EC2 and Amazon EBS to help its customers identify vulnerabilities before they become large-scale problems.
SID306 – How Chick-fil-A Embraces DevSecOps on AWS As Chick-fil-A became a cloud-first organization, their security team didn’t want to become the bottleneck for agility. But the security team also wanted to raise the bar for their security posture on AWS. Robert Davis, security architect at Chick-fil-A, provides an overview about how he and his team recognized that writing code was the best way for their security policies to scale across the many AWS accounts that Chick-fil-A operates.
SID307 – Serverless for Security Officers: Paradigm Walkthrough and Comprehensive Security Best Practices For security practitioners, serverless represents a context switch from the familiar servers and networks to a decentralized set of code snippets and AWS platform constructs. This new ecosystem also represents new operational teams, data flows, security tooling, and faster-then-ever change velocity. In this talk, we perform live demos and provide code samples for a wide array of security best practices aligned to industry standards such as NIST 800-53 and ISO 27001.
SID308 – Multi-Account Strategies We will explore a multi-account architecture and how to approach the design/thought process around it. This chalk talk will allow attendees to dive deep into the topic and discuss the nuances of the architecture as well as provide feedback around the approach.
SID309 – Credentials, Credentials, Credentials, Oh My! For new and experienced customers alike, understanding the various credential forms and exchange mechanisms within AWS can be a daunting exercise. In this chalk talk, we clear up the confusion by performing a cartography exercise. We visually depict the right source credentials (for example, enterprise user name and password, IAM keys, AWS STS tokens, and so on) and transformation mechanisms (for example, AssumeRole and so on) to use depending on what you’re trying to do and where you’re coming from.
SID310 – Moving from the Shadows to the Throne What do you do when leadership embraces what was called “shadow IT” as the new path forward? How do you onboard new accounts while simultaneously pushing policy to secure all existing accounts? This session walks through Cisco’s journey consolidating over 700 existing accounts in the Cisco organization, while building and applying Cisco’s new cloud policies.
SID311 – Designing Security and Governance Across a Multi-Account Strategy When organizations plan their journey to cloud adoption at scale, they quickly encounter questions such as: How many accounts do we need? How do we share resources? How do we integrate with existing identity solutions? In this workshop, we present best practices and give you the hands-on opportunity to test and develop best practices. You will work in teams to set up and create an AWS environment that is enterprise-ready for application deployment and integration into existing operations, security, and procurement processes. You will get hands-on experience with cross-account roles, consolidated logging, account governance and other challenges to solve.
SID312 – DevSecOps Capture the Flag In this Capture the Flag workshop, we divide groups into teams and work on AWS CloudFormation DevSecOps. The AWS Red Team supplies an AWS DevSecOps Policy that needs to be enforced via CloudFormation static analysis. Participant Blue Teams are provided with an AWS Lambda-based reference architecture to be used to inspect CloudFormation templates against that policy. Interesting items need to be logged, and made visible via ChatOps. Dangerous items need to be logged, and recorded accurately as a template fail. The secondary challenge is building a CloudFormation template to thwart the controls being created by the other Blue teams.
SID313 – Continuous Compliance on AWS at Scale In cloud migrations, the cloud’s elastic nature is often touted as a critical capability in delivering on key business initiatives. However, you must account for it in your security and compliance plans or face some real challenges. Always counting on a virtual host to be running, for example, causes issues when that host is rebooted or retired. Managing security and compliance in the cloud is continuous, requiring forethought and automation. Learn how a leading, next generation managed cloud provider uses automation and cloud expertise to manage security and compliance at scale in an ever-changing environment.
SID314 – IAM Policy Ninja Are you interested in learning how to control access to your AWS resources? Have you wondered how to best scope permissions to achieve least-privilege permissions access control? If your answer is “yes,” this session is for you. We look at the AWS Identity and Access Management (IAM) policy language, starting with the basics of the policy language and how to create and attach policies to IAM users, groups, and roles. We explore policy variables, conditions, and tools to help you author least privilege policies. We cover common use cases, such as granting a user secure access to an Amazon S3 bucket or to launch an Amazon EC2 instance of a specific type.
SID315 – Security and DevOps: Agility and Teamwork In this session, you learn pragmatic steps to integrate security controls into DevOps processes in your AWS environment at scale. Cybersecurity expert and founder of Alert Logic Misha Govshteyn shares insights from high performing teams who are embracing the reality that an agile security program can enable faster and more secure workload deployments. Joining Misha is Joey Peloquin, Director of Cloud Security Operations at Citrix, who discusses Citrix’s DevOps experiences and how they manage their cybersecurity posture within the AWS Cloud. Session sponsored by Alert Logic.
SID316 – Using Access Advisor to Strike the Balance Between Security and Usability AWS provides a killer feature for security operations teams: Access Advisor. In this session, we discuss how Access Advisor shows the services to which an IAM policy grants access and provides a timestamp for the last time that the role authenticated against that service. At Netflix, we use this valuable data to automatically remove permissions that are no longer used. By continually removing excess permissions, we can achieve a balance of empowering developers and maintaining a best-practice, secure environment.
SID317 – Automating Security and Compliance Testing of Infrastructure-as-Code for DevSecOps Infrastructure-as-Code (IaC) has emerged as an essential element of organizational DevOps practices. Tools such as AWS CloudFormation and Terraform allow software-defined infrastructure to be deployed quickly and repeatably to AWS. But the agility of CI/CD pipelines also creates new challenges in infrastructure security hardening. This session provides a foundation for how to bring proven software hardening practices into the world of infrastructure deployment. We discuss how to build security and compliance tests for infrastructure analogous to unit tests for application code, and showcase how security, compliance and governance testing fit in a modern CI/CD pipeline.
SID318 – From Obstacle to Advantage: The Changing Role of Security & Compliance in Your Organization A surprising trend is starting to emerge among organizations who are progressing through the cloud maturity lifecycle: major improvements in revenue growth, customer satisfaction, and mission success are being directly attributed to improvements in security and compliance. At one time thought of as speed bumps in the path to deployment, security and compliance are now seen as critical ingredients that help organizations differentiate their offerings in the market, win more deals, and achieve mission-critical goals faster. This session explores how organizations like Jive Software and the National Geospatial Agency use the Evident Security Platform, AWS, and AWS Quick Starts to automate security and compliance processes in their organization to accomplish more, do it faster, and deliver better results.
SID319 – Incident Response in the Cloud In this session, we walk you through a hypothetical incident response managed on AWS. Learn how to apply existing best practices as well as how to leverage the unique security visibility, control, and automation that AWS provides. We cover how to set up your AWS environment to prevent a security event and how to build a cloud-specific incident response plan so that your organization is prepared before a security event occurs. This session also covers specific environment recovery steps available on AWS.
SID320 – Fraud Prevention, Detection, Lessons Learned, and Best Practices Fighting fraud means countering human actors that quickly adapt to whatever you do to stop them. In this presentation, we discuss the key components of a fraud prevention program in the cloud. Additionally, we provide techniques for detecting known and unknown fraud activity and explore different strategies for effectively preventing detected patterns. Finally, we discuss lessons learned from our own prevention activities as well as the best practices that you can apply to manage risk.
SID321 – How Capital One Applies AWS Organizations Best Practices to Manage Multiple AWS Accounts In this session, we review best practices for managing multiple AWS accounts using AWS Organizations. We cover how to think about the master account and your account strategy, as well as how to roll out changes. You learn how Capital One applies these best practices to manage its AWS accounts, which number over 160, and PCI workloads.
SID322 – The AWS Philosophy of Security AWS distinguished engineer Eric Brandwine speaks with hundreds of customers each year, and noticed one question coming up more than any other, “How does AWS operationalize its own security?” In this session, Eric details both strategic and tactical considerations, along with an insider’s look at AWS tooling and processes.
SID324 – Automating DDoS Response in the Cloud If left unmitigated, Distributed Denial of Service (DDoS) attacks have the potential to harm application availability or impair application performance. DDoS attacks can also act as a smoke screen for intrusion attempts or as a harbinger for attacks against non-cloud infrastructure. Accordingly, it’s crucial that developers architect for DDoS resiliency and maintain robust operational capabilities that allow for rapid detection and engagement during high-severity events. In this session, you learn how to build a DDoS-resilient application and how to use services like AWS Shield and Amazon CloudWatch to defend against DDoS attacks and automate response to attacks in progress.
SID325 – Amazon Macie: Data Visibility Powered by Machine Learning for Security and Compliance Workloads In this session, Edmunds discusses how they create workflows to manage their regulated workloads with Amazon Macie, a newly-released security and compliance management service that leverages machine learning to classify your sensitive data and business-critical information. Amazon Macie uses recurrent neural networks (RNN) to identify and alert potential misuse of intellectual property. They do a deep dive into machine learning within the security ecosystem.
SID326 – AWS Security State of the Union Steve Schmidt, chief information security officer at AWS, addresses the current state of security in the cloud, with a particular focus on feature updates, the AWS internal “secret sauce,” and what’s on horizon in terms of security, identity, and compliance tooling.
SID327 – How Zocdoc Achieved Security and Compliance at Scale With Infrastructure as Code In less than 12 months, Zocdoc became a cloud-first organization, diversifying their tech stack and liberating data to help drive rapid product innovation. Brian Lozada, CISO at Zocdoc, and Zhen Wang, Director of Engineering, provide an overview on how their teams recognized that infrastructure as code was the most effective approach for their security policies to scale across their AWS infrastructure. They leveraged tools such as AWS CloudFormation, hardened AMIs, and hardened containers. The use of DevSecOps within Zocdoc has enhanced data protection with the use of AWS services such as AWS KMS and AWS CloudHSM and auditing capabilities, and event-based policy enforcement with Amazon Elasticsearch Service and Amazon CloudWatch, all built on top of AWS.
SID328 – Cloud Adoption in Regulated Financial Services Macquarie, a global provider of financial services, identified early on that it would require strong partnership between its business, technology and risk teams to enable the rapid adoption of AWS cloud technologies. As a result, Macquarie built a Cloud Governance Platform to enable its risk functions to move as quickly as its development teams. This platform has been the backbone of Macquarie’s adoption of AWS over the past two years and has enabled Macquarie to accelerate its use of cloud technologies for the benefit of clients across multiple global markets. This talk will outline the strategy that Macquarie embarked on, describe the platform they built, and provide examples of other organizations who are on a similar journey.
SID329 – A Deep Dive into AWS Encryption Services AWS Encryption Services provide an easy and cost-effective way to protect your data in AWS. In this session, you learn about leveraging the latest encryption management features to minimize risk for your data.
SID330 – Best Practices for Implementing Your Encryption Strategy Using AWS Key Management Service AWS Key Management Service (KMS) is a managed service that makes it easy for you to create and manage the encryption keys used to encrypt your data. In this session, we will dive deep into best practices learned by implementing AWS KMS at AWS’s largest enterprise clients. We will review the different capabilities described in the AWS Cloud Adoption Framework (CAF) Security Perspective and how to implement these recommendations using AWS KMS. In addition to sharing recommendations, we will also provide examples that will help you protect sensitive information on the AWS Cloud.
SID331 – Architecting Security and Governance Across a Multi-Account Strategy Whether it is per business unit or per application, many AWS customers use multiple accounts to meet their infrastructure isolation, separation of duties, and billing requirements. In this session, we discuss considerations, limitations, and security patterns when building out a multi-account strategy. We explore topics such as identity federation, cross-account roles, consolidated logging, and account governance. Thomson Reuters shared their journey and their approach to a multi-account strategy. At the end of the session, we present an enterprise-ready, multi-account architecture that you can start leveraging today.
SID332 – Identity Management for Your Users and Apps: A Deep Dive on Amazon Cognito Learn how to set up an end-user directory, secure sign-up and sign-in, manage user profiles, authenticate and authorize your APIs, federate from enterprise and social identity providers, and use OAuth to integrate with your app—all without any server setup or code. With clear blueprints, we show you how to leverage Amazon Cognito to administer and secure your end users and enable identity for the applied patterns of mobile, web, and enterprise apps.
SID334 – How Amazon Business Uses Amazon Cloud Directory as the Data Store for Its Account Management Platform Join the Amazon Business team to learn how it uses Amazon Cloud Directory as the data store for its account management platform. You will learn how Amazon Business uses Amazon DynamoDB with Cloud Directory to manage user authorization and business process workflows. You also will learn how Cloud Directory helps to manage hierarchical datasets and how to get started modeling these datasets in Cloud Directory.
SID335 – Implementing Security and Governance across a Multi-Account Strategy As existing or new organizations expand their AWS footprint, managing multiple accounts while maintaining security quickly becomes a challenge. In this chalk talk, we will demonstrate how AWS Organizations, IAM roles, identity federation, and cross-account manager can be combined to build a scalable multi-account management platform. By the end of this session, attendees will have the understanding and deployment patterns to bring a secure, flexible and automated multi-account management platform to their organizations.
SID336 – Use AWS to Effectively Manage GDPR Compliance The General Data Protection Regulation (GDPR) is considered to be the most stringent privacy regulation ever enacted. Complying with GDPR could be a challenge for organizations, and AWS services can help get you ahead of the May 2018 enforcement deadline. In this chalk talk, the Legal and Compliance GDPR leadership at AWS discusses what enforcement of GDPR might mean for you and your customer’s compliance programs.
SID337 – Best Practices for Managing Access to AWS Resources Using IAM Roles In this chalk talk, we discuss why using temporary security credentials to manage access to your AWS resources is an AWS Identity and Access Management (IAM) best practice. IAM roles help you follow this best practice by delivering and rotating temporary credentials automatically. We discuss the different types of IAM roles, the assume role functionality, and how to author fine-grained trust and access policies that limit the scope of IAM roles. We then show you how to attach IAM roles to your AWS resources, such as Amazon EC2 instances and AWS Lambda functions. We also discuss migrating applications that use long-term AWS access keys to temporary credentials managed by IAM roles.
SID338 – [email protected] Once a customer achieves success with using AWS in a few pilot projects, most look to rapidly adopt an “all-in” enterprise migration strategy. Along this journey, several new challenges emerge that quickly become blockers and slow down migrations if they are not addressed properly. At this scale, customers will deal with the governance of hundreds of accounts, as well as thousands of IT resources residing within those accounts. Humans and traditional IT management processes cannot scale at the same pace and inevitably challenging questions emerge. In this session, we discuss those questions about governance at scale.
SID339 – Deep Dive on AWS CloudHSM Organizations building applications that handle confidential or sensitive data are subject to many types of regulatory requirements and often rely on hardware security modules (HSMs) to provide validated control of encryption keys and cryptographic operations. AWS CloudHSM is a cloud-based hardware security module (HSM) that enables you to easily generate and use your own encryption keys on the AWS Cloud using FIPS 140-2 Level 3 validated HSMs. This chalk talk will provide you a deep-dive on CloudHSM, and demonstrate how you can quickly and easily use CloudHSM to help secure your data and meet your compliance requirements.
SID340 – Using Infrastructure as Code to Inject Security Best Practices as Part of the Software Deployment Lifecycle A proactive approach to security is key to securing your applications as part of software deployment. In this chalk talk, T. Rowe Price, a financial asset management institution, outlines how they built their security automation process in enabling their numerous developer teams to rapidly and securely build and deploy applications at scale on AWS. Learn how they use services like AWS Identity and Access Management (IAM), HashiCorp tools, Terraform for automation, and Vault for secrets management, and incorporate certificate management and monitoring as part of the deployment process. T. Rowe Price discusses lessons learned and best practices to move from a tightly controlled legacy environment to an agile, automated software development process on AWS.
SID341 – Using AWS CloudTrail Logs for Scalable, Automated Anomaly Detection This workshop gives you an opportunity to develop a solution that can continuously monitor for and detect a realistic threat by analyzing AWS CloudTrail log data. Participants are provided with a CloudTrail data source and some clues to get started. Then you have to design a system that can process the logs, detect the threat, and trigger an alarm. You can make use of any AWS services that can assist in this endeavor, such as AWS Lambda for serverless detection logic, Amazon CloudWatch or Amazon SNS for alarming and notification, Amazon S3 for data and configuration storage, and more.
SID342 – Protect Your Web Applications from Common Attack Vectors Using AWS WAF As attacks and attempts to exploit vulnerabilities in web applications become more sophisticated, having an effective web request filtering solution becomes key to keeping your users’ data safe. In this workshop, discover how the OWASP Top 10 list of application security risks can help you secure your web applications. Learn how to use AWS services, such as AWS WAF, to mitigate vulnerabilities. This session includes hands-on labs to help you build a solution. Key learning goals include understanding the breadth and complexity of vulnerabilities customers need to protect from, understanding the AWS tools and capabilities that can help mitigate vulnerabilities, and learning how to configure effective HTTP request filtering rules using AWS WAF.
SID343 – User Management and App Authentication with Amazon Cognito Are you curious about how to authenticate and authorize your applications on AWS? Have you thought about how to integrate AWS Identity and Access Management (IAM) with your app authentication? Have you tried to integrate third-party SAML providers with your app authentication? Look no further. This workshop walks you through step by step to configure and create Amazon Cognito user pools and identity pools. This workshop presents you with the framework to build an application using Java, .NET, and serverless. You choose the stack and build the app with local users. See the service being used not only with mobile applications but with other stacks that normally don’t include Amazon Cognito.
SID344 – Soup to Nuts: Identity Federation for AWS AWS offers customers multiple solutions for federating identities on the AWS Cloud. In this session, we will embark on a tour of these solutions and the use cases they support. Along the way, we will dive deep with demonstrations and best practices to help you be successful managing identities on the AWS Cloud. We will cover how and when to use Security Assertion Markup Language 2.0 (SAML), OpenID Connect (OIDC), and other AWS native federation mechanisms. You will learn how these solutions enable federated access to the AWS Management Console, APIs, and CLI, AWS Infrastructure and Managed Services, your web and mobile applications running on the AWS Cloud, and much more.
SID345 – AWS Encryption SDK: The Busy Engineer’s Guide to Client-Side Encryption You know you want client-side encryption for your service but you don’t know exactly where to start. Join us for a hands-on workshop where we review some of your client-side encryption options and explore implementing client-side encryption using the AWS Encryption SDK. In this session, we cover the basics of client-side encryption, perform encrypt and decrypt operations using AWS KMS and the AWS Encryption SDK, and discuss security and performance considerations when implementing client-side encryption in your service.
Expert level
SID401 – Let’s Dive Deep Together: Advancing Web Application Security Beginning with a recap of best practices in CloudFront, AWS WAF, Route 53, and Amazon VPC security, we break into small teams to work together on improving the security of a typical web application. How can we creatively use the services? What additional features would help us? This technically advanced chalk talk requires certification at the solutions architect associate level or greater.
SID402 – An AWS Security Odyssey: Implementing Security Controls in the World of Internet, Big Data, IoT and E-Commerce Platforms This workshop will give participants the opportunity to take a security-focused journey across various AWS services and implement automated controls along the way. You will learn how to apply AWS security controls to services such as Amazon EC2, Amazon S3, AWS Lambda, and Amazon VPC. In short, you will learn how to use the cloud to protect the cloud. We will talk about how to: Adopt a workload-centric approach to your security strategy, Address security issues in a cost-effective manner Automate your security responses to promote maturity and auditability. In order to complete this workshop, attendees will need a laptop with wireless access, an AWS account and an IAM user that has full administrative privileges within their account. AWS credits will be provided as attendees depart the session to cover the cost of running the workshop in their own account.
SID404 – Amazon Inspector – Automating the “Sec” in DevSecOps Adopting DevSecOps can be challenging using traditional security tools that are designed for on-premises infrastructure. Amazon Inspector is an automated security assessment service that helps you adopt DevSecOps by integrating security assessments directly into the development process of applications running on Amazon EC2. We dive deep on how to use Inspector to automate host security assessments. We show you how to integrate Inspector with other AWS Cloud services to provide automated security assessments throughout your development process. We demo installing the AWS agent, setting up assessment targets and templates, and running assessments. We review the findings and discuss how you can automate the management and remediation of those findings with your available AWS services.
SID405 – Five New Security Automation Improvements You Can Make by Using Amazon CloudWatch Events and AWS Config Rules This presentation will include a deep dive into the code behind multiple security automation and remediation functions. This session will consider potential use cases, as well as feature a demonstration of a proposed script, and then walk through the code set to explain the various challenges and solutions of the intended script. All examples of code will be previously unreleased and will feature integration with services such as Trusted Advisor and Macie. All code will be released as OSS after re:Invent.
A robust analytics platform is critical for an organization’s success. However, an analytical system is only as good as the data that is used to power it. Wrong or incomplete data can derail analytical projects completely. Moreover, with varied data types and sources being added to the analytical platform, it’s important that the platform is able to keep up. This means the system has to be flexible enough to adapt to changing data types and volumes.
For a platform based on relational databases, this would mean countless data model updates, code changes, and regression testing, all of which can take a very long time. For a decision support system, it’s imperative to provide the correct answers quickly. Architects must design analytical platforms with the following criteria:
Help ingest data easily from multiple sources
Analyze it for completeness and accuracy
Use it for metrics computations
Store the data assets and scale as they grow rapidly without causing disruptions
Adapt to changes as they happen
Have a relatively short development cycle that is repeatable and easy to implement.
Schema-on-read is a unique approach for storing and querying datasets. It reverses the order of things when compared to schema-on-write in that the data can be stored as is and you apply a schema at the time that you read it. This approach has changed the time to value for analytical projects. It means you can immediately load data and can query it to do exploratory activities.
In this post, I show how to build a schema-on-read analytical pipeline, similar to the one used with relational databases, using Amazon Athena. The approach is completely serverless, which allows the analytical platform to scale as more data is stored and processed via the pipeline.
The data integration challenge
Part of the reason why it’s so difficult to build analytical platforms are the challenges associated with data integration. This is usually done as a series of Extract Transform Load (ETL) jobs that pull data from multiple varied sources and integrate them into a central repository. The multiple steps of ETL can be viewed as a pipeline where raw data is fed in one end and integrated data accumulates at the other.
Two major hurdles complicate the development of ETL pipelines:
The sheer volume of data needing to be integrated makes it very difficult to scale these jobs.
There are an immense variety of data formats from where information needs to be gathered.
Put this in context by looking at the healthcare industry. Building an analytical pipeline for healthcare usually involves working with multiple datasets that cover care quality, provider operations, patient clinical records, and patient claims data. Customers are now looking for even more data sources that may contain valuable information that can be analyzed. Examples include social media data, data from devices and sensors, and biometric and human-generated data such as notes from doctors. These new data sources do not rely on fixed schemas, so integrating them by conventional ETL jobs is much more difficult.
The volume of data available for analysis is growing as well. According to an article published by NCBI, the US healthcare system reached a collective data volume of 150 exabytes in 2011. With the current rate of growth, it is expected to quickly reach zettabyte scale, and soon grow into the yottabytes.
Why schema-on-read?
The traditional data warehouses of the early 2000s, like the one shown in the following diagram, were based on a standard four-layer approach of ingest, stage, store, and report. This involved building and maintaining huge data models that suited certain sources and analytical queries. This type of design is based on a schema-on-write approach, where you write data into a predefined schema and read data by querying it.
As we progressed to more varied datasets and analytical requirements, the predefined schemas were not able to keep up. Moreover, by the time the models were built and put into production, the requirements had changed.
Schema-on-read provides much needed flexibility to an analytical project. Not having to rely on physical schemas improves the overall performance when it comes to high volume data loads. Because there are no physical constraints for the data, it works well with datasets with undefined data structures. It gives customers the power to experiment and try out different options.
How can Athena help?
Athena is a serverless analytical query engine that allows you to start querying data stored in Amazon S3 instantly. It supports standard formats like CSV and Parquet and integrates with Amazon QuickSight, which allows you to build interactive business intelligence reports.
The Amazon Athena User Guide provides multiple best practices that should be considered when using Athena for analytical pipelines at production scale. There are various performance tuning techniques that you can apply to speed up query performance. For more information, see Top 10 Performance Tuning Tips for Amazon Athena.
Solution architecture
To demonstrate this solution, I use the healthcare dataset from the Centers for Disease Control (CDC) Behavioral Risk Factor Surveillance system (BRFSS). It gathers data via telephone surveys for health-related risk behaviors and conditions, and the use of preventive services. The dataset is available as zip files from the CDC FTP portal for general download and analysis. There is also a user guide with comprehensive details about the program and the process of collecting data.
The following diagram shows the schema-on-read pipeline that demonstrates this solution.
In this architecture, S3 is the central data repository for the CSV files, which are divided by behavioral risk factors like smoking, drinking, obesity, and high blood pressure.
Athena is used to collectively query the CSV files in S3. It uses the AWS Glue Data Catalog to maintain the schema details and applies it at the time of querying the data.
The dataset is further filtered and transformed into a subset that is specifically used for reporting with Amazon QuickSight.
As new data files become available, they are incrementally added to the S3 bucket and the subset query automatically appends them to the end of the table. The dashboards in Amazon QuickSight are refreshed with the new values of calculated metrics.
Walkthrough
To use Athena in an analytical pipeline, you have to consider how to design it for initial data ingestion and the subsequent incremental ingestions. Moreover, you also have to decide on the mechanism to trigger a particular stage in the pipeline, which can either be scheduled or event-based.
For the purposes of this post, you build a simple pipeline where the data is:
Ingested into a staging area in S3
Filtered and transformed for reporting into a different S3 location
Transformed into a reporting table in Athena
Included into a dashboard created using Amazon QuickSight
Incrementally ingested for updates
This approach is highly customizable and can be used for building more complex pipelines with many more steps.
Data staging in S3
The data ingestion into S3 is fairly straightforward. The files can be ingested via the S3 command line interface (CLI), API, or the AWS Management Console. The data files are in CSV format and already divided by the behavioral condition. To improve performance, I recommend that you use partitioned data with Athena, especially when dealing with large volumes. You can use pre-partitioned data in S3 or build partitions later in the process. The example you are working with has a total of 247 CSV files storing about 205 MB of data across them, but typical production scale deployment would be much larger.
To automate the pipeline, you can make it either event-based or schedule-based. If you take the event-based approach, you can make use of S3 events to trigger an action when the files are uploaded. The event triggers an action, using an AWS Lambda function that corresponds to another step in the pipeline. Traditional ETL jobs have to rely on mechanisms like database triggers to enable this, which can cause additional performance overhead.
If you choose to go with a scheduled-based approach, you can use Lambda with scheduled events. The schedule is managed via a cron expression and the Lambda function is used to run the next step of the pipeline. This is suitable for workloads similar to a scheduled batch ETL job.
Filter and data transformation
To filter and transform the dataset, first look at the overall counts and structure of the data. This allows you to choose the columns that are important for a given report and use the filter clause to extract a subset of the data. Transforming the data as you progress through the pipeline ensures that you are only exposing relevant data to the reporting layer, to optimize performance.
To look at the entire dataset, create a table in Athena to go across the entire data volume. This can be done using the following query:
CREATE EXTERNAL TABLE IF NOT EXISTS brfsdata(
ID STRING,
HIW STRING,
SUSA_NAME STRING,
MATCH_NAME STRING,
CHSI_NAME STRING,
NSUM STRING,
MEAN STRING,
FLAG STRING,
IND STRING,
UP_CI STRING,
LOW_CI STRING,
SEMEAN STRING,
AGE_ADJ STRING,
DATASRC STRING,
FIPS STRING,
FIPNAME STRING,
HRR STRING,
DATA_YR STRING,
UNIT STRING,
AGEGRP STRING,
GENDER STRING,
RACE STRING,
EHN STRING,
EDU STRING,
FAMINC STRING,
DISAB STRING,
METRO STRING,
SEXUAL STRING,
FAMSTRC STRING,
MARITAL STRING,
POP_SPC STRING,
POP_POLICY STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
LOCATION "s3://<YourBucket/YourPrefix>"
Replace YourBucket and YourPrefix with your corresponding values.
In this case, there are a total of ~1.4 million records, which you get by running a simple COUNT(*) query on the table.
From here, you can run multiple analysis queries on the dataset. For example, you can find the number of records that fall into a certain behavioral risk, or the state that has the highest number of diabetic patients recorded. These metrics provide data points that help to determine the attributes that would be needed from a reporting perspective.
As you can see, this is much simpler compared to a schema-on-write approach where the analysis of the source dataset is much more difficult. This solution allows you to design the reporting platform in accordance with the questions you are looking to answer from your data. The source data analysis is the first step to design a good analytical platform and this approach allows customers to do that much earlier in the project lifecycle.
Reporting table
After you have completed the source data analysis, the next step is to filter out the required data and transform it to create a reporting database. This is synonymous to the data mart in a standard analytical pipeline. Based on the analysis carried out in the previous step, you might notice some mismatches with the data headers. You might also identify the filter clauses to apply to the dataset to get to your reporting data.
Athena automatically saves query results in S3 for every run. The default bucket for this is created in the following format:
Athena creates a prefix for each saved query and stores the result set as CSV files organized by dates. You can use this feature to filter out result datasets and store them in an S3 bucket for reporting.
To enable this, create queries that can filter out and transform the subset of data on which to report. For this use case, create three separate queries to filter out unwanted data and fix the column headers:
Query 1:
SELECT ID, up_ci AS source, semean AS state, datasrc
AS year, fips AS unit, fipname AS age,mean, current_date AS dt, current_time AS tm FROM brfsdata
WHERE ID != '' AND hrr IS NULL AND semean NOT LIKE '%29193%'
Query 2:
SELECT ID, up_ci AS source, semean AS state, datasrc
AS year, fips AS unit, fipname AS age,mean, current_date AS dt, current_time AS tm
FROM brfsdata WHERE ID != '' AND hrr IS NOT NULL AND up_ci LIKE '%BRFSS%'and semean NOT LIKE '"%' AND semean NOT LIKE '%29193%'
Query 3:
SELECT ID, low_ci AS source, age_adj AS state, fips
AS year, fipname AS unit, hrr AS age,mean, current_date AS dt, current_time AS tm
FROM brfsdata WHERE ID != '' AND hrr IS NOT NULL AND up_ci NOT LIKE '%BRFSS%' AND age_adj NOT LIKE '"%' AND semean NOT LIKE '%29193%' AND low_ci LIKE 'BRFSS'
You can save these queries in Athena so that you can get to the query results easily every time they are executed. The following screenshot is an example of the results when query 1 is executed four times.
The next step is to copy these results over to a new bucket for creating your reporting table. This can be done by running an S3 CP command from the CLI or API, as shown below:
Note the prefix structure in which Athena stores query results. It creates a separate prefix for each day in which the query is executed, and stores the corresponding CSV and metadata file for each run. Copy the result set over to a new prefix “Reporting_Data”. Use the “exclude” and “include” option of S3 CP to only copy the CSV files and use “recursive” to copy all the files from the run.
You can replace the value of the saved query name from “Query1” to “Query2” or “Query3” to copy all data resulting from those queries to the same target prefix. For pipelines that require more complicated transformations, divide the query transformation into multiple steps and execute them based on events or schedule them, as described in the earlier data staging step.
Amazon QuickSight dashboard
After the filtered results sets are copied as CSV files into the new Reporting_Data prefix, create a new table in Athena that is used specifically for BI reporting. This can be done using a create table statement similar to the one below:
CREATE EXTERNAL TABLE IF NOT EXISTS BRFSS_REPORTING(
ID varchar(100),
source varchar(100),
state varchar(100),
year int,
unit varchar(10),
age varchar(10),
mean float
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
LOCATION "s3://healthdataset/brfsdata/Reporting_Data"
This table can now act as a source for a dashboard on Amazon QuickSight, which is a straightforward process to enable. When you choose a new data source, Athena shows up as an option and Amazon QuickSight automatically detects the tables in Athena that are exposed for querying. Here are the data sources supported by Athena at the time of this post:
After choosing Athena, give a name to the data source and choose the database. The tables available for querying automatically show up in the list.
If you choose “BRFSS_REPORTING”, you can create custom metrics using the columns in the reporting table, which can then be used in reports and dashboards.
To build a complete pipeline, think about ingesting data incrementally for new records as they become available. To enable this, make sure that the data can be incrementally ingested into the reporting schema and that the reporting metrics are refreshed on each run of the report. To demonstrate this, look at a scenario where the new dataset is ingested in a periodic basis into S3, and which has to be included into the reporting schema when calculating the metrics.
Look at the number of records in the reporting table before the incremental ingestion.
SELECT count(*) FROM brfss_reporting;
The results are as follows:
_col0
1
713123
Use Query1 as an example transformation query that can isolate the incremental load. Here is a view of the query result bucket before Query1 runs.
After the incremental data is ingested in S3, trigger an event (or pre-schedule) an execution of Query1 in Athena, which results in the csv result set and the metadata file as shown below.
Next, trigger (or schedule) the copy command to copy the incremental records file into the reporting prefix. This is easily automated by using the predefined structure in which Athena saves the query results on S3. On checking the records count in the reporting table after copying, you get an increased count.
_col0
1
810042
This shows that 96,919 records were added to our reporting table that can be used in metric calculations.
This process can be implemented programmatically to add incremental records into the reporting table every time new records are ingested into the staging area. As a result, you can simulate an end-to-end analytical pipeline that runs based on events or is scheduled to run as a batch workload.
Conclusion
Using Athena, you can benefit from the advantages of a schema-on-read analytical application. You can combine other AWS services like Lambda and Amazon QuickSight to build an end-to-end analytical pipeline.
However, it’s important to note that a schema-on-read analytical pipeline may not be the answer for all use cases. Carefully consider the choices between schema-on-read and schema-on-write. The source systems you are working with play a critical role in making that decision. For example:
Some systems have defined data structures and the chances of variation is very rare. These systems can work with a fixed relational target schema.
If the target queries mostly involve joining across normalized tables, they work better with a relational database. For this use case, schema-on-write is a good choice.
The query performance in a schema-on-write application is faster as the data is pre-structured. For fixed dashboards with little to no changes, a schema-on-write is a good choice.
You can also choose to go hybrid. Offload a part of the pipeline that deals with unstructured flat datasets into a schema-on-read architecture, and integrate the output into the main schema-on-write pipeline at a later stage. AWS provides multiple options to build analytical pipelines that suit various use cases. For more information, read about the AWS big data and analytics services.
If you have questions or suggestions, please comment below.
Ujjwal Ratan is a healthcare and life sciences Solutions Architect at AWS. He has worked with organizations ranging from large enterprises to smaller startups on problems related to distributed computing, analytics and machine learning. In his free time, he enjoys listening to (and playing) music and taking unplanned road trips with his family.
Simple AD, which is powered by Samba 4, supports basic Active Directory (AD) authentication features such as users, groups, and the ability to join domains. Simple AD also includes an integrated Lightweight Directory Access Protocol (LDAP) server. LDAP is a standard application protocol for the access and management of directory information. You can use the BIND operation from Simple AD to authenticate LDAP client sessions. This makes LDAP a common choice for centralized authentication and authorization for services such as Secure Shell (SSH), client-based virtual private networks (VPNs), and many other applications. Authentication, the process of confirming the identity of a principal, typically involves the transmission of highly sensitive information such as user names and passwords. To protect this information in transit over untrusted networks, companies often require encryption as part of their information security strategy.
In this blog post, we show you how to configure an LDAPS (LDAP over SSL/TLS) encrypted endpoint for Simple AD so that you can extend Simple AD over untrusted networks. Our solution uses Elastic Load Balancing (ELB) to send decrypted LDAP traffic to HAProxy running on Amazon EC2, which then sends the traffic to Simple AD. ELB offers integrated certificate management, SSL/TLS termination, and the ability to use a scalable EC2 backend to process decrypted traffic. ELB also tightly integrates with Amazon Route 53, enabling you to use a custom domain for the LDAPS endpoint. The solution needs the intermediate HAProxy layer because ELB can direct traffic only to EC2 instances. To simplify testing and deployment, we have provided an AWS CloudFormation template to provision the ELB and HAProxy layers.
This post assumes that you have an understanding of concepts such as Amazon Virtual Private Cloud (VPC) and its components, including subnets, routing, Internet and network address translation (NAT) gateways, DNS, and security groups. You should also be familiar with launching EC2 instances and logging in to them with SSH. If needed, you should familiarize yourself with these concepts and review the solution overview and prerequisites in the next section before proceeding with the deployment.
Note: This solution is intended for use by clients requiring an LDAPS endpoint only. If your requirements extend beyond this, you should consider accessing the Simple AD servers directly or by using AWS Directory Service for Microsoft AD.
Solution overview
The following diagram and description illustrates and explains the Simple AD LDAPS environment. The CloudFormation template creates the items designated by the bracket (internal ELB load balancer and two HAProxy nodes configured in an Auto Scaling group).
Here is how the solution works, as shown in the preceding numbered diagram:
The LDAP client sends an LDAPS request to ELB on TCP port 636.
ELB terminates the SSL/TLS session and decrypts the traffic using a certificate. ELB sends the decrypted LDAP traffic to the EC2 instances running HAProxy on TCP port 389.
The HAProxy servers forward the LDAP request to the Simple AD servers listening on TCP port 389 in a fixed Auto Scaling group configuration.
The Simple AD servers send an LDAP response through the HAProxy layer to ELB. ELB encrypts the response and sends it to the client.
Note: Amazon VPC prevents a third party from intercepting traffic within the VPC. Because of this, the VPC protects the decrypted traffic between ELB and HAProxy and between HAProxy and Simple AD. The ELB encryption provides an additional layer of security for client connections and protects traffic coming from hosts outside the VPC.
Prerequisites
Our approach requires an Amazon VPC with two public and two private subnets. The previous diagram illustrates the environment’s VPC requirements. If you do not yet have these components in place, follow these guidelines for setting up a sample environment:
Identify a region that supports Simple AD, ELB, and NAT gateways. The NAT gateways are used with an Internet gateway to allow the HAProxy instances to access the internet to perform their required configuration. You also need to identify the two Availability Zones in that region for use by Simple AD. You will supply these Availability Zones as parameters to the CloudFormation template later in this process.
Create or choose an Amazon VPC in the region you chose. In order to use Route 53 to resolve the LDAPS endpoint, make sure you enable DNS support within your VPC. Create an Internet gateway and attach it to the VPC, which will be used by the NAT gateways to access the internet.
Create a route table with a default route to the Internet gateway. Create two NAT gateways, one per Availability Zone in your public subnets to provide additional resiliency across the Availability Zones. Together, the routing table, the NAT gateways, and the Internet gateway enable the HAProxy instances to access the internet.
Create two private routing tables, one per Availability Zone. Create two private subnets, one per Availability Zone. The dual routing tables and subnets allow for a higher level of redundancy. Add each subnet to the routing table in the same Availability Zone. Add a default route in each routing table to the NAT gateway in the same Availability Zone. The Simple AD servers use subnets that you create.
The LDAP service requires a DNS domain that resolves within your VPC and from your LDAP clients. If you do not have an existing DNS domain, follow the steps to create a private hosted zone and associate it with your VPC. To avoid encryption protocol errors, you must ensure that the DNS domain name is consistent across your Route 53 zone and in the SSL/TLS certificate (see Step 2 in the “Solution deployment” section).
We will use a self-signed certificate for ELB to perform SSL/TLS decryption. You can use a certificate issued by your preferred certificate authority or a certificate issued by AWS Certificate Manager (ACM). Note: To prevent unauthorized connections directly to your Simple AD servers, you can modify the Simple AD security group on port 389 to block traffic from locations outside of the Simple AD VPC. You can find the security group in the EC2 console by creating a search filter for your Simple AD directory ID. It is also important to allow the Simple AD servers to communicate with each other as shown on Simple AD Prerequisites.
Solution deployment
This solution includes five main parts:
Create a Simple AD directory.
Create a certificate.
Create the ELB and HAProxy layers by using the supplied CloudFormation template.
Create a Route 53 record.
Test LDAPS access using an Amazon Linux client.
1. Create a Simple AD directory
With the prerequisites completed, you will create a Simple AD directory in your private VPC subnets:
In the Directory Service console navigation pane, choose Directories and then choose Set up directory.
Choose Simple AD.
Provide the following information:
Directory DNS – The fully qualified domain name (FQDN) of the directory, such as corp.example.com. You will use the FQDN as part of the testing procedure.
NetBIOS name – The short name for the directory, such as CORP.
Administrator password – The password for the directory administrator. The directory creation process creates an administrator account with the user name Administrator and this password. Do not lose this password because it is nonrecoverable. You also need this password for testing LDAPS access in a later step.
Description – An optional description for the directory.
Directory Size – The size of the directory.
Provide the following information in the VPC Details section, and then choose Next Step:
VPC – Specify the VPC in which to install the directory.
Subnets – Choose two private subnets for the directory servers. The two subnets must be in different Availability Zones. Make a note of the VPC and subnet IDs for use as CloudFormation input parameters. In the following example, the Availability Zones are us-east-1a and us-east-1c.
Review the directory information and make any necessary changes. When the information is correct, choose Create Simple AD.
It takes several minutes to create the directory. From the AWS Directory Service console , refresh the screen periodically and wait until the directory Status value changes to Active before continuing. Choose your Simple AD directory and note the two IP addresses in the DNS address section. You will enter them when you run the CloudFormation template later.
In the previous step, you created the Simple AD directory. Next, you will generate a self-signed SSL/TLS certificate using OpenSSL. You will use the certificate with ELB to secure the LDAPS endpoint. OpenSSL is a standard, open source library that supports a wide range of cryptographic functions, including the creation and signing of x509 certificates. You then import the certificate into ACM that is integrated with ELB.
You must have a system with OpenSSL installed to complete this step. If you do not have OpenSSL, you can install it on Amazon Linux by running the command, sudo yum install openssl. If you do not have access to an Amazon Linux instance you can create one with SSH access enabled to proceed with this step. Run the command, openssl version, at the command line to see if you already have OpenSSL installed.
[[email protected] ~]$ openssl version
OpenSSL 1.0.1k-fips 8 Jan 2015
Create a private key using the command, openssl genrsa command.
[[email protected] tmp]$ openssl genrsa 2048 > privatekey.pem
Generating RSA private key, 2048 bit long modulus
......................................................................................................................................................................+++
..........................+++
e is 65537 (0x10001)
Generate a certificate signing request (CSR) using the openssl req command. Provide the requested information for each field. The Common Name is the FQDN for your LDAPS endpoint (for example, ldap.corp.example.com). The Common Name must use the domain name you will later register in Route 53. You will encounter certificate errors if the names do not match.
[[email protected] tmp]$ openssl req -new -key privatekey.pem -out server.csr
You are about to be asked to enter information that will be incorporated into your certificate request.
Use the openssl x509 command to sign the certificate. The following example uses the private key from the previous step (privatekey.pem) and the signing request (server.csr) to create a public certificate named server.crt that is valid for 365 days. This certificate must be updated within 365 days to avoid disruption of LDAPS functionality.
Keep the private key and public certificate for later use. You can discard the signing request because you are using a self-signed certificate and not using a Certificate Authority. Always store the private key in a secure location and avoid adding it to your source code.
Using your favorite Linux text editor, paste the contents of your server.crt file in the Certificate body box.
Using your favorite Linux text editor, paste the contents of your privatekey.pem file in the Certificate private key box. For a self-signed certificate, you can leave the Certificate chain box blank.
Choose Review and import. Confirm the information and choose Import.
3. Create the ELB and HAProxy layers by using the supplied CloudFormation template
Now that you have created your Simple AD directory and SSL/TLS certificate, you are ready to use the CloudFormation template to create the ELB and HAProxy layers.
Load the supplied CloudFormation template to deploy an internal ELB and two HAProxy EC2 instances into a fixed Auto Scaling group. After you load the template, provide the following input parameters. Note: You can find the parameters relating to your Simple AD from the directory details page by choosing your Simple AD in the Directory Service console.
Input parameter
Input parameter description
HAProxyInstanceSize
The EC2 instance size for HAProxy servers. The default size is t2.micro and can scale up for large Simple AD environments.
MyKeyPair
The SSH key pair for EC2 instances. If you do not have an existing key pair, you must create one.
VPCId
The target VPC for this solution. Must be in the VPC where you deployed Simple AD and is available in your Simple AD directory details page.
SubnetId1
The Simple AD primary subnet. This information is available in your Simple AD directory details page.
SubnetId2
The Simple AD secondary subnet. This information is available in your Simple AD directory details page.
MyTrustedNetwork
Trusted network Classless Inter-Domain Routing (CIDR) to allow connections to the LDAPS endpoint. For example, use the VPC CIDR to allow clients in the VPC to connect.
SimpleADPriIP
The primary Simple AD Server IP. This information is available in your Simple AD directory details page.
SimpleADSecIP
The secondary Simple AD Server IP. This information is available in your Simple AD directory details page.
LDAPSCertificateARN
The Amazon Resource Name (ARN) for the SSL certificate. This information is available in the ACM console.
Enter the input parameters and choose Next.
On the Options page, accept the defaults and choose Next.
On the Review page, confirm the details and choose Create. The stack will be created in approximately 5 minutes.
4. Create a Route 53 record
The next step is to create a Route 53 record in your private hosted zone so that clients can resolve your LDAPS endpoint.
If you do not have an existing DNS domain for use with LDAP, create a private hosted zone and associate it with your VPC. The hosted zone name should be consistent with your Simple AD (for example, corp.example.com).
When the CloudFormation stack is in CREATE_COMPLETE status, locate the value of the LDAPSURL on the Outputs tab of the stack. Copy this value for use in the next step.
On the Route 53 console, choose Hosted Zones and then choose the zone you used for the Common Name box for your self-signed certificate. Choose Create Record Set and enter the following information:
Name – The label of the record (such as ldap).
Type – Leave as A – IPv4 address.
Alias – Choose Yes.
Alias Target – Paste the value of the LDAPSURL on the Outputs tab of the stack.
Leave the defaults for Routing Policy and Evaluate Target Health, and choose Create.
5. Test LDAPS access using an Amazon Linux client
At this point, you have configured your LDAPS endpoint and now you can test it from an Amazon Linux client.
Create an Amazon Linux instance with SSH access enabled to test the solution. Launch the instance into one of the public subnets in your VPC. Make sure the IP assigned to the instance is in the trusted IP range you specified in the CloudFormation parameter MyTrustedNetwork in Step 3.b.
SSH into the instance and complete the following steps to verify access.
Install the openldap-clients package and any required dependencies: sudo yum install -y openldap-clients.
Add the server.crt file to the /etc/openldap/certs/ directory so that the LDAPS client will trust your SSL/TLS certificate. You can copy the file using Secure Copy (SCP) or create it using a text editor.
Edit the /etc/openldap/ldap.conf file and define the environment variables BASE, URI, and TLS_CACERT.
The value for BASE should match the configuration of the Simple AD directory name.
The value for URI should match your DNS alias.
The value for TLS_CACERT is the path to your public certificate.
Here is an example of the contents of the file.
BASE dc=corp,dc=example,dc=com
URI ldaps://ldap.corp.example.com
TLS_CACERT /etc/openldap/certs/server.crt
To test the solution, query the directory through the LDAPS endpoint, as shown in the following command. Replace corp.example.com with your domain name and use the Administrator password that you configured with the Simple AD directory
You should see a response similar to the following response, which provides the directory information in LDAP Data Interchange Format (LDIF) for the administrator distinguished name (DN) from your Simple AD LDAP server.
# extended LDIF
#
# LDAPv3
# base <dc=corp,dc=example,dc=com> (default) with scope subtree
# filter: sAMAccountName=Administrator
# requesting: ALL
#
# Administrator, Users, corp.example.com
dn: CN=Administrator,CN=Users,DC=corp,DC=example,DC=com
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: user
description: Built-in account for administering the computer/domain
instanceType: 4
whenCreated: 20170721123204.0Z
uSNCreated: 3223
name: Administrator
objectGUID:: l3h0HIiKO0a/ShL4yVK/vw==
userAccountControl: 512
…
You can now use the LDAPS endpoint for directory operations and authentication within your environment. If you would like to learn more about how to interact with your LDAPS endpoint within a Linux environment, here are a few resources to get started:
Verify that the parameters in ldap.conf match your configured LDAPS URI endpoint and that all parameters can be resolved by DNS. You can use the following dig command, substituting your configured endpoint DNS name.
$ dig ldap.corp.example.com
Confirm that the client instance from which you are connecting is in the CIDR range of the CloudFormation parameter, MyTrustedNetwork.
Confirm that the path to your public SSL/TLS certificate configured in ldap.conf as TLS_CAERT is correct. You configured this in Step 5.b.3. You can check your SSL/TLS connection with the command, substituting your configured endpoint DNS name for the string after –connect.
Verify that your HAProxy instances have the status InService in the EC2 console: Choose Load Balancers under Load Balancing in the navigation pane, highlight your LDAPS load balancer, and then choose the Instances
Conclusion
You can use ELB and HAProxy to provide an LDAPS endpoint for Simple AD and transport sensitive authentication information over untrusted networks. You can explore using LDAPS to authenticate SSH users or integrate with other software solutions that support LDAP authentication. This solution’s CloudFormation template is available on GitHub.
If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, start a new thread on the Directory Service forum.
Encryption at Rest Today we are adding support for encryption of data at rest. When you create a new file system, you can select a key that will be used to encrypt the contents of the files that you store on the file system. The key can be a built-in key that is managed by AWS or a key that you created yourself using AWS Key Management Service (KMS). File metadata (file names, directory names, and directory contents) will be encrypted using a key managed by AWS. Both forms of encryption are implemented using an industry-standard AES-256 algorithm.
You can set this up in seconds when you create a new file system. You simply choose the built-in key (aws/elasticfilesystem) or one of your own:
EFS will take care of the rest! You can select the filesystem in the console to verify that it is encrypted as desired:
A cryptographic algorithm that meets the approval of FIPS 140-2 is used to encrypt data and metadata. The encryption is transparent and has a minimal effect on overall performance.
You can use AWS Identity and Access Management (IAM) to control access to the Customer Master Key (CMK). The CMK must be enabled in order to grant access to the file system; disabling the key prevents it from being used to create new file systems and blocks access (after a period of time) to existing file systems that it protects. To learn more about your options, read Managing Access to Encrypted File Systems.
Available Now Encryption of data at rest is available now in all regions where EFS is supported, at no additional charge.
Our customers run an incredible variety of mission-critical workloads on AWS, many of which process and store sensitive data. As detailed in our Overview of Security Processes document, AWS customers have access to an ever-growing set of options for encrypting and protecting this data. For example, Amazon Relational Database Service (RDS) supports encryption of data at rest and in transit, with options tailored for each supported database engine (MySQL, SQL Server, Oracle, MariaDB, PostgreSQL, and Aurora).
Many customers use AWS Key Management Service (KMS) to centralize their key management, with others taking advantage of the hardware-based key management, encryption, and decryption provided by AWS CloudHSM to meet stringent security and compliance requirements for their most sensitive data and regulated workloads (you can read my post, AWS CloudHSM – Secure Key Storage and Cryptographic Operations, to learn more about Hardware Security Modules, also known as HSMs).
Major CloudHSM Update Today, building on what we have learned from our first-generation product, we are making a major update to CloudHSM, with a set of improvements designed to make the benefits of hardware-based key management available to a much wider audience while reducing the need for specialized operating expertise. Here’s a summary of the improvements:
Pay As You Go – CloudHSM is now offered under a pay-as-you-go model that is simpler and more cost-effective, with no up-front fees.
Fully Managed – CloudHSM is now a scalable managed service; provisioning, patching, high availability, and backups are all built-in and taken care of for you. Scheduled backups extract an encrypted image of your HSM from the hardware (using keys that only the HSM hardware itself knows) that can be restored only to identical HSM hardware owned by AWS. For durability, those backups are stored in Amazon Simple Storage Service (S3), and for an additional layer of security, encrypted again with server-side S3 encryption using an AWS KMS master key.
Open & Compatible – CloudHSM is open and standards-compliant, with support for multiple APIs, programming languages, and cryptography extensions such as PKCS #11, Java Cryptography Extension (JCE), and Microsoft CryptoNG (CNG). The open nature of CloudHSM gives you more control and simplifies the process of moving keys (in encrypted form) from one CloudHSM to another, and also allows migration to and from other commercially available HSMs.
More Secure – CloudHSM Classic (the original model) supports the generation and use of keys that comply with FIPS 140-2 Level 2. We’re stepping that up a notch today with support for FIPS 140-2 Level 3, with security mechanisms that are designed to detect and respond to physical attempts to access or modify the HSM. Your keys are protected with exclusive, single-tenant access to tamper-resistant HSMs that appear within your Virtual Private Clouds (VPCs). CloudHSM supports quorum authentication for critical administrative and key management functions. This feature allows you to define a list of N possible identities that can access the functions, and then require at least M of them to authorize the action. It also supports multi-factor authentication using tokens that you provide.
AWS-Native – The updated CloudHSM is an integral part of AWS and plays well with other tools and services. You can create and manage a cluster of HSMs using the AWS Management Console, AWS Command Line Interface (CLI), or API calls.
Diving In You can create CloudHSM clusters that contain 1 to 32 HSMs, each in a separate Availability Zone in a particular AWS Region. Spreading HSMs across AZs gives you high availability (including built-in load balancing); adding more HSMs gives you additional throughput. The HSMs within a cluster are kept in sync: performing a task or operation on one HSM in a cluster automatically updates the others. Each HSM in a cluster has its own Elastic Network Interface (ENI).
All interaction with an HSM takes place via the AWS CloudHSM client. It runs on an EC2 instance and uses certificate-based mutual authentication to create secure (TLS) connections to the HSMs.
At the hardware level, each HSM includes hardware-enforced isolation of crypto operations and key storage. Each customer HSM runs on dedicated processor cores.
Setting Up a Cluster Let’s set up a cluster using the CloudHSM Console:
I click on Create cluster to get started, select my desired VPC and the subnets within it (I can also create a new VPC and/or subnets if needed):
Then I review my settings and click on Create:
After a few minutes, my cluster exists, but is uninitialized:
Initialization simply means retrieving a certificate signing request (the Cluster CSR):
And then creating a private key and using it to sign the request (these commands were copied from the Initialize Cluster docs and I have omitted the output. Note that ID identifies the cluster):
The next step is to apply the signed certificate to the cluster using the console or the CLI. After this has been done, the cluster can be activated by changing the password for the HSM’s administrative user, otherwise known as the Crypto Officer (CO).
Once the cluster has been created, initialized and activated, it can be used to protect data. Applications can use the APIs in AWS CloudHSM SDKs to manage keys, encrypt & decrypt objects, and more. The SDKs provide access to the CloudHSM client (running on the same instance as the application). The client, in turn, connects to the cluster across an encrypted connection.
Available Today The new HSM is available today in the US East (Northern Virginia), US West (Oregon), US East (Ohio), and EU (Ireland) Regions, with more in the works. Pricing starts at $1.45 per HSM per hour.
As I explained in my previous Security Blog post, a hardware security module (HSM) is a hardware device designed with the security of your data and cryptographic key material in mind. It is tamper-resistant hardware that prevents unauthorized users from attempting to pry open the device, plug in any extra devices to access data or keys such as subtokens, or damage the outside housing. The HSM device AWS CloudHSM offers is the Luna SA 7000 (also called Safenet Network HSM 7000), which is created by Gemalto. Depending on the firmware version you install, many of the security properties of these HSMs will have been validated under Federal Information Processing Standard (FIPS) 140-2, a standard issued by the National Institute of Standards and Technology (NIST) for cryptography modules. These standards are in place to protect the integrity and confidentiality of the data stored on cryptographic modules.
To help ensure its continued use, functionality, and support from AWS, we suggest that you update your AWS CloudHSM device software and firmware as well as the client instance software to current versions offered by AWS. As of the publication of this blog post, the current non-FIPS-validated versions are 5.4.9/client, 5.3.13/software, and 6.20.2/firmware, and the current FIPS-validated versions are 5.4.9/client, 5.3.13/software, and 6.10.9/firmware. (The firmware version determines FIPS validation.) It is important to know your current versions before updating so that you can follow the correct update path.
In this post, I demonstrate how to update your current CloudHSM devices and client instances so that you are using the most current versions of software and firmware. If you contact AWS Support for CloudHSM hardware and application issues, you will be required to update to these supported versions before proceeding. Also, any newly provisioned CloudHSM devices will use these supported software and firmware versions only, and AWS does not offer “downgrade” options.
Note: Before you perform any updates, check with your local CloudHSM administrator and application developer to verify that these updates will not conflict with your current applications or architecture.
Overview of the update process
To update your client and CloudHSM devices, you must use both update paths offered by AWS. The first path involves updating the software on your client instance, also known as a control instance. Following the second path updates the software first and then the firmware on your CloudHSM devices. The CloudHSM software must be updated before the firmware because of the firmware’s dependencies on the software in order to work appropriately.
As I demonstrate in this post, the correct update order is:
Updating your client instance
Updating your CloudHSM software
Updating your CloudHSM firmware
To update your client instance, you must have the private SSH key you created when you first set up your client environment. This key is used to connect via SSH protocol on port 22 of your client instance. If you have more than one client instance, you must repeat this connection and update process on each of them. The following diagram shows the flow of an SSH connection from your local network to your client instances in the AWS Cloud.
After you update your client instance to the most recent software (5.3.13), you then must update the CloudHSM device software and firmware. First, you must initiate an SSH connection from any one client instance to each CloudHSM device, as illustrated in the following diagram. A successful SSH connection will have you land at the Luna shell, denoted by lunash:>. Second, you must be able to initiate a Secure Copy (SCP) of files to each device from the client instance. Because the software and firmware updates require an elevated level of privilege, you must have the Security Officer (SO) password that you created when you initialized your CloudHSM devices.
After you have completed all updates, you can receive enhanced troubleshooting assistance from AWS, if you need it. When new versions of software and firmware are released, AWS performs extensive testing to ensure your smooth transition from version to version.
Detailed guidance for updating your client instance, CloudHSM software, and CloudHSM firmware
1. Updating your client instance
Let’s start by updating your client instances. My client instance and CloudHSM devices are in the eu-west-1 region, but these steps work the same in any AWS region. Because Gemalto offers client instances in both Linux and Windows, I will cover steps to update both. I will start with Linux. Please note that all commands should be run as the “root” user.
Updating the Linux client
SSH from your local network into the client instance. You can do this from Linux or Windows. Typically, you would do this from the directory where you have stored your private SSH key by using a command like the following command in a terminal or PuTTY This initiates the SSH connection by pointing to the path of your SSH key and denoting the user name and IP address of your client instance.
After the SSH connection is established, you must stop all applications and services on the instance that are using the CloudHSM device. This is required because you are removing old software and installing new software in its place. After you have stopped all applications and services, you can move on to remove the existing version of the client software.
/usr/safenet/lunaclient/bin/uninstall.sh
This command will remove the old client software, but will not remove your configuration file or certificates. These will be saved in the Chrystoki.conf file of your /etc directory and your usr/safenet/lunaclient/cert directory. Do not delete these files because you will lose the configuration of your CloudHSM devices and client connections.
Download the new client software package: cloudhsm-safenet-client. Double-click it to extract the archive.
SafeNet-Luna-client-5-4-9/linux/64/install.sh
Make sure you choose the Luna SA option when presented with it. Because the directory where your certificates are installed is the same, you do not need to copy these certificates to another directory. You do, however, need to ensure that the Chrystoki.conf file, located at /etc/Chrystoki.conf, has the same path and name for the certificates as when you created them. (The path or names should not have changed, but you should still verify they are same as before the update.)
Check to ensure that the PATH environment variable points to the directory, /usr/safenet/lunaclient/bin, to ensure no issues when you restart applications and services. The update process for your Linux client Instance is now complete.
Updating the Windows client
Use the following steps to update your Windows client instances:
After you establish the RDP connection, stop all applications and services on the instance that are using the CloudHSM device. This is required because you will remove old software and install new software in its place or overwrite If your client software version is older than 5.4.1, you need to completely remove it and all patches by using Programs and Features in the Windows Control Panel. If your client software version is 5.4.1 or newer, proceed without removing the software. Your configuration file will remain intact in the crystoki.ini file of your C:\Program Files\SafeNet\Lunaclient\ directory. All certificates are preserved in the C:\Program Files\SafeNet\Lunaclient\cert\ directory. Again, do not delete these files, or you will lose all configuration and client connection data.
After you have completed these steps, download the new client software: cloudhsm-safenet-client. Extract the archive from the downloaded file, and launch the SafeNet-Luna-client-5-4-9\win\64\Lunaclient.msi Choose the Luna SA option when it is presented to you. Because the directory where your certificates are installed is the same, you do not need to copy these certificates to another directory. You do, however, need to ensure that the crystoki.ini file, which is located at C:\Program Files\SafeNet\Lunaclient\crystoki.ini, has the same path and name for the certificates as when you created them. (The path and names should not have changed, but you should still verify they are same as before the update.)
Make one last check to ensure the PATH environment variable points to the directory C:\Program Files\SafeNet\Lunaclient\ to help ensure no issues when you restart applications and services. The update process for your Windows client instance is now complete.
2. Updating your CloudHSM software
Now that your clients are up to date with the most current software version, it’s time to move on to your CloudHSM devices. A few important notes:
Back up your data to a Luna SA Backup device. AWS does not sell or support the Luna SA Backup devices, but you can purchase them from Gemalto. We do, however, offer the steps to back up your data to a Luna SA Backup device. Do not update your CloudHSM devices without backing up your data first.
If the names of your clients used for Network Trust Link Service (NTLS) connections has a capital “T” as the eighth character, the client will not work after this update. This is because of a Gemalto naming convention. Before upgrading, ensure you modify your client names accordingly. The NTLS connection uses a two-way digital certificate authentication and SSL data encryption to protect sensitive data transmitted between your CloudHSM device and the client Instances.
The syslog configuration for the CloudHSM devices will be lost. After the update is complete, notify AWS Support and we will update the configuration for you.
Now on to updating the software versions. There are actually three different update paths to follow, and I will cover each. Depending on the current software versions on your CloudHSM devices, you might need to follow all three or just one.
Updating the software from version 5.1.x to 5.1.5
If you are running any version of the software older than 5.1.5, you must first update to version 5.1.5 before proceeding. To update to version 5.1.5:
Stop all applications and services that access the CloudHSM device.
<private_key_file> is the private portion of your SSH key pair and <hsm_ip_address> is the IP address of your CloudHSM elastic network interface (ENI). The ENI is the network endpoint that permits access to your CloudHSM device. The IP address was supplied to you when the CloudHSM device was provisioned.
Use the following command to connect to your CloudHSM device and log in with your Security Officer (SO) password.
The value you will use for <auth_code> is contained in the lunasa_update-5.1.5-2.auth file found in the 630-010165-018_reva.tar archive you downloaded in Step 2.
Reboot the CloudHSM device by running the following command.
lunash:> sysconf appliance reboot
When all the steps in this section are completed, you will have updated your CloudHSM software to version 5.1.5. You can now move on to update to version 5.3.10.
Updating the software to version 5.3.10
You can update to version 5.3.10 only if you are currently running version 5.1.5. To update to version 5.3.10:
Stop all applications and services that access the CloudHSM device.
The value you will use for <auth_code> is contained in the lunasa_update-5.3.10-7.auth file found in the SafeNet-Luna-SA-5-3-10.zip archive you downloaded in Step 2.
Reboot the CloudHSM device by running the following command.
lunash:> sysconf appliance reboot
When all the steps in this section are completed, you will have updated your CloudHSM software to version 5.3.10. You can now move on to update to version 5.3.13.
Note: Do not configure your applog settings at this point; you must first update the software to version 5.3.13 in the following step.
Updating the software to version 5.3.13
You can update to version 5.3.13 only if you are currently running version 5.3.10. If you are not already running version 5.3.10, follow the two update paths mentioned previously in this section.
To update to version 5.3.13:
Stop all applications and services that access the CloudHSM device.
The value you will use for <auth_code> is contained in the lunasa_update-5.3.13-1.auth file found in the SafeNet-Luna-SA-5-3-13.zip archive that you downloaded in Step 2.
When updating to this software version, the option to update the firmware also is offered. If you do not require a version of the firmware validated under FIPS 140-2, accept the firmware update to version 6.20.2. If you do require a version of the firmware validated under FIPS 140-2, do not accept the firmware update and instead update by using the steps in the next section, “Updating your CloudHSM FIPS 140-2 validated firmware.”
After updating the CloudHSM device, reboot it by running the following command.
lunash:> sysconf appliance reboot
Disable NTLS IP checking on the CloudHSM device so that it can operate within its VPC. To do this, run the following command.
lunash:> ntls ipcheck disable
When all the steps in this section are completed, you will have updated your CloudHSM software to version 5.3.13. If you don’t need the FIPS 140-2 validated firmware, you will have also updated the firmware to version 6.20.2. If you do need the FIPS 140-2 validated firmware, proceed to the next section.
3. Updating your CloudHSM FIPS 140-2 validated firmware
To update the FIPS 140-2 validated version of the firmware to 6.10.9, use the following steps:
The value you will use for <auth_code> is contained in the 630-010430-010_SPKG_LunaFW_6.10.9.auth file found in the 630-010430-010_SPKG_LunaFW_6.10.9.zip archive that you downloaded in Step 1.
Run the following command to update the firmware of the CloudHSM devices.
lunash:> hsm update firmware
After you have updated the firmware, reboot the CloudHSM devices to complete the installation.
lunash:> sysconf appliance reboot
Summary
In this blog post, I walked you through how to update your existing CloudHSM devices and clients so that they are using supported client, software, and firmware versions. Per AWS Support and CloudHSM Terms and Conditions, your devices and clients must use the most current supported software and firmware for continued troubleshooting assistance. Software and firmware versions regularly change based on customer use cases and requirements. Because AWS tests and validates all updates from Gemalto, you must install all updates for firmware and software by using our package links described in this post and elsewhere in our documentation.
If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about implementing this solution, please start a new thread on the CloudHSM forum.
– Tracy
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.