Earlier this month we launched the C5 Instances with Local NVMe Storage and I told you that we would be doing the same for additional instance types in the near future!
Today we are introducing M5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for workloads that require a balance of compute and memory resources. Here are the specs:
Instance Name
vCPUs
RAM
Local Storage
EBS-Optimized Bandwidth
Network Bandwidth
m5d.large
2
8 GiB
1 x 75 GB NVMe SSD
Up to 2.120 Gbps
Up to 10 Gbps
m5d.xlarge
4
16 GiB
1 x 150 GB NVMe SSD
Up to 2.120 Gbps
Up to 10 Gbps
m5d.2xlarge
8
32 GiB
1 x 300 GB NVMe SSD
Up to 2.120 Gbps
Up to 10 Gbps
m5d.4xlarge
16
64 GiB
1 x 600 GB NVMe SSD
2.210 Gbps
Up to 10 Gbps
m5d.12xlarge
48
192 GiB
2 x 900 GB NVMe SSD
5.0 Gbps
10 Gbps
m5d.24xlarge
96
384 GiB
4 x 900 GB NVMe SSD
10.0 Gbps
25 Gbps
The M5d instances are powered by Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, including support for AVX-512.
You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.
Here are a couple of things to keep in mind about the local NVMe storage on the M5d instances:
Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.
Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.
Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.
Available Now M5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent M5 instances.
As you can see from my EC2 Instance History post, we add new instance types on a regular and frequent basis. Driven by increasingly powerful processors and designed to address an ever-widening set of use cases, the size and diversity of this list reflects the equally diverse group of EC2 customers!
Near the bottom of that list you will find the new compute-intensive C5 instances. With a 25% to 50% improvement in price-performance over the C4 instances, the C5 instances are designed for applications like batch and log processing, distributed and or real-time analytics, high-performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding. Some of these applications can benefit from access to high-speed, ultra-low latency local storage. For example, video encoding, image manipulation, and other forms of media processing often necessitates large amounts of I/O to temporary storage. While the input and output files are valuable assets and are typically stored as Amazon Simple Storage Service (S3) objects, the intermediate files are expendable. Similarly, batch and log processing runs in a race-to-idle model, flushing volatile data to disk as fast as possible in order to make full use of compute resources.
New C5d Instances with Local Storage In order to meet this need, we are introducing C5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for the applications that I described above, as well as others that you will undoubtedly dream up! Here are the specs:
Instance Name
vCPUs
RAM
Local Storage
EBS Bandwidth
Network Bandwidth
c5d.large
2
4 GiB
1 x 50 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.xlarge
4
8 GiB
1 x 100 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.2xlarge
8
16 GiB
1 x 225 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.4xlarge
16
32 GiB
1 x 450 GB NVMe SSD
2.25 Gbps
Up to 10 Gbps
c5d.9xlarge
36
72 GiB
1 x 900 GB NVMe SSD
4.5 Gbps
10 Gbps
c5d.18xlarge
72
144 GiB
2 x 900 GB NVMe SSD
9 Gbps
25 Gbps
Other than the addition of local storage, the C5 and C5d share the same specs. Both are powered by 3.0 GHz Intel Xeon Platinum 8000-series processors, optimized for EC2 and with full control over C-states on the two largest sizes, giving you the ability to run two cores at up to 3.5 GHz using Intel Turbo Boost Technology.
You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.
Here are a couple of things to keep in mind about the local NVMe storage:
Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.
Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.
Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.
Available Now C5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent C5 instances.
I’ve been busy trying to replicate the “eFail” PGP/SMIME bug. I thought I’d write up some notes.
PGP and S/MIME encrypt emails, so that eavesdroppers can’t read them. The bugs potentially allow eavesdroppers to take the encrypted emails they’ve captured and resend them to you, reformatted in a way that allows them to decrypt the messages.
Disable remote/external content in email
The most important defense is to disable “external” or “remote” content from being automatically loaded. This is when HTML-formatted emails attempt to load images from remote websites. This happens legitimately when they want to display images, but not fill up the email with them. But most of the time this is illegitimate, they hide images on the webpage in order to track you with unique IDs and cookies. For example, this is the code at the end of an email from politician Bernie Sanders to his supporters. Notice the long random number assigned to track me, and the width/height of this image is set to one pixel, so you don’t even see it:
Such trackers are so pernicious they are disabled by default in most email clients. This is an example of the settings in Thunderbird:
The problem is that as you read email messages, you often get frustrated by the fact the error messages and missing content, so you keep adding exceptions:
The correct defense against this eFail bug is to make sure such remote content is disabled and that you have no exceptions, or at least, no HTTP exceptions. HTTPS exceptions (those using SSL) are okay as long as they aren’t to a website the attacker controls. Unencrypted exceptions, though, the hacker can eavesdrop on, so it doesn’t matter if they control the website the requests go to. If the attacker can eavesdrop on your emails, they can probably eavesdrop on your HTTP sessions as well.
Some have recommended disabling PGP and S/MIME completely. That’s probably overkill. As long as the attacker can’t use the “remote content” in emails, you are fine. Likewise, some have recommend disabling HTML completely. That’s not even an option in any email client I’ve used — you can disable sending HTML emails, but not receiving them. It’s sufficient to just disable grabbing remote content, not the rest of HTML email rendering.
I couldn’t replicate the direct exfiltration
There rare two related bugs. One allows direct exfiltration, which appends the decrypted PGP email onto the end of an IMG tag (like one of those tracking tags), allowing the entire message to be decrypted.
An example of this is the following email. This is a standard HTML email message consisting of multiple parts. The trick is that the IMG tag in the first part starts the URL (blog.robertgraham.com/…) but doesn’t end it. It has the starting quotes in front of the URL but no ending quotes. The ending will in the next chunk.
The next chunk isn’t HTML, though, it’s PGP. The PGP extension (in my case, Enignmail) will detect this and automatically decrypt it. In this case, it’s some previous email message I’ve received the attacker captured by eavesdropping, who then pastes the contents into this email message in order to get it decrypted.
What should happen at this point is that Thunderbird will generate a request (if “remote content” is enabled) to the blog.robertgraham.com server with the decrypted contents of the PGP email appended to it. But that’s not what happens. Instead, I get this:
I am indeed getting weird stuff in the URL (the bit after the GET /), but it’s not the PGP decrypted message. Instead what’s going on is that when Thunderbird puts together a “multipart/mixed” message, it adds it’s own HTML tags consisting of lines between each part. In the email client it looks like this:
The HTML code it adds looks like:
That’s what you see in the above URL, all this code up to the first quotes. Those quotes terminate the quotes in the URL from the first multipart section, causing the rest of the content to be ignored (as far as being sent as part of the URL).
So at least for the latest version of Thunderbird, you are accidentally safe, even if you have “remote content” enabled. Though, this is only according to my tests, there may be a work around to this that hackers could exploit.
STARTTLS
In the old days, email was sent plaintext over the wire so that it could be passively eavesdropped on. Nowadays, most providers send it via “STARTTLS”, which sorta encrypts it. Attackers can still intercept such email, but they have to do so actively, using man-in-the-middle. Such active techniques can be detected if you are careful and look for them.
Some organizations don’t care. Apparently, some nation states are just blocking all STARTTLS and forcing email to be sent unencrypted. Others do care. The NSA will passively sniff all the email they can in nations like Iraq, but they won’t actively intercept STARTTLS messages, for fear of getting caught.
The consequence is that it’s much less likely that somebody has been eavesdropping on you, passively grabbing all your PGP/SMIME emails. If you fear they have been, you should look (e.g. send emails from GMail and see if they are intercepted by sniffing the wire).
You’ll know if you are getting hacked
If somebody attacks you using eFail, you’ll know. You’ll get an email message formatted this way, with multipart/mixed components, some with corrupt HTML, some encrypted via PGP. This means that for the most part, your risk is that you’ll be attacked only once — the hacker will only be able to get one message through and decrypt it before you notice that something is amiss. Though to be fair, they can probably include all the emails they want decrypted as attachments to the single email they sent you, so the risk isn’t necessarily that you’ll only get one decrypted.
As mentioned above, a lot of attackers (e.g. the NSA) won’t attack you if its so easy to get caught. Other attackers, though, like anonymous hackers, don’t care.
Somebody ought to write a plugin to Thunderbird to detect this.
Summary
It only works if attackers have already captured your emails (though, that’s why you use PGP/SMIME in the first place, to guard against that).
It only works if you’ve enabled your email client to automatically grab external/remote content.
It seems to not be easily reproducible in all cases.
Instead of disabling PGP/SMIME, you should make sure your email client hast remote/external content disabled — that’s a huge privacy violation even without this bug.
Notes: The default email client on the Mac enables remote content by default, which is bad:
As a follow-up to our earlier work to provide longer IDs for a small set of essential EC2 resources, we are now doing the same for the remaining EC2 resources, with a migration deadline of July 2018. You can opt-in on a per-user, per-region, per-type basis and verify that your code, regular expressions, database schemas, and database queries work as expected
If you have code that recognizes, processes, or stores IDs for any type of EC2 resources, please read this post with care! Here’s what you need to know:
Migration Deadline – You have until July 2018 to make sure that your code and your schemas can process and store the new, longer IDs. After that, longer IDs will be assigned by default for all newly created resources. IDs for existing resources will remain as-is and will continue to work.
More Resource Types – Longer IDs are now supported for all types of EC2 resources, and you can opt-in as desired:
I would like to encourage you to opt-in, starting with your test accounts, as soon as possible. This will give you time to thoroughly test your code and to make any necessary changes before promoting the code to production.
More Regions – The longer IDs are now available in the AWS China (Beijing) and AWS China (Ningxia) Regions.
Test AMIs – We have published AMIs with longer IDs that you can use for testing (search for testlongids to find them in the Public images):
Kumar Venkateswar, Product Manager on the AWS ML Platforms Team, shares details on the announcement of Auto Scaling with Amazon SageMaker.
With Amazon SageMaker, thousands of customers have been able to easily build, train and deploy their machine learning (ML) models. Today, we’re making it even easier to manage production ML models, with Auto Scaling for Amazon SageMaker. Instead of having to manually manage the number of instances to match the scale that you need for your inferences, you can now have SageMaker automatically scale the number of instances based on an AWS Auto Scaling Policy.
SageMaker has made managing the ML process easier for many customers. We’ve seen customers take advantage of managed Jupyter notebooks and managed distributed training. We’ve seen customers deploying their models to SageMaker hosting for inferences, as they integrate machine learning with their applications. SageMaker makes this easy – you don’t have to think about patching the operating system (OS) or frameworks on your inference hosts, and you don’t have to configure inference hosts across Availability Zones. You just deploy your models to SageMaker, and it handles the rest.
Until now, you have needed to specify the number and type of instances per endpoint (or production variant) to provide the scale that you need for your inferences. If your inference volume changes, you can change the number and/or type of instances that back each endpoint to accommodate that change, without incurring any downtime. In addition to making it easy to change provisioning, customers have asked us how we can make managing capacity for SageMaker even easier.
With Auto Scaling for Amazon SageMaker, in the SageMaker console, the AWS Auto Scaling API, and the AWS SDK, this becomes much easier. Now, instead of having to closely monitor inference volume, and change the endpoint configuration in response, customers can configure a scaling policy to be used by AWS Auto Scaling. Auto Scaling adjusts the number of instances up or down in response to actual workloads, determined by using Amazon CloudWatch metrics and target values defined in the policy. In this way, customers can automatically adjust their inference capacity to maintain predictable performance at a low cost. You simply specify the target inference throughput per instance and provide upper and lower bounds for the number of instances for each production variant. SageMaker will then monitor throughput per instance using Amazon CloudWatch alarms, and then it will adjust provisioned capacity up or down as needed.
After you configure the endpoint with Auto Scaling, SageMaker will continue to monitor your deployed models to automatically adjust the instance count. SageMaker will keep throughput within desired levels, in response to changes in application traffic. This makes it easier to manage models in production, and it can help reduce the cost of deployed models, as you no longer have to provision sufficient capacity in order to manage your peak load. Instead, you configure the limits to accommodate your minimum expected traffic and the maximum peak, and Amazon SageMaker will work within those limits to minimize cost.
How do you get started? Open the SageMaker console. For existing endpoints, you first access the endpoint to modify the settings.
Then, scroll to the Endpoint runtime settings section, select the variant, and choose Configure auto scaling.
First, configure the minimum and maximum number of instances.
Next, choose the throughput per instance at which you want to add an additional instance, given previous load testing.
You can optionally set cool down periods for scaling in or out, to avoid oscillation during periods of wide fluctuation in workload. If not, SageMaker will assume default values.
And that’s it! You now have an endpoint that will automatically scale with increasing inferences.
You pay for the capacity used at regular SageMaker pay-as-you-go pricing, so you no longer have to pay for unused capacity during relative idle periods!
Auto Scaling in Amazon SageMaker is available today in the US East (N. Virginia & Ohio), EU (Ireland), and U.S. West (Oregon) AWS regions. To learn more, see the Amazon SageMaker Auto Scaling documentation.
Kumar Venkateswar is a Product Manager in the AWS ML Platforms team, which includes Amazon SageMaker, Amazon Machine Learning, and the AWS Deep Learning AMIs. When not working, Kumar plays the violin and Magic: The Gathering.
Most malware tries to compromise your systems by using a known vulnerability that the operating system maker has already patched. As best practices to help prevent malware from affecting your systems, you should apply all operating system patches and actively monitor your systems for missing patches.
Launch an Amazon EC2 instance for use with Systems Manager.
Configure Systems Manager to patch your Amazon EC2 Linux instances.
In two previous blog posts (Part 1 and Part 2), I showed how to use the AWS Management Console to perform the necessary steps to patch, inspect, and protect Microsoft Windows workloads. You can implement those same processes for your Linux instances running in AWS by changing the instance tags and types shown in the previous blog posts.
Because most Linux system administrators are more familiar with using a command line, I show how to patch Linux workloads by using the AWS CLI in this blog post. The steps to use the Amazon EBS Snapshot Scheduler and Amazon Inspector are identical for both Microsoft Windows and Linux.
What you should know first
To follow along with the solution in this post, you need one or more Amazon EC2 instances. You may use existing instances or create new instances. For this post, I assume this is an Amazon EC2 for Amazon Linux instance installed from Amazon Machine Images (AMIs).
Systems Manager is a collection of capabilities that helps you automate management tasks for AWS-hosted instances on Amazon EC2 and your on-premises servers. In this post, I use Systems Manager for two purposes: to run remote commands and apply operating system patches. To learn about the full capabilities of Systems Manager, see What Is AWS Systems Manager?
If you are not familiar with how to launch an Amazon EC2 instance, see Launching an Instance. I also assume you launched or will launch your instance in a private subnet. You must make sure that the Amazon EC2 instance can connect to the internet using a network address translation (NAT) instance or NAT gateway to communicate with Systems Manager. The following diagram shows how you should structure your VPC.
Later in this post, you will assign tasks to a maintenance window to patch your instances with Systems Manager. To do this, the IAM user you are using for this post must have the iam:PassRole permission. This permission allows the IAM user assigning tasks to pass his own IAM permissions to the AWS service. In this example, when you assign a task to a maintenance window, IAM passes your credentials to Systems Manager. You also should authorize your IAM user to use Amazon EC2 and Systems Manager. As mentioned before, you will be using the AWS CLI for most of the steps in this blog post. Our documentation shows you how to get started with the AWS CLI. Make sure you have the AWS CLI installed and configured with an AWS access key and secret access key that belong to an IAM user that have the following AWS managed policies attached to the IAM user you are using for this example: AmazonEC2FullAccess and AmazonSSMFullAccess.
Step 1: Launch an Amazon EC2 Linux instance
In this section, I show you how to launch an Amazon EC2 instance so that you can use Systems Manager with the instance. This step requires you to do three things:
Create an IAM role for Systems Manager before launching your Amazon EC2 instance.
Launch your Amazon EC2 instance with Amazon EBS and the IAM role for Systems Manager.
Add tags to the instances so that you can add your instances to a Systems Manager maintenance window based on tags.
A. Create an IAM role for Systems Manager
Before launching an Amazon EC2 instance, I recommend that you first create an IAM role for Systems Manager, which you will use to update the Amazon EC2 instance. AWS already provides a preconfigured policy that you can use for the new role and it is called AmazonEC2RoleforSSM.
Create a JSON file named trustpolicy-ec2ssm.json that contains the following trust policy. This policy describes which principal (an entity that can take action on an AWS resource) is allowed to assume the role we are going to create. In this example, the principal is the Amazon EC2 service.
Use the following command to create a role named EC2SSM that has the AWS managed policy AmazonEC2RoleforSSM attached to it. This generates JSON-based output that describes the role and its parameters, if the command is successful.
$ aws iam create-role – role-name EC2SSM – assume-role-policy-document file://trustpolicy-ec2ssm.json
Use the following command to attach the AWS managed IAM policy (AmazonEC2RoleforSSM) to your newly created role.
$ aws iam attach-role-policy – role-name EC2SSM – policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM
Use the following commands to create the IAM instance profile and add the role to the instance profile. The instance profile is needed to attach the role we created earlier to your Amazon EC2 instance.
$ aws iam create-instance-profile – instance-profile-name EC2SSM-IP
$ aws iam add-role-to-instance-profile – instance-profile-name EC2SSM-IP – role-name EC2SSM
B. Launch your Amazon EC2 instance
To follow along, you need an Amazon EC2 instance that is running Amazon Linux. You can use any existing instance you may have or create a new instance.
When launching a new Amazon EC2 instance, be sure that:
Use the following command to launch a new Amazon EC2 instance using an Amazon Linux AMI available in the US East (N. Virginia) Region (also known as us-east-1). Replace YourKeyPair and YourSubnetId with your information. For more information about creating a key pair, see the create-key-pair documentation. Write down the InstanceId that is in the output because you will need it later in this post.
If you are using an existing Amazon EC2 instance, you can use the following command to attach the instance profile you created earlier to your instance.
The final step of configuring your Amazon EC2 instances is to add tags. You will use these tags to configure Systems Manager in Step 2 of this post. For this example, I add a tag named Patch Group and set the value to Linux Servers. I could have other groups of Amazon EC2 instances that I treat differently by having the same tag name but a different tag value. For example, I might have a collection of other servers with the tag name Patch Group with a value of Web Servers.
Use the following command to add the Patch Group tag to your Amazon EC2 instance.
Note: You must wait a few minutes until the Amazon EC2 instance is available before you can proceed to the next section. To make sure your Amazon EC2 instance is online and ready, you can use the following AWS CLI command:
At this point, you now have at least one Amazon EC2 instance you can use to configure Systems Manager.
Step 2: Configure Systems Manager
In this section, I show you how to configure and use Systems Manager to apply operating system patches to your Amazon EC2 instances, and how to manage patch compliance.
To start, I provide some background information about Systems Manager. Then, I cover how to:
Create the Systems Manager IAM role so that Systems Manager is able to perform patch operations.
Create a Systems Manager patch baseline and associate it with your instance to define which patches Systems Manager should apply.
Define a maintenance window to make sure Systems Manager patches your instance when you tell it to.
Monitor patch compliance to verify the patch state of your instances.
You must meet two prerequisites to use Systems Manager to apply operating system patches. First, you must attach the IAM role you created in the previous section, EC2SSM, to your Amazon EC2 instance. Second, you must install the Systems Manager agent on your Amazon EC2 instance. If you have used a recent Amazon Linux AMI, Amazon has already installed the Systems Manager agent on your Amazon EC2 instance. You can confirm this by logging in to an Amazon EC2 instance and checking the Systems Manager agent log files that are located at /var/log/amazon/ssm/.
For a maintenance window to be able to run any tasks, you must create a new role for Systems Manager. This role is a different kind of role than the one you created earlier: this role will be used by Systems Manager instead of Amazon EC2. Earlier, you created the role, EC2SSM, with the policy, AmazonEC2RoleforSSM, which allowed the Systems Manager agent on your instance to communicate with Systems Manager. In this section, you need a new role with the policy, AmazonSSMMaintenanceWindowRole, so that the Systems Manager service can execute commands on your instance.
To create the new IAM role for Systems Manager:
Create a JSON file named trustpolicy-maintenancewindowrole.json that contains the following trust policy. This policy describes which principal is allowed to assume the role you are going to create. This trust policy allows not only Amazon EC2 to assume this role, but also Systems Manager.
Use the following command to create a role named MaintenanceWindowRole that has the AWS managed policy, AmazonSSMMaintenanceWindowRole, attached to it. This command generates JSON-based output that describes the role and its parameters, if the command is successful.
$ aws iam create-role – role-name MaintenanceWindowRole – assume-role-policy-document file://trustpolicy-maintenancewindowrole.json
Use the following command to attach the AWS managed IAM policy (AmazonEC2RoleforSSM) to your newly created role.
$ aws iam attach-role-policy – role-name MaintenanceWindowRole – policy-arn arn:aws:iam::aws:policy/service-role/AmazonSSMMaintenanceWindowRole
B. Create a Systems Manager patch baseline and associate it with your instance
Next, you will create a Systems Manager patch baseline and associate it with your Amazon EC2 instance. A patch baseline defines which patches Systems Manager should apply to your instance. Before you can associate the patch baseline with your instance, though, you must determine if Systems Manager recognizes your Amazon EC2 instance. Use the following command to list all instances managed by Systems Manager. The --filters option ensures you look only for your newly created Amazon EC2 instance.
If your instance is missing from the list, verify that:
Your instance is running.
You attached the Systems Manager IAM role, EC2SSM.
You deployed a NAT gateway in your public subnet to ensure your VPC reflects the diagram shown earlier in this post so that the Systems Manager agent can connect to the Systems Manager internet endpoint.
Now that you have checked that Systems Manager can manage your Amazon EC2 instance, it is time to create a patch baseline. With a patch baseline, you define which patches are approved to be installed on all Amazon EC2 instances associated with the patch baseline. The Patch Group resource tag you defined earlier will determine to which patch group an instance belongs. If you do not specifically define a patch baseline, the default AWS-managed patch baseline is used.
To create a patch baseline:
Use the following command to create a patch baseline named AmazonLinuxServers. With approval rules, you can determine the approved patches that will be included in your patch baseline. In this example, you add all Critical severity patches to the patch baseline as soon as they are released, by setting the Auto approval delay to 0 days. By setting the Auto approval delay to 2 days, you add to this patch baseline the Important, Medium, and Low severity patches two days after they are released.
$ aws ssm create-patch-baseline – name "AmazonLinuxServers" – description "Baseline containing all updates for Amazon Linux" – operating-system AMAZON_LINUX – approval-rules "PatchRules=[{PatchFilterGroup={PatchFilters=[{Values=[Critical],Key=SEVERITY}]},ApproveAfterDays=0,ComplianceLevel=CRITICAL},{PatchFilterGroup={PatchFilters=[{Values=[Important,Medium,Low],Key=SEVERITY}]},ApproveAfterDays=2,ComplianceLevel=HIGH}]"
{
"BaselineId": "YourBaselineId"
}
Use the following command to register the patch baseline you created with your instance. To do so, you use the Patch Group tag that you added to your Amazon EC2 instance.
Now that you have successfully set up a role, created a patch baseline, and registered your Amazon EC2 instance with your patch baseline, you will define a maintenance window so that you can control when your Amazon EC2 instances will receive patches. By creating multiple maintenance windows and assigning them to different patch groups, you can make sure your Amazon EC2 instances do not all reboot at the same time.
To define a maintenance window:
Use the following command to define a maintenance window. In this example command, the maintenance window will start every Saturday at 10:00 P.M. UTC. It will have a duration of 4 hours and will not start any new tasks 1 hour before the end of the maintenance window.
After defining the maintenance window, you must register the Amazon EC2 instance with the maintenance window so that Systems Manager knows which Amazon EC2 instance it should patch in this maintenance window. You can register the instance by using the same Patch Group tag you used to associate the Amazon EC2 instance with the AWS-provided patch baseline, as shown in the following command.
Assign a task to the maintenance window that will install the operating system patches on your Amazon EC2 instance. The following command includes the following options.
name is the name of your task and is optional. I named mine Patching.
task-arn is the name of the task document you want to run.
max-concurrency allows you to specify how many of your Amazon EC2 instances Systems Manager should patch at the same time. max-errors determines when Systems Manager should abort the task. For patching, this number should not be too low, because you do not want your entire patch task to stop on all instances if one instance fails. You can set this, for example, to 20%.
service-role-arn is the Amazon Resource Name (ARN) of the AmazonSSMMaintenanceWindowRole role you created earlier in this blog post.
task-invocation-parameters defines the parameters that are specific to the AWS-RunPatchBaseline task document and tells Systems Manager that you want to install patches with a timeout of 600 seconds (10 minutes).
Now, you must wait for the maintenance window to run at least once according to the schedule you defined earlier. If your maintenance window has expired, you can check the status of any maintenance tasks Systems Manager has performed by using the following command.
You also can see the overall patch compliance of all Amazon EC2 instances using the following command in the AWS CLI.
$ aws ssm list-compliance-summaries
This command shows you the number of instances that are compliant with each category and the number of instances that are not in JSON format.
You also can see overall patch compliance by choosing Compliance under Insights in the navigation pane of the Systems Manager console. You will see a visual representation of how many Amazon EC2 instances are up to date, how many Amazon EC2 instances are noncompliant, and how many Amazon EC2 instances are compliant in relation to the earlier defined patch baseline.
In this section, you have set everything up for patch management on your instance. Now you know how to patch your Amazon EC2 instance in a controlled manner and how to check if your Amazon EC2 instance is compliant with the patch baseline you have defined. Of course, I recommend that you apply these steps to all Amazon EC2 instances you manage.
Summary
In this blog post, I showed how to use Systems Manager to create a patch baseline and maintenance window to keep your Amazon EC2 Linux instances up to date with the latest security patches. Remember that by creating multiple maintenance windows and assigning them to different patch groups, you can make sure your Amazon EC2 instances do not all reboot at the same time.
If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing any part of this solution, start a new thread on the Amazon EC2 forum or contact AWS Support.
I hope that you have configured your AMIs and your current-generation EC2 instances to use the Elastic Network Adapter (ENA) that I told you about back in mid-2016. The ENA gives you high throughput and low latency, while minimizing the load on the host processor. It is designed to work well in the presence of multiple vCPUs, with intelligent packet routing backed up by multiple transmit and receive queues.
Today we are opening up the floodgates and giving you access to more bandwidth in all AWS Regions. Here are the specifics (in each case, the actual bandwidth is dependent on the instance type and size):
EC2 to S3 – Traffic to and from Amazon Simple Storage Service (S3) can now take advantage of up to 25 Gbps of bandwidth. Previously, traffic of this type had access to 5 Gbps of bandwidth. This will be of benefit to applications that access large amounts of data in S3 or that make use of S3 for backup and restore.
EC2 to EC2 – Traffic to and from EC2 instances in the same or different Availability Zones within a region can now take advantage of up to 5 Gbps of bandwidth for single-flow traffic, or 25 Gbps of bandwidth for multi-flow traffic (a flow represents a single, point-to-point network connection) by using private IPv4 or IPv6 addresses, as described here.
EC2 to EC2 (Cluster Placement Group) – Traffic to and from EC2 instances within a cluster placement group can continue to take advantage of up to 10 Gbps of lower-latency bandwidth for single-flow traffic, or 25 Gbps of lower-latency bandwidth for multi-flow traffic.
To take advantage of this additional bandwidth, make sure that you are using the latest, ENA-enabled AMIs on current-generation EC2 instances. ENA-enabled AMIs are available for Amazon Linux, Ubuntu 14.04 & 16.04, RHEL 7.4, SLES 12, and Windows Server (2008 R2, 2012, 2012 R2, and 2016). The FreeBSD AMI in AWS Marketplace is also ENA-enabled, as is VMware Cloud on AWS.
I’m running a game jam, and this announcement is before the jam starts! What a concept!
The idea is simple: you have all of February to make a horny game.
(This jam is, as you may have guessed, NSFW. 🔞)
I think there’s a lot of interesting potential at the intersection of sex and games, but we see very little exploration of it — in large part because mega-platforms like Steam (and its predecessor, Walmart) have historically been really squeamish about anything sexual. Unless it’s scantily-clad women draped over everything, that’s fine. But un-clad women are right out. Also gratuitous high-definition gore is cool. But no nipples!!
The result is a paltry cultural volume of games about sex, but as boundaries continue to be pushed without really being broken, we get more and more blockbuster games with sex awkwardly tacked on top as lazy titillation. “Ah, it’s a story-driven role-playing shooter, but in this one part you can have sex, which will affect nothing and never come up again, but you can see a butt!” Truly revolutionary.
The opposite end of the spectrum also exists, in the form of porn games where the game part is tacked on to make something interactive — you know, click really fast to make clothes fall off or whatever. It’s not especially engaging, but it’s more compelling than staring at a JPEG.
So my secret motive here is to encourage people to explore the vast gulf in the middle — to make games that are interesting as games and that feature sexuality as a fundamental part of the game. Something where both parts could stand alone, yet are so intertwined as to be inseparable.
The one genre that is seeing a lot of experimentation is the raunchy visual novel, which is a great example: they tend to tell stories where sexuality plays a heavy part, but they’re still compelling interactive stories and hold up on those grounds just as well. What, I wonder, would this same sort of harmony look like for other genres, other kinds of interaction? What does a horny racing game look like, or a horny inventory-horror game, or a horny brawler? Hell, why are there no horny co-op games to speak of? That seems obvious, right?
I haven’t said all this on the jam page because it would add half a dozen paragraphs to what is already a lengthy document. I also suspect that I’ll sound like I’m suggesting “a racing game but all the cars are dicks,” which isn’t quite right, and I’d need to blather even more to clarify. Anyway, it seems vaguely improper as the jam organizer to be telling people what kind of games not to make; last year I just tried to lead by example by making fox flux.
If exploring this design space seems interesting to you, please do join in! If you’ve never made a game before, this might be a great opportunity to give it a try — everything is going to be embarrassing and personal regardless. Maybe hop on Discord if you need help or want a teammate. Feel free to flip through last year’s entries, too, or my (super nsfw) thread where I played some and talked about them. Some of them are even open source, cough, cough.
As companies mature in their cloud journey, they implement layered security capabilities and practices in their cloud architectures. One such practice is to continually assess golden Amazon Machine Images (AMIs) for security vulnerabilities. AMIs provide the information required to launch an Amazon EC2 instance, which is a virtual server in the AWS Cloud. A golden AMI is an AMI that contains the latest security patches, software, configuration, and software agents that you need to install for logging, security maintenance, and performance monitoring. You can build and deploy golden AMIs in your environment, but the AMIs quickly become dated as new vulnerabilities are discovered.
A security best practice is to perform routine vulnerability assessments of your golden AMIs to identify if newly found vulnerabilities apply to them. If you identify a vulnerability, you can update your golden AMIs with the appropriate security patches, test the AMIs, and deploy the patched AMIs in your environment. In this blog post, I demonstrate how to use Amazon Inspector to set up such continuous vulnerability assessments to scan your golden AMIs routinely.
Solution overview
Amazon Inspector performs security assessments of Amazon EC2 instances by using AWS managed rules packages such as the Common Vulnerabilities and Exposures (CVEs) package. The solution in this post creates EC2 instances from golden AMIs and then runs an Amazon Inspector security assessment on the created instances. When the assessment results are available, the solution consolidates the findings and advises you about next steps. Furthermore, the solution schedules an Amazon CloudWatch Events rule to run the golden AMI vulnerability assessments on a regular basis.
The following solution diagram illustrates how this solution works.
Here’s how this solution works, as illustrated in the preceding diagram:
A scheduled CloudWatch Events event triggers the StartContinuousAssessmentAWS Lambda function, which starts the security assessment of your golden AMIs.
The StartContinuousAssessment Lambda function performs the following actions:
It reads a JSON parameter stored in the AWS Systems Manager (Systems Manager) Parameter Store. This JSON parameter contains the following metadata for each golden AMI:
InstanceType – A valid instance-type for launching an EC2 instance of the golden AMI.
Later in this blog post, I provide instructions for creating this JSON parameter.
For each AMI specified in the JSON parameter, the Lambda function creates an EC2 instance. When each instance starts, it installs the Amazon Inspector agent by using the user-data script provided in the JSON. The Lambda function then copies each golden AMI’s tags (you will assign custom metadata in the form of tags to each golden AMI when you set up the solution) to the corresponding EC2 instance. The function also adds a tag with the key of continuous-assessment-instance and value as true. This tag identifies EC2 instances that require regular security assessments. The Lambda function copies the AMI’s tags to the instance (and later, to the security findings found for the instance) to help you identify the golden AMIs for each security finding. After you analyze security findings, you can patch your golden AMIs.
The first time the StartContinuousAssessment function runs, it creates:
An Amazon Inspector assessment template: The template contains a reference to the Amazon Inspector assessment target created in the preceding step and the following AWS managed rules packages to evaluate:
For subsequent assessments, the StartContinuousAssessment function reuses the target and the template created during the first run of StartContinuousAssessment function.
Note: Amazon Inspector can start an assessment only after it finds at least one running Amazon Inspector agent. To allow EC2 instances to boot and the Amazon inspector agent to start, the Lambda function waits four minutes. Because the assessment runs for approximately one hour and boot time for EC2 instances typically takes a few minutes, all Amazon Inspector agents start before the assessment ends.
The Lambda function then runs the assessment. The Amazon Inspector agents collect behavior and configuration data, and pass it to Amazon Inspector. Amazon Inspector analyzes the data and generates Amazon Inspector findings, which are possible security findings you may need to address.
After the Lambda function completes the assessment, Amazon Inspector publishes an assessment-completion notification message to an Amazon SNS topic called ContinuousAssessmentCompleteTopic. SNS uses topics, which are communication channels for sending messages and subscribing to notifications.
The notification message published to SNS triggers the AnalyzeInspectorFindings Lambda function, which performs the following actions:
Associates the tags of each EC2 instance with security findings found for that EC2 instance. This enables you to identify the security findings using the app-name tag you specified for your golden AMIs. You can use the information provided in the findings to patch your golden AMIs.
Terminates all instances associated with the continuous-assessment-instance=true tag.
Aggregates the number of findings found for each EC2 instance by severity and then publishes a consolidated result to an SNS topic called ContinuousAssessmentResultsTopic.
How to deploy the solution
To deploy this solution, you must set it up in the AWS Region where you build your golden AMIs. If that AWS Region does not support Amazon Inspector, at the end of your continuous integration pipeline, you can copy your AMIs to an AWS Region where Amazon Inspector assessments are supported. To learn more about continuous integration pipelines, see What is Continuous Integration?
To deploy continuous golden AMI vulnerability assessments in your AWS account, follow these steps:
Tag your golden AMIs – Tagging your golden AMIs lets you search assessment result findings based on tags after Amazon Inspector completes an assessment.
Store your golden AMI metadata in the Systems Manager Parameter Store – Prepare and store the golden AMI metadata in the Systems Manager Parameter Store. The StartContinuousAssessment Lambda function reads golden AMI metadata and starts assessing for vulnerabilities.
Run the supplied AWS CloudFormation template and subscribe to an SNS topic to receive assessment results – Set up the infrastructure required to run vulnerability assessments and subscribe to an SNS topic to receive assessment results via email.
Test golden AMI vulnerability assessments – Ensure you have successfully set up the required resources to run vulnerability assessments.
Set up a CloudWatch Events rule for triggering continuous golden AMI vulnerability assessments – Schedule the execution of vulnerability assessments on a regular basis.
1. Tag your golden AMIs
You can search assessment findings based on golden AMI tags after Amazon Inspector completes an assessment.
To tag a golden AMI by using the AWS Management Console:
Choose your AMI from the list, and then choose Actions > Add/Edit Tags.
Choose Create Tag. In the Key column, type app-name. In the Value column, type your application name. Following the same steps, create the app-version and app-environment tags. Choose Save.
Now that you have tagged your golden AMIs, you need to create golden AMI metadata, which will be read by the StartContinuousAssessment function to initiate vulnerability assessments. You will store the golden AMI metadata in the Systems Manager Parameter Store.
2. Store your golden AMI metadata in the Systems Manager Parameter Store
This solution reads golden AMI metadata from a parameter stored in the Systems Manager Parameter Store. The metadata must be in JSON format and must contain the following information for each golden AMI:
Ami-Id
InstanceType
UserData
Step A: Find the AMI ID of your golden AMI.
An AMI ID uniquely identifies an AMI in an AWS Region and is a required parameter for launching an EC2 instance from a golden AMI. To find the AMI ID of your golden AMI:
Choose your AMI from the list and then note the corresponding value in the AMI ID column.
Step B: Find a compatible InstanceType for your golden AMI.
Each AMI has a list of compatible InstanceTypes. The InstanceType is a required parameter for launching an EC2 instance from a golden AMI. To find a compatible InstanceType for your golden AMI:
Choose Launch Instance. On the Choose an Amazon Machine Image (AMI) page, choose My AMIs.
Type the AMI ID that you noted in Step A in the Search my AMIs box, and then choose Enter.
The search result will contain your golden AMI. To choose it, choose Select.
Locate any available Instance Type and then note the corresponding value in the Type column.
Choose Cancel.
Note: Amazon Inspector will launch the chosen InstanceType every time the vulnerability assessment runs.
Step C: Create the user-data script to install and start the Amazon Inspector agent.
The user-data script automates the installation of software packages when an EC2 instance launches for the first time. In this step, you create an operating system specific, JSON-compatible user-data script that installs and starts the Amazon Inspector agent.
Identify the command that installs the Amazon Inspector agent
Based on Installing Amazon Inspector Agents, the following shell command installs the Amazon Inspector agent on an Amazon Linux-based EC2 instance.
Based on Running Commands on Your Linux Instance at Launch, you make a Linux shell script user-data compatible by prefixing it with a #!/bin/bash. In this step, you add the #!/bin/bash prefix to the script from the preceding step. The following is the user-data compatible version of the script from the preceding step.
The user-data script provided in the JSON metadata must be JSON-compatible, which you will do next.
Make the user-data script JSON compatible
To make the user-data script JSON compatible, you must replace all new-line characters with a \r\n\r\n sequence. The following is the JSON-compatible user-data script that you specify for your Amazon Linux-based golden AMI in Step D.
Repeat Steps A, B, and C to find the Ami Id, InstanceType, and UserData for each of your golden AMIs. When you have this metadata, you can create the JSON document of metadata for all your golden AMIs. The StartContinuousAssessment Lambda function reads this JSON to start golden AMI vulnerability assessments.
Step D: Create a JSON document of metadata of all your golden AMIs.
Use the following template to create a JSON document:
Replace all placeholder values with values corresponding to your first golden AMI. If your golden AMI is Amazon Linux-based, you can specify the userData as the JSON-compatible-user-data-for-Amazon-Linux-AMI from Step C.5. Next, replace the placeholder values for your second golden AMI. You can add more entries to your JSON document, if you have more than two golden AMIs.
Note: The total number of characters in the JSON document must be fewer than or equal to 4,096 characters, and the number of golden AMIs must be fewer than 500. You must verify whether your account has permissions to run one on-demand EC2 instance for each of your golden AMIs. For information about how to verify service limits, see Amazon EC2 Service Limits.
Now that you have created the JSON document of your golden AMIs, you will store the JSON document in a Systems Manager parameter. The StartContinuousAssessment Lambda function will read the metadata from this parameter.
Step E: Store the JSON in a Systems Manager parameter.
Expand Systems Manager Shared Resources in the navigation pane, and then choose Parameter Store.
Choose Create Parameter.
For Name, type ContinuousAssessmentInput.
In the Description field, type Continuous golden AMI vulnerability assessment process metadata.
For Type, choose String.
Paste the JSON that you created in Step D in the Value field.
Choose Create Parameter. After the system creates the parameter, choose Close.
To set up the remaining components required to run assessments, you will run a CloudFormation template and perform the configuration explained in the next section.
3. Run the CloudFormation template and subscribe to an SNS topic to receive assessment results
Next, create a CloudFormation stack using the provided CloudFormation template. Before you start, download the CloudFormation template to your computer.
On the Stacks page, choose AmazonInspectorAssessment.
In the Detail pane, choose Outputs to view the output of your stack.
After CloudFormation successfully creates a stack, the Outputs tab displays following results:
StartContinuousAssessmentLambdaFunction – The Value box displays the name of the StartContinuousAssessment function. You will run this function to trigger the entire workflow.
ContinuousAssessmentResultsTopic – The Value box displays the ContinuousAssessmentResultsTopic topic’s Amazon Resource Name (ARN), which you will use later.
To receive consolidated vulnerability assessment results in email, you must subscribe to ContinuousAssessmentResultsTopic.
Choose Create subscription. In the Topic ARN field, paste the ARN of ContinuousAssessmentResultsTopic that you noted in the previous section.
In the Protocol drop-down, choose Email.
In the Endpoint box, type the email address where you will receive notifications.
Choose Create subscription.
Navigate to your email application and open the message from AWS Notifications. Click the link to confirm your subscription to the SNS topic.
4. Test golden AMI vulnerability assessments
Before you schedule vulnerability assessments, you should test the process by running the StartContinuousAssessment function. In this test, you trigger a security assessment and monitor it. You then receive an email after the assessment has completed, which shows that vulnerability assessments have been successfully set up.
On Dashboard under Recent Assessment Runs, you will see an entry with the status, Collecting Data. This status indicates that Amazon Inspector agents are collecting data from instances running your golden AMIs. The agents collect data for an hour and then Amazon Inspector analyzes the collected data.
After Amazon Inspector completes the assessment, the status in the console changes to Analysis complete. Amazon Inspector then publishes an SNS message that triggers the AnalyzeInspectionReports Lambda function. When AnalyzeInspectionReports publishes results, you will receive an email containing consolidated assessment results. You also will be able to see the findings.
To see the findings in Amazon Inspector’s Findings section:
In the navigation pane, choose Assessment Runs. In the table on the Amazon Inspector – Assessment Runs page, choose the findings of the latest assessment run.
Choose the settings () icon and choose the appropriate tags to see the details of findings, as shown in the following screenshot. The findings also contain information about how you can address each underlying vulnerability.
Having verified that you have successfully set up all components of golden AMI vulnerability assessments, you now will schedule the vulnerability assessments to run on a regular basis to give you continual insight into the health of instances created from your golden AMIs.
5. Set up a CloudWatch Events rule for triggering continuous golden AMI vulnerability assessments
The last step is to create a CloudWatch Events rule to schedule the execution of the vulnerability assessments on a daily or weekly basis.
In the navigation pane, choose Rules > Create rule.
On the Event Source page, choose Schedule. Choose Fixed rate of and specify the interval (for example, 1 day).
For Targets, choose Add target and then choose Lambda function.
For Function, choose the StartContinuousAssessment function.
Choose Configure Input.
Choose Constant (JSON text).
In the box, paste the following JSON code.
{
"AMIsParamName": "ContinuousAssessmentInput"
}
Choose Configure details.
For Rule definition, type ContinuousGoldenAMIAssessmentTrigger for the name, and type as the description, This rule triggers the continuous golden AMI vulnerability assessment process.
Choose Create rule.
The vulnerability assessments are executed on the first occurrence of the schedule you chose while setting up the CloudWatch Events rule. After the vulnerability assessment is executed, you will receive an email to indicate that your continuous golden AMI vulnerability assessments are set up.
Summary
To get visibility into the security of your EC2 instances created from your golden AMIs, it is important that you perform security assessments of your golden AMIs on a regular basis. In this blog post, I have demonstrated how to set up vulnerability assessments, and the results of these continuous golden AMI vulnerability assessments can help you keep your environment up to date with security patches. To learn how to patch your golden AMIs, see Streamline AMI Maintenance and Patching Using Amazon EC2 Systems Manager.
If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about implementing the solution in this post, start a new thread on the Amazon Inspector forum or contact AWS Support.
I’m getting ready to wrap up my work for the year, cleaning up my inbox and catching up on a few recent AWS launches that happened at and shortly after AWS re:Invent.
Last week we launched Amazon Linux 2. This is modern version of Linux, designed to meet the security, stability, and productivity needs of enterprise environments while giving you timely access to new tools and features. It also includes all of the things that made the Amazon Linux AMI popular, including AWS integration, cloud-init, a secure default configuration, regular security updates, and AWS Support. From that base, we have added many new features including:
Long-Term Support – You can use Amazon Linux 2 in situations where you want to stick with a single major version of Linux for an extended period of time, perhaps to avoid re-qualifying your applications too frequently. This build (2017.12) is a candidate for LTS status; the final determination will be made based on feedback in the Amazon Linux Discussion Forum. Long-term support for the Amazon Linux 2 LTS build will include security updates, bug fixes, user-space Application Binary Interface (ABI), and user-space Application Programming Interface (API) compatibility for 5 years.
Extras Library – You can now get fast access to fresh, new functionality while keeping your base OS image stable and lightweight. The Amazon Linux Extras Library eliminates the age-old tradeoff between OS stability and access to fresh software. It contains open source databases, languages, and more, each packaged together with any needed dependencies.
Tuned Kernel – You have access to the latest 4.9 LTS kernel, with support for the latest EC2 features and tuned to run efficiently in AWS and other virtualized environments.
Systemd – Amazon Linux 2 includes the systemd init system, designed to provide better boot performance and increased control over individual services and groups of interdependent services. For example, you can indicate that Service B must be started only after Service A is fully started, or that Service C should start on a change in network connection status.
Wide Availabilty – Amazon Linux 2 is available in all AWS Regions in AMI and Docker image form. Virtual machine images for Hyper-V, KVM, VirtualBox, and VMware are also available. You can build and test your applications on your laptop or in your own data center and then deploy them to AWS.
I’m interested in the Extras Library; here’s how I see which topics (lists of packages) are available:
As you can see, the library includes languages, editors, and web tools that receive frequent updates. Each topic contains all of dependencies that are needed to install the package on Amazon Linux 2. For example, the Rust topic includes the cmake build system for Rust, cargo for Rust package maintenance, and the LLVM-based compiler toolchain for Rust.
SNS Updates Many AWS customers use the Amazon Linux AMIs as a starting point for their own AMIs. If you do this and would like to kick off your build process whenever a new AMI is released, you can subscribe to an SNS topic:
You can be notified by email, invoke a AWS Lambda function, and so forth.
I always advise new EC2 users to start with our general-purpose instances, run some stress tests, and to get a really good feel for the compute, memory, and networking profile of their application before taking a look at other instance types. With a broad selection of instances optimized for compute, memory, and storage, our customers have many options and the flexibility to choose the instance type that is the best fit for their needs.
As you can see from my EC2 Instance History post, the general-purpose (M) instances go all the way back to 2006 when we launched the m1.small. We continued to evolve along this branch of our family tree, launching the the M2 (2009), M3 (2012), and the M4 (2015) instances. Our customers use the general-purpose instances to run web & app servers, host enterprise applications, support online games, and build cache fleets.
New M5 Instances Today we are taking the next step forward with the launch of our new M5 instances. These instances benefit from our commitment to continued innovation and offer even better price-performance than their predecessors. Based on Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, the M5 instances are designed for highly demanding workloads and will deliver 14% better price/performance than the M4 instances on a per-core basis. Applications that use the AVX-512 instructions will crank out twice as many FLOPS per core. We’ve also added a new size at the high-end, giving you even more options.
Here are the M5 instances (all VPC-only, HVM-only, and EBS-Optimized):
Instance Name
vCPUs
RAM
Network Bandwidth
EBS-Optimized Bandwidth
m5.large
2
8 GiB
Up to 10 Gbps
Up to 2120 Mbps
m5.xlarge
4
16 GiB
Up to 10 Gbps
Up to 2120 Mbps
m5.2xlarge
8
32 GiB
Up to 10 Gbps
Up to 2120 Mbps
m5.4xlarge
16
64 GiB
Up to 10 Gbps
2120 Mbps
m5.12xlarge
48
192 GiB
10 Gbps
5000 Mbps
m5.24xlarge
96
384 GiB
25 Gbps
10000 Mbps
At the top end of the lineup, the m5.24xlarge is second only to the X instances when it comes to vCPU count, giving you more room to scale up and to consolidate workloads. The instances support Enhanced Networking, and can deliver up to 25 Gbps when used within a Placement Group.
In addition to dedicated, EBS-Optimized bandwidth to EBS, access to EBS storage is enhanced by the use of NVMe (you’ll need to install the proper drivers if you are using older AMIs). The combination of more bandwidth and NVMe will increase the amount of data that your M5 instances can chew through.
Launch One Today You can launch M5 instances today in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) Regions in On-Demand and Spot form (Reserved Instances are also available), with additional Regions in the works.
One quick note: the current NVMe driver is not optimized for high-performance sequential workloads and we don’t recommend the use of M5 instances in conjunction with sc1 or st1 volumes. We are aware of this issue and have been working to optimize the driver for this important use case.
Threats to your IT infrastructure (AWS accounts & credentials, AWS resources, guest operating systems, and applications) come in all shapes and sizes! The online world can be a treacherous place and we want to make sure that you have the tools, knowledge, and perspective to keep your IT infrastructure safe & sound.
Amazon GuardDuty is designed to give you just that. Informed by a multitude of public and AWS-generated data feeds and powered by machine learning, GuardDuty analyzes billions of events in pursuit of trends, patterns, and anomalies that are recognizable signs that something is amiss. You can enable it with a click and see the first findings within minutes.
How it Works GuardDuty voraciously consumes multiple data streams, including several threat intelligence feeds, staying aware of malicious IP addresses, devious domains, and more importantly, learning to accurately identify malicious or unauthorized behavior in your AWS accounts. In combination with information gleaned from your VPC Flow Logs, AWS CloudTrail Event Logs, and DNS logs, this allows GuardDuty to detect many different types of dangerous and mischievous behavior including probes for known vulnerabilities, port scans and probes, and access from unusual locations. On the AWS side, it looks for suspicious AWS account activity such as unauthorized deployments, unusual CloudTrail activity, patterns of access to AWS API functions, and attempts to exceed multiple service limits. GuardDuty will also look for compromised EC2 instances talking to malicious entities or services, data exfiltration attempts, and instances that are mining cryptocurrency.
GuardDuty operates completely on AWS infrastructure and does not affect the performance or reliability of your workloads. You do not need to install or manage any agents, sensors, or network appliances. This clean, zero-footprint model should appeal to your security team and allow them to green-light the use of GuardDuty across all of your AWS accounts.
Findings are presented to you at one of three levels (low, medium, or high), accompanied by detailed evidence and recommendations for remediation. The findings are also available as Amazon CloudWatch Events; this allows you to use your own AWS Lambda functions to automatically remediate specific types of issues. This mechanism also allows you to easily push GuardDuty findings into event management systems such as Splunk, Sumo Logic, and PagerDuty and to workflow systems like JIRA, ServiceNow, and Slack.
A Quick Tour Let’s take a quick tour. I open up the GuardDuty Console and click on Get started:
Then I confirm that I want to enable GuardDuty. This gives it permission to set up the appropriate service-linked roles and to analyze my logs by clicking on Enable GuardDuty:
My own AWS environment isn’t all that exciting, so I visit the General Settings and click on Generate sample findings to move ahead. Now I’ve got some intriguing findings:
I can click on a finding to learn more:
The magnifying glass icons allow me to create inclusion or exclusion filters for the associated resource, action, or other value. I can filter for all of the findings related to this instance:
I can customize GuardDuty by adding lists of trusted IP addresses and lists of malicious IP addresses that are peculiar to my environment:
After I enable GuardDuty in my administrator account, I can invite my other accounts to participate:
Once the accounts decide to participate, GuardDuty will arrange for their findings to be shared with the administrator account.
I’ve barely scratched the surface of GuardDuty in the limited space and time that I have. You can try it out at no charge for 30 days; after that you pay based on the number of entries it processes from your VPC Flow, CloudTrail, and DNS logs.
Available Now Amazon GuardDuty is available in production form in the US East (Northern Virginia), US East (Ohio), US West (Oregon), US West (Northern California), EU (Ireland), EU (Frankfurt), EU (London), South America (São Paulo), Canada (Central), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Mumbai) Regions and you can start using it today!
Most malware tries to compromise your systems by using a known vulnerability that the maker of the operating system has already patched. To help prevent malware from affecting your systems, two security best practices are to apply all operating system patches to your systems and actively monitor your systems for missing patches. In case you do need to recover from a malware attack, you should make regular backups of your data.
In today’s blog post (Part 1 of a two-part post), I show how to keep your Amazon EC2 instances that run Microsoft Windows up to date with the latest security patches by using Amazon EC2 Systems Manager. Tomorrow in Part 2, I show how to take regular snapshots of your data by using Amazon EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).
Later on, you will assign tasks to a maintenance window to patch your instances with Systems Manager. To do this, the AWS Identity and Access Management (IAM) user you are using for this post must have the iam:PassRole permission. This permission allows this IAM user to assign tasks to pass their own IAM permissions to the AWS service. In this example, when you assign a task to a maintenance window, IAM passes your credentials to Systems Manager. This safeguard ensures that the user cannot use the creation of tasks to elevate their IAM privileges because their own IAM privileges limit which tasks they can run against an EC2 instance. You should also authorize your IAM user to use EC2, Amazon Inspector, Amazon CloudWatch, and Systems Manager. You can achieve this by attaching the following AWS managed policies to the IAM user you are using for this example: AmazonInspectorFullAccess, AmazonEC2FullAccess, and AmazonSSMFullAccess.
Architectural overview
The following diagram illustrates the components of this solution’s architecture.
For this blog post, Microsoft Windows EC2 is Amazon EC2 for Microsoft Windows Server 2012 R2 instances with attached Amazon Elastic Block Store (Amazon EBS) volumes, which are running in your VPC. These instances may be standalone Windows instances running your Windows workloads, or you may have joined them to an Active Directory domain controller. For instances joined to a domain, you can be using Active Directory running on an EC2 for Windows instance, or you can use AWS Directory Service for Microsoft Active Directory.
Amazon EC2 Systems Manager is a scalable tool for remote management of your EC2 instances. You will use the Systems Manager Run Command to install the Amazon Inspector agent. The agent enables EC2 instances to communicate with the Amazon Inspector service and run assessments, which I explain in detail later in this blog post. You also will create a Systems Manager association to keep your EC2 instances up to date with the latest security patches.
You can use the EBS Snapshot Scheduler to schedule automated snapshots at regular intervals. You will use it to set up regular snapshots of your Amazon EBS volumes. EBS Snapshot Scheduler is a prebuilt solution by AWS that you will deploy in your AWS account. With Amazon EBS snapshots, you pay only for the actual data you store. Snapshots save only the data that has changed since the previous snapshot, which minimizes your cost.
You will use Amazon Inspector to run security assessments on your EC2 for Windows Server instance. In this post, I show how to assess if your EC2 for Windows Server instance is vulnerable to any of the more than 50,000 CVEs registered with Amazon Inspector.
In today’s and tomorrow’s posts, I show you how to:
Launch an EC2 instance with an IAM role, Amazon EBS volume, and tags that Systems Manager and Amazon Inspector will use.
Configure Systems Manager to install the Amazon Inspector agent and patch your EC2 instances.
Use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).
Step 1: Launch an EC2 instance
In this section, I show you how to launch your EC2 instances so that you can use Systems Manager with the instances and use instance tags with EBS Snapshot Scheduler to automate snapshots. This requires three things:
Create an IAM role for Systems Manager before launching your EC2 instance.
Launch your EC2 instance with Amazon EBS and the IAM role for Systems Manager.
Add tags to instances so that you can automate policies for which instances you take snapshots of and when.
Create an IAM role for Systems Manager
Before launching your EC2 instance, I recommend that you first create an IAM role for Systems Manager, which you will use to update the EC2 instance you will launch. AWS already provides a preconfigured policy that you can use for your new role, and it is called AmazonEC2RoleforSSM.
Sign in to the IAM console and choose Roles in the navigation pane. Choose Create new role.
In the role-creation workflow, choose AWS service > EC2 > EC2 to create a role for an EC2 instance.
Choose the AmazonEC2RoleforSSM policy to attach it to the new role you are creating.
Give the role a meaningful name (I chose EC2SSM) and description, and choose Create role.
Launch your EC2 instance
To follow along, you need an EC2 instance that is running Microsoft Windows Server 2012 R2 and that has an Amazon EBS volume attached. You can use any existing instance you may have or create a new instance.
When launching your new EC2 instance, be sure that:
The operating system is Microsoft Windows Server 2012 R2.
You attach at least one Amazon EBS volume to the EC2 instance.
The final step of configuring your EC2 instances is to add tags. You will use these tags to configure Systems Manager in Step 2 of this blog post and to configure Amazon Inspector in Part 2. For this example, I add a tag key, Patch Group, and set the value to Windows Servers. I could have other groups of EC2 instances that I treat differently by having the same tag key but a different tag value. For example, I might have a collection of other servers with the Patch Group tag key with a value of IAS Servers.
Note: You must wait a few minutes until the EC2 instance becomes available before you can proceed to the next section.
At this point, you now have at least one EC2 instance you can use to configure Systems Manager, use EBS Snapshot Scheduler, and use Amazon Inspector.
Note: If you have a large number of EC2 instances to tag, you may want to use the EC2 CreateTags API rather than manually apply tags to each instance.
Step 2: Configure Systems Manager
In this section, I show you how to use Systems Manager to apply operating system patches to your EC2 instances, and how to manage patch compliance.
To start, I will provide some background information about Systems Manager. Then, I will cover how to:
Create the Systems Manager IAM role so that Systems Manager is able to perform patch operations.
Associate a Systems Manager patch baseline with your instance to define which patches Systems Manager should apply.
Define a maintenance window to make sure Systems Manager patches your instance when you tell it to.
Monitor patch compliance to verify the patch state of your instances.
Systems Manager is a collection of capabilities that helps you automate management tasks for AWS-hosted instances on EC2 and your on-premises servers. In this post, I use Systems Manager for two purposes: to run remote commands and apply operating system patches. To learn about the full capabilities of Systems Manager, see What Is Amazon EC2 Systems Manager?
Patch management is an important measure to prevent malware from infecting your systems. Most malware attacks look for vulnerabilities that are publicly known and in most cases are already patched by the maker of the operating system. These publicly known vulnerabilities are well documented and therefore easier for an attacker to exploit than having to discover a new vulnerability.
Patches for these new vulnerabilities are available through Systems Manager within hours after Microsoft releases them. There are two prerequisites to use Systems Manager to apply operating system patches. First, you must attach the IAM role you created in the previous section, EC2SSM, to your EC2 instance. Second, you must install the Systems Manager agent on your EC2 instance. If you have used a recent Microsoft Windows Server 2012 R2 AMI published by AWS, Amazon has already installed the Systems Manager agent on your EC2 instance. You can confirm this by logging in to an EC2 instance and looking for Amazon SSM Agent under Programs and Features in Windows. To install the Systems Manager agent on an instance that does not have the agent preinstalled or if you want to use the Systems Manager agent on your on-premises servers, see the documentation about installing the Systems Manager agent. If you forgot to attach the newly created role when launching your EC2 instance or if you want to attach the role to already running EC2 instances, see Attach an AWS IAM Role to an Existing Amazon EC2 Instance by Using the AWS CLI or use the AWS Management Console.
To make sure your EC2 instance receives operating system patches from Systems Manager, you will use the default patch baseline provided and maintained by AWS, and you will define a maintenance window so that you control when your EC2 instances should receive patches. For the maintenance window to be able to run any tasks, you also must create a new role for Systems Manager. This role is a different kind of role than the one you created earlier: Systems Manager will use this role instead of EC2. Earlier we created the EC2SSM role with the AmazonEC2RoleforSSM policy, which allowed the Systems Manager agent on our instance to communicate with the Systems Manager service. Here we need a new role with the policy AmazonSSMMaintenanceWindowRole to make sure the Systems Manager service is able to execute commands on our instance.
Create the Systems Manager IAM role
To create the new IAM role for Systems Manager, follow the same procedure as in the previous section, but in Step 3, choose the AmazonSSMMaintenanceWindowRole policy instead of the previously selected AmazonEC2RoleforSSM policy.
Finish the wizard and give your new role a recognizable name. For example, I named my role MaintenanceWindowRole.
By default, only EC2 instances can assume this new role. You must update the trust policy to enable Systems Manager to assume this role.
To update the trust policy associated with this new role:
Navigate to the IAM console and choose Roles in the navigation pane.
Choose MaintenanceWindowRole and choose the Trust relationships tab. Then choose Edit trust relationship.
Update the policy document by copying the following policy and pasting it in the Policy Document box. As you can see, I have added the ssm.amazonaws.com service to the list of allowed Principals that can assume this role. Choose Update Trust Policy.
Associate a Systems Manager patch baseline with your instance
Next, you are going to associate a Systems Manager patch baseline with your EC2 instance. A patch baseline defines which patches Systems Manager should apply. You will use the default patch baseline that AWS manages and maintains. Before you can associate the patch baseline with your instance, though, you must determine if Systems Manager recognizes your EC2 instance.
Navigate to the EC2 console, scroll down to Systems Manager Shared Resources in the navigation pane, and choose Managed Instances. Your new EC2 instance should be available there. If your instance is missing from the list, verify the following:
Go to the EC2 console and verify your instance is running.
Select your instance and confirm you attached the Systems Manager IAM role, EC2SSM.
Make sure that you deployed a NAT gateway in your public subnet to ensure your VPC reflects the diagram at the start of this post so that the Systems Manager agent can connect to the Systems Manager internet endpoint.
Now that you have confirmed that Systems Manager can manage your EC2 instance, it is time to associate the AWS maintained patch baseline with your EC2 instance:
Choose Patch Baselines under Systems Manager Services in the navigation pane of the EC2 console.
Choose the default patch baseline as highlighted in the following screenshot, and choose Modify Patch Groups in the Actions drop-down.
In the Patch group box, enter the same value you entered under the Patch Group tag of your EC2 instance in “Step 1: Configure your EC2 instance.” In this example, the value I enter is Windows Servers. Choose the check mark icon next to the patch group and choose Close.
Define a maintenance window
Now that you have successfully set up a role and have associated a patch baseline with your EC2 instance, you will define a maintenance window so that you can control when your EC2 instances should receive patches. By creating multiple maintenance windows and assigning them to different patch groups, you can make sure your EC2 instances do not all reboot at the same time. The Patch Group resource tag you defined earlier will determine to which patch group an instance belongs.
To define a maintenance window:
Navigate to the EC2 console, scroll down to Systems Manager Shared Resources in the navigation pane, and choose Maintenance Windows. Choose Create a Maintenance Window.
Select the Cron schedule builder to define the schedule for the maintenance window. In the example in the following screenshot, the maintenance window will start every Saturday at 10:00 P.M. UTC.
To specify when your maintenance window will end, specify the duration. In this example, the four-hour maintenance window will end on the following Sunday morning at 2:00 A.M. UTC (in other words, four hours after it started).
Systems manager completes all tasks that are in process, even if the maintenance window ends. In my example, I am choosing to prevent new tasks from starting within one hour of the end of my maintenance window because I estimated my patch operations might take longer than one hour to complete. Confirm the creation of the maintenance window by choosing Create maintenance window.
After creating the maintenance window, you must register the EC2 instance to the maintenance window so that Systems Manager knows which EC2 instance it should patch in this maintenance window. To do so, choose Register new targets on the Targets tab of your newly created maintenance window. You can register your targets by using the same Patch Group tag you used before to associate the EC2 instance with the AWS-provided patch baseline.
Assign a task to the maintenance window that will install the operating system patches on your EC2 instance:
Open Maintenance Windows in the EC2 console, select your previously created maintenance window, choose the Tasks tab, and choose Register run command task from the Register new task drop-down.
Choose the AWS-RunPatchBaseline document from the list of available documents.
For Parameters:
For Role, choose the role you created previously (called MaintenanceWindowRole).
For Execute on, specify how many EC2 instances Systems Manager should patch at the same time. If you have a large number of EC2 instances and want to patch all EC2 instances within the defined time, make sure this number is not too low. For example, if you have 1,000 EC2 instances, a maintenance window of 4 hours, and 2 hours’ time for patching, make this number at least 500.
For Stop after, specify after how many errors Systems Manager should stop.
For Operation, choose Install to make sure to install the patches.
Now, you must wait for the maintenance window to run at least once according to the schedule you defined earlier. Note that if you don’t want to wait, you can adjust the schedule to run sooner by choosing Edit maintenance window on the Maintenance Windows page of Systems Manager. If your maintenance window has expired, you can check the status of any maintenance tasks Systems Manager has performed on the Maintenance Windows page of Systems Manager and select your maintenance window.
Monitor patch compliance
You also can see the overall patch compliance of all EC2 instances that are part of defined patch groups by choosing Patch Compliance under Systems Manager Services in the navigation pane of the EC2 console. You can filter by Patch Group to see how many EC2 instances within the selected patch group are up to date, how many EC2 instances are missing updates, and how many EC2 instances are in an error state.
In this section, you have set everything up for patch management on your instance. Now you know how to patch your EC2 instance in a controlled manner and how to check if your EC2 instance is compliant with the patch baseline you have defined. Of course, I recommend that you apply these steps to all EC2 instances you manage.
Summary
In Part 1 of this blog post, I have shown how to configure EC2 instances for use with Systems Manager, EBS Snapshot Scheduler, and Amazon Inspector. I also have shown how to use Systems Manager to keep your Microsoft Windows–based EC2 instances up to date. In Part 2 of this blog post tomorrow, I will show how to take regular snapshots of your data by using EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any CVEs.
If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, start a new thread on the EC2 forum or the Amazon Inspector forum, or contact AWS Support.
This post courtesy of Aaron Friedman, Healthcare and Life Sciences Partner Solutions Architect, AWS and Angel Pizarro, Genomics and Life Sciences Senior Solutions Architect, AWS
Precision medicine is tailored to individuals based on quantitative signatures, including genomics, lifestyle, and environment. It is often considered to be the driving force behind the next wave of human health. Through new initiatives and technologies such as population-scale genomics sequencing and IoT-backed wearables, researchers and clinicians in both commercial and public sectors are gaining new, previously inaccessible insights.
Many of these precision medicine initiatives are already happening on AWS. A few of these include:
PrecisionFDA – This initiative is led by the US Food and Drug Administration. The goal is to define the next-generation standard of care for genomics in precision medicine.
Deloitte ConvergeHEALTH – Gives healthcare and life sciences organizations the ability to analyze their disparate datasets on a singular real world evidence platform.
Central to many of these initiatives is genomics, which gives healthcare organizations the ability to establish a baseline for longitudinal studies. Due to its wide applicability in precision medicine initiatives—from rare disease diagnosis to improving outcomes of clinical trials—genomics data is growing at a larger rate than Moore’s law across the globe. Many expect these datasets to grow to be in the range of tens of exabytes by 2025.
Genomics data is also regularly re-analyzed by the community as researchers develop new computational methods or compare older data with newer genome references. These trends are driving innovations in data analysis methods and algorithms to address the massive increase of computational requirements.
Edico Genome, an AWS Partner Network (APN) Partner, has developed a novel solution that accelerates genomics analysis using field-programmable gate arrays, or FPGAs. Historically, Edico Genome deployed their FPGA appliances on-premises. When AWS announced the Amazon EC2 F1 PGA-based instance family in December 2016, Edico Genome adopted a cloud-first strategy, became a F1 launch partner, and was one of the first partners to deploy FPGA-enabled applications on AWS.
On October 19, 2017, Edico Genome partnered with the Children’s Hospital of Philadelphia (CHOP) to demonstrate their FPGA-accelerated genomic pipeline software, called DRAGEN. It can significantly reduce time-to-insight for patient genomes, and analyzed 1,000 genomes from the Center for Applied Genomics Biobank in the shortest time possible. This set a Guinness World Record for the fastest analysis of 1000 whole human genomes, and they did this using 1000 EC2 f1.2xlarge instances in a single AWS region. Not only were they able to analyze genomes at high throughput, they did so averaging approximately $3 per whole human genome of AWS compute for the analysis.
The version of DRAGEN that Edico Genome used for this analysis was also the same one used in the precisionFDA Hidden Treasures – Warm Up challenge, where they were one of the top performers in every assessment.
In the remainder of this post, we walk through the architecture used by Edico Genome, combining EC2 F1 instances and AWS Batch to achieve this milestone.
EC2 F1 instances and Edico’s DRAGEN
EC2 F1 instances provide access to programmable hardware-acceleration using FPGAs at a cloud scale. AWS customers use F1 instances for a wide variety of applications, including big data, financial analytics and risk analysis, image and video processing, engineering simulations, AR/VR, and accelerated genomics. Edico Genome’s FPGA-backed DRAGEN Bio-IT Platform is now integrated with EC2 F1 instances. You can access the accuracy, speed, flexibility, and low compute cost of DRAGEN through a number of third-party platforms, AWS Marketplace, and Edico Genome’s own platform. The DRAGEN platform offers a scalable, accelerated, and cost-efficient secondary analysis solution for a wide variety of genomics applications. Edico Genome also provides a highly optimized mechanism for the efficient storage of genomic data.
Scaling DRAGEN on AWS
Edico Genome used 1,000 EC2 F1 instances to help their customer, the Children’s Hospital of Philadelphia (CHOP), to process and analyze all 1,000 whole human genomes in parallel. They used AWS Batch to provision compute resources and orchestrate DRAGEN compute jobs across the 1,000 EC2 F1 instances. This solution successfully addressed the challenge of creating a scalable genomic processing pipeline that can easily scale to thousands of engines running in parallel.
Architecture
A simplified view of the architecture used for the analysis is shown in the following diagram:
DRAGEN’s portal uses Elastic Load Balancing and Auto Scaling groups to scale out EC2 instances that submitted jobs to AWS Batch.
Job metadata is stored in their Workflow Management (WFM) database, built on top of Amazon Aurora.
The DRAGEN Workflow Manager API submits jobs to AWS Batch.
These jobs are executed on the AWS Batch managed compute environment that was responsible for launching the EC2 F1 instances.
These jobs run as Docker containers that have the requisite DRAGEN binaries for whole genome analysis.
As each job runs, it retrieves and stores genomics data that is staged in Amazon S3.
The steps listed previously can also be bucketed into the following higher-level layers:
Workflow: Edico Genome used their Workflow Management API to orchestrate the submission of AWS Batch jobs. Metadata for the jobs (such as the S3 locations of the genomes, etc.) resides in the Workflow Management Database backed by Amazon Aurora.
Batch execution: AWS Batch launches EC2 F1 instances and coordinates the execution of DRAGEN jobs on these compute resources. AWS Batch enabled Edico to quickly and easily scale up to the full number of instances they needed as jobs were submitted. They also scaled back down as each job was completed, to optimize for both cost and performance.
Compute/job: Edico Genome stored their binaries in a Docker container that AWS Batch deployed onto each of the F1 instances, giving each instance the ability to run DRAGEN without the need to pre-install the core executables. The AWS based DRAGEN solution streams all genomics data from S3 for local computation and then writes the results to a destination bucket. They used an AWS Batch job role that specified the IAM permissions. The role ensured that DRAGEN only had access to the buckets or S3 key space it needed for the analysis. Jobs didn’t need to embed AWS credentials.
In the following sections, we dive deeper into several tasks that enabled Edico Genome’s scalable FPGA genome analysis on AWS:
Prepare your Amazon FPGA Image for AWS Batch
Create a Dockerfile and build your Docker image
Set up your AWS Batch FPGA compute environment
Prerequisites
In brief, you need a modern Linux distribution (3.10+), Amazon ECS Container Agent, awslogs driver, and Docker configured on your image. There are additional recommendations in the Compute Resource AMI specification.
Preparing your Amazon FPGA Image for AWS Batch
You can use any Amazon Machine Image (AMI) or Amazon FPGA Image (AFI) with AWS Batch, provided that it meets the Compute Resource AMI specification. This gives you the ability to customize any workload by increasing the size of root or data volumes, adding instance stores, and connecting with the FPGA (F) and GPU (G and P) instance families.
Next, install the AWS CLI:
pip install awscli
Add any additional software required to interact with the FPGAs on the F1 instances.
As a starting point, AWS publishes an FPGA Developer AMI in the AWS Marketplace. It is based on a CentOS Linux image and includes pre-integrated FPGA development tools. It also includes the runtime tools required to develop and use custom FPGAs for hardware acceleration applications.
For more information about how to set up custom AMIs for your AWS Batch managed compute environments, see Creating a Compute Resource AMI.
Building your Dockerfile
There are two common methods for connecting to AWS Batch to run FPGA-enabled algorithms. The first method, which is the route Edico Genome took, involves storing your binaries in the Docker container itself and running that on top of an F1 instance with Docker installed. The following code example is what a Dockerfile to build your container might look like for this scenario.
# DRAGEN_EXEC Docker image generator –
# Run this Dockerfile from a local directory that contains the latest release of
# - Dragen RPM and Linux DMA Driver available from Edico
# - Edico's Dragen WFMS Wrapper files
FROM centos:centos7
RUN rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# Install Basic packages needed for Dragen
RUN yum -y install \
perl \
sos \
coreutils \
gdb \
time \
systemd-libs \
bzip2-libs \
R \
ca-certificates \
ipmitool \
smartmontools \
rsync
# Install the Dragen RPM
RUN mkdir -m777 -p /var/log/dragen /var/run/dragen
ADD . /root
RUN rpm -Uvh /root/edico_driver*.rpm || true
RUN rpm -Uvh /root/dragen-aws*.rpm || true
# Auto generate the Dragen license
RUN /opt/edico/bin/dragen_lic -i auto
#########################################################
# Now install the Edico WFMS "Wrapper" functions
# Add development tools needed for some util
RUN yum groupinstall -y "Development Tools"
# Install necessary standard packages
RUN yum -y install \
dstat \
git \
python-devel \
python-pip \
time \
tree && \
pip install – upgrade pip && \
easy_install requests && \
pip install psutil && \
pip install python-dateutil && \
pip install constants && \
easy_install boto3
# Setup Python path used by the wrapper
RUN mkdir -p /opt/workflow/python/bin
RUN ln -s /usr/bin/python /opt/workflow/python/bin/python2.7
RUN ln -s /usr/bin/python /opt/workflow/python/bin/python
# Install d_haul and dragen_job_execute wrapper functions and associated packages
RUN mkdir -p /root/wfms/trunk/scheduler/scheduler
COPY scheduler/d_haul /root/wfms/trunk/scheduler/
COPY scheduler/dragen_job_execute /root/wfms/trunk/scheduler/
COPY scheduler/scheduler/aws_utils.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/constants.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/job_utils.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/logger.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/scheduler_utils.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/webapi.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/wfms_exception.py /root/wfms/trunk/scheduler/scheduler/
RUN touch /root/wfms/trunk/scheduler/scheduler/__init__.py
# Landing directory should be where DJX is located
WORKDIR "/root/wfms/trunk/scheduler/"
# Debug print of container's directories
RUN tree /root/wfms/trunk/scheduler
# Default behaviour. Over-ride with – entrypoint on docker run cmd line
ENTRYPOINT ["/root/wfms/trunk/scheduler/dragen_job_execute"]
CMD []
Note: Edico Genome’s custom Python wrapper functions for its Workflow Management System (WFMS) in the latter part of this Dockerfile should be replaced with functions that are specific to your workflow.
The second method is to install binaries and then use Docker as a lightweight connector between AWS Batch and the AFI. For example, this might be a route you would choose to use if you were provisioning DRAGEN from the AWS Marketplace.
In this case, the Dockerfile would not contain the installation of the binaries to run DRAGEN, but would contain any other packages necessary for job completion. When you run your Docker container, you enable Docker to access the underlying file system.
Connecting to AWS Batch
AWS Batch provisions compute resources and runs your jobs, choosing the right instance types based on your job requirements and scaling down resources as work is completed. AWS Batch users submit a job, based on a template or “job definition” to an AWS Batch job queue.
Job queues are mapped to one or more compute environments that describe the quantity and types of resources that AWS Batch can provision. In this case, Edico created a managed compute environment that was able to launch 1,000 EC2 F1 instances across multiple Availability Zones in us-east-1. As jobs are submitted to a job queue, the service launches the required quantity and types of instances that are needed. As instances become available, AWS Batch then runs each job within appropriately sized Docker containers.
The Edico Genome workflow manager API submits jobs to an AWS Batch job queue. This job queue maps to an AWS Batch managed compute environment containing On-Demand F1 instances. In this section, you can set this up yourself.
To create the compute environment that DRAGEN can use:
An f1.2xlarge EC2 instance contains one FPGA, eight vCPUs, and 122-GiB RAM. As DRAGEN requires an entire FPGA to run, Edico Genome needed to ensure that only one analysis per time executed on an instance. By using the f1.2xlarge vCPUs and memory as a proxy in their AWS Batch job definition, Edico Genome could ensure that only one job runs on an instance at a time. Here’s what that looks like in the AWS CLI:
You can query the status of your DRAGEN job with the following command:
aws batch describe-jobs – jobs <the job ID from the above command>
The logs for your job are written to the /aws/batch/job CloudWatch log group.
Conclusion
In this post, we demonstrated how to set up an environment with AWS Batch that can run DRAGEN on EC2 F1 instances at scale. If you followed the walkthrough, you’ve replicated much of the architecture Edico Genome used to set the Guinness World Record.
There are several ways in which you can harness the computational power of DRAGEN to analyze genomes at scale. First, DRAGEN is available through several different genomics platforms, such as the DNAnexus Platform. DRAGEN is also available on the AWS Marketplace. You can apply the architecture presented in this post to build a scalable solution that is both performant and cost-optimized.
For more information about how AWS Batch can facilitate genomics processing at scale, be sure to check out our aws-batch-genomics GitHub repo on high-throughput genomics on AWS.
I’m thrilled to announce that the new compute-intensive C5 instances are available today in six sizes for launch in three AWS regions!
These instances designed for compute-heavy applications like batch processing, distributed analytics, high-performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding. The new instances offer a 25% price/performance improvement over the C4 instances, with over 50% for some workloads. They also have additional memory per vCPU, and (for code that can make use of the new AVX-512 instructions), twice the performance for vector and floating point workloads.
Over the years we have been working non-stop to provide our customers with the best possible networking, storage, and compute performance, with a long-term focus on offloading many types of work to dedicated hardware designed and built by AWS. The C5 instance type incorporates the latest generation of our hardware offloads, and also takes another big step forward with the addition of a new hypervisor that runs hand-in-glove with our hardware. The new hypervisor allows us to give you access to all of the processing power provided by the host hardware, while also making performance even more consistent and further raising the bar on security. We’ll be sharing many technical details about it at AWS re:Invent.
The New Instances The C5 instances are available in six sizes:
Instance Name
vCPUs
RAM
EBS Bandwidth
Network Bandwidth
c5.large
2
4 GiB
Up to 2.25 Gbps
Up to 10 Gbps
c5.xlarge
4
8 GiB
Up to 2.25 Gbps
Up to 10 Gbps
c5.2xlarge
8
16 GiB
Up to 2.25 Gbps
Up to 10 Gbps
c5.4xlarge
16
32 GiB
2.25 Gbps
Up to 10 Gbps
c5.9xlarge
36
72 GiB
4.5 Gbps
10 Gbps
c5.18xlarge
72
144 GiB
9 Gbps
25 Gbps
Each vCPU is a hardware hyperthread on a 3.0 GHz Intel Xeon Platinum 8000-series processor. This custom processor, optimized for EC2, allows you have full control over the C-states on the two largest sizes, allowing you to run a single core at up to 3.5 GHz using Intel Turbo Boost Technology.
As you can see from the table, the four smallest instance sizes offer substantially more EBS and network bandwidth than the previous generation of compute-intensive instances.
Because all networking and storage functionality is implemented in hardware, C5 instances require HVM AMIs that include drivers for the Elastic Network Adapter (ENA) and NVMe. The latest Amazon Linux, Microsoft Windows, Ubuntu, RHEL, CentOS, SLES, Debian, and FreeBSD AMIs all support C5 instances. If you are doing machine learning inferencing, or other compute-intensive work, be sure to check out the most recent version of the Intel Math Kernel Library. It has been optimized for the Intel® Xeon® Platinum processor and has the potential to greatly accelerate your work.
In order to remain compatible with instances that use the Xen hypervisor, the device names for EBS volumes will continue to use the existing /dev/sd and /dev/xvd prefixes. The device name that you provide when you attach a volume to an instance is not used because the NVMe driver assigns its own device name (read Amazon EBS and NVMe to learn more):
The nvme command displays additional information about each volume (install it using sudo yum -y install nvme-cli if necessary):
The SN field in the output can be mapped to an EBS volume ID by inserting a “-” after the “vol” prefix (sadly, the NVMe SN field is not long enough to store the entire ID). Here’s a simple script that uses this information to create an EBS snapshot of each attached volume:
With a little more work (and a lot of testing), you could create a script that expands EBS volumes that are getting full.
Getting to C5 As I mentioned earlier, our effort to offload work to hardware accelerators has been underway for quite some time. Here’s a recap:
CC1 – Launched in 2010, the CC1 was designed to support scale-out HPC applications. It was the first EC2 instance to support 10 Gbps networking and one of the first to support HVM virtualization. The network fabric that we designed for the CC1 (based on our own switch hardware) has become the standard for all AWS data centers.
C3 – Launched in 2013, the C3 introduced Enhanced Networking and uses dedicated hardware accelerators to support the software defined network inside of each Virtual Private Cloud (VPC). Hardware virtualization removes the I/O stack from the hypervisor in favor of direct access by the guest OS, resulting in higher performance and reduced variability.
C4 – Launched in 2015, the C4 instances are EBS Optimized by default via a dedicated network connection, and also offload EBS processing (including CPU-intensive crypto operations for encrypted EBS volumes) to a hardware accelerator.
C5 – Launched today, the hypervisor that powers the C5 instances allow practically all of the resources of the host CPU to be devoted to customer instances. The ENA networking and the NVMe interface to EBS are both powered by hardware accelerators. The instances do not require (or support) the Xen paravirtual networking or block device drivers, both of which have been removed in order to increase efficiency.
Going forward, we’ll use this hypervisor to power other instance types and plan to share additional technical details in a set of AWS re:Invent sessions.
Launch a C5 Today You can launch C5 instances today in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) Regions in On-Demand and Spot form (Reserved Instances are also available), with additional Regions in the works.
One quick note before I go: The current NVMe driver is not optimized for high-performance sequential workloads and we don’t recommend the use of C5 instances in conjunction with sc1 or st1 volumes. We are aware of this issue and have been working to optimize the driver for this important use case.
Driven by customer demand and made possible by on-going advances in the state-of-the-art, we’ve come a long way since the original m1.small instance that we launched in 2006, with instances that are emphasize compute power, burstable performance, memory size, local storage, and accelerated computing.
The New P3 Today we are making the next generation of GPU-powered EC2 instances available in four AWS regions. Powered by up to eight NVIDIA Tesla V100 GPUs, the P3 instances are designed to handle compute-intensive machine learning, deep learning, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, and genomics workloads.
P3 instances use customized Intel Xeon E5-2686v4 processors running at up to 2.7 GHz. They are available in three sizes (all VPC-only and EBS-only):
Model
NVIDIA Tesla V100 GPUs
GPU Memory
NVIDIA NVLink
vCPUs
Main Memory
Network Bandwidth
EBS Bandwidth
p3.2xlarge
1
16 GiB
n/a
8
61 GiB
Up to 10 Gbps
1.5 Gbps
p3.8xlarge
4
64 GiB
200 GBps
32
244 GiB
10 Gbps
7 Gbps
p3.16xlarge
8
128 GiB
300 GBps
64
488 GiB
25 Gbps
14 Gbps
Each of the NVIDIA GPUs is packed with 5,120 CUDA cores and another 640 Tensor cores and can deliver up to 125 TFLOPS of mixed-precision floating point, 15.7 TFLOPS of single-precision floating point, and 7.8 TFLOPS of double-precision floating point. On the two larger sizes, the GPUs are connected together via NVIDIA NVLink 2.0 running at a total data rate of up to 300 GBps. This allows the GPUs to exchange intermediate results and other data at high speed, without having to move it through the CPU or the PCI-Express fabric.
What’s a Tensor Core? I had not heard the term Tensor core before starting to write this post. According to this very helpful post on the NVIDIA Blog, Tensor cores are designed to speed up the training and inference of large, deep neural networks. Each core is able to quickly and efficiently multiply a pair of 4×4 half-precision (also known as FP16) matrices together, add the resulting 4×4 matrix to another half or single-precision (FP32) matrix, and store the resulting 4×4 matrix in either half or single-precision form. Here’s a diagram from NVIDIA’s blog post:
This operation is in the innermost loop of the training process for a deep neural network, and is an excellent example of how today’s NVIDIA GPU hardware is purpose-built to address a very specific market need. By the way, the mixed-precision qualifier on the Tensor core performance means that it is flexible enough to work with with a combination of 16-bit and 32-bit floating point values.
Performance in Perspective I always like to put raw performance numbers into a real-world perspective so that they are easier to relate to and (hopefully) more meaningful. This turned out to be surprisingly difficult, given that the eight NVIDIA Tesla V100 GPUs on a single p3.16xlarge can do 125 trillion single-precision floating point multiplications per second.
Let’s go back to the dawn of the microprocessor era, and consider the Intel 8080A chip that powered the MITS Altair that I bought in the summer of 1977. With a 2 MHz clock, it was able to do about 832 multiplications per second (I used this data and corrected it for the faster clock speed). The p3.16xlarge is roughly 150 billion times faster. However, just 1.2 billion seconds have gone by since that summer. In other words, I can do 100x more calculations today in one second than my Altair could have done in the last 40 years!
What about the innovative 8087 math coprocessor that was an optional accessory for the IBM PC that was announced in the summer of 1981? With a 5 MHz clock and purpose-built hardware, it was able to do about 52,632 multiplications per second. 1.14 billion seconds have elapsed since then, p3.16xlarge is 2.37 billion times faster, so the poor little PC would be barely halfway through a calculation that would run for 1 second today.
Ok, how about a Cray-1? First delivered in 1976, this supercomputer was able to perform vector operations at 160 MFLOPS, making the p3.x16xlarge 781,000 times faster. It could have iterated on some interesting problem 1500 times over the years since it was introduced.
Comparisons between the P3 and today’s scale-out supercomputers are harder to make, given that you can think of the P3 as a step-and-repeat component of a supercomputer that you can launch on as as-needed basis.
Run One Today In order to take full advantage of the NVIDIA Tesla V100 GPUs and the Tensor cores, you will need to use CUDA 9 and cuDNN7. These drivers and libraries have already been added to the newest versions of the Windows AMIs and will be included in an updated Amazon Linux AMI that is scheduled for release on November 7th. New packages are already available in our repos if you want to to install them on your existing Amazon Linux AMI.
The newest AWS Deep Learning AMIs come preinstalled with the latest releases of Apache MxNet, Caffe2, and Tensorflow (each with support for the NVIDIA Tesla V100 GPUs), and will be updated to support P3 instances with other machine learning frameworks such as Microsoft Cognitive Toolkit and PyTorch as soon as these frameworks release support for the NVIDIA Tesla V100 GPUs. You can also use the NVIDIA Volta Deep Learning AMI for NGC.
P3 instances are available in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions in On-Demand, Spot, Reserved Instance, and Dedicated Host form.
The recent addition of Xilinx FPGAs to AWS Cloud compute offerings is one way that AWS is enabling global growth in the areas of advanced analytics, deep learning and AI. The customized F1 servers use pooled accelerators, enabling interconnectivity of up to 8 FPGAs, each one including 64 GiB DDR4 ECC protected memory, with a dedicated PCIe x16 connection. That makes this a powerful engine with the capacity to process advanced analytical applications at scale, at a significantly faster rate. For example, AWS commercial partner Edico Genome is able to achieve an approximately 30X speedup in analyzing whole genome sequencing datasets using their DRAGEN platform powered with F1 instances.
While the availability of FPGA F1 compute on-demand provides clear accessibility and cost advantages, many mainstream users are still finding that the “threshold to entry” in developing or running FPGA-accelerated simulations is too high. Researchers at the UC Berkeley RISE Lab have developed “FireSim”, powered by Amazon FPGA F1 instances as an open-source resource, FireSim lowers that entry bar and makes it easier for everyone to leverage the power of an FPGA-accelerated compute environment. Whether you are part of a small start-up development team or working at a large datacenter scale, hardware-software co-design enables faster time-to-deployment, lower costs, and more predictable performance. We are excited to feature FireSim in this post from Sagar Karandikar and his colleagues at UC-Berkeley.
―Mia Champion, Sr. Data Scientist, AWS
Mapping an 8-node FireSim cluster simulation to Amazon EC2 F1
As traditional hardware scaling nears its end, the data centers of tomorrow are trending towards heterogeneity, employing custom hardware accelerators and increasingly high-performance interconnects. Prototyping new hardware at scale has traditionally been either extremely expensive, or very slow. In this post, I introduce FireSim, a new hardware simulation platform under development in the computer architecture research group at UC Berkeley that enables fast, scalable hardware simulation using Amazon EC2 F1 instances.
FireSim benefits both hardware and software developers working on new rack-scale systems: software developers can use the simulated nodes with new hardware features as they would use a real machine, while hardware developers have full control over the hardware being simulated and can run real software stacks while hardware is still under development. In conjunction with this post, we’re releasing the first public demo of FireSim, which lets you deploy your own 8-node simulated cluster on an F1 Instance and run benchmarks against it. This demo simulates a pre-built “vanilla” cluster, but demonstrates FireSim’s high performance and usability.
Why FireSim + F1?
FPGA-accelerated hardware simulation is by no means a new concept. However, previous attempts to use FPGAs for simulation have been fraught with usability, scalability, and cost issues. FireSim takes advantage of EC2 F1 and open-source hardware to address the traditional problems with FPGA-accelerated simulation: Problem #1: FPGA-based simulations have traditionally been expensive, difficult to deploy, and difficult to reproduce. FireSim uses public-cloud infrastructure like F1, which means no upfront cost to purchase and deploy FPGAs. Developers and researchers can distribute pre-built AMIs and AFIs, as in this public demo (more details later in this post), to make experiments easy to reproduce. FireSim also automates most of the work involved in deploying an FPGA simulation, essentially enabling one-click conversion from new RTL to deploying on an FPGA cluster.
Problem #2: FPGA-based simulations have traditionally been difficult (and expensive) to scale. Because FireSim uses F1, users can scale out experiments by spinning up additional EC2 instances, rather than spending hundreds of thousands of dollars on large FPGA clusters.
Problem #3: Finding open hardware to simulate has traditionally been difficult.Finding open hardware that can run real software stacks is even harder. FireSim simulates RocketChip, an open, silicon-proven, RISC-V-based processor platform, and adds peripherals like a NIC and disk device to build up a realistic system. Processors that implement RISC-V automatically support real operating systems (such as Linux) and even support applications like Apache and Memcached. We provide a custom Buildroot-based FireSim Linux distribution that runs on our simulated nodes and includes many popular developer tools.
Problem #4: Writing hardware in traditional HDLs is time-consuming. Both FireSim and RocketChip use the Chisel HDL, which brings modern programming paradigms to hardware description languages. Chisel greatly simplifies the process of building large, highly parameterized hardware components.
How to use FireSim for hardware/software co-design
FireSim drastically improves the process of co-designing hardware and software by acting as a push-button interface for collaboration between hardware developers and systems software developers. The following diagram describes the workflows that hardware and software developers use when working with FireSim.
Figure 2. The FireSim custom hardware development workflow.
The hardware developer’s view:
Write custom RTL for your accelerator, peripheral, or processor modification in a productive language like Chisel.
Run a software simulation of your hardware design in standard gate-level simulation tools for early-stage debugging.
Run FireSim build scripts, which automatically build your simulation, run it through the Vivado toolchain/AWS shell scripts, and publish an AFI.
Deploy your simulation on EC2 F1 using the generated simulation driver and AFI
Run real software builds released by software developers to benchmark your hardware
The software developer’s view:
Deploy the AMI/AFI generated by the hardware developer on an F1 instance to simulate a cluster of nodes (or scale out to many F1 nodes for larger simulated core-counts).
Connect using SSH into the simulated nodes in the cluster and boot the Linux distribution included with FireSim. This distribution is easy to customize, and already supports many standard software packages.
Directly prototype your software using the same exact interfaces that the software will see when deployed on the real future system you’re prototyping, with the same performance characteristics as observed from software, even at scale.
FireSim demo v1.0
Figure 3. Cluster topology simulated by FireSim demo v1.0.
This first public demo of FireSim focuses on the aforementioned “software-developer’s view” of the custom hardware development cycle. The demo simulates a cluster of 1 to 8 RocketChip-based nodes, interconnected by a functional network simulation. The simulated nodes work just like “real” machines: they boot Linux, you can connect to them using SSH, and you can run real applications on top. The nodes can see each other (and the EC2 F1 instance on which they’re deployed) on the network and communicate with one another. While the demo currently simulates a pre-built “vanilla” cluster, the entire hardware configuration of these simulated nodes can be modified after FireSim is open-sourced.
In this post, I walk through bringing up a single-node FireSim simulation for experienced EC2 F1 users. For more detailed instructions for new users and instructions for running a larger 8-node simulation, see FireSim Demo v1.0 on Amazon EC2 F1. Both demos walk you through setting up an instance from a demo AMI/AFI and booting Linux on the simulated nodes. The full demo instructions also walk you through an example workload, running Memcached on the simulated nodes, with YCSB as a load generator to demonstrate network functionality.
Deploying the demo on F1
In this release, we provide pre-built binaries for driving simulation from the host and a pre-built AFI that contains the FPGA infrastructure necessary to simulate a RocketChip-based node.
Starting your F1 instances
First, launch an instance using the free FireSim Demo v1.0 product available on the AWS Marketplace on an f1.2xlarge instance. After your instance has booted, log in using the user name centos. On the first login, you should see the message “FireSim network config completed.” This sets up the necessary tap interfaces and bridge on the EC2 instance to enable communicating with the simulated nodes.
AMI contents
The AMI contains a variety of tools to help you run simulations and build software for RISC-V systems, including the riscv64 toolchain, a Buildroot-based Linux distribution that runs on the simulated nodes, and the simulation driver program. For more details, see the AMI Contents section on the FireSim website.
Single-node demo
First, you need to flash the FPGA with the FireSim AFI. To do so, run:
This automatically calls the simulation driver, telling it to load the Linux kernel image and root filesystem for the Linux distro. This produces output similar to the following:
Simulations Started. You can use the UART console of each simulated node by attaching to the following screens:
There is a screen on:
2492.fsim0 (Detached)
1 Socket in /var/run/screen/S-centos.
You could connect to the simulated UART console by connecting to this screen, but instead opt to use SSH to access the node instead.
First, ping the node to make sure it has come online. This is currently required because nodes may get stuck at Linux boot if the NIC does not receive any network traffic. For more information, see Troubleshooting/Errata. The node is always assigned the IP address 192.168.1.10:
This should eventually produce the following output:
PING 192.168.1.10 (192.168.1.10) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Destination Host Unreachable
…
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=2017 ms
64 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=1018 ms
64 bytes from 192.168.1.10: icmp_seq=3 ttl=64 time=19.0 ms
…
At this point, you know that the simulated node is online. You can connect to it using SSH with the user name root and password firesim. It is also convenient to make sure that your TERM variable is set correctly. In this case, the simulation expects TERM=linux, so provide that:
At this point, you’re connected to the simulated node. Run uname -a as an example. You should see the following output, indicating that you’re connected to a RISC-V system:
# uname -a
Linux buildroot 4.12.0-rc2 #1 Fri Aug 4 03:44:55 UTC 2017 riscv64 GNU/Linux
Now you can run programs on the simulated node, as you would with a real machine. For an example workload (running YCSB against Memcached on the simulated node) or to run a larger 8-node simulation, see the full FireSim Demo v1.0 on Amazon EC2 F1 demo instructions.
Finally, when you are finished, you can shut down the simulated node by running the following command from within the simulated node:
# poweroff
You can confirm that the simulation has ended by running screen -ls, which should now report that there are no detached screens.
Future plans
At Berkeley, we’re planning to keep improving the FireSim platform to enable our own research in future data center architectures, like FireBox. The FireSim platform will eventually support more sophisticated processors, custom accelerators (such as Hwacha), network models, and peripherals, in addition to scaling to larger numbers of FPGAs. In the future, we’ll open source the entire platform, including Midas, the tool used to transform RTL into FPGA simulators, allowing users to modify any part of the hardware/software stack. Follow @firesimproject on Twitter to stay tuned to future FireSim updates.
Acknowledgements
FireSim is the joint work of many students and faculty at Berkeley: Sagar Karandikar, Donggyu Kim, Howard Mao, David Biancolin, Jack Koenig, Jonathan Bachrach, and Krste Asanović. This work is partially funded by AWS through the RISE Lab, by the Intel Science and Technology Center for Agile HW Design, and by ASPIRE Lab sponsors and affiliates Intel, Google, HPE, Huawei, NVIDIA, and SK hynix.
Now that you can reserve seating in AWS re:Invent 2017 breakout sessions, workshops, chalk talks, and other events, the time is right to review the list of introductory, advanced, and expert content being offered this year. To learn more about breakout content types and levels, see Breakout Content.
SID201 – IAM for Enterprises: How Vanguard Strikes the Balance Between Agility, Governance, and Security For Vanguard, managing the creation of AWS Identity and Access Management (IAM) objects is key to balancing developer velocity and compliance. In this session, you learn how Vanguard designs IAM roles to control the blast radius of AWS resources and maintain simplicity for developers. Vanguard will also share best practices to help you manage governance and improve your visibility across your AWS resources.
SID202 – Deep dive about how Capital One automates the delivery of directory services across AWS accounts Traditional solutions for using Microsoft Active Directory across on-premises and AWS Cloud Windows workloads can require complex networking or syncing identities across multiple systems. AWS Directory Service for Microsoft Active Directory, also known as AWS Managed AD, offers you actual Microsoft Active Directory in the AWS Cloud as a managed service. In this session, you will learn how Capital One uses AWS Managed AD to provide highly available authentication and authorization services for its Windows workloads, such as Amazon RDS for SQL Server.
SID205 – Building the Largest Repo for Serverless Compliance-as-Code When you use the cloud to enable speed and agility, how do you know if you’ve done it correctly? We are on a mission to help builders follow industry best practices within security guardrails by creating the largest compliance-as-code repository, available to all. Compliance-as-code is the idea to translate best practices, guardrails, policies, and standards into codified unit testing. Apply this to your AWS environment to provide insights about what can or must be improved. Learn why compliance-as-code matters to gain speed (by getting developers, architects, and security pros on the same page), how it is currently used (demo), and how to start using it or being part of building it.
SID206 – Best Practices for Managing Security Operation on AWS To help prevent unexpected access to your AWS resources, it is critical to maintain strong identity and access policies and track, detect, and react to changes. In this session, you will learn how to use AWS Identity and Access Management (IAM) to control access to AWS resources and integrate your existing authentication system with IAM. We will cover how to deploy and control AWS infrastructure using code templates, including change management policies with AWS CloudFormation.
SID207 – Feedback Security in the Cloud Like many security teams, Riot has been challenged by new paradigms that came with the move to the cloud. We discuss how our security team has developed a security culture based on feedback and self-service to best thrive in the cloud. We detail how the team assessed the security gaps and challenges in our move to AWS, and then describe how the team works within Riot’s unique feedback culture.
SID208 – Less (Privilege) Is More: Getting Least Privilege Right in AWS AWS services are designed to enable control through AWS Identity and Access Management (IAM) and Amazon Virtual Private Cloud (VPC). Join us in this chalk talk to learn how to apply these toward the security principal of least privilege for applications and data and how to practically integrate them in your security operations.
SID209 – Designing and Deploying an AWS Account Factory AWS customers start off with one AWS account, but quickly realize the benefits of having multiple AWS accounts. A common learning curve for customers is how to securely baseline and set up new accounts at scale. This talk helps you understand how to use AWS Organizations, AWS Identity and Access Management (IAM), AWS CloudFormation, and other tools to baseline new accounts, set them up for federation, and make a secure and repeatable account factory to create new AWS accounts. Walk away with demos and tools to use in your own environment.
SID210 – A CISO’s Journey at Vonage: Achieving Unified Security at Scale Making sense of the risks of IT deployments that sit in hybrid environments and span multiple countries is a major challenge. When you add in multiple toolsets and global compliance requirements, including GDPR, it can get overwhelming. Listen to Vonage’s Chief Information Security Officer, Johan Hybinette, share his experiences tackling these challenges.
SID212 – Maximizing Your Move to AWS – Five Key Lessons from Vanguard and Cloud Technology Partners CTP’s Robert Christiansen and Mike Kavis describe how to maximize the value of your AWS initiative. From building a Minimum Viable Cloud to establishing a cloud robust security and compliance posture, we walk through key client success stories and lessons learned. We also explore how CTP has helped Vanguard, the leading provider of investor communications and technology, take advantage of AWS to delight customers, drive new revenue streams, and transform their business.
SID213 – Managing Regulator Expectations – Lessons Learned on Positioning AWS Services from an Audit Perspective Cloud migration in highly regulated industries can stall without a solid understanding of how (and when) to address regulatory expectations. This session provides a guide to explaining the aspects of AWS services that are most frequently the subject of an internal or regulatory audit. Because regulatory agencies and internal auditors might not share a common understanding of the cloud, this session is designed to help you to help them, regardless of their level of technical fluency.
SID214 – Best Security Practices in the Intelligence Community Executives from the Intelligence community discuss cloud security best practices in a field where security is imperative to operations. CIA security cloud chief John Nicely and NGA security cloud chief Scot Kaplan share success stories of migrating mass data to the cloud from a security perspective. Hear how they migrated their IT portfolios while managing their organizations’ unique blend of constraints, budget issues, politics, culture, and security pressures. Learn how these institutions overcame barriers to migration, and ask these panelists what actions you can take to better prepare yourself for the journey of mass migration to the cloud.
SID216 – Defending Diverse Applications Against Common Threats In this session, you learn how to adapt application defenses and operational responses based on your unique requirements. You also hear directly from customers about how they architected their applications on AWS to protect their applications. There are many ways to build secure, high-availability applications in the cloud. Services such as Amazon API Gateway, Amazon VPC, ALB, ELB, and Amazon EC2 are the basic building blocks that enable you to address a wide range of use cases. Best practices for defending your applications against Distributed Denial of Service (DDoS) attacks, exploitation attempts, and bad bots can vary with your choices in architecture.
Advanced level
SID301 – Using AWS Lambda as a Security Team Operating a security practice on AWS brings many new challenges that haven’t been faced in data center environments. The dynamic nature of infrastructure, the relationship between development team members and their applications, and the architecture paradigms have all changed as a result of building software on top of AWS. In this session, learn how your security team can leverage AWS Lambda as a tool to monitor, audit, and enforce your security policies within an AWS environment.
SID302 – Force Multiply Your Security Team with Automation and Alexa Adversaries automate. Who says the good guys can’t as well? By combining AWS offerings like AWS CloudTrail, Amazon Cloudwatch, AWS Config, and AWS Lambda with the power of Amazon Alexa, you can do more security tasks faster, with fewer resources. Force multiplying your security team is all about automation! Last year, we showed off penetration testing at the push of an (AWS IoT) button, and surprise-previewed how to ask Alexa to run Inspector as-needed. Want to see other ways to ask Alexa to be your cloud security sidekick? We have crazy new demos at the ready to show security geeks how to sling security automation solutions for their AWS environments (and impress and help your boss, too).
SID303 – How You Can Use AWS’s Identity Services to be Successful on Your AWS Cloud Journey Every journey to the AWS Cloud is unique. Some customers are migrating existing applications, while others are building new applications using cloud-native services. Along each of these journeys, identity and access management helps customers protect their applications and resources. In this session, you will learn how AWS’s identity services provide you a secure, flexible, and easy solution for managing identities and access on the AWS Cloud. With AWS’’s Identity Services, you do not have to adapt to AWS. Instead, you have a choice of services designed to meet you anywhere along your journey to the AWS Cloud. Every journey to the AWS Cloud is unique. Some customers are migrating existing applications, while others are building new applications using cloud-native services.
SID304 – SecOps 2021 Today: Using AWS Services to Deliver SecOps This talk dives deep on how to build end-to-end security capabilities using AWS. Our goal is orchestrating AWS Security services with other AWS building blocks to deliver enhanced security. We cover working with AWS CloudWatch Events as a queueing mechanism for processing security events, using Amazon DynamoDB to provide a stateful layer to provide tailored response to events and other ancillary functions, using DynamoDB as an attack signature engine, and the use of analytics to derive tailored signatures for detection with AWS Lambda.
SID305 – How CrowdStrike Built a Real-time Security Monitoring Service on AWS The CrowdStrike motto is “We Stop Breaches.” To do that, it needed to build a real-time security monitoring service to detect threats. Join this session to learn how Crowdstrike uses Amazon EC2 and Amazon EBS to help its customers identify vulnerabilities before they become large-scale problems.
SID306 – How Chick-fil-A Embraces DevSecOps on AWS As Chick-fil-A became a cloud-first organization, their security team didn’t want to become the bottleneck for agility. But the security team also wanted to raise the bar for their security posture on AWS. Robert Davis, security architect at Chick-fil-A, provides an overview about how he and his team recognized that writing code was the best way for their security policies to scale across the many AWS accounts that Chick-fil-A operates.
SID307 – Serverless for Security Officers: Paradigm Walkthrough and Comprehensive Security Best Practices For security practitioners, serverless represents a context switch from the familiar servers and networks to a decentralized set of code snippets and AWS platform constructs. This new ecosystem also represents new operational teams, data flows, security tooling, and faster-then-ever change velocity. In this talk, we perform live demos and provide code samples for a wide array of security best practices aligned to industry standards such as NIST 800-53 and ISO 27001.
SID308 – Multi-Account Strategies We will explore a multi-account architecture and how to approach the design/thought process around it. This chalk talk will allow attendees to dive deep into the topic and discuss the nuances of the architecture as well as provide feedback around the approach.
SID309 – Credentials, Credentials, Credentials, Oh My! For new and experienced customers alike, understanding the various credential forms and exchange mechanisms within AWS can be a daunting exercise. In this chalk talk, we clear up the confusion by performing a cartography exercise. We visually depict the right source credentials (for example, enterprise user name and password, IAM keys, AWS STS tokens, and so on) and transformation mechanisms (for example, AssumeRole and so on) to use depending on what you’re trying to do and where you’re coming from.
SID310 – Moving from the Shadows to the Throne What do you do when leadership embraces what was called “shadow IT” as the new path forward? How do you onboard new accounts while simultaneously pushing policy to secure all existing accounts? This session walks through Cisco’s journey consolidating over 700 existing accounts in the Cisco organization, while building and applying Cisco’s new cloud policies.
SID311 – Designing Security and Governance Across a Multi-Account Strategy When organizations plan their journey to cloud adoption at scale, they quickly encounter questions such as: How many accounts do we need? How do we share resources? How do we integrate with existing identity solutions? In this workshop, we present best practices and give you the hands-on opportunity to test and develop best practices. You will work in teams to set up and create an AWS environment that is enterprise-ready for application deployment and integration into existing operations, security, and procurement processes. You will get hands-on experience with cross-account roles, consolidated logging, account governance and other challenges to solve.
SID312 – DevSecOps Capture the Flag In this Capture the Flag workshop, we divide groups into teams and work on AWS CloudFormation DevSecOps. The AWS Red Team supplies an AWS DevSecOps Policy that needs to be enforced via CloudFormation static analysis. Participant Blue Teams are provided with an AWS Lambda-based reference architecture to be used to inspect CloudFormation templates against that policy. Interesting items need to be logged, and made visible via ChatOps. Dangerous items need to be logged, and recorded accurately as a template fail. The secondary challenge is building a CloudFormation template to thwart the controls being created by the other Blue teams.
SID313 – Continuous Compliance on AWS at Scale In cloud migrations, the cloud’s elastic nature is often touted as a critical capability in delivering on key business initiatives. However, you must account for it in your security and compliance plans or face some real challenges. Always counting on a virtual host to be running, for example, causes issues when that host is rebooted or retired. Managing security and compliance in the cloud is continuous, requiring forethought and automation. Learn how a leading, next generation managed cloud provider uses automation and cloud expertise to manage security and compliance at scale in an ever-changing environment.
SID314 – IAM Policy Ninja Are you interested in learning how to control access to your AWS resources? Have you wondered how to best scope permissions to achieve least-privilege permissions access control? If your answer is “yes,” this session is for you. We look at the AWS Identity and Access Management (IAM) policy language, starting with the basics of the policy language and how to create and attach policies to IAM users, groups, and roles. We explore policy variables, conditions, and tools to help you author least privilege policies. We cover common use cases, such as granting a user secure access to an Amazon S3 bucket or to launch an Amazon EC2 instance of a specific type.
SID315 – Security and DevOps: Agility and Teamwork In this session, you learn pragmatic steps to integrate security controls into DevOps processes in your AWS environment at scale. Cybersecurity expert and founder of Alert Logic Misha Govshteyn shares insights from high performing teams who are embracing the reality that an agile security program can enable faster and more secure workload deployments. Joining Misha is Joey Peloquin, Director of Cloud Security Operations at Citrix, who discusses Citrix’s DevOps experiences and how they manage their cybersecurity posture within the AWS Cloud. Session sponsored by Alert Logic.
SID316 – Using Access Advisor to Strike the Balance Between Security and Usability AWS provides a killer feature for security operations teams: Access Advisor. In this session, we discuss how Access Advisor shows the services to which an IAM policy grants access and provides a timestamp for the last time that the role authenticated against that service. At Netflix, we use this valuable data to automatically remove permissions that are no longer used. By continually removing excess permissions, we can achieve a balance of empowering developers and maintaining a best-practice, secure environment.
SID317 – Automating Security and Compliance Testing of Infrastructure-as-Code for DevSecOps Infrastructure-as-Code (IaC) has emerged as an essential element of organizational DevOps practices. Tools such as AWS CloudFormation and Terraform allow software-defined infrastructure to be deployed quickly and repeatably to AWS. But the agility of CI/CD pipelines also creates new challenges in infrastructure security hardening. This session provides a foundation for how to bring proven software hardening practices into the world of infrastructure deployment. We discuss how to build security and compliance tests for infrastructure analogous to unit tests for application code, and showcase how security, compliance and governance testing fit in a modern CI/CD pipeline.
SID318 – From Obstacle to Advantage: The Changing Role of Security & Compliance in Your Organization A surprising trend is starting to emerge among organizations who are progressing through the cloud maturity lifecycle: major improvements in revenue growth, customer satisfaction, and mission success are being directly attributed to improvements in security and compliance. At one time thought of as speed bumps in the path to deployment, security and compliance are now seen as critical ingredients that help organizations differentiate their offerings in the market, win more deals, and achieve mission-critical goals faster. This session explores how organizations like Jive Software and the National Geospatial Agency use the Evident Security Platform, AWS, and AWS Quick Starts to automate security and compliance processes in their organization to accomplish more, do it faster, and deliver better results.
SID319 – Incident Response in the Cloud In this session, we walk you through a hypothetical incident response managed on AWS. Learn how to apply existing best practices as well as how to leverage the unique security visibility, control, and automation that AWS provides. We cover how to set up your AWS environment to prevent a security event and how to build a cloud-specific incident response plan so that your organization is prepared before a security event occurs. This session also covers specific environment recovery steps available on AWS.
SID320 – Fraud Prevention, Detection, Lessons Learned, and Best Practices Fighting fraud means countering human actors that quickly adapt to whatever you do to stop them. In this presentation, we discuss the key components of a fraud prevention program in the cloud. Additionally, we provide techniques for detecting known and unknown fraud activity and explore different strategies for effectively preventing detected patterns. Finally, we discuss lessons learned from our own prevention activities as well as the best practices that you can apply to manage risk.
SID321 – How Capital One Applies AWS Organizations Best Practices to Manage Multiple AWS Accounts In this session, we review best practices for managing multiple AWS accounts using AWS Organizations. We cover how to think about the master account and your account strategy, as well as how to roll out changes. You learn how Capital One applies these best practices to manage its AWS accounts, which number over 160, and PCI workloads.
SID322 – The AWS Philosophy of Security AWS distinguished engineer Eric Brandwine speaks with hundreds of customers each year, and noticed one question coming up more than any other, “How does AWS operationalize its own security?” In this session, Eric details both strategic and tactical considerations, along with an insider’s look at AWS tooling and processes.
SID324 – Automating DDoS Response in the Cloud If left unmitigated, Distributed Denial of Service (DDoS) attacks have the potential to harm application availability or impair application performance. DDoS attacks can also act as a smoke screen for intrusion attempts or as a harbinger for attacks against non-cloud infrastructure. Accordingly, it’s crucial that developers architect for DDoS resiliency and maintain robust operational capabilities that allow for rapid detection and engagement during high-severity events. In this session, you learn how to build a DDoS-resilient application and how to use services like AWS Shield and Amazon CloudWatch to defend against DDoS attacks and automate response to attacks in progress.
SID325 – Amazon Macie: Data Visibility Powered by Machine Learning for Security and Compliance Workloads In this session, Edmunds discusses how they create workflows to manage their regulated workloads with Amazon Macie, a newly-released security and compliance management service that leverages machine learning to classify your sensitive data and business-critical information. Amazon Macie uses recurrent neural networks (RNN) to identify and alert potential misuse of intellectual property. They do a deep dive into machine learning within the security ecosystem.
SID326 – AWS Security State of the Union Steve Schmidt, chief information security officer at AWS, addresses the current state of security in the cloud, with a particular focus on feature updates, the AWS internal “secret sauce,” and what’s on horizon in terms of security, identity, and compliance tooling.
SID327 – How Zocdoc Achieved Security and Compliance at Scale With Infrastructure as Code In less than 12 months, Zocdoc became a cloud-first organization, diversifying their tech stack and liberating data to help drive rapid product innovation. Brian Lozada, CISO at Zocdoc, and Zhen Wang, Director of Engineering, provide an overview on how their teams recognized that infrastructure as code was the most effective approach for their security policies to scale across their AWS infrastructure. They leveraged tools such as AWS CloudFormation, hardened AMIs, and hardened containers. The use of DevSecOps within Zocdoc has enhanced data protection with the use of AWS services such as AWS KMS and AWS CloudHSM and auditing capabilities, and event-based policy enforcement with Amazon Elasticsearch Service and Amazon CloudWatch, all built on top of AWS.
SID328 – Cloud Adoption in Regulated Financial Services Macquarie, a global provider of financial services, identified early on that it would require strong partnership between its business, technology and risk teams to enable the rapid adoption of AWS cloud technologies. As a result, Macquarie built a Cloud Governance Platform to enable its risk functions to move as quickly as its development teams. This platform has been the backbone of Macquarie’s adoption of AWS over the past two years and has enabled Macquarie to accelerate its use of cloud technologies for the benefit of clients across multiple global markets. This talk will outline the strategy that Macquarie embarked on, describe the platform they built, and provide examples of other organizations who are on a similar journey.
SID329 – A Deep Dive into AWS Encryption Services AWS Encryption Services provide an easy and cost-effective way to protect your data in AWS. In this session, you learn about leveraging the latest encryption management features to minimize risk for your data.
SID330 – Best Practices for Implementing Your Encryption Strategy Using AWS Key Management Service AWS Key Management Service (KMS) is a managed service that makes it easy for you to create and manage the encryption keys used to encrypt your data. In this session, we will dive deep into best practices learned by implementing AWS KMS at AWS’s largest enterprise clients. We will review the different capabilities described in the AWS Cloud Adoption Framework (CAF) Security Perspective and how to implement these recommendations using AWS KMS. In addition to sharing recommendations, we will also provide examples that will help you protect sensitive information on the AWS Cloud.
SID331 – Architecting Security and Governance Across a Multi-Account Strategy Whether it is per business unit or per application, many AWS customers use multiple accounts to meet their infrastructure isolation, separation of duties, and billing requirements. In this session, we discuss considerations, limitations, and security patterns when building out a multi-account strategy. We explore topics such as identity federation, cross-account roles, consolidated logging, and account governance. Thomson Reuters shared their journey and their approach to a multi-account strategy. At the end of the session, we present an enterprise-ready, multi-account architecture that you can start leveraging today.
SID332 – Identity Management for Your Users and Apps: A Deep Dive on Amazon Cognito Learn how to set up an end-user directory, secure sign-up and sign-in, manage user profiles, authenticate and authorize your APIs, federate from enterprise and social identity providers, and use OAuth to integrate with your app—all without any server setup or code. With clear blueprints, we show you how to leverage Amazon Cognito to administer and secure your end users and enable identity for the applied patterns of mobile, web, and enterprise apps.
SID334 – How Amazon Business Uses Amazon Cloud Directory as the Data Store for Its Account Management Platform Join the Amazon Business team to learn how it uses Amazon Cloud Directory as the data store for its account management platform. You will learn how Amazon Business uses Amazon DynamoDB with Cloud Directory to manage user authorization and business process workflows. You also will learn how Cloud Directory helps to manage hierarchical datasets and how to get started modeling these datasets in Cloud Directory.
SID335 – Implementing Security and Governance across a Multi-Account Strategy As existing or new organizations expand their AWS footprint, managing multiple accounts while maintaining security quickly becomes a challenge. In this chalk talk, we will demonstrate how AWS Organizations, IAM roles, identity federation, and cross-account manager can be combined to build a scalable multi-account management platform. By the end of this session, attendees will have the understanding and deployment patterns to bring a secure, flexible and automated multi-account management platform to their organizations.
SID336 – Use AWS to Effectively Manage GDPR Compliance The General Data Protection Regulation (GDPR) is considered to be the most stringent privacy regulation ever enacted. Complying with GDPR could be a challenge for organizations, and AWS services can help get you ahead of the May 2018 enforcement deadline. In this chalk talk, the Legal and Compliance GDPR leadership at AWS discusses what enforcement of GDPR might mean for you and your customer’s compliance programs.
SID337 – Best Practices for Managing Access to AWS Resources Using IAM Roles In this chalk talk, we discuss why using temporary security credentials to manage access to your AWS resources is an AWS Identity and Access Management (IAM) best practice. IAM roles help you follow this best practice by delivering and rotating temporary credentials automatically. We discuss the different types of IAM roles, the assume role functionality, and how to author fine-grained trust and access policies that limit the scope of IAM roles. We then show you how to attach IAM roles to your AWS resources, such as Amazon EC2 instances and AWS Lambda functions. We also discuss migrating applications that use long-term AWS access keys to temporary credentials managed by IAM roles.
SID338 – [email protected] Once a customer achieves success with using AWS in a few pilot projects, most look to rapidly adopt an “all-in” enterprise migration strategy. Along this journey, several new challenges emerge that quickly become blockers and slow down migrations if they are not addressed properly. At this scale, customers will deal with the governance of hundreds of accounts, as well as thousands of IT resources residing within those accounts. Humans and traditional IT management processes cannot scale at the same pace and inevitably challenging questions emerge. In this session, we discuss those questions about governance at scale.
SID339 – Deep Dive on AWS CloudHSM Organizations building applications that handle confidential or sensitive data are subject to many types of regulatory requirements and often rely on hardware security modules (HSMs) to provide validated control of encryption keys and cryptographic operations. AWS CloudHSM is a cloud-based hardware security module (HSM) that enables you to easily generate and use your own encryption keys on the AWS Cloud using FIPS 140-2 Level 3 validated HSMs. This chalk talk will provide you a deep-dive on CloudHSM, and demonstrate how you can quickly and easily use CloudHSM to help secure your data and meet your compliance requirements.
SID340 – Using Infrastructure as Code to Inject Security Best Practices as Part of the Software Deployment Lifecycle A proactive approach to security is key to securing your applications as part of software deployment. In this chalk talk, T. Rowe Price, a financial asset management institution, outlines how they built their security automation process in enabling their numerous developer teams to rapidly and securely build and deploy applications at scale on AWS. Learn how they use services like AWS Identity and Access Management (IAM), HashiCorp tools, Terraform for automation, and Vault for secrets management, and incorporate certificate management and monitoring as part of the deployment process. T. Rowe Price discusses lessons learned and best practices to move from a tightly controlled legacy environment to an agile, automated software development process on AWS.
SID341 – Using AWS CloudTrail Logs for Scalable, Automated Anomaly Detection This workshop gives you an opportunity to develop a solution that can continuously monitor for and detect a realistic threat by analyzing AWS CloudTrail log data. Participants are provided with a CloudTrail data source and some clues to get started. Then you have to design a system that can process the logs, detect the threat, and trigger an alarm. You can make use of any AWS services that can assist in this endeavor, such as AWS Lambda for serverless detection logic, Amazon CloudWatch or Amazon SNS for alarming and notification, Amazon S3 for data and configuration storage, and more.
SID342 – Protect Your Web Applications from Common Attack Vectors Using AWS WAF As attacks and attempts to exploit vulnerabilities in web applications become more sophisticated, having an effective web request filtering solution becomes key to keeping your users’ data safe. In this workshop, discover how the OWASP Top 10 list of application security risks can help you secure your web applications. Learn how to use AWS services, such as AWS WAF, to mitigate vulnerabilities. This session includes hands-on labs to help you build a solution. Key learning goals include understanding the breadth and complexity of vulnerabilities customers need to protect from, understanding the AWS tools and capabilities that can help mitigate vulnerabilities, and learning how to configure effective HTTP request filtering rules using AWS WAF.
SID343 – User Management and App Authentication with Amazon Cognito Are you curious about how to authenticate and authorize your applications on AWS? Have you thought about how to integrate AWS Identity and Access Management (IAM) with your app authentication? Have you tried to integrate third-party SAML providers with your app authentication? Look no further. This workshop walks you through step by step to configure and create Amazon Cognito user pools and identity pools. This workshop presents you with the framework to build an application using Java, .NET, and serverless. You choose the stack and build the app with local users. See the service being used not only with mobile applications but with other stacks that normally don’t include Amazon Cognito.
SID344 – Soup to Nuts: Identity Federation for AWS AWS offers customers multiple solutions for federating identities on the AWS Cloud. In this session, we will embark on a tour of these solutions and the use cases they support. Along the way, we will dive deep with demonstrations and best practices to help you be successful managing identities on the AWS Cloud. We will cover how and when to use Security Assertion Markup Language 2.0 (SAML), OpenID Connect (OIDC), and other AWS native federation mechanisms. You will learn how these solutions enable federated access to the AWS Management Console, APIs, and CLI, AWS Infrastructure and Managed Services, your web and mobile applications running on the AWS Cloud, and much more.
SID345 – AWS Encryption SDK: The Busy Engineer’s Guide to Client-Side Encryption You know you want client-side encryption for your service but you don’t know exactly where to start. Join us for a hands-on workshop where we review some of your client-side encryption options and explore implementing client-side encryption using the AWS Encryption SDK. In this session, we cover the basics of client-side encryption, perform encrypt and decrypt operations using AWS KMS and the AWS Encryption SDK, and discuss security and performance considerations when implementing client-side encryption in your service.
Expert level
SID401 – Let’s Dive Deep Together: Advancing Web Application Security Beginning with a recap of best practices in CloudFront, AWS WAF, Route 53, and Amazon VPC security, we break into small teams to work together on improving the security of a typical web application. How can we creatively use the services? What additional features would help us? This technically advanced chalk talk requires certification at the solutions architect associate level or greater.
SID402 – An AWS Security Odyssey: Implementing Security Controls in the World of Internet, Big Data, IoT and E-Commerce Platforms This workshop will give participants the opportunity to take a security-focused journey across various AWS services and implement automated controls along the way. You will learn how to apply AWS security controls to services such as Amazon EC2, Amazon S3, AWS Lambda, and Amazon VPC. In short, you will learn how to use the cloud to protect the cloud. We will talk about how to: Adopt a workload-centric approach to your security strategy, Address security issues in a cost-effective manner Automate your security responses to promote maturity and auditability. In order to complete this workshop, attendees will need a laptop with wireless access, an AWS account and an IAM user that has full administrative privileges within their account. AWS credits will be provided as attendees depart the session to cover the cost of running the workshop in their own account.
SID404 – Amazon Inspector – Automating the “Sec” in DevSecOps Adopting DevSecOps can be challenging using traditional security tools that are designed for on-premises infrastructure. Amazon Inspector is an automated security assessment service that helps you adopt DevSecOps by integrating security assessments directly into the development process of applications running on Amazon EC2. We dive deep on how to use Inspector to automate host security assessments. We show you how to integrate Inspector with other AWS Cloud services to provide automated security assessments throughout your development process. We demo installing the AWS agent, setting up assessment targets and templates, and running assessments. We review the findings and discuss how you can automate the management and remediation of those findings with your available AWS services.
SID405 – Five New Security Automation Improvements You Can Make by Using Amazon CloudWatch Events and AWS Config Rules This presentation will include a deep dive into the code behind multiple security automation and remediation functions. This session will consider potential use cases, as well as feature a demonstration of a proposed script, and then walk through the code set to explain the various challenges and solutions of the intended script. All examples of code will be previously unreleased and will feature integration with services such as Trusted Advisor and Macie. All code will be released as OSS after re:Invent.
This time I need it to run in a cluster on AWS, so I went on to setup such a cluster. Googling how to do it gives several interesting results, like this, this and this, but they are either incomplete, or outdates, or have too many irrelevant details. So they are only of moderate help.
My goal is to use CloudFormation (or Terraform potentially) to launch a stack which has a Cassandra auto-scaling group (in a single region) that can grow as easily as increasing the number of nodes in the group.
Also, in order to have the web application connect to Cassandra without hardcoding the node IPs, I wanted to have a load balancer in front of all Cassandra nodes that does the round-robin for me. The alternative for that would be to have a client-side round-robin, but that would mean some extra complexity on the client which seems avoidable with a load balancer in front of the Cassandra auto-scaling group.
Sets up 3 private subnet (1 per availability zone in the eu-west region)
Creates a security group which allows incoming and outgoing ports that allow cassandra to accept connections (9042) and for the nodes to gossip (7000/7001). Note that the ports are only accessible from within the VPC, no external connection is allowed. SSH goes only through a bastion host.
Defines a TCP load balancer for port 9042 where all clients will connect. The load balancer requires a so-called “Target group” which is defined as well.
Configures an auto-scaling group, with a pre-configured number of nodes. The autoscaling group has a reference to the “target group”, so that the load balancer always sees all nodes in the auto-scaling group
Each node in the auto-scaling group is identical based on a launch configuration. The launch configuration runs a few scripts on initialization. These scripts will be run for every node – either initially, or in case a node dies and another one is spawned in its place, or when the cluster has to grow. The scripts are fetched from S3, where you can publish them (and version them) either manually, or with an automated process.
Note: this does not configure specific EBS volumes and in reality you may need to configure and attach them, if the instance storage is insufficient. Don’t worry about nodes dying, though, as data is safely replicated.
That was the easy part – a bunch of AWS resources and port configurations. The Cassandra-specific setup is a bit harder, as it requires understanding on how Cassandra functions.
The two scripts are setup-cassandra.sh and update-cassandra-cluster-config.py, so bash and python. Bash for setting-up the machine, and python for cassandra-specific stuff. Instead of the bash script one could use a pre-built AMI (image), e.g. with packer, but since only 2 pieces of software are installed, I thought it’s a bit of an overhead to support AMIs.
The bash script can be seen here, and simply installs Java 8 and the latest Cassandra, runs the python script, runs the Cassandra services and creates (if needed) a keyspace with proper replication configuration. A few notes here – the cassandra.yaml.template could be supplied via the cloudformation script instead of having it fetched via bash (and having the pass the bucket name); you could also have it fetched in the python script itself – it’s a matter of preference. Cassandra is not configured for use with SSL, which is generally a bad idea, but the SSL configuration is out of scope of the basic setup. Finally, the script waits for the Cassandra process to run (using a while/sleep loop) and then creates the keyspace if needed. The keyspace (=database) has to be created with a NetworkTopologyStrategy, and the number of replicas for the particular datacenter (=AWS region) has to be configured. The value is 3, for the 3 availability zones where we’ll have nodes. That means there’s a copy in each AZ (which is seen like a “rack”, although it’s exactly that).
The python script does some very important configurations – without them the cluster won’t work. (I don’t work with Python normally, so feel free to criticize my Python code). The script does the following:
Gets the current autoscaling group details (using AWS EC2 APIs)
Sorts the instances by time
Fetches the first instance in the group in order to assign it as seed node
Sets the seed node in the configuration file (by replacing a placeholder)
Sets the listen_address (and therefore rpc_address) to the private IP of the node in order to allow Cassandra to listen for incoming connections
Designating the seed node is important, as all cluster nodes have to join the cluster by specifying at least one seed. You can get the first two nodes instead of just one, but it shouldn’t matter. Note that the seed node is not always fixed – it’s just the oldest node in the cluster. If at some point the oldest node is terminated, each new node will use the second oldest as seed.
What I haven’t shown is the cassandra.yaml.template file. It is basically a copy of the cassandra.yaml file from a standard Cassandra installation, with a few changes:
cluster_name is modified to match your application name. This is just for human-readable purposes, doesn’t matter what you set it to.
allocate_tokens_for_keyspace: your_keyspace is uncommented and the keyspace is set to match your main keyspace. This enables the new token distribution algorithm in Cassandra 3.0. It allows for evenly distributing the data across nodes.
endpoint_snitch: Ec2Snitch is set instead of the SimpleSnitch to make use of AWS metadata APIs. Note that this setup is in a single region. For multi-region there’s another snitch and some addtional complications of exposing ports and changing the broadcast address.
as mentionted above, ${private_ip} and ${seeds} placeholders are placed in the appropriate places (listen_address and rpc_address for the IP) in order to allow substitution.
The lets you run a Cassandra cluster as part of your AWS stack, which is auto-scalable and doesn’t require any manual intervention – neither on setup, nor on scaling up. Well, allegedly – there may be issues that have to be resolved once you hit the usecases of reality. And for clients to connect to the cluster, simply use the load balancer DNS name (you can print it in a config file on each application node)
Run on EC2 I’m happy to announce that you can now launch EC2 instances that run Windows Server 2016 and four editions (Web, Express, Standard, and Enterprise) of SQL Server 2017. The AMIs (Amazon Machine Images) are available today in all AWS Regions and run on a wide variety of EC2 instance types, including the new x1e.32xlarge with 128 vCPUs and almost 4 TB of memory.
Licensing Options Galore You have lots of licensing options for SQL Server:
Pay As You Go – This option works well if you would prefer to avoid buying licenses, are already running an older version of SQL Server, and want to upgrade. You don’t have to deal with true-ups, software compliance audits, or Software Assurance and you don’t need to make a long-term purchase. If you are running the Standard Edition of SQL Server, you also benefit from our recent price reduction, with savings of up to 52%.
License Mobility – This option lets your use your active Software Assurance agreement to bring your existing licenses to EC2, and allows you to run SQL Server on Windows or Linux instances.
Bring Your Own Licenses – This option lets you take advantage of your existing license investment while minimizing upgrade costs. You can run SQL Server on EC2 Dedicated Instances or EC2 Dedicated Hosts, with the potential to reduce operating costs by licensing SQL Server on a per-core basis. This option allows you to run SQL Server 2017 on EC2 Linux instances (SUSE, RHEL, and Ubuntu are supported) and also supports Docker-based environments running on EC2 Windows and Linux instances. To learn more about these options, read the Installation Guidance for SQL Server on Linux and Run SQL Server 2017 Container Image with Docker.
Learn More To learn more about SQL Server 2017 and to explore your licensing options in depth, take a look at the SQL Server on AWS page.
If you need advice and guidance as you plan your migration effort, check out the AWS Partners who have qualified for the Microsoft Workloads competency and focus on database solutions.
Amazon RDS support for SQL Server 2017 is planned for November. This will give you a fully managed option.
Plan to join the AWS team at the PASS Summit (November 1-3 in Seattle) and at AWS re:Invent (November 27th to December 1st in Las Vegas).
PS – Special thanks to my colleague Tom Staab (Partner Solutions Architect) for his help with this post!
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.