Tag Archives: Amazon EC2

BYOL and Oversubscription

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/byol-and-oversubscription/

This post is courtesy of Mike Eizensmits, Senior Solutions Architect – AWS

Most AWS customers have a significant Windows Server deployment and are also tied to a Microsoft licensing program. When it comes to Microsoft products, such as Windows Server and SQL Server, licensing models can easily dictate Cloud infrastructure solutions. AWS provides several options to support Bring Your Own Licensing (BYOL) as well as EC2 License Included models for non-BYOL workloads. Most Enterprise customers have EA’s with Microsoft which can skew their licensing strategy when considering Azure, On-premises and other Cloud Service Providers such as AWS. BYOL models may be the only reasonable implementation path when entering a new environment or spinning up new applications. Licensing can constitute a significant investment when running workloads on public cloud. To help facilitate the maximum benefit of a customer’s existing Microsoft licensing, AWS provides multiple options to utilize BYOL EC2 Dedicated Hosts and Dedicated Instances expose the physical cores of the server to Windows and applications such as SQL Server while allowing licenses with or without Software Assurance to be utilized. Bare Metal as well as VMware on AWS can minimize additional licensing costs.

Dedicated Hosts and Instances

An EC2 Dedicated Host is a dedicated server of a specific Instance Class that is allotted to a single customer, referred to as dedicated tenancy. The density of a host is based on the Instance Size as well as the Instance Type defined at creation. If you chose the M5 Instance Type and chose the m5.large Instance Size, you would have 48 “slots” available on the host to deploy m5.large Instances. If you chose the m5.xlarge, you would have enough capacity to house 24 Instances. Dedicated hosts have a fixed number of vCPU and RAM per Instance Type. To deploy Windows on a Dedicated Host, the customer imports an image (vmdk, ova, vhd) using the import-image utility and tags the image as a “BYOL” in the command. The BYOL flag dictates whether the image will acquire a license from AWS or the customer’s existing licensing framework. When dealing with an oversubscribed customer environment, such as an on-premise VMware deployment, the customer has likely oversubscribed the environment with minimum of 4 vCPU to 1 physical core (4:1). In these environments, Microsoft licensing typically takes place at the host level using physical cores rather than the resources in a provisioned instance (vCPU). An AWS Dedicated Host is oversubscribed 2 vCPU to 1 CPU, meaning each core is Hyper-Threaded. While the math can be performed to show the actual value of a vCPU, customers can be reluctant to modify vCPU configurations to reflect the greater value of the AWS vCPU. Simply matching the quantity of vCPU’s to their current environment may be much more costly and expansive than rightsizing the instances for cost optimization.

Below is a sample configuration of a customer interested in migrating to AWS and utilizing BYOL for Microsoft Windows and SQL Server Enterprise Edition. By licensing the 400 physical cores in their cluster, the customer is able to assign any number of vCPU’s to the VM’s deployed on the hosts. Enterprise Architects have spent a considerable amount of time sizing VM’s with the proper resource attributes, so it can be difficult to initiate that process all over again to bring them to the public cloud.

Customer Environment (SQL Server Enterprise Cluster on VMware):

ESXi Nodes in Cluster 10
Cores/Host (2.6GHz) 40
Total Cores in Cluster 400
vCPU assigned (240/host) 2400
Virtual Machines 470
Oversubscription 5:1
Value of vCPU 520 MHz

In this case, the customer has decided not to right-size their VM’s and instead maintain their current vCPU/RAM specifications. This is a Dedicated Host solution to match their vCPU configurations.

AWS Dedicated Host Environment (Dedicated Hosts required to match vCPU of VMware above)

Dedicated Hosts (r4) 38
Cores/Host (2.6GHz) 36
vCPU/Host 64
Total vCPU assigned 2432
Oversubscription 2:1
Value of vCPU 1300 MHz

If the customer is not willing to consider right-sizing their VM’s by assigning fewer, yet higher powered vCPU’s, they face a considerably larger server deployment. After doing the math, we can determine that the Dedicated Host solution has 2.5 times the power of the VMware solution. Following that logic, and math, right-sizing the VM’s would drop the required vCPU count down to 960 vCPU to match their current solution. This would change the number of required r4 Dedicated Hosts from 38 to 15 hosts and slash the SQL licensing requirements for the solution.

EC2 Bare Metal instances and VMware Cloud on AWS

AWS does have other products that lend to the BYOL/oversubscription story. EC2 Bare Metal instances and VMware Cloud on AWS gives the customer full control of the configuration of instances just as they have on-premise. The EC2 Bare Metal instances are built on the Nitro System which is a collection of AWS-built hardware offload and security components that offers high performance networking and storage to EC2 instances. EC2 Bare Metal instances can utilize AWS services such as Amazon Virtual Private Cloud (VPC), Elastic Block Store (EBS), Elastic Load Balancing (ELB), AutoScaling and more. The Nitro configuration gives the customer the ability to install a server Operating System or hypervisor directly on the hardware. By utilizing their own hypervisor, the customer can define and configure their own instance configurations of RAM, disk and vCPU. By surpassing the fixed configurations in the EC2 Dedicated Host environment, Nitro configurations enable migrating highly oversubscribed on-premise workloads.

The VMware Cloud on AWS offering provides organizations the ability to extend and migrate their on-premise vSphere environment to AWS’s scalable and secure Cloud infrastructure. Customers can leverage vSphere, vSAN, NSX and vCenter technologies to extend their data centers and consume AWS services. vMotion provides the ability to live migrate VM’s to AWS with limited or no downtime. While licensing for migrated VM’s does not change at the VM level, it is imperative that the licensing on the new vSphere node be adequate. Since the customer has complete control of the environment, they have the ability to oversubscribe CPU’s to any ratio. By licensing applications such as SQL Server at the host level, oversubscription rates are irrelevant to licensing. If a vSphere node has 40 cores, as long as the 40 cores are licensed, the number of vCPU’s assigned is immaterial. In the VMware environment, all OS’s and applications are BYOL and no licensing will be provided by AWS. Ultimately, this solution is free of the oversubscription burden that affects certain AWS dedicated tenancy options.

Optimize CPU

EC2 Instance Types offer multiple fixed vCPU to memory configurations to match the customer’s workloads and use cases. With Optimize CPU, customers now have the ability to specify the number of cores that an instance has access to as well as determining if Hyper-Threading is enabled. Hyper-Threading Technology enables multiple threads to run concurrently on a single Intel Xeon CPU core. Each thread is represented as a virtual CPU (vCPU) on the instance. Controlling the threads and core count is significant for Microsoft SQL Server as it is typically more RAM constrained than compute bound. With Optimize CPUs, you can potentially reduce the number of SQL Server licenses required by specifying a custom number of vCPUs. SQL Server on Amazon EC2 is often licensed per virtual core. EC2 vCPU’s are the equivalent of virtual cores. When licensing with virtual cores on EC2, the number of active vCPUs provisioned through Optimize CPUs may indicate the number of SQL Server licenses required. For example, if you have a SQL Standard build that needs the RAM and network capabilities of an r4.4xlarge but not the 16 vCPUs that comes configured on it, you can define the Optimize CPU options in the CLI or API at launch to disable Hyper-Threading and limit the instance to 1, 2, 3, 4, 5, 6, 7 or 8 cores. This example would cut the licensing costs exponentially. The Optimize CPU feature is available for new instance launches in all AWS public regions. To benefit from Optimize CPUs, you can bring SQL Server licenses to Amazon EC2 and use it on EC2 default tenancy or on allocated instances on a EC2 Dedicated Host or a EC2 Dedicated Instance. For a list of supported instance types, and valid CPU counts, see instance type documentation.

In this post, we’ve covered three AWS scenarios and how they fulfill specific areas of BYOL with CPU oversubscription scenario as well as how Optimize CPU can help cut licensing costs. EC2 Dedicated hosts are generally the first choice in the Microsoft BYOL realm unless the customer is absolutely unwilling to right-size their highly oversubscribed instances. EC2 Bare Metal instances provide the customer the ability to configure all aspects of their hypervisor of choice and maintain any oversubscription that exists in their environment. This is a very popular choice that requires little change and ultimately gets their workloads to the AWS Cloud. The VMware Cloud on AWS option is sold by and provisioned by VMware. For users that are current VMware customers, this service allows them to bridge the gap between their on-premise data center and AWS while providing a seamless migration path to the Cloud using their current toolsets.

Launch an edge node for Amazon EMR to run RStudio

Post Syndicated from Tanzir Musabbir original https://aws.amazon.com/blogs/big-data/launch-an-edge-node-for-amazon-emr-to-run-rstudio/

RStudio Server provides a browser-based interface for R and a popular tool among data scientists. Data scientist use Apache Spark cluster running on  Amazon EMR to perform distributed training. In a previous blog post, the author showed how you can install RStudio Server on Amazon EMR cluster. However, in certain scenarios you might want to install it on a standalone Amazon EC2 instance and connect to a remote Amazon EMR cluster. Benefits of running RStudio on EC2 include the following:

  • Running RStudio Server on an EC2 instance, you can keep your scientific models and model artifacts on the instance. You might have to relaunch your EMR cluster to meet your application requirements. By running RStudio Server separately, you have more flexibility and don’t have to depend entirely on an Amazon EMR cluster.
  • Installing RStudio on the master node of Amazon EMR requires sharing of resources with the applications running on the same node. By running RStudio on a standalone Amazon EC2 instance, you can use resources as you need without having to share the resources with other applications.
  • You might have multiple Amazon EMR clusters in your environment. With RStudio on Edge node, you have the flexibility to connect to any EMR clusters in your environment.

There is one major difference between running RStudio Server on an Amazon EMR cluster vs. running it on a standalone Amazon EC2 instance. In the latter case, the instance needs to be configured as an Amazon EMR client (or edge node). By doing so, you can submit Apache Spark jobs and other Hadoop-based jobs from an instance other than EMR master node.

In this post, I walk you through a list of steps to configure an Amazon EC2 instance as an Amazon EMR edge node with RStudio Server configured for remote workloads.

Solution overview

In the next few sections, I describe creating an edge node, installing RStudio, and connecting to a remote Spark cluster from R running on the edge node.

At a high level, the solution includes the following steps:

  1. Create an Amazon EC2 instance.
  2. Install RStudio Server and required dependencies on that instance.
  3. Install the Apache Spark and Hadoop client libraries and dependencies on the same instance.
  4. Launch an Amazon EMR cluster with the Apache Spark, Livy, and Hive applications.
  5. Configure the Amazon EC2 instance as an EMR client.
  6. Test EMR client functionality by running sample remote jobs.
  7. Connect to the Spark cluster from R using Sparklyr.
  8. Interact with Hive tables on EMR from RStudio.
  9. Run R models on the data on the EMR cluster.

Let’s take a look at the steps to configure an Amazon EC2 instance as EMR edge node.

Creating an edge node for Amazon EMR

In this exercise, I create a Spark client on the edge node. Because Spark relies on Hadoop libraries, I also install Hadoop on the edge node. To make sure that the client works properly, I install the same Hadoop and Spark versions as those on the EMR cluster. Because most of the libraries also run on JVM, I recommend that you have the same JVM version as the one on the EMR cluster.

After I install Spark and Hadoop on the edge node, I configure the edge node to talk to the remote Amazon EMR cluster. To do that, several configurations files from the EMR cluster need to copy to the edge node. There are two ways to copy the configuration files from a newly created EMR cluster to the edge node on EC2 instance—manual and automated. In the next section, I discuss those two approaches.

Manual approach

After the EMR cluster is up and running, you can use the secure transfer tool scp to copy the required configuration files from an EMR master node to a local machine. In my case, I used my laptop.

> mkdir emr-config
> cd emr-config
> scp -i <key> [email protected]<master-node-dns>:/etc/hadoop/conf/*-site.xml .

You can also use the same tool to copy those files from that local machine to the edge node:

> scp -i <key> hdfs-site.xml [email protected]<edge-node-dns>:/etc/hadoop/conf/.

PC users can use an application like WinSCP to connect and transfer files between an EMR master node and a PC.

Note: Depending on the applications installed on Amazon EMR, you might need to copy other libraries from the cluster. As an example, the open-source distributions of Hadoop and Spark packages that are used in this solution don’t have libraries for EMRFS. So, to use EMRFS, copy the EMRFS libraries to the edge node and update the classpath to include the libraries.

Automated approach (used in this solution)

As you might have noticed in the previous approach, you need to run the copy operation twice:

  1. From the EMR master node to a local machine
  2. From the local machine to the edge node

If you use a bastion host to access EMR, then the copy process also needs to go one extra hop. One way to automate this process is to execute a script as an EMR step, which uploads all the required libraries and configuration files to an Amazon S3 location. A second script on the edge node runs through cfn-init, which downloads files from the S3 location and places them in the right application paths. The following diagram illustrates a sequence of steps that take place during this process.

In this solution, the EMR step (CreateEMRClientDeps) executes the script create-emr-client.sh to copy the configuration files to Amazon S3. The script first creates an archive file awsemrdeps.tgz with all the required libraries. It then uploads that file into a temporary S3 bucket with a prefix ending in /emr-client/. On the edge node, the install-client-and-rstudio.sh script is used to copy the awsemrdeps.tgz file from S3 back to the edge node.

Let’s take a look at the AWS CloudFormation steps to create an edge node for Amazon EMR and run RStudio on the edge node.

Walkthrough using AWS CloudFormation

Following are the prerequisites to run the AWS CloudFormation template for this solution:

  • Create an Amazon VPC with at least one public subnet and one private subnet.
  • Update the IAM policy so that the user has access to create IAM policies, instance profile, roles, and security groups.
  • Enable VPC endpoints for Amazon S3.
  • Create an EC2 key-pair to connect to EC2 instances.

To set up this entire solution, you need to create a few AWS resources. The attached CloudFormation template creates all those required AWS resources and configures them to create an Amazon EMR edge node and running RStudio on it.

This CloudFormation template requires you to pass the following parameters during launch.

Parameter Description
EmrSubnet The subnet where the Amazon EMR cluster is deployed. It can be either a public or private subnet.
InstanceType The Amazon EC2 instance type used for the RStudio Server and edge node, which defaults to m4.xlarge.
KeyName The name of the existing EC2 key pair to access the Amazon EMR and edge node.
RStudioServerSubnet The public subnet where the RStudio Server and edge node are launched.
S3RepoPath The Amazon S3 path where all required files (template, scripts job, sample data, and so on) are stored.
S3TempUploadPath The S3 path in your AWS account for housing temporary dependency files and sample data for Hive.
VPC The ID of the virtual private cloud (VPC) where the EMR and edge node is deployed.

Important: This template is designed only to show how you can create an EMR edge node and configure RStudio for remote EMR workloads. This setup isn’t intended for production use without modification. If you try this solution outside of the US-East-1 Region, be sure to download the necessary files from s3://aws-data-analytics-blog/rstudio-edge-node. You then upload the files to the buckets in your AWS Region, edit the script as appropriate, and then run it.

To launch the CloudFormation stack, choose Launch Stack:

The following sample screenshot shows the stack parameters.

Launching this stack creates the following AWS resources.

Logical ID Resource type Description
EMRCluster Amazon EMR cluster The EMR cluster to run the Spark and Hive jobs
CreateEMRClientDeps EMR step job A job that runs a script to create client dependencies and uploads to S3
CreateHiveTables EMR step job A job to copy sample data for Hive and create Hive tables
RStudioConfigureWaitCondition CloudFormation wait condition A wait condition that works with the wait handler, and waits for the RStudio Server setup process to complete
RStudioEIP Elastic IP address The elastic IP address for RStudio Server
RStudioInstanceProfile Instance profile The instance profile for the RStudio and edge node instance (for this solution, I used the default role EMR_EC2_DefaultRole created during EMR launch)
RStudioSecGroup Amazon EC2 security group The security group that controls incoming traffic to the edge node
RStudioServerEC2 Amazon EC2 instance The EC2 instance for the edge node and RStudio Server
RStudioToEMRSecGroup Amazon EC2 security group The security group that controls traffic between EMR and the edge node
RStudioWaitHandle CloudFormation wait handler The wait handler that gets triggered after RStudio Server is launched
SecGroupSelfIngress Amazon EC2 security group ingress rule An ingress rule to RStudioToEMRSecGroup that allows the instance to talk an instance with the same security group

The CloudFormation template used in this solution configures S3 paths and stores files to their respective locations. The EMR client dependencies archive awsemrdeps.tgz is stored at the <<s3-temp-upload-path>>/emr-client/ location. The sample data file tripdata.csv is stored at <<s3-temp-upload-path>>/ny-taxi/.

The following screenshot shows how the S3 paths are configured after deployment. In this example, I passed an S3 full path, s3://<<my-bucket>>/rstudio-edge-node, which is on my Amazon S3 account.

When the CloudFormation template has successfully completed, the DNS address of RStudio Server is displayed on the Outputs tab, as shown following.

The address shown is the DNS address of the RStudio Server and edge node. A user should be able to connect to this address immediately after enabling FoxyProxy.

Test data and tables

For the source data, I have used New York City Taxi and Limousine Commission (TLC) trip record data. For a description of the data, see this detailed dictionary of the taxi data. The trip data is in comma-separated value (CSV) format with the first row as a header. The following image shows data from the trip dataset.

A second EMR step, CreateHiveTables, is created as part of the CloudFormation template. This step creates two Hive tables that will be later used by R on RStudio to run sample models. Both are external Hive tables—one is stored in HDFS on the EMR cluster and the other in Amazon S3. The goal is to demonstrate how RStudio can consume data with storage that is backed by HDFS and S3.

Table name Storage type Path
ny_taxi_hdfs HDFS /user/ruser/ny_taxi
ny_taxi_s3 S3 s3://<s3-temp-upload-path>/ny_taxi

The following section shows a list of steps to test Amazon EMR client functionality, which is optional.

Testing EMR client functionality (optional)

If the EMR client is configured correctly on the edge node, you should be able to submit Spark jobs from the edge node to the EMR cluster. You can apply the following few steps to the edge node to verify this functionality:

  1. Log in to the edge node using the default user ec2-user.
  2. Choose this host address from the CloudFormation Outputs tab:
ssh -i <<key-pair>> [email protected]<<rstudio-server-address>>
  1. The CloudFormation template also creates a new user, called ruser, that you can use to submit Spark jobs or use the RStudio UI. Switch the user from ec2-user to ruser:
[[email protected] ~]$ sudo -s
[[email protected] ec2-user]# su – ruser
  1. Submit a Spark example job to the remote EMR cluster:
$SPARK_HOME/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn $SPARK_HOME/examples/jars/spark-examples_2.11-2.3.1.jar

  1. Check the job status in the terminal and also on the EMR console. The Spark example job should be able to finish successfully. In the terminal, it should display the value of Pi as shown following.

  1. Check the job status in the Resource Manager UI; notice that the Spark PI job ran as ruser and completed successfully.

  1. Test this setup further by running spark-shell, and retrieve Hive table data from the remote EMR cluster:
[[email protected] ~]$ $SPARK_HOME/bin/spark-shell

  1. Check the list of all available Hive tables and their content:
scala> spark.sql("show tables").show

scala> spark.sql("select * from ny_taxi_s3 limit 10").show

Running R and connecting to Apache Spark

In this section, let’s run some tests and models from RStudio consuming data from Amazon EMR. Locate the RStudio Server address on the Outputs tab on the CloudFormation console. The user name is ruser and the password is BigData26.

A successful login redirects you to this welcome window. The left big window is the console window, where you write R.

Create a SparkContext on R console. No additional configuration is needed, because RStudio is already set up with the required environment variables and files through the AWS CloudFormation stack. In this solution, Sparklyr is used to connect to Spark. Attach the required R packages before creating SparkContext as follows:

sc <- spark_connect(master = "yarn")

When a connection to Spark is established, it creates a “yarn” connection channel (find this in the RStudio UI, on the Connections tab at the right corner). It also shows Hive metadata on the same widget. Because the CloudFormation template created two Hive tables, they appear under the “yarn” connection as shown following.

A YARN application is also placed under ruser. The status of the application is RUNNING as long as the connection is established. You can find the status of that application on the YARN ResourceManager UI also. Notice that the user is ruser and the name of the application is sparklyr.

For more information, check the YARN app log by choosing Log on the widget.

Now, test whether the data for those two Hive tables is accessible. Choose the ny_taxi_hdfs table to sample the data.

Choose the ny_taxi_s3 table to sample the data on S3.

By successfully running these two tests, you can see that RStudio running on an edge node can consume data stored in a remote EMR cluster.

During the development phase, users might want to write some data back to your S3 bucket. So it’s a good idea to verify whether a user can write data directly to S3 using R and Spark. To test this, I read the ny_taxi_hdfs Hive table using the spark_read_table API. Then I write the data to Amazon S3 by calling the spark_write_csv API and passing my S3 target path. For this solution, I used s3://tm-blogs-placeholder/write-from-rstudio as my new S3 path.

ny_taxi <- spark_read_table(sc, "ny_taxi_hdfs")
spark_write_csv(ny_taxi,path = "s3://tm-blogs-placeholder/write-from-rstudio")

After the write operation, the S3 location appears as follows.

You can also see Spark write logs in the YARN application log.

Now analyze the data with R and draw some plots. To do so, first check the count of ny_taxi data. It should return 20,000.

ny_taxi <- spark_read_table(sc, "ny_taxi_hdfs")
ny_taxi %>% count

Now, find the number of trips for each rate code type. There are six different rate code types where 1 is the standard rate code and 5 is the negotiated rate. For details, see this detailed dictionary of the taxi data.

trip_by_rate_code_id <- ny_taxi %>%
  mutate(rate_code_id) %>%
  group_by(rate_code_id) %>%
  summarize(n = n()) %>%

ggplot(trip_by_rate_code_id, aes(rate_code_id, n)) + 
  geom_bar(stat="Identity") +
  scale_y_continuous(labels = scales::comma) +
  labs(title = "Number of Trips by Rate Code", x = "Rate Code Id", y = "")

Based on the graph, I can say that (except for some passengers who paid a negotiated rate) the rest of the passengers paid the standard rate during their ride.

Now find the average trip duration between two New York areas—Queens and Manhattan. The pu_location_id value represents the taxi pick-up zone, and do_location_id represents the taxi drop-off zone. For this test, I use 129 as the pick-up zone and 82 as the drop-off zone. Taxi zone 129 represents the Jackson Heights area in Queens, and taxi zone 82 represents the Elmhurst area. For details, see this taxi zone lookup table.

trip_duration_tbl <- ny_taxi %>%
  filter(pu_location_id == 129 & do_location_id == 82) %>%
  mutate(pickup_time = hour(from_unixtime(unix_timestamp(lpep_pickup_datetime, "MM/dd/yy HH:mm")))) %>%
  mutate(trip_duration = unix_timestamp(lpep_dropoff_datetime, "MM/dd/yy HH:mm") - unix_timestamp(lpep_pickup_datetime, "MM/dd/yy HH:mm")) %>%
  group_by(pickup_time) %>% 
  summarize(n = n(),
            trip_duration_mean = mean(trip_duration),
            trip_duration_p10 = percentile(trip_duration, 0.10),
            trip_duration_p25 = percentile(trip_duration, 0.25),
            trip_duration_p50 = percentile(trip_duration, 0.50),
            trip_duration_p75 = percentile(trip_duration, 0.75),
            trip_duration_p90 = percentile(trip_duration, 0.90)) %>% 
ggplot(trip_duration_tbl, aes(x = pickup_time)) +
          geom_line(aes(y = trip_duration_p50, alpha = "Median")) +
          geom_ribbon(aes(ymin = trip_duration_p25, ymax = trip_duration_p75, 
                          alpha = "25–75th percentile")) +
          geom_ribbon(aes(ymin = trip_duration_p10, ymax = trip_duration_p90, 
                          alpha = "10–90th percentile")) +
          scale_y_continuous("Trip Duration (in seconds)") + 
          scale_x_continuous("Pickup Time of the day")

Based on the plot, I can say that on average, each trip duration was about 10–12 minutes. There was a rare peak around 1 a.m. for some days, where the trip duration was more than 30 minutes.

Next steps

The goal of this post is to show, first how to create an edge node or Amazon EMR client on an Amazon EC2 instance. Second, it’s to show how other applications—RStudio in this case—can use that edge node or Amazon EMR client to submit workloads remotely. By following the same approach, you can also create an edge node for other Hadoop applications—Hive client, Oozie client, HBase client, and so on. Data scientists can keep enriching their R environment by adding additional packages and keeping it totally isolated from developers EMR environments. To enhance this solution further and make this production ready, you can explore the following options:

  • Use friendly URLs for Amazon EMR interfaces. For example, instead of thrift://ip-10-0-20-253.ec2.internal:9083 for the hive.metastore.uris value, you can use something like thrift://hive-metastore.dev.example.corp:9083. In the same way, instead of using ip-10-0-20-253.ec2.internal:8032 for the yarn.resourcemanager.address property value, you can use dev.emr.example.corp:8032. The benefit of this approach is that, even if you terminate your EMR cluster and recreate it again (with new IP addresses), you don’t have to change your client node’s configuration. This blog post shows how you can create friendly URLs for Amazon EMR.
  • If you already integrated Microsoft Active Directory into your Amazon EMR cluster, you can do the same with RStudio. That way, you can achieve single sign-on across your data analytics solutions.
  • Enable detailed Amazon CloudWatch logs to monitor your edge node behaviors and trigger alerts for different scenarios (disk space utilization, memory usage, and so on). With this approach, you can proactively notify your data scientists before a possible failure.
  • H20 is one of the popular packages used in R. It’s open-source software that allows users to fit thousands of potential models to discover patterns in user data. You can install H20 using CRAN just like the way Sparklyr was installed in this solution. You can execute this on RStudio. Alternatively, you can add the H20 package as part of the installation process by placing it in the install-client-and-rstudio.sh
localH2O = h2o.init()

Common issues

Although it’s hard to cover every possible scenario (because these vary on AWS environments), this section covers some common issues that can occur and ways to fix them.

Issue 1: Clicking the RStudio Server URL returns a There is no Internet connection error.

Solution:  Make sure that you configured FoxyProxy in your browser and that you are connecting to the public IP address of the RStudio EC2 instance. You can get this address from the AWS CloudFormation console on the Outputs tab.

Issue 2: The EMR step job CreateClientDeps fails.

Solution: This EMR step job runs the create-emr-client.sh script, which creates an archive with all required dependencies and uploads it to the S3 location. If the edge node doesn’t have write access to S3, this step job fails. In this solution, the default EMR role EMR_EC2_DefaultRole is assigned to the edge node instance also. We assume that EMR_EC2_DefaultRole has write access to the S3 location given through the CloudFormation parameter S3TempUploadPath.

Issue 3: The AWS CloudFormation template Blog-EMR-Edge-Node-With-RStudio times out or fails.

Solution: A script called install-client-and-rstudio.sh runs through cfn-init on the edge node, and it writes logs to the /tmp/edge-node-rstudio-installation.log file. This script contains a sleep clause, where it waits for the awsemrdeps.tgz file to be available on S3. This clause times out after 20 minutes. If the script fails to find that file within that time period, subsequent execution fails. Also, in this solution, RStudio uses http://cran.rstudio.com/ as its repo when installing packages. If the Amazon EC2 instance can’t reach the internet, it can’t download and install those packages, and the template might fail. Make sure that you pick a public subnet or a private subnet with NAT for the edge node.

Issue 4: During Amazon EMR client testing, the Spark sample application fails with a NoClassdefFounderror or UnsupportedOperationException error.

Solution: This blog post uses Amazon EMR 5.16.0. Make sure to use the Hadoop and Spark versions corresponding to the EMR release. If the master node’s application version is different from the edge node’s application version, the client might fail with a NoClassdefFounderror or UnsupportedOperationException error. Make sure that you always install the same version of Hadoop and Spark in both locations.

Cleaning up

When you’ve finished testing this solution, remember to clean up all those AWS resources that you created using AWS CloudFormation. Use the AWS CloudFormation console or AWS CLI to delete the stack named Blog-EMR-Edge-Node-With-RStudio.


In this post, I show you how to create a client for Amazon EMR. I also show how you can install RStudio on that client node and connect Apache Spark clusters running on Amazon EMR. I used Sparklyr to connect to Spark, consume data from both HDFS and S3, and analyze the data using R models. Go ahead—give this solution a try and share your experience with us!


Additional Reading

If you found this post useful, be sure to check out Running sparklyr – RStudio’s R Interface to Spark on Amazon EMR, Statistical Analysis with Open-Source R and RStudio on Amazon EMR, and Running R on Amazon Athena.


About the Author

Tanzir Musabbir is an EMR Specialist Solutions Architect with AWS. He is an early adopter of open source Big Data technologies. At AWS, he works with our customers to provide them architectural guidance for running analytics solutions on Amazon EMR, Amazon Athena & AWS Glue. Tanzir is a big Real Madrid fan and he loves to travel in his free time.


New – AWS Systems Manager Session Manager for Shell Access to EC2 Instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-session-manager/

It is a very interesting time to be a corporate IT administrator. On the one hand, developers are talking about (and implementing) an idyllic future where infrastructure as code, and treating servers and other resources as cattle. On the other hand, legacy systems still must be treated as pets, set up and maintained by hand or with the aid of limited automation. Many of the customers that I speak with are making the transition to the future at a rapid pace, but need to work in the world that exists today. For example, they still need shell-level access to their servers on occasion. They might need to kill runaway processes, consult server logs, fine-tune configurations, or install temporary patches, all while maintaining a strong security profile. They want to avoid the hassle that comes with running Bastion hosts and the risks that arise when opening up inbound SSH ports on the instances.

We’ve already addressed some of the need for shell-level access with the AWS Systems Manager Run Command. This AWS facility gives administrators secure access to EC2 instances. It allows them to create command documents and run them on any desired set of EC2 instances, with support for both Linux and Microsoft Windows. The commands are run asynchronously, with output captured for review.

New Session Manager
Today we are adding a new option for shell-level access. The new Session Manager makes the AWS Systems Manager even more powerful. You can now use a new browser-based interactive shell and a command-line interface (CLI) to manage your Windows and Linux instances. Here’s what you get:

Secure Access – You don’t have to manually set up user accounts, passwords, or SSH keys on the instances and you don’t have to open up any inbound ports. Session Manager communicates with the instances via the SSM Agent across an encrypted tunnel that originates on the instance, and does not require a bastion host.

Access Control – You use IAM policies and users to control access to your instances, and don’t need to distribute SSH keys. You can limit access to a desired time/maintenance window by using IAM’s Date Condition Operators.

Auditability – Commands and responses can be logged to Amazon CloudWatch and to an S3 bucket. You can arrange to receive an SNS notification when a new session is started.

Interactivity – Commands are executed synchronously in a full interactive bash (Linux) or PowerShell (Windows) environment

Programming and Scripting – In addition to the console access that I will show you in a moment, you can also initiate sessions from the command line (aws ssm ...) or via the Session Manager APIs.

The SSM Agent running on the EC2 instances must be able to connect to Session Manager’s public endpoint. You can also set up a PrivateLink connection to allow instances running in private VPCs (without Internet access or a public IP address) to connect to Session Manager.

Session Manager in Action
In order to use Session Manager to access my EC2 instances, the instances must be running the latest version (2.3.12 or above) of the SSM Agent. The instance role for the instances must reference a policy that allows access to the appropriate services; you can create your own or use AmazonEC2RoleForSSM. Here are my EC2 instances (sk1 and sk2 are running Amazon Linux; sk3-win and sk4-win are running Microsoft Windows):

Before I run my first command, I open AWS Systems Manager and click Preferences. Since I want to log my commands, I enter the name of my S3 bucket and my CloudWatch log group. If I enter either or both values, the instance policy must also grant access to them:

I’m ready to roll! I click Sessions, see that I have no active sessions, and click Start session to move ahead:

I select a Linux instance (sk1), and click Start session again:

The session opens up immediately:

I can do the same for one of my Windows instances:

The log streams are visible in CloudWatch:

Each stream contains the content of a single session:

In the Works
As usual, we have some additional features in the works for Session Manager. Here’s a sneak peek:

SSH Client – You will be able to create SSH sessions atop Session Manager without opening up any inbound ports.

On-Premises Access – We plan to give you the ability to access your on-premises instances (which must be running the SSM Agent) via Session Manager.

Available Now
Session Manager is available in all AWS regions (including AWS GovCloud) at no extra charge.


Compute Abstractions on AWS: A Visual Story

Post Syndicated from Massimo Re Ferre original https://aws.amazon.com/blogs/architecture/compute-abstractions-on-aws-a-visual-story/

When I joined AWS last year, I wanted to find a way to explain, in the easiest way possible, all the options it offers to users from a compute perspective. There are many ways to peel this onion, but I want to share a “visual story” that I have created.

I define the compute domain as “anything that has CPU and Memory capacity that allows you to run an arbitrary piece of code written in a specific programming language.” Your mileage may vary in how you define it, but this is broad enough that it should cover a lot of different interpretations.

A key part of my story is around the introduction of different levels of compute abstractions this industry has witnessed in the last 20 or so years.

Separation of duties

The start of my story is a line. In a cloud environment, this line defines the perimeter between the consumer role and the provider role. In the cloud, there are things that AWS will do and things that the consumer will do. The perimeter of these responsibilities varies depending on the services you opt to use. If you want to understand more about this concept, read the AWS Shared Responsibility Model documentation.

The different abstraction levels

The reason why the line above is oblique is because it needs to intercept different compute abstraction levels. If you think about what happened in the last 20 years of IT, we have seen a surge of different compute abstractions that changed the way people consume CPU and Memory resources. It all started with physical (x86) servers back in the 80s, and then we have seen the industry adding abstraction layers over the years (for example, hypervisors, containers, functions).

The higher you go in the abstraction levels, the more the cloud provider can add value and can offload the consumer from non-strategic activities. A lot of these activities tend to be “undifferentiated heavy lifting.” We define this as something that AWS customers have to do but that don’t necessarily differentiate them from their competitors (because those activities are table-stakes in that particular industry).

What we found is that supporting millions of customers on AWS requires a certain degree of flexibility in the services we offer because there are many different patterns, use cases, and requirements to satisfy. Giving our customers choices is something AWS always strives for.

A couple of final notes before we dig deeper. The way this story builds up through the blog post is aligned to the progression of the launch dates of the various services, with a few noted exceptions. Also, the services mentioned are all generally available and production-grade. For full transparency, the integration among some of them may still be work-in-progress, which I’ll call out explicitly as we go.

The instance (or virtual machine) abstraction

This is the very first abstraction we introduced on AWS back in 2006. Amazon Elastic Compute Cloud (Amazon EC2) is the service that allows AWS customers to launch instances in the cloud. When customers intercept us at this level, they retain responsibility of the guest operating system and above (middleware, applications, etc.) and their lifecycle. AWS has the responsibility for managing the hardware and the hypervisor including their lifecycle.

At the very same level of the stack there is also Amazon Lightsail, which “is the easiest way to get started with AWS for developers, small businesses, students, and other users who need a simple virtual private server (VPS) solution. Lightsail provides developers compute, storage, and networking capacity and capabilities to deploy and manage websites and web applications in the cloud.”

And this is how these two services appear in our story:

The container abstraction

With the rise of microservices, a new abstraction took the industry by storm in the last few years: containers. Containers are not a new technology, but the rise of Docker a few years ago democratized access. You can think of a container as a self-contained environment with soft boundaries that includes both your own application as well as the software dependencies to run it. Whereas an instance (or VM) virtualizes a piece of hardware so that you can run dedicated operating systems, a container technology virtualizes an operating system so that you can run separated applications with different (and often incompatible) software dependencies.

And now the tricky part. Modern containers-based solutions are usually implemented in two main logical pieces:

  • A containers control plane that is responsible for exposing the API and interfaces to define, deploy, and lifecycle containers. This is also sometimes referred to as the container orchestration layer.
  • A containers data plane that is responsible for providing capacity (as in CPU/Memory/Network/Storage) so that those containers can actually run and connect to a network. From a practical perspective this is typically a Linux host or less often a Windows host where the containers get started and wired to the network.

Arguably, in a specific compute abstraction discussion, the data plane is key, but it is as important to understand what’s happening for the control plane piece.

In 2014, Amazon launched a production-grade containers control plane called Amazon Elastic Container Service (ECS), which “is a highly scalable, high performance container management service that supports Docker … Amazon ECS eliminates the need for you to install, operate, and scale your own cluster management infrastructure.”

In 2017, Amazon also announced the intention to release a new service called Amazon Elastic Container Service for Kubernetes (EKS) based on Kubernetes, a successful open source containers control plane technology. Amazon EKS was made generally available in early June 2018.

Just like for ECS, the aim for this service is to free AWS customers from having to manage a containers control plane. In the past, AWS customers would spin up EC2 instances and deploy/manage their own Kubernetes masters (masters is the name of the Kubernetes hosts running the control plane) on top of an EC2 abstraction. However, we believe many AWS customers will leave to AWS the burden of managing this layer by either consuming ECS or EKS, depending on their use cases. A comparison between ECS and EKS is beyond the scope of this blog post.

You may have noticed that what we have discussed so far is about the container control plane. How about the containers data plane? This is typically a fleet of EC2 instances managed by the customer. In this particular setup, the containers control plane is managed by AWS while the containers data plane is managed by the customer. One could argue that, with ECS and EKS, we have raised the abstraction level for the control plane, but we have not yet really raised the abstraction level for the data plane as the data plane is still comprised of regular EC2 instances that the customer has responsibility for.

There is more on that later on but, for now, this is how the containers control plane and the containers data plane services appear:

The function abstraction

At re:Invent 2014, AWS introduced another abstraction layer: AWS Lambda. Lambda is an execution environment that allows an AWS customer to run a single function. So instead of having to manage and run a full-blown OS instance to run your code, or having to track all software dependencies in a user-built container to run your code, Lambda allows you to upload your code and let AWS figure out how to run it at scale.

What makes Lambda so special is its event-driven model. Not only can you invoke Lambda directly (for example, via the Amazon API Gateway), but you can trigger a Lambda function upon an event in another AWS service (for example, an upload to Amazon S3 or a change in an Amazon DynamoDB table).

The key point about Lambda is that you don’t have to manage the infrastructure underneath the function you are running. No need to track the status of the physical hosts, no need to track the capacity of the fleet, no need to patch the OS where the function will be running. In a nutshell, no need to spend time and money on the undifferentiated heavy lifting.

And this is how the Lambda service appears:

The bare metal abstraction

Also known as the “no abstraction.”

As recently as re:Invent 2017, we announced (the preview of) the Amazon EC2 bare metal instances. We made this service generally available to the public in May 2018.

This announcement is part of Amazon’s strategy to provide choice to our customers. In this case, we are giving customers direct access to hardware. To quote from Jeff Barr’s post:

“…. (AWS customers) wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.”

This is how the bare metal Amazon EC2 i3.metal instance appears:

As a side note, and also as alluded to by Jeff, i3.metal is the foundational EC2 instance type on top of which VMware created their own VMware Cloud on AWS service. We are now offering the ability to any AWS user to provision bare metal instances. This doesn’t necessarily mean you can load your hypervisor of choice out of the box, but you can certainly do things you wouldn’t be able to do with a traditional EC2 instance (note: this was just a Saturday afternoon hack).

More seriously, a question I get often asked is whether users could install ESXi on i3.metal on their own. Today this cannot be done, but I’d be interested in hearing your use case for this.

The full container abstraction (for lack of a better term)

Now that we covered all the abstractions, it is time to go back and see if there are other optimizations we can provide for AWS customers. When we discussed the container abstraction, we called out that while there are two different fully managed containers control planes (ECS and EKS), there wasn’t a managed option for the data plane.

Some customers were (and still are) happy about being in full control of said instances. Others have been very vocal that they wanted to get out of the (undifferentiated heavy-lifting) business of managing the lifecycle of that piece of infrastructure.

Enter AWS Fargate, a production-grade service that provides compute capacity to AWS containers control planes. Practically speaking, Fargate is making the containers data plane fall into the “Provider space” responsibility. This means the compute unit exposed to the user is the container abstraction, while AWS will manage transparently the data plane abstractions underneath.

This is how the Fargate service appears:

Now ECS has two “launch types”: one called “EC2” (where your tasks get deployed on a customer-managed fleet of EC2 instances), and the other one called “Fargate” (where your tasks get deployed on an AWS-managed fleet of EC2 instances).

For EKS, the strategy will be very similar, but as of this writing it was not yet available. If you’re interested in some of the exploration being done to make this happen, this is a good read.


We covered the spectrum of abstraction levels available on AWS and how AWS customers can intercept them depending on their use cases and where they sit on their cloud maturity journey. Customers with a “lift & shift” approach may be more akin to consume services on the left-hand side of the slide, whereas customers with a more mature cloud native approach may be more interested in consuming services on the right-hand side of the slide.

In general, customers tend to use higher-level services to get out of the business of managing non-differentiating activities. For example, I recently talked to a customer interested in using Fargate. The trigger there was the fact that Fargate is ISO, PCI, SOC and HIPAA compliant, which was a huge time and money saver for them because it’s easier to point to an AWS document during an audit than having to architect and document for compliance the configuration of a DIY containers data plane.

As a recap, here’s our visual story with all the abstractions available:

I hope you found it useful. Any feedback is greatly appreciated.

About the author

Massimo is a Principal Solutions Architect at AWS. For about 25 years, he specialized on the x86 ecosystem starting with operating systems and virtualization technologies, and lately he has been head down learning about cloud and how application architectures are evolving in that space. Massimo has a blog at www.it20.info and his Twitter handle is @mreferre.

Run your Kubernetes Workloads on Amazon EC2 Spot Instances with Amazon EKS

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/run-your-kubernetes-workloads-on-amazon-ec2-spot-instances-with-amazon-eks/

Contributed by Madhuri Peri, Sr. EC2 Spot Specialist SA, and Shawn OConnor, AWS Enterprise Solutions Architect

Many organizations today are using containers to package source code and dependencies into lightweight, immutable artifacts that can be deployed reliably to any environment.

Kubernetes (K8s) is an open-source framework for automated scheduling and management of containerized workloads. In addition to master nodes, a K8s cluster is made up of worker nodes where containers are scheduled and run.

Amazon Elastic Container Service for Kubernetes (Amazon EKS) is a managed service that removes the need to manage the installation, scaling, or administration of master nodes and the etcd distributed key-value store. It provides a highly available and secure K8s control plane.

This post demonstrates how to use Spot Instances as K8s worker nodes, and shows the areas of provisioning, automatic scaling, and handling interruptions (termination) of K8s worker nodes across your cluster.

What this post does not cover

This post focuses primarily on EC2 instance scaling. This post also assumes a default interruption mode of terminate for EC2 instances, though there are other interruption types, stop and hibernate. For stateless K8s sessions, I recommend choosing the interruption mode of terminate.

Spot Instances

Amazon EC2 Spot Instances are spare EC2 capacity that offer discounts of 70-90% over On-Demand prices. The Spot price is determined by term trends in supply and demand and the amount of On-Demand capacity on a particular instance size, family, Availability Zone, and AWS Region.

If the available On-Demand capacity of a particular instance type is depleted, the Spot Instance is sent an interruption notice two minutes ahead to gracefully wrap up things. I recommend a diversified fleet of instances, with multiple instance types created by Spot Fleets or EC2 Fleets.

You can use Spot Instances for various fault-tolerant and flexible applications. In a workload that uses container orchestration and management platforms like EKS or Amazon Elastic Container Service (Amazon ECS), the schedulers have built-in mechanisms to identify any pods or containers on these interrupted EC2 instances. The interrupted pods or containers are then replaced on other EC2 instances in the cluster.

Solution architecture

There are three goals to accomplish with this solution:

  1.  The cluster must scale automatically to match the demands of an application.
  2. Optimize for cost by using Spot Instances.
  3. The cluster must be resilient to Spot Instance interruptions.

These goals are accomplished with the following components:

Solution component Role in solution Code Deployment
Cluster Autoscaler Scales EC2 instances in or out Open source K8s pod DaemonSet on On-Demand Instances
Auto Scaling group Provisions Spot or On-Demand Instances AWS Via CloudFormation
Spot Instance interrupt handler Sets K8s nodes to drain state, when the Spot Instance is interrupted Open source K8s pod DaemonSet on all K8s nodes with the label lifecycle=EC2Spot

Here’s a diagram of the solution architecture.

There are a few important things to note in this architecture:

  • Cluster Autoscaler is being used to control all scaling activities, with changes to the MinSize and DesiredCapacity parameters of the Auto Scaling group. This separation of duties ensures that there are no race conditions.
  • The Auto Scaling groups are used purely to replace any lost instances automatically (for example, terminations or interruptions) and maintain the desired number of instances. There are no scaling policies attached to the groups.
  • Auto Scaling, at the time of this post, supports a single instance type. As noted by Jeff Barr’s post EC2 Fleet – Manage Thousands of On-Demand and Spot Instances with One Request, in H2 2018, Auto Scaling groups will support mixed instance types. At that point, multiple groups will not be required, and can collapse into a single group specifying all instance types.

Here’s a further breakdown on the components.

Cluster Autoscaler

Automatic scaling in K8s comes in two forms:

  • Horizontal Pod Autoscaler scales the pods in a deployment or replica set. It is implemented as a K8s API resource and a controller. The controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition. It obtains the metrics from either the resource metrics API (for per-pod resource metrics), or the custom metrics API (for all other metrics).
  • Cluster Autoscaler scales the worker nodes available for pods to be placed. Cluster Autoscaler is the focus for this post.

Cluster Autoscaler is the default K8s component that can be used to perform pod scaling as well as scaling nodes in a cluster. It automatically increases the size of an Auto Scaling group so that pods have a place to run. And it attempts to remove idle nodes, that is, nodes with no running pods.

When a pod cannot be scheduled due to lack of available resources, Cluster Autoscaler determines that the cluster must scale up. Expander interfaces allow you to apply different pod placement strategies. Currently, the following strategies are supported:

  • Random – Randomly select an available node group.
  • Most Pods – Selects the group that can schedule the largest quantity of nodes. This can be used balance the load across groups of nodes.
  • Least Waste – This is commonly referred to as ‘bin packing.’ It selects the node-group with the least available tied resource (CPU or memory). This helps to reduce the total node footprint, and is the strategy used in this post.

Although Cluster Autoscaler is the de facto standard for automatic scaling in K8s, it is not part of the main release. Deploy it like any other pod in the kube-system namespace, like other management pods. Those management pods would prevent the cluster from scaling down. Override this default behavior by passing in the –-skip-nodes-with-system-pods=false flag.

But how do you reliably control scale-down operations so that you do not remove the pods that you need? This is accomplished using a pod disruption budget (PDB). A PDB limits the number of replicated pods that can be down at a given time. Create a PDB to ensure that you always have at least one Cluster Autoscaler pod running

In summary, Cluster Autoscaler does not remove nodes under the following scenarios:

  • Pods with a restrictive PDB.
  • Pods running in the kube-system namespace that are deployed (that is, not run on the node by default or which do not have a PDB).
  • Pods not backed by a controller object (not created by a deployment, replica set, job, stateful-set, and so on).
  • Pods running with local storage.
  • Pods running that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, matching anti-affinity, and so on).

Auto Scaling Group

With Spot Instances, each instance type in each Availability Zone is a pool with its own Spot price based on the available capacity. A recommended best practice when working with Spot Instances is to use a diversified fleet of instances with multiple instance types, as created by Spot Fleet or EC2 Fleet. These APIs aim to fulfill the specified TargetCapacity across the instance types to launch the number of Spot Instances and optionally, On-Demand Instances.

Unfortunately, Cluster Autoscaler does not support Spot Fleets at this time. You need a different strategy to provide diversification. Cluster Autoscaler for AWS provides integration with Auto Scaling groups. It enables users to choose from four different options of deployment:

  • One Auto Scaling group
  • Multiple Auto Scaling groups
  • Auto-Discovery
  • Master Node setup

For this post, you use the Multi-ASG deployment option. For Cluster Autoscaler and other cluster administration and management pods that run on EKS worker nodes, create a small Auto Scaling group using On-Demand Instances. This ensures that the health of the cluster is not impacted by Spot interruptions.

In K8s, label selectors are used to control where pods are placed. Use the K8s node label selector to place the appropriate pods on Spot or On-Demand Instances.

Interrupt handler

The last component to consider handles how the cluster responds to the interruption of a Spot Instance. The workflow can be summarized as:

  • Identify that a Spot Instance is being reclaimed.
  • Use the 2-minute notification window to gracefully prepare the node for termination.
  • Taint the node and cordon it off to prevent new pods from being placed.
  • Drain connections on the running pods.
  • To maintain desired capacity, replace the pods on remaining nodes.

Spot interruptions are reported in the following ways:

For this post, you use a K8s DaemonSet, which means running one pod per node. The pod periodically polls the EC2 metadata service for a Spot termination notice. If a termination notice is received (HTTP status 200), then it tries to gracefully stop and restart on other nodes before the 2-minute grace period expires. This approach is based on an existing project at the kube-spot-termination-notice-handler GitHub repo.


Here’s the suggested workflow for this solution:

  1. Provision the worker nodes with EC2 instances using CloudFormation templates.
  2. Deploy the K8s Cluster Autoscaler pods as a DaemonSet, with a PDB.
  3. Deploy the Spot Instance interrupt handler pods as a DaemonSet.
  4. Deploy the sample application


You should have the following resources or configurations before starting this walkthrough:

  • An EKS cluster master endpoint
  • An EKS service role ARN
  • Subnet IDs and the control plane security group values
  • EKS master cluster certificates
  • Configuration of kubectl against the master EKS endpoint

For more information, see Amazon EKS – Now Generally Available and Deploy a Kubernetes Application with Amazon Elastic Container Service for Kubernetes.

When you describe the EKS cluster, you get a response like the following sample output:

    "cluster": {
        "name": " DemoSpotClusterScale",
        "arn": "arn:aws:eks:us-west-2: 0123456789012:cluster/ DemoSpotClusterScale",
        "createdAt": 1528317531.751,
        "version": "1.10",
        "endpoint": "https://B960845ED5E21A3439ABB5E12F09CE88.sk1.us-west-2.eks.amazonaws.com",
        "roleArn": "arn:aws:iam::0123456789012:role/eksServiceRoleGA",
        "resourcesVpcConfig": {
            "subnetIds": [
            "securityGroupIds": [
            "vpcId": "vpc-c7c8c4be"
        "status": "ACTIVE",
        "certificateAuthority": {
            "data": "<Your ca data here>"

I use the cluster name DemoSpotClusterScale throughout this post. Replace that with your cluster name in the following commands.

Get started

git clone https://github.com/awslabs/ec2-spot-labs.git

cd ec2-spot-labs/ec2-spot-eks-solution

Provision the worker nodes

Add worker nodes to your cluster so that you can deploy your applications. Worker nodes can be either Spot or On-Demand Instances. In this example, use Spot Instances for worker nodes.

You can use this customized AWS CloudFormation template to create the Auto Scaling groups described earlier. This template also labels the node with a lifecycle key value indicating whether it is an On-Demand or Spot Instance node.

The template deploys Auto Scaling groups dedicated to the following instance types:

  • Spot Instances, m4.large, across three Availability Zones.
  • Spot Instances, t2.medium, across three Availability Zones.
  • On-Demand Instances, across three Availability Zones.

Make sure that you apply the aws-auth-cm.yaml file with the appropriate NodeInstanceRole value, as provisioned by the CloudFormation template. Find this parameter on the Resources tab.

kubectl apply -f aws-auth-cm.yaml

If the kubectl get nodes command worked as documented, then you are ready to proceed to the next section

Deploying Cluster Autoscaler and PDB

  1. Download the manifest file cluster-autoscaler-ds.yaml. There are six K8s resources that enable the cluster-autoscaler add-on to work in the EKS environment:
    • Service account
    • Cluster role
    • Role
    • Cluster role binding
    • Role binding
    • Two Auto Scaling groups created by the CloudFormation template for Spot and On-Demand Instances

    You also see the cluster-autoscaler command with configured parameters.

  2. Edit the cluster-autoscaler-ds.yaml file to replace the [OD-NodeGroup-Name], [Spot-NodeGroup1-Name], [Spot-NodeGroup2-Name] sections in lines 141-143 with the resources created in your worker node cloudformation template as shown in screenshot above. Deploy the cluster-autoscaler-ds.yaml manifest
    $ kubectl create -f cluster-autoscaler/cluster-autoscaler-ds.yaml

  3. Monitor the deployment:
    $ kubectl logs cluster-autoscaler-<podgeneratedID> --namespace=kube-system

  4. Download and deploy the Cluster Autoscaler PDB:
    $ kubectl create -f cluster-autoscaler/cluster-autoscaler-pdb.yaml

Deploy the Spot Instance interrupt handler

Each K8s EC2 node being launched must have the lifecycle=Ec2Spot value for -node-label, as in the following example. This line is an excerpt from the CloudFormation template:

“sed -i s,MAX_PODS,”, !Join [ “”, [ “‘”, { “Fn::FindInMap”: [ MaxPodsPerNode, { Ref: SpotNode2InstanceType }, MaxPods ] }, ” –node-labels “, “lifecycle=Ec2Spot” , “‘” ] ], “,g /etc/systemd/system/kubelet.service”, “\n”,

The Docker image contains the instance metadata poll script, as shown in entrypoint.sh. Publish this image to your repository. In the following screenshot, I used my ECR repository. A sample image is available on Docker Hub.

Deploy the Spot interrupt handler pod using spec. This sets up the DaemonSet only on the instances that have a K8s label of lifecycle=Ec2Spot.

kubectl apply -f spot-termination-handler/deploy-k8-pod/spot-interrupt-handler.yaml

When the Spot Instance is interrupted, this pod catches the interruption and vacates the pods.

Deploy the sample application and test out scaling up & down

Deploy a sample application with three replicas. Create a new manifest file named greeter-sample.yaml from the code below, or download it from here

You are using node affinity to prefer deployment on Spot Instances. If the Ec2Spot label is unavailable, the manifest file allows the application to run elsewhere

$ kubectl create -f sample/greeter-sample.yaml

Scale up, and watch Cluster Autoscaler manage the Auto Scaling groups. Verify that Cluster Autoscaler is working by scaling up the sample service beyond the current limits of the cluster.

$ kubectl scale --replicas=50 deployment/greeter-sample

Check the AWS Management Console to confirm that the Auto Scaling groups are scaling up to meet demand. This may take a few minutes. You can also follow along with the pod deployment from the command line. You should see the pods transition from pending to running as nodes are scaled up.

$ kubectl get pods -o wide --watch

Scale down, and watch Cluster Autoscaler manage the Auto Scaling groups:

$ kubectl scale --replicas=1 deployment/greeter

Check the K8s logs to watch the terminations occur:

$ kubectl logs deployment/cluster-autoscaler-<podgeneratedID> –namespace=kube-system


In this post, I showed you how to use Spot Instances with K8s workloads, by provisioning, scaling, and managing terminations effectively in EKS clusters to leverage both cost and scale optimizations. Happy coding!

Running GPU-Accelerated Kubernetes Workloads on P3 and P2 EC2 Instances with Amazon EKS

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/running-gpu-accelerated-kubernetes-workloads-on-p3-and-p2-ec2-instances-with-amazon-eks/

This post contributed by Scott Malkie, AWS Solutions Architect

Amazon EC2 P3 and P2 instances, featuring NVIDIA GPUs, power some of the most computationally advanced workloads today, including machine learning (ML), high performance computing (HPC), financial analytics, and video transcoding. Now Amazon Elastic Container Service for Kubernetes (Amazon EKS) supports P3 and P2 instances, making it easy to deploy, manage, and scale GPU-based containerized applications.

This blog post walks through how to start up GPU-powered worker nodes and connect them to an existing Amazon EKS cluster. Then it demonstrates an example application to show how containers can take advantage of all that GPU power!


You need an existing Amazon EKS cluster, kubectl, and the aws-iam-authenticator set up according to Getting Started with Amazon EKS.

Two steps are required to enable GPU workloads. First, join Amazon EC2 P3 or P2 GPU compute instances as worker nodes to the Kubernetes cluster. Second, configure pods to enable container-level access to the node’s GPUs.

Spinning up Amazon EC2 GPU instances and joining them to an existing Amazon EKS Cluster

To start the worker nodes, use the standard AWS CloudFormation template for Amazon EKS worker nodes, specifying the AMI ID of the new Amazon EKS-optimized AMI for GPU workloads. This AMI is available on AWS Marketplace.

Subscribe to the AMI and then launch it using the AWS CloudFormation template. The template takes care of networking, configuring kubelets, and placing your worker nodes into an Auto Scaling group, as shown in the following image.

This template creates an Auto Scaling group with up to two p3.8xlarge Amazon EC2 GPU instances. Powered by up to eight NVIDIA Tesla V100 GPUs, these instances deliver up to 1 petaflop of mixed-precision performance per instance to significantly accelerate ML and HPC applications. Amazon EC2 P3 instances have been proven to reduce ML training times from days to hours and to reduce time-to-results for HPC.

After the AWS CloudFormation template completes, the Outputs view contains the NodeInstanceRole parameter, as shown in the following image.

NodeInstanceRole needs to be passed in to the AWS Authenticator ConfigMap, as documented in the AWS EKS Getting Started Guide. To do so, edit the ConfigMap template and run the command kubectl apply -f aws-auth-cm.yaml in your terminal to apply the ConfigMap. You can then run kubectl get nodes —watch to watch the two Amazon EC2 GPU instances join the cluster, as shown in the following image.

Configuring Kubernetes pods to access GPU resources

First, use the following command to apply the NVIDIA Kubernetes device plugin as a daemon set on the cluster.

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.10/nvidia-device-plugin.yml

This command produces the following output:

Once the daemon set is running on the GPU-powered worker nodes, use the following command to verify that each node has allocatable GPUs.

kubectl get nodes \

The following output shows that each node has four GPUs available:

Next, modify any Kubernetes pod manifests, such as the following one, to take advantage of these GPUs. In general, adding the resources configuration (resources: limits:) to pod manifests gives containers access to one GPU. A pod can have access to all of the GPUs available to the node that it’s running on.

apiVersion: v1
kind: Pod
  name: pod-name
  - name: container-name
        nvidia.com/gpu: 4

As a more specific example, the following sample manifest displays the results of the nvidia-smi binary, which shows diagnostic information about all GPUs visible to the container.

apiVersion: v1
kind: Pod
  name: nvidia-smi
  restartPolicy: OnFailure
  - name: nvidia-smi
    image: nvidia/cuda:latest
    - "nvidia-smi"
        nvidia.com/gpu: 4

Download this manifest as nvidia-smi-pod.yaml and launch it with kubectl apply -f nvidia-smi-pod.yaml.

To confirm successful nvidia-smi execution, use the following command to examine the log.

kubectl logs nvidia-smi

The above commands produce the following output:

Existing limitations

  • GPUs cannot be overprovisioned – containers and pods cannot share GPUs
  • The maximum number of GPUs that you can schedule to a pod is capped by the number of GPUs available to that pod’s node
  • Depending on your account, you might have Amazon EC2 service limits on how many and which type of Amazon EC2 GPU compute instances you can launch simultaneously

For more information about GPU support in Kubernetes, see the Kubernetes documentation. For more information about using Amazon EKS, see the Amazon EKS documentation. Guidance setting up and running Amazon EKS can be found in the AWS Workshop for Kubernetes on GitHub.

Please leave any comments about this post and share what you’re working on. I can’t wait to see what you build with GPU-powered workloads on Amazon EKS!

New T3 Instances – Burstable, Cost-Effective Performance

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-t3-instances-burstable-cost-effective-performance/

We launched the t1.micro instance type in 2010, and followed up with the first of the T2 instances (micro, small, and medium) in 2014, more sizes in 2015 (nano) and 2016 (xlarge and 2xlarge), and unlimited bursting last year.

Today we are launching T3 instances in twelve regions. These general-purpose instances are even more cost-effective than the T2 instances, with On-Demand prices that start at $0.0052 per hour ($3.796 per month). If you have workloads that currently run on M4 or M5 instances but don’t need sustained compute power, consider moving them to T3 instances. You can host the workloads at a very low cost while still having access to sustainable high performance when needed (unlimited bursting is enabled by default, making these instances even easier to use than their predecessors).

As is the case with the T1 and the T2, you get a generous and assured baseline amount of processing power and the ability to transparently scale up to full core performance when you need more processing power, for as long as necessary. The instances are powered by 2.5 GHz Intel® Xeon® Scalable (Skylake) Processors featuring the new Intel® AVX-512 instructions and you can launch them today in seven sizes:

Name vCPUs Baseline Performance / vCPU
Price / Hour
Price / Hour
t3.nano 2 5% 0.5 GiB $0.0052 $0.0980
t3.micro 2 10% 1 GiB $0.0104 $0.0196
t3.small 2 20% 2 GiB $0.0209 $0.0393
t3.medium 2 20% 4 GiB $0.0418 $0.0602
t3.large 2 30% 8 GiB $0.0835 $0.1111
t3.xlarge 4 40% 16 GiB $0.1670 $0.2406
t3.2xlarge 8 40% 32 GiB $0.3341 $0.4813

The column labeled Baseline Performance indicates the percentage of a single hyperthread’s processing power that is allocated to the instance. Instances accumulate credits when idle and consume them when running, with credits stored for up to 7 days. If the average CPU utilization of a T3 instance is lower than the baseline over a 24 hour period, the hourly instance price covers all spikes in usage. If the instance runs at higher CPU utilization for a prolonged period, there will be an additional charge of $0.05 per vCPU-hour. This billing model means that you can choose the T3 instance that has enough memory and vCPUs for your needs, and then count on the bursting to deliver all necessary CPU cycles. This is a unique form of vertical scaling that could very well enable some new types of applications.

T3 instances are powered by the Nitro system. In addition to CPU bursting, they support network and EBS bursting, giving you access to additional throughput when you need it. Network traffic can burst to 5 Gbps for all instance sizes; EBS bursting ranges from 1.5 Gbps to 2.05 Gbps depending on the size of the instance, with corresponding bursts for EBS IOPS.

Like all of our most recent instance types, the T3 instances are HVM only, and must be launched within a Virtual Private Cloud (VPC) using an AMI that includes the Elastic Network Adapter (ENA) driver.

Now Available
The T3 instances are available in the US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Canada (Central), Europe (Ireland), Europe (London), Europe (Frankfurt), South America (São Paulo), Asia Pacific (Tokyo), Asia Pacific (Singapore), and Asia Pacific (Sydney) Regions.


Running Hyper-V on Amazon EC2 Bare Metal Instances

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/running-hyper-v-on-amazon-ec2-bare-metal-instances/

AWS recently announced the general availability of Amazon EC2 bare metal Instances. This post provides an overview of launching, setting up, and configuring a Hyper-V enabled host, launching a guest virtual machine (VM) within Hyper-V running on i3.metal.


The key elements of this process include the following steps:

  1. Launch a Windows Server 2016 with Hyper-V AMI provided by Amazon.
  2. Configure Hyper-V networking.
  3. Launch a Hyper-V guest VM.

Launch a Windows Server 2016 with Hyper-V AMI provided by Amazon

1. Open the EC2 console.
2. Choose Public Images and search for the Amazon Hyper-V AMIs.
3. Select your preferred Hyper-V AMI, and choose Launch.
4. Follow the Launch wizard process to launch the instance on i3.metal.

The Amazon Hyper-V AMIs have the Hyper-V role pre-enabled. You can also launch a Windows Server 2016 Base AMI to i3.metal, and enable the Hyper-V role for your use case.

Configure Hyper-V networking

To enable networking for your Hyper-V guests—so they can have connectivity to other resources in your VPC, or to the internet via your VPC internet gateway, ensure that you have first configured your VPC. For more information, see Creating and Attaching an Internet Gateway.

Hyper-V provides three types of virtual switches for networking:

  • External
  • Internal
  • Private

In this solution, you are creating an internal virtual switch and using the Hyper-V host as the NAT server for the guest VMs, similar to Microsoft’s topic Set up a NAT network.

You can specify your own virtual network range. For this example, use as the range for the virtual network inside the Hyper-V host.

  1. Run the following PowerShell command to create the internal virtual switch:
    New-VMSwitch -SwitchName "Hyper-VSwitch" -SwitchType Internal
  2. Determine which network interface is associated with the virtual switch. For this solution, the Get-NetAdapter command shows that the Hyper-V virtual switch has an ifIndex value of 12.
  3. Configure the Hyper-V Virtual Ethernet adapter with the NAT gateway IP address. This IP address is used as default gateway (Router IP) for the guest VMs. The following command sets the IP address with a subnet mask on the Interface (InterfaceIndex 12):
    New-NetIPAddress -IPAddress -PrefixLength 24 -InterfaceIndex 12
  4. Create a NAT virtual network using the range of
    New-NetNat -Name MyNATnetwork -InternalIPInterfaceAddressPrefix

Now the environment is ready for the guest VMs to have outbound communication with other resources through the host NAT. For each VM, assign an IP address with the default gateway ( This can be done manually within each guest VM. In this solution, you make it easier by enabling a DHCP server within the Hyper-V host to automatically assign IP addresses.

Setting up DHCP server role on the host

  1. Run the following command to add the DHCP role to the host:
    Install-WindowsFeature -Name 'DHCP' -IncludeManagementTools
  2. To configure the DHCP server to bind on the Hyper-V virtual interface, choose Control Panel, Administrative Tools, DHCP.
  3. Select this computer, add or remove bindings, and then select the IP address corresponding to Hyper-V virtual interface (that is,
  4. Configure the DHCP scope and specify a range from the subnet that you determined earlier. In this example, use
    Add-DhcpServerv4Scope -Name GuestIPRange -StartRange -EndRange -SubnetMask -State Active

    You should be able to see the range in the DHCP console, as in the following screenshot:

  5. For Router, choose the NAT gateway IP address assigned it to the Hyper-V network adapter (
  6. For DNS server, use the Amazon DNS, which is the second IP address for the VPC (

Launch a Hyper-V guest VM

For this post, follow the new VM wizard to create an Ubuntu 18.04 LTS guest VM. First, download the Ubuntu installation ISO from the Ubuntu website to your Hyper-V host, and store it on a secondary EBS volume that you added as the D: drive.

I3.metal instances use Amazon EBS and instance store volumes with the NVM Express (NVMe) interface. When you stop an I3.metal instance, any data stored on instance store volumes is gone. I recommend storing your guest VM’s hard drive (vhd or vhdx) on an EBS volume that is attached to your I3.Metal instance. This can be the root volume (C:) or any additional EBS volumes attached to the instance. For more information, see What’s the difference between instance store and EBS?

After that is complete, follow these steps:

  1. In Hyper-V Manager, choose Actions, New, Virtual Machine.
  2. Follow the wizard with your desired configuration up to the Configure Networking section.
  3. In the Configure Networking step, for Connection, choose Hyper-V Switch, and choose Next.
  4. In the Connect Virtual Hard Disk step, enter a name for the virtual hard disk. Use the default location C:\Users\Public\Documents\Hyper-V\Virtual Hard Disks\.
  5. Specify the size of the virtual hard disk, and choose Next.
  6. In the Installation Options step, choose the Ubuntu ISO that you downloaded earlier.
  7. Finish the wizard and start the VM, then follow the steps on the Ubuntu installation wizard. As you have already set up DHCP and NAT for the Hyper-V network, the Ubuntu VM automatically gets an IP address from the DHCP scope that you defined earlier.
  8. Confirm the connectivity of the VM to the internet


You’ve just built a Hyper-V host on an EC2 bare metal instance. Now you’re ready to add more guest VMs and put them to work!

Migrating a multi-tier application from a Microsoft Hyper-V environment using AWS SMS and AWS Migration Hub

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/migrating-a-multi-tier-application-from-a-microsoft-hyper-v-environment-using-aws-sms-and-aws-migration-hub/

Shane Baldacchino is a Solutions Architect at Amazon Web Services

Many customers ask for guidance to migrate end-to-end solutions running in their on-premises data center to AWS. This post provides an overview of moving a common blogging platform, WordPress, running on an on-premises virtualized Microsoft Hyper-V platform to AWS, including re-pointing the DNS records associated to the website.

AWS Server Migration Service (AWS SMS) is an agentless service that makes it easier and faster for you to migrate thousands of on-premises workloads to AWS. In November 2017, AWS added support for Microsoft’s Hyper-V hypervisor. AWS SMS allows you to automate, schedule, and track incremental replications of live server volumes, making it easier for you to coordinate large-scale server migrations. In this post, I guide you through migrating your multi-tier workloads using both AWS SMS and AWS Migration Hub.

Migration Hub provides a single location to track the progress of application migrations across multiple AWS and partner solutions. In this post, you use AWS SMS as a mechanism to migrate the virtual machines (VMs) and track them via Migration Hub. You can also use other third-party tools in Migration Hub, and choose the migration tools that best fit your needs. Migration Hub allows you to get progress updates across all migrations, identify and troubleshoot any issues, and reduce the overall time and effort spent on your migration projects.

Migration Hub and AWS SMS are both free. You pay only for the cost of the individual migration tools that you use, and any resources being consumed on AWS.


For this walkthrough, the WordPress blog is currently running as a two-tier stack in a corporate data center. The example environment is multi-tier and polyglot in nature. The frontend uses Windows Server 2016 (running IIS 10 with PHP as an ISAPI extension) and the backend is supported by a MySQL server running on Ubuntu 16.04 LTS. All systems are hosted on a virtualized platform. As the environment consists of multiple servers, you can use Migration Hub to group the servers together as an application and manage the holistic process of migrating the application.
The key elements of this migration process involve the following steps:

  1. Establish your AWS environment.
  2. Replicate your database.
  3. Download the SMS Connector from the AWS Management Console.
  4. Configure AWS SMS and Hyper-V permissions.
  5. Install and configure the SMS Connector appliance.
  6. Configure Hyper-V host permissions.
  7. Import your virtual machine inventory and create a replication job.
  8. Use AWS Migration Hub to track progress.
  9. Launch your Amazon EC2 instance.
  10. Change your DNS records to resolve the WordPress blog to your EC2 instance.

Before you start, ensure that your source systems OS and hypervisor version are supported by AWS SMS. For more information, see the Server Migration Service FAQ. This post focuses on the Microsoft Hyper-V hypervisor.

Establish your AWS environment

First, establish your AWS environment. If your organization is new to AWS, this may include account or subaccount creation, a new virtual private cloud (VPC), and associated subnets, route tables, internet gateways, and so on. Think of this phase as setting up your software-defined data center. For more information, see Getting Started with Amazon EC2 Linux Instances.

The blog is a two-tier stack, so go with two private subnets. Because you want it to be highly available, use multiple Availability Zones. An Availability Zone resides within an AWS Region. Each Availability Zone is isolated, but the zones within a Region are connected through low-latency links. This allows architects and solution designers to build highly available solutions.

Replicate your database

WordPress uses a MySQL relational database. You could continue to manage MySQL and the associated EC2 instances associated with maintaining and scaling a database. But for this walkthrough, I am using this opportunity to migrate to an RDS instance of Amazon Aurora, as it is a MySQL-compliant database. Not only is Amazon Aurora a high-performant database engine but it frees you up to focus on application development by managing time-consuming database administration tasks, including backups, software patching, monitoring, scaling, and replication.

Use AWS Database Migration Service (AWS DMS) to migrate your MySQL database to Amazon Aurora easily and securely. You can send the results from AWS DMS to Migration Hub. This allows you to create a single pane view of your application migration.

After a database migration instance has been instantiated, configure the source and destination endpoints and create a replication task.

By attaching to the MySQL binlog, you can seed in the current data in the database and also capture all future state changes in near–real time. For more information, see Migrating a MySQL-Compatible Database to Amazon Aurora.

Finally, the task shows that you are replicating current data in your WordPress blog database and future changes from MySQL into Amazon Aurora.

Download the SMS Connector from the AWS Management Console

Now, use AWS SMS to migrate your IIS/PHP frontend. AWS SMS is delivered as a virtual appliance that can be deployed in your Hyper-V environment.

To download the SMS Connector, log in to the console and choose Server Migration Service, Connectors, SMS Connector setup guide. Download the VHD file for SCVMM/Hyper-V.

Configure SMS

Your hypervisor and AWS SMS need an appropriate user with sufficient privileges to perform migrations:

Launch a new VM in Hyper-V based on the SMS Connector that you downloaded. To configure the connector, connect to it via HTTPS. You can obtain the SMS Connector IP address from within Hyper-V. By default, the SMS Connector uses DHCP to obtain a valid IP address.

Connect to the SMS Connector via HTTPS. In the example above, the connector IP address is In your browser, enter As the SMS Connector can only work with one hypervisor at a time, you must state the hypervisor with which to interface. For the purpose of this post, the examples use Microsoft Hyper-V.

Configure the connector with the IAM and hypervisor credentials that you created earlier.

After you have entered in both your AWS and Hyper-V credentials and the associated connectivity and authentication checks have passed, you are redirected to the home page of your SMS Connector. The home page provides you a status on connectivity and the health of the SMS Connector.

Configure Hyper-V host permissions

You also must modify your Hyper-V hosts to provide WinRM connectivity. AWS provides a downloadable PowerShell script to configure your Windows environment to support WinRM communications with the SMS Connector. The same script is used for configuring either standalone Hyper-V or SCVMM.

Execute the PowerShell script and follow the prompts. In the following example, Reconfigure Hyper-V not managed by SCVMM (Standalone Hyper-V)… was selected.

Import your virtual machine inventory and create a replication job

You have now configured the SMS Connector and your Microsoft Hyper-V hosts. Switch to the console to import your server catalog to AWS SMS. Within AWS SMS, choose Connectors, Import Server Catalog.

This process can take up to a few minutes and is dependent on the number of machines in your Hyper-V inventory.

Select the server to migrate and choose Create replication job. The console guides you through the process. The time that the initial replication task takes to complete is dependent on the available bandwidth and the size of your VM. After the initial seed replication, network bandwidth is minimized as AWS SMS replicates only incremental changes occurring on the VM.

Use Migration Hub to track progress

You have now successfully started your database migration via AWS DMS, set up your SMS Connector, configured your Microsoft Hyper-V environment, and started a replication job.

You can now track the collective progress of your application migration. To track migration progress, connect AWS DMS and AWS SMS to Migration Hub.

To do this, navigate to Migration Hub in the AWS Management Console. Under Migrate and Tools, connect both services so that the migration status of these services is sent to Migration Hub.

You can then group your servers into an application in Migration Hub and collectively track the progress of your migration. In this example, I created an application, Company Blog, and added in my servers from both AWS SMS and AWS DMS.

The progress updates from linked services are automatically sent to Migration Hub so that you can track tasks in progress. The dashboard reflects any status changes that occur in the linked services. You can see from the following image that one server is complete while another is in progress.

Using Migration Hub, you can view the migration progress of all applications. This allows you to quickly get progress updates across all of your migrations, easily identify and troubleshoot any issues, and reduce the overall time and effort spent on your migration projects.

Launch your EC2 instance

When your replication task is complete, the artifact created by AWS SMS is a custom AMI that you can use to deploy an EC2 instance. Follow the usual process to launch your EC2 instance, using the custom AMI created by AWS SMS, noting that you may need to replace any host-based firewalls with security groups and NACLs.

When you create an EC2 instance, ensure that you pick the most suitable EC2 instance type and size to match your performance requirements while optimizing for cost.

While your new EC2 instance is a replica of your on-premises VM, you should always validate that applications are functioning. How you do this differ on an application-by-application basis. You can use a combination of approaches, such as editing a local host file and testing your application, SSH, RDP, and Telnet.

From the RDS console, get your connection string details and update your WordPress configuration file to point to the Amazon Aurora database. As WordPress is expecting a MySQL database and Amazon Aurora is MySQL-compliant, this change of database engine is transparent to WordPress.

Change your DNS records to resolve the WordPress blog to your EC2 instance

You have validated that your WordPress application is running correctly, as you are still receiving changes from your on-premises data center via AWS DMS into your Amazon Aurora database. You can now update your DNS zone file using Amazon Route 53. Amazon Route 53 can be driven by multiple methods: console, SDK, or AWS CLI.

For this walkthrough, use Windows PowerShell for AWS to update the DNS zone file. The example shows UPSERTING the A record in the zone to resolve to the Amazon EC2 instance created with AWS SMS.

Based on the TTL of your DNS zone file, end users slowly resolve the WordPress blog to AWS.


You have now successfully migrated your WordPress blog to AWS using AWS migration services, specifically the AWS SMS Hyper-V/SCVMM Connector. Your blog now resolves to AWS. After validation, you are ready to decommission your on-premises resources.

Many architectures can be extended to use many of the inherent benefits of AWS, with little effort. For example, by using Amazon CloudWatch metrics to drive scaling policies, you can use an Application Load Balancer as your frontend. This removes the single point of failure for a single EC2 instance

New – Provisioned Throughput for Amazon Elastic File System (EFS)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-provisioned-throughput-for-amazon-elastic-file-system-efs/

Amazon Elastic File System lets you create petabyte-scale file systems that can be accessed in massively parallel fashion from hundreds or thousands of Amazon Elastic Compute Cloud (EC2) servers and on-premises resources, scaling on demand without disrupting applications. Behind the scenes, storage is distributed across multiple Availability Zones and redundant storage servers in order to provide you with file systems that are scalable, durable, and highly available. Space is allocated and billed on as as-needed basis, allowing you to consume as much as you need while keeping costs proportional to actual usage. Applications can achieve high levels of aggregate throughput and IOPS, with consistent low latencies. Our customers are using EFS for a broad spectrum of use cases including media processing workflows, big data & analytics jobs, code repositories, build trees, and content management repositories, taking advantage of the ability to simply lift-and-shift their existing file-based applications and workflows to the cloud.

A Quick Review
As you may already know, EFS lets you choose one of two performance modes each time you create a file system:

General Purpose – This is the default mode, and the one that you should start with. It is perfect for use cases that are sensitive to latency, and supports up to 7,000 operations per second per file system.

Max I/O – This mode scales to higher levels of aggregate throughput and performance, with slightly higher latency. It does not have an intrinsic limit on operations per second.

With either mode, throughput scales with the size of the file system, and can also burst to higher levels as needed. The size of the file system determines the upper bound on how high and how long you can burst. For example:

1 TiB – A 1 TiB file system can deliver 50 MiB/second continuously, and burst to 100 MiB/second for up to 12 hours each day.

10 TiB – A 10 TiB file system can deliver 500 MiB/second continuously, and burst up to 1 GiB/second for up to 12 hours each day.

EFS uses a credit system that allows you to “save up” throughput during quiet times and then use it during peak times. You can read about Amazon EFS Performance to learn more about the credit system and the two performance modes.

New Provisioned Throughput
Web server content farms, build trees, and EDA simulations (to name a few) can benefit from high throughput to a set of files that do not occupy a whole lot of space. In order to address this usage pattern, you now have the option to provision any desired level of throughput (up to 1 GiB/second) for each of your EFS file systems. You can set an initial value when you create the file system, and then increase it as often as you’d like. You can dial it back down every 24 hours, and you can also switch between provisioned throughput and bursting throughput on the same cycle. You can for example, configure an EFS file system to provide 50 MiB/second of throughput for your web server, even if the volume contains a relatively small amount of content.

If your application has throughput requirements that exceed what is possible using the default (bursting) model, the provisioned model is for you! You can achieve the desired level of throughput as soon as you create the file system, regardless of how much or how little storage you consume.

Here’s how I set this up:

Using provisioned throughput means that I will be billed separately for storage (in GiB/month units) and for provisioned throughput (in MiB/second-month units).

I can monitor average throughput by using a CloudWatch metric math expression. The Amazon EFS Monitoring Tutorial contains all of the formulas, along with CloudFormation templates that I can use to set up a complete CloudWatch Dashboard in a matter of minutes:

I select the desired template, enter the Id of my EFS file system, and click through the remaining screens to create my dashboard:

The template creates an IAM role, a Lambda function, a CloudWatch Events rule, and the dashboard:

The dashboard is available within the CloudWatch Console:

Here’s the dashboard for my test file system:

To learn more about how to use the EFS performance mode that is the best fit for your application, read Amazon Elastic File System – Choosing Between the Different Throughput & Performance Modes.

Available Now
This feature is available now and you can start using in today in all AWS regions where EFS is available.



Improving application performance and reducing costs with Amazon EBS-Optimized Instance burst capability

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/improving-application-performance-and-reducing-costs-with-amazon-ebs-optimized-instance-burst-capability/

Contributed by Sooraj Prasannan, Senior Product Manager, Amazon Elastic Block Store

In November 2017, Amazon EC2 introduced C5 compute-intensive instances and M5 general-purpose instances. In the first half of 2018, we released EC2 C5d instances and M5d instances by adding high-speed, ultra-low latency local NVMe storage to the EC2 C5 and M5 instance families. EC2 C5/C5d and M5/M5d instances are built on the Nitro system. This collection of AWS-built hardware and software components enables high performance, high availability, high security, and bare metal capabilities to reduce virtualization overhead.

During the design of the Nitro system, we analyzed real-world workloads and recognized the need for smaller instance sizes to drive higher performance from their Amazon EBS volumes. We found that the majority of application storage needs are bursty, with short, intense periods of high I/O and plenty of idle time between bursts. To improve the experience for these workloads, we developed burst capability for smaller instance sizes. Available on EC2 C5/C5d and M5/M5d instances, this feature enables large, xlarge, and 2xlarge instance sizes to drive the same performance as the 4xlarge instance for 30 minutes each day.

For applications with spiky Amazon EBS demand, you can right-size your instances based on your CPU and memory requirements and still meet your EBS-optimized instance performance requirements. This higher performance also enables you to speed up sections of your workflow dependent on EBS-optimized instance performance. Faster workflows result in quicker job completions and improved resource utilization. The burst capability ultimately enables you to reduce costs by right-sizing your instance and improving total resource usage.

With this performance increase, you will be able to handle unplanned spikes in demand without any impact to your application performance. You can now size your instances based on historical average trends. This burst capability gives you more performance to absorb spikes without affecting your customer experience.

Using Amazon CloudWatch metrics to monitor burst usage

For better visibility into your performance, instances based on the Nitro system provide Amazon CloudWatch metrics to help profile your usage. Based on the usage profile, you can decide if smaller instances meet your requirements.

These instances give you the ability to monitor your usage via instance level CloudWatch metrics for operations (EBSReadOpsandEBSWriteOps) and bytes transferred (EBSReadBytesand EBSWriteBytes). For more information on these metrics, see List of available CloudWatch metrics for your instances. These metrics support basic monitoring (five-minute frequency) by default, but you can enable detailed monitoring (one-minute frequency) for an additional cost. For more information, see Amazon CloudWatch pricing.

For large, xlarge, and 2xlarge instances, we also provide burst balance metrics. EBSIOBalance% monitors the instance I/O burst bucket, and EBSByteBalance% monitors the instance byte burst bucket. These metrics give information about the percentage of I/O or bytes credits remaining in the respective burst buckets. The metrics are expressed as a percentage, where 100% means that the instance has accumulated the maximum number of credits. You can set up an alarm that triggers if the balance gets too low.

To demonstrate these metrics, we launched an m5.large instance. We then attached a 500GB io1 Amazon EBS volume with 32,000 provisioned IOPS to the instance. Amazon EBS volumes attached to instances based on the Nitro system are exposed as NVMe devices.

First, we ran a large block (128 KiB) test using fio to /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 and monitored both EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=128k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=large-block-test 

Because this is a large block workload, it’s not driving enough IOPS to deplete EBSIOBalance%. It depletes EBSByteBalance% instead, as shown in the following image.

Then we ran a small block test to understand how it affects EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=16k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=small-block-test 

Because this is a small block test, it drives higher IOPS than bytes/second. Hence, EBSIOBalance% drops faster than EBSByteBalance%, as shown in the following image.

As long as EBSIOBalance% and EBSByteBalance% are above 0%, the instance can drive the burst performance. When the instance I/O activity is below the baseline rate, the burst buckets refill. After the tests finished, we paused all I/O from the instance. This period of inactivity allows the instance burst buckets to refill, as EBSIOBalance% and EBSByteBalance% show in the following image.

The refill rate for a burst bucket is the difference between the baseline rate and the instance I/O activity. For example, m5.large has a baseline throughput rate of 60 MB/s and a baseline IOPS rate of 3600 IOPS. Suppose the instance I/O activity is 10 MB/s and 1000 IOPS. The byte bucket fills at the rate of 50 MB/s (60 MB/s minus 10 MB/s). The IOPS bucket fills at the rate of 2600 IOPS (3600 IOPS minus 1000 IOPS). For the baseline rates for the different instances, see Amazon EBS–optimized instances. In addition, we top off the burst buckets every 24 hours, which means that the instance has burst performance available for 30 minutes each day.

Performance enhancements

We have continued to make enhancements to the Nitro system. With the latest set of enhancements, we have increased the maximum burst bandwidth on the large, xlarge, and 2xlarge EC2 C5/C5d and M5/M5d instances to 3.5 Gbps, up from 2.25 Gbps and 2.12 Gbps, respectively. We have also increased the maximum burst IOPS for EC2 C5/C5d to 20,000 IOPS and to 18,750 IOPS for M5/M5d, up from 16,000 IOPS for both. All new EC2 C5/C5d and M5/M5d smaller instances can take advantage of this performance increase at no additional cost.

For the latest list of instances based on the Nitro system that support this burst feature and their corresponding performance numbers, see Amazon EBS–optimized instances.

Deploy an 8K HEVC pipeline using Amazon EC2 P3 instances with AWS Batch

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/deploy-an-8k-hevc-pipeline-using-amazon-ec2-p3-instances-with-aws-batch/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

AWS provides several managed services for file- and streaming-based media encoding options.

Currently, these services offer up to 4K encoding. Recent developments and the growing popularity of 8K content has now increased the need to distribute higher resolution content.

In this solution, you use an Amazon EC2 P3 instance to create a file-based encoding pipeline utilizing AWS Batch by first uploading a sample 8K (7680×4320) file to Amazon S3.

AWS Batch

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

P3 instances for video transcoding workloads

The P3 instance comes equipped with the NVIDIA Tesla V100 GPU. The V100 is a 16 GB 5,120 CUDA Core-GPU based on the latest Volta architecture; well suited for video coding workloads. The largest instance size in that family, p3.16xlarge, has 64 vCPU, 488 GB of RAM, 8 NVIDIA Tesla V100 GPUs, and 25 Gbps networking bandwidth.

Other than being a mainstay in computational workloads the V100 offers enhanced hardware-based encoding/decoding (NVENC/NVDEC). The following tables summarize the NVENC/NVDEC options available compared to other GPUs offered at AWS.

NVENC Support Matrix

AWS GPU instance
GPU FAMILY GPU H.264 (AVCHD) YUV 4:2:0 H.264 (AVCHD) YUV 4:4:4 H.264 (AVCHD) Lossless H.265 (HEVC) 4K YUV 4:2:0 H.265 (HEVC) 4K YUV 4:4:4 H.265 (HEVC) 4K Lossless H.265 (HEVC) 8k
G2 Kepler GRID K520 YES
P2 Kepler (2nd Gen) Tesla K80 YES
G3 Maxwell (2nd Gen) Tesla M60 YES YES YES YES

NVDEC Support Matrix

AWS GPU instance GPU FAMILY GPU MPEG-2 VC-1 H.264 (AVCHD) H.265 (HEVC) VP8 VP9
P2 Kepler (2nd Gen) Tesla K80 YES YES YES
G3 Maxwell (2nd Gen) Tesla M60 YES YES YES YES

Cinematic 8K encoding is supported using the Tesla V100 (P3 instance family) either in landscape or portrait orientations using the HEVC codec. 

GPU H264 H264_444 H264_ME H264_WxH HEVC HEVC_Main10 HEVC_Lossless HEVC_SAO HEVC_444 HEVC_ME HEVC_WxH
Tesla M60 + + + 4096x
+ 4096x
Tesla V100 + + + 4096x
+ + + + + + 8192x


To follow along with these procedures, ensure that you have the following:

  • An AWS account with permissions to create IAM roles and policies, as well as read and write access to S3
  • Registration with the NVIDIA Developer Network
  • Familiarity with Docker


For deployment, you containerize the encoding pipeline. After building the underlying P3 container instance, you then use nvidia-docker2 to build the video-encoding Docker image, which is registered with Amazon Elastic Container Registry (Amazon ECR).

As shown in the following diagram, the pipeline reads an input raw YUV file from S3, then pulls the containerized encoding application to execute at scale on the P3 container instance. The encoded video file is then be transferred to S3.

The nvidia-docker2 image video encoding stack contains the following components:

  • FFMPEG 4.0
  • NVIDIA Video Codec SDK 8.1

This is a relatively lengthy procedure. However, after it’s built, the underlying instance and Docker image are reusable and can be quickly deployed as part of a high performance computing (HPC) pipeline.

Creating the ECS container instance

The underlying instance can be built by selecting the Amazon Linux AMI with the p3.2xlarge instance type in a public subnet. Additionally, add an EBS volume (150 GB), which is used for the 8k input, raw yuv, and output files. Scale the storage amount for larger input files. Persist the mount in /etc/fstab. Connect to the instance over SSH and install any OS updates as well as the EPEL Release and support packages as well as the base docker-ce.

sudo yum update -y
sudo yum install yum-utils \
                 device-mapper-persistent-data \
                 lvm2 \

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo yum install epel-release-latest-7.noarch.rpm
sudo yum update
sudo yum install docker-ce -y

The NVIDIA/CUDA stack can be installed using the cuda-repo-rhel7.rpm file. The CUDA framework installs the NVIDIA driver dependencies.

sudo yum install cuda -y

Next, install nvidia-docker2 as provided in the NVIDIA GitHub repo.

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
  sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

sudo tee /etc/docker/daemon.json <<EOF
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
    "default-runtime": "nvidia"

sudo systemctl restart docker

With the base components in place, make this instance compatible with the ECS service:

sudo yum install ecs-init -y

Create the /etc/ecs/ecs.config file with the following template:

cat << EOF > /etc/ecs/ecs.config

Iptables and packet forwarding rules need to be created to pass IAM roles into task operations:

sudo sh -c "echo 'net.ipv4.conf.all.route_localnet = 1' >> /etc/sysctl.conf"
sudo sysctl -p /etc/sysctl.conf
sudo iptables -t nat -A PREROUTING -p tcp -d --dport 80 -j DNAT --to-destination
sudo iptables -t nat -A OUTPUT -d -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 51679
sudo sh -c 'iptables-save > /etc/sysconfig/iptables'

Finally, a systemd unit file needs to be created:

sudo cat << EOF > /etc/systemd/system/[email protected]
Description=Docker Container %I

ExecStartPre=-/usr/bin/docker rm -f %i
ExecStart=/usr/bin/docker run --name %i \
--privileged \
--restart=on-failure:10 \
--volume=/var/run:/var/run \
--volume=/var/log/ecs/:/log:Z \
--volume=/var/lib/ecs/data:/data:Z \
--volume=/etc/ecs:/etc/ecs \
--net=host \
--env-file=/etc/ecs/ecs.config \
ExecStop=/usr/bin/docker stop %i


sudo systemctl enable [email protected]
sudo systemctl start [email protected]
sudo systemctl status [email protected]

Ensure that the [email protected] service starts successfully.

Creating the NVIDIA-Docker image

With Docker installed, pull the latest nvidia/cuda:latest image from DockerHub.

docker pull nvidia/cuda:latest

It is best at this point to run the Docker container in interactive mode. However, a Docker build file can be created afterwards. At the time of publication, only CUDA 9.0 is installed. NVIDIA has already provided the necessary repositories. Install CUDA 9.2, and support packages, inside the Docker container, referenced by the (docker)  label:

docker run -it --runtime=nvidia --rm nvidia/cuda
(docker) apt update
(docker) apt install pkg-config build-essential wget curl nasm unzip \
                     git libglew-dev cuda-toolkit-9-2 python3-pip -y
(docker) pip3 install awscli

Next, download the FFMPEG 4.0, nv-codec-headers, and the Video Codec SDK 8.1 from the NVIDIA Developer platform.

First, extract the nv-codec-headers and into the directory:

(docker) make
(docker) make install

Extract the ffmpeg-4.0 directory and compile and install FFmpeg:

(docker) ./configure --enable-cuda --enable-cuvid --enable-nvenc --enable-nonfree --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64
(docker) make -j 4
(docker) make install

Download and extract the NVIDIA Video Codec SDK 8.1. The “Samples” directory has a preconfigured Makefile that compiles the binaries in the SDK. After it’s successful, confirm that the binaries are correctly set up.

(docker): ~/Video_Codec_SDK_8.1.24/Samples/AppEncode/AppEncCuda$ ./AppEncCuda -h
-i Input file path
-o Output file path
-s Input resolution in this form: WxH
-if Input format: iyuv nv12 yuv444 p010 yuv444p16 bgra bgra10 ayuv abgr abgr10
-gpu Ordinal of GPU to use
-codec Codec: h264 hevc
-preset Preset: default hp hq bd ll ll_hp ll_hq lossless lossless_hp
-profile H264: baseline main high high444; HEVC: main main10 frext
-444 (Only for RGB input) YUV444 encode
-rc Rate control mode: constqp vbr cbr cbr_ll_hq cbr_hq vbr_hq
-fps Frame rate
-gop Length of GOP (Group of Pictures)
-bf Number of consecutive B-frames
-bitrate Average bit rate, can be in unit of 1, K, M
-maxbitrate Max bit rate, can be in unit of 1, K, M
-vbvbufsize VBV buffer size in bits, can be in unit of 1, K, M
-vbvinit VBV initial delay in bits, can be in unit of 1, K, M
-aq Enable spatial AQ and set its stength (range 1-15, 0-auto)
-temporalaq (No value) Enable temporal AQ
-lookahead Maximum depth of lookahead (range 0-32)
-cq Target constant quality level for VBR mode (range 1-51, 0-auto)
-qmin Min QP value
-qmax Max QP value
-initqp Initial QP value
-constqp QP value for constqp rate control mode
Note: QP value can be in the form of qp_of_P_B_I or qp_P,qp_B,qp_I (no space)

Encoder Capability
# GPU H264 H264_444 H264_ME H264_WxH HEVC HEVC_Main10 HEVC_Lossless HEVC_SAO HEVC_444 HEVC_ME HEVC_WxH
0 Tesla V100-SXM2-16GB + + + 4096x4096 + + + + + + 8192x8192

Create a small script to be used for the 8K-encoding test inside the Docker container. Save the file as /root/nvenc-processor.sh. In the basic form, this script encodes using a single thread. For comparison, the same file is encoded using four threads.

#!/bin/bash -xe
time aws s3 cp $S3_INPUT /mnt/8k.webm

time /usr/local/bin/ffmpeg -y -hwaccel cuda -i /mnt/8k.webm -c:v rawvideo -pix_fmt yuv420p /mnt/8k.yuv
time /root/Video_Codec_SDK/Samples/AppEncode/AppEncCuda/AppEncCuda -i /mnt/8k.yuv -o /mnt/8k.hevc -s 7680x4320 -codec hevc
time /root/Video_Codec_SDK/Samples/AppEncode/AppEncPerf/AppEncPerf -i /mnt/8k.yuv -s 7680x4320 -thread 4 -codec hevc

time aws s3 cp /mnt/8k.hevc $S3_OUTPUT

This script downloads a file from S3 and processes it through FFmpeg. Using the AppEncCuda and AppEncPerf methods, create the 8K-encoded file to be uploaded back to S3. Commit your Docker container into a new Docker image:

docker commit -m "creating hvec-processor image" <containerid> nvidia-hvec:latest

Ensure that a Docker repo has been created in Amazon ECS. Choose Repositories, Create Repository. After you open the repository, choose View Push Commands. Commit the new created image to your ECR repo.

After confirming that your image is in your ECR repo, delete all images locally in the instance:

docker rmi -f $(docker images -a -q)

Before stopping the instance, remove the ECS agent checkpoint file:

sudo rm -rf /var/lib/ecs/data/ecs_agent_data.json

Create an AMI from the instance, maintaining the attached EBS volume. Note the AMI ID.

Creating IAM role permissions

To ensure that access to ECS is controlled and to allow AWS Batch to be called, create two IAM roles:

  • BatchServiceRole allows AWS Batch to call services on your behalf.
  • ecsInstanceRole is specific to this workflow and adds permissions for S3FullAccess. This allows the container to read from and write to your S3 bucket. The following screenshot shows the example policy stack.

In AWS Batch, select the compute environment and create a managed compute environment. Assign a cluster name and min and max vCPUs values. Use the AMI ID, and IAM roles created earlier. Use the Spot pricing model with a consideration of running at 60% of the On-Demand price. Look at the current Spot price to see if more aggressive discounts are possible.

Note the cluster name. In Amazon ECS, you should see the cluster created. Next, create a job queue and associate this job queue with the compute environment created earlier. Note the job queue name.

Next, create a job definition file. This provides the job parameters to be used including mounting paths, CPU, and memory requirements.

    "containerProperties": {
        "mountPoints": [
                "sourceVolume": "codec-data",
                "readOnly": false,
                "containerPath": "/mnt"
        "image": "<accountnumber>.dkr.ecr.us-east-1.amazonaws.com/nvidia/nvidia-hvec:latest",
        "command": ["/root/nvenc-processor.sh"],
        "volumes": [
                "host": {"sourcePath": "/mnt"},
                "name": "codec-data"
        "memory": 32768,
        "vcpus": 8,
        "privileged": true,
        "environment": [
                "name": "S3_INPUT",
                "value": "s3://<bucket>/<key_name>"
                "name": "S3_OUTPUT",
                "value": "s3://<bucket>"
        "ulimits": []
    "type": "container",
    "jobDefinitionName": "nvenc-test"

Save the file as nvenc-test.json and register the job in AWS Batch.

aws batch register-job-definition --cli-input-json file://nvenc-test.json

In the AWS Batch console, create a job queue assigning a priority of 1 to the compute environment created earlier. Create a job assigning a job name, with the job definition file, and job queue. Add additional environment variables for the S3 bucket. Ensure that these buckets and input file are created.

S3_INPUT = s3://<bucket>/<key_name> 
S3_OUTPUT = s3://<bucket> 

Execute the job. In a few moments, the job should be in the Running state. Check the CloudWatch logs for an updated status of the job progression. Open the job record information and scroll down to CloudWatch metrics. The events are logged in a new AWS Batch log stream.

A 1-minute 8K YUV 4:2:0 file took approximately 10 minutes single-threaded (top panel), and 58 seconds using four threads (bottom panel). The nvenc-processor.sh script serves as a basic implementation of 8K encoding. Explore the options provided by the NVIDIA Video Codec SDK for additional encoding/decoding and transcoding options.


With AWS Batch, a customized container instance, and a dockerized NVIDIA video encoding platform, AWS can provide your HD, 4K, and now 8K media distribution. I invite you to incorporate this into your automated pipeline.

With some minor modification, it’s possible to trigger this pipeline after a new file is uploaded into S3. Then, execute through AWS Lambda or as part of an AWS Step Functions workflow.






Now Available: R5, R5d, and z1d Instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-r5-r5d-and-z1d-instances/

Just last week I told you about our plans to launch EC2 Instances with Faster Processors and More Memory. Today I am happy to report that the R5, R5d, and z1d instances are available now and you can start using them today. Let’s take a look at each one!

R5 Instances
The memory-optimized R5 instances use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, powered by sustained all-core Turbo Boost. They are perfect for distributed in-memory caches, in-memory analytics, and big data analytics, and are available in six sizes:

Instance Name vCPUs Memory EBS-Optimized Bandwidth Network Bandwidth
r5.large 2 16 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.xlarge 4 32 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.2xlarge 8 64 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.4xlarge 16 128 GiB 3.5 Gbps Up to 10 Gbps
r5.12xlarge 48 384 GiB 7.0 Gbps 10 Gbps
r5.24xlarge 96 768 GiB 14.0 Gbps 25 Gbps

You can launch R5 instances today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) Regions. To learn more, read about R5 and R5d Instances.

R5d Instances
The R5d instances share their specs with the R5 instances, and also include up to to 3.6 TB of local NVMe storage. Here are the sizes:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
r5d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.4xlarge 16 128 GiB 2 x 300 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
r5d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
r5d.24xlarge 96 768 GiB 4 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The R5d instances are available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions. To learn more, read about R5 and R5d Instances.

z1d Instances
The high frequency z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost, perfect for Electronic Design Automation (EDA), financial simulation, relational database, and gaming workloads that can benefit from extremely high per-core performance. They are available in six sizes:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
z1d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD 2.333 Gbps Up to 10 Gbps
z1d.3xlarge 12 96 GiB 1 x 450 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
z1d.6xlarge 24 192 GiB 1 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
z1d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The fast cores allow you to run your existing jobs to completion more quickly than ever before, giving you the ability to fine-tune your use of databases and EDA tools that are licensed on a per-core basis.

You can launch z1d instances today in the US East (N. Virginia), US West (Oregon), US West (N. California), EU (Ireland), Asia Pacific (Tokyo), and Asia Pacific (Singapore) Regions. To learn more, read about z1d Instances.

In the Works
You will be able to create Amazon Relational Database Service (RDS), Amazon ElastiCache, and Amazon Aurora instances that make use of the R5 in the next two or three months.

The r5.metal, r5d.metal, and z1d.metal instances are on the near-term roadmap and I’ll let you know when they are available.



Building a GPU workstation for visual effects with AWS

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/building-a-gpu-workstation-for-visual-effects-with-aws/

Contributed by Mike Owen, Solutions Architect, AWS Thinkbox

The elasticity, scalability, and cost effectiveness of the cloud value proposition is attractive to media customers. One of the key design patterns in media and entertainment (M&E) workloads is using the cloud as a content lake and bringing the underlying processes closer without having to synchronize data. In this high-end graphics visualization business, a pixel-perfect, color-accurate, fully interactive native desktop experience is required for both Windows and Linux platforms. Visual effects (VFX) artists also require input peripherals such as latest-generation Wacom 8K pressure-sensitive tablets and Wacom Cintiq monitors to work as seamlessly as they do on-premises.

AWS offers Amazon EC2 G3 instances backed by NVIDIA Tesla M60 GPUs with powerful graphics capabilities: OpenGL 4.6, DirectX 12, CUDA 9.2, GRID 6.1. You can combine these instances with the Teradici streaming protocol via their Cloud Access Software (CAS) agent to enable a high-end desktop experience on either Windows or Linux with an on-demand pricing model to fit your business needs. Teradici PCoIP is a popular protocol in the M&E industry, where Teradici have delivered a custom silicon accelerated zero-client hardware device to deliver secure pixel streaming to on-premises monitors. AWS also enables customers to create managed virtual desktop environments with Amazon WorkSpaces Graphics bundles (Windows and Linux) or Amazon AppStream 2.0 (Windows). Both solutions offer a managed environment with GPU-backed instances. This blog describes how you can set up an unmanaged VFX desktop using Amazon EC2 G3 instances combined with high-performance storage and scalable compute options such as Amazon EC2 Spot Instances.


The following diagram describes a typical Windows and Linux configuration. In this setup, you use a Teradici PCoIP Zero Client over a dedicated network connection from your on-premises location via your chosen network provider to their nearest AWS Region containing an Amazon EC2 G3 instance. AWS Direct Connect provides a low-latency, high-bandwidth dedicated connection that doesn’t traverse the public internet. With the Windows instance, you might use a creative pen display such as a Wacom Cintiq monitor or, on a Linux instance, the latest generation of Wacom 8K pressure-sensitive tablets. You can connect both types of environments to dual 2K monitors and be ready for film VFX work.

Once built, the g3.4xl instance runs your custom Amazon Machine Image (AMI) with encrypted volume(s) in Amazon Elastic Block Storage (EBS) containing all your software, pulling floating licenses from your on-premises license servers where necessary. For Linux, you have the option of centrally installing your software via a fast NVMe SSD–based i3 instance type and building a minimal-sized boot AMI. In both cases, you can add encrypted Amazon EBS SSD volumes for increased local storage. The Teradici CAS agent runs on each individual G3 instance and can be provisioned, brokered, and managed by the optional Teradici Cloud Access Manager (CAM) solution. Finally, Amazon WorkSpaces Graphics bundles are compatible with a Teradici zero client, providing easy access to a fully managed Windows desktop. This might be useful for Linux-based studios that require ad hoc Windows usage such as Adobe Creative Cloud.

In this configuration, a Teradici zero client interacts with the provisioned desktop (served on a G3 instance) in the cloud. The Teradici CAS agent captures the frame buffer and sends it in real time to the zero client over the network via UDP using the PCoIP protocol. A smooth, reliable experience depends on a low-latency and high-bandwidth connection to the Amazon EC2 instance hosting the desktop. Bandwidth requirements depend on the number of monitors used, resolution, frame rate, and lossless quality of the desktop experience. For Wacom tablet support, Teradici CAS 2.12 requires the latency level to be less than 25 ms. You can use ping.psa.fun or cloudping.info to check the latency time of public pings between your location and your closest AWS Region. Ideally, you will provision an AWS Direct Connect connection for private (doesn’t traverse the public internet) and fast (low-latency) connectivity to the AWS Region from your location. You can also use a public internet connection for initial testing. In both cases, you can route traffic over a VPN for added security.


Instead of doing a manual build, you can visit the AWS Marketplace and subscribe to a Teradici-provided pre-built AMI. It already has the NVIDIA GRID driver and Teradici CAS software installed, configured, and licensed as part of the overall usage cost. See the following offerings on AWS Marketplace:


Make sure that everything in the following list is in place before deploying to either platform:

  • Create an AWS account.
  • Ensure that your AWS account has an EC2 key-pair associated with it by going to the AWS Management Console and checking Key Pairs under Network and Security in the applicable AWS Region.
  • Set up an AWS account <ACCESS KEY> and <SECRET ACCESS KEY> to access the NVIDIA GRID driver from an Amazon S3 bucket. The deployment instructions explain how to install and set up the AWS Command-Line Interface (AWS CLI).
  • Minimum version: CentOS 7.2 or Windows 2016.
  • Recommended Teradici PCoIP Zero Client firmware version: 6.0. Contact Teradici to download.
  • Contact Teradici who will provide a 60-day trial license: <TERADICI LICENSE CODE> for Cloud Access Software. You should receive your license within 1 business day. If you don’t receive your license, please contact [email protected].
  • You must have superuser (root) or Administrator privileges to the AMI.
  • The Amazon EC2 security group provides a stateful firewall on each instance via a set of rules. The following inbound ports must be available on the Amazon EC2 instance from a specific client’s source IP address (restrictive access).
Type Protocol Port Range Source Description Platform
Custom TCP Rule TCP 4172 <YOUR SOURCE IP> PCoIP Both
Custom UDP Rule UDP 4172 <YOUR SOURCE IP> PCoIP Both
Custom TCP Rule TCP 60443 <YOUR SOURCE IP> PCoIP Both
RDP TCP 3389 <YOUR SOURCE IP> RDP Windows only

Deploying the desktop on Linux

For our Linux deployment, we use the latest CentOS 7.5 AMI from AWS Marketplace and install the NVIDIA/Xorg/KDE/Wacom stack to create a fully functioning VFX Linux desktop environment. This stack contains the following components:

  • CentOS 7.5.1804_2 AMI
  • NVIDIA Grid 6.1 (390.57 May 2018) driver
  • Teradici CAS 2.12
  • Wacom 0.40 driver

Feel free to use your own CentOS 7.2+ AMI and modify the step by step instructions accordingly.

Setting up the desktop on Linux

To launch a g3.4xl instance in the closest AWS Region in your AWS account using the created key-pair and security group, use an AMI ID from the ones in the following table. For reference, search for the AMI using the keywords CentOS Linux 7 x86_64 HVM EBS 1804_2.

AWS Region AWS Region ID AMI ID
US East (N. Virginia) us-east-1 ami-d5bf2caa
US East (Ohio) us-east-2 ami-77724e12
US West (N. California) us-west-1 ami-3b89905b
US West (Oregon) us-west-2 ami-5490ed2c
EU (Frankfurt) eu-central-1 ami-9a183671
EU (Ireland) eu-west-1 ami-4c457735
Asia Pacific (Tokyo) ap-northeast-1 ami-3185744e
Asia Pacific (Singapore) ap-southeast-1 ami-da6151a6
Asia Pacific (Sydney) ap-southeast-2 ami-0d13c26f

Once the g3.4xl instance has passed its EC2 instance 2/2 status checks, we can build in true AWS style.

First, log in to the instance and set up the environment.

# ssh into running Amazon EC2 instance
ssh [email protected]<IP-ADDRESS>.<AWS-REGION>.compute.amazonaws.com
# yes

# set a password for your user
sudo passwd centos

# disable selinux
sudo sed -ir 's/SELINUX=\(disabled\|enforcing\|permissive\)/SELINUX=disabled/' /etc/selinux/config

# install the EPEL repository
sudo yum install wget -y
sudo wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo rpm -i epel-release-latest-7.noarch.rpm

# run yum update to make sure all packages are up-to-date
sudo yum update -y

# install the "Server with GUI" group
sudo yum groupinstall "Server with GUI" -y

# prefer KDE desktop? (optional)
sudo yum groupinstall -y "KDE Plasma Workspaces"
sudo systemctl set-default graphical.target
echo "exec startkde" >> ~/.xinitrc

# uninstall KDE (optional)
# sudo yum groupremove -y "KDE Plasma Workspaces"
# sudo yum autoremove -y
# sudo reboot

# reboot to make sure the latest installed kernel is running
sudo reboot

# install kernel-devel
sudo yum install kernel-devel -y

Next, install and register the Teradici CAS 2.12 software.

# import the Teradici signing key
sudo rpm --import https://downloads.teradici.com/rhel/teradici.pub.gpg

# grab the PCoIP repo file
sudo curl -o /etc/yum.repos.d/pcoip.repo https://downloads.teradici.com/rhel/pcoip.repo

# install PCoIP agent package
sudo yum install pcoip-agent-graphics -y

# load vhci-hcd kernel modules
sudo modprobe -a usb-vhci-hcd usb-vhci-iocifc

# register with the licensing service
pcoip-register-host --registration-code=<TERADICI LICENSE CODE>

# set up PCoIP agent config to enable USB
echo """pcoip.grid_diff_map = 0 pcoip.enable_usb = 1 pcoip.usb_auth_table = "23XXXXXX" pcoip.usb_unauth_table = "" """ | sudo tee /etc/pcoip-agent/pcoip-agent.conf

# make sure you're running latest pcoip-agent version
sudo yum update pcoip-agent-graphics

Then install the NVIDIA GRID graphics driver and apply performance optimization to its configuration.

# NVIDIA GRID driver
# https://docs.nvidia.com/grid/index.html
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html

# install nano editor
sudo yum install nano -y

# remove any old NVIDIA drivers/CUDA
sudo yum erase nvidia cuda

# disable the nouveau open source driver for NVIDIA graphics cards
sudo touch /etc/modprobe.d/blacklist.conf

# paste the following lines in one go into your shell
cat << EOF | sudo tee --append /etc/modprobe.d/blacklist.conf
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv

# edit the /etc/default/grub file and add the line:
sudo nano /etc/default/grub

# rebuild grub2 config
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo reboot

# install pip
curl -O https://bootstrap.pypa.io/get-pip.py
python get-pip.py --user

# install AWS CLI
pip install awscli --upgrade --user

# configure AWS CLI credentials
aws configure

# AWS Access Key ID [None]: <ACCESS KEY>
# AWS Secret Access Key [None]: <SECRET ACCESS KEY>
# Default Region name [None]: <AWS REGION>
# Default output format [None]: <enter>

# 390.57 driver
aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/latest/ .
chmod +x NVIDIA-Linux-x86_64-390.57-grid.run

sudo /bin/bash ./NVIDIA-Linux-x86_64-390.57-grid.run

# respond to the NVIDIA installer prompts as follows:
    # <accept> the EULA
    # <Yes> to register kernel module sources with DKMS
    # <No> to installing 32-bit libraries
    # <No> to modifying the x.org file at end of install
    # <OK> to complete the installer

# check driver installed
nvidia-smi -q | head

# g3/NVIDIA optimization settings
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/optimize_gpu.html
sudo nvidia-persistenced
sudo nvidia-smi --auto-boost-default=0
sudo nvidia-smi -ac 2505,1177

sudo reboot

Install CUDA if required by any of your VFX software such as Autodesk Maya or SideFX Houdini:

# install CUDA and OpenCL
# https://developer.download.nvidia.com/compute/cuda/9.2/Prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf
# https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=CentOS&target_version=7&target_type=runfilelocal

wget https://developer.nvidia.com/compute/cuda/9.2/Prod/local_installers/cuda_9.2.88_396.26_linux
mv cuda_9.2.88_396.26_linux cuda_9.2.88_396.26_linux.run

# don't install the actual graphics driver, just CUDA 9.2 toolkit, sym-link
sudo /bin/sh cuda_9.2.88_396.26_linux.run

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.26?
(y)es/(n)o/(q)uit: n

Install the CUDA 9.2 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-9.2 ]: 

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 9.2 Samples?
(y)es/(n)o/(q)uit: n

Installing the CUDA Toolkit in /usr/local/cuda-9.2 ...

# CUDA Patch 1 (Released May 16, 2018)
wget https://developer.nvidia.com/compute/cuda/9.2/Prod/patches/1/cuda_9.2.88.1_linux
mv cuda_9.2.88.1_linux cuda_9.2.88.1_linux.run
sudo /bin/sh cuda_9.2.88.1_linux.run

# Ensure these ENV VARs are present: /etc/profile.d
export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Finally, install Wacom drivers.

# install Wacom driver
# https://github.com/linuxwacom/input-wacom/releases
cd ~
wget https://github.com/linuxwacom/input-wacom/releases/download/input-wacom-0.40.0/input-wacom-0.40.0.tar.bz2
tar jxf input-wacom-0.40.0.tar.bz2
cd input-wacom-0.40.0
sudo su
make && make install
modprobe wacom
dracut --force
sudo touch /etc/X11/xorg.conf.d/99-wacom-pressure2k.conf

# edit Wacom conf file as follows
sudo nano /etc/X11/xorg.conf.d/99-wacom-pressure2k.conf

Section "InputClass"
    Identifier "Wacom pressure compatibility"
    MatchDriver "wacom"
    Option "Pressure2K" "true"

# check Elastic Network Adapter (ENA) is running on your instance
modinfo ena
ethtool -i eth0
aws ec2 describe-images --image-id <AMI-ID> --query 'Images[].EnaSupport'

# if that command returns false, proceed to enable it
# make sure that you have AWS CLI installed with AWS credentials on your local machine
sudo shutdown now
aws ec2 modify-instance-attribute --instance-id <CURRENT EC2 INSTANCE ID> --ena-support

# if you're using a pre-existing Linux AMI, you need to install the ENA driver yourself
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking-ena.html#enhanced-networking-ena-linux

sudo reboot

Deploying the desktop on Windows

We use the latest AWS-provided Windows 2016 AMI for our deployment and install the NVIDIA/Teradici/Wacom stack to create a fully functioning VFX Windows desktop environment. This stack contains the following components:

  • Windows Server 2016 Base 2018.04.11
  • NVIDIA Grid 6.1 (391.58 May 2018) driver
  • Teradici CAS 2.12
  • Latest Wacom driver

Feel free to use your own Windows 2016 AMI and modify the step by step instructions accordingly.

Windows Instructions

To launch a g3.4xl instance in the closest AWS Region in your AWS account using the created key-pair and security group, use an AMI ID from the ones in the following table. For reference, the AMI name is Microsoft Windows Server 2016 Base 2018.04.11.

AWS Region AWS Region ID AMI ID
US East (N. Virginia) us-east-1 ami-3633b149
US East (Ohio) us-east-2 ami-5984b43c
US West (N. California) us-west-1 ami-3dd1c25d
US West (Oregon) us-west-2 ami-f3dcbc8b
EU (Frankfurt) eu-central-1 ami-b5530b5e
EU (Ireland) eu-west-1 ami-4cc09a35
Asia Pacific (Tokyo) ap-northeast-1 ami-0e809272
Asia Pacific (Singapore) ap-southeast-1 ami-00a2847c
Asia Pacific (Sydney) ap-southeast-2 ami-7279b010

Once the g3.4xl instance has passed its Amazon EC2 instance 2/2 status checks, let’s go build:

# use AWS Management Console to right-click EC2 instance and "Get Windows Password" -> <RDP PASSWORD>

# RDP into machine
# address: ec2-<IP-ADDRESS>.<AWS-REGION>.compute.amazonaws.com
# username: Administrator
# password: <RDP PASSWORD>

# set a password in command prompt
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ec2-windows-passwords.html
net user Administrator <NEW PASSWORD>

# configure Powershell - Allow ExecutionPolicy of Powershell scripts
Set-ExecutionPolicy -ExecutionPolicy AllSigned

# enable Software Secure Attention Sequence (SAS) setting
Open gpedit.msc
Expand Computer Configuration > Administrative Templates > Windows Components
Select Windows Logon Options
Double-click Disable or enable software Secure Attention Sequence
Select Enabled
Select Services from the drop down list in the bottom left pane
Click OK

# install AWS CLI
# https://docs.aws.amazon.com/cli/latest/userguide/awscli-install-windows.html
# download and install: https://s3.amazonaws.com/aws-cli/AWSCLI64.msi

# configure AWS CLI credentials in Powershell
aws configure

# AWS Access Key ID [None]: <ACCESS KEY>
# AWS Secret Access Key [None]: <SECRET ACCESS KEY>
# Default Region name [None]: <AWS REGION>
# Default output format [None]: <enter>

# download NVIDIA GRID driver from Amazon S3
# right-click Powershell, Run As Administrator, paste following into Powershell

$Bucket = "ec2-windows-nvidia-drivers"
$KeyPrefix = "latest"
$LocalPath = "C:\Users\Administrator\Desktop\NVIDIA"
$Objects = Get-S3Object -BucketName $Bucket -KeyPrefix $KeyPrefix -Region us-east-1
foreach ($Object in $Objects) {
    $LocalFileName = $Object.Key
    if ($LocalFileName -ne '' -and $Object.Size -ne 0) {
        $LocalFilePath = Join-Path $LocalPath $LocalFileName
        Copy-S3Object -BucketName $Bucket -Key $Object.Key -LocalFile $LocalFilePath -Region us-east-1

# run NVIDIA GRID installer

# reboot machine via command prompt
cmd shutdown /r

# Optimize GPU settings (follow these instructions)
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/optimize_gpu.html

# via Powershell
cd "C:\Program Files\NVIDIA Corporation\NVSMI"
.\nvidia-smi --auto-boost-default=0
.\nvidia-smi -ac "2505,1177"

# go to www.teradici.com, create account, and request access from Teradici via support ticket
# download Teradici PCoIP CAS software: PCoIP Graphics Agent 2.12 for Windows or later

# install PCoIP graphics agent package via GUI based installer
enter <TERADICI LICENSE CODE> via GUI installer
reboot machine

# download and install latest Wacom drivers from Wacom website
# https://www.wacom.com/en/support/product-support/drivers

# double-check the Elastic Network Adapter (ENA) is running
# ensure you have AWS CLI installed with AWS credentials on your local machine
aws ec2 describe-instances --instance-ids <CURRENT EC2 INSTANCE ID> --query "Reservations[].Instances[].EnaSupport"

# if the check returns false, install ENA drivers
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/enhanced-networking-ena.html

# if you're using a pre-existing Windows AMI, you need to install the ENA driver yourself
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/enhanced-networking-ena.html

Validating the desktop

Finally, take your new Linux or Windows VFX workstation for a spin. Using a zero client:

# connect Wacom tablet to zero-client and start a PCoIP session...
# ensure you configure zero-client to connect via:
# “Auto-Detect” in local z/c connection settings

# install any other software you need...

# don't forget to configure your floating license servers...

# finally, create a new AMI to capture your new custom VFX workstation image in your account

Teradici provides a software client for Windows and macOS that you can use to validate the setup of your Windows or Linux desktop. It’s also handy for system administrators who need to access a graphics workstation for artist technical support.

Testing the desktop

For testing, let’s run Autodesk 3ds Max on Windows and Autodesk Maya on Linux.

In 3ds Max, we have a 35-million-poly scene from the GPU-accelerated renderer Redshift, fully interactive and able to use the NVIDIA card to perform CUDA-based GPU final rendering.

In Maya, we show the 16 vCPUs and 120 GB of RAM available to this 3D scene file. The file takes 10 minutes to final render at HD resolution on a g3.4xl instance or, if you decide to offload the CUDA rendering to the Amazon EC2 P3.16xl instance type, just 19 seconds!


The Amazon EC2 G3 instance type is purpose-built to provide a high-end professional graphics infrastructure for visual computing applications. With remote protocols like Teradici PCoIP, G3 instances are the next-generation VFX cloud desktops that can deliver outstanding performance. With many studios already taking advantage of elastic cloud scaling for rendering, now is a great time to deploy cloud desktops for your business.

Amazon EC2 Instance Update – Faster Processors and More Memory

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-instance-update-faster-processors-and-more-memory/

Last month I told you about the Nitro system and explained how it will allow us to broaden the selection of EC2 instances and to pick up the pace as we do so, with an ever-broadening selection of compute, storage, memory, and networking options. This will allow us to give you access to the latest technology very quickly, giving you the ability to choose the instance type that is the best match for your applications.

Today, I would like to tell you about three new instance types that are in the works and that will be available soon:

Z1d – Compute-intensive instances running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They are ideal for Electronic Design Automation (EDA) and relational database workloads, and are also a great fit for several kinds of HPC workloads.

R5 – Memory-optimized instances running at up to 3.1 GHz powered by sustained all-core Turbo Boost, with up to 50% more vCPUs and 60% more memory than R4 instances.

R5d – Memory-optimized instances equipped with local NVMe storage (up to 3.6 TB for the largest R5d instance), and will be available in the same sizes and with the same specs as the R5 instances.

We are also planning to launch R5 Bare Metal, R5d Bare Metal, and Z1d Bare Metal instances. As is the case with the existing i3.metal instances, you will be able to access low-level hardware features and to run applications that are not licensed or supported in virtualized environments.

Z1d Instances
The Z1d instances are designed for applications that can benefit from extremely high per-core performance. These include:

Electronic Design Automation – As chips become smaller and denser, the amount of compute power needed to design and verify the chips increases non-linearly. Semiconductor customers deploy jobs that span thousands of cores; having access to faster cores reduces turnaround time for each job and can also lead to a measurable reduction in software licensing costs.

HPC – In the financial services world, jobs that run analyses or compute risks also benefit from faster cores. Manufacturing organizations can run their Finite Element Analysis (FEA) and simulation jobs to completion more quickly.

Relational Database – CPU-bound workloads that run on a database that “features” high per-core license fees will enjoy both cost and performance benefits.

Z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They will be available in 6 sizes, with up to 48 vCPUs, 384 GiB of memory, and 1.8 TB of local NVMe storage. On the network side, they feature ENA networking that will deliver up to 25 Gbps of bandwidth, and are EBS-Optimized by default for up to 14 Gbps of bandwidth. As usual, you can launch them in a Cluster Placement Group to increase throughput and reduce latency. Here are the sizes and specs:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
z1d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD 2.333 Gbps Up to 10 Gbps
z1d.3xlarge 12 96 GiB 1 x 450 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
z1d.6xlarge 24 192 GiB 1 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
z1d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers. Any AMI that runs on C5 or M5 instances will also run on Z1d instances.

R5 Instances
Building on the earlier generations of memory-intensive instance types (M2, CR1, R3, and R4), the R5 instances are designed to support high-performance databases, distributed in-memory caches, in-memory analytics, and big data analytics. They use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, again powered by sustained all-core Turbo Boost. The instances will be available in 6 sizes, with up to 96 vCPUs and 768 GiB of memory. Like the Z1d instances, they feature ENA networking and are EBS-Optimized by default, and can be launched in Placement Groups. Here are the sizes and specs:

Instance Name vCPUs Memory EBS-Optimized Bandwidth Network Bandwidth
r5.large 2 16 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.xlarge 4 32 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.2xlarge 8 64 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.4xlarge 16 128 GiB 3.5 Gbps Up to 10 Gbps
r5.12xlarge 48 384 GiB 7.0 Gbps 10 Gbps
r5.24xlarge 96 768 GiB 14.0 Gbps 25 Gbps

Once again, the instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers.

Learn More
The new EC2 instances announced today highlight our plan to continue innovating in order to better meet your needs! I’ll share additional information as soon as it is available.




New – EC2 Compute Instances for AWS Snowball Edge

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-ec2-compute-instances-for-aws-snowball-edge/

I love factories and never miss an opportunity to take a tour. Over the years, I have been lucky enough to watch as raw materials and sub-assemblies are turned into cars, locomotives, memory chips, articulated buses, and more. I’m always impressed by the speed, precision, repeat-ability, and the desire to automate every possible step. On one recent tour, the IT manager told me that he wanted to be able to set up and centrally manage the global collection of on-premises industrialized PCs that monitor their machinery as easily and as efficiently as he does their EC2 instances and other cloud resources.

Today we are making that manager’s dream a reality, with the introduction of EC2 instances that run on AWS Snowball Edge devices! These ruggedized devices, with 100 TB of local storage, can be used to collect and process data in hostile environments with limited or non-existent Internet connections before shipping the processed data back to AWS for storage, aggregation, and detailed analysis. Here are the instance specs:

Instance Name vCPUs Memory
sbe1.small 1 1 GiB
sbe1.medium 1 2 GiB
sbe1.large 2 4 GiB
sbe1.xlarge 4 8 GiB
sbe1.2xlarge 8 16 GiB
sbe1.4xlarge 16 32 GiB

Each Snowball Edge device is powered by an Intel® Xeon® D processor running at 1.8 GHz, and supports any combination instances that consume up to 24 vCPUs and 32 GiB of memory. You can build and test AMIs (Amazon Machine Images) in the cloud and then preload them onto the device as part of the ordering process (I’ll show you how in just a minute). You can use the EC2-compatible endpoint exposed by each device to programmatically start, stop, resume, and terminate instances. This allows you to use the existing CLI commands and to build tools and scripts to manage fleets of devices. It also allows you to take advantage of your existing EC2 skills and knowledge, and to put them to good use in a new environment.

There are three main setup steps:

  1. Creating a suitable AMI.
  2. Ordering a Snowball Edge Device.
  3. Connecting and Configuring the Device.

Let’s take an in-depth look at the first two steps. Time was tight and I didn’t have time to get hands-on experience with an actual device, so the third step will have to wait for another time.

Creating a Suitable AMI
I have the ability to choose up to 10 AMIs that will be preloaded onto the device. The AMIs must be owned by my AWS account, and must be based on one of the following Marketplace AMIs:

These AMIs have been tested for use on Snowball Edge devices and can be used as a starting point for customization. We will be adding additional options over time, so let us know what you need.

I decided to start with the newest Ubuntu AMI, and launch it on an M5 instance, taking care to specify the SSH keypair that I will eventually use to connect to the instance from my terminal client:

After my instance launches, I connect to it, customize it as desired for use on my device, and then return to the EC2 Console to create an AMI. I select the running instance, choose Create Image from the Actions menu, specify the details, and click Create Image:

The size of the root volume will determine how much of the device’s SSD storage is allocated to the instance when it launches. A total of one TB of space is available for all running instances, so keep your local file storage needs in mind as your analyze your use case and set up your AMIs. Also, Snowball Edge devices cannot make use of additional EBS volumes, so don’t bother including them in your AMI. My AMI is ready within minutes (To learn more about how to create AMIs, read Creating an Amazon EBS-Backed Linux AMI):

Now I am ready to order my first device!

Ordering a Snowball Edge Device
The ordering procedure lets me designate a shipping address and specify how I would like my Snowball device to be configured. I open the AWS Snowball Console and click Create job:

I specify the job type (they all support EC2 compute instances):

Then I select my shipping address, entering a new one if necessary (come and visit me):

Next, I define my job. I give it a name (SJ1), select the 100 TB device, and pick the S3 bucket that will receive data when the device is returned to AWS:

Now comes the fun part! I click Enable compute with EC2 and select the AMIs to be loaded on the Snowball Edge:

I click Add an AMI and find the one that I created earlier:

I can add up to ten AMIs to my job, but will stop at one for this post:

Next, I set up my IAM role and configure encryption:

Then I configure the optional SNS notifications. I can choose to receive notification for a wide variety of job status values:

My job is almost ready! I review the settings and click Create job to create it:

Connecting and Configuring the Device
After I create the job, I wait until my Snowball Edge device arrives. I connect it to my network, power it on, and then unlock it using my manifest and device code, as detailed in Unlock the Snowball Edge. Then I configure my EC2 CLI to use the EC2 endpoint on the device and launch an instance. Since I configured my AMI for SSH access, I can connect to it as if it were an EC2 instance in the cloud.

Things to Know
Here are a couple of things to keep in mind:

Long-Term Usage – You can keep the Snowball Edge devices on your premises and hard at work for as long as you would like. You’ll be billed for a one-time setup fee for each job; after 10 days you will pay an additional, per-day fee for each device. If you want to keep a device for an extended period of time, you can also pay upfront as part of a one or three year commitment.

Dev/Test – You should be able to do much of your development and testing on an EC2 instance running in the cloud; some of our early users are working in this way as part of a “Digital Twin” strategy.

S3 Access – Each Snowball Edge device includes an S3-compatible endpoint that you can access from your on-device code. You can also make use of existing S3 tools and applications.

Now Available
You can start ordering devices today and make use of this exciting new AWS feature right away.




New – Lifecycle Management for Amazon EBS Snapshots

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-lifecycle-management-for-amazon-ebs-snapshots/

It is always interesting to zoom in on the history of a single AWS service or feature and watch how it has evolved over time in response to customer feedback. For example, Amazon Elastic Block Store (EBS) launched a decade ago and has been gaining more features and functionality every since. Here are a few of the most significant announcements:

Several of the items that I chose to highlight above make EBS snapshots more useful and more flexible. As you may already know, it is easy to create snapshots. Each snapshot is a point-in-time copy of the blocks that have changed since the previous snapshot, with automatic management to ensure that only the data unique to a snapshot is removed when it is deleted. This incremental model reduces your costs and minimizes the time needed to create a snapshot.

Because snapshots are so easy to create and use, our customers create a lot of them, and make great use of tags to categorize, organize, and manage them. Going back to my list, you can see that we have added multiple tagging features over the years.

Lifecycle Management – The Amazon Data Lifecycle Manager
We want to make it even easier for you to create, use, and benefit from EBS snapshots! Today we are launching Amazon Data Lifecycle Manager to automate the creation, retention, and deletion of Amazon EBS volume snapshots. Instead of creating snapshots manually and deleting them in the same way (or building a tool to do it for you), you simply create a policy, indicating (via tags) which volumes are to be snapshotted, set a retention model, fill in a few other details, and let Data Lifecycle Manager do the rest. Data Lifecycle Manager is powered by tags, so you should start by setting up a clear and comprehensive tagging model for your organization (refer to the links above to learn more).

It turns out that many of our customers have invested in tools to automate the creation of snapshots, but have skimped on the retention and deletion. Sooner or later they receive a surprisingly large AWS bill and find that their scripts are not working as expected. The Data Lifecycle Manager should help them to save money and to be able to rest assured that their snapshots are being managed as expected.

Creating and Using a Lifecycle Policy
Data Lifecycle Manager uses lifecycle policies to figure out when to run, which volumes to snapshot, and how long to keep the snapshots around. You can create the policies in the AWS Management Console, from the AWS Command Line Interface (CLI) or via the Data Lifecycle Manager APIs; I’ll use the Console today. Here are my EBS volumes, all suitably tagged with a department:

I access the Lifecycle Manager from the Elastic Block Store section of the menu:

Then I click Create Snapshot Lifecycle Policy to proceed:

Then I create my first policy:

I use tags to specify the volumes that the policy applies to. If I specify multiple tags, then the policy applies to volumes that have any of the tags:

I can create snapshots at 12 or 24 hour intervals, and I can specify the desired snapshot time. Snapshot creation will start no more than an hour after this time, with completion based on the size of the volume and the degree of change since the last snapshot.

I can use the built-in default IAM role or I can create one of my own. If I use my own role, I need to enable the EC2 snapshot operations and all of the DLM (Data Lifecycle Manager) operations; read the docs to learn more.

Newly created snapshots will be tagged with the aws:dlm:lifecycle-policy-id and  aws:dlm:lifecycle-schedule-name automatically; I can also specify up to 50 additional key/value pairs for each policy:

I can see all of my policies at a glance:

I took a short break and came back to find that the first set of snapshots had been created, as expected (I configured the console to show the two tags created on the snapshots):

Things to Know
Here are a couple of things to keep in mind when you start to use Data Lifecycle Manager to automate your snapshot management:

Data Consistency – Snapshots will contain the data from all completed I/O operations, also known as crash consistent.

Pricing – You can create and use Data Lifecyle Manager policies at no charge; you pay the usual storage charges for the EBS snapshots that it creates.

Availability – Data Lifecycle Manager is available in the US East (N. Virginia), US West (Oregon), and EU (Ireland) Regions.

Tags and Policies – If a volume has more than one tag and the tags match multiple policies, each policy will create a separate snapshot and both policies will govern the retention. No two policies can specify the same key/value pair for a tag.

Programmatic Access – You can create and manage policies programmatically! Take a look at the CreateLifecyclePolicy, GetLifecyclePolicies, and UpdateLifeCyclePolicy functions to get started. You can also write an AWS Lambda function in response to the createSnapshot event.

Error Handling – Data Lifecycle Manager generates a “DLM Policy State Change” event if a policy enters the error state.

In the Works – As you might have guessed from the name, we plan to add support for additional AWS data sources over time. We also plan to support policies that will let you do weekly and monthly snapshots, and also expect to give you additional scheduling flexibility.


Amazon EC2 Update – Additional Instance Types, Nitro System, and CPU Options

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-update-additional-instance-types-nitro-system-and-cpu-options/

I have a backlog of EC2 updates to share with you. We’ve been releasing new features and instance types at a rapid clip and it is time to catch up. Here’s a quick peek at where we are and where we are going…

Additional Instance Types
Here’s a quick recap of the most recent EC2 instance type announcements:

Compute-Intensive – The compute-intensive C5d instances provide a 25% to 50% performance improvement over the C4 instances. They are available in 5 regions and offer up to 72 vCPUs, 144 GiB of memory, and 1.8 TB of local NVMe storage.

General Purpose – The general purpose M5d instances are also available in 5 regions. They offer up to 96 vCPUs, 384 GiB of memory, and 3.6 TB of local NVMe storage.

Bare Metal – The i3.metal instances became generally available in 5 regions a couple of weeks ago. You can run performance analysis tools that are hardware-dependent, workloads that require direct access to bare-metal infrastructure, applications that need to run in non-virtualized environments for licensing or support reasons, and container environments such as Clear Containers, while you take advantage of AWS features such as Elastic Block Store (EBS), Elastic Load Balancing, and Virtual Private Clouds. Bare metal instances with 6 TB, 9 TB, 12 TB, and more memory are in the works, all designed specifically for SAP HANA and other in-memory workloads.

Innovation and the Nitro System
The Nitro system is a rich collection of building blocks that can be assembled in many different ways, giving us the flexibility to design and rapidly deliver EC2 instance types with an ever-broadening selection of compute, storage, memory, and networking options. We will deliver new instance types more quickly than ever in the months to come, with the goal of helping you to build, migrate, and run even more types of workloads.

Local NVMe Storage – The new C5d, M5d, and bare metal EC2 instances feature our Nitro local NVMe storage building block, which is also used in the Xen-virtualized I3 and F1 instances. This building block provides direct access to high-speed local storage over a PCI interface and transparently encrypts all data using dedicated hardware. It also provides hardware-level isolation between storage devices and EC2 instances so that bare metal instances can benefit from local NVMe storage.

Nitro Security Chip – A component that is part of our AWS server designs that continuously monitors and protects hardware resources and independently verifies firmware each time a system boots.

Nitro Hypervisor – A thin, quiescent hypervisor that manages memory and CPU allocation, and delivers performance that is indistinguishable from bare metal for most workloads (Brendan Gregg of Netflix benchmarked it at less than 1%).

Networking – Hardware support for the software defined network inside of each Virtual Private Cloud (VPC), Enhanced Networking, and Elastic Network Adapter.

Elastic Block Storage – Hardware EBS processing including CPU-intensive cryptographic operations.

Moving storage, networking, and security functions to hardware has important consequences for both bare metal and virtualized instance types:

Virtualized instances can make just about all of the host’s CPU power and memory available to the guest operating systems since the hypervisor plays a greatly diminished role.

Bare metal instances have full access to the hardware, but also have the same the flexibility and feature set as virtualized EC2 instances including CloudWatch metrics, EBS, and VPC.

To learn more about the hardware and software that make up the Nitro system, watch Amazon EC2 Bare Metal Instances or C5 Instances and the Evolution of Amazon EC2 Virtualization and take a look at The Nitro Project: Next-Generation EC2 Infrastructure.

CPU Options
This feature provides you with additional control over your EC2 instances and lets you optimize your instance for a particular workload. First, you can specify the desired number of vCPUs at launch time. This allows you to control the vCPU to memory ratio for Oracle and SQL Server workloads that need high memory, storage, and I/O but perform well with a low vCPU count. As a result, you can optimize your vCPU-based licensing costs when you Bring Your Own License (BYOL). Second, you can disable Intel® Hyper-Threading Technology (Intel® HT Technology) on instances that run compute-intensive workloads. These workloads sometimes exhibit diminished performance when Intel HT is enabled. Both of these options are available when you launch an instance using the AWS Command Line Interface (CLI) or one of the AWS SDKs. You simply specify the total number of cores and the number of threads per core using values chosen from the CPU Cores and Threads per CPU Core Per Instance Type table. Here’s how you would launch an instance with 6 CPU cores and Intel® HT Technology disabled:

$ aws ec2 run-instances --image-id ami-1a2b3c4d --instance-type r4.4xlarge --cpu-options "CoreCount=6,ThreadsPerCore=1"

To learn more, read about Optimizing CPU Options.

Help Wanted
The EC2 team is always hiring! Here are a few of their open positions:


Deploying a 4K, GPU-backed Linux desktop instance on AWS

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/deploying-4k-gpu-backed-linux-desktop-instance-on-aws/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

AWS currently supports many managed des­ktop delivery mechanisms. Amazon WorkSpaces and Amazon AppStream 2.0 both deliver managed Windows-based machine images with GPU-backed instances. However, many desktop services and applications are better served through a Linux backed instance. Given the variety of Linux distributions as well as desktop managers, it can be valuable to have a generic solution for provisioning a Linux desktop on Amazon EC2.

A GPU-backed instance reduces the computational requirements from the client (local) machine, eliminating the need for a local discrete GPU to run graphical workloads. The framebuffer objects generated by the GPU are compressed when sent over the network, and decompressed by the local CPU resources. This allows clients to take advantage of the server GPU and display the high-resolution content on local thin clients, mobile devices, and low-powered desktops and laptops. Such GPU-backed Linux instances have been used for VFX rendering, computational drug discovery, and computational fluid dynamics (CFD) simulation use cases. An upcoming followup post details enabling this technology on the Windows platform.


In this configuration, a client machine connects to the provisioned desktop (server) in the cloud. The server captures the framebuffer, which is sent in real time to the client machine over the network. Thus latency is an important metric to consider when provisioning this solution. I recommend choosing the nearest AWS Region (under 100 ms). Some customers may even prefer to install AWS Direct Connect.

Region Latency
US-East (Virginia) 18 ms
US East (Ohio) 31 ms
US-West (California) 77 ms
US-West (Oregon) 97 ms
Canada (Central) 29 ms
Europe (Ireland) 89 ms
Europe (London) 90 ms
Europe (Frankfurt) 108 ms
Asia Pacific (Mumbai) 197 ms
Asia Pacific (Seoul) 198 ms
Asia Pacific (Singapore) 288 ms
Asia Pacific (Sydney) 218 ms
Asia Pacific (Tokyo) 188 ms
South America (São Paulo) 138 ms
China (Beijing) 267 ms
AWS GovCloud (US) 97 ms

Source: http://www.cloudping.info/ from the Amazon offices located in Herndon, VA

Bandwidth requirements depend on the quality of the desktop experience as well as the desired resolution. Provision the backend Linux desktop instance with a 4096×2160 (4K) resolution. Depending on the specific G3 instance type selected, multi-GPU managed desktops give additional performance benefits. Each instance can also host multiple users, either in collaborative sessions, or with up to four independent 4K monitors. The GPU framebuffer memory used per session generally limits the number of sessions per managed desktop.

A smooth reliable experience depends on a low latency and high-bandwidth connection to the EC2 instance hosting the desktop. One of the benefits of using a multithreaded framebuffer reader is that only the defined block of the rendered desktop that is changing needs to be sent over the network. Full-screen redraws may be necessary only in rare cases. The minimum requirements for this 4K (3840×2160) configuration are as follows:

  • Bandwidth: 50 Mbps
  • Latency: < 30 ms
  • Jitter: < 5 ms


Use RHEL/CentOS for the deployment. Except for DCV, this stack is compatible with Debian/Ubuntu distributions. Use the CentOS 7.5 Server AMI and install the NVIDIA/Xorg/KDE stack  to create a fully functioning desktop environment with a max resolution of 16384 x 8640 (that is, 4x4K) at 60 Hz.

This stack contains the following software:

  • CentOS 7.5 Base
  • Xorg 1.19
  • NVIDIA Grid Driver 6.1 (for the G3 instance family)
  • KDE Desktop environment
  • VirtualGL
  • TurboVNC

To make the most efficient use of the NVIDIA Tesla M60 framebuffer memory, disable the compositing features of the desktop manager. Other non-compositing desktop managers (such as XFCE, MATE, etc.) are supported as well. This ensures that the GPU is reserved for specific OpenGL API tasks for the application, and that the performance is not impacted by the desktop environment decorations.

Start up a CentOS 7.5 server desktop based on the latest AMI available in the closest Region:

Distributor ID:    CentOS
Description:       CentOS Linux release 7.5.1804 (Core)
Release:           7.5.1804
Codename:          Core

Now install the Xorg stack with the KDE desktop manager:

sudo yum install epel-release
sudo yum update
sudo yum groupinstall "Development Tools"
sudo yum install xorg-* kernel-devel dkms python-pip lsb
sudo pip install awscli
sudo yum groupinstall "KDE Plasma Workspaces"
sudo systemctl disable firewalld #AWS security groups will provide our firewall rules
# if there is a kernel update
sudo reboot

Download the NVIDIA Grid driver (6.1). For more information, see Installing the NVIDIA Driver on Linux Instances.

aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/ .
chmod +x latest/NVIDIA-Linux-x86_64-390.57-grid.run
sudo .latest/NVIDIA-Linux-x86_64-390.57-grid.run
# register the driver with dkms, ignore errors associated with 32bit compatible libraries

Deposit the xorg.conf file in /etc/X11/xorg.conf:

Section "ServerLayout"
        Identifier     "X.org Configured"
        Screen      0  "Screen0" 0 0
        InputDevice    "Mouse0" "CorePointer"
        InputDevice    "Keyboard0" "CoreKeyboard"
Section "Files"
        ModulePath   "/usr/lib64/xorg/modules"
        FontPath     "catalogue:/etc/X11/fontpath.d"
        FontPath     "built-ins"
Section "Module"
        Load  "glx"
Section "InputDevice"
        Identifier  "Keyboard0"
        Driver      "kbd"
Section "InputDevice"
        Identifier  "Mouse0"
        Driver      "mouse"
        Option      "Protocol" "auto"
        Option      "Device" "/dev/input/mice"
        Option      "ZAxisMapping" "4 5 6 7"
Section "Monitor"
        Identifier   "Monitor0"
        VendorName   "Monitor Vendor"
        ModelName    "Monitor Model"
        Modeline "3840x2160_60.00"  712.34  3840 4152 4576 5312  2160 2161 2164 2235  -HSync +Vsync

Section "Device"
        Identifier  "Card0"
        Driver      "nvidia"
        Option "ConnectToAcpid" "0"
        BusID       "PCI:0:30:0"
Section "Screen"
        Identifier "Screen0"
        Device     "Card0"
        Monitor    "Monitor0"
        SubSection "Display"
                Viewport   0 0
                Depth     24
        Modes    "4096x2160" "3840x2160"

Reboot again and check that the nvidia-gridd service is running. You may notice errors. They can be safely ignored after the nvidia-gridd service successfully acquires a license.

[[email protected] ~]# systemctl status nvidia-gridd.service
● nvidia-gridd.service - NVIDIA Grid Daemon
   Loaded: loaded (/usr/lib/systemd/system/nvidia-gridd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-05-29 18:37:35 UTC; 39s ago
  Process: 863 ExecStart=/usr/bin/nvidia-gridd (code=exited, status=0/SUCCESS)
 Main PID: 881 (nvidia-gridd)
   CGroup: /system.slice/nvidia-gridd.service
           └─881 /usr/bin/nvidia-gridd
May 29 18:37:35 ip-10-0-125-164.ec2.internal systemd[1]: Starting NVIDIA Grid Daemon...
May 29 18:37:35 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Started (881)
May 29 18:37:35 ip-10-0-125-164.ec2.internal systemd[1]: Started NVIDIA Grid Daemon.
May 29 18:37:36 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Configuration parameter ( ServerAddress  FeatureType) not set
May 29 18:37:40 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Calling load_byte_array(tra)
May 29 18:37:41 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: License acquired successfully (2)

You can confirm that 4K resolution is enabled by running the following command:

DISPLAY=:0 xrandr -q
Screen 0: minimum 8 x 8, current 4096 x 2160, maximum 16384 x 8640
DVI-D-0 connected primary 4096x2160+0+0 (normal left inverted right x axis y axis) 641mm x 400mm
2560x1600 59.86+
4096x2160 60.03*
3840x2160 60.00 

Finally, check that your underlying GL renderer is using the NVIDIA driver by querying glxinfo

DISPLAY=:0 glxinfo

OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: Quadro FX Tesla M60/PCIe/SSE2
OpenGL core profile version string: 4.5.0 NVIDIA 390.57
OpenGL core profile shading language version string: 4.50 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6.0 NVIDIA 390.57
OpenGL shading language version string: 4.60 NVIDIA

At the time of publication, OpenGL 4.5 is enabled. Your applications can take advantage of that API for rendering.

To interact with the instance, install server-side desktop remote display software that can specifically take advantage of the 3D hardware acceleration. For example, AWS provides the NICE DCV platform.

DCV is an accelerated remote desktop framework that provides in-web browser desktop connections. DCV is supported in both Windows and Linux (RHEL/CentOS). In the Windows platform, OpenGL and DirectX are fully supported. DCV entitlement is free when provisioning on AWS. NICE DCV is also provided as a component to the AWS EnginFrame and myHPC solutions.

To install DCV, download the NICE DCV 2017 EL7 archive and Administrative Guide. After you extract the archive in the instance, you see a list of nice-* RPMS. You don’t have to worry about licensing, as the installer captures that the instance is running in AWS.

sudo yum localinstall nice-*
sudo systemctl enable dcvserver
sudo systemctl start dcvserver

When the DCV server starts, you have the option to create a single console session or multiple virtual sessions. You must assign a password for the CentOS user issued, by running the following command:

sudo passwd centos

Start the console session:

sudo dcv create-session --type=console --owner centos session1
sudo dcv list-sessions

The AWS security groups are enabled to allow TCP 8443 traffic to the instance. You see the DCV login portal and can interact with the instance. Other popular frameworks include the following:

You can also find plug and play images for managed desktops in the AWS Marketplace.


Implement the changes outlined in the Optimizing GPU Settings (P2, P3, and G3 Instances) topic. You can turn off the autoboost feature and set the maximum graphics and memory clocks manually.

sudo nvidia-smi --auto-boost-default=0
sudo nvidia-smi -ac 2505,1177

Application testing

For testing, look at PyMOL (PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.). PyMOL is a standard commercial drug discovery application that is used for processing, and visualizing biochemical structures.  I used the opensource fork.

With the NVIDIA GRID licensing enabled earlier, PyMOL can take advantage of the Quadro features supplied by the Tesla M60. After it’s installed and loaded, you can confirm the functionality of the entire G3 instance software stack installed earlier:

PyMOL(TM) Molecular Graphics System, Version 2.1.0.
 Copyright (c) Schrodinger, LLC.
 All Rights Reserved.
    Created by Warren L. DeLano, Ph.D. 
    PyMOL is user-supported open-source software.  Although some versions
    are freely available, PyMOL is not in the public domain.
    If PyMOL is helpful in your work or study, then please volunteer 
    support for our ongoing efforts to create open and affordable scientific
    software by purchasing a PyMOL Maintenance and/or Support subscription.

    More information can be found at "http://www.pymol.org".
    Enter "help" for a list of commands.
    Enter "help <command-name>" for information on a specific command.

 Hit ESC anytime to toggle between text and graphics.

 Detected OpenGL version 2.0 or greater. Shaders available.
 Detected GLSL version 4.60.
 OpenGL graphics engine:
  GL_VENDOR:   NVIDIA Corporation
  GL_RENDERER: Quadro FX Tesla M60/PCIe/SSE2
  GL_VERSION:  4.6.0 NVIDIA 390.57
 Adapting to Quadro hardware.
 Detected 16 CPU cores.  Enabled multithreaded rendering.

In the PyMOL window, run “fetch 5ta3”, which is a 39k amino acid protein, under the 4K desktop environment. Rotating and translating the protein should be smooth and respond quickly to pointer events.

The PyMOL Gallery contains other representative examples that take advantage of various visualization and processing workflows. Also, you can find many demos (choose Wizard, Demo).

Under the Sculpting demo, you can show the pointer latency between the client and server.

Finally, look at ray tracing. From the PyMOL wiki, take a chemical structure and render each frame with ray tracing to produce a video. On the Tesla M60 with Quadro features enabled, the total render time was approximately 1 minute.


As I mentioned previously, the framebuffer redirection protocols have a feature set to create multiple virtual sessions per node. A virtual session is not necessarily tied to a single user either. In other words, the number of independent virtual sessions is limited by the total amount of GPU frame buffer memory used in all sessions per GPU. Thus, it’s possible to scale horizontally by increasing the number of G3 instances, or vertically by using larger instance types in the G3 family.


The G3 instance type is purpose-built to provide a managed, high-end professional graphics infrastructure for visual computing needs. With NICE DCV, you can take advantage of NVIDIA Quadro software features for a range of applications including drug discovery and VFX rendering. Connected with the AWS high-performance network backbone, the instance can become an integral part of your graphics workload pipeline. Now, you can power up and deliver your applications to teams working anywhere in the world.

EC2 Instance Update – M5 Instances with Local NVMe Storage (M5d)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-instance-update-m5-instances-with-local-nvme-storage-m5d/

Earlier this month we launched the C5 Instances with Local NVMe Storage and I told you that we would be doing the same for additional instance types in the near future!

Today we are introducing M5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for workloads that require a balance of compute and memory resources. Here are the specs:

Instance Name vCPUs RAM Local Storage EBS-Optimized Bandwidth Network Bandwidth
m5d.large 2 8 GiB 1 x 75 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.xlarge 4 16 GiB 1 x 150 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.2xlarge 8 32 GiB 1 x 300 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.4xlarge 16 64 GiB 1 x 600 GB NVMe SSD 2.210 Gbps Up to 10 Gbps
m5d.12xlarge 48 192 GiB 2 x 900 GB NVMe SSD 5.0 Gbps 10 Gbps
m5d.24xlarge 96 384 GiB 4 x 900 GB NVMe SSD 10.0 Gbps 25 Gbps

The M5d instances are powered by Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, including support for AVX-512.

You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.

Here are a couple of things to keep in mind about the local NVMe storage on the M5d instances:

Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.

Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.

Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.

Available Now
M5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent M5 instances.