Tag Archives: Compute

Learn about AWS – November AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-november-aws-online-tech-talks/

AWS Tech Talks

AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. Join us this month to learn about AWS services and solutions. We’ll have experts online to help answer any questions you may have.

Featured this month! Check out the tech talks: Virtual Hands-On Workshop: Amazon Elasticsearch Service – Analyze Your CloudTrail Logs, AWS re:Invent: Know Before You Go and AWS Office Hours: Amazon GuardDuty Tips and Tricks.

Register today!

Note – All sessions are free and in Pacific Time.

Tech talks this month:


November 13, 2018 | 11:00 AM – 12:00 PM PTHow to Create a Chatbot Using Amazon Sumerian and Sumerian Hosts – Learn how to quickly and easily create a chatbot using Amazon Sumerian & Sumerian Hosts.


November 19, 2018 | 11:00 AM – 12:00 PM PTUsing Amazon Lightsail to Create a Database – Learn how to set up a database on your Amazon Lightsail instance for your applications or stand-alone websites.

November 21, 2018 | 09:00 AM – 10:00 AM PTSave up to 90% on CI/CD Workloads with Amazon EC2 Spot Instances – Learn how to automatically scale a fleet of Spot Instances with Jenkins and EC2 Spot Plug-In.


November 13, 2018 | 09:00 AM – 10:00 AM PTCustomer Showcase: How Portal Finance Scaled Their Containerized Application Seamlessly with AWS Fargate – Learn how to scale your containerized applications without managing servers and cluster, using AWS Fargate.

November 14, 2018 | 11:00 AM – 12:00 PM PTCustomer Showcase: How 99designs Used AWS Fargate and Datadog to Manage their Containerized Application – Learn how 99designs scales their containerized applications using AWS Fargate.

November 21, 2018 | 11:00 AM – 12:00 PM PTMonitor the World: Meaningful Metrics for Containerized Apps and Clusters – Learn about metrics and tools you need to monitor your Kubernetes applications on AWS.

Data Lakes & Analytics

November 12, 2018 | 01:00 PM – 01:45 PM PTSearch Your DynamoDB Data with Amazon Elasticsearch Service – Learn the joint power of Amazon Elasticsearch Service and DynamoDB and how to set up your DynamoDB tables and streams to replicate your data to Amazon Elasticsearch Service.

November 13, 2018 | 01:00 PM – 01:45 PM PTVirtual Hands-On Workshop: Amazon Elasticsearch Service – Analyze Your CloudTrail Logs – Get hands-on experience and learn how to ingest and analyze CloudTrail logs using Amazon Elasticsearch Service.

November 14, 2018 | 01:00 PM – 01:45 PM PTBest Practices for Migrating Big Data Workloads to AWS – Learn how to migrate analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premises deployments to AWS.

November 15, 2018 | 11:00 AM – 11:45 AM PTBest Practices for Scaling Amazon Redshift – Learn about the most common scalability pain points with analytics platforms and see how Amazon Redshift can quickly scale to fulfill growing analytical needs and data volume.


November 12, 2018 | 11:00 AM – 11:45 AM PTModernize your SQL Server 2008/R2 Databases with AWS Database Services – As end of extended Support for SQL Server 2008/ R2 nears, learn how AWS’s portfolio of fully managed, cost effective databases, and easy-to-use migration tools can help.


November 16, 2018 | 09:00 AM – 09:45 AM PTBuild and Orchestrate Serverless Applications on AWS with PowerShell – Learn how to build and orchestrate serverless applications on AWS with AWS Lambda and PowerShell.

End-User Computing

November 19, 2018 | 01:00 PM – 02:00 PM PTWork Without Workstations with AppStream 2.0 – Learn how to work without workstations and accelerate your engineering workflows using AppStream 2.0.

Enterprise & Hybrid

November 19, 2018 | 09:00 AM – 10:00 AM PTEnterprise DevOps: New Patterns of Efficiency – Learn how to implement “Enterprise DevOps” in your organization through building a culture of inclusion, common sense, and continuous improvement.

November 20, 2018 | 11:00 AM – 11:45 AM PTAre Your Workloads Well-Architected? – Learn how to measure and improve your workloads with AWS Well-Architected best practices.


November 16, 2018 | 01:00 PM – 02:00 PM PTPushing Intelligence to the Edge in Industrial Applications – Learn how GE uses AWS IoT for industrial use cases, including 3D printing and aviation.

Machine Learning

November 12, 2018 | 09:00 AM – 09:45 AM PTAutomate for Efficiency with Amazon Transcribe and Amazon Translate – Learn how you can increase efficiency and reach of your operations with Amazon Translate and Amazon Transcribe.


November 20, 2018 | 01:00 PM – 02:00 PM PTGraphQL Deep Dive – Designing Schemas and Automating Deployment – Get an overview of the basics of how GraphQL works and dive into different schema designs, best practices, and considerations for providing data to your applications in production.


November 9, 2018 | 08:00 AM – 08:30 AM PTEpisode 7: Getting Around the re:Invent Campus – Learn how to efficiently get around the re:Invent campus using our new mobile app technology. Make sure you arrive on time and never miss a session.

November 14, 2018 | 08:00 AM – 08:30 AM PTEpisode 8: Know Before You Go – Learn about all final details you need to know before you arrive in Las Vegas for AWS re:Invent!

Security, Identity & Compliance

November 16, 2018 | 11:00 AM – 12:00 PM PTAWS Office Hours: Amazon GuardDuty Tips and Tricks – Join us for office hours and get the latest tips and tricks for Amazon GuardDuty from AWS Security experts.


November 14, 2018 | 09:00 AM – 10:00 AM PTServerless Workflows for the Enterprise – Learn how to seamlessly build and deploy serverless applications across multiple teams in large organizations.


November 15, 2018 | 01:00 PM – 01:45 PM PTMove From Tape Backups to AWS in 30 Minutes – Learn how to switch to cloud backups easily with AWS Storage Gateway.

November 20, 2018 | 09:00 AM – 10:00 AM PTDeep Dive on Amazon S3 Security and Management – Amazon S3 provides some of the most enhanced data security features available in the cloud today, including access controls, encryption, security monitoring, remediation, and security standards and compliance certifications.

Re-affirming Long-Term Support for Java in Amazon Linux

Post Syndicated from Deepak Singh original https://aws.amazon.com/blogs/compute/re-affirming-long-term-support-for-java-in-amazon-linux/

In light of Oracle’s recent announcement indicating an end to free long-term support for OpenJDK after January 2019, we re-affirm that the OpenJDK 8 and OpenJDK 11 Java runtimes in Amazon Linux 2 will continue to receive free long-term support from Amazon until at least June 30, 2023. We are collaborating and contributing in the OpenJDK community to provide our customers with a free long-term supported Java runtime.

In addition, Amazon Linux AMI 2018.03, the last major release of Amazon Linux AMI, will receive support for the OpenJDK 8 runtime at least until June 30, 2020, to facilitate migration to Amazon Linux 2. Java runtimes provided by AWS Services such as AWS Lambda, AWS Elastic Map Reduce (EMR), and AWS Elastic Beanstalk will also use the AWS supported OpenJDK builds.

Amazon Linux users will not need to make any changes to get support for OpenJDK 8. OpenJDK 11 will be made available through the Amazon Linux 2 repositories at a future date. The Amazon Linux OpenJDK support posture will also apply to the on-premises virtual machine images and Docker base image of Amazon Linux 2.

Amazon Linux 2 provides a secure, stable, and high-performance execution environment. Amazon Linux AMI and Amazon Linux 2 include a Java runtime based on OpenJDK 8 and are available in all public AWS regions at no additional cost beyond the pricing for Amazon EC2 instance usage.

Deploying a Burstable and Event-driven HPC Cluster on AWS Using SLURM, Part 2

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/deploy-a-burstable-and-event-driven-hpc-cluster-on-aws-using-slurm-part-2/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

In part 1 of this series, you deployed the base components to create the HPC cluster. This unique deployment stands up the SLURM headnode. For every job submitted to the queue, the headnode provisions the needed compute resources to run the job, based on job submission parameters.

By provisioning the compute nodes dynamically, you can immediately see the benefit of elasticity, scale, and optimized operational compute costs. As new technologies are released, you can take advantage of heterogeneous deployments, such as scaling high, tightly coupled, CPU-bound workloads independently from high memory or distributed GPU-based workloads.

To further extend a cloud-native approach to designing HPC architectures, you can integrate with existing AWS services and provide additional benefits by abstracting the underlying compute resources. It is possible for the HPC cluster to be event-driven in response to requests from a web application or from direct API calls.

Additional frontend components can be added to take advantage of an API-instantiated execution of an HPC workload. The following reference architecture describes the pattern.


The difference from the previous reference architecture in Part 1 is that the user submits the job described as JSON through an HTTP call to Amazon API Gateway, which is then processed by an AWS Lambda function to submit the job.


I recommend that you start this section after completing the deployment in Part I . Write down the private IP address of the SLURM controller.

In the Amazon EC2 console, select the SLURM headnode and retrieve the private IPv4 address. In the Lambda console, create a new function based on Python 2.7 authored from scratch.

Under the environment variables, add a new entry for “HEADNODE”, “SLURM_BUCKET_S3”, “SLURM_KEY_S3” and set the value to the private IPv4 address of the SLURM controller noted earlier, plus the bucket and key pair. This allows the Lambda function to connect to the instance using SSH.

In the AWS GitHub repo that you cloned in part 1, find the lambda/hpc_worker.zip file and upload the contents to the Function Code section of the Lambda function. A derivative of this function was referenced by Puneet Agarwal, in the Scheduling SSH jobs using AWS Lambda post.

The Lambda function needs to launch in the VPC as the SLURM node and have the same security groups as the SLURM headnode. This is because the Lambda function connects to the SLURM controller using SSH. Ignore the error about creating the Lambda function across two Availability Zones for high availability (HA).

The default memory settings, with a timeout of 20 seconds, are sufficient. The Lambda execution role needs access to Amazon EC2, Amazon CloudWatch, and Amazon S3.

In the API Gateway console, create a new API from scratch and name it “hpc.” Under Resources, create a new resource as “hpc.” Then, create a new method under the “hpc” resource for POST.

Under the POST method, set the integration method to the Lambda function created earlier.

Under the resource “hpc”, choose to deploy the API for staging, calling the endpoint “dev.” You get an endpoint to execute:

curl -H "Content-Type: application/json" -X POST https://<endpoint>.execute-api.us-west-2.amazonaws.com/dev/hpc -d @test.json

Then, create a JSON file with the following code.

    "username": "awsuser", 
    "jobname": "hpc_test", 
    "nodes": 2, 
    "tasks-per-node": 1, 
    "cpus-per-task": 4, 
    "feature": "us-west-2a|us-west-2b|us-west-2c", 
        [{"workdir": "/home/centos/job123"},
         {"input": "s3://ar-job-input/test.input"},
         {"output": "s3://ar-job-output"}],
    "launch": "env && sleep 60"

Next, in the API Gateway console, watch the following four events happen:

  1. The API gateway passes the input JSON to the Lambda function.
  2. The Lambda function writes out a SLURM sbatch job submission file.
  3. The job is executed and held until the instance is provisioned
  4. After the instance is running, the job script executes, copies data from S3, and completes the job.

In the response body of the API call, you return the job ID.

"body": "{\"error\": \"\", \"name\": \"awsuser\", \"jobid\": \"Submitted batch job 5\\n\"}",
"statusCode": 200

When the job completes, the instance is held for 60 seconds in case another job is submitted. If no jobs are submitted, the instance is terminated by the SLURM cluster.


End-to-end scalable job submission and instance provisioning is one way to execute your HPC workloads in a scalable and elastic fashion. Now, go power your HPC workloads on AWS!

Amazon ECS and Docker volume drivers, part 1: Amazon EBS

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/amazon-ecs-and-docker-volume-drivers-amazon-ebs/

→ Part 2: Amazon EFS


Post by: Jeremy Cowan, Ronnie Eichler, and Tiffany Jernigan


Containers are emerging as the default compute primitive for building cloud-native applications.  They facilitate the adoption of continuous delivery, and help increase infrastructure use.

However, deploying stateful application as containers has been challenging because containers have short life-spans, get re-deployed frequently, are scaled up and down dynamically, and often share the same host with other containers. All of these factors make it challenging for you to appropriately align the lifecycles of storage volumes and containers.

Before Docker volume driver support was added to Amazon ECS, you had to manage storage volumes manually using custom tooling such as bash scripts, Lambda functions, or manual configuration of Docker volumes. Now, you can now take full advantage of the Docker plugin ecosystem by using popular plugins such as REX-Ray or Portworx.

ECS support for Docker volumes means that you can now deploy stateful and storage-intensive use cases. These include:

  • Machine learning and data processing workloads
  • Applications such as GitLab or Jenkins that share a filesystem across multiple tasks
  • Databases such as Cassandra or RocksDB
  • Streaming tools such as Kafka
  • Additional scratch space added to containers that process large workloads and are storage-intensive

To support this broad array of use cases, ECS offers you the flexibility to configure the lifecycle of the Docker volume. For example, you can specify whether it is a scratch space volume specific to a single instantiation of a task, or a persistent volume that persists beyond the lifecycle of a unique instantiation of the task. You can also choose to use a Docker volume that you’ve created before launching your task.

In addition to managing the Docker volume configuration and lifecycle, the ECS scheduler is now plugin-aware. ECS takes the availability of the requested driver into account in its placement decisions, so that tasks that require a certain driver are only placed on container instances that have the driver installed.

Docker and Docker volumes

Docker volumes are a way to persist data outside of the lifecycle of a container. Containers themselves are made up of multiple immutable layers of storage with an ephemeral layer, which is read/write. If your application writes files to the ephemeral layer, these changes are lost when the container stops.

Volumes are managed outside of the container lifecycle—stopping or removing the container does not remove the volume. Docker also supports volume drivers that allow you to use volumes as an abstraction between containers and persistent storage such as Amazon EBS or Amazon EFS. By default, Docker provides a driver called ‘local’ that provides local storage volumes to containers. With Docker plugins, you can now add volume drivers to provision and manage EBS and EFS storage, such as REX-Ray, Portworx, and NetShare.

To deploy a stateful application such as Cassandra, MongoDB, Zookeeper, or Kafka, you likely need high-performance persistent storage like EBS. Docker volumes allow you to present an EBS volume to your application as a Docker volume.

There are other applications such as Jenkins and GitLab, where multiple copies of the application need access to the same data. With volume drivers and EFS, you can present EFS as a shared volume to multiple instances of your container so that you can scale your application yet still retain and persist shared data on EFS.

Another overlooked use case involves applications that need scratch space. When you define a task in ECS and your application writes to the filesystem inside of the container (not on a Docker volume), the task consumes space on the underlying EC2 instance that is shared by all other running tasks. This can lead to issues of ‘noisy neighbors’ if a task were to write a bunch of data to /tmp on its local filesystem.

Now with Docker volume support in ECS, you can map an EBS volume to /tmp (or whatever your scratch space directory you prefer). You can ensure good performance while limiting the size of the underlying EBS volume using arguments in your ECS task to the volume driver.

What is REX-Ray?

REX-Ray is just one example of a Docker volume driver plugin that provides an abstraction between Docker volumes and the underlying storage. Built on top of the libStorage framework, REX-Ray’s simplified architecture consists of a single binary. It runs as a stateless service on every host, using a configuration file to orchestrate multiple storage platforms. REX-Ray supports multiple storage backends. For this post, we focus on EBS as a storage backend. Part two of this series focuses on EFS.

Using a plugin such as REX-Ray, your Docker container is able to persist data outside of the lifespan of a running container. You don’t have to worry about the underlying storage. Instead, you simply reference a Docker volume in your task definition and let REX-Ray provide the abstraction. While this post is specific to REX-Ray, ECS is designed to be open and pass through the volume driver arguments from your task definition to Docker. You can use any volume driver (such as Portworx) that is supported by Docker.

Putting it all together

Before you can get started using Docker volumes with ECS, there are a few things you need to do.

First, you need a suitable volume driver plugin, such as REX-Ray, to provide an abstraction between the Docker volume and the underlying storage, for example, EBS or EFS. Docker designed volumes and the associated driver mechanism to be pluggable to support a variety of storage backends. Although we’ve chosen to highlight REX-Ray for this post, there are several others to choose from, including Portworx and NetShare.

Because the volume plugin interacts with the AWS storage services on your behalf, an IAM role has to be assigned to the ECS container instances. This allows REX-Ray to issue the appropriate AWS API calls and perform actions such as attaching and detaching EBS volumes, and so on.

Using REX-Ray with Amazon EBS

To help you get started, we’ve created an AWS CloudFormation template that builds a two-node ECS cluster.  The template bootstraps the rexray/ebs volume driver onto each node and assigns them an IAM role with an inline policy that allows them to call the API actions that REX-Ray needs.  The template also creates a Network Load Balancer, which is used to expose an ECS service to the internet.

Finally, you create a task definition for a stateful service—MySQL—that uses the the rexray/ebs driver. Observe how the volume where MySQL stores its data is moved when the MySQL task is scheduled on another instance in the cluster.

Set up the environment

Here’s how to set up the environment for this walkthrough.

Step 1: Instantiate the AWS CloudFormation template

aws cloudformation create-stack --stack-name rexray-demo \
--capabilities CAPABILITY_NAMED_IAM \
--template-url http://s3.amazonaws.com/ecs-refarch-volume-plugins/rexray-demo.json \
--parameters ParameterKey=KeyName,ParameterValue=<keypair-name>

The ECS container instances are bootstrapped using the following script, which is given as user data in rexyray-demo.json.

#open file descriptor for stderr
exec 2>>/var/log/ecs/ecs-agent-install.log
set -x
#verify that the agent is running
until curl -s http://localhost:51678/v1/metadata
	sleep 1
#install the Docker volume plugin
docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=<AWS_REGION> --grant-all-permissions
#restart the ECS agent
stop ecs 
start ecs

Step 2: Export output parameters as environment variables

This shell script exports the output parameters from the CloudFormation template and imports them as OS environment variables.  You use these variables later to create task and service definitions.

cat > get-outputs.sh << 'EOF'
function usage {
  echo "usage: source <(./get-outputs.sh <stackname-or-stackid> <region>)"
  echo "stack name or ID must be provided or exported as the CloudFormationStack environment variable"
  echo "region must be provided or set with aws configure"

function main {
    #Get stack
    if [ -z "$1" ]; then
        if [ -z "$CloudFormationStack" ]; then
            echo "please provide stack name or ID"
            exit 1
    #Get region
    if [ -z "$2" ]; then
        region=$(aws configure get region)
        if [ -z $region ]; then
            echo "please provide region"
            exit 1
    echo "#Region: $region"
    echo "#Stack: $CloudFormationStack"
    echo "#---"
    echo "#Checking if stack exists..."
    aws cloudformation wait stack-exists \
    --region $region \
    --stack-name $CloudFormationStack
    echo "#Checking if stack creation is complete..."
    aws cloudformation wait stack-create-complete \
    --region $region \
    --stack-name $CloudFormationStack
    echo "#Getting output keys and values..."
    echo "#---"
    aws cloudformation describe-stacks \
    --region $region \
    --stack-name $CloudFormationStack \
    --query 'Stacks[].Outputs[].[OutputKey, OutputValue]' \
    --output text | awk '{print "export", $1"="$2}'
main "[email protected]"

#Add executable permissions
chmod +x get-outputs.sh

Export the output parameters. The region parameter is only needed if your Region configuration is not us-west-2, as defined in the CloudFormation template.

./get-outputs.sh && source <(./get-outputs.sh)

Step 3: Create the task definition

In this step, you create a task definition for MySQL.  MySQL is considered stateful service because the data stored in the database has to persist beyond the life of the task.

When the MySQL task is restarted on another instance in the cluster, the scheduler and the rexray/ebs plugin ensure that the task is launched on an instance that can re-establish a connection to the EBS volume where the database is stored.

The placement constraint in the task definition informs the ECS service scheduler to launch the task in a specific Availability Zone; the available zone where the EBS volume was originally created.  Such a constraint is necessary because instances cannot connect to volumes in a different Availability Zone.

cat > mysql-taskdef.json << EOF 
    "containerDefinitions": [
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "${CWLogGroupName}",
                    "awslogs-region": "${AWSRegion}",
                    "awslogs-stream-prefix": "ecs"
            "portMappings": [
                    "containerPort": 3306,
                    "protocol": "tcp"
            "environment": [
                    "name": "MYSQL_ROOT_PASSWORD",
                    "value": "my-secret-pw"
            "mountPoints": [
                    "containerPath": "/var/lib/mysql",
                    "sourceVolume": "rexray-vol"
            "image": "mysql",
            "essential": true,
            "name": "mysql"
    "placementConstraints": [
            "type": "memberOf",
            "expression": "attribute:ecs.availability-zone==${AvailabilityZone}"
    "memory": "512",
    "family": "mysql",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
    "cpu": "512",
    "volumes": [
            "name": "rexray-vol",
            "dockerVolumeConfiguration": {
                "autoprovision": true,
                "scope": "shared",
                "driver": "rexray/ebs",
                "driverOpts": {
                    "volumetype": "gp2",
                    "size": "5"

Docker volumes support adds several new the parameters to the ECS task definition. These include the volume type, scope, drivers, and Docker options and labels. A volume can either be scoped to a single, specific task or it can be shared among multiple tasks.

When a volume is scoped to a task, it is not meant to be shared across different running tasks.  In contrast, a shared volume is for use cases where the volume lifecycle is independent of the ECS task. The volume can be used by different tasks concurrently or at different times. It is primarily intended for use cases such as single-task applications where the volume persists after the task dies and is re-used when the task starts again. Another use case is when multiple tasks on the same EC2 container instance access the volume concurrently.

The autoprovision parameter is used to specify whether ECS manages the lifecycle of the volume.  When this is set to true, ECS automatically provisions the volume for you, which is what you are doing in the above example.  When it’s set to false, ECS assumes that the volume already exists.  For this example, you could instead set autoprovision to false and run the following command to create a volume:

aws create-volume --size 1 --volume-type gp2 \
--availability-zone $AvailabilityZone \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=rexray-vol}]'

The driver options are used to configure the type of EBS storage use, for example, gp2, standard, io1, and so on, the size of the volume to provision, IOPS, and encryption.  The specific options vary depending on the volume plugin that you are using.

Register the task definition and extract the task definition ARN from the result:

TaskDefinitionArn=$(aws ecs register-task-definition \
--cli-input-json 'file://mysql-taskdef.json' \
| jq -r .taskDefinition.taskDefinitionArn)

Step 4: Create a service definition

In this step, you create a service definition for MySQL.  An ECS service is a long running task that is monitored by the service scheduler.  If the task dies or becomes unhealthy, the scheduler automatically attempts to restart the task.

The MySQL service is fronted by a Network Load Balancer that is configured for forward traffic on port 3306 to the tasks registered with a specific target group.  The desired count is the desired number of task copies to run. The minimum and maximum healthy percent parameters inform the scheduler to only run exactly the number of desired copies of this task at a time. Unless a task has been stopped, it does not try starting a new one.

cat > mysql-svcdef.json << EOF 
    "cluster": "${ECSClusterName}",
    "serviceName": "mysql-svc",
    "taskDefinition": "${TaskDefinitionArn}",
    "loadBalancers": [
            "targetGroupArn": "${MySQLTargetGroupArn}",
            "containerName": "mysql",
            "containerPort": 3306
    "desiredCount": 1,
    "launchType": "EC2",
    "healthCheckGracePeriodSeconds": 60, 
    "deploymentConfiguration": {
        "maximumPercent": 100,
        "minimumHealthyPercent": 0
    "networkConfiguration": {
        "awsvpcConfiguration": {
            "subnets": [
            "securityGroups": [
            "assignPublicIp": "DISABLED"

Create the MySQL service:

SvcDefinitionArn=$(aws ecs create-service \
--cli-input-json file://mysql-svcdef.json \
| jq -r .service.serviceArn)

Step 5: Connect to the MySQL service

After the service is running, configure a MySQL client, such as MySQL Workbench, to connect to the service:

  1. For Connection Name, type “rexray-demo”.
  2. For Hostname, copy and paste the DNS name of the Network Load Balancer.
  3. For Password, type the default password found in the mysql-taskdef.json file.
  4. Choose Test Connection, Close.
  5. Under MySQL Connections, open the rexray-demo connection.

MySQL Workbench

In the Query window, paste the following:

USE rexraydb;
CREATE TABLE pets (name VARCHAR(20), breed VARCHAR(20));
INSERT INTO pets VALUES ('Fluffy', 'Poodle');

You can execute each line separately by placing the cursor on a line and clicking the execute statement button.

Execute MySQL commands

Step 6: Drain the instance

Now that you have a running MySQL database server running under a container and persisting its data, make sure that it will survive a container replacement.

Docker containers by their nature are designed to be ephemeral. If you upgrade the underlying host operating system, you must drain the tasks off of the instance and let them be re-scheduled onto another ECS host. Below, I show the behavior of persisting the MySQL instance’s data to an EBS volume and allowing the task to be re-scheduled.

The following script identifies the instance that is currently running the task and puts it in a draining state.  This forces the task to be rescheduled onto the other EC2 container instance in the cluster.

cat > drain-instance.sh << 'EOF'

echo "Region [$AWSRegion]"
echo "Cluster [$ECSClusterName]"
echo "Task Definition [$TaskDefinitionArn]"

TaskArns=$(aws ecs list-tasks --region $AWSRegion \
--cluster $ECSClusterName --query taskArns --output text)
echo "Task ARNs [$TaskArns]"

ContainerInstanceArns=$(aws ecs describe-tasks \
--region $AWSRegion --cluster $ECSClusterName \
--tasks $TaskArns \
--query 'tasks[?taskDefinitionArn==`'$TaskDefinitionArn'`]' \
--query 'tasks[].containerInstanceArn' --output text)
echo "Container Instance ARNs [$ContainerInstanceArns]"

echo "DRAINING Instances"
aws ecs update-container-instances-state --region $AWSRegion \
--cluster $ECSClusterName --container-instances $ContainerInstanceArns \
--status "DRAINING"


In the ECS console, if you click on the cluster and then the tab for the cluster’s tasks, you see the container instance ID for the MySQL task:

Clicking the link of the container instance ID takes you to another page that shows the EC2 instance ID of the instance where the MySQL task is running:

Now run the script:

chmod +x drain-instance.sh

When you run the script, the tasks on the draining instance are stopped. Because you have an ECS service definition for MySQL, ECS launches new tasks on other ECS instances in the cluster that meet the placement constraints. In this example, you placed a constraint on the Availability Zone of the EBS volume as it’s not possible to detach and re-attach volumes across Availability Zones. Because the volume already exists, REX-Ray attaches the existing volume to the new task. When MySQL starts, it sees this as its data volume and you have access to the recently stored data.

Step 7: Re-connect to the MySQL service

After you see that a new task has been provisioned on the ECS cluster, you can return to MySQL Workbench and attempt to run the following query:

USE rexraydb;

You may get an error message stating “The MySQL server has gone away.” This usually means that the new ECS task has not completed starting or hasn’t been registered yet as a healthy target behind the Network Load Balancer. If you wait a little longer and try again, you should see the same results in the query grid as before.

This environment is meant as a demonstration on how to use Docker volume plugins with ECS for supporting persistent workloads. For an actual production implementation, I recommend scoping the VPC and security groups to only allow network access from trusted resources. This post creates a MySQL server that is accessible from the internet. In addition, you should implement your own strong MySQL root password, among other things.

To clean up this demo, take the following steps.

Delete the service.

aws ecs update-service --cluster $ECSClusterName \
--service $SvcDefinitionArn \
--desired-count 0
aws ecs delete-service --cluster $ECSClusterName \
--service $SvcDefinitionArn

Delete the volume.

Even though you deleted the task and the service, you still need to clean up the EBS volume that you created. You created this volume and referenced it in the ECS task definition. ECS passed this information along to Docker running on the host, which in turn handed it to REX-Ray (your volume driver), which knew how to attach the EBS volume and map it to the container.

The easiest way to delete this volume is from the EC2 console. In the list of volumes, you should see a volume named rexray-vol that is unattached (state=available). Delete this volume as it is no longer needed.


REX-Ray Volume

Otherwise, you can run the following command, which grabs the volume ID and deletes it:

rexrayVolumeID=$(aws ec2 describe-volumes --filter Name="tag:Name",Values=rexray-vol \
--query "Volumes[].VolumeId" --output text)
aws ec2 delete-volume --volume-id $rexrayVolumeID

Delete the CloudFormation template.

Lastly, delete the CloudFormation template. This removes the rest of the environment that was pre-created for this exercise.

aws cloudformation delete-stack --stack-name rexray-demo


While it was possible to use Docker volume plugins with ECS previously, doing so required you to create volumes out of band, that is, outside of ECS, and create placement constraints to restrict where tasks could be run. With native support for Docker volumes, volumes can now be provisioned simply by adding a handful of parameters to an ECS task definition.

Moreover, the ECS scheduler is now volume plugin aware.  Instances that have a volume driver installed on them automatically get annotated with attributes that inform the scheduler where to place tasks that use a particular driver.  Together, these features help you to run stateful, storage intensive applications such as databases, machine learning, and data processing applications, streaming applications like Kafka, as well as applications that need additional scratch space.  We look forward to hearing about the use cases that this new feature enables.

– Jeremy, Ronnie, and Tiffany

Celebrating 10 years of Microsoft Windows Server and SQL Server on AWS! Happy Birthday!

Post Syndicated from Betsy Chernoff original https://aws.amazon.com/blogs/compute/celebrating-10-years-of-microsoft-windows-server-and-sql-server-on-aws-happy-birthday/

Contributed by Sandy Carter, Vice President of Windows on AWS and Enterprise Workloads

Happy birthday to all of our AWS customers! In particular, I want to call out Autodesk, RightScale (now part of Flexera), and Suunto (Movescount) – just a few of our customers who have been running Microsoft Windows Server and Microsoft SQL Server on AWS for 10 years! Thank you for your business and seeing the value of Windows on AWS!

So many customers trust their Windows workloads on AWS because of our experience, reliability, security, and performance. IDC (a leading IT Analyst) estimates that AWS accounted for approximately 57.7% of total Windows instances in public cloud IaaS during 2017 – nearly 2x the nearest cloud provider.

Our Windows on AWS customers benefit from the millions of active customers per month across the AWS Cloud. Ancestry, the global leader in family history and consumer genomics, has over 6000 instances on AWS. Nat Natarajan, Executive Vice President of Product and Technology at Ancestry just spoke with us in Seattle. I loved hearing how they are using Windows on AWS.

“AWS provides us with the flexibility we need to stay at the forefront of consumer genomics, as the science and technology in the space continues to rapidly evolve. We’re confident that AWS provides us with unmatched scalability, security, and privacy.”

Reliability is one the reasons why NextGen Healthcare, provider of tailored healthcare solutions for ambulatory practices and healthcare providers around the world, trusts AWS to run their SQL Server databases. One of the foundations of our reliability is how we design our Regions. AWS has 18 Regions around the globe, each of which are made up of two or more Availability Zones. Availability Zones are physically separate locations with independent infrastructure engineered to be insulated from failures in other Availability Zones. Today we have 55 Availability Zones across these 18 Regions, and we’ve announced plans for 12 more Availability Zones and four more Regions.

I talk to so many of our customers every week who tell me that their Windows and SQL Server workloads run better on AWS. For example, eMarketer enables thousands of companies around the world to better understand markets and consumer behavior. This helps them get the data they need to succeed in a competitive and fast changing digital economy. They recently told me how they started their digital transformation initiative on another public cloud.

“We chose to move our Microsoft workloads to AWS because of your extensive migration experience, higher availability, and better performance. We are seeing 35% cost savings and thrilled to see 4x faster launch times now.” – Ryan Hoffman, Senior Vice President of Engineering

One of the things I get asked about more and more is, can you modernize those Windows apps as well? Using serverless compute on AWS Lambda, Windows containers, and Amazon Machine Learning (Amazon ML), you can really take those Windows apps into the 21st century! For example, Mitek, the global leader in mobile capture and identity verification software solutions, wanted to modernize their Mobile Verify application to accelerate integration across multiple regions and environments. They leveraged Windows containers using Amazon ECS so they could focus their resources on developing more features instead of servers, VMs, and patching. They reduced their deployment time from hours to minutes!

We know .NET developers love using their existing tools. We created tools such as AWS Toolkit for Visual Studio and AWS Tools for Visual Studio Team Services (VSTS) to provide integration into many popular AWS services. Agero tells us how easy it is for their .NET developers to get started with AWS. Agero provides connected vehicle data, roadside assistance, and claims management services to over 115 million drivers and leading insurers.

“We experimented with AWS Elastic Beanstalk and found it was the simplest, fastest way to get .NET code running in AWS.” Bernie Gracy, Chief Digital Officer

Of course, most of our customers use Microsoft Active Directory on-premises for directory-based identity-related services, and some also use Azure AD to manage users with Office365. Customers use AWS Directory Service for Microsoft Active Directory (AWS Managed Microsoft AD) to easily integrate AWS resources with on-premises AD and Azure AD. That way, there’s no data to be synchronized or replicated from on-premises to AWS. AWS Managed AD lets you use the same administration tools and built-in features such as single sign-on (SSO) or Group Policy as you use on-premises. And we now enable our customers to share a single directory with multiple AWS accounts within an AWS Region!

This birthday is significant to us here at Amazon Web Services as we obsess over our customers. Over 90% of our roadmap items are driven directly from you! With hundreds of thousands of active customers running Windows on AWS, that’s a lot of great ideas. In fact, did you know that our premier serverless engine, AWS Lambda, which lets you run .NET Core without provisioning or managing servers, came directly from you, our customers?

Some of you wanted an easier way to jumpstart Windows Server projects on AWS, which led us to build Amazon Lightsail, giving you compute, storage and networking with a low, predictable price. Based on feedback from machine learning practitioners and researchers, we launched our AWS Deep Learning AMI for Windows so you can quickly launch Amazon EC2 instances pre-installed with popular deep learning frameworks including Apache MXNet, Caffe and Tensorflow.

Licensing Options

Our customer obsession means that we are committed to helping you lower your total cost of ownership (TCO). When I talk to customers, they tell me they appreciate that AWS does not approach their cloud migration journey as a way to lock-in additional software license subscriptions. For example, TSO Logic, one of our AWS Partner Network (APN) partners, described in a blog the work we did with one of our joint customers, a privately-held U.S. company with more than 70,000 employees, operating in 50 countries. We helped this customer save 22 percent on their SQL Server workloads by optimizing core counts and reducing licensing costs.

Delaware North, a global leader in hospitality management and food service management, uses our pay-as-you-go licenses to scale-up their SQL Server instances during peak periods, without having to pay for those licenses for multiple years. Many customers also use License Mobility benefits to bring their Microsoft application licenses to AWS, and other customers, such as Xero, the accounting software platform for small and medium-sized businesses, reduce costs by bringing their own Windows Server Datacenter Edition and SQL Server Enterprise Edition licenses to AWS on our Amazon EC2 Dedicated Hosts. And, we also have investment programs to help qualified customers offset Microsoft licensing costs when migrating to the AWS Cloud.

We know that many of you are thinking about what to do with legacy applications still using the 2008 versions of SQL Server and Windows Server. I hear from many leaders who don’t want to base their cloud strategy on software end-of-support. AWS provides flexibility to easily upgrade and modernize your legacy workloads. ClickSoftware Technologies, the SaaS provider of field service management solutions, found how easy it was to upgrade to a current version of SQL Server on AWS.

“After migrating to AWS, we upgraded to SQL Server 2016 using SQL Server 2008 in compatibility mode, which meant we did not have to make any application changes, and now have a fully supported version of SQL Server.” – Udi Keidar, VP of Cloud Services, ClickSoftware

Did you know you can also bring your Microsoft licenses to VMware Cloud on AWS? VMware Cloud on AWS is a great solution when you need to execute a fast migration – whether that’s due to running out of data center space, an upcoming lease expiration, or a natural disaster such as the recent hurricanes. Massachusetts Institute of Technology (MIT) started with a proof of concept (POC) and moved their initial 300 VMs in less than 96 hours, with just one employee. Over the next three months they migrated of all of their 2,800 production VMs to VMware Cloud on AWS.

Looking Forward

Next month, I hope you’ll join me at AWS re:Invent to learn “What’s New with Microsoft and .NET on AWS” as well as the dozens of other sessions we have for Windows on AWS for IT leaders, DevOps engineers, system administrators, DBAs and .NET developers. We have so many new innovations to share with you!

I want to thank each you for trusting us these last ten years with your most critical business applications and allowing us to continue to help you innovate and transform your business. If you’d like to learn more about how we can help you bring your applications built on Windows Server and SQL Server to the cloud, please check out the following resources and events or contact us!

  1. https://aws.amazon.com/windows/
  2. https://aws.amazon.com/sql/
  3. https://aws.amazon.com/directoryservice/
  4. https://aws.amazon.com/windows/windows-study-guide/
  5. AWS .NET Developer Center

Upcoming Events

  1. October 23, 2018 Webinar: Migrating Microsoft SQL Server 2008 Databases to AWS
  2. Live with AM & Nicki – A fun new twitch.tv series to show you how to build a modern web application on AWS!
  3. November 26-30 2018 re:Invent: Check out the complete list of sessions for Windows, SQL Server, Active Directory and .NET on AWS!




Using Cromwell with AWS Batch

Post Syndicated from Josh Rad original https://aws.amazon.com/blogs/compute/using-cromwell-with-aws-batch/

Contributed by W. Lee Pang and Emil Lerch, WWPS Professional Services

DNA is often referred to as the “source code of life.” All living cells contain long chains of deoxyribonucleic acid that encode instructions on how they are constructed and behave in their surroundings. Genomics is the study of the structure and function of DNA at the molecular level. It has recently shown immense potential to provide improved detection, diagnosis, and treatment of human diseases.

Continuous improvements in genome sequencing technologies have accelerated genomics research by providing unprecedented speed, accuracy, and quantity of DNA sequence data. In fact, the rate of sequencing efficiency has been shown to outpace Moore’s law. Processing this influx of genomic data is ideally aligned with the power and scalability of cloud computing.

Genomic data processing typically uses a wide assortment of specialized bioinformatics tools, like sequence alignment algorithms, variant callers, and statistical analysis methods. These tools are run in sequence as workflow pipelines that can range from a couple of steps to many long toolchains executing in parallel.

Traditionally, bioinformaticians and genomics scientists relied on Bash, Perl, or Python scripts to orchestrate their pipelines. As pipelines have gotten more complex, and maintainability and reproducibility have become standard requirements in science, the need for specialized orchestration tooling and portable workflow definitions has grown significantly.

What is Cromwell?

The Broad Institute’s Cromwell is purpose-built for this need. It is a workflow execution engine for orchestrating command line and containerized tools. Most importantly, it is the engine that drives the GATK Best Practices genome analysis pipeline.

Workflows for Cromwell are defined using the Workflow Definition Language (WDL – pronounced “widdle”), a flexible meta-scripting language that allows researchers to focus on the pieces of their workflow that matter. That’s the tools for each step and their respective inputs and outputs, and not the plumbing in between.

Genomics data is not small (on the order of TBs-PBs for one experiment!), so processing it usually requires significant computing scale, like HPC clusters and cloud computing. Cromwell has previously enabled this with support for many backends such as Spark, and HPC frameworks like Sun GridEngine and SLURM.

AWS and Cromwell

We are excited to announce that Cromwell now supports AWS! In this post, we go over how to configure Cromwell on AWS and get started running genomics pipelines in the cloud.

In a nutshell, the AWS backend for Cromwell is a layer that communicates with AWS Batch. Why AWS Batch? As stated before, genomics analysis pipelines are composed of many different tools. Each of these tools can have specific computing requirements. Operations like genome alignment can be memory-intensive, whereas joint genotyping may be compute-heavy.

AWS Batch dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory-optimized instances). Provisioning is based on the volume and specific resource requirements of the batch jobs submitted. This means that each step of a genomics workflow gets the most ideal instance to run on.

The AWS backend translates Cromwell task definitions into batch job definitions and submits them via API calls to a user-specified batch queue. Runtime parameters such as the container image to use, and resources like desired vCPUs and memory are also translated from the WDL task and transmitted to the batch job. A number of environment variables are automatically set on the job to support data localization and de-localization to the job instance. Ultimately, scientists and genomics researchers should be familiar with the backend method to submit jobs to AWS Batch because it uses their existing WDL files and research processes.

Getting started

To get started using Cromwell with AWS, create a custom AMI. This is necessary to ensure that the AMI is private to the account, encrypted, and has tooling specific to genomics workloads and Cromwell.

One feature of this tooling is the automatic creation and attachment of additional Amazon Elastic Block Store (Amazon EBS) capacity as additional data is copied onto the EC2 instance for processing. It also contains an ECS agent that has been customized to the needs of Cromwell, and a Cromwell Docker image responsible for interfacing the Cromwell task with Amazon S3.

After the custom AMI is created, install Cromwell on your workstation or EC2 instance. Configure an S3 bucket to hold Cromwell execution directories. For the purposes of this post, we refer to the bucket as s3-bucket-name. Lastly, go to the AWS Batch console, and create a job queue. Save the ARN of the queue, as this is needed later.

To get up these resources with a single click, this link provides a set of AWS CloudFormation templates that gets all the needed infrastructure running in minutes.

The next step is to configure Cromwell to work with AWS Batch and your newly created S3 bucket. Use the sample hello.wdl and hello.inputs files from the Cromwell AWS backend tutorial. You also need a custom configuration file so that Cromwell can interact with AWS Batch.

The following sample file can be used on an EC2 instance with the appropriate IAM role attached, or on a developer workstation with the AWS CLI configured. Keep in mind that you must replace <s3-bucket-name> in the configuration file with the appropriate bucket name. Also, replace “your ARN here” with the ARN of the job queue that you created earlier.

// aws.conf

include required(classpath("application"))

aws {

    application-name = "cromwell"
    auths = [
         name = "default"
         scheme = "default"
    region = "default"
    // uses region from ~/.aws/config set by aws configure command,
    // or us-east-1 by default

engine {
     filesystems {
         s3 {
            auth = "default"

backend {
     default = "AWSBATCH"
     providers {
         AWSBATCH {
             actor-factory = "cromwell.backend.impl.aws.AwsBatchBackendLifecycleActorFactory"
             config {
                 // Base bucket for workflow executions
                 root = "s3://<s3-bucket-name>/cromwell-execution"
                 // A reference to an auth defined in the `aws` stanza at the top. This auth is used to create
                 // Jobs and manipulate auth JSONs.
                 auth = "default"

                 numSubmitAttempts = 3
                 numCreateDefinitionAttempts = 3

                 concurrent-job-limit = 16
                 default-runtime-attributes {
                    queueArn: "<your ARN here>"
                 filesystems {
                     s3 {
                         // A reference to a potentially different auth for manipulating files via engine functions.
                         auth = "default"

Now, you can run your workflow. The following command runs Hello World, and ensures that everything is connected properly:

$ java -Dconfig.file=aws.conf -jar cromwell-34.jar run hello.wdl -i hello.inputs

After the workflow has run, your workflow logs should report the workflow outputs.

[info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.
 "outputs": {
    "wf_hello.hello.message": "Hello World! Welcome to Cromwell . . . on AWS!"
 "id": "08213b40-bcf5-470d-b8b7-1d1a9dccb10e"

You also see your job in the “succeeded” section of the AWS Batch Jobs console.

After the environment is configured properly, other Cromwell WDL files can be used as usual.

With AWS Batch, a customized AMI instance, and Cromwell workflow definitions, AWS provides a simple solution to process genomics data easily. We invite you to incorporate this into your automated pipeline.

Deploying a Burstable and Event-driven HPC Cluster on AWS Using SLURM, Part 1

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/deploying-a-burstable-and-event-driven-hpc-cluster-on-aws-using-slurm-part-1/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

When you execute high performance computing (HPC) workflows on AWS, you can take advantage of the elasticity and concomitant scale associated with recruiting resources for your computational workloads. AWS offers a variety of services, solutions, and open source tools to deploy, manage, and dynamically destroy compute resources for running various types of HPC workloads.

Best practices in deploying HPC resources on AWS include creating much of the infrastructure on-demand, and making it as ephemeral and dynamic as possible. Traditional HPC clusters use a resource scheduler that maintains a set of computational resources and distributes those resources over a collection of queued jobs.

With a central resource scheduler, all users have a single point of entry to a broad range of compute. Traditionally, many of these schedulers managed on-premises systems. They weren’t offered dynamically as much as cloud-based HPC clusters, and they usually only needed to manage a largely static set of resources.

However, many schedulers now support the ability to burst into AWS or manage a dynamically changing HPC environment through plugins, connectors, and custom scripting. Some of the more common resource schedulers include:


Simple Linux Resource Manager (SLURM) by SchedMD is one such popular scheduler. Using a derivative of SLURM’s elastic power plugin, you can coordinate the launch of a set of compute nodes with the appropriate CPU/disk/GPU/network topologies. You standup the compute resources for the job, instead of trying to fit a job within a set of pre-existing compute topologies.

We have recently released an example implementation of the SLURM bursting capability in the AWS Samples GitHub repo.

The following diagram shows the reference architecture.


Download the aws-plugin-for-slurm directory locally. Use the following AWS CLI commands to sync the directory with an S3 bucket to be referenced later. For more detailed instructions on the deployment follow the README within the GitHub.

git clone https://github.com/aws-samples/aws-plugin-for-slurm.git 
aws s3 sync aws-plugin-for-slurm/ s3://<bucket-name>

Included is a CloudFormation script, which you use to stand up the VPC and subnets, as well as the headnode. In AWS CloudFormation, choose Create Stack and import the slurm_headnode-clouformation.yml script.

The CloudFormation script lays down the landing zone with the appropriate network topology. The headnode is based on the publicly available CentOS 7.5 available in the AWS Marketplace. The latest security packages are installed with the dependencies needed to install SLURM.

I have found that scheduling performance is best if the source is compiled at runtime, which the CloudFormation script takes care of. The script sets up the headnode as a single controller. However, with minor modifications, it can be set up in a highly available manner with a backup SLURM controller.

After the deployment moves to CREATE_COMPLETEstatus in CloudFormation, use SSH to connect to the slurm headnode:

ssh -i <path/to/private/key.pem> [email protected]<public-ip-address>

Create a new sbatch job submission file by running the following commands in the vi/nano text editor:

#SBATCH —nodes=2
#SBATCH —ntasks-per-node=1
#SBATCH —cpus-per-task=4
#SBATCH —constraint=[us-west-2a]

sleep 60

This job submission script requests two nodes to be allocated, running one task per node and using four CPUs. The constraint is optional but allows SLURM to allocate the job among the available zones.

The elasticity of the cluster comes in setting the slurm.conf parameters SuspendProgram and ResumeProgram in the slurm.conf file.


You can set the responsiveness of the scaling on AWS by modifying SuspendTime. Do not set a value for ResumeRateor SuspendRate, as the underlying SuspendProgram and ResumeProgramscripts have API calls that impose their own rate limits. If you find that your API call rate limit is reached at scale (approximately 1000 nodes/sec), you can set ResumeRate and SuspendRate accordingly.

If you are familiar with SLURM’s configuration options, you can make further modifications by editing the /nfs/slurm/etc/slurm.conf.d/slurm_nodes.conffile. That file contains the node definitions, with some minor modifications. You can schedule GPU-based workloads separate from CPU to instantiate a heterogeneous cluster layout. You also get more flexibility running tightly coupled workloads alongside loosely coupled jobs, as well as job array support. For additional commands administrating the SLURM cluster, see the SchedMD SLURM documentation.

The initial state of the cluster shows that no compute resources are available. Run the sinfocommand:

[[email protected] ~]$ sinfo
all*         up  infinite     0   n/a 
gpu          up  infinite     0   n/a 
[[email protected] ~]$ 

The job described earlier is submitted with the sbatchcommand:

sbatch test.sbatch

The power plugin allocates the requested number of nodes based on your job definition and runs the Amazon EC2 API operations to request those instances.

[[email protected] ~]$ sinfo
all*      up    infinite      2 alloc# ip-10-0-1-[6-7]
gpu       up    infinite      2 alloc# ip-10-0-1-[6-7]
[[email protected] ~]$ 

The log file is located at /var/log/power_save.log.

Wed Sep 12 18:37:45 UTC 2018 Resume invoked 
/nfs/slurm/bin/slurm-aws-startup.sh ip-10-0-1-[6-7]

After the request job is complete, the compute nodes remain idle for the duration of the SuspendTime=60value in the slurm.conffile.

[[email protected] ~]$ sinfo
all*         up  infinite     2  idle ip-10-0-1-[6-7]
gpu          up  infinite     2  idle ip-10-0-1-[6-7]
[[email protected] ~]$ 

Ideally, ensure that other queued jobs have an opportunity to run on the current infrastructure, assuming that the job requirements are fulfilled by the compute nodes.

If the job requirements are not fulfilled and there are no more jobs in the queue, the aws-slurm shutdown script takes over and terminates the instance. That’s one of the benefits of an elastic cluster.

Wed Sep 12 18:42:38 UTC 2018 Suspend invoked /nfs/slurm/bin/slurm-aws-shutdown.sh ip-10-0-1-[6-7]
    "TerminatingInstances": [
            "InstanceId": "i-0b4c6ec4945afe52e", 
            "CurrentState": {
                "Code": 32, 
                "Name": "shutting-down"
            "PreviousState": {
                "Code": 16, 
                "Name": "running"
    "TerminatingInstances": [
            "InstanceId": "i-0f3139a52f2602c60", 
            "CurrentState": {
                "Code": 32, 
                "Name": "shutting-down"
            "PreviousState": {
                "Code": 16, 
                "Name": "running"

The SLURM elastic compute plugin provisions the compute resources based on the scheduler queue load. In this example implementation, you are distributing a set of compute nodes to take advantage of scale and capacity across all Availability Zones within an AWS Region.

With a minor modification on the VPC layer, you can use this same plugin to stand up compute resources across multiple Regions. With this implementation, you can truly take advantage of a global HPC footprint.

“Imagine creating your own custom cluster mix of GPUs, CPUs, storage, memory, and networking – just the way you want it, then running your experiment, getting the results, and then tearing it all down.” — InsideHPC. Innovation Unbound: What Would You do with a Million Cores?

Part 2

In Part 2 of this post series, you integrate this cluster with AWS native services to handle scalable job submission and monitoring using Amazon API Gateway, AWS Lambda, and Amazon CloudWatch.

BYOL and Oversubscription

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/byol-and-oversubscription/

This post is courtesy of Mike Eizensmits, Senior Solutions Architect – AWS

Most AWS customers have a significant Windows Server deployment and are also tied to a Microsoft licensing program. When it comes to Microsoft products, such as Windows Server and SQL Server, licensing models can easily dictate Cloud infrastructure solutions. AWS provides several options to support Bring Your Own Licensing (BYOL) as well as EC2 License Included models for non-BYOL workloads. Most Enterprise customers have EA’s with Microsoft which can skew their licensing strategy when considering Azure, On-premises and other Cloud Service Providers such as AWS. BYOL models may be the only reasonable implementation path when entering a new environment or spinning up new applications. Licensing can constitute a significant investment when running workloads on public cloud. To help facilitate the maximum benefit of a customer’s existing Microsoft licensing, AWS provides multiple options to utilize BYOL EC2 Dedicated Hosts and Dedicated Instances expose the physical cores of the server to Windows and applications such as SQL Server while allowing licenses with or without Software Assurance to be utilized. Bare Metal as well as VMware on AWS can minimize additional licensing costs.

Dedicated Hosts and Instances

An EC2 Dedicated Host is a dedicated server of a specific Instance Class that is allotted to a single customer, referred to as dedicated tenancy. The density of a host is based on the Instance Size as well as the Instance Type defined at creation. If you chose the M5 Instance Type and chose the m5.large Instance Size, you would have 48 “slots” available on the host to deploy m5.large Instances. If you chose the m5.xlarge, you would have enough capacity to house 24 Instances. Dedicated hosts have a fixed number of vCPU and RAM per Instance Type. To deploy Windows on a Dedicated Host, the customer imports an image (vmdk, ova, vhd) using the import-image utility and tags the image as a “BYOL” in the command. The BYOL flag dictates whether the image will acquire a license from AWS or the customer’s existing licensing framework. When dealing with an oversubscribed customer environment, such as an on-premise VMware deployment, the customer has likely oversubscribed the environment with minimum of 4 vCPU to 1 physical core (4:1). In these environments, Microsoft licensing typically takes place at the host level using physical cores rather than the resources in a provisioned instance (vCPU). An AWS Dedicated Host is oversubscribed 2 vCPU to 1 CPU, meaning each core is Hyper-Threaded. While the math can be performed to show the actual value of a vCPU, customers can be reluctant to modify vCPU configurations to reflect the greater value of the AWS vCPU. Simply matching the quantity of vCPU’s to their current environment may be much more costly and expansive than rightsizing the instances for cost optimization.

Below is a sample configuration of a customer interested in migrating to AWS and utilizing BYOL for Microsoft Windows and SQL Server Enterprise Edition. By licensing the 400 physical cores in their cluster, the customer is able to assign any number of vCPU’s to the VM’s deployed on the hosts. Enterprise Architects have spent a considerable amount of time sizing VM’s with the proper resource attributes, so it can be difficult to initiate that process all over again to bring them to the public cloud.

Customer Environment (SQL Server Enterprise Cluster on VMware):

ESXi Nodes in Cluster 10
Cores/Host (2.6GHz) 40
Total Cores in Cluster 400
vCPU assigned (240/host) 2400
Virtual Machines 470
Oversubscription 5:1
Value of vCPU 520 MHz

In this case, the customer has decided not to right-size their VM’s and instead maintain their current vCPU/RAM specifications. This is a Dedicated Host solution to match their vCPU configurations.

AWS Dedicated Host Environment (Dedicated Hosts required to match vCPU of VMware above)

Dedicated Hosts (r4) 38
Cores/Host (2.6GHz) 36
vCPU/Host 64
Total vCPU assigned 2432
Oversubscription 2:1
Value of vCPU 1300 MHz

If the customer is not willing to consider right-sizing their VM’s by assigning fewer, yet higher powered vCPU’s, they face a considerably larger server deployment. After doing the math, we can determine that the Dedicated Host solution has 2.5 times the power of the VMware solution. Following that logic, and math, right-sizing the VM’s would drop the required vCPU count down to 960 vCPU to match their current solution. This would change the number of required r4 Dedicated Hosts from 38 to 15 hosts and slash the SQL licensing requirements for the solution.

EC2 Bare Metal instances and VMware Cloud on AWS

AWS does have other products that lend to the BYOL/oversubscription story. EC2 Bare Metal instances and VMware Cloud on AWS gives the customer full control of the configuration of instances just as they have on-premise. The EC2 Bare Metal instances are built on the Nitro System which is a collection of AWS-built hardware offload and security components that offers high performance networking and storage to EC2 instances. EC2 Bare Metal instances can utilize AWS services such as Amazon Virtual Private Cloud (VPC), Elastic Block Store (EBS), Elastic Load Balancing (ELB), AutoScaling and more. The Nitro configuration gives the customer the ability to install a server Operating System or hypervisor directly on the hardware. By utilizing their own hypervisor, the customer can define and configure their own instance configurations of RAM, disk and vCPU. By surpassing the fixed configurations in the EC2 Dedicated Host environment, Nitro configurations enable migrating highly oversubscribed on-premise workloads.

The VMware Cloud on AWS offering provides organizations the ability to extend and migrate their on-premise vSphere environment to AWS’s scalable and secure Cloud infrastructure. Customers can leverage vSphere, vSAN, NSX and vCenter technologies to extend their data centers and consume AWS services. vMotion provides the ability to live migrate VM’s to AWS with limited or no downtime. While licensing for migrated VM’s does not change at the VM level, it is imperative that the licensing on the new vSphere node be adequate. Since the customer has complete control of the environment, they have the ability to oversubscribe CPU’s to any ratio. By licensing applications such as SQL Server at the host level, oversubscription rates are irrelevant to licensing. If a vSphere node has 40 cores, as long as the 40 cores are licensed, the number of vCPU’s assigned is immaterial. In the VMware environment, all OS’s and applications are BYOL and no licensing will be provided by AWS. Ultimately, this solution is free of the oversubscription burden that affects certain AWS dedicated tenancy options.

Optimize CPU

EC2 Instance Types offer multiple fixed vCPU to memory configurations to match the customer’s workloads and use cases. With Optimize CPU, customers now have the ability to specify the number of cores that an instance has access to as well as determining if Hyper-Threading is enabled. Hyper-Threading Technology enables multiple threads to run concurrently on a single Intel Xeon CPU core. Each thread is represented as a virtual CPU (vCPU) on the instance. Controlling the threads and core count is significant for Microsoft SQL Server as it is typically more RAM constrained than compute bound. With Optimize CPUs, you can potentially reduce the number of SQL Server licenses required by specifying a custom number of vCPUs. SQL Server on Amazon EC2 is often licensed per virtual core. EC2 vCPU’s are the equivalent of virtual cores. When licensing with virtual cores on EC2, the number of active vCPUs provisioned through Optimize CPUs may indicate the number of SQL Server licenses required. For example, if you have a SQL Standard build that needs the RAM and network capabilities of an r4.4xlarge but not the 16 vCPUs that comes configured on it, you can define the Optimize CPU options in the CLI or API at launch to disable Hyper-Threading and limit the instance to 1, 2, 3, 4, 5, 6, 7 or 8 cores. This example would cut the licensing costs exponentially. The Optimize CPU feature is available for new instance launches in all AWS public regions. To benefit from Optimize CPUs, you can bring SQL Server licenses to Amazon EC2 and use it on EC2 default tenancy or on allocated instances on a EC2 Dedicated Host or a EC2 Dedicated Instance. For a list of supported instance types, and valid CPU counts, see instance type documentation.

In this post, we’ve covered three AWS scenarios and how they fulfill specific areas of BYOL with CPU oversubscription scenario as well as how Optimize CPU can help cut licensing costs. EC2 Dedicated hosts are generally the first choice in the Microsoft BYOL realm unless the customer is absolutely unwilling to right-size their highly oversubscribed instances. EC2 Bare Metal instances provide the customer the ability to configure all aspects of their hypervisor of choice and maintain any oversubscription that exists in their environment. This is a very popular choice that requires little change and ultimately gets their workloads to the AWS Cloud. The VMware Cloud on AWS option is sold by and provisioned by VMware. For users that are current VMware customers, this service allows them to bridge the gap between their on-premise data center and AWS while providing a seamless migration path to the Cloud using their current toolsets.

Introducing private registry authentication support for AWS Fargate

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/introducing-private-registry-authentication-support-for-aws-fargate/

Private registry authentication support for Amazon Elastic Container Service (Amazon ECS) is now available with the AWS Fargate launch type! Now, in addition to Amazon Elastic Container Registry (Amazon ECR), you can use any private registry or repository of your choice for both EC2 and Fargate launch types.

For ECS to pull from a private repository, it needs a secret in AWS Secrets Manager with your registry credentials, an ECS task execution IAM role in AWS Identity Access Management (IAM) with a policy granting access to the secret, and a task with the secret and task execution IAM role ARNs in the task definition.

Diagram of ECS Private Registry Authentication Architecture

Here’s how to use ECS with a private repository on Docker Hub via the AWS Management Console.


If you don’t already have a private repository (or account), you can create a free repo now. To follow along, run the following commands in a terminal to pull an image, get the image ID, and push it to your new repository:

docker pull tiffanyfay/space
docker images tiffanyfay/space --format {{.ID}}
docker tag <image-id> <your-username/repository-name>:latest
docker login
docker push <your-username/repository-name>

Secrets Manager

In the Secrets Manager console, store a new secret with your Docker Hub credentials, which is used to access your private repository.

By default, Secrets Manager creates an encryption key, DefaultEncryptionKey, on your behalf. You can instead use an existing key or add a new one with AWS Key Management Service (AWS KMS), if you would prefer.

Choose Other type of secrets and add secret keys and values for username and password.

Next, create a name, such as dockerhub, and description for your secret.

Because the keys are corresponding to your Docker Hub credentials, leave rotation disabled.

On the next page, you can review your settings and store your secret. Open your new secret to see the details. Write down the Secret ARN value and keep it handy, as it is used in the next step and later, in your task definition.


Now that you have a secret, you need to provide Fargate permissions to read it. This is done via a task execution IAM role.

In the IAM console, choose Policies, Create policy. Provide Secrets Manager with read access for secretsmanager:GetSecretValue, with your secret’s ARN as the resource.

Name your policy dockerhubsecret.

If you chose to use your own encryption key, you also need to create a policy with kms:Decrypt permissions for KMS.

Next, choose Role to create an IAM role, which is used as your task execution role. Choose AWS service, Elastic Container Service, and Elastic Container Service Task.

Search for your dockerhubsecret policy and attach it to the role.

Lastly, give the role a name, such as ecsExecutionRoleDockerHub, and create it. Copy the role ARN value. Depending on how you create your task definition, you may need it.


While the mechanism to authenticate private registries is supported on both EC2 and Fargate launch types, for this example we will be launching a task on Fargate.

Before you can create a task, you need an ECS cluster, VPC, and subnets. If you don’t already have them, in the ECS console, choose Clusters, Get Started. Keep track of the cluster name, VPC ID, and subnet IDs, as you use them soon.

It’s time to create your task definition, which is used to create your task (grouping of up to ten containers that run on the same host). This is where you need your Secrets Manager ARN and IAM role name.

Choose Task Definitions, Create new Task Definition, and select the Fargate launch type. You can then configure your task definition via the wizard or scroll down, choose Configure via JSON and paste the following task definition after replacing fields with angle brackets. This task definition also works with the EC2 launch type.

    "family": "space-td",
    "containerDefinitions": [
            "name": "space",
            "image": "<your-username/repository-name>",
            "portMappings": [
                    "protocol": "tcp",
                    "containerPort": 80
            "cpu": 0,
            "repositoryCredentials": {
                "credentialsParameter": "<secret-ARN>"
    "memory": "512",
    "cpu": "256",
    "requiresCompatibilities": [
    "networkMode": "awsvpc",
    "executionRoleArn": "<execution-role-ARN>"

If you use the wizard, give your task a name, such as space-td, and specify your task execution IAM role (ecsTaskExecutionRoleDockerHub), a task size of 0.5 GB of memory, and 0.25 vCPU.

Next, choose Container Definitions, Add container. Give the container a name, specify your image <your-username/repository-name>, check the box for private registry authentication, and add your secrets manager ARN and a container port 80. Choose Add.

After you create your task definition, choose Actions, Run Task, and specify the Fargate launch type, your cluster, cluster VPC, subnets, a security group with inbound permissions for your container ports (the default one provides access to port 80). Enable auto-assigning a public IP address.

Open the task from its ID to see the details:

When the Last status field is RUNNING, under Network, copy the public IP address and paste it in a browser.

If you used pushed tiffanyfay/space to your repository, you should see the following:

I hope this post has helped you. If you have any questions, feel free to reach out!


Special thanks to Yuling Zhou, Deepak Dayama, Derek Petersen, Varun Iyer, Adnan Khan and several others for their insights in this blog.

tiffany jernigan

tiffany jernigan

Tiffany is a developer advocate at Amazon for containers on AWS. Previously she worked at Docker and Intel in software engineering and as a hardware engineer after graduating from Georgia Tech in Electrical Engineering. In the majority of her free time she dabbles in photography and spends time with family and friends. You can find her on twitter/ig as tiffanyfayj.

Compute Abstractions on AWS: A Visual Story

Post Syndicated from Massimo Re Ferre original https://aws.amazon.com/blogs/architecture/compute-abstractions-on-aws-a-visual-story/

When I joined AWS last year, I wanted to find a way to explain, in the easiest way possible, all the options it offers to users from a compute perspective. There are many ways to peel this onion, but I want to share a “visual story” that I have created.

I define the compute domain as “anything that has CPU and Memory capacity that allows you to run an arbitrary piece of code written in a specific programming language.” Your mileage may vary in how you define it, but this is broad enough that it should cover a lot of different interpretations.

A key part of my story is around the introduction of different levels of compute abstractions this industry has witnessed in the last 20 or so years.

Separation of duties

The start of my story is a line. In a cloud environment, this line defines the perimeter between the consumer role and the provider role. In the cloud, there are things that AWS will do and things that the consumer will do. The perimeter of these responsibilities varies depending on the services you opt to use. If you want to understand more about this concept, read the AWS Shared Responsibility Model documentation.

The different abstraction levels

The reason why the line above is oblique is because it needs to intercept different compute abstraction levels. If you think about what happened in the last 20 years of IT, we have seen a surge of different compute abstractions that changed the way people consume CPU and Memory resources. It all started with physical (x86) servers back in the 80s, and then we have seen the industry adding abstraction layers over the years (for example, hypervisors, containers, functions).

The higher you go in the abstraction levels, the more the cloud provider can add value and can offload the consumer from non-strategic activities. A lot of these activities tend to be “undifferentiated heavy lifting.” We define this as something that AWS customers have to do but that don’t necessarily differentiate them from their competitors (because those activities are table-stakes in that particular industry).

What we found is that supporting millions of customers on AWS requires a certain degree of flexibility in the services we offer because there are many different patterns, use cases, and requirements to satisfy. Giving our customers choices is something AWS always strives for.

A couple of final notes before we dig deeper. The way this story builds up through the blog post is aligned to the progression of the launch dates of the various services, with a few noted exceptions. Also, the services mentioned are all generally available and production-grade. For full transparency, the integration among some of them may still be work-in-progress, which I’ll call out explicitly as we go.

The instance (or virtual machine) abstraction

This is the very first abstraction we introduced on AWS back in 2006. Amazon Elastic Compute Cloud (Amazon EC2) is the service that allows AWS customers to launch instances in the cloud. When customers intercept us at this level, they retain responsibility of the guest operating system and above (middleware, applications, etc.) and their lifecycle. AWS has the responsibility for managing the hardware and the hypervisor including their lifecycle.

At the very same level of the stack there is also Amazon Lightsail, which “is the easiest way to get started with AWS for developers, small businesses, students, and other users who need a simple virtual private server (VPS) solution. Lightsail provides developers compute, storage, and networking capacity and capabilities to deploy and manage websites and web applications in the cloud.”

And this is how these two services appear in our story:

The container abstraction

With the rise of microservices, a new abstraction took the industry by storm in the last few years: containers. Containers are not a new technology, but the rise of Docker a few years ago democratized access. You can think of a container as a self-contained environment with soft boundaries that includes both your own application as well as the software dependencies to run it. Whereas an instance (or VM) virtualizes a piece of hardware so that you can run dedicated operating systems, a container technology virtualizes an operating system so that you can run separated applications with different (and often incompatible) software dependencies.

And now the tricky part. Modern containers-based solutions are usually implemented in two main logical pieces:

  • A containers control plane that is responsible for exposing the API and interfaces to define, deploy, and lifecycle containers. This is also sometimes referred to as the container orchestration layer.
  • A containers data plane that is responsible for providing capacity (as in CPU/Memory/Network/Storage) so that those containers can actually run and connect to a network. From a practical perspective this is typically a Linux host or less often a Windows host where the containers get started and wired to the network.

Arguably, in a specific compute abstraction discussion, the data plane is key, but it is as important to understand what’s happening for the control plane piece.

In 2014, Amazon launched a production-grade containers control plane called Amazon Elastic Container Service (ECS), which “is a highly scalable, high performance container management service that supports Docker … Amazon ECS eliminates the need for you to install, operate, and scale your own cluster management infrastructure.”

In 2017, Amazon also announced the intention to release a new service called Amazon Elastic Container Service for Kubernetes (EKS) based on Kubernetes, a successful open source containers control plane technology. Amazon EKS was made generally available in early June 2018.

Just like for ECS, the aim for this service is to free AWS customers from having to manage a containers control plane. In the past, AWS customers would spin up EC2 instances and deploy/manage their own Kubernetes masters (masters is the name of the Kubernetes hosts running the control plane) on top of an EC2 abstraction. However, we believe many AWS customers will leave to AWS the burden of managing this layer by either consuming ECS or EKS, depending on their use cases. A comparison between ECS and EKS is beyond the scope of this blog post.

You may have noticed that what we have discussed so far is about the container control plane. How about the containers data plane? This is typically a fleet of EC2 instances managed by the customer. In this particular setup, the containers control plane is managed by AWS while the containers data plane is managed by the customer. One could argue that, with ECS and EKS, we have raised the abstraction level for the control plane, but we have not yet really raised the abstraction level for the data plane as the data plane is still comprised of regular EC2 instances that the customer has responsibility for.

There is more on that later on but, for now, this is how the containers control plane and the containers data plane services appear:

The function abstraction

At re:Invent 2014, AWS introduced another abstraction layer: AWS Lambda. Lambda is an execution environment that allows an AWS customer to run a single function. So instead of having to manage and run a full-blown OS instance to run your code, or having to track all software dependencies in a user-built container to run your code, Lambda allows you to upload your code and let AWS figure out how to run it at scale.

What makes Lambda so special is its event-driven model. Not only can you invoke Lambda directly (for example, via the Amazon API Gateway), but you can trigger a Lambda function upon an event in another AWS service (for example, an upload to Amazon S3 or a change in an Amazon DynamoDB table).

The key point about Lambda is that you don’t have to manage the infrastructure underneath the function you are running. No need to track the status of the physical hosts, no need to track the capacity of the fleet, no need to patch the OS where the function will be running. In a nutshell, no need to spend time and money on the undifferentiated heavy lifting.

And this is how the Lambda service appears:

The bare metal abstraction

Also known as the “no abstraction.”

As recently as re:Invent 2017, we announced (the preview of) the Amazon EC2 bare metal instances. We made this service generally available to the public in May 2018.

This announcement is part of Amazon’s strategy to provide choice to our customers. In this case, we are giving customers direct access to hardware. To quote from Jeff Barr’s post:

“…. (AWS customers) wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.”

This is how the bare metal Amazon EC2 i3.metal instance appears:

As a side note, and also as alluded to by Jeff, i3.metal is the foundational EC2 instance type on top of which VMware created their own VMware Cloud on AWS service. We are now offering the ability to any AWS user to provision bare metal instances. This doesn’t necessarily mean you can load your hypervisor of choice out of the box, but you can certainly do things you wouldn’t be able to do with a traditional EC2 instance (note: this was just a Saturday afternoon hack).

More seriously, a question I get often asked is whether users could install ESXi on i3.metal on their own. Today this cannot be done, but I’d be interested in hearing your use case for this.

The full container abstraction (for lack of a better term)

Now that we covered all the abstractions, it is time to go back and see if there are other optimizations we can provide for AWS customers. When we discussed the container abstraction, we called out that while there are two different fully managed containers control planes (ECS and EKS), there wasn’t a managed option for the data plane.

Some customers were (and still are) happy about being in full control of said instances. Others have been very vocal that they wanted to get out of the (undifferentiated heavy-lifting) business of managing the lifecycle of that piece of infrastructure.

Enter AWS Fargate, a production-grade service that provides compute capacity to AWS containers control planes. Practically speaking, Fargate is making the containers data plane fall into the “Provider space” responsibility. This means the compute unit exposed to the user is the container abstraction, while AWS will manage transparently the data plane abstractions underneath.

This is how the Fargate service appears:

Now ECS has two “launch types”: one called “EC2” (where your tasks get deployed on a customer-managed fleet of EC2 instances), and the other one called “Fargate” (where your tasks get deployed on an AWS-managed fleet of EC2 instances).

For EKS, the strategy will be very similar, but as of this writing it was not yet available. If you’re interested in some of the exploration being done to make this happen, this is a good read.


We covered the spectrum of abstraction levels available on AWS and how AWS customers can intercept them depending on their use cases and where they sit on their cloud maturity journey. Customers with a “lift & shift” approach may be more akin to consume services on the left-hand side of the slide, whereas customers with a more mature cloud native approach may be more interested in consuming services on the right-hand side of the slide.

In general, customers tend to use higher-level services to get out of the business of managing non-differentiating activities. For example, I recently talked to a customer interested in using Fargate. The trigger there was the fact that Fargate is ISO, PCI, SOC and HIPAA compliant, which was a huge time and money saver for them because it’s easier to point to an AWS document during an audit than having to architect and document for compliance the configuration of a DIY containers data plane.

As a recap, here’s our visual story with all the abstractions available:

I hope you found it useful. Any feedback is greatly appreciated.

About the author

Massimo is a Principal Solutions Architect at AWS. For about 25 years, he specialized on the x86 ecosystem starting with operating systems and virtualization technologies, and lately he has been head down learning about cloud and how application architectures are evolving in that space. Massimo has a blog at www.it20.info and his Twitter handle is @mreferre.

Running Hyper-V on Amazon EC2 Bare Metal Instances

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/running-hyper-v-on-amazon-ec2-bare-metal-instances/

AWS recently announced the general availability of Amazon EC2 bare metal Instances. This post provides an overview of launching, setting up, and configuring a Hyper-V enabled host, launching a guest virtual machine (VM) within Hyper-V running on i3.metal.


The key elements of this process include the following steps:

  1. Launch a Windows Server 2016 with Hyper-V AMI provided by Amazon.
  2. Configure Hyper-V networking.
  3. Launch a Hyper-V guest VM.

Launch a Windows Server 2016 with Hyper-V AMI provided by Amazon

1. Open the EC2 console.
2. Choose Public Images and search for the Amazon Hyper-V AMIs.
3. Select your preferred Hyper-V AMI, and choose Launch.
4. Follow the Launch wizard process to launch the instance on i3.metal.

The Amazon Hyper-V AMIs have the Hyper-V role pre-enabled. You can also launch a Windows Server 2016 Base AMI to i3.metal, and enable the Hyper-V role for your use case.

Configure Hyper-V networking

To enable networking for your Hyper-V guests—so they can have connectivity to other resources in your VPC, or to the internet via your VPC internet gateway, ensure that you have first configured your VPC. For more information, see Creating and Attaching an Internet Gateway.

Hyper-V provides three types of virtual switches for networking:

  • External
  • Internal
  • Private

In this solution, you are creating an internal virtual switch and using the Hyper-V host as the NAT server for the guest VMs, similar to Microsoft’s topic Set up a NAT network.

You can specify your own virtual network range. For this example, use as the range for the virtual network inside the Hyper-V host.

  1. Run the following PowerShell command to create the internal virtual switch:
    New-VMSwitch -SwitchName "Hyper-VSwitch" -SwitchType Internal
  2. Determine which network interface is associated with the virtual switch. For this solution, the Get-NetAdapter command shows that the Hyper-V virtual switch has an ifIndex value of 12.
  3. Configure the Hyper-V Virtual Ethernet adapter with the NAT gateway IP address. This IP address is used as default gateway (Router IP) for the guest VMs. The following command sets the IP address with a subnet mask on the Interface (InterfaceIndex 12):
    New-NetIPAddress -IPAddress -PrefixLength 24 -InterfaceIndex 12
  4. Create a NAT virtual network using the range of
    New-NetNat -Name MyNATnetwork -InternalIPInterfaceAddressPrefix

Now the environment is ready for the guest VMs to have outbound communication with other resources through the host NAT. For each VM, assign an IP address with the default gateway ( This can be done manually within each guest VM. In this solution, you make it easier by enabling a DHCP server within the Hyper-V host to automatically assign IP addresses.

Setting up DHCP server role on the host

  1. Run the following command to add the DHCP role to the host:
    Install-WindowsFeature -Name 'DHCP' -IncludeManagementTools
  2. To configure the DHCP server to bind on the Hyper-V virtual interface, choose Control Panel, Administrative Tools, DHCP.
  3. Select this computer, add or remove bindings, and then select the IP address corresponding to Hyper-V virtual interface (that is,
  4. Configure the DHCP scope and specify a range from the subnet that you determined earlier. In this example, use
    Add-DhcpServerv4Scope -Name GuestIPRange -StartRange -EndRange -SubnetMask -State Active

    You should be able to see the range in the DHCP console, as in the following screenshot:

  5. For Router, choose the NAT gateway IP address assigned it to the Hyper-V network adapter (
  6. For DNS server, use the Amazon DNS, which is the second IP address for the VPC (

Launch a Hyper-V guest VM

For this post, follow the new VM wizard to create an Ubuntu 18.04 LTS guest VM. First, download the Ubuntu installation ISO from the Ubuntu website to your Hyper-V host, and store it on a secondary EBS volume that you added as the D: drive.

I3.metal instances use Amazon EBS and instance store volumes with the NVM Express (NVMe) interface. When you stop an I3.metal instance, any data stored on instance store volumes is gone. I recommend storing your guest VM’s hard drive (vhd or vhdx) on an EBS volume that is attached to your I3.Metal instance. This can be the root volume (C:) or any additional EBS volumes attached to the instance. For more information, see What’s the difference between instance store and EBS?

After that is complete, follow these steps:

  1. In Hyper-V Manager, choose Actions, New, Virtual Machine.
  2. Follow the wizard with your desired configuration up to the Configure Networking section.
  3. In the Configure Networking step, for Connection, choose Hyper-V Switch, and choose Next.
  4. In the Connect Virtual Hard Disk step, enter a name for the virtual hard disk. Use the default location C:\Users\Public\Documents\Hyper-V\Virtual Hard Disks\.
  5. Specify the size of the virtual hard disk, and choose Next.
  6. In the Installation Options step, choose the Ubuntu ISO that you downloaded earlier.
  7. Finish the wizard and start the VM, then follow the steps on the Ubuntu installation wizard. As you have already set up DHCP and NAT for the Hyper-V network, the Ubuntu VM automatically gets an IP address from the DHCP scope that you defined earlier.
  8. Confirm the connectivity of the VM to the internet


You’ve just built a Hyper-V host on an EC2 bare metal instance. Now you’re ready to add more guest VMs and put them to work!

Migrating a multi-tier application from a Microsoft Hyper-V environment using AWS SMS and AWS Migration Hub

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/migrating-a-multi-tier-application-from-a-microsoft-hyper-v-environment-using-aws-sms-and-aws-migration-hub/

Shane Baldacchino is a Solutions Architect at Amazon Web Services

Many customers ask for guidance to migrate end-to-end solutions running in their on-premises data center to AWS. This post provides an overview of moving a common blogging platform, WordPress, running on an on-premises virtualized Microsoft Hyper-V platform to AWS, including re-pointing the DNS records associated to the website.

AWS Server Migration Service (AWS SMS) is an agentless service that makes it easier and faster for you to migrate thousands of on-premises workloads to AWS. In November 2017, AWS added support for Microsoft’s Hyper-V hypervisor. AWS SMS allows you to automate, schedule, and track incremental replications of live server volumes, making it easier for you to coordinate large-scale server migrations. In this post, I guide you through migrating your multi-tier workloads using both AWS SMS and AWS Migration Hub.

Migration Hub provides a single location to track the progress of application migrations across multiple AWS and partner solutions. In this post, you use AWS SMS as a mechanism to migrate the virtual machines (VMs) and track them via Migration Hub. You can also use other third-party tools in Migration Hub, and choose the migration tools that best fit your needs. Migration Hub allows you to get progress updates across all migrations, identify and troubleshoot any issues, and reduce the overall time and effort spent on your migration projects.

Migration Hub and AWS SMS are both free. You pay only for the cost of the individual migration tools that you use, and any resources being consumed on AWS.


For this walkthrough, the WordPress blog is currently running as a two-tier stack in a corporate data center. The example environment is multi-tier and polyglot in nature. The frontend uses Windows Server 2016 (running IIS 10 with PHP as an ISAPI extension) and the backend is supported by a MySQL server running on Ubuntu 16.04 LTS. All systems are hosted on a virtualized platform. As the environment consists of multiple servers, you can use Migration Hub to group the servers together as an application and manage the holistic process of migrating the application.
The key elements of this migration process involve the following steps:

  1. Establish your AWS environment.
  2. Replicate your database.
  3. Download the SMS Connector from the AWS Management Console.
  4. Configure AWS SMS and Hyper-V permissions.
  5. Install and configure the SMS Connector appliance.
  6. Configure Hyper-V host permissions.
  7. Import your virtual machine inventory and create a replication job.
  8. Use AWS Migration Hub to track progress.
  9. Launch your Amazon EC2 instance.
  10. Change your DNS records to resolve the WordPress blog to your EC2 instance.

Before you start, ensure that your source systems OS and hypervisor version are supported by AWS SMS. For more information, see the Server Migration Service FAQ. This post focuses on the Microsoft Hyper-V hypervisor.

Establish your AWS environment

First, establish your AWS environment. If your organization is new to AWS, this may include account or subaccount creation, a new virtual private cloud (VPC), and associated subnets, route tables, internet gateways, and so on. Think of this phase as setting up your software-defined data center. For more information, see Getting Started with Amazon EC2 Linux Instances.

The blog is a two-tier stack, so go with two private subnets. Because you want it to be highly available, use multiple Availability Zones. An Availability Zone resides within an AWS Region. Each Availability Zone is isolated, but the zones within a Region are connected through low-latency links. This allows architects and solution designers to build highly available solutions.

Replicate your database

WordPress uses a MySQL relational database. You could continue to manage MySQL and the associated EC2 instances associated with maintaining and scaling a database. But for this walkthrough, I am using this opportunity to migrate to an RDS instance of Amazon Aurora, as it is a MySQL-compliant database. Not only is Amazon Aurora a high-performant database engine but it frees you up to focus on application development by managing time-consuming database administration tasks, including backups, software patching, monitoring, scaling, and replication.

Use AWS Database Migration Service (AWS DMS) to migrate your MySQL database to Amazon Aurora easily and securely. You can send the results from AWS DMS to Migration Hub. This allows you to create a single pane view of your application migration.

After a database migration instance has been instantiated, configure the source and destination endpoints and create a replication task.

By attaching to the MySQL binlog, you can seed in the current data in the database and also capture all future state changes in near–real time. For more information, see Migrating a MySQL-Compatible Database to Amazon Aurora.

Finally, the task shows that you are replicating current data in your WordPress blog database and future changes from MySQL into Amazon Aurora.

Download the SMS Connector from the AWS Management Console

Now, use AWS SMS to migrate your IIS/PHP frontend. AWS SMS is delivered as a virtual appliance that can be deployed in your Hyper-V environment.

To download the SMS Connector, log in to the console and choose Server Migration Service, Connectors, SMS Connector setup guide. Download the VHD file for SCVMM/Hyper-V.

Configure SMS

Your hypervisor and AWS SMS need an appropriate user with sufficient privileges to perform migrations:

Launch a new VM in Hyper-V based on the SMS Connector that you downloaded. To configure the connector, connect to it via HTTPS. You can obtain the SMS Connector IP address from within Hyper-V. By default, the SMS Connector uses DHCP to obtain a valid IP address.

Connect to the SMS Connector via HTTPS. In the example above, the connector IP address is In your browser, enter As the SMS Connector can only work with one hypervisor at a time, you must state the hypervisor with which to interface. For the purpose of this post, the examples use Microsoft Hyper-V.

Configure the connector with the IAM and hypervisor credentials that you created earlier.

After you have entered in both your AWS and Hyper-V credentials and the associated connectivity and authentication checks have passed, you are redirected to the home page of your SMS Connector. The home page provides you a status on connectivity and the health of the SMS Connector.

Configure Hyper-V host permissions

You also must modify your Hyper-V hosts to provide WinRM connectivity. AWS provides a downloadable PowerShell script to configure your Windows environment to support WinRM communications with the SMS Connector. The same script is used for configuring either standalone Hyper-V or SCVMM.

Execute the PowerShell script and follow the prompts. In the following example, Reconfigure Hyper-V not managed by SCVMM (Standalone Hyper-V)… was selected.

Import your virtual machine inventory and create a replication job

You have now configured the SMS Connector and your Microsoft Hyper-V hosts. Switch to the console to import your server catalog to AWS SMS. Within AWS SMS, choose Connectors, Import Server Catalog.

This process can take up to a few minutes and is dependent on the number of machines in your Hyper-V inventory.

Select the server to migrate and choose Create replication job. The console guides you through the process. The time that the initial replication task takes to complete is dependent on the available bandwidth and the size of your VM. After the initial seed replication, network bandwidth is minimized as AWS SMS replicates only incremental changes occurring on the VM.

Use Migration Hub to track progress

You have now successfully started your database migration via AWS DMS, set up your SMS Connector, configured your Microsoft Hyper-V environment, and started a replication job.

You can now track the collective progress of your application migration. To track migration progress, connect AWS DMS and AWS SMS to Migration Hub.

To do this, navigate to Migration Hub in the AWS Management Console. Under Migrate and Tools, connect both services so that the migration status of these services is sent to Migration Hub.

You can then group your servers into an application in Migration Hub and collectively track the progress of your migration. In this example, I created an application, Company Blog, and added in my servers from both AWS SMS and AWS DMS.

The progress updates from linked services are automatically sent to Migration Hub so that you can track tasks in progress. The dashboard reflects any status changes that occur in the linked services. You can see from the following image that one server is complete while another is in progress.

Using Migration Hub, you can view the migration progress of all applications. This allows you to quickly get progress updates across all of your migrations, easily identify and troubleshoot any issues, and reduce the overall time and effort spent on your migration projects.

Launch your EC2 instance

When your replication task is complete, the artifact created by AWS SMS is a custom AMI that you can use to deploy an EC2 instance. Follow the usual process to launch your EC2 instance, using the custom AMI created by AWS SMS, noting that you may need to replace any host-based firewalls with security groups and NACLs.

When you create an EC2 instance, ensure that you pick the most suitable EC2 instance type and size to match your performance requirements while optimizing for cost.

While your new EC2 instance is a replica of your on-premises VM, you should always validate that applications are functioning. How you do this differ on an application-by-application basis. You can use a combination of approaches, such as editing a local host file and testing your application, SSH, RDP, and Telnet.

From the RDS console, get your connection string details and update your WordPress configuration file to point to the Amazon Aurora database. As WordPress is expecting a MySQL database and Amazon Aurora is MySQL-compliant, this change of database engine is transparent to WordPress.

Change your DNS records to resolve the WordPress blog to your EC2 instance

You have validated that your WordPress application is running correctly, as you are still receiving changes from your on-premises data center via AWS DMS into your Amazon Aurora database. You can now update your DNS zone file using Amazon Route 53. Amazon Route 53 can be driven by multiple methods: console, SDK, or AWS CLI.

For this walkthrough, use Windows PowerShell for AWS to update the DNS zone file. The example shows UPSERTING the A record in the zone to resolve to the Amazon EC2 instance created with AWS SMS.

Based on the TTL of your DNS zone file, end users slowly resolve the WordPress blog to AWS.


You have now successfully migrated your WordPress blog to AWS using AWS migration services, specifically the AWS SMS Hyper-V/SCVMM Connector. Your blog now resolves to AWS. After validation, you are ready to decommission your on-premises resources.

Many architectures can be extended to use many of the inherent benefits of AWS, with little effort. For example, by using Amazon CloudWatch metrics to drive scaling policies, you can use an Application Load Balancer as your frontend. This removes the single point of failure for a single EC2 instance

Building Real Time AI with AWS Fargate

Post Syndicated from AWS Admin original https://aws.amazon.com/blogs/architecture/building-real-time-ai-with-aws-fargate/

This post is a contribution from AWS customer, Veritone. It was originally published on the company’s Website

Here at Veritone, we deal with a lot of data. Our product uses the power of cognitive computing to analyze and interpret the contents of structured and unstructured data, particularly audio and video. We use cognitive computing to provide valuable insights to our customers.

Our platform is designed to ingest audio, video and other types of data via a series of batch processes (called “engines”) that process the media and attach some sort of output to it, such as transcripts or facial recognition data.

Our goal was to design a data pipeline that could process streaming audio, video, or other content from sources, such as IP cameras, mobile devices, and structured data feeds in real-time, through an open ecosystem of cognitive engines. This enables support for customer use cases like real-time transcription for live-broadcast TV and radio, face and object detection for public safety applications, and the real-time analysis of social media for harmful content.

Why AWS Fargate?
We leverage Docker containers as the deployment artifact of both our internal services and cognitive engines. This gave us the flexibility to deploy and execute services in a reliable and portable way. Fargate on AWS turned out to be a perfect tool for orchestrating the dynamic nature of our deployments.

Fargate allows us to quickly scale Docker-based engines from zero to any desired number without having to worry about pre-provisioning capacity or bootstrapping and managing EC2 instances. We use Fargate both as a backend for quickly starting engine containers on demand and for the orchestration of services that need to always be running. It enables us to handle sudden bursts of real-time workloads with a consistent launch time. Fargate also allows our developers to get near-immediate feedback on deployments without having to manage any infrastructure or deal with downtime. The integration with Fargate makes this super simple.

Moving to Real Time
We designed a solution (shown below), in which media from a source, such as a mobile app, which “pushes” streams into our platform, or an IP camera feed, which is “pulled”, is streamed through a series of containerized engines, processing the data as it is ingested. Some engines, which we refer to as Stream Engines, work on raw media streams from start to finish. For all others, streams are decomposed into a series of objects, such as video frames or small audio/video chunks that can be processed in parallel by what we call Object Engines. An output stream of results from each engine in the pipeline is relayed back to our core platform or customer-facing applications via Veritone’s APIs.

Message queues placed between the components facilitate the flow of stream data, objects, and events through the data pipeline. For that, we defined a number of message formats. We decided to use Apache Kafka, a streaming message platform, as the message bus between these components.

Kafka gives us the ability to:

  • Guarantee that a consumer receives an entire stream of messages, in sequence.
  • Buffer streams and have consumers process streams at their own pace.
  • Determine “lag” of engine queues.
  • Distribute workload across engine groups, by utilizing partitions.

The flow of stream data and the lifecycle of the engines is managed and coordinated by a number of microservices written in Go. These include the Scheduler, Coordinator, and Engine Orchestrators.

Deployment and Orchestration
For processing real-time data, such as streaming video from a mobile device, we required the flexibility to deploy dynamic container configurations and often define new services (engines) on the fly. Stream Engines need to be launched on-demand to handle an incoming stream. Object Engines, on the other hand, are brought up and torn down in response to the amount of pending work in their respective queues.

EC2 instances typically require provisioning to be done in anticipation of incoming load and generally take too long to start in this case. We needed a way to quickly scale Docker containers on demand, and Fargate made this achievable with very little effort.

In Closing
Fargate helped us solve a lot of problems related to real-time processing, including the reduction of operational overhead, for this dynamic environment. We expect it to continue to grow and mature as a service. Some features we would like to see in the near future include GPU support for our GPU-based AI Engines and the ability to cache container images that are larger for quicker “warm” launch times.

About Veritone
Veritone created the world’s first operating system for Artificial Intelligence. Veritone’s aiWARE operating system unlocks the power of cognitive computing to transform and analyze audio, video and other data sources in an automated manner to generate actionable insights. The Veritone platform provides customers ease, speed and accuracy at low cost.

The Veritone authors are Christopher Stobie – [email protected] and Mezzi Sotoodeh – [email protected]

Improving application performance and reducing costs with Amazon EBS-Optimized Instance burst capability

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/improving-application-performance-and-reducing-costs-with-amazon-ebs-optimized-instance-burst-capability/

Contributed by Sooraj Prasannan, Senior Product Manager, Amazon Elastic Block Store

In November 2017, Amazon EC2 introduced C5 compute-intensive instances and M5 general-purpose instances. In the first half of 2018, we released EC2 C5d instances and M5d instances by adding high-speed, ultra-low latency local NVMe storage to the EC2 C5 and M5 instance families. EC2 C5/C5d and M5/M5d instances are built on the Nitro system. This collection of AWS-built hardware and software components enables high performance, high availability, high security, and bare metal capabilities to reduce virtualization overhead.

During the design of the Nitro system, we analyzed real-world workloads and recognized the need for smaller instance sizes to drive higher performance from their Amazon EBS volumes. We found that the majority of application storage needs are bursty, with short, intense periods of high I/O and plenty of idle time between bursts. To improve the experience for these workloads, we developed burst capability for smaller instance sizes. Available on EC2 C5/C5d and M5/M5d instances, this feature enables large, xlarge, and 2xlarge instance sizes to drive the same performance as the 4xlarge instance for 30 minutes each day.

For applications with spiky Amazon EBS demand, you can right-size your instances based on your CPU and memory requirements and still meet your EBS-optimized instance performance requirements. This higher performance also enables you to speed up sections of your workflow dependent on EBS-optimized instance performance. Faster workflows result in quicker job completions and improved resource utilization. The burst capability ultimately enables you to reduce costs by right-sizing your instance and improving total resource usage.

With this performance increase, you will be able to handle unplanned spikes in demand without any impact to your application performance. You can now size your instances based on historical average trends. This burst capability gives you more performance to absorb spikes without affecting your customer experience.

Using Amazon CloudWatch metrics to monitor burst usage

For better visibility into your performance, instances based on the Nitro system provide Amazon CloudWatch metrics to help profile your usage. Based on the usage profile, you can decide if smaller instances meet your requirements.

These instances give you the ability to monitor your usage via instance level CloudWatch metrics for operations (EBSReadOpsandEBSWriteOps) and bytes transferred (EBSReadBytesand EBSWriteBytes). For more information on these metrics, see List of available CloudWatch metrics for your instances. These metrics support basic monitoring (five-minute frequency) by default, but you can enable detailed monitoring (one-minute frequency) for an additional cost. For more information, see Amazon CloudWatch pricing.

For large, xlarge, and 2xlarge instances, we also provide burst balance metrics. EBSIOBalance% monitors the instance I/O burst bucket, and EBSByteBalance% monitors the instance byte burst bucket. These metrics give information about the percentage of I/O or bytes credits remaining in the respective burst buckets. The metrics are expressed as a percentage, where 100% means that the instance has accumulated the maximum number of credits. You can set up an alarm that triggers if the balance gets too low.

To demonstrate these metrics, we launched an m5.large instance. We then attached a 500GB io1 Amazon EBS volume with 32,000 provisioned IOPS to the instance. Amazon EBS volumes attached to instances based on the Nitro system are exposed as NVMe devices.

First, we ran a large block (128 KiB) test using fio to /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 and monitored both EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=128k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=large-block-test 

Because this is a large block workload, it’s not driving enough IOPS to deplete EBSIOBalance%. It depletes EBSByteBalance% instead, as shown in the following image.

Then we ran a small block test to understand how it affects EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=16k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=small-block-test 

Because this is a small block test, it drives higher IOPS than bytes/second. Hence, EBSIOBalance% drops faster than EBSByteBalance%, as shown in the following image.

As long as EBSIOBalance% and EBSByteBalance% are above 0%, the instance can drive the burst performance. When the instance I/O activity is below the baseline rate, the burst buckets refill. After the tests finished, we paused all I/O from the instance. This period of inactivity allows the instance burst buckets to refill, as EBSIOBalance% and EBSByteBalance% show in the following image.

The refill rate for a burst bucket is the difference between the baseline rate and the instance I/O activity. For example, m5.large has a baseline throughput rate of 60 MB/s and a baseline IOPS rate of 3600 IOPS. Suppose the instance I/O activity is 10 MB/s and 1000 IOPS. The byte bucket fills at the rate of 50 MB/s (60 MB/s minus 10 MB/s). The IOPS bucket fills at the rate of 2600 IOPS (3600 IOPS minus 1000 IOPS). For the baseline rates for the different instances, see Amazon EBS–optimized instances. In addition, we top off the burst buckets every 24 hours, which means that the instance has burst performance available for 30 minutes each day.

Performance enhancements

We have continued to make enhancements to the Nitro system. With the latest set of enhancements, we have increased the maximum burst bandwidth on the large, xlarge, and 2xlarge EC2 C5/C5d and M5/M5d instances to 3.5 Gbps, up from 2.25 Gbps and 2.12 Gbps, respectively. We have also increased the maximum burst IOPS for EC2 C5/C5d to 20,000 IOPS and to 18,750 IOPS for M5/M5d, up from 16,000 IOPS for both. All new EC2 C5/C5d and M5/M5d smaller instances can take advantage of this performance increase at no additional cost.

For the latest list of instances based on the Nitro system that support this burst feature and their corresponding performance numbers, see Amazon EBS–optimized instances.

Deploy an 8K HEVC pipeline using Amazon EC2 P3 instances with AWS Batch

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/deploy-an-8k-hevc-pipeline-using-amazon-ec2-p3-instances-with-aws-batch/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

AWS provides several managed services for file- and streaming-based media encoding options.

Currently, these services offer up to 4K encoding. Recent developments and the growing popularity of 8K content has now increased the need to distribute higher resolution content.

In this solution, you use an Amazon EC2 P3 instance to create a file-based encoding pipeline utilizing AWS Batch by first uploading a sample 8K (7680×4320) file to Amazon S3.

AWS Batch

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

P3 instances for video transcoding workloads

The P3 instance comes equipped with the NVIDIA Tesla V100 GPU. The V100 is a 16 GB 5,120 CUDA Core-GPU based on the latest Volta architecture; well suited for video coding workloads. The largest instance size in that family, p3.16xlarge, has 64 vCPU, 488 GB of RAM, 8 NVIDIA Tesla V100 GPUs, and 25 Gbps networking bandwidth.

Other than being a mainstay in computational workloads the V100 offers enhanced hardware-based encoding/decoding (NVENC/NVDEC). The following tables summarize the NVENC/NVDEC options available compared to other GPUs offered at AWS.

NVENC Support Matrix

AWS GPU instance
GPU FAMILY GPU H.264 (AVCHD) YUV 4:2:0 H.264 (AVCHD) YUV 4:4:4 H.264 (AVCHD) Lossless H.265 (HEVC) 4K YUV 4:2:0 H.265 (HEVC) 4K YUV 4:4:4 H.265 (HEVC) 4K Lossless H.265 (HEVC) 8k
G2 Kepler GRID K520 YES
P2 Kepler (2nd Gen) Tesla K80 YES
G3 Maxwell (2nd Gen) Tesla M60 YES YES YES YES

NVDEC Support Matrix

AWS GPU instance GPU FAMILY GPU MPEG-2 VC-1 H.264 (AVCHD) H.265 (HEVC) VP8 VP9
P2 Kepler (2nd Gen) Tesla K80 YES YES YES
G3 Maxwell (2nd Gen) Tesla M60 YES YES YES YES

Cinematic 8K encoding is supported using the Tesla V100 (P3 instance family) either in landscape or portrait orientations using the HEVC codec. 

GPU H264 H264_444 H264_ME H264_WxH HEVC HEVC_Main10 HEVC_Lossless HEVC_SAO HEVC_444 HEVC_ME HEVC_WxH
Tesla M60 + + + 4096x
+ 4096x
Tesla V100 + + + 4096x
+ + + + + + 8192x


To follow along with these procedures, ensure that you have the following:

  • An AWS account with permissions to create IAM roles and policies, as well as read and write access to S3
  • Registration with the NVIDIA Developer Network
  • Familiarity with Docker


For deployment, you containerize the encoding pipeline. After building the underlying P3 container instance, you then use nvidia-docker2 to build the video-encoding Docker image, which is registered with Amazon Elastic Container Registry (Amazon ECR).

As shown in the following diagram, the pipeline reads an input raw YUV file from S3, then pulls the containerized encoding application to execute at scale on the P3 container instance. The encoded video file is then be transferred to S3.

The nvidia-docker2 image video encoding stack contains the following components:

  • FFMPEG 4.0
  • NVIDIA Video Codec SDK 8.1

This is a relatively lengthy procedure. However, after it’s built, the underlying instance and Docker image are reusable and can be quickly deployed as part of a high performance computing (HPC) pipeline.

Creating the ECS container instance

The underlying instance can be built by selecting the Amazon Linux AMI with the p3.2xlarge instance type in a public subnet. Additionally, add an EBS volume (150 GB), which is used for the 8k input, raw yuv, and output files. Scale the storage amount for larger input files. Persist the mount in /etc/fstab. Connect to the instance over SSH and install any OS updates as well as the EPEL Release and support packages as well as the base docker-ce.

sudo yum update -y
sudo yum install yum-utils \
                 device-mapper-persistent-data \
                 lvm2 \

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo yum install epel-release-latest-7.noarch.rpm
sudo yum update
sudo yum install docker-ce -y

The NVIDIA/CUDA stack can be installed using the cuda-repo-rhel7.rpm file. The CUDA framework installs the NVIDIA driver dependencies.

sudo yum install cuda -y

Next, install nvidia-docker2 as provided in the NVIDIA GitHub repo.

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | \
  sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

sudo tee /etc/docker/daemon.json <<EOF
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
    "default-runtime": "nvidia"

sudo systemctl restart docker

With the base components in place, make this instance compatible with the ECS service:

sudo yum install ecs-init -y

Create the /etc/ecs/ecs.config file with the following template:

cat << EOF > /etc/ecs/ecs.config

Iptables and packet forwarding rules need to be created to pass IAM roles into task operations:

sudo sh -c "echo 'net.ipv4.conf.all.route_localnet = 1' >> /etc/sysctl.conf"
sudo sysctl -p /etc/sysctl.conf
sudo iptables -t nat -A PREROUTING -p tcp -d --dport 80 -j DNAT --to-destination
sudo iptables -t nat -A OUTPUT -d -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 51679
sudo sh -c 'iptables-save > /etc/sysconfig/iptables'

Finally, a systemd unit file needs to be created:

sudo cat << EOF > /etc/systemd/system/[email protected]
Description=Docker Container %I

ExecStartPre=-/usr/bin/docker rm -f %i
ExecStart=/usr/bin/docker run --name %i \
--privileged \
--restart=on-failure:10 \
--volume=/var/run:/var/run \
--volume=/var/log/ecs/:/log:Z \
--volume=/var/lib/ecs/data:/data:Z \
--volume=/etc/ecs:/etc/ecs \
--net=host \
--env-file=/etc/ecs/ecs.config \
ExecStop=/usr/bin/docker stop %i


sudo systemctl enable [email protected]
sudo systemctl start [email protected]
sudo systemctl status [email protected]

Ensure that the [email protected] service starts successfully.

Creating the NVIDIA-Docker image

With Docker installed, pull the latest nvidia/cuda:latest image from DockerHub.

docker pull nvidia/cuda:latest

It is best at this point to run the Docker container in interactive mode. However, a Docker build file can be created afterwards. At the time of publication, only CUDA 9.0 is installed. NVIDIA has already provided the necessary repositories. Install CUDA 9.2, and support packages, inside the Docker container, referenced by the (docker)  label:

docker run -it --runtime=nvidia --rm nvidia/cuda
(docker) apt update
(docker) apt install pkg-config build-essential wget curl nasm unzip \
                     git libglew-dev cuda-toolkit-9-2 python3-pip -y
(docker) pip3 install awscli

Next, download the FFMPEG 4.0, nv-codec-headers, and the Video Codec SDK 8.1 from the NVIDIA Developer platform.

First, extract the nv-codec-headers and into the directory:

(docker) make
(docker) make install

Extract the ffmpeg-4.0 directory and compile and install FFmpeg:

(docker) ./configure --enable-cuda --enable-cuvid --enable-nvenc --enable-nonfree --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64
(docker) make -j 4
(docker) make install

Download and extract the NVIDIA Video Codec SDK 8.1. The “Samples” directory has a preconfigured Makefile that compiles the binaries in the SDK. After it’s successful, confirm that the binaries are correctly set up.

(docker): ~/Video_Codec_SDK_8.1.24/Samples/AppEncode/AppEncCuda$ ./AppEncCuda -h
-i Input file path
-o Output file path
-s Input resolution in this form: WxH
-if Input format: iyuv nv12 yuv444 p010 yuv444p16 bgra bgra10 ayuv abgr abgr10
-gpu Ordinal of GPU to use
-codec Codec: h264 hevc
-preset Preset: default hp hq bd ll ll_hp ll_hq lossless lossless_hp
-profile H264: baseline main high high444; HEVC: main main10 frext
-444 (Only for RGB input) YUV444 encode
-rc Rate control mode: constqp vbr cbr cbr_ll_hq cbr_hq vbr_hq
-fps Frame rate
-gop Length of GOP (Group of Pictures)
-bf Number of consecutive B-frames
-bitrate Average bit rate, can be in unit of 1, K, M
-maxbitrate Max bit rate, can be in unit of 1, K, M
-vbvbufsize VBV buffer size in bits, can be in unit of 1, K, M
-vbvinit VBV initial delay in bits, can be in unit of 1, K, M
-aq Enable spatial AQ and set its stength (range 1-15, 0-auto)
-temporalaq (No value) Enable temporal AQ
-lookahead Maximum depth of lookahead (range 0-32)
-cq Target constant quality level for VBR mode (range 1-51, 0-auto)
-qmin Min QP value
-qmax Max QP value
-initqp Initial QP value
-constqp QP value for constqp rate control mode
Note: QP value can be in the form of qp_of_P_B_I or qp_P,qp_B,qp_I (no space)

Encoder Capability
# GPU H264 H264_444 H264_ME H264_WxH HEVC HEVC_Main10 HEVC_Lossless HEVC_SAO HEVC_444 HEVC_ME HEVC_WxH
0 Tesla V100-SXM2-16GB + + + 4096x4096 + + + + + + 8192x8192

Create a small script to be used for the 8K-encoding test inside the Docker container. Save the file as /root/nvenc-processor.sh. In the basic form, this script encodes using a single thread. For comparison, the same file is encoded using four threads.

#!/bin/bash -xe
time aws s3 cp $S3_INPUT /mnt/8k.webm

time /usr/local/bin/ffmpeg -y -hwaccel cuda -i /mnt/8k.webm -c:v rawvideo -pix_fmt yuv420p /mnt/8k.yuv
time /root/Video_Codec_SDK/Samples/AppEncode/AppEncCuda/AppEncCuda -i /mnt/8k.yuv -o /mnt/8k.hevc -s 7680x4320 -codec hevc
time /root/Video_Codec_SDK/Samples/AppEncode/AppEncPerf/AppEncPerf -i /mnt/8k.yuv -s 7680x4320 -thread 4 -codec hevc

time aws s3 cp /mnt/8k.hevc $S3_OUTPUT

This script downloads a file from S3 and processes it through FFmpeg. Using the AppEncCuda and AppEncPerf methods, create the 8K-encoded file to be uploaded back to S3. Commit your Docker container into a new Docker image:

docker commit -m "creating hvec-processor image" <containerid> nvidia-hvec:latest

Ensure that a Docker repo has been created in Amazon ECS. Choose Repositories, Create Repository. After you open the repository, choose View Push Commands. Commit the new created image to your ECR repo.

After confirming that your image is in your ECR repo, delete all images locally in the instance:

docker rmi -f $(docker images -a -q)

Before stopping the instance, remove the ECS agent checkpoint file:

sudo rm -rf /var/lib/ecs/data/ecs_agent_data.json

Create an AMI from the instance, maintaining the attached EBS volume. Note the AMI ID.

Creating IAM role permissions

To ensure that access to ECS is controlled and to allow AWS Batch to be called, create two IAM roles:

  • BatchServiceRole allows AWS Batch to call services on your behalf.
  • ecsInstanceRole is specific to this workflow and adds permissions for S3FullAccess. This allows the container to read from and write to your S3 bucket. The following screenshot shows the example policy stack.

In AWS Batch, select the compute environment and create a managed compute environment. Assign a cluster name and min and max vCPUs values. Use the AMI ID, and IAM roles created earlier. Use the Spot pricing model with a consideration of running at 60% of the On-Demand price. Look at the current Spot price to see if more aggressive discounts are possible.

Note the cluster name. In Amazon ECS, you should see the cluster created. Next, create a job queue and associate this job queue with the compute environment created earlier. Note the job queue name.

Next, create a job definition file. This provides the job parameters to be used including mounting paths, CPU, and memory requirements.

    "containerProperties": {
        "mountPoints": [
                "sourceVolume": "codec-data",
                "readOnly": false,
                "containerPath": "/mnt"
        "image": "<accountnumber>.dkr.ecr.us-east-1.amazonaws.com/nvidia/nvidia-hvec:latest",
        "command": ["/root/nvenc-processor.sh"],
        "volumes": [
                "host": {"sourcePath": "/mnt"},
                "name": "codec-data"
        "memory": 32768,
        "vcpus": 8,
        "privileged": true,
        "environment": [
                "name": "S3_INPUT",
                "value": "s3://<bucket>/<key_name>"
                "name": "S3_OUTPUT",
                "value": "s3://<bucket>"
        "ulimits": []
    "type": "container",
    "jobDefinitionName": "nvenc-test"

Save the file as nvenc-test.json and register the job in AWS Batch.

aws batch register-job-definition --cli-input-json file://nvenc-test.json

In the AWS Batch console, create a job queue assigning a priority of 1 to the compute environment created earlier. Create a job assigning a job name, with the job definition file, and job queue. Add additional environment variables for the S3 bucket. Ensure that these buckets and input file are created.

S3_INPUT = s3://<bucket>/<key_name> 
S3_OUTPUT = s3://<bucket> 

Execute the job. In a few moments, the job should be in the Running state. Check the CloudWatch logs for an updated status of the job progression. Open the job record information and scroll down to CloudWatch metrics. The events are logged in a new AWS Batch log stream.

A 1-minute 8K YUV 4:2:0 file took approximately 10 minutes single-threaded (top panel), and 58 seconds using four threads (bottom panel). The nvenc-processor.sh script serves as a basic implementation of 8K encoding. Explore the options provided by the NVIDIA Video Codec SDK for additional encoding/decoding and transcoding options.


With AWS Batch, a customized container instance, and a dockerized NVIDIA video encoding platform, AWS can provide your HD, 4K, and now 8K media distribution. I invite you to incorporate this into your automated pipeline.

With some minor modification, it’s possible to trigger this pipeline after a new file is uploaded into S3. Then, execute through AWS Lambda or as part of an AWS Step Functions workflow.






Building a GPU workstation for visual effects with AWS

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/building-a-gpu-workstation-for-visual-effects-with-aws/

Contributed by Mike Owen, Solutions Architect, AWS Thinkbox

The elasticity, scalability, and cost effectiveness of the cloud value proposition is attractive to media customers. One of the key design patterns in media and entertainment (M&E) workloads is using the cloud as a content lake and bringing the underlying processes closer without having to synchronize data. In this high-end graphics visualization business, a pixel-perfect, color-accurate, fully interactive native desktop experience is required for both Windows and Linux platforms. Visual effects (VFX) artists also require input peripherals such as latest-generation Wacom 8K pressure-sensitive tablets and Wacom Cintiq monitors to work as seamlessly as they do on-premises.

AWS offers Amazon EC2 G3 instances backed by NVIDIA Tesla M60 GPUs with powerful graphics capabilities: OpenGL 4.6, DirectX 12, CUDA 9.2, GRID 6.1. You can combine these instances with the Teradici streaming protocol via their Cloud Access Software (CAS) agent to enable a high-end desktop experience on either Windows or Linux with an on-demand pricing model to fit your business needs. Teradici PCoIP is a popular protocol in the M&E industry, where Teradici have delivered a custom silicon accelerated zero-client hardware device to deliver secure pixel streaming to on-premises monitors. AWS also enables customers to create managed virtual desktop environments with Amazon WorkSpaces Graphics bundles (Windows and Linux) or Amazon AppStream 2.0 (Windows). Both solutions offer a managed environment with GPU-backed instances. This blog describes how you can set up an unmanaged VFX desktop using Amazon EC2 G3 instances combined with high-performance storage and scalable compute options such as Amazon EC2 Spot Instances.


The following diagram describes a typical Windows and Linux configuration. In this setup, you use a Teradici PCoIP Zero Client over a dedicated network connection from your on-premises location via your chosen network provider to their nearest AWS Region containing an Amazon EC2 G3 instance. AWS Direct Connect provides a low-latency, high-bandwidth dedicated connection that doesn’t traverse the public internet. With the Windows instance, you might use a creative pen display such as a Wacom Cintiq monitor or, on a Linux instance, the latest generation of Wacom 8K pressure-sensitive tablets. You can connect both types of environments to dual 2K monitors and be ready for film VFX work.

Once built, the g3.4xl instance runs your custom Amazon Machine Image (AMI) with encrypted volume(s) in Amazon Elastic Block Storage (EBS) containing all your software, pulling floating licenses from your on-premises license servers where necessary. For Linux, you have the option of centrally installing your software via a fast NVMe SSD–based i3 instance type and building a minimal-sized boot AMI. In both cases, you can add encrypted Amazon EBS SSD volumes for increased local storage. The Teradici CAS agent runs on each individual G3 instance and can be provisioned, brokered, and managed by the optional Teradici Cloud Access Manager (CAM) solution. Finally, Amazon WorkSpaces Graphics bundles are compatible with a Teradici zero client, providing easy access to a fully managed Windows desktop. This might be useful for Linux-based studios that require ad hoc Windows usage such as Adobe Creative Cloud.

In this configuration, a Teradici zero client interacts with the provisioned desktop (served on a G3 instance) in the cloud. The Teradici CAS agent captures the frame buffer and sends it in real time to the zero client over the network via UDP using the PCoIP protocol. A smooth, reliable experience depends on a low-latency and high-bandwidth connection to the Amazon EC2 instance hosting the desktop. Bandwidth requirements depend on the number of monitors used, resolution, frame rate, and lossless quality of the desktop experience. For Wacom tablet support, Teradici CAS 2.12 requires the latency level to be less than 25 ms. You can use ping.psa.fun or cloudping.info to check the latency time of public pings between your location and your closest AWS Region. Ideally, you will provision an AWS Direct Connect connection for private (doesn’t traverse the public internet) and fast (low-latency) connectivity to the AWS Region from your location. You can also use a public internet connection for initial testing. In both cases, you can route traffic over a VPN for added security.


Instead of doing a manual build, you can visit the AWS Marketplace and subscribe to a Teradici-provided pre-built AMI. It already has the NVIDIA GRID driver and Teradici CAS software installed, configured, and licensed as part of the overall usage cost. See the following offerings on AWS Marketplace:


Make sure that everything in the following list is in place before deploying to either platform:

  • Create an AWS account.
  • Ensure that your AWS account has an EC2 key-pair associated with it by going to the AWS Management Console and checking Key Pairs under Network and Security in the applicable AWS Region.
  • Set up an AWS account <ACCESS KEY> and <SECRET ACCESS KEY> to access the NVIDIA GRID driver from an Amazon S3 bucket. The deployment instructions explain how to install and set up the AWS Command-Line Interface (AWS CLI).
  • Minimum version: CentOS 7.2 or Windows 2016.
  • Recommended Teradici PCoIP Zero Client firmware version: 6.0. Contact Teradici to download.
  • Contact Teradici who will provide a 60-day trial license: <TERADICI LICENSE CODE> for Cloud Access Software. You should receive your license within 1 business day. If you don’t receive your license, please contact [email protected].
  • You must have superuser (root) or Administrator privileges to the AMI.
  • The Amazon EC2 security group provides a stateful firewall on each instance via a set of rules. The following inbound ports must be available on the Amazon EC2 instance from a specific client’s source IP address (restrictive access).
Type Protocol Port Range Source Description Platform
Custom TCP Rule TCP 4172 <YOUR SOURCE IP> PCoIP Both
Custom UDP Rule UDP 4172 <YOUR SOURCE IP> PCoIP Both
Custom TCP Rule TCP 60443 <YOUR SOURCE IP> PCoIP Both
RDP TCP 3389 <YOUR SOURCE IP> RDP Windows only

Deploying the desktop on Linux

For our Linux deployment, we use the latest CentOS 7.5 AMI from AWS Marketplace and install the NVIDIA/Xorg/KDE/Wacom stack to create a fully functioning VFX Linux desktop environment. This stack contains the following components:

  • CentOS 7.5.1804_2 AMI
  • NVIDIA Grid 6.1 (390.57 May 2018) driver
  • Teradici CAS 2.12
  • Wacom 0.40 driver

Feel free to use your own CentOS 7.2+ AMI and modify the step by step instructions accordingly.

Setting up the desktop on Linux

To launch a g3.4xl instance in the closest AWS Region in your AWS account using the created key-pair and security group, use an AMI ID from the ones in the following table. For reference, search for the AMI using the keywords CentOS Linux 7 x86_64 HVM EBS 1804_2.

AWS Region AWS Region ID AMI ID
US East (N. Virginia) us-east-1 ami-d5bf2caa
US East (Ohio) us-east-2 ami-77724e12
US West (N. California) us-west-1 ami-3b89905b
US West (Oregon) us-west-2 ami-5490ed2c
EU (Frankfurt) eu-central-1 ami-9a183671
EU (Ireland) eu-west-1 ami-4c457735
Asia Pacific (Tokyo) ap-northeast-1 ami-3185744e
Asia Pacific (Singapore) ap-southeast-1 ami-da6151a6
Asia Pacific (Sydney) ap-southeast-2 ami-0d13c26f

Once the g3.4xl instance has passed its EC2 instance 2/2 status checks, we can build in true AWS style.

First, log in to the instance and set up the environment.

# ssh into running Amazon EC2 instance
ssh [email protected]<IP-ADDRESS>.<AWS-REGION>.compute.amazonaws.com
# yes

# set a password for your user
sudo passwd centos

# disable selinux
sudo sed -ir 's/SELINUX=\(disabled\|enforcing\|permissive\)/SELINUX=disabled/' /etc/selinux/config

# install the EPEL repository
sudo yum install wget -y
sudo wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo rpm -i epel-release-latest-7.noarch.rpm

# run yum update to make sure all packages are up-to-date
sudo yum update -y

# install the "Server with GUI" group
sudo yum groupinstall "Server with GUI" -y

# prefer KDE desktop? (optional)
sudo yum groupinstall -y "KDE Plasma Workspaces"
sudo systemctl set-default graphical.target
echo "exec startkde" >> ~/.xinitrc

# uninstall KDE (optional)
# sudo yum groupremove -y "KDE Plasma Workspaces"
# sudo yum autoremove -y
# sudo reboot

# reboot to make sure the latest installed kernel is running
sudo reboot

# install kernel-devel
sudo yum install kernel-devel -y

Next, install and register the Teradici CAS 2.12 software.

# import the Teradici signing key
sudo rpm --import https://downloads.teradici.com/rhel/teradici.pub.gpg

# grab the PCoIP repo file
sudo curl -o /etc/yum.repos.d/pcoip.repo https://downloads.teradici.com/rhel/pcoip.repo

# install PCoIP agent package
sudo yum install pcoip-agent-graphics -y

# load vhci-hcd kernel modules
sudo modprobe -a usb-vhci-hcd usb-vhci-iocifc

# register with the licensing service
pcoip-register-host --registration-code=<TERADICI LICENSE CODE>

# set up PCoIP agent config to enable USB
echo """pcoip.grid_diff_map = 0 pcoip.enable_usb = 1 pcoip.usb_auth_table = "23XXXXXX" pcoip.usb_unauth_table = "" """ | sudo tee /etc/pcoip-agent/pcoip-agent.conf

# make sure you're running latest pcoip-agent version
sudo yum update pcoip-agent-graphics

Then install the NVIDIA GRID graphics driver and apply performance optimization to its configuration.

# NVIDIA GRID driver
# https://docs.nvidia.com/grid/index.html
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html

# install nano editor
sudo yum install nano -y

# remove any old NVIDIA drivers/CUDA
sudo yum erase nvidia cuda

# disable the nouveau open source driver for NVIDIA graphics cards
sudo touch /etc/modprobe.d/blacklist.conf

# paste the following lines in one go into your shell
cat << EOF | sudo tee --append /etc/modprobe.d/blacklist.conf
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv

# edit the /etc/default/grub file and add the line:
sudo nano /etc/default/grub

# rebuild grub2 config
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo reboot

# install pip
curl -O https://bootstrap.pypa.io/get-pip.py
python get-pip.py --user

# install AWS CLI
pip install awscli --upgrade --user

# configure AWS CLI credentials
aws configure

# AWS Access Key ID [None]: <ACCESS KEY>
# AWS Secret Access Key [None]: <SECRET ACCESS KEY>
# Default Region name [None]: <AWS REGION>
# Default output format [None]: <enter>

# 390.57 driver
aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/latest/ .
chmod +x NVIDIA-Linux-x86_64-390.57-grid.run

sudo /bin/bash ./NVIDIA-Linux-x86_64-390.57-grid.run

# respond to the NVIDIA installer prompts as follows:
    # <accept> the EULA
    # <Yes> to register kernel module sources with DKMS
    # <No> to installing 32-bit libraries
    # <No> to modifying the x.org file at end of install
    # <OK> to complete the installer

# check driver installed
nvidia-smi -q | head

# g3/NVIDIA optimization settings
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/optimize_gpu.html
sudo nvidia-persistenced
sudo nvidia-smi --auto-boost-default=0
sudo nvidia-smi -ac 2505,1177

sudo reboot

Install CUDA if required by any of your VFX software such as Autodesk Maya or SideFX Houdini:

# install CUDA and OpenCL
# https://developer.download.nvidia.com/compute/cuda/9.2/Prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf
# https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=CentOS&target_version=7&target_type=runfilelocal

wget https://developer.nvidia.com/compute/cuda/9.2/Prod/local_installers/cuda_9.2.88_396.26_linux
mv cuda_9.2.88_396.26_linux cuda_9.2.88_396.26_linux.run

# don't install the actual graphics driver, just CUDA 9.2 toolkit, sym-link
sudo /bin/sh cuda_9.2.88_396.26_linux.run

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.26?
(y)es/(n)o/(q)uit: n

Install the CUDA 9.2 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-9.2 ]: 

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 9.2 Samples?
(y)es/(n)o/(q)uit: n

Installing the CUDA Toolkit in /usr/local/cuda-9.2 ...

# CUDA Patch 1 (Released May 16, 2018)
wget https://developer.nvidia.com/compute/cuda/9.2/Prod/patches/1/cuda_9.2.88.1_linux
mv cuda_9.2.88.1_linux cuda_9.2.88.1_linux.run
sudo /bin/sh cuda_9.2.88.1_linux.run

# Ensure these ENV VARs are present: /etc/profile.d
export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Finally, install Wacom drivers.

# install Wacom driver
# https://github.com/linuxwacom/input-wacom/releases
cd ~
wget https://github.com/linuxwacom/input-wacom/releases/download/input-wacom-0.40.0/input-wacom-0.40.0.tar.bz2
tar jxf input-wacom-0.40.0.tar.bz2
cd input-wacom-0.40.0
sudo su
make && make install
modprobe wacom
dracut --force
sudo touch /etc/X11/xorg.conf.d/99-wacom-pressure2k.conf

# edit Wacom conf file as follows
sudo nano /etc/X11/xorg.conf.d/99-wacom-pressure2k.conf

Section "InputClass"
    Identifier "Wacom pressure compatibility"
    MatchDriver "wacom"
    Option "Pressure2K" "true"

# check Elastic Network Adapter (ENA) is running on your instance
modinfo ena
ethtool -i eth0
aws ec2 describe-images --image-id <AMI-ID> --query 'Images[].EnaSupport'

# if that command returns false, proceed to enable it
# make sure that you have AWS CLI installed with AWS credentials on your local machine
sudo shutdown now
aws ec2 modify-instance-attribute --instance-id <CURRENT EC2 INSTANCE ID> --ena-support

# if you're using a pre-existing Linux AMI, you need to install the ENA driver yourself
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking-ena.html#enhanced-networking-ena-linux

sudo reboot

Deploying the desktop on Windows

We use the latest AWS-provided Windows 2016 AMI for our deployment and install the NVIDIA/Teradici/Wacom stack to create a fully functioning VFX Windows desktop environment. This stack contains the following components:

  • Windows Server 2016 Base 2018.04.11
  • NVIDIA Grid 6.1 (391.58 May 2018) driver
  • Teradici CAS 2.12
  • Latest Wacom driver

Feel free to use your own Windows 2016 AMI and modify the step by step instructions accordingly.

Windows Instructions

To launch a g3.4xl instance in the closest AWS Region in your AWS account using the created key-pair and security group, use an AMI ID from the ones in the following table. For reference, the AMI name is Microsoft Windows Server 2016 Base 2018.04.11.

AWS Region AWS Region ID AMI ID
US East (N. Virginia) us-east-1 ami-3633b149
US East (Ohio) us-east-2 ami-5984b43c
US West (N. California) us-west-1 ami-3dd1c25d
US West (Oregon) us-west-2 ami-f3dcbc8b
EU (Frankfurt) eu-central-1 ami-b5530b5e
EU (Ireland) eu-west-1 ami-4cc09a35
Asia Pacific (Tokyo) ap-northeast-1 ami-0e809272
Asia Pacific (Singapore) ap-southeast-1 ami-00a2847c
Asia Pacific (Sydney) ap-southeast-2 ami-7279b010

Once the g3.4xl instance has passed its Amazon EC2 instance 2/2 status checks, let’s go build:

# use AWS Management Console to right-click EC2 instance and "Get Windows Password" -> <RDP PASSWORD>

# RDP into machine
# address: ec2-<IP-ADDRESS>.<AWS-REGION>.compute.amazonaws.com
# username: Administrator
# password: <RDP PASSWORD>

# set a password in command prompt
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ec2-windows-passwords.html
net user Administrator <NEW PASSWORD>

# configure Powershell - Allow ExecutionPolicy of Powershell scripts
Set-ExecutionPolicy -ExecutionPolicy AllSigned

# enable Software Secure Attention Sequence (SAS) setting
Open gpedit.msc
Expand Computer Configuration > Administrative Templates > Windows Components
Select Windows Logon Options
Double-click Disable or enable software Secure Attention Sequence
Select Enabled
Select Services from the drop down list in the bottom left pane
Click OK

# install AWS CLI
# https://docs.aws.amazon.com/cli/latest/userguide/awscli-install-windows.html
# download and install: https://s3.amazonaws.com/aws-cli/AWSCLI64.msi

# configure AWS CLI credentials in Powershell
aws configure

# AWS Access Key ID [None]: <ACCESS KEY>
# AWS Secret Access Key [None]: <SECRET ACCESS KEY>
# Default Region name [None]: <AWS REGION>
# Default output format [None]: <enter>

# download NVIDIA GRID driver from Amazon S3
# right-click Powershell, Run As Administrator, paste following into Powershell

$Bucket = "ec2-windows-nvidia-drivers"
$KeyPrefix = "latest"
$LocalPath = "C:\Users\Administrator\Desktop\NVIDIA"
$Objects = Get-S3Object -BucketName $Bucket -KeyPrefix $KeyPrefix -Region us-east-1
foreach ($Object in $Objects) {
    $LocalFileName = $Object.Key
    if ($LocalFileName -ne '' -and $Object.Size -ne 0) {
        $LocalFilePath = Join-Path $LocalPath $LocalFileName
        Copy-S3Object -BucketName $Bucket -Key $Object.Key -LocalFile $LocalFilePath -Region us-east-1

# run NVIDIA GRID installer

# reboot machine via command prompt
cmd shutdown /r

# Optimize GPU settings (follow these instructions)
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/optimize_gpu.html

# via Powershell
cd "C:\Program Files\NVIDIA Corporation\NVSMI"
.\nvidia-smi --auto-boost-default=0
.\nvidia-smi -ac "2505,1177"

# go to www.teradici.com, create account, and request access from Teradici via support ticket
# download Teradici PCoIP CAS software: PCoIP Graphics Agent 2.12 for Windows or later

# install PCoIP graphics agent package via GUI based installer
enter <TERADICI LICENSE CODE> via GUI installer
reboot machine

# download and install latest Wacom drivers from Wacom website
# https://www.wacom.com/en/support/product-support/drivers

# double-check the Elastic Network Adapter (ENA) is running
# ensure you have AWS CLI installed with AWS credentials on your local machine
aws ec2 describe-instances --instance-ids <CURRENT EC2 INSTANCE ID> --query "Reservations[].Instances[].EnaSupport"

# if the check returns false, install ENA drivers
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/enhanced-networking-ena.html

# if you're using a pre-existing Windows AMI, you need to install the ENA driver yourself
# https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/enhanced-networking-ena.html

Validating the desktop

Finally, take your new Linux or Windows VFX workstation for a spin. Using a zero client:

# connect Wacom tablet to zero-client and start a PCoIP session...
# ensure you configure zero-client to connect via:
# “Auto-Detect” in local z/c connection settings

# install any other software you need...

# don't forget to configure your floating license servers...

# finally, create a new AMI to capture your new custom VFX workstation image in your account

Teradici provides a software client for Windows and macOS that you can use to validate the setup of your Windows or Linux desktop. It’s also handy for system administrators who need to access a graphics workstation for artist technical support.

Testing the desktop

For testing, let’s run Autodesk 3ds Max on Windows and Autodesk Maya on Linux.

In 3ds Max, we have a 35-million-poly scene from the GPU-accelerated renderer Redshift, fully interactive and able to use the NVIDIA card to perform CUDA-based GPU final rendering.

In Maya, we show the 16 vCPUs and 120 GB of RAM available to this 3D scene file. The file takes 10 minutes to final render at HD resolution on a g3.4xl instance or, if you decide to offload the CUDA rendering to the Amazon EC2 P3.16xl instance type, just 19 seconds!


The Amazon EC2 G3 instance type is purpose-built to provide a high-end professional graphics infrastructure for visual computing applications. With remote protocols like Teradici PCoIP, G3 instances are the next-generation VFX cloud desktops that can deliver outstanding performance. With many studios already taking advantage of elastic cloud scaling for rendering, now is a great time to deploy cloud desktops for your business.

AWS Online Tech Talks – July 2018

Post Syndicated from Sara Rodas original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-july-2018/

Join us this month to learn about AWS services and solutions featuring topics on Amazon EMR, Amazon SageMaker, AWS Lambda, Amazon S3, Amazon WorkSpaces, Amazon EC2 Fleet and more! We also have our third episode of the “How to re:Invent” where we’ll dive deep with the AWS Training and Certification team on Bootcamps, Hands-on Labs, and how to get AWS Certified at re:Invent. Register now! We look forward to seeing you. Please note – all sessions are free and in Pacific Time.


Tech talks featured this month:


Analytics & Big Data

July 23, 2018 | 11:00 AM – 12:00 PM PT – Large Scale Machine Learning with Spark on EMR – Learn how to do large scale machine learning on Amazon EMR.

July 25, 2018 | 01:00 PM – 02:00 PM PT – Introduction to Amazon QuickSight: Business Analytics for Everyone – Get an introduction to Amazon Quicksight, Amazon’s BI service.

July 26, 2018 | 11:00 AM – 12:00 PM PT – Multi-Tenant Analytics on Amazon EMR – Discover how to make an Amazon EMR cluster multi-tenant to have different processing activities on the same data lake.



July 31, 2018 | 11:00 AM – 12:00 PM PT – Accelerate Machine Learning Workloads Using Amazon EC2 P3 Instances – Learn how to use Amazon EC2 P3 instances, the most powerful, cost-effective and versatile GPU compute instances available in the cloud.

August 1, 2018 | 09:00 AM – 10:00 AM PT – Technical Deep Dive on Amazon EC2 Fleet – Learn how to launch workloads across instance types, purchase models, and AZs with EC2 Fleet to achieve the desired scale, performance and cost.



July 25, 2018 | 11:00 AM – 11:45 AM PT – How Harry’s Shaved Off Their Operational Overhead by Moving to AWS Fargate – Learn how Harry’s migrated their messaging workload to Fargate and reduced message processing time by more than 75%.



July 23, 2018 | 01:00 PM – 01:45 PM PT – Purpose-Built Databases: Choose the Right Tool for Each Job – Learn about purpose-built databases and when to use which database for your application.

July 24, 2018 | 11:00 AM – 11:45 AM PT – Migrating IBM Db2 Databases to AWS – Learn how to migrate your IBM Db2 database to the cloud database of your choice.



July 25, 2018 | 09:00 AM – 09:45 AM PT – Optimize Your Jenkins Build Farm – Learn how to optimize your Jenkins build farm using the plug-in for AWS CodeBuild.


Enterprise & Hybrid

July 31, 2018 | 09:00 AM – 09:45 AM PT – Enable Developer Productivity with Amazon WorkSpaces – Learn how your development teams can be more productive with Amazon WorkSpaces.

August 1, 2018 | 11:00 AM – 11:45 AM PT – Enterprise DevOps: Applying ITIL to Rapid Innovation – Innovation doesn’t have to equate to more risk for your organization. Learn how Enterprise DevOps delivers agility while maintaining governance, security and compliance.



July 30, 2018 | 01:00 PM – 01:45 PM PT – Using AWS IoT & Alexa Skills Kit to Voice-Control Connected Home Devices – Hands-on workshop that covers how to build a simple backend service using AWS IoT to support an Alexa Smart Home skill.


Machine Learning

July 23, 2018 | 09:00 AM – 09:45 AM PT – Leveraging ML Services to Enhance Content Discovery and Recommendations – See how customers are using computer vision and language AI services to enhance content discovery & recommendations.

July 24, 2018 | 09:00 AM – 09:45 AM PT – Hyperparameter Tuning with Amazon SageMaker’s Automatic Model Tuning – Learn how to use Automatic Model Tuning with Amazon SageMaker to get the best machine learning model for your datasets, to tune hyperparameters.

July 26, 2018 | 09:00 AM – 10:00 AM PT – Build Intelligent Applications with Machine Learning on AWS – Learn how to accelerate development of AI applications using machine learning on AWS.



July 18, 2018 | 08:00 AM – 08:30 AM PT – Episode 3: Training & Certification Round-Up – Join us as we dive deep with the AWS Training and Certification team on Bootcamps, Hands-on Labs, and how to get AWS Certified at re:Invent.


Security, Identity, & Compliance

July 30, 2018 | 11:00 AM – 11:45 AM PT – Get Started with Well-Architected Security Best Practices – Discover and walk through essential best practices for securing your workloads using a number of AWS services.



July 24, 2018 | 01:00 PM – 02:00 PM PT – Getting Started with Serverless Computing Using AWS Lambda – Get an introduction to serverless and how to start building applications with no server management.



July 30, 2018 | 09:00 AM – 09:45 AM PT – Best Practices for Security in Amazon S3 – Learn about Amazon S3 security fundamentals and lots of new features that help make security simple.

Deploying a 4K, GPU-backed Linux desktop instance on AWS

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/deploying-4k-gpu-backed-linux-desktop-instance-on-aws/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

AWS currently supports many managed des­ktop delivery mechanisms. Amazon WorkSpaces and Amazon AppStream 2.0 both deliver managed Windows-based machine images with GPU-backed instances. However, many desktop services and applications are better served through a Linux backed instance. Given the variety of Linux distributions as well as desktop managers, it can be valuable to have a generic solution for provisioning a Linux desktop on Amazon EC2.

A GPU-backed instance reduces the computational requirements from the client (local) machine, eliminating the need for a local discrete GPU to run graphical workloads. The framebuffer objects generated by the GPU are compressed when sent over the network, and decompressed by the local CPU resources. This allows clients to take advantage of the server GPU and display the high-resolution content on local thin clients, mobile devices, and low-powered desktops and laptops. Such GPU-backed Linux instances have been used for VFX rendering, computational drug discovery, and computational fluid dynamics (CFD) simulation use cases. An upcoming followup post details enabling this technology on the Windows platform.


In this configuration, a client machine connects to the provisioned desktop (server) in the cloud. The server captures the framebuffer, which is sent in real time to the client machine over the network. Thus latency is an important metric to consider when provisioning this solution. I recommend choosing the nearest AWS Region (under 100 ms). Some customers may even prefer to install AWS Direct Connect.

Region Latency
US-East (Virginia) 18 ms
US East (Ohio) 31 ms
US-West (California) 77 ms
US-West (Oregon) 97 ms
Canada (Central) 29 ms
Europe (Ireland) 89 ms
Europe (London) 90 ms
Europe (Frankfurt) 108 ms
Asia Pacific (Mumbai) 197 ms
Asia Pacific (Seoul) 198 ms
Asia Pacific (Singapore) 288 ms
Asia Pacific (Sydney) 218 ms
Asia Pacific (Tokyo) 188 ms
South America (São Paulo) 138 ms
China (Beijing) 267 ms
AWS GovCloud (US) 97 ms

Source: http://www.cloudping.info/ from the Amazon offices located in Herndon, VA

Bandwidth requirements depend on the quality of the desktop experience as well as the desired resolution. Provision the backend Linux desktop instance with a 4096×2160 (4K) resolution. Depending on the specific G3 instance type selected, multi-GPU managed desktops give additional performance benefits. Each instance can also host multiple users, either in collaborative sessions, or with up to four independent 4K monitors. The GPU framebuffer memory used per session generally limits the number of sessions per managed desktop.

A smooth reliable experience depends on a low latency and high-bandwidth connection to the EC2 instance hosting the desktop. One of the benefits of using a multithreaded framebuffer reader is that only the defined block of the rendered desktop that is changing needs to be sent over the network. Full-screen redraws may be necessary only in rare cases. The minimum requirements for this 4K (3840×2160) configuration are as follows:

  • Bandwidth: 50 Mbps
  • Latency: < 30 ms
  • Jitter: < 5 ms


Use RHEL/CentOS for the deployment. Except for DCV, this stack is compatible with Debian/Ubuntu distributions. Use the CentOS 7.5 Server AMI and install the NVIDIA/Xorg/KDE stack  to create a fully functioning desktop environment with a max resolution of 16384 x 8640 (that is, 4x4K) at 60 Hz.

This stack contains the following software:

  • CentOS 7.5 Base
  • Xorg 1.19
  • NVIDIA Grid Driver 6.1 (for the G3 instance family)
  • KDE Desktop environment
  • VirtualGL
  • TurboVNC

To make the most efficient use of the NVIDIA Tesla M60 framebuffer memory, disable the compositing features of the desktop manager. Other non-compositing desktop managers (such as XFCE, MATE, etc.) are supported as well. This ensures that the GPU is reserved for specific OpenGL API tasks for the application, and that the performance is not impacted by the desktop environment decorations.

Start up a CentOS 7.5 server desktop based on the latest AMI available in the closest Region:

Distributor ID:    CentOS
Description:       CentOS Linux release 7.5.1804 (Core)
Release:           7.5.1804
Codename:          Core

Now install the Xorg stack with the KDE desktop manager:

sudo yum install epel-release
sudo yum update
sudo yum groupinstall "Development Tools"
sudo yum install xorg-* kernel-devel dkms python-pip lsb
sudo pip install awscli
sudo yum groupinstall "KDE Plasma Workspaces"
sudo systemctl disable firewalld #AWS security groups will provide our firewall rules
# if there is a kernel update
sudo reboot

Download the NVIDIA Grid driver (6.1). For more information, see Installing the NVIDIA Driver on Linux Instances.

aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/ .
chmod +x latest/NVIDIA-Linux-x86_64-390.57-grid.run
sudo .latest/NVIDIA-Linux-x86_64-390.57-grid.run
# register the driver with dkms, ignore errors associated with 32bit compatible libraries

Deposit the xorg.conf file in /etc/X11/xorg.conf:

Section "ServerLayout"
        Identifier     "X.org Configured"
        Screen      0  "Screen0" 0 0
        InputDevice    "Mouse0" "CorePointer"
        InputDevice    "Keyboard0" "CoreKeyboard"
Section "Files"
        ModulePath   "/usr/lib64/xorg/modules"
        FontPath     "catalogue:/etc/X11/fontpath.d"
        FontPath     "built-ins"
Section "Module"
        Load  "glx"
Section "InputDevice"
        Identifier  "Keyboard0"
        Driver      "kbd"
Section "InputDevice"
        Identifier  "Mouse0"
        Driver      "mouse"
        Option      "Protocol" "auto"
        Option      "Device" "/dev/input/mice"
        Option      "ZAxisMapping" "4 5 6 7"
Section "Monitor"
        Identifier   "Monitor0"
        VendorName   "Monitor Vendor"
        ModelName    "Monitor Model"
        Modeline "3840x2160_60.00"  712.34  3840 4152 4576 5312  2160 2161 2164 2235  -HSync +Vsync

Section "Device"
        Identifier  "Card0"
        Driver      "nvidia"
        Option "ConnectToAcpid" "0"
        BusID       "PCI:0:30:0"
Section "Screen"
        Identifier "Screen0"
        Device     "Card0"
        Monitor    "Monitor0"
        SubSection "Display"
                Viewport   0 0
                Depth     24
        Modes    "4096x2160" "3840x2160"

Reboot again and check that the nvidia-gridd service is running. You may notice errors. They can be safely ignored after the nvidia-gridd service successfully acquires a license.

[[email protected] ~]# systemctl status nvidia-gridd.service
● nvidia-gridd.service - NVIDIA Grid Daemon
   Loaded: loaded (/usr/lib/systemd/system/nvidia-gridd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-05-29 18:37:35 UTC; 39s ago
  Process: 863 ExecStart=/usr/bin/nvidia-gridd (code=exited, status=0/SUCCESS)
 Main PID: 881 (nvidia-gridd)
   CGroup: /system.slice/nvidia-gridd.service
           └─881 /usr/bin/nvidia-gridd
May 29 18:37:35 ip-10-0-125-164.ec2.internal systemd[1]: Starting NVIDIA Grid Daemon...
May 29 18:37:35 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Started (881)
May 29 18:37:35 ip-10-0-125-164.ec2.internal systemd[1]: Started NVIDIA Grid Daemon.
May 29 18:37:36 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Configuration parameter ( ServerAddress  FeatureType) not set
May 29 18:37:40 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Calling load_byte_array(tra)
May 29 18:37:41 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: License acquired successfully (2)

You can confirm that 4K resolution is enabled by running the following command:

DISPLAY=:0 xrandr -q
Screen 0: minimum 8 x 8, current 4096 x 2160, maximum 16384 x 8640
DVI-D-0 connected primary 4096x2160+0+0 (normal left inverted right x axis y axis) 641mm x 400mm
2560x1600 59.86+
4096x2160 60.03*
3840x2160 60.00 

Finally, check that your underlying GL renderer is using the NVIDIA driver by querying glxinfo

DISPLAY=:0 glxinfo

OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: Quadro FX Tesla M60/PCIe/SSE2
OpenGL core profile version string: 4.5.0 NVIDIA 390.57
OpenGL core profile shading language version string: 4.50 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6.0 NVIDIA 390.57
OpenGL shading language version string: 4.60 NVIDIA

At the time of publication, OpenGL 4.5 is enabled. Your applications can take advantage of that API for rendering.

To interact with the instance, install server-side desktop remote display software that can specifically take advantage of the 3D hardware acceleration. For example, AWS provides the NICE DCV platform.

DCV is an accelerated remote desktop framework that provides in-web browser desktop connections. DCV is supported in both Windows and Linux (RHEL/CentOS). In the Windows platform, OpenGL and DirectX are fully supported. DCV entitlement is free when provisioning on AWS. NICE DCV is also provided as a component to the AWS EnginFrame and myHPC solutions.

To install DCV, download the NICE DCV 2017 EL7 archive and Administrative Guide. After you extract the archive in the instance, you see a list of nice-* RPMS. You don’t have to worry about licensing, as the installer captures that the instance is running in AWS.

sudo yum localinstall nice-*
sudo systemctl enable dcvserver
sudo systemctl start dcvserver

When the DCV server starts, you have the option to create a single console session or multiple virtual sessions. You must assign a password for the CentOS user issued, by running the following command:

sudo passwd centos

Start the console session:

sudo dcv create-session --type=console --owner centos session1
sudo dcv list-sessions

The AWS security groups are enabled to allow TCP 8443 traffic to the instance. You see the DCV login portal and can interact with the instance. Other popular frameworks include the following:

You can also find plug and play images for managed desktops in the AWS Marketplace.


Implement the changes outlined in the Optimizing GPU Settings (P2, P3, and G3 Instances) topic. You can turn off the autoboost feature and set the maximum graphics and memory clocks manually.

sudo nvidia-smi --auto-boost-default=0
sudo nvidia-smi -ac 2505,1177

Application testing

For testing, look at PyMOL (PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.). PyMOL is a standard commercial drug discovery application that is used for processing, and visualizing biochemical structures.  I used the opensource fork.

With the NVIDIA GRID licensing enabled earlier, PyMOL can take advantage of the Quadro features supplied by the Tesla M60. After it’s installed and loaded, you can confirm the functionality of the entire G3 instance software stack installed earlier:

PyMOL(TM) Molecular Graphics System, Version 2.1.0.
 Copyright (c) Schrodinger, LLC.
 All Rights Reserved.
    Created by Warren L. DeLano, Ph.D. 
    PyMOL is user-supported open-source software.  Although some versions
    are freely available, PyMOL is not in the public domain.
    If PyMOL is helpful in your work or study, then please volunteer 
    support for our ongoing efforts to create open and affordable scientific
    software by purchasing a PyMOL Maintenance and/or Support subscription.

    More information can be found at "http://www.pymol.org".
    Enter "help" for a list of commands.
    Enter "help <command-name>" for information on a specific command.

 Hit ESC anytime to toggle between text and graphics.

 Detected OpenGL version 2.0 or greater. Shaders available.
 Detected GLSL version 4.60.
 OpenGL graphics engine:
  GL_VENDOR:   NVIDIA Corporation
  GL_RENDERER: Quadro FX Tesla M60/PCIe/SSE2
  GL_VERSION:  4.6.0 NVIDIA 390.57
 Adapting to Quadro hardware.
 Detected 16 CPU cores.  Enabled multithreaded rendering.

In the PyMOL window, run “fetch 5ta3”, which is a 39k amino acid protein, under the 4K desktop environment. Rotating and translating the protein should be smooth and respond quickly to pointer events.

The PyMOL Gallery contains other representative examples that take advantage of various visualization and processing workflows. Also, you can find many demos (choose Wizard, Demo).

Under the Sculpting demo, you can show the pointer latency between the client and server.

Finally, look at ray tracing. From the PyMOL wiki, take a chemical structure and render each frame with ray tracing to produce a video. On the Tesla M60 with Quadro features enabled, the total render time was approximately 1 minute.


As I mentioned previously, the framebuffer redirection protocols have a feature set to create multiple virtual sessions per node. A virtual session is not necessarily tied to a single user either. In other words, the number of independent virtual sessions is limited by the total amount of GPU frame buffer memory used in all sessions per GPU. Thus, it’s possible to scale horizontally by increasing the number of G3 instances, or vertically by using larger instance types in the G3 family.


The G3 instance type is purpose-built to provide a managed, high-end professional graphics infrastructure for visual computing needs. With NICE DCV, you can take advantage of NVIDIA Quadro software features for a range of applications including drug discovery and VFX rendering. Connected with the AWS high-performance network backbone, the instance can become an integral part of your graphics workload pipeline. Now, you can power up and deliver your applications to teams working anywhere in the world.

EC2 Instance Update – M5 Instances with Local NVMe Storage (M5d)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-instance-update-m5-instances-with-local-nvme-storage-m5d/

Earlier this month we launched the C5 Instances with Local NVMe Storage and I told you that we would be doing the same for additional instance types in the near future!

Today we are introducing M5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for workloads that require a balance of compute and memory resources. Here are the specs:

Instance Name vCPUs RAM Local Storage EBS-Optimized Bandwidth Network Bandwidth
m5d.large 2 8 GiB 1 x 75 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.xlarge 4 16 GiB 1 x 150 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.2xlarge 8 32 GiB 1 x 300 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.4xlarge 16 64 GiB 1 x 600 GB NVMe SSD 2.210 Gbps Up to 10 Gbps
m5d.12xlarge 48 192 GiB 2 x 900 GB NVMe SSD 5.0 Gbps 10 Gbps
m5d.24xlarge 96 384 GiB 4 x 900 GB NVMe SSD 10.0 Gbps 25 Gbps

The M5d instances are powered by Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, including support for AVX-512.

You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.

Here are a couple of things to keep in mind about the local NVMe storage on the M5d instances:

Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.

Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.

Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.

Available Now
M5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent M5 instances.



AWS Online Tech Talks – June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-june-2018/

AWS Online Tech Talks – June 2018

Join us this month to learn about AWS services and solutions. New this month, we have a fireside chat with the GM of Amazon WorkSpaces and our 2nd episode of the “How to re:Invent” series. We’ll also cover best practices, deep dives, use cases and more! Join us and register today!

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:


Analytics & Big Data

June 18, 2018 | 11:00 AM – 11:45 AM PTGet Started with Real-Time Streaming Data in Under 5 Minutes – Learn how to use Amazon Kinesis to capture, store, and analyze streaming data in real-time including IoT device data, VPC flow logs, and clickstream data.
June 20, 2018 | 11:00 AM – 11:45 AM PT – Insights For Everyone – Deploying Data across your Organization – Learn how to deploy data at scale using AWS Analytics and QuickSight’s new reader role and usage based pricing.


AWS re:Invent
June 13, 2018 | 05:00 PM – 05:30 PM PTEpisode 2: AWS re:Invent Breakout Content Secret Sauce – Hear from one of our own AWS content experts as we dive deep into the re:Invent content strategy and how we maintain a high bar.

June 25, 2018 | 01:00 PM – 01:45 PM PTAccelerating Containerized Workloads with Amazon EC2 Spot Instances – Learn how to efficiently deploy containerized workloads and easily manage clusters at any scale at a fraction of the cost with Spot Instances.

June 26, 2018 | 01:00 PM – 01:45 PM PTEnsuring Your Windows Server Workloads Are Well-Architected – Get the benefits, best practices and tools on running your Microsoft Workloads on AWS leveraging a well-architected approach.


June 25, 2018 | 09:00 AM – 09:45 AM PTRunning Kubernetes on AWS – Learn about the basics of running Kubernetes on AWS including how setup masters, networking, security, and add auto-scaling to your cluster.



June 18, 2018 | 01:00 PM – 01:45 PM PTOracle to Amazon Aurora Migration, Step by Step – Learn how to migrate your Oracle database to Amazon Aurora.

June 20, 2018 | 09:00 AM – 09:45 AM PTSet Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tools – Learn how to set up a CI/CD pipeline for deploying containers using the AWS Developer Tools.


Enterprise & Hybrid
June 18, 2018 | 09:00 AM – 09:45 AM PTDe-risking Enterprise Migration with AWS Managed Services – Learn how enterprise customers are de-risking cloud adoption with AWS Managed Services.

June 19, 2018 | 11:00 AM – 11:45 AM PTLaunch AWS Faster using Automated Landing Zones – Learn how the AWS Landing Zone can automate the set up of best practice baselines when setting up new


AWS Environments

June 21, 2018 | 11:00 AM – 11:45 AM PTLeading Your Team Through a Cloud Transformation – Learn how you can help lead your organization through a cloud transformation.

June 21, 2018 | 01:00 PM – 01:45 PM PTEnabling New Retail Customer Experiences with Big Data – Learn how AWS can help retailers realize actual value from their big data and deliver on differentiated retail customer experiences.

June 28, 2018 | 01:00 PM – 01:45 PM PTFireside Chat: End User Collaboration on AWS – Learn how End User Compute services can help you deliver access to desktops and applications anywhere, anytime, using any device.

June 27, 2018 | 11:00 AM – 11:45 AM PTAWS IoT in the Connected Home – Learn how to use AWS IoT to build innovative Connected Home products.


Machine Learning

June 19, 2018 | 09:00 AM – 09:45 AM PTIntegrating Amazon SageMaker into your Enterprise – Learn how to integrate Amazon SageMaker and other AWS Services within an Enterprise environment.

June 21, 2018 | 09:00 AM – 09:45 AM PTBuilding Text Analytics Applications on AWS using Amazon Comprehend – Learn how you can unlock the value of your unstructured data with NLP-based text analytics.


Management Tools

June 20, 2018 | 01:00 PM – 01:45 PM PTOptimizing Application Performance and Costs with Auto Scaling – Learn how selecting the right scaling option can help optimize application performance and costs.


June 25, 2018 | 11:00 AM – 11:45 AM PTDrive User Engagement with Amazon Pinpoint – Learn how Amazon Pinpoint simplifies and streamlines effective user engagement.


Security, Identity & Compliance

June 26, 2018 | 09:00 AM – 09:45 AM PTUnderstanding AWS Secrets Manager – Learn how AWS Secrets Manager helps you rotate and manage access to secrets centrally.
June 28, 2018 | 09:00 AM – 09:45 AM PTUsing Amazon Inspector to Discover Potential Security Issues – See how Amazon Inspector can be used to discover security issues of your instances.



June 19, 2018 | 01:00 PM – 01:45 PM PTProductionize Serverless Application Building and Deployments with AWS SAM – Learn expert tips and techniques for building and deploying serverless applications at scale with AWS SAM.



June 26, 2018 | 11:00 AM – 11:45 AM PTDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway – Learn how you can reduce your on-premises infrastructure by using the AWS Storage Gateway to connecting your applications to the scalable and reliable AWS storage services.
June 27, 2018 | 01:00 PM – 01:45 PM PTChanging the Game: Extending Compute Capabilities to the Edge – Discover how to change the game for IIoT and edge analytics applications with AWS Snowball Edge plus enhanced Compute instances.
June 28, 2018 | 11:00 AM – 11:45 AM PTBig Data and Analytics Workloads on Amazon EFS – Get best practices and deployment advice for running big data and analytics workloads on Amazon EFS.