This post is courtesy of Giuseppe Angelo Porcelli, Principal ML Specialist SA, and Diego Natali, Solutions Architect.
Amazon EFS for AWS Lambda makes it easier for serverless applications requiring persistent file storage or access to large amounts of reference data. Previously, applications had to download data from an object store or database to local ephemeral storage in 512-MB chunks for processing. This creates more code, causes slower startup behavior, and slower data processing. Customers also faced challenges when loading large code packages and models for ML inference.
Recently, AWS announced Amazon EFS support for AWS Lambda. It enables customers to easily share data across function invocations. It also allows you to read large reference data files, and write function output to a persistent and shared data store. Customers can now use Lambda to build data-intensive applications, and load larger libraries and models. They can process larger amounts of data in a highly distributed manner, and share data across functions, containers, and instances.
In this blog post, we show how you can use EFS to store deep learning (DL) framework libraries and models to load from Lambda to execute inferences. We provide a code example on executing serverless inferences with TensorFlow 2.
Using EFS and Lambda for deep learning inference requires to execute two steps:
Storing the deep learning libraries and model on EFS
Creating a Lambda function for inference, which loads the libraries and model from the EFS file system
In the next sections, we share some best practices to implement these steps, and then discuss a full, working example.
Prerequisites
This post assumes experience with Lambda, EFS, plus general knowledge of Python programming, DL, and DL frameworks. To help you get started, read the blog post and documentation.
1. Storing the deep learning libraries and model on Amazon EFS
To populate EFS with DL framework Python libraries and the DL model, there are different options. You can use EC2 instances, third-party tools like lambdci or AWS CodeBuild. AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages for deployment.
This blog post uses an AWS CodeBuild project, configured as follows:
The build environment is a Docker container replicating the Lambda runtime environment. To make sure that the packages work in Lambda, it uses the lambci/lambda build container images on Docker Hub.
The EFS file system is mounted to the CodeBuild environment.
Build commands are used to install the DL framework and download the model to specific paths of the file system.
After the build completes, the EFS file system contains the Python libraries and the model in specific paths. It is attached to the Lambda function for loading those libraries at runtime and execute inference.
For this example, these are the CodeBuild commands to install the TensorFlow 2 framework and an SSD (Single Shot MultiBox Detector) pre-trained object detection model from TensorFlow Hub:
The approach described can also work for other ML/DL frameworks
The EFS file system can be attached to multiple Lambda functions. This means it can share the DL framework libraries with multiple inference functions (up to 25000 connections for each file system).
There are alternatives to using EFS for model storage. If the model size fits in the Lambda package deployment, then you could optimize the first invocation since it doesn’t need to download the model. You can also use the function’s initializer to load the model since the first mount to EFS only takes a few hundred milliseconds.
2. Creating a Lambda function for inference
After attaching the EFS file system, you may structure the Lambda code as follows:
The code outside the handler method first adds the local mount path to the Python path. It then imports the frameworks, and loads the model into memory. Executing those operations outside of the function’s handler ensures that those objects remain initialized and reused in subsequent invocations of the same Lambda function instance. The code inside the handler runs the inference flow by reading inputs, executing the actual inference, and returning the results to the caller.
For hosting the TensorFlow 2 object detection model in the example, this is the function code:
{
"statusCode": 200,
"body": "{
\"detection_boxes\": This field contains the relative position of the bounding boxes,
\"detection_class_entities\": This field returns the class labels,
\"detection_scores\": This field returns the detection confidences
}"
}
Running the example
This working example is provided to set up and run ML/AI inference on Lambda using EFS. To run it, you must have the AWS CDK installed. Execute the following commands:
# clone repository
$ git clone https://github.com/aws-samples/lambda-efs-deep-learning-inference.git
$ cd lambda-efs-ml-demo
# Install the CDK and bootstrap the target account (if this was never done before)
$ npm install -g aws-cdk
$ cdk bootstrap aws://{account_id}/{region}
# Install packages for the project, build and deploy
$ cd cdk/
$ npm install
$ npm run build
$ cdk deploy
It takes a few minutes for AWS CodeBuild to deploy the libraries and framework to EFS. To test the Lambda function, run this command, replacing the function name:
Here you can check the inference’s result: $ tail /tmp/return.json
The following image shows the bounding boxes created from the inference output.
The following image shows the bounding boxes created from the inference output.
To generate this image with the bounding boxes, use the Jupyter notebook from the repository. We reduce the number of bounding boxes to the most relevant classes:
Bicycle: 91%
Wheel: 48%
Person: 45%
Wheel: 44%
Man: 40%
Bicycle wheel: 37%
Bicycle wheel: 30%
To clean up the deployment, run:
$ cdk destroy
Performance considerations
When planning for ML inference, you must keep three main aspects in mind: the type of compute resources required for inference, model size and memory footprint, function initialization and cold start.
Lambda is best suited for CPU-based inferencing, which meets the needs for most ML/DL inference use cases. Lambda’s memory can be set between 128 MB and 3008 MB. This means that large models (for example, FasterRCNN models) that may require more memory or dedicated GPUs are not a good fit.
It’s important to understand how Lambda invokes affect performance. The first request to a function instance is called a “cold-start”. This is where the function is provisioned, code downloaded, and the initializer is executed to download the code and load libraries. In this example, it takes about 40 seconds to load the full TensorFlow 2 libraries from EFS, and another 8 seconds to load the model into memory.
Subsequent calls to the same Lambda function instance don’t incur cold start latency if the request is handled by an existing execution environment. Customers who want to reduce this one-time cold start can use Provisioned Concurrency. This feature provides customers with greater control over performance of their serverless applications at any scale.
The EFS mount operation only takes a few hundred milliseconds and only happens once during the function provisioning. EFS supports up to 25,000 connections so is ideal for functions that scale up. We recommend you use EFS provisioned throughput with Provisioned Concurrency for better performance. To learn more, read the documentation about Amazon EFS performance and monitoring Amazon EFS.
Conclusion
This post shows how you can use EFS for Lambda to deploy large DL libraries and models into a function for synchronous invocations. The same approach can be applied to asynchronous invokes. For example, you could perform object detection on images stored in Amazon S3, or streaming invokes on data in Amazon Kinesis and Amazon DynamoDB.
We’ve all been able to check on our kitties’ outdoor activities for a while now, thanks to motion-activated cameras. And the internet’s favourite cat flap even live-tweets when it senses paws through the door.
“Did you already make dinner? I stopped on the way home to pick this up for you.”
But what’s eluded us “owners” of felines up until now is the ability to stop our furry companions from bringing home mauled presents we neither want nor asked for.
A cat flap bouncer powered by deep learning
Now this Raspberry Pi–powered machine learning build, shared by reddit user u/eee_bume, can help us out: at its heart, there’s a convolutional neural network cascade that detects whether a cat is trying to enter a cat flap with something in its maw. (No word from the creators on how many half-consumed rodents the makers had to dispose of while training the machine learning model.)
The neural network first detects the whole cat in an image; then it hones in on the cat’s maw. Image classification is performed to detect whether there is anything in or around the maw. If the network thinks the cat is trying to smuggle caught contraband into the house, it’s a “no” from this virtual door bouncer.
The system runs on Raspberry Pi 4 with an infrared camera at an average detection rate of around 1 FPS. The PC-Val value, representing the certainty of the prey classification => prey/no_prey certainty threshold, is 0.5.
The infrared camera setup, powered by Raspberry Pi
How to get enough training data
This project formed Nicolas Baumann’s and Michael Ganz’s spring semester thesis at the Swiss Federal Institute of Technology. One of the problems they ran into while trying to train their device is that cats are only expected to enter the cat flap carrying prey 3% of the time, which leads to a largely imbalanced classification problem. It would have taken a loooong time if they had just waited for Nicolas and Michael’s pets to bring home enough decomposing gifts.
The cutest mugshots you ever did see
To get around this, they custom-built a scalable image data gathering network to simplify and maximise the collection of training data. It features multiple distributed Camera Nodes (CN), a centralised main archive, and a custom labeling tool. As a result of the data gathering network, 40GB of training data have been amassed.
What is my cat eating?!
The makers also took the time to train their neural network to classify different types of prey. So far, it recognises mice, lizards, slow-worms, and birds.
“Come ooooon, it’s not even a *whole* mouse, let me in!”
It’s still being tweaked, but at the moment the machine learning model correctly detects when a cat has prey in its mouth 93% of the time. But it still falsely accuses kitties 28% of the time. We’ll leave it to you to decide whether your feline companion will stand for that kind of false positive rate, or whether it’s more than your job’s worth.
Netflix brings delightful customer experiences to homes on a variety of devices that continues to grow each day. The device ecosystem is rich with partners ranging from Silicon-on-Chip (SoC) manufacturers, Original Design Manufacturer (ODM) and Original Equipment Manufacturer (OEM) vendors.
Partners across the globe leverage Netflix device certification process on a continual basis to ensure that quality products and experiences are delivered to their customers. The certification process involves the verification of partner’s implementation of features provided by the Netflix SDK.
The Partner Device Ecosystem organization in Netflix is responsible for ensuring successful integration and testing of the Netflix application on all partner devices. Netflix engineers run a series of tests and benchmarks to validate the device across multiple dimensions including compatibility of the device with the Netflix SDK, device performance, audio-video playback quality, license handling, encryption and security. All this leads to a plethora of test cases, most of them automated, that need to be executed to validate the functionality of a device running Netflix.
Problem
With a collection of tests that, by nature, are time consuming to run and sometimes require manual intervention, we need to prioritize and schedule test executions in a way that will expedite detection of test failures. There are several problems efficient test scheduling could help us solve:
Quickly detect a regression in the integration of the Netflix SDK on a consumer electronic or MVPD (multichannel video programming distributor) device.
Detect a regression in a test case. Using the Netflix Reference Application and known good devices, ensure the test case continues to function and tests what is expected.
When code many test cases are dependent on has changed, choose the right test cases among thousands of affected tests to quickly validate the change before committing it and running extensive, and expensive, tests.
Choose the most promising subset of tests out of thousands of test cases available when running continuous integration against a device.
Recommend a set of test cases to execute against the device that would increase the probability of failing the device in real-time.
Solving the above problems could help Netflix and our Partners save time and money during the entire lifecycle of device design, build, test, and certification.
These problems could be solved in several different ways. In our quest to be objective, scientific, and inline with the Netflix philosophy of using data to drive solutions for intriguing problems, we proceeded by leveraging machine learning.
In the case of continuously testing a Netflix SDK integration on a new device, we usually lack relevant data for model training in the early phases of integration. In this situation training an agent is a great fit as it allows us to start with very little input data and let the agent explore and exploit the patterns it learns in the process of SDK integration and regression testing. The agent in reinforcement learning is an entity that performs a decision on what action to take considering the current state of the environment, and gets a reward based on the quality of the action.
Solution
We built a system called Lerner that consists of a set of microservices and a python library that allows scalable agent training and inference for test case scheduling. We also provide an API client in Python.
Lerner works in tandem with our continuous integration framework that executes on-device tests using the Netflix Test Studio platform. Tests are run on Netflix Reference Applications (running as containers on Titus), as well as on physical devices.
There were several motivations that led to building a custom solution:
We wanted to keep the APIs and integrations as simple as possible.
We needed a way to run agents and tie the runs to the internal infrastructure for analytics, reporting, and visualizations.
We wanted the to tool be available as a standalone library as well as scalable API service.
Lerner provides ability to setup any number of agents making it the first component in our re-usable reinforcement learning framework for device certification.
Lerner, as a web-service, relies on Amazon Web Services (AWS) and Netflix’s Open Source Software (OSS) tools. We use Spinnaker to deploy instances and host the API containers on Titus — which allows fast deployment times and rapid scalability. Lerner uses AWS services to store binary versions of the agents, agent configurations, and training data. To maintain the quality of Lerner APIs, we are using the server-less paradigm for Lerner’s own integration testing by utilizing AWS Lambda.
The agent training library is written in Python and supports versions 2.7, 3.5, 3.6, and 3.7. The library is available in the artifactory repository for easy installation. It can be used in Python notebooks — allowing for rapid experimentation in isolated environments without a need to perform API calls. The agent training library exposes different types of learning agents that utilize neural networks to approximate action.
The neural network (NN)-based agent uses a deep net with fully connected layers. The NN gets the state of a particular test case (the input) and outputs a continuous value, where a higher number means an earlier position in a test execution schedule. The inputs to the neural network include: general historical features such as the last N executions and several domain specific features that provide meta-information about a test case.
The Lerner APIs are split into three areas:
Storing execution results.
Getting recommendations based on the current state of the environment.
Assign reward to the agent based on the execution result and predicted recommendations.
A process of getting recommendations and rewarding the agent using APIs consists of 4 steps:
Out of all available test cases for a particular job — form a request that can be interpreted by Lerner. This involves aggregation of historical results and additional features.
Lerner returns a recommendation identified with a unique episode id.
A CI system can execute the recommendation and submit the execution results to Lerner based on the episode id.
Call an API to assign a reward based on the agent id and episode id.
Below is a diagram of the services and persistence layers that support the functionality of the Lerner API.
The self-service nature of the tool makes it easy for service owners to integrate with Lerner, create agents, ask agents for recommendations and reward them after execution results are available.
The metrics relevant to the training and recommendation process are reported to Atlas and visualized using Netflix’s Lumen. Users of the service can track the statistics specific to the agents they setup and deploy, which allows them to build their own dashboards.
We have identified some interesting patterns while doing online reinforcement learning.
The recommendation/execution reward cycle can happen without any prior training data.
We can bootstrap several CI jobs that would use agents with different reward functions, and gain additional insight based on agents performance. It could help us design and implement more targeted reward functions.
We can keep a small amount of historical data to train agents. The data can be truncated after each execution and offloaded to a long-term storage for further analysis.
Some of the downsides:
It might take time for an agent to stop exploring and start exploiting the accumulated experience.
As agents stored in a binary format in the database, an update of an agent from multiple jobs could cause a race condition in its state. Handling concurrency in the training process is cumbersome and requires trade offs. We achieved the desired state by relying on the locking mechanisms of the underlying persistence layer that stores and serves agent binaries.
Thus, we have the luxury of training as many agents as we want that could prioritize and recommend test cases based on their unique learning experiences.
Outcome
We are currently piloting the system and have live agents serving predictions for various CI runs. At the moment we run Lerner-based CIs in parallel with CIs that either execute test cases in random order or use simple heuristics as sorting test cases by time and execute everything that previously failed.
The system was built with simplicity and performance in mind, so the set of APIs are minimal. We developed client libraries that allow seamless, but opinionated, integration with Lerner.
We collect several metrics to evaluate the performance of a recommendation, with main metrics being time taken to first failure and time taken to complete a whole scheduled run.
Lerner-based recommendations are proving to be different and more insightful than random runs, as they allow us to fit a particular time budget and detect patterns such as cases that tend to fail together in a cluster, cases that haven’t been run in a long time, and so on.
The below graphs shows more or less an artificial case when a schedule of 100+ test cases would contain several flaky tests. The Y-axis represents how many minutes it took to complete the schedule or reach a first failed test case. In blue, we have random recommendations with no time budget constraints. In green you can see executions based on Lerner recommendations under a time constraint of 60 minutes. The green spikes represent Lerner exploring the environment, where the wiggly lines around 0 are the executions that failed quickly as Lerner was exploiting its policy.
Execution of schedules that were randomly generated. Y-axis represents time to finish execution or reach first failure.Execution of Lerner based schedules. You can see moments when Lerner was exploring the environment, and the wiggly lines represent when the schedule was generated based on exploiting existing knowledge.
Next Steps
The next phases of the project will focus on:
Reward functions that are aware of a comprehensive domain context, such as assigning appropriate rewards to states where infrastructure is fragile and test case could not be run appropriately.
Administrative user-interface to manage agents.
More generic, simple, and user-friendly framework for reinforcement learning and agent deployment.
Using Lerner on all available CIs jobs against all SDK versions.
Experiment with different neural network architectures.
If you would like to be a part of our team, come join us.
Contributed by Manuel Manzano Hoss, Cloud Support Engineer
I remember playing around with graphics processing units (GPUs) workload examples in 2017 when the Deep Learning on AWS Batch post was published by my colleague Kiuk Chung. He provided an example of how to train a convolutional neural network (CNN), the LeNet architecture, to recognize handwritten digits from the MNIST dataset using Apache MXNet as the framework. Back then, to run such jobs with GPU capabilities, I had to do the following:
Identify the type of P2 EC2 instance that had the required amount of GPU for my job.
Check the amount of vCPUs that it offered (even if I was not interested on using them).
Specify that number of vCPUs for my job.
All that, when I didn’t have any certainty that the instance was going to have the GPU required available when my job was already running. Back then, there was no GPU pinning. Other jobs running on the same EC2 instance were able to use that GPU, making the orchestration of my jobs a tricky task.
Fast forward two years. Today, AWS Batch announced integrated support for Amazon EC2 Accelerated Instances. It is now possible to specify an amount of GPU as a resource that AWS Batch considers in choosing the EC2 instance to run your job, along with vCPU and memory. That allows me to take advantage of the main benefits of using AWS Batch, the compute resource selection algorithm and job scheduler. It also frees me from having to check the types of EC2 instances that have enough GPU.
Also, I can take advantage of the Amazon ECS GPU-optimized AMI maintained by AWS. It comes with the NVIDIA drivers and all the necessary software to run GPU-enabled jobs. When I allow the P2 or P3 instance types on my compute environment, AWS Batch launches my compute resources using the Amazon ECS GPU-optimized AMI automatically.
In other words, now I don’t worry about the GPU task list mentioned earlier. I can focus on deciding which framework and command to run on my GPU-accelerated workload. At the same time, I’m now sure that my jobs have access to the required performance, as physical GPUs are pinned to each job and not shared among them.
A GPU race against the past
As a kind of GPU-race exercise, I checked a similar example to the one from Kiuk’s post, to see how fast it could be to run a GPU-enabled job now. I used the AWS Management Console to demonstrate how simple the steps are.
In this case, I decided to use the deep neural network architecture called multilayer perceptron (MLP), not the LeNet CNN, to compare the validation accuracy between them.
To make the test even simpler and faster to implement, I thought I would use one of the recently announced AWS Deep Learning containers, which come pre-packed with different frameworks and ready-to-process data. I chose the container that comes with MXNet and Python 2.7, customized for Training and GPU. For more information about the Docker images available, see the AWS Deep Learning Containers documentation.
In the AWS Batch console, I created a managed compute environment with the default settings, allowing AWS Batch to create the required IAM roles on my behalf.
On the configuration of the compute resources, I selected the P2 and P3 families of instances, as those are the type of instance with GPU capabilities. You can select On-Demand Instances, but in this case I decided to use Spot Instances to take advantage of the discounts that this pricing model offers. I left the defaults for all other settings, selecting the AmazonEC2SpotFleetRole role that I created the first time that I used Spot Instances:
Finally, I also left the network settings as default. My compute environment selected the default VPC, three subnets, and a security group. They are enough to run my jobs and at the same time keep my environment safe by limiting connections from outside the VPC:
I created a job queue, GPU_JobQueue, attaching it to the compute environment that I just created:
Next, I registered the same job definition that I would have created following Kiuk’s post. I specified enough memory to run this test, one vCPU, and the AWS Deep Learning Docker image that I chose, in this case mxnet-training:1.4.0-gpu-py27-cu90-ubuntu16.04. The amount of GPU required was in this case, one. To have access to run the script, the container must run as privileged, or using the root user.
Finally, I submitted the job. I first cloned the MXNet repository for the train_mnist.py Python script. Then I ran the script itself, with the parameter –gpus 0 to indicate that the assigned GPU should be used. The job inherits all the other parameters from the job definition:
That’s all, and my GPU-enabled job was running. It took me less than two minutes to go from zero to having the job submitted. This is the log of my job, from which I removed the iterations from epoch 1 to 18 to make it shorter:
As you can see, after AWS Batch launched the instance, the job took slightly more than two minutes to run. I spent roughly five minutes from start to finish. That was much faster than the time that I was previously spending just to configure the AMI. Using the AWS CLI, one of the AWS SDKs, or AWS CloudFormation, the same environment could be created even faster.
From a training point of view, I lost on the validation accuracy, as the results obtained using the LeNet CNN are higher than when using an MLP network. On the other hand, my job was faster, with a time cost of 1.6 seconds in average for each epoch. As the software stack evolves, and increased hardware capabilities come along, these numbers keep improving, but that shouldn’t mean extra complexity. Using managed primitives like the one presented in this post enables a simpler implementation.
I encourage you to test this example and see for yourself how just a few clicks or commands lets you start running GPU jobs with AWS Batch. Then, it is just a matter of replacing the Docker image that I used for one with the framework of your choice, TensorFlow, Caffe, PyTorch, Keras, etc. Start to run your GPU-enabled machine learning, deep learning, computational fluid dynamics (CFD), seismic analysis, molecular modeling, genomics, or computational finance workloads. It’s faster and easier than ever.
If you decide to give it a try, have any doubt or just want to let me know what you think about this post, please write in the comments section!
Running on a smart mirror, YogAI uses a database of postures, image recognition software, and the magic of mirrors to not only show users their current posture but to also teach them how to correct their posture to reach peak yogi-ness. Here’s Rob Zwetsloot from The MagPi magazine with more.
We’ve seen many ‘magic mirror’ projects over the past few years, featuring a TV screen behind the glass to show useful information, but YogAI takes the concept to a whole new level by providing an AI personal trainer to guide and correct your yoga positions.
Self-confessed fitness nuts Salma Mayorquin and Terry Rodriguez thought that having a personal trainer could be a way to keep track of their fitness progress, so why not try to make a virtual one? “With [deep learning] models like pose estimation, we figured there was a way we could make a program that could track how we were exercising and started experimenting from there,” says Terry.
“YogAI guides users through a flow of yoga poses, offering generally helpful advice when the camera senses a user not in the correct pose,” explains Salma. “At the heart, YogAI uses pose estimation to find reference key points on the body. This is used to understand and classify common yoga poses.”
Users interact with YogAI through both visual feedback via the mirror display, and a voice interface — using the Snips AIR voice assistant — which enables the user to give spoken commands to start, stop, pause, and restart a yoga session. YogAI also talks back through the Flite voice synthesiser to guide the yogi to achieve the correct poses.
While a prototype magic mirror only took the experienced makers a week to build, training the AI to recognise yoga poses in real time was a trickier task. “We need our computer vision models to run quickly so that we have enough resolution in time to identify the move,” reveals Terry.
Strike a pose
A Raspberry Pi 3 interprets the camera images in real time, detecting key body points to display the pose on the mirror and classify it using a deep-learning model trained with a dataset of around 35000 samples.
However, the pair found that the Pi could only run image inference at one frame every 4–5 seconds, resulting in lag. A workaround was soon found: “Shrinking our pose estimation models down using TensorFlow Lite, we were able to bring our frame rate from 0.2 fps to 2.5 fps,” says Salma. “For faster inference, we will look for ways to reduce the model further. We also believe upgrading to the Raspberry Pi Compute Module 3 will increase the performance significantly.”
“Overall, the accuracy across a dozen common poses is roughly 80%,” divulges Terry. “Not surprisingly, we find similar pose variants, e.g. warrior poses, can be a source of confusion. When the head/face is blocked, the pose estimates degrade, which impacts our classification of poses like downward dog.”
More intense exercise
As well as using the system for yoga, Salma and Terry are planning to adapt YogAI to monitor more energetic workouts. “We’re interested in strength training, and others have suggested dance and karate katas,” says Terry. “We think YogAI is well-positioned to perform more general health and personal wellness tasks.”
“We want to integrate with popular health wearables,” adds Salma. “A smart watch with an accelerometer and heart rate monitor can introduce a lot of important context to bring YogAI closer to our vision for a smart mirror yoga instructor and toward a personal wellness platform.”
More from The MagPi magazine
The MagPi magazine issue 80 is out today. Buy your copy now from the Raspberry Pi Press store, major newsagents in the UK, or Barnes & Noble, Fry’s, or Micro Center in the US. Or, download your free PDF copy from The MagPi magazine website.
Subscribe now
Subscribe to The MagPi magazine on a monthly, quarterly, or twelve-month basis to save money against newsstand prices!
Twelve-month print subscribers get a free Raspberry Pi 3A+, the perfect Raspberry Pi to try your hand at some of the latest projects covered in The MagPi magazine.
This post is contributed by Brent Langston – Sr. Developer Advocate, Amazon Container Services
Last week, AWS announced enhanced Amazon Elastic Container Service (Amazon ECS) support for GPU-enabled EC2 instances. This means that now GPUs are first class resources that can be requested in your task definition, and scheduled on your cluster by ECS.
Previously, to schedule a GPU workload, you had to maintain your own custom configured AMI, with a custom configured Docker runtime. You also had to use custom vCPU logic as a stand-in for assigning your GPU workloads to GPU instances. Even when all that was in place, there was still no pinning of a GPU to a task. One task might consume more GPU resources than it should. This could cause other tasks to not have a GPU available.
Now, AWS maintains an ECS-optimized AMI that includes the correct NVIDIA drivers and Docker customizations. You can use this AMI to provision your GPU workloads. With this enhancement, GPUs can also be requested directly in the task definition. Like allocating CPU or RAM to a task, now you can explicitly request a number of GPUs to be allocated to your task. The scheduler looks for matching resources on the cluster to place those tasks. The GPUs are pinned to the task for as long as the task is running, and can’t be allocated to any other tasks.
I thought I’d see how easy it is to deploy GPU workloads to my ECS cluster. I’m working in the US-EAST-2 (Ohio) region, from my AWS Cloud9 IDE, so these commands work for Amazon Linux. Feel free to adapt to your environment as necessary.
If you’d like to run this example yourself, you can find all the code in this GitHub repo. If you run this example in your own account, be aware of the instance pricing, and clean up your resources when your experiment is complete.
While AWS CloudFormation is provisioning resources, examine the template used to build your infrastructure. Open `cluster-cpu-gpu.yml`, and you see that you are provisioning a test VPC with two c5.2xlarge instances, and two p3.2xlarge instances. This gives you one NVIDIA Tesla V100 GPU per instance, for a total of two GPUs to run training tasks.
I adapted the TensorFlow benchmark Docker container to create a training workload. I use this container to compare the GPU scheduling and runtime.
When the CloudFormation stack is deployed, register a task definition with the ECS service:
When you launch the task, the output shows the `guIds` values that are assigned to the task. This GPU is pinned to this task, and can’t be shared with any other tasks. If all GPUs are allocated, you can’t schedule additional GPU tasks until a running task with a GPU completes. That frees the GPU to be scheduled again.
When you look at the log output in Amazon CloudWatch Logs, you see that the container discovered one GPU: `/gpu0` and the training benchmark trained at a rate of 321.16 images/sec.
With your two p3.2xlarge nodes in the cluster, you are limited to two concurrent single GPU based workloads. To scale horizontally, you could add additional p3.2xlarge nodes. This would limit your workloads to a single GPU each. To scale vertically, you could bump up the instance type, which would allow you to assign multiple GPUs to a single task. Now, let’s see how fast your TensorFlow container can train when assigned multiple GPUs.
Launch a multiple-GPU training workload
To begin, replace the p3.2xlarge instances with p3.16xlarge instances. This gives your cluster two instances that each have eight GPUs, for a total of 16 GPUs that can be allocated.
With each task request, GPUs are allocated: four in the first request, and eight in the second request. Again, these GPUs are pinned to the task, and not usable by any other task until these tasks are complete.
Check the log output in CloudWatch Logs:
On the “devices” lines, you can see that the container discovered and used four (or eight) GPUs. Also, the total images/sec improved to 1297.41 with four GPUs, and 1707.23 with eight GPUs.
Because you can pin single or multiple GPUs to a task, running advanced GPU based training tasks on Amazon ECS is easier than ever!
Cleanup
To clean up your running resources, delete the CloudFormation stack:
We have seen a lot of discussion this past week about the role of Amazon Rekognition in facial recognition, surveillance, and civil liberties, and we wanted to share some thoughts.
Amazon Rekognition is a service we announced in 2016. It makes use of new technologies – such as deep learning – and puts them in the hands of developers in an easy-to-use, low-cost way. Since then, we have seen customers use the image and video analysis capabilities of Amazon Rekognition in ways that materially benefit both society (e.g. preventing human trafficking, inhibiting child exploitation, reuniting missing children with their families, and building educational apps for children), and organizations (enhancing security through multi-factor authentication, finding images more easily, or preventing package theft). Amazon Web Services (AWS) is not the only provider of services like these, and we remain excited about how image and video analysis can be a driver for good in the world, including in the public sector and law enforcement.
There have always been and will always be risks with new technology capabilities. Each organization choosing to employ technology must act responsibly or risk legal penalties and public condemnation. AWS takes its responsibilities seriously. But we believe it is the wrong approach to impose a ban on promising new technologies because they might be used by bad actors for nefarious purposes in the future. The world would be a very different place if we had restricted people from buying computers because it was possible to use that computer to do harm. The same can be said of thousands of technologies upon which we all rely each day. Through responsible use, the benefits have far outweighed the risks.
Customers are off to a great start with Amazon Rekognition; the evidence of the positive impact this new technology can provide is strong (and growing by the week), and we’re excited to continue to support our customers in its responsible use.
-Dr. Matt Wood, general manager of artificial intelligence at AWS
Today, at the AWS Summit in Tokyo we announced a number of updates and new features for Amazon SageMaker. Starting today, SageMaker is available in Asia Pacific (Tokyo)! SageMaker also now supports CloudFormation. A new machine learning framework, Chainer, is now available in the SageMaker Python SDK, in addition to MXNet and Tensorflow. Finally, support for running Chainer models on several devices was added to AWS Greengrass Machine Learning.
Amazon SageMaker Chainer Estimator
Chainer is a popular, flexible, and intuitive deep learning framework. Chainer networks work on a “Define-by-Run” scheme, where the network topology is defined dynamically via forward computation. This is in contrast to many other frameworks which work on a “Define-and-Run” scheme where the topology of the network is defined separately from the data. A lot of developers enjoy the Chainer scheme since it allows them to write their networks with native python constructs and tools.
Luckily, using Chainer with SageMaker is just as easy as using a TensorFlow or MXNet estimator. In fact, it might even be a bit easier since it’s likely you can take your existing scripts and use them to train on SageMaker with very few modifications. With TensorFlow or MXNet users have to implement a train function with a particular signature. With Chainer your scripts can be a little bit more portable as you can simply read from a few environment variables like SM_MODEL_DIR, SM_NUM_GPUS, and others. We can wrap our existing script in a if __name__ == '__main__': guard and invoke it locally or on sagemaker.
import argparse
import os
if __name__ =='__main__':
parser = argparse.ArgumentParser()
# hyperparameters sent by the client are passed as command-line arguments to the script.
parser.add_argument('--epochs', type=int, default=10)
parser.add_argument('--batch-size', type=int, default=64)
parser.add_argument('--learning-rate', type=float, default=0.05)
# Data, model, and output directories
parser.add_argument('--output-data-dir', type=str, default=os.environ['SM_OUTPUT_DATA_DIR'])
parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])
parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN'])
parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST'])
args, _ = parser.parse_known_args()
# ... load from args.train and args.test, train a model, write model to args.model_dir.
Then, we can run that script locally or use the SageMaker Python SDK to launch it on some GPU instances in SageMaker. The hyperparameters will get passed in to the script as CLI commands and the environment variables above will be autopopulated. When we call fit the input channels we pass will be populated in the SM_CHANNEL_* environment variables.
from sagemaker.chainer.estimator import Chainer
# Create my estimator
chainer_estimator = Chainer(
entry_point='example.py',
train_instance_count=1,
train_instance_type='ml.p3.2xlarge',
hyperparameters={'epochs': 10, 'batch-size': 64}
)
# Train my estimator
chainer_estimator.fit({'train': train_input, 'test': test_input})
# Deploy my estimator to a SageMaker Endpoint and get a Predictor
predictor = chainer_estimator.deploy(
instance_type="ml.m4.xlarge",
initial_instance_count=1
)
Now, instead of bringing your own docker container for training and hosting with Chainer, you can just maintain your script. You can see the full sagemaker-chainer-containers on github. One of my favorite features of the new container is built-in chainermn for easy multi-node distribution of your chainer training jobs.
There’s a lot more documentation and information available in both the README and the example notebooks.
AWS GreenGrass ML with Chainer
AWS GreenGrass ML now includes a pre-built Chainer package for all devices powered by Intel Atom, NVIDIA Jetson, TX2, and Raspberry Pi. So, now GreenGrass ML provides pre-built packages for TensorFlow, Apache MXNet, and Chainer! You can train your models on SageMaker then easily deploy it to any GreenGrass-enabled device using GreenGrass ML.
JAWS UG
I want to give a quick shout out to all of our wonderful and inspirational friends in the JAWS UG who attended the AWS Summit in Tokyo today. I’ve very much enjoyed seeing your pictures of the summit. Thanks for making Japan an amazing place for AWS developers! I can’t wait to visit again and meet with all of you.
This video demos a real-like Pokedex, complete with visual recognition, that I created using a Raspberry Pi, Python, and Deep Learning. You can find the entire blog post, including code, using this link: https://www.pyimagesearch.com/2018/04/30/a-fun-hands-on-deep-learning-project-for-beginners-students-and-hobbyists/ Music credit to YouTube user “No Copyright” for providing royalty free music: https://www.youtube.com/watch?v=PXpjqURczn8
The history of Pokémon in 30 seconds
The Pokémon franchise was created by video game designer Satoshi Tajiri in 1995. In the fictional world of Pokémon, Pokémon Trainers explore the vast landscape, catching and training small creatures called Pokémon. To date, there are 802 different types of Pokémon. They range from the ever recognisable Pikachu, a bright yellow electric Pokémon, to the highly sought-after Shiny Charizard, a metallic, playing-card-shaped Pokémon that your mate Alex claims she has in mint condition, but refuses to show you.
In the world of Pokémon, children as young as ten-year-old protagonist and all-round annoyance Ash Ketchum are allowed to leave home and wander the wilderness. There, they hunt vicious, deadly creatures in the hope of becoming a Pokémon Master.
Adrian’s deep learning Pokédex
Adrian is a bit of a deep learning pro, as demonstrated by his Santa/Not Santa detector, which we wrote about last year. For that project, he also provided a great explanation of what deep learning actually is. In a nutshell:
…a subfield of machine learning, which is, in turn, a subfield of artificial intelligence (AI).While AI embodies a large, diverse set of techniques and algorithms related to automatic reasoning (inference, planning, heuristics, etc), the machine learning subfields are specifically interested in pattern recognition and learning from data.
As with his earlier Raspberry Pi project, Adrian uses the Keras deep learning model and the TensorFlow backend, plus a few other packages such as Adrian’s own imutils functions and OpenCV.
Adrian trained a Convolutional Neural Network using Keras on a dataset of 1191 Pokémon images, obtaining 96.84% accuracy. As Adrian explains, this model is able to identify Pokémon via still image and video. It’s perfect for creating a Pokédex – an interactive Pokémon catalogue that should, according to the franchise, be able to identify and read out information on any known Pokémon when captured by camera. More information on model training can be found on Adrian’s blog.
For the physical build, a Raspberry Pi 3 with camera module is paired with the Raspberry Pi 7″ touch display to create a portable Pokédex. And while Adrian comments that the same result can be achieved using your home computer and a webcam, that’s not how Adrian rolls as a Raspberry Pi fan.
Plus, the smaller size of the Pi is perfect for one of you to incorporate this deep learning model into a 3D-printed Pokédex for ultimate Pokémon glory, pretty please, thank you.
Adrian has gone into impressive detail about how the project works and how you can create your own on his blog, pyimagesearch. So if you’re interested in learning more about deep learning, and making your own Pokédex, be sure to visit.
Amazon SageMaker continues to iterate quickly and release new features on behalf of customers. Starting today, SageMaker adds support for many new instance types, local testing with the SDK, and Apache MXNet 1.1.0 and Tensorflow 1.6.0. Let’s take a quick look at each of these updates.
New Instance Types
Amazon SageMaker customers now have additional options for right-sizing their workloads for notebooks, training, and hosting. Notebook instances now support almost all T2, M4, P2, and P3 instance types with the exception of t2.micro, t2.small, and m4.large instances. Model training now supports nearly all M4, M5, C4, C5, P2, and P3 instances with the exception of m4.large, c4.large, and c5.large instances. Finally, model hosting now supports nearly all T2, M4, M5, C4, C5, P2, and P3 instances with the exception of m4.large instances. Many customers can take advantage of the newest P3, C5, and M5 instances to get the best price/performance for their workloads. Customers also take advantage of the burstable compute model on T2 instances for endpoints or notebooks that are used less frequently.
Open Sourced Containers, Local Mode, and TensorFlow 1.6.0 and MXNet 1.1.0
Today Amazon SageMaker has open sourced the MXNet and Tensorflow deep learning containers that power the MXNet and Tensorflow estimators in the SageMaker SDK. The ability to write Python scripts that conform to simple interface is still one of my favorite SageMaker features and now those containers can be additionally customized to include any additional libraries. You can download these containers locally to iterate and experiment which can accelerate your debugging cycle. When you’re ready go from local testing to production training and hosting you just change one line of code.
These containers launch with support for Tensorflow 1.6.0 and MXNet 1.1.0 as well. Tensorflow has a number of new 1.6.0 features including support for CUDA 9.0, cuDNN 7, and AVX instructions which allows for significant speedups in many training applications. MXNet 1.1.0 adds a number of new features including a Text API mxnet.text with support for text processing, indexing, glossaries, and more. Two of the really cool pre-trained embeddings included are GloVe and fastText. <
Available Now All of the features mentioned above are available today. As always please let us know on Twitter or in the comments below if you have any questions or if you’re building something interesting. Now, if you’ll excuse me I’m going to go experiment with some of those new MXNet APIs!
Can you believe it’s already the month of March? With some great new Tech Talks available this month, there’s no better time to grow your knowledge about AWS services and solutions.
AWS Online Tech Talks
March 2018– Schedule
Below is the full schedule for the live, online technical sessions being held during the month of March. Make sure to register ahead of time so you won’t miss out on these free talks conducted by AWS subject matter experts.
March 28, 2018 | 11:00 AM – 12:00 PM PT – Deep Dive on Amazon Athena (300) – Dive deep into the most common Amazon Athena use cases, including working with other AWS services.
Compute
March 26, 2018 | 01:00 PM – 01:45 PM PT – High Performance Computing in the Cloud (200) – Learn how AWS is enabling faster time to results and higher ROI when it comes to solving the big problems in science, engineering and business with high performance computing in the cloud.
March 27, 2018 | 01:00 PM – 01:45 PM PT – Introduction to Hybrid Cloud on AWS (200) – Learn how AWS is building the industry’s broadest capabilities for Hybrid Cloud deployments.
March 28, 2018 | 01:00 PM – 02:00 PM PT – Media Processing Workflows at High Velocity and Scale using AI and ML (200) – Hear how AWS customers have improved media supply chains using AI in areas such as metadata tagging (Rekognition and Comprehend), translations, transcriptions, and cloud services (Elemental).
March 22, 2018 | 09:00 AM – 09:45 AM PT – New Mobile CLI and Console Experience (200) – Learn how AWS Mobile Services has introduced a new CLI and streamlined console experience in order to simplify and speed up the development of mobile applications with innovative AWS features and back-end functionality.
Networking
March 28, 2018 | 09:00 AM – 09:45 AM PT –Deep Dive on New AWS Networking Features (300) – Learn how AWS PrivateLink, Direct Connect gateway, and new features with Elastic Load Balancers (ELB) come together to meet the needs of a modern enterprise.
March 29, 2018 | 09:00 AM – 09:45 AM PT – Navigating GDPR Compliance on AWS (300) – Get a walkthrough of potential General Data Protection Regulation (GDPR) obligations and see how the AWS cloud offers services and features that are consistent with GDPR considerations in the ramp-up to the May 25th, 2018 enforcement date.
March 27, 2018 | 11:00 AM – 11:45 AM PT–Enterprise Applications with Amazon EFS (300) – Join us for a technical deep dive on Amazon EFS, where you’ll learn tips and tricks for integrating your enterprise applications with Amazon EFS.
March 29, 2018 | 11:00 AM – 11:45 AM PT – Transforming Data Lakes with Amazon S3 Select & Amazon Glacier Select (300) – Join us for a webinar where we’ll demonstrate how Amazon S3 Select can increase analytics query performance up to 400%, and Amazon Glacier Select makes it practical to extend queries to archive storage, significantly reducing data lake storage costs.
Kumar Venkateswar, Product Manager on the AWS ML Platforms Team, shares details on the announcement of Auto Scaling with Amazon SageMaker.
With Amazon SageMaker, thousands of customers have been able to easily build, train and deploy their machine learning (ML) models. Today, we’re making it even easier to manage production ML models, with Auto Scaling for Amazon SageMaker. Instead of having to manually manage the number of instances to match the scale that you need for your inferences, you can now have SageMaker automatically scale the number of instances based on an AWS Auto Scaling Policy.
SageMaker has made managing the ML process easier for many customers. We’ve seen customers take advantage of managed Jupyter notebooks and managed distributed training. We’ve seen customers deploying their models to SageMaker hosting for inferences, as they integrate machine learning with their applications. SageMaker makes this easy – you don’t have to think about patching the operating system (OS) or frameworks on your inference hosts, and you don’t have to configure inference hosts across Availability Zones. You just deploy your models to SageMaker, and it handles the rest.
Until now, you have needed to specify the number and type of instances per endpoint (or production variant) to provide the scale that you need for your inferences. If your inference volume changes, you can change the number and/or type of instances that back each endpoint to accommodate that change, without incurring any downtime. In addition to making it easy to change provisioning, customers have asked us how we can make managing capacity for SageMaker even easier.
With Auto Scaling for Amazon SageMaker, in the SageMaker console, the AWS Auto Scaling API, and the AWS SDK, this becomes much easier. Now, instead of having to closely monitor inference volume, and change the endpoint configuration in response, customers can configure a scaling policy to be used by AWS Auto Scaling. Auto Scaling adjusts the number of instances up or down in response to actual workloads, determined by using Amazon CloudWatch metrics and target values defined in the policy. In this way, customers can automatically adjust their inference capacity to maintain predictable performance at a low cost. You simply specify the target inference throughput per instance and provide upper and lower bounds for the number of instances for each production variant. SageMaker will then monitor throughput per instance using Amazon CloudWatch alarms, and then it will adjust provisioned capacity up or down as needed.
After you configure the endpoint with Auto Scaling, SageMaker will continue to monitor your deployed models to automatically adjust the instance count. SageMaker will keep throughput within desired levels, in response to changes in application traffic. This makes it easier to manage models in production, and it can help reduce the cost of deployed models, as you no longer have to provision sufficient capacity in order to manage your peak load. Instead, you configure the limits to accommodate your minimum expected traffic and the maximum peak, and Amazon SageMaker will work within those limits to minimize cost.
How do you get started? Open the SageMaker console. For existing endpoints, you first access the endpoint to modify the settings.
Then, scroll to the Endpoint runtime settings section, select the variant, and choose Configure auto scaling.
First, configure the minimum and maximum number of instances.
Next, choose the throughput per instance at which you want to add an additional instance, given previous load testing.
You can optionally set cool down periods for scaling in or out, to avoid oscillation during periods of wide fluctuation in workload. If not, SageMaker will assume default values.
And that’s it! You now have an endpoint that will automatically scale with increasing inferences.
You pay for the capacity used at regular SageMaker pay-as-you-go pricing, so you no longer have to pay for unused capacity during relative idle periods!
Auto Scaling in Amazon SageMaker is available today in the US East (N. Virginia & Ohio), EU (Ireland), and U.S. West (Oregon) AWS regions. To learn more, see the Amazon SageMaker Auto Scaling documentation.
Kumar Venkateswar is a Product Manager in the AWS ML Platforms team, which includes Amazon SageMaker, Amazon Machine Learning, and the AWS Deep Learning AMIs. When not working, Kumar plays the violin and Magic: The Gathering.
Noted below are the upcoming scheduled live, online technical sessions being held during the month of January. Make sure to register ahead of time so you won’t miss out on these free talks conducted by AWS subject matter experts.
The video is a demo of my “Not Santa” detector that I deployed to the Raspberry Pi. I trained the detector using deep learning, Keras, and Python. You can find the full source code and tutorial here: https://www.pyimagesearch.com/2017/12/18/keras-deep-learning-raspberry-pi/
Ho-ho-how does it work?
Note: Adrian Rosebrock is not Santa. But he does a good enough impression of the jolly old fellow that his disguise can fool a Raspberry Pi into thinking otherwise.
We jest, but has anyone seen Adrian and Santa in the same room together? Image c/o Adrian Rosebrock
But how is the Raspberry Pi able to detect the Santa-ness or Not-Santa-ness of people who walk into the frame?
Two words: deep learning
If you’re not sure what deep learning is, you’re not alone. It’s a hefty topic, and one that Adrian has written a book about, so I grilled him for a bluffers’ guide. In his words, deep learning is:
…a subfield of machine learning, which is, in turn a subfield of artificial intelligence (AI). While AI embodies a large, diverse set of techniques and algorithms related to automatic reasoning (inference, planning, heuristics, etc), the machine learning subfields are specifically interested in pattern recognition and learning from data.
Artificial Neural Networks (ANNs) are a class of machine learning algorithms that can learn from data. We have been using ANNs successfully for over 60 years, but something special happened in the past 5 years — (1) we’ve been able to accumulate massive datasets, orders of magnitude larger than previous datasets, and (2) we have access to specialized hardware to train networks faster (i.e., GPUs).
Given these large datasets and specialized hardware, deeper neural networks can be trained, leading to the term “deep learning”.
So now we have a bird’s-eye view of deep learning, how does the detector detect?
Components for Santa detection Image c/o Adrian Rosebrock
The camera captures footage of Santa in the wild, while the Christmas tree add-on provides a twinkly notification, accompanied by a resonant ho, ho, ho from the speakers.
A deeper deep dive into deep learning
A full breakdown of the project and the workings of the Not Santa detector can be found on Adrian’s blog, PyImageSearch, which includes links to other deep learning and image classification tutorials using TensorFlow and Keras. It’s an excellent place to start if you’d like to understand more about deep learning.
Build your own Santa detector
Santa might catch on to Adrian’s clever detector and start avoiding the camera, and for that eventuality, we have our own Santa detector. It uses motion detection to notify you of his presence (and your presents!).
Check out our Santa Detector resource here and use a passive infrared sensor, Raspberry Pi, and Scratch to catch the big man in action.
My colleague Sandy Carter delivered the Enterprise Innovation State of the Union last week at AWS re:Invent. She wrote the guest post below to recap the announcements that she made from the stage.
“I want my company to innovate, but I am not convinced we can execute successfully.” Far too many times I have heard this fear expressed by senior executives that I have met at different points in my career. In fact, a recent study published by Price Waterhouse Coopers found that while 93% of executives depend on innovation to drive growth, more than half are challenged to take innovative ideas to market quickly in a scalable way.
Many customers are struggling with how to drive enterprise innovation, so I was thrilled to share the stage at AWS re:Invent this past week with several senior executives who have successfully broken this mold to drive amazing enterprise innovation. In particular, I want to thank Parag Karnik from Johnson & Johnson, Bill Rothe from Hess Corporation, Dave Williams from Just Eat, and Olga Lagunova from Pitney Bowes for sharing their stories of innovation, creativity, and solid execution.
Among the many new announcements from AWS this past week, I am particularly excited about the following newly-launched AWS products and programs that I announced at re:Invent to drive new innovations by our enterprise customers:
AI: New Deep Learning Amazon Machine Image (AMI) on EC2 Windows As I shared at re:Invent, customers such as Infor are already successfully leveraging artificial intelligence tools on AWS to deliver tailored, industry-specific applications to their customers. We want to facilitate more of our Windows developers to get started quickly and easily with AI, leveraging machine learning based tools with popular deep learning frameworks, such as Apache MXNet, TensorFlow, and Caffe2. In order to enable this, I announced at re:Invent that AWS now offers a new Deep Learning AMI for Microsoft Windows. The AMI is tailored to facilitate large scale training of deep-learning models, and enables quick and easy setup of Windows Server-based compute resources for machine learning applications.
IoT: Visualize and Analyze SQL and IoT Data Forecasts show as many as 31 billion IoT devices by 2020. AWS wants every Windows customer to take advantage of the data available from their devices. Pitney Bowes, for example, now has more than 130,000 IoT devices streaming data to AWS. Using machine learning, Pitney Bowes enriches and analyzes data to enhance their customer experience, improve efficiencies, and create new data products. AWS IoT Analytics can now be leveraged to run analytics on IoT data and get insights that help you make better and more accurate decisions for IoT applications and machine learning use cases. AWS IoT Analytics can automatically enrich IoT device data with contextual metadata such as your SQL Server transactional data.
New Capabilities for .NET Developers on AWS In addition to all of the enhancements we’ve introduced to deliver a first class experience to Windows developers on AWS, we announced that we are including .NET Core 2.0 support in AWS Lambda and AWS CodeBuild, which will be available for broader use early next year. .NET Core 2.0 packs a number of new features such as Razor pages, better compatibility with .NET framework, more than double the number of APIs compared to the previous versions, and much more. With this announcement, you will be able to take advantage of all latest .NET Core features on Lambda and CodeBuild for building modern serverless and DevOps centric solutions.
License optimization for BYOL AWS provides you a wide variety of instance types and families that best meet your workload needs. If you are using software licensed by the number of vCPUs, you want the ability to further tweak vCPU count to optimize license spend. I announced the upcoming ability to optimize CPUs for EC2, giving you greater control over your EC2 instances on two fronts:
You can specify a custom number of vCPUs when launching new instances to save on vCPU based licensing costs. For example, SQL Server licensing spend.
You can disable Hyper-Threading Technology for workloads that perform well with single-threaded CPUs, like some high-performance computing (HPC) applications.
Using these capabilities, customers who bring their own license (BYOL) will be able to optimize their license usage and save on the license costs.
Server Migration Service for Hyper-V Virtual Machines As Bill Rothe from Hess Corporation shared at re:Invent, Hess has successfully migrated a wide range of workloads to the cloud, including SQL Server, SharePoint, SAP HANA, and many others. AWS Server Migration Service (SMS) now supports Hyper-V virtual machine (VM) migration, in order to further support enterprise migrations like these. AWS Server Migration Service will enable you to more easily co-ordinate large-scale server migrations from on-premise Hyper-V environments to AWS. AWS Server Migration Service allows you to automate, schedule, and track incremental replications of live server volumes. The replicated volumes are encrypted in transit and saved as a new Amazon Machine Image (AMI), which can be launched as an EC2 instance on AWS.
Microsoft Premier Support for AWS End-Customers I was pleased to announce that Microsoft and AWS have developed new areas of support integration to help ensure a great customer experience. Microsoft Premier Support is on board to help AWS assist end customers. AWS Support engineers can escalate directly to Microsoft Support on behalf of AWS customers running Microsoft workloads.
Best Practice Tools: HIPAA Compliance and Digital Innovation Workshop In November, we updated our HIPAA-focused white paper, outlining how you can use AWS to create HIPAA-compliant applications. In the first quarter of next year, we will publish a HIPAA Implementation Guide that expands on our HIPAA Quick Start to enable you to follow strict security, compliance, and risk management controls for common healthcare use cases. I was also pleased to award a Digital Innovation Workshop to one of our customers in my re:Invent session, and look forward to seeing more customers take advantage of this workshop.
AWS: The Continuous Innovation Cloud A common thread we see across customers is that continuous innovation from AWS enables their ongoing reinvention. Continuous innovation means that you are always getting a newer, better offering every single day. Sometimes it is in the form of brand new services and capabilities, and sometimes it is happening invisibly, under the covers where your environment just keeps getting better. I invite you to learn more about how you can accelerate your innovation journey with recently launched AWS services and AWS best practices. If you are migrating Windows workloads, speak with your AWS sales representative or an AWS Microsoft Workloads Competency Partner to learn how you can leverage our re:Think for Windows program for credits to start your migration.
Today AWS announced contributions to the milestone 1.0 release of the Apache MXNet deep learning engine including the introduction of a new model-serving capability for MXNet. The new capabilities in MXNet provide the following benefits to users:
1) MXNet is easier to use: The model server for MXNet is a new capability introduced by AWS, and it packages, runs, and serves deep learning models in seconds with just a few lines of code, making them accessible over the internet via an API endpoint and thus easy to integrate into applications. The 1.0 release also includes an advanced indexing capability that enables users to perform matrix operations in a more intuitive manner.
Model Serving enables set up of an API endpoint for prediction: It saves developers time and effort by condensing the task of setting up an API endpoint for running and integrating prediction functionality into an application to just a few lines of code. It bridges the barrier between Python-based deep learning frameworks and production systems through a Docker container-based deployment model.
Advanced indexing for array operations in MXNet: It is now more intuitive for developers to leverage the powerful array operations in MXNet. They can use the advanced indexing capability by leveraging existing knowledge of NumPy/SciPy arrays. For example, it supports MXNet NDArray and Numpy ndarray as index, e.g. (a[mx.nd.array([1,2], dtype = ‘int32’]).
2) MXNet is faster: The 1.0 release includes implementation of cutting-edge features that optimize the performance of training and inference. Gradient compression enables users to train models up to five times faster by reducing communication bandwidth between compute nodes without loss in convergence rate or accuracy. For speech recognition acoustic modeling like the Alexa voice, this feature can reduce network bandwidth by up to three orders of magnitude during training. With the support of NVIDIA Collective Communication Library (NCCL), users can train a model 20% faster on multi-GPU systems.
Optimize network bandwidth with gradient compression: In distributed training, each machine must communicate frequently with others to update the weight-vectors and thereby collectively build a single model, leading to high network traffic. Gradient compression algorithm enables users to train models up to five times faster by compressing the model changes communicated by each instance.
Optimize the training performance by taking advantage of NCCL: NCCL implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs. NCCL provides communication routines that are optimized to achieve high bandwidth over interconnection between multi-GPUs. MXNet supports NCCL to train models about 20% faster on multi-GPU systems.
3) MXNet provides easy interoperability: MXNet now includes a tool for converting neural network code written with the Caffe framework to MXNet code, making it easier for users to take advantage of MXNet’s scalability and performance.
Migrate Caffe models to MXNet: It is now possible to easily migrate Caffe code to MXNet, using the new source code translation tool for converting Caffe code to MXNet code.
MXNet has helped developers and researchers make progress with everything from language translation to autonomous vehicles and behavioral biometric security. We are excited to see the broad base of users that are building production artificial intelligence applications powered by neural network models developed and trained with MXNet. For example, the autonomous driving company TuSimple recently piloted a self-driving truck on a 200-mile journey from Yuma, Arizona to San Diego, California using MXNet. This release also includes a full-featured and performance optimized version of the Gluon programming interface. The ease-of-use associated with it combined with the extensive set of tutorials has led significant adoption among developers new to deep learning. The flexibility of the interface has driven interest within the research community, especially in the natural language processing domain.
Getting started with MXNet Getting started with MXNet is simple. To learn more about the Gluon interface and deep learning, you can reference this comprehensive set of tutorials, which covers everything from an introduction to deep learning to how to implement cutting-edge neural network models. If you’re a contributor to a machine learning framework, check out the interface specs on GitHub.
To get started with the Model Server for Apache MXNet, install the library with the following command:
$ pip install mxnet-model-server
The Model Server library has a Model Zoo with 10 pre-trained deep learning models, including the SqueezeNet 1.1 object classification model. You can start serving the SqueezeNet model with just the following command:
We can’t believe that there are just few days left before re:Invent 2017. If you are attending this year, you’ll want to check out our Big Data sessions! The Big Data and Machine Learning categories are bigger than ever. As in previous years, you can find these sessions in various tracks, including Analytics & Big Data, Deep Learning Summit, Artificial Intelligence & Machine Learning, Architecture, and Databases.
We have great sessions from organizations and companies like Vanguard, Cox Automotive, Pinterest, Netflix, FINRA, Amtrak, AmazonFresh, Sysco Foods, Twilio, American Heart Association, Expedia, Esri, Nextdoor, and many more. All sessions are recorded and made available on YouTube. In addition, all slide decks from the sessions will be available on SlideShare.net after the conference.
This post highlights the sessions that will be presented as part of the Analytics & Big Data track, as well as relevant sessions from other tracks like Architecture, Artificial Intelligence & Machine Learning, and IoT. If you’re interested in Machine Learning sessions, don’t forget to check out our Guide to Machine Learning at re:Invent 2017.
This year’s session catalog contains the following breakout sessions.
Raju Gulabani, VP, Database, Analytics and AI at AWS will discuss the evolution of database and analytics services in AWS, the new database and analytics services and features we launched this year, and our vision for continued innovation in this space. We are witnessing an unprecedented growth in the amount of data collected, in many different forms. Storage, management, and analysis of this data require database services that scale and perform in ways not possible before. AWS offers a collection of database and other data services—including Amazon Aurora, Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon ElastiCache, Amazon Kinesis, and Amazon EMR—to process, store, manage, and analyze data. In this session, we provide an overview of AWS database and analytics services and discuss how customers are using these services today.
Deep dive customer use cases
ABD401 – How Netflix Monitors Applications in Near Real-Time with Amazon Kinesis Thousands of services work in concert to deliver millions of hours of video streams to Netflix customers every day. These applications vary in size, function, and technology, but they all make use of the Netflix network to communicate. Understanding the interactions between these services is a daunting challenge both because of the sheer volume of traffic and the dynamic nature of deployments. In this session, we first discuss why Netflix chose Kinesis Streams to address these challenges at scale. We then dive deep into how Netflix uses Kinesis Streams to enrich network traffic logs and identify usage patterns in real time. Lastly, we cover how Netflix uses this system to build comprehensive dependency maps, increase network efficiency, and improve failure resiliency. From this session, you will learn how to build a real-time application monitoring system using network traffic logs and get real-time, actionable insights.
In this session, learn how Nextdoor replaced their home-grown data pipeline based on a topology of Flume nodes with a completely serverless architecture based on Kinesis and Lambda. By making these changes, they improved both the reliability of their data and the delivery times of billions of records of data to their Amazon S3–based data lake and Amazon Redshift cluster. Nextdoor is a private social networking service for neighborhoods.
ABD205 – Taking a Page Out of Ivy Tech’s Book: Using Data for Student Success Data speaks. Discover how Ivy Tech, the nation’s largest singly accredited community college, uses AWS to gather, analyze, and take action on student behavioral data for the betterment of over 3,100 students. This session outlines the process from inception to implementation across the state of Indiana and highlights how Ivy Tech’s model can be applied to your own complex business problems.
ABD207 – Leveraging AWS to Fight Financial Crime and Protect National Security Banks aren’t known to share data and collaborate with one another. But that is exactly what the Mid-Sized Bank Coalition of America (MBCA) is doing to fight digital financial crime—and protect national security. Using the AWS Cloud, the MBCA developed a shared data analytics utility that processes terabytes of non-competitive customer account, transaction, and government risk data. The intelligence produced from the data helps banks increase the efficiency of their operations, cut labor and operating costs, and reduce false positive volumes. The collective intelligence also allows greater enforcement of Anti-Money Laundering (AML) regulations by helping members detect internal risks—and identify the challenges to detecting these risks in the first place. This session demonstrates how the AWS Cloud supports the MBCA to deliver advanced data analytics, provide consistent operating models across financial institutions, reduce costs, and strengthen national security.
ABD208 – Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores New Innovation with Amazon Kinesis Firehose In this session, learn how Cox Automotive is using Splunk Cloud for real time visibility into its AWS and hybrid environments to achieve near instantaneous MTTI, reduce auction incidents by 90%, and proactively predict outages. We also introduce a highly anticipated capability that allows you to ingest, transform, and analyze data in real time using Splunk and Amazon Kinesis Firehose to gain valuable insights from your cloud resources. It’s now quicker and easier than ever to gain access to analytics-driven infrastructure monitoring using Splunk Enterprise & Splunk Cloud.
ABD209 – Accelerating the Speed of Innovation with a Data Sciences Data & Analytics Hub at Takeda Historically, silos of data, analytics, and processes across functions, stages of development, and geography created a barrier to R&D efficiency. Gathering the right data necessary for decision-making was challenging due to issues of accessibility, trust, and timeliness. In this session, learn how Takeda is undergoing a transformation in R&D to increase the speed-to-market of high-impact therapies to improve patient lives. The Data and Analytics Hub was built, with Deloitte, to address these issues and support the efficient generation of data insights for functions such as clinical operations, clinical development, medical affairs, portfolio management, and R&D finance. In the AWS hosted data lake, this data is processed, integrated, and made available to business end users through data visualization interfaces, and to data scientists through direct connectivity. Learn how Takeda has achieved significant time reductions—from weeks to minutes—to gather and provision data that has the potential to reduce cycle times in drug development. The hub also enables more efficient operations and alignment to achieve product goals through cross functional team accountability and collaboration due to the ability to access the same cross domain data.
ABD210 – Modernizing Amtrak: Serverless Solution for Real-Time Data Capabilities As the nation’s only high-speed intercity passenger rail provider, Amtrak needs to know critical information to run their business such as: Who’s onboard any train at any time? How are booking and revenue trending? Amtrak was faced with unpredictable and often slow response times from existing databases, ranging from seconds to hours; existing booking and revenue dashboards were spreadsheet-based and manual; multiple copies of data were stored in different repositories, lacking integration and consistency; and operations and maintenance (O&M) costs were relatively high. Join us as we demonstrate how Deloitte and Amtrak successfully went live with a cloud-native operational database and analytical datamart for near-real-time reporting in under six months. We highlight the specific challenges and the modernization of architecture on an AWS native Platform as a Service (PaaS) solution. The solution includes cloud-native components such as AWS Lambda for microservices, Amazon Kinesis and AWS Data Pipeline for moving data, Amazon S3 for storage, Amazon DynamoDB for a managed NoSQL database service, and Amazon Redshift for near-real time reports and dashboards. Deloitte’s solution enabled “at scale” processing of 1 million transactions/day and up to 2K transactions/minute. It provided flexibility and scalability, largely eliminate the need for system management, and dramatically reduce operating costs. Moreover, it laid the groundwork for decommissioning legacy systems, anticipated to save at least $1M over 3 years.
ABD211 – Sysco Foods: A Journey from Too Much Data to Curated Insights In this session, we detail Sysco’s journey from a company focused on hindsight-based reporting to one focused on insights and foresight. For this shift, Sysco moved from multiple data warehouses to an AWS ecosystem, including Amazon Redshift, Amazon EMR, AWS Data Pipeline, and more. As the team at Sysco worked with Tableau, they gained agile insight across their business. Learn how Sysco decided to use AWS, how they scaled, and how they became more strategic with the AWS ecosystem and Tableau.
ABD217 – From Batch to Streaming: How Amazon Flex Uses Real-time Analytics to Deliver Packages on Time Reducing the time to get actionable insights from data is important to all businesses, and customers who employ batch data analytics tools are exploring the benefits of streaming analytics. Learn best practices to extend your architecture from data warehouses and databases to real-time solutions. Learn how to use Amazon Kinesis to get real-time data insights and integrate them with Amazon Aurora, Amazon RDS, Amazon Redshift, and Amazon S3. The Amazon Flex team describes how they used streaming analytics in their Amazon Flex mobile app, used by Amazon delivery drivers to deliver millions of packages each month on time. They discuss the architecture that enabled the move from a batch processing system to a real-time system, overcoming the challenges of migrating existing batch data to streaming data, and how to benefit from real-time analytics.
ABD218 – How EuroLeague Basketball Uses IoT Analytics to Engage Fans IoT and big data have made their way out of industrial applications, general automation, and consumer goods, and are now a valuable tool for improving consumer engagement across a number of industries, including media, entertainment, and sports. The low cost and ease of implementation of AWS analytics services and AWS IoT have allowed AGT, a leader in IoT, to develop their IoTA analytics platform. Using IoTA, AGT brought a tailored solution to EuroLeague Basketball for real-time content production and fan engagement during the 2017-18 season. In this session, we take a deep dive into how this solution is architected for secure, scalable, and highly performant data collection from athletes, coaches, and fans. We also talk about how the data is transformed into insights and integrated into a content generation pipeline. Lastly, we demonstrate how this solution can be easily adapted for other industries and applications.
ABD222 – How to Confidently Unleash Data to Meet the Needs of Your Entire Organization Where are you on the spectrum of IT leaders? Are you confident that you’re providing the technology and solutions that consistently meet or exceed the needs of your internal customers? Do your peers at the executive table see you as an innovative technology leader? Innovative IT leaders understand the value of getting data and analytics directly into the hands of decision makers, and into their own. In this session, Daren Thayne, Domo’s Chief Technology Officer, shares how innovative IT leaders are helping drive a culture change at their organizations. See how transformative it can be to have real-time access to all of the data that’ is relevant to YOUR job (including a complete view of your entire AWS environment), as well as understand how it can help you lead the way in applying that same pattern throughout your entire company
ABD303 – Developing an Insights Platform – Sysco’s Journey from Disparate Systems to Data Lake and Beyond Sysco has nearly 200 operating companies across its multiple lines of business throughout the United States, Canada, Central/South America, and Europe. As the global leader in food services, Sysco identified the need to streamline the collection, transformation, and presentation of data produced by the distributed units and systems, into a central data ecosystem. Sysco’s Business Intelligence and Analytics team addressed these requirements by creating a data lake with scalable analytics and query engines leveraging AWS services. In this session, Sysco will outline their journey from a hindsight reporting focused company to an insights driven organization. They will cover solution architecture, challenges, and lessons learned from deploying a self-service insights platform. They will also walk through the design patterns they used and how they designed the solution to provide predictive analytics using Amazon Redshift Spectrum, Amazon S3, Amazon EMR, AWS Glue, Amazon Elasticsearch Service and other AWS services.
ABD309 – How Twilio Scaled Its Data-Driven Culture As a leading cloud communications platform, Twilio has always been strongly data-driven. But as headcount and data volumes grew—and grew quickly—they faced many new challenges. One-off, static reports work when you’re a small startup, but how do you support a growth stage company to a successful IPO and beyond? Today, Twilio’s data team relies on AWS and Looker to provide data access to 700 colleagues. Departments have the data they need to make decisions, and cloud-based scale means they get answers fast. Data delivers real-business value at Twilio, providing a 360-degree view of their customer, product, and business. In this session, you hear firsthand stories directly from the Twilio data team and learn real-world tips for fostering a truly data-driven culture at scale.
ABD310 – How FINRA Secures Its Big Data and Data Science Platform on AWS FINRA uses big data and data science technologies to detect fraud, market manipulation, and insider trading across US capital markets. As a financial regulator, FINRA analyzes highly sensitive data, so information security is critical. Learn how FINRA secures its Amazon S3 Data Lake and its data science platform on Amazon EMR and Amazon Redshift, while empowering data scientists with tools they need to be effective. In addition, FINRA shares AWS security best practices, covering topics such as AMI updates, micro segmentation, encryption, key management, logging, identity and access management, and compliance.
ABD331 – Log Analytics at Expedia Using Amazon Elasticsearch Service Expedia uses Amazon Elasticsearch Service (Amazon ES) for a variety of mission-critical use cases, ranging from log aggregation to application monitoring and pricing optimization. In this session, the Expedia team reviews how they use Amazon ES and Kibana to analyze and visualize Docker startup logs, AWS CloudTrail data, and application metrics. They share best practices for architecting a scalable, secure log analytics solution using Amazon ES, so you can add new data sources almost effortlessly and get insights quickly
ABD316 – American Heart Association: Finding Cures to Heart Disease Through the Power of Technology Combining disparate datasets and making them accessible to data scientists and researchers is a prevalent challenge for many organizations, not just in healthcare research. American Heart Association (AHA) has built a data science platform using Amazon EMR, Amazon Elasticsearch Service, and other AWS services, that corrals multiple datasets and enables advanced research on phenotype and genotype datasets, aimed at curing heart diseases. In this session, we present how AHA built this platform and the key challenges they addressed with the solution. We also provide a demo of the platform, and leave you with suggestions and next steps so you can build similar solutions for your use cases
ABD319 – Tooling Up for Efficiency: DIY Solutions @ Netflix At Netflix, we have traditionally approached cloud efficiency from a human standpoint, whether it be in-person meetings with the largest service teams or manually flipping reservations. Over time, we realized that these manual processes are not scalable as the business continues to grow. Therefore, in the past year, we have focused on building out tools that allow us to make more insightful, data-driven decisions around capacity and efficiency. In this session, we discuss the DIY applications, dashboards, and processes we built to help with capacity and efficiency. We start at the ten thousand foot view to understand the unique business and cloud problems that drove us to create these products, and discuss implementation details, including the challenges encountered along the way. Tools discussed include Picsou, the successor to our AWS billing file cost analyzer; Libra, an easy-to-use reservation conversion application; and cost and efficiency dashboards that relay useful financial context to 50+ engineering teams and managers.
ABD312 – Deep Dive: Migrating Big Data Workloads to AWS Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premise deployments to AWS in order to save costs, increase availability, and improve performance. AWS offers a broad set of analytics services, including solutions for batch processing, stream processing, machine learning, data workflow orchestration, and data warehousing. This session will focus on identifying the components and workflows in your current environment; and providing the best practices to migrate these workloads to the right AWS data analytics product. We will cover services such as Amazon EMR, Amazon Athena, Amazon Redshift, Amazon Kinesis, and more. We will also feature Vanguard, an American investment management company based in Malvern, Pennsylvania with over $4.4 trillion in assets under management. Ritesh Shah, Sr. Program Manager for Cloud Analytics Program at Vanguard, will describe how they orchestrated their migration to AWS analytics services, including Hadoop and Spark workloads to Amazon EMR. Ritesh will highlight the technical challenges they faced and overcame along the way, as well as share common recommendations and tuning tips to accelerate the time to production.
ABD402 – How Esri Optimizes Massive Image Archives for Analytics in the Cloud Petabyte scale archives of satellites, planes, and drones imagery continue to grow exponentially. They mostly exist as semi-structured data, but they are only valuable when accessed and processed by a wide range of products for both visualization and analysis. This session provides an overview of how ArcGIS indexes and structures data so that any part of it can be quickly accessed, processed, and analyzed by reading only the minimum amount of data needed for the task. In this session, we share best practices for structuring and compressing massive datasets in Amazon S3, so it can be analyzed efficiently. We also review a number of different image formats, including GeoTIFF (used for the Public Datasets on AWS program, Landsat on AWS), cloud optimized GeoTIFF, MRF, and CRF as well as different compression approaches to show the effect on processing performance. Finally, we provide examples of how this technology has been used to help image processing and analysis for the response to Hurricane Harvey.
ABD329 – A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Massive Scale Amazon’s consumer business continues to grow, and so does the volume of data and the number and complexity of the analytics done in support of the business. In this session, we talk about how Amazon.com uses AWS technologies to build a scalable environment for data and analytics. We look at how Amazon is evolving the world of data warehousing with a combination of a data lake and parallel, scalable compute engines such as Amazon EMR and Amazon Redshift.
ABD327 – Migrating Your Traditional Data Warehouse to a Modern Data Lake In this session, we discuss the latest features of Amazon Redshift and Redshift Spectrum, and take a deep dive into its architecture and inner workings. We share many of the recent availability, performance, and management enhancements and how they improve your end user experience. You also hear from 21st Century Fox, who presents a case study of their fast migration from an on-premises data warehouse to Amazon Redshift. Learn how they are expanding their data warehouse to a data lake that encompasses multiple data sources and data formats. This architecture helps them tie together siloed business units and get actionable 360-degree insights across their consumer base. MCL202 – Ally Bank & Cognizant: Transforming Customer Experience Using Amazon Alexa Given the increasing popularity of natural language interfaces such as Voice as User technology or conversational artificial intelligence (AI), Ally® Bank was looking to interact with customers by enabling direct transactions through conversation or voice. They also needed to develop a capability that allows third parties to connect to the bank securely for information sharing and exchange, using oAuth, an authentication protocol seen as the future of secure banking technology. Cognizant’s Architecture team partnered with Ally Bank’s Enterprise Architecture group and identified the right product for oAuth integration with Amazon Alexa and third-party technologies. In this session, we discuss how building products with conversational AI helps Ally Bank offer an innovative customer experience; increase retention through improved data-driven personalization; increase the efficiency and convenience of customer service; and gain deep insights into customer needs through data analysis and predictive analytics to offer new products and services.
MCL317 – Orchestrating Machine Learning Training for Netflix Recommendations At Netflix, we use machine learning (ML) algorithms extensively to recommend relevant titles to our 100+ million members based on their tastes. Everything on the member home page is an evidence-driven, A/B-tested experience that we roll out backed by ML models. These models are trained using Meson, our workflow orchestration system. Meson distinguishes itself from other workflow engines by handling more sophisticated execution graphs, such as loops and parameterized fan-outs. Meson can schedule Spark jobs, Docker containers, bash scripts, gists of Scala code, and more. Meson also provides a rich visual interface for monitoring active workflows and inspecting execution logs. It has a powerful Scala DSL for authoring workflows as well as the REST API. In this session, we focus on how Meson trains recommendation ML models in production, and how we have re-architected it to scale up for a growing need of broad ETL applications within Netflix. As a driver for this change, we have had to evolve the persistence layer for Meson. We talk about how we migrated from Cassandra to Amazon RDS backed by Amazon Aurora
MCL350 – Humans vs. the Machines: How Pinterest Uses Amazon Mechanical Turk’s Worker Community to Improve Machine Learning Ever since the term “crowdsourcing” was coined in 2006, it’s been a buzzword for technology companies and social institutions. In the technology sector, crowdsourcing is instrumental for verifying machine learning algorithms, which, in turn, improves the user’s experience. In this session, we explore how Pinterest adapted to an increased reliability on human evaluation to improve their product, with a focus on how they’ve integrated with Mechanical Turk’s platform. This presentation is aimed at engineers, analysts, program managers, and product managers who are interested in how companies rely on Mechanical Turk’s human evaluation platform to better understand content and improve machine learning algorithms. The discussion focuses on the analysis and product decisions related to building a high quality crowdsourcing system that takes advantage of Mechanical Turk’s powerful worker community.
ABD201 – Big Data Architectural Patterns and Best Practices on AWS In this session, we simplify big data processing as a data bus comprising various stages: collect, store, process, analyze, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost
ABD202 – Best Practices for Building Serverless Big Data Applications Serverless technologies let you build and scale applications and services rapidly without the need to provision or manage servers. In this session, we show you how to incorporate serverless concepts into your big data architectures. We explore the concepts behind and benefits of serverless architectures for big data, looking at design patterns to ingest, store, process, and visualize your data. Along the way, we explain when and how you can use serverless technologies to streamline data processing, minimize infrastructure management, and improve agility and robustness and share a reference architecture using a combination of cloud and open source technologies to solve your big data problems. Topics include: use cases and best practices for serverless big data applications; leveraging AWS technologies such as Amazon DynamoDB, Amazon S3, Amazon Kinesis, AWS Lambda, Amazon Athena, and Amazon EMR; and serverless ETL, event processing, ad hoc analysis, and real-time analytics.
ABD206 – Building Visualizations and Dashboards with Amazon QuickSight Just as a picture is worth a thousand words, a visual is worth a thousand data points. A key aspect of our ability to gain insights from our data is to look for patterns, and these patterns are often not evident when we simply look at data in tables. The right visualization will help you gain a deeper understanding in a much quicker timeframe. In this session, we will show you how to quickly and easily visualize your data using Amazon QuickSight. We will show you how you can connect to data sources, generate custom metrics and calculations, create comprehensive business dashboards with various chart types, and setup filters and drill downs to slice and dice the data.
ABD203 – Real-Time Streaming Applications on AWS: Use Cases and Patterns To win in the marketplace and provide differentiated customer experiences, businesses need to be able to use live data in real time to facilitate fast decision making. In this session, you learn common streaming data processing use cases and architectures. First, we give an overview of streaming data and AWS streaming data capabilities. Next, we look at a few customer examples and their real-time streaming applications. Finally, we walk through common architectures and design patterns of top streaming data use cases.
ABD213 – How to Build a Data Lake with AWS Glue Data Catalog As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. The AWS Glue Data Catalog provides a central view of your data lake, making data readily available for analytics. We introduce key features of the AWS Glue Data Catalog and its use cases. Learn how crawlers can automatically discover your data, extract relevant metadata, and add it as table definitions to the AWS Glue Data Catalog. We will also explore the integration between AWS Glue Data Catalog and Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
ABD214 – Real-time User Insights for Mobile and Web Applications with Amazon Pinpoint With customers demanding relevant and real-time experiences across a range of devices, digital businesses are looking to gather user data at scale, understand this data, and respond to customer needs instantly. This requires tools that can record large volumes of user data in a structured fashion, and then instantly make this data available to generate insights. In this session, we demonstrate how you can use Amazon Pinpoint to capture user data in a structured yet flexible manner. Further, we demonstrate how this data can be set up for instant consumption using services like Amazon Kinesis Firehose and Amazon Redshift. We walk through example data based on real world scenarios, to illustrate how Amazon Pinpoint lets you easily organize millions of events, record them in real-time, and store them for further analysis.
ABD223 – IT Innovators: New Technology for Leveraging Data to Enable Agility, Innovation, and Business Optimization Companies of all sizes are looking for technology to efficiently leverage data and their existing IT investments to stay competitive and understand where to find new growth. Regardless of where companies are in their data-driven journey, they face greater demands for information by customers, prospects, partners, vendors and employees. All stakeholders inside and outside the organization want information on-demand or in “real time”, available anywhere on any device. They want to use it to optimize business outcomes without having to rely on complex software tools or human gatekeepers to relevant information. Learn how IT innovators at companies such as MasterCard, Jefferson Health, and TELUS are using Domo’s Business Cloud to help their organizations more effectively leverage data at scale.
ABD301 – Analyzing Streaming Data in Real Time with Amazon Kinesis Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we present an end-to-end streaming data solution using Kinesis Streams for data ingestion, Kinesis Analytics for real-time processing, and Kinesis Firehose for persistence. We review in detail how to write SQL queries using streaming data and discuss best practices to optimize and monitor your Kinesis Analytics applications. Lastly, we discuss how to estimate the cost of the entire system
ABD302 – Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service and Kibana In this session, we use Apache web logs as example and show you how to build an end-to-end analytics solution. First, we cover how to configure an Amazon ES cluster and ingest data using Amazon Kinesis Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data. Then we demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we review approaches for generating custom, ad-hoc reports.
ABD304 – Best Practices for Data Warehousing with Amazon Redshift & Redshift Spectrum Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we take an in-depth look at how modern data warehousing blends and analyzes all your data, inside and outside your data warehouse without moving the data, to give you deeper insights to run your business. We will cover best practices on how to design optimal schemas, load data efficiently, and optimize your queries to deliver high throughput and performance.
ABD305 – Design Patterns and Best Practices for Data Analytics with Amazon EMR Amazon EMR is one of the largest Hadoop operators in the world, enabling customers to run ETL, machine learning, real-time processing, data science, and low-latency SQL at petabyte scale. In this session, we introduce you to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long and short-lived clusters, and other Amazon EMR architectural best practices. We talk about lowering cost with Auto Scaling and Spot Instances, and security best practices for encryption and fine-grained access control. Finally, we dive into some of our recent launches to keep you current on our latest features.
ABD307 – Deep Analytics for Global AWS Marketing Organization To meet the needs of the global marketing organization, the AWS marketing analytics team built a scalable platform that allows the data science team to deliver custom econometric and machine learning models for end user self-service. To meet data security standards, we use end-to-end data encryption and different AWS services such as Amazon Redshift, Amazon RDS, Amazon S3, Amazon EMR with Apache Spark and Auto Scaling. In this session, you see real examples of how we have scaled and automated critical analysis, such as calculating the impact of marketing programs like re:Invent and prioritizing leads for our sales teams.
ABD311 – Deploying Business Analytics at Enterprise Scale with Amazon QuickSight One of the biggest tradeoffs customers usually make when deploying BI solutions at scale is agility versus governance. Large-scale BI implementations with the right governance structure can take months to design and deploy. In this session, learn how you can avoid making this tradeoff using Amazon QuickSight. Learn how to easily deploy Amazon QuickSight to thousands of users using Active Directory and Federated SSO, while securely accessing your data sources in Amazon VPCs or on-premises. We also cover how to control access to your datasets, implement row-level security, create scheduled email reports, and audit access to your data.
ABD315 – Building Serverless ETL Pipelines with AWS Glue Organizations need to gain insight and knowledge from a growing number of Internet of Things (IoT), APIs, clickstreams, unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL flows for your data lake. We discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue. Additionally, Merck will share how they built an end-to-end ETL pipeline for their application release management system, and launched it in production in less than a week using AWS Glue.
ABD318 – Architecting a data lake with Amazon S3, Amazon Kinesis, and Amazon Athena Learn how to architect a data lake where different teams within your organization can publish and consume data in a self-service manner. As organizations aim to become more data-driven, data engineering teams have to build architectures that can cater to the needs of diverse users – from developers, to business analysts, to data scientists. Each of these user groups employs different tools, have different data needs and access data in different ways. In this talk, we will dive deep into assembling a data lake using Amazon S3, Amazon Kinesis, Amazon Athena, Amazon EMR, and AWS Glue. The session will feature Mohit Rao, Architect and Integration lead at Atlassian, the maker of products such as JIRA, Confluence, and Stride. First, we will look at a couple of common architectures for building a data lake. Then we will show how Atlassian built a self-service data lake, where any team within the company can publish a dataset to be consumed by a broad set of users.
Companies have valuable data that they may not be analyzing due to the complexity, scalability, and performance issues of loading the data into their data warehouse. However, with the right tools, you can extend your analytics to query data in your data lake—with no loading required. Amazon Redshift Spectrum extends the analytic power of Amazon Redshift beyond data stored in your data warehouse to run SQL queries directly against vast amounts of unstructured data in your Amazon S3 data lake. This gives you the freedom to store your data where you want, in the format you want, and have it available for analytics when you need it. Join a discussion with AWS solution architects to ask question.
ABD330 – Combining Batch and Stream Processing to Get the Best of Both Worlds Today, many architects and developers are looking to build solutions that integrate batch and real-time data processing, and deliver the best of both approaches. Lambda architecture (not to be confused with the AWS Lambda service) is a design pattern that leverages both batch and real-time processing within a single solution to meet the latency, accuracy, and throughput requirements of big data use cases. Come join us for a discussion on how to implement Lambda architecture (batch, speed, and serving layers) and best practices for data processing, loading, and performance tuning
ABD335 – Real-Time Anomaly Detection Using Amazon Kinesis Amazon Kinesis Analytics offers a built-in machine learning algorithm that you can use to easily detect anomalies in your VPC network traffic and improve security monitoring. Join us for an interactive discussion on how to stream your VPC flow Logs to Amazon Kinesis Streams and identify anomalies using Kinesis Analytics.
ABD339 – Deep Dive and Best Practices for Amazon Athena Amazon Athena is an interactive query service that enables you to process data directly from Amazon S3 without the need for infrastructure. Since its launch at re:invent 2016, several organizations have adopted Athena as the central tool to process all their data. In this talk, we dive deep into the most common use cases, including working with other AWS services. We review the best practices for creating tables and partitions and performance optimizations. We also dive into how Athena handles security, authorization, and authentication. Lastly, we hear from a customer who has reduced costs and improved time to market by deploying Athena across their organization.
We look forward to meeting you at re:Invent 2017!
About the Author
Roy Ben-Alta is a solution architect and principal business development manager at Amazon Web Services in New York. He focuses on Data Analytics and ML Technologies, working with AWS customers to build innovative data-driven products.
Contributed by Tiffany Jernigan, Developer Advocate for Amazon ECS
Get ready for takeoff!
We made sure that this year’s re:Invent is chock-full of containers: there are over 40 sessions! New to containers? No problem, we have several introductory sessions for you to dip your toes. Been using containers for years and know the ins and outs? Don’t miss our technical deep-dives and interactive chalk talks led by container experts.
If you can’t make it to Las Vegas, you can catch the keynotes and session recaps from our livestream and on Twitch.
Session types
Not everyone learns the same way, so we have multiple types of breakout content:
Birds of a Feather An interactive discussion with industry leaders about containers on AWS.
Breakout sessions 60-minute presentations about building on AWS. Sessions are delivered by both AWS experts and customers and span all content levels.
Workshops 2.5-hour, hands-on sessions that teach how to build on AWS. AWS credits are provided. Bring a laptop, and have an active AWS account.
Chalk Talks 1-hour, highly interactive sessions with a smaller audience. They begin with a short lecture delivered by an AWS expert, followed by a discussion with the audience.
Session levels
Whether you’re new to containers or you’ve been using them for years, you’ll find useful information at every level.
Introductory Sessions are focused on providing an overview of AWS services and features, with the assumption that attendees are new to the topic.
Advanced Sessions dive deeper into the selected topic. Presenters assume that the audience has some familiarity with the topic, but may or may not have direct experience implementing a similar solution.
Expert Sessions are for attendees who are deeply familiar with the topic, have implemented a solution on their own already, and are comfortable with how the technology works across multiple services, architectures, and implementations.
Session locations
All container sessions are located in the Aria Resort.
MONDAY 11/27
Breakout sessions
Level 200 (Introductory)
CON202 – Getting Started with Docker and Amazon ECS By packaging software into standardized units, Docker gives code everything it needs to run, ensuring consistency from your laptop all the way into production. But once you have your code ready to ship, how do you run and scale it in the cloud? In this session, you become comfortable running containerized services in production using Amazon ECS. We cover container deployment, cluster management, service auto-scaling, service discovery, secrets management, logging, monitoring, security, and other core concepts. We also cover integrated AWS services and supplementary services that you can take advantage of to run and scale container-based services in the cloud.
Chalk talks
Level 200 (Introductory)
CON211 – Reducing your Compute Footprint with Containers and Amazon ECS Tomas Riha, platform architect for Volvo, shows how Volvo transitioned its WirelessCar platform from using Amazon EC2 virtual machines to containers running on Amazon ECS, significantly reducing cost. Tomas dives deep into the architecture that Volvo used to achieve the migration in under four months, including Amazon ECS, Amazon ECR, Elastic Load Balancing, and AWS CloudFormation.
CON212 – Anomaly Detection Using Amazon ECS, AWS Lambda, and Amazon EMR Learn about the architecture that Cisco CloudLock uses to enable automated security and compliance checks throughout the entire development lifecycle, from the first line of code through runtime. It includes integration with IAM roles, Amazon VPC, and AWS KMS.
Level 400 (Expert)
CON410 – Advanced CICD with Amazon ECS Control Plane Mohit Gupta, product and engineering lead for Clever, demonstrates how to extend the Amazon ECS control plane to optimize management of container deployments and how the control plane can be broadly applied to take advantage of new AWS services. This includes ark—an AWS CLI-based deployment to Amazon ECS, Dapple—a slack-based automation system for deployments and notifications, and Kayvee—log and event routing libraries based on Amazon Kinesis.
Workshops
Level 200 (Introductory)
CON209 – Interstella 8888: Learn How to Use Docker on AWS Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. Join this workshop to get hands-on experience with Docker as you containerize Interstella 8888’s aging monolithic application and deploy it using Amazon ECS.
CON213 – Hands-on Deployment of Kubernetes on AWS In this workshop, attendees get hands-on experience using Kubernetes and Kops (Kubernetes Operations), as described in our recent blog post. Attendees learn how to provision a cluster, assign role-based permissions and security, and launch a container. If you’re interested in learning best practices for running Kubernetes on AWS, don’t miss this workshop.
TUESDAY 11/28
Breakout Sessions
Level 200 (Introductory)
CON206 – Docker on AWS In this session, Docker Technical Staff Member Patrick Chanezon discusses how Finnish Rail, the national train system for Finland, is using Docker on Amazon Web Services to modernize their customer facing applications, from ticket sales to reservations. Patrick also shares the state of Docker development and adoption on AWS, including explaining the opportunities and implications of efforts such as Project Moby, Docker EE, and how developers can use and contribute to Docker projects.
CON208 – Building Microservices on AWS Increasingly, organizations are turning to microservices to help them empower autonomous teams, letting them innovate and ship software faster than ever before. But implementing a microservices architecture comes with a number of new challenges that need to be dealt with. Chief among these finding an appropriate platform to help manage a growing number of independently deployable services. In this session, Sam Newman, author of Building Microservices and a renowned expert in microservices strategy, discusses strategies for building scalable and robust microservices architectures. He also tells you how to choose the right platform for building microservices, and about common challenges and mistakes organizations make when they move to microservices architectures.
Level 300 (Advanced)
CON302 – Building a CICD Pipeline for Containers on AWS Containers can make it easier to scale applications in the cloud, but how do you set up your CICD workflow to automatically test and deploy code to containerized apps? In this session, we explore how developers can build effective CICD workflows to manage their containerized code deployments on AWS.
Ajit Zadgaonkar, Director of Engineering and Operations at Edmunds walks through best practices for CICD architectures used by his team to deploy containers. We also deep dive into topics such as how to create an accessible CICD platform and architect for safe blue/green deployments.
CON307 – Building Effective Container Images Sick of getting paged at 2am and wondering “where did all my disk space go?” New Docker users often start with a stock image in order to get up and running quickly, but this can cause problems as your application matures and scales. Creating efficient container images is important to maximize resources, and deliver critical security benefits.
In this session, AWS Sr. Technical Evangelist Abby Fuller covers how to create effective images to run containers in production. This includes an in-depth discussion of how Docker image layers work, things you should think about when creating your images, working with Amazon ECR, and mise-en-place for install dependencies. Prakash Janakiraman, Co-Founder and Chief Architect at Nextdoor discuss high-level and language-specific best practices for with building images and how Nextdoor uses these practices to successfully scale their containerized services with a small team.
CON309 – Containerized Machine Learning on AWS Image recognition is a field of deep learning that uses neural networks to recognize the subject and traits for a given image. In Japan, Cookpad uses Amazon ECS to run an image recognition platform on clusters of GPU-enabled EC2 instances. In this session, hear from Cookpad about the challenges they faced building and scaling this advanced, user-friendly service to ensure high-availability and low-latency for tens of millions of users.
CON320 – Monitoring, Logging, and Debugging for Containerized Services As containers become more embedded in the platform tools, debug tools, traces, and logs become increasingly important. Nare Hayrapetyan, Senior Software Engineer and Calvin French-Owen, Senior Technical Officer for Segment discuss the principals of monitoring and debugging containers and the tools Segment has implemented and built for logging, alerting, metric collection, and debugging of containerized services running on Amazon ECS.
Chalk Talks
Level 300 (Advanced)
CON314 – Automating Zero-Downtime Production Cluster Upgrades for Amazon ECS Containers make it easy to deploy new code into production to update the functionality of a service, but what happens when you need to update the Amazon EC2 compute instances that your containers are running on? In this talk, we’ll deep dive into how to upgrade the Amazon EC2 infrastructure underlying a live production Amazon ECS cluster without affecting service availability. Matt Callanan, Engineering Manager at Expedia walk through Expedia’s “PRISM” project that safely relocates hundreds of tasks onto new Amazon EC2 instances with zero-downtime to applications.
CON322 – Maximizing Amazon ECS for Large-Scale Workloads Head of Mobfox DevOps, David Spitzer, shows how Mobfox used Docker and Amazon ECS to scale the Mobfox services and development teams to achieve low-latency networking and automatic scaling. This session covers Mobfox’s ecosystem architecture. It compares 2015 and today, the challenges Mobfox faced in growing their platform, and how they overcame them.
CON323 – Microservices Architectures for the Enterprise Salva Jung, Principle Engineer for Samsung Mobile shares how Samsung Connect is architected as microservices running on Amazon ECS to securely, stably, and efficiently handle requests from millions of mobile and IoT devices around the world.
CON324 – Windows Containers on Amazon ECS Docker containers are commonly regarded as powerful and portable runtime environments for Linux code, but Docker also offers API and toolchain support for running Windows Servers in containers. In this talk, we discuss the various options for running windows-based applications in containers on AWS.
CON326 – Remote Sensing and Image Processing on AWS Learn how Encirca services by DuPont Pioneer uses Amazon ECS powered by GPU-instances and Amazon EC2 Spot Instances to run proprietary image-processing algorithms against satellite imagery. Mark Lanning and Ethan Harstad, engineers at DuPont Pioneer show how this architecture has allowed them to process satellite imagery multiple times a day for each agricultural field in the United States in order to identify crop health changes.
Workshops
Level 300 (Advanced)
CON317 – Advanced Container Management at Catsndogs.lol Catsndogs.lol is a (fictional) company that needs help deploying and scaling its container-based application. During this workshop, attendees join the new DevOps team at CatsnDogs.lol, and help the company to manage their applications using Amazon ECS, and help release new features to make our customers happier than ever.Attendees get hands-on with service and container-instance auto-scaling, spot-fleet integration, container placement strategies, service discovery, secrets management with AWS Systems Manager Parameter Store, time-based and event-based scheduling, and automated deployment pipelines. If you are a developer interested in learning more about how Amazon ECS can accelerate your application development and deployment workflows, or if you are a systems administrator or DevOps person interested in understanding how Amazon ECS can simplify the operational model associated with running containers at scale, then this workshop is for you. You should have basic familiarity with Amazon ECS, Amazon EC2, and IAM.
Additional requirements:
The AWS CLI or AWS Tools for PowerShell installed
An AWS account with administrative permissions (including the ability to create IAM roles and policies) created at least 24 hours in advance.
WEDNESDAY 11/29
Birds of a Feather (BoF)
CON01 – Birds of a Feather: Containers and Open Source at AWS Cloud native architectures take advantage of on-demand delivery, global deployment, elasticity, and higher-level services to enable developer productivity and business agility. Open source is a core part of making cloud native possible for everyone. In this session, we welcome thought leaders from the CNCF, Docker, and AWS to discuss the cloud’s direction for growth and enablement of the open source community. We also discuss how AWS is integrating open source code into its container services and its contributions to open source projects.
Breakout Sessions
Level 300 (Advanced)
CON308 – Mastering Kubernetes on AWS Much progress has been made on how to bootstrap a cluster since Kubernetes’ first commit and is now only a matter of minutes to go from zero to a running cluster on Amazon Web Services. However, evolving a simple Kubernetes architecture to be ready for production in a large enterprise can quickly become overwhelming with options for configuration and customization.
In this session, Arun Gupta, Open Source Strategist for AWS and Raffaele Di Fazio, software engineer at leading European fashion platform Zalando, show the common practices for running Kubernetes on AWS and share insights from experience in operating tens of Kubernetes clusters in production on AWS. We cover options and recommendations on how to install and manage clusters, configure high availability, perform rolling upgrades and handle disaster recovery, as well as continuous integration and deployment of applications, logging, and security.
CON310 – Moving to Containers: Building with Docker and Amazon ECS If you’ve ever considered moving part of your application stack to containers, don’t miss this session. We cover best practices for containerizing your code, implementing automated service scaling and monitoring, and setting up automated CI/CD pipelines with fail-safe deployments. Manjeeva Silva and Thilina Gunasinghe show how McDonalds implemented their home delivery platform in four months using Docker containers and Amazon ECS to serve tens of thousands of customers.
Level 400 (Expert)
CON402 – Advanced Patterns in Microservices Implementation with Amazon ECS Scaling a microservice-based infrastructure can be challenging in terms of both technical implementation and developer workflow. In this talk, AWS Solutions Architect Pierre Steckmeyer is joined by Will McCutchen, Architect at BuzzFeed, to discuss Amazon ECS as a platform for building a robust infrastructure for microservices. We look at the key attributes of microservice architectures and how Amazon ECS supports these requirements in production, from configuration to sophisticated workload scheduling to networking capabilities to resource optimization. We also examine what it takes to build an end-to-end platform on top of the wider AWS ecosystem, and what it’s like to migrate a large engineering organization from a monolithic approach to microservices.
CON404 – Deep Dive into Container Scheduling with Amazon ECS As your application’s infrastructure grows and scales, well-managed container scheduling is critical to ensuring high availability and resource optimization. In this session, we deep dive into the challenges and opportunities around container scheduling, as well as the different tools available within Amazon ECS and AWS to carry out efficient container scheduling. We discuss patterns for container scheduling available with Amazon ECS, the Blox scheduling framework, and how you can customize and integrate third-party scheduler frameworks to manage container scheduling on Amazon ECS.
Chalk Talks
Level 300 (Advanced)
CON312 – Building a Selenium Fleet on the Cheap with Amazon ECS with Spot Fleet Roberto Rivera and Matthew Wedgwood, engineers at RetailMeNot, give a practical overview of setting up a fleet of Selenium nodes running on Amazon ECS with Spot Fleet. Discuss the challenges of running Selenium with high availability at minimum cost using Amazon ECS container introspection to connect the Selenium Hub with its nodes.
CON315 – Virtually There: Building a Render Farm with Amazon ECS Learn how 8i Corp scales its multi-tenanted, volumetric render farm up to thousands of instances using AWS, Docker, and an API-driven infrastructure. This render farm enables them to turn the video footage from an array of synchronized cameras into a photo-realistic hologram capable of playback on a range of devices, from mobile phones to high-end head mounted displays. Join Owen Evans, VP of Engineering for 8i, as they dive deep into how 8i’s rendering infrastructure is built and maintained by just a handful of people and powered by Amazon ECS.
CON325 – Developing Microservices – from Your Laptop to the Cloud Wesley Chow, Staff Engineer at Adroll, shows how his team extends Amazon ECS by enabling local development capabilities. Hologram, Adroll’s local development program, brings the capabilities of the Amazon EC2 instance metadata service to non-EC2 hosts, so that developers can run the same software on local machines with the same credentials source as in production.
CON327 – Patterns and Considerations for Service Discovery Roven Drabo, head of cloud operations at Kaplan Test Prep, illustrates Kaplan’s complete container automation solution using Amazon ECS along with how his team uses NGINX and HashiCorp Consul to provide an automated approach to service discovery and container provisioning.
CON328 – Building a Development Platform on Amazon ECS Quinton Anderson, Head of Engineering for Commonwealth Bank of Australia, walks through how they migrated their internal development and deployment platform from Mesos/Marathon to Amazon ECS. The platform uses a custom DSL to abstract a layered application architecture, in a way that makes it easy to plug or replace new implementations into each layer in the stack.
Workshops
Level 300 (Advanced)
CON318 – Interstella 8888: Monolith to Microservices with Amazon ECS Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. Join this workshop to get hands-on experience deploying Docker containers as you break Interstella 8888’s aging monolithic application into containerized microservices. Using Amazon ECS and an Application Load Balancer, you create API-based microservices and deploy them leveraging integrations with other AWS services.
CON332 – Build a Java Spring Application on Amazon ECS This workshop teaches you how to lift and shift existing Spring and Spring Cloud applications onto the AWS platform. Learn how to build a Spring application container, understand bootstrap secrets, push container images to Amazon ECR, and deploy the application to Amazon ECS. Then, learn how to configure the deployment for production.
THURSDAY 11/30
Breakout Sessions
Level 200 (Introductory)
CON201 – Containers on AWS – State of the Union Just over four years after the first public release of Docker, and three years to the day after the launch of Amazon ECS, the use of containers has surged to run a significant percentage of production workloads at startups and enterprise organizations. Join Deepak Singh, General Manager of Amazon Container Services, as he covers the state of containerized application development and deployment trends, new container capabilities on AWS that are available now, options for running containerized applications on AWS, and how AWS customers successfully run container workloads in production.
Level 300 (Advanced)
CON304 – Batch Processing with Containers on AWS Batch processing is useful to analyze large amounts of data. But configuring and scaling a cluster of virtual machines to process complex batch jobs can be difficult. In this talk, we show how to use containers on AWS for batch processing jobs that can scale quickly and cost-effectively. We also discuss AWS Batch, our fully managed batch-processing service. You also hear from GoPro and Here about how they use AWS to run batch processing jobs at scale including best practices for ensuring efficient scheduling, fine-grained monitoring, compute resource automatic scaling, and security for your batch jobs.
Level 400 (Expert)
CON406 – Architecting Container Infrastructure for Security and Compliance While organizations gain agility and scalability when they migrate to containers and microservices, they also benefit from compliance and security, advantages that are often overlooked. In this session, Kelvin Zhu, lead software engineer at Okta, joins Mitch Beaumont, enterprise solutions architect at AWS, to discuss security best practices for containerized infrastructure. Learn how Okta built their development workflow with an emphasis on security through testing and automation. Dive deep into how containers enable automated security and compliance checks throughout the development lifecycle. Also understand best practices for implementing AWS security and secrets management services for any containerized service architecture.
Chalk Talks
Level 300 (Advanced)
CON329 – Full Software Lifecycle Management for Containers Running on Amazon ECS Learn how The Washington Post uses Amazon ECS to run Arc Publishing, a digital journalism platform that powers The Washington Post and a growing number of major media websites. Amazon ECS enabled The Washington Post to containerize their existing microservices architecture, avoiding a complete rewrite that would have delayed the platform’s launch by several years. In this session, Jason Bartz, Technical Architect at The Washington Post, discusses the platform’s architecture. He addresses the challenges of optimizing Arc Publishing’s workload, and managing the application lifecycle to support 2,000 containers running on more than 50 Amazon ECS clusters.
CON330 – Running Containerized HIPAA Workloads on AWS Nihar Pasala, Engineer at Aetion, discusses the Aetion Evidence Platform, a system for generating the real-world evidence used by healthcare decision makers to implement value-based care. This session discusses the architecture Aetion uses to run HIPAA workloads using containers on Amazon ECS, best practices, and learnings.
Level 400 (Expert)
CON408 – Building a Machine Learning Platform Using Containers on AWS DeepLearni.ng develops and implements machine learning models for complex enterprise applications. In this session, Thomas Rogers, Engineer for DeepLearni.ng discusses how they worked with Scotiabank to leverage Amazon ECS, Amazon ECR, Docker, GPU-accelerated Amazon EC2 instances, and TensorFlow to develop a retail risk model that helps manage payment collections for millions of Canadian credit card customers.
Workshops
Level 300 (Advanced)
CON319 – Interstella 8888: CICD for Containers on AWS Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. Join this workshop to learn how to set up a CI/CD pipeline for containerized microservices. You get hands-on experience deploying Docker container images using Amazon ECS, AWS CloudFormation, AWS CodeBuild, and AWS CodePipeline, automating everything from code check-in to production.
FRIDAY 12/1
Breakout Sessions
Level 400 (Expert)
CON405 – Moving to Amazon ECS – the Not-So-Obvious Benefits If you ask 10 teams why they migrated to containers, you will likely get answers like ‘developer productivity’, ‘cost reduction’, and ‘faster scaling’. But teams often find there are several other ‘hidden’ benefits to using containers for their services. In this talk, Franziska Schmidt, Platform Engineer at Mapbox and Yaniv Donenfeld from AWS will discuss the obvious, and not so obvious benefits of moving to containerized architecture. These include using Docker and Amazon ECS to achieve shared libraries for dev teams, separating private infrastructure from shareable code, and making it easier for non-ops engineers to run services.
Chalk Talks
Level 300 (Advanced)
CON331 – Deploying a Regulated Payments Application on Amazon ECS Travelex discusses how they built an FCA-compliant international payments service using a microservices architecture on AWS. This chalk talk covers the challenges of designing and operating an Amazon ECS-based PaaS in a regulated environment using a DevOps model.
Workshops
Level 400 (Expert)
CON407 – Interstella 8888: Advanced Microservice Operations Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. In this workshop, you help Interstella 8888 build a modern microservices-based logistics system to save the company from financial ruin. We give you the hands-on experience you need to run microservices in the real world. This includes implementing advanced container scheduling and scaling to deal with variable service requests, implementing a service mesh, issue tracing with AWS X-Ray, container and instance-level logging with Amazon CloudWatch, and load testing.
Know before you go
Want to brush up on your container knowledge before re:Invent? Here are some helpful resources to get started:
Driven by customer demand and made possible by on-going advances in the state-of-the-art, we’ve come a long way since the original m1.small instance that we launched in 2006, with instances that are emphasize compute power, burstable performance, memory size, local storage, and accelerated computing.
The New P3 Today we are making the next generation of GPU-powered EC2 instances available in four AWS regions. Powered by up to eight NVIDIA Tesla V100 GPUs, the P3 instances are designed to handle compute-intensive machine learning, deep learning, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, and genomics workloads.
P3 instances use customized Intel Xeon E5-2686v4 processors running at up to 2.7 GHz. They are available in three sizes (all VPC-only and EBS-only):
Model
NVIDIA Tesla V100 GPUs
GPU Memory
NVIDIA NVLink
vCPUs
Main Memory
Network Bandwidth
EBS Bandwidth
p3.2xlarge
1
16 GiB
n/a
8
61 GiB
Up to 10 Gbps
1.5 Gbps
p3.8xlarge
4
64 GiB
200 GBps
32
244 GiB
10 Gbps
7 Gbps
p3.16xlarge
8
128 GiB
300 GBps
64
488 GiB
25 Gbps
14 Gbps
Each of the NVIDIA GPUs is packed with 5,120 CUDA cores and another 640 Tensor cores and can deliver up to 125 TFLOPS of mixed-precision floating point, 15.7 TFLOPS of single-precision floating point, and 7.8 TFLOPS of double-precision floating point. On the two larger sizes, the GPUs are connected together via NVIDIA NVLink 2.0 running at a total data rate of up to 300 GBps. This allows the GPUs to exchange intermediate results and other data at high speed, without having to move it through the CPU or the PCI-Express fabric.
What’s a Tensor Core? I had not heard the term Tensor core before starting to write this post. According to this very helpful post on the NVIDIA Blog, Tensor cores are designed to speed up the training and inference of large, deep neural networks. Each core is able to quickly and efficiently multiply a pair of 4×4 half-precision (also known as FP16) matrices together, add the resulting 4×4 matrix to another half or single-precision (FP32) matrix, and store the resulting 4×4 matrix in either half or single-precision form. Here’s a diagram from NVIDIA’s blog post:
This operation is in the innermost loop of the training process for a deep neural network, and is an excellent example of how today’s NVIDIA GPU hardware is purpose-built to address a very specific market need. By the way, the mixed-precision qualifier on the Tensor core performance means that it is flexible enough to work with with a combination of 16-bit and 32-bit floating point values.
Performance in Perspective I always like to put raw performance numbers into a real-world perspective so that they are easier to relate to and (hopefully) more meaningful. This turned out to be surprisingly difficult, given that the eight NVIDIA Tesla V100 GPUs on a single p3.16xlarge can do 125 trillion single-precision floating point multiplications per second.
Let’s go back to the dawn of the microprocessor era, and consider the Intel 8080A chip that powered the MITS Altair that I bought in the summer of 1977. With a 2 MHz clock, it was able to do about 832 multiplications per second (I used this data and corrected it for the faster clock speed). The p3.16xlarge is roughly 150 billion times faster. However, just 1.2 billion seconds have gone by since that summer. In other words, I can do 100x more calculations today in one second than my Altair could have done in the last 40 years!
What about the innovative 8087 math coprocessor that was an optional accessory for the IBM PC that was announced in the summer of 1981? With a 5 MHz clock and purpose-built hardware, it was able to do about 52,632 multiplications per second. 1.14 billion seconds have elapsed since then, p3.16xlarge is 2.37 billion times faster, so the poor little PC would be barely halfway through a calculation that would run for 1 second today.
Ok, how about a Cray-1? First delivered in 1976, this supercomputer was able to perform vector operations at 160 MFLOPS, making the p3.x16xlarge 781,000 times faster. It could have iterated on some interesting problem 1500 times over the years since it was introduced.
Comparisons between the P3 and today’s scale-out supercomputers are harder to make, given that you can think of the P3 as a step-and-repeat component of a supercomputer that you can launch on as as-needed basis.
Run One Today In order to take full advantage of the NVIDIA Tesla V100 GPUs and the Tensor cores, you will need to use CUDA 9 and cuDNN7. These drivers and libraries have already been added to the newest versions of the Windows AMIs and will be included in an updated Amazon Linux AMI that is scheduled for release on November 7th. New packages are already available in our repos if you want to to install them on your existing Amazon Linux AMI.
The newest AWS Deep Learning AMIs come preinstalled with the latest releases of Apache MxNet, Caffe2, and Tensorflow (each with support for the NVIDIA Tesla V100 GPUs), and will be updated to support P3 instances with other machine learning frameworks such as Microsoft Cognitive Toolkit and PyTorch as soon as these frameworks release support for the NVIDIA Tesla V100 GPUs. You can also use the NVIDIA Volta Deep Learning AMI for NGC.
P3 instances are available in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions in On-Demand, Spot, Reserved Instance, and Dedicated Host form.
The recent addition of Xilinx FPGAs to AWS Cloud compute offerings is one way that AWS is enabling global growth in the areas of advanced analytics, deep learning and AI. The customized F1 servers use pooled accelerators, enabling interconnectivity of up to 8 FPGAs, each one including 64 GiB DDR4 ECC protected memory, with a dedicated PCIe x16 connection. That makes this a powerful engine with the capacity to process advanced analytical applications at scale, at a significantly faster rate. For example, AWS commercial partner Edico Genome is able to achieve an approximately 30X speedup in analyzing whole genome sequencing datasets using their DRAGEN platform powered with F1 instances.
While the availability of FPGA F1 compute on-demand provides clear accessibility and cost advantages, many mainstream users are still finding that the “threshold to entry” in developing or running FPGA-accelerated simulations is too high. Researchers at the UC Berkeley RISE Lab have developed “FireSim”, powered by Amazon FPGA F1 instances as an open-source resource, FireSim lowers that entry bar and makes it easier for everyone to leverage the power of an FPGA-accelerated compute environment. Whether you are part of a small start-up development team or working at a large datacenter scale, hardware-software co-design enables faster time-to-deployment, lower costs, and more predictable performance. We are excited to feature FireSim in this post from Sagar Karandikar and his colleagues at UC-Berkeley.
―Mia Champion, Sr. Data Scientist, AWS
Mapping an 8-node FireSim cluster simulation to Amazon EC2 F1
As traditional hardware scaling nears its end, the data centers of tomorrow are trending towards heterogeneity, employing custom hardware accelerators and increasingly high-performance interconnects. Prototyping new hardware at scale has traditionally been either extremely expensive, or very slow. In this post, I introduce FireSim, a new hardware simulation platform under development in the computer architecture research group at UC Berkeley that enables fast, scalable hardware simulation using Amazon EC2 F1 instances.
FireSim benefits both hardware and software developers working on new rack-scale systems: software developers can use the simulated nodes with new hardware features as they would use a real machine, while hardware developers have full control over the hardware being simulated and can run real software stacks while hardware is still under development. In conjunction with this post, we’re releasing the first public demo of FireSim, which lets you deploy your own 8-node simulated cluster on an F1 Instance and run benchmarks against it. This demo simulates a pre-built “vanilla” cluster, but demonstrates FireSim’s high performance and usability.
Why FireSim + F1?
FPGA-accelerated hardware simulation is by no means a new concept. However, previous attempts to use FPGAs for simulation have been fraught with usability, scalability, and cost issues. FireSim takes advantage of EC2 F1 and open-source hardware to address the traditional problems with FPGA-accelerated simulation: Problem #1: FPGA-based simulations have traditionally been expensive, difficult to deploy, and difficult to reproduce. FireSim uses public-cloud infrastructure like F1, which means no upfront cost to purchase and deploy FPGAs. Developers and researchers can distribute pre-built AMIs and AFIs, as in this public demo (more details later in this post), to make experiments easy to reproduce. FireSim also automates most of the work involved in deploying an FPGA simulation, essentially enabling one-click conversion from new RTL to deploying on an FPGA cluster.
Problem #2: FPGA-based simulations have traditionally been difficult (and expensive) to scale. Because FireSim uses F1, users can scale out experiments by spinning up additional EC2 instances, rather than spending hundreds of thousands of dollars on large FPGA clusters.
Problem #3: Finding open hardware to simulate has traditionally been difficult.Finding open hardware that can run real software stacks is even harder. FireSim simulates RocketChip, an open, silicon-proven, RISC-V-based processor platform, and adds peripherals like a NIC and disk device to build up a realistic system. Processors that implement RISC-V automatically support real operating systems (such as Linux) and even support applications like Apache and Memcached. We provide a custom Buildroot-based FireSim Linux distribution that runs on our simulated nodes and includes many popular developer tools.
Problem #4: Writing hardware in traditional HDLs is time-consuming. Both FireSim and RocketChip use the Chisel HDL, which brings modern programming paradigms to hardware description languages. Chisel greatly simplifies the process of building large, highly parameterized hardware components.
How to use FireSim for hardware/software co-design
FireSim drastically improves the process of co-designing hardware and software by acting as a push-button interface for collaboration between hardware developers and systems software developers. The following diagram describes the workflows that hardware and software developers use when working with FireSim.
Figure 2. The FireSim custom hardware development workflow.
The hardware developer’s view:
Write custom RTL for your accelerator, peripheral, or processor modification in a productive language like Chisel.
Run a software simulation of your hardware design in standard gate-level simulation tools for early-stage debugging.
Run FireSim build scripts, which automatically build your simulation, run it through the Vivado toolchain/AWS shell scripts, and publish an AFI.
Deploy your simulation on EC2 F1 using the generated simulation driver and AFI
Run real software builds released by software developers to benchmark your hardware
The software developer’s view:
Deploy the AMI/AFI generated by the hardware developer on an F1 instance to simulate a cluster of nodes (or scale out to many F1 nodes for larger simulated core-counts).
Connect using SSH into the simulated nodes in the cluster and boot the Linux distribution included with FireSim. This distribution is easy to customize, and already supports many standard software packages.
Directly prototype your software using the same exact interfaces that the software will see when deployed on the real future system you’re prototyping, with the same performance characteristics as observed from software, even at scale.
FireSim demo v1.0
Figure 3. Cluster topology simulated by FireSim demo v1.0.
This first public demo of FireSim focuses on the aforementioned “software-developer’s view” of the custom hardware development cycle. The demo simulates a cluster of 1 to 8 RocketChip-based nodes, interconnected by a functional network simulation. The simulated nodes work just like “real” machines: they boot Linux, you can connect to them using SSH, and you can run real applications on top. The nodes can see each other (and the EC2 F1 instance on which they’re deployed) on the network and communicate with one another. While the demo currently simulates a pre-built “vanilla” cluster, the entire hardware configuration of these simulated nodes can be modified after FireSim is open-sourced.
In this post, I walk through bringing up a single-node FireSim simulation for experienced EC2 F1 users. For more detailed instructions for new users and instructions for running a larger 8-node simulation, see FireSim Demo v1.0 on Amazon EC2 F1. Both demos walk you through setting up an instance from a demo AMI/AFI and booting Linux on the simulated nodes. The full demo instructions also walk you through an example workload, running Memcached on the simulated nodes, with YCSB as a load generator to demonstrate network functionality.
Deploying the demo on F1
In this release, we provide pre-built binaries for driving simulation from the host and a pre-built AFI that contains the FPGA infrastructure necessary to simulate a RocketChip-based node.
Starting your F1 instances
First, launch an instance using the free FireSim Demo v1.0 product available on the AWS Marketplace on an f1.2xlarge instance. After your instance has booted, log in using the user name centos. On the first login, you should see the message “FireSim network config completed.” This sets up the necessary tap interfaces and bridge on the EC2 instance to enable communicating with the simulated nodes.
AMI contents
The AMI contains a variety of tools to help you run simulations and build software for RISC-V systems, including the riscv64 toolchain, a Buildroot-based Linux distribution that runs on the simulated nodes, and the simulation driver program. For more details, see the AMI Contents section on the FireSim website.
Single-node demo
First, you need to flash the FPGA with the FireSim AFI. To do so, run:
This automatically calls the simulation driver, telling it to load the Linux kernel image and root filesystem for the Linux distro. This produces output similar to the following:
Simulations Started. You can use the UART console of each simulated node by attaching to the following screens:
There is a screen on:
2492.fsim0 (Detached)
1 Socket in /var/run/screen/S-centos.
You could connect to the simulated UART console by connecting to this screen, but instead opt to use SSH to access the node instead.
First, ping the node to make sure it has come online. This is currently required because nodes may get stuck at Linux boot if the NIC does not receive any network traffic. For more information, see Troubleshooting/Errata. The node is always assigned the IP address 192.168.1.10:
This should eventually produce the following output:
PING 192.168.1.10 (192.168.1.10) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Destination Host Unreachable
…
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=2017 ms
64 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=1018 ms
64 bytes from 192.168.1.10: icmp_seq=3 ttl=64 time=19.0 ms
…
At this point, you know that the simulated node is online. You can connect to it using SSH with the user name root and password firesim. It is also convenient to make sure that your TERM variable is set correctly. In this case, the simulation expects TERM=linux, so provide that:
At this point, you’re connected to the simulated node. Run uname -a as an example. You should see the following output, indicating that you’re connected to a RISC-V system:
# uname -a
Linux buildroot 4.12.0-rc2 #1 Fri Aug 4 03:44:55 UTC 2017 riscv64 GNU/Linux
Now you can run programs on the simulated node, as you would with a real machine. For an example workload (running YCSB against Memcached on the simulated node) or to run a larger 8-node simulation, see the full FireSim Demo v1.0 on Amazon EC2 F1 demo instructions.
Finally, when you are finished, you can shut down the simulated node by running the following command from within the simulated node:
# poweroff
You can confirm that the simulation has ended by running screen -ls, which should now report that there are no detached screens.
Future plans
At Berkeley, we’re planning to keep improving the FireSim platform to enable our own research in future data center architectures, like FireBox. The FireSim platform will eventually support more sophisticated processors, custom accelerators (such as Hwacha), network models, and peripherals, in addition to scaling to larger numbers of FPGAs. In the future, we’ll open source the entire platform, including Midas, the tool used to transform RTL into FPGA simulators, allowing users to modify any part of the hardware/software stack. Follow @firesimproject on Twitter to stay tuned to future FireSim updates.
Acknowledgements
FireSim is the joint work of many students and faculty at Berkeley: Sagar Karandikar, Donggyu Kim, Howard Mao, David Biancolin, Jack Koenig, Jonathan Bachrach, and Krste Asanović. This work is partially funded by AWS through the RISE Lab, by the Intel Science and Technology Center for Agile HW Design, and by ASPIRE Lab sponsors and affiliates Intel, Google, HPE, Huawei, NVIDIA, and SK hynix.
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.