Tag Archives: artificial intelligence

Amazon Forecast – Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-forecast-now-generally-available/

Getting accurate time series forecasts from historical data is not an easy task. Last year at re:Invent we introduced Amazon Forecast, a fully managed service that requires no experience in machine learning to deliver highly accurate forecasts. I’m excited to share that Amazon Forecast is generally available today!

With Amazon Forecast, there are no servers to provision. You only need to provide historical data, plus any additional metadata that you think may have an impact on your forecasts. For example, the demand for a particular product you need or produce may change with the weather, the time of the year, and the location where the product is used.

Amazon Forecast is based on the same technology used at Amazon and packages our years of experience in building and operating scalable, highly accurate forecasting technology in a way that is easy to use, and can be used for lots of different use cases, such as estimating product demand, cloud computing usage, financial planning, resource planning in a supply chain management system, as it uses deep learning to learn from multiple datasets and automatically try different algorithms.

Using Amazon Forecast
For this post, I need some sample data. To have an interesting use case, I go for the individual household electric power consumption dataset from the UCI Machine Learning Repository. For simplicity, I am using a version where data is aggregated hourly in a file in CSV format. Here are the first few lines where you can see the timestamp, the energy consumption, and the client ID:

2014-01-01 01:00:00,38.34991708126038,client_12
2014-01-01 02:00:00,33.5820895522388,client_12
2014-01-01 03:00:00,34.41127694859037,client_12
2014-01-01 04:00:00,39.800995024875625,client_12
2014-01-01 05:00:00,41.044776119402975,client_12

Let’s see how easy it is to build a predictor and get forecasts by using the Amazon Forecast console. Another option, for more advanced users, would be to use a Jupyter notebook and the AWS SDK for Python. You can find some sample notebooks in this GitHub repository.

In the Amazon Forecast console, the first step is to create a dataset group. Dataset groups act as containers for datasets that are related.

I can select a forecasting domain for my dataset group. Each domain covers a specific use case, such as retail, inventory planning, or web traffic, and brings its own dataset types based on the type of data used for training. For now, I use a custom domain that covers all use cases that don’t fall in other categories.

Next, I create a dataset. The data I am going to upload is aggregated by the hour, so 1 hour is the frequency of my data. The default data schema depends on the forecasting domain I selected earlier. I am using a custom domain here, and I change the data schema to have a timestamp, a target_value, and an item_id, in that order, as you can see in the sample few lines of data at the beginning of this post.

Now is the time to upload my time series data from Amazon Simple Storage Service (S3) into my dataset. The default timestamp format is exactly what I have in my data, so I don’t need to change it. I need an AWS Identity and Access Management (IAM) role to give Amazon Forecast access to the S3 bucket. I can select one here, or create a new one for this use case. As usual, avoid creating IAM roles that are too permissive and apply a least privilege approach to reduce the amount of permissions to the minimum required for this activity. After I tell Amazon Forecast in which S3 bucket and folder to look for my historical data, I start the import job.

The dataset group dashboard gives an overview of the process. My target time series data is being imported, and I can optionally add:

  • item metadata information on the items I want to predict on; for example, the color of the items in a retail scenario, or the kind of household (is this an apartment or a detached house?) for this electricity-focused use case.
  • related time series data that don’t include the target variable I want to predict, but can help improve my model; for example, price and promotions used by an ecommerce company are probably related to actual sales.

I am not adding more data for this use case. As soon as my dataset is imported, I start to train a predictor that I can then use to generate forecasts. I give the predictor a name, then select the forecast horizon, that in my case is 24 hours, and the frequency at which my forecast are generated.

To train the predictor, I can select a specific machine learning algorithms of my choice, such as ARIMA or DeepAR+, but I prefer simplicity and use AutoML to let Amazon Forecast evaluate all algorithms and choose the one that performs best for my dataset.

In the case of my dataset, each household is identified by a single variable, the item_id, but you can add more dimensions if required. I can then select the Country for holidays. This is optional, but can improve your results if the data you are using may be affected by people being on holidays or not. I think energy usage is different on holidays, so I select United States, the country my dataset is coming from.

The configuration of the backtest windows is a more advanced topic, and you can skip the next paragraph if you’re not interested into the details of how a machine learning model is evaluated in case of time series. In this case, I am leaving the default.

When training a machine learning model, you need two split your dataset in two: a training dataset you use to train with the machine learning algorithm, and an evaluation dataset that you use to evaluate the performance of your trained model. With time series, you can’t just create these two subsets of your data randomly, like you would normally do, because the order of your data points is important. The approach we use for Amazon Forecast is to split the time series in one or more parts, each one called a backtest window, preserving the order of the data. When evaluating your model against a backtest window, you should always use an evaluation dataset of the same length, otherwise it would be very difficult to compare different results.  The backtest window offset tells how many points ahead of a split point you want to use for evaluation, and this is the same value for all the splits. For example, by leaving 24 (hours) I always use one day of data for evaluating my model against multiple window offsets.

In the advanced configurations, I have the option to enable hyperparameter optimization (HPO), for the algorithms that support it, and featurizations, to develop additional features computed from the ones in your data. I am not touching those settings now.

After a few minutes, the predictor is active. To understand the quality of a predictor, I look at some of the metrics that are automatically computed.

Quantile loss (QL) calculates how far off the forecast at a certain quantile is from the actual demand. It weights underestimation and overestimation according to a specific quantile. For example, a P90 forecast, if calibrated, means that 90% of the time the true demand is less than the forecast value. Thus, when the demand turns out to be higher than the forecast, the loss would be greater than the other way around.

When the predictor is ready, and I am satisfied by its metrics, I can use it to create a forecast.

When the forecast is active, I can query it to get predictions. I can export the whole forecast as CSV file, or query for specific lookups. Let’s do a lookup. In the case of the dataset I am using, I can forecast the energy used by a household for a specific range of time. Dates here are in the past because I used an old dataset. I am pretty sure you’re going to use Amazon Forecast to look into the future.

For each timestamp in the forecast, I get a range of values. The P10, P50, and P90 forecasts have respectively 10%, 50%, and 90% probability of satisfying the actual demand. How you use these three values depends on your use case and how it is impacted by overestimating or underestimating demand. The P50 forecast is the most likely estimate for the demand. The P10 and P90 forecasts give you an 80% confidence interval for what to expect.

Available Now
You can use Amazon Forecast via the console, the AWS Command Line Interface (CLI) and the AWS SDKs. For example, you can use Amazon Forecast within a Jupyter notebook with the AWS SDK for Python to create a new predictor, or use the AWS SDK for JavaScript in the Browser to get predictions from within a web or mobile app, or the AWS SDK for Java or AWS SDK for .NET to add forecast capabilities to an existing enterprise application.

Here’s the overall flow of the Amazon Forecast API, from creating the dataset group to querying and extracting the forecast:

The dataset I used for this walkthrough and other examples are available in this GitHub repository:

Amazon Forecast is now available in US East (N. Virginia), US West (Oregon), US East (Ohio), Europe (Ireland), Asia Pacific (Singapore), and Asia Pacific (Tokyo).

More information on specific features and pricing is one click away at:

I look forward to see what you’re going to use it for, please share your results with me!

Danilo

Architecture Monthly Magazine for July: Machine Learning

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/architecture-monthly-magazine-for-july-machine-learning/

Every month, AWS publishes the AWS Architecture Monthly Magazine (available for free on Kindle and Flipboard) that curates some of the best technical and video content from around AWS.

In the June edition, we offered several pieces of content related to Internet of Things (IoT). This month we’re talking about artificial intelligence (AI), namely machine learning.

Machine Learning: Let’s Get it Started

Alan Turing, the British mathematician whose life and work was documented in the movie The Imitation Game, was a pioneer of theoretical computer science and AI. He was the first to put forth the idea that machines can think.

Jump ahead 80 years to this month when researchers asked four-time World Poker Tour title holder Darren Elias to play Texas Hold’em with Pluribus, a poker-playing bot (actually, five of these bots were at the table). Pluribus learns by playing against itself over and over and remembering which strategies worked best. The bot became world-class-level poker player in a matter of days. Read about it in the journal Science.

If AI is making a machine more human, AI’s subset, machine learning, involves the techniques that allow these machines to make sense of the data we feed them. Machine learning is mimicking how humans learn, and Pluribus is actually learning from itself.

From self-driving cars, medical diagnostics, and facial recognition to our helpful (and sometimes nosy) pals Siri, Alexa, and Cortana, all these smart machines are constantly improving from the moment we unbox them. We humans are teaching the machines to think like us.

For July’s magazine, we assembled architectural best practices about machine learning from all over AWS, and we’ve made sure that a broad audience can appreciate it.

  • Interview: Mahendra Bairagi, Solutions Architect, Artificial Intelligence
  • Training: Getting in the Voice Mindset
  • Quick Start: Predictive Data Science with Amazon SageMaker and a Data Lake on AWS
  • Blog post: Amazon SageMaker Neo Helps Detect Objects and Classify Images on Edge Devices
  • Solution: Fraud Detection Using Machine Learning
  • Video: Viz.ai Uses Deep Learning to Analyze CT Scans and Save Lives
  • Whitepaper: Power Machine Learning at Scale

We hope you find this edition of Architecture Monthly useful, and we’d like your feedback. Please give us a star rating and your comments on Amazon. You can also reach out to [email protected] anytime. Check back in a month to discover what the August magazine will offer.

Amazon Polly Introduces Neural Text-To-Speech and Newscaster Style

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-polly-introduces-neural-text-to-speech-and-newscaster-style/

From Robbie the Robot to Jarvis, science fiction writers have long understood how important it was for an artificial being to sound as lifelike as possible. Speech is central to human interaction, and beyond words, it helps us express feelings and emotions: who can forget HAL 9000’s haunting final scene in 2001: A Space Odyssey?

In the real world, things are more complicated of course. Decades before the term ‘artificial intelligence’ had even been coined, scientists were designing systems that tried to mimic the human voice. In 1937, almost 20 years before the seminal Dartmouth workshop, Homer Dudley invented the Voder, the first attempt to synthesize human speech with electronic components: this video has sound samples and extra information on this incredible device.

We’ve come a long way since then! At AWS re:Invent 2016, we announced Polly, a managed service that turns text into lifelike speech, allowing customers to create applications that talk, and build entirely new categories of speech-enabled products. Zero machine learning expertise required: just call an API and get the job done! Since then, the team has regularly added new voices, for a current total of 29 languages and 59 voices.

Today, we’re happy to announce two major new features for Polly: Neural Text-To-Speech, and a groundbreaking newscaster style.

Introducing Neural Text-To-Speech (NTTS)
Through a new machine learning approach, NTTS delivers significant improvements in speech quality. It increases naturalness and expressiveness, two key factors in synthesizing lifelike speech that is getting closer than ever from human voices. Here’s an example of the quality you can expect.

As of today, NTTS is available for 11 voices, both in real-time and in batch mode:

  • All 3 UK English voices: Amy, Emma and Brian.
  • All 8 US English voices: Ivy, Joanna, Kendra, Kimberly, Salli, Joey, Justin and Matthew.

Why not head out to the AWS console for a quick test?

Introducing the newscaster style

Speech quality is certainly important, but more can be done to make a synthetic voice sound even more realistic and engaging. What about style? For sure, human ears can tell the difference between a newscast, a sportscast, a university class and so on; indeed, most humans adopt the right style of speech for the right context, and this certainly helps in getting their message across.

Thanks to NTTS, it’s possible to apply styles to synthesized speech, and you can now use a newscaster style with Polly. Here’s an example.

From news to blog posts, this makes narration sound even more realistic, and customers like The Globe and Mail already use it today. Thanks to Polly and the newscaster style, their readers (or should we say listeners now?) can enjoy articles read in a high-quality voice that sounds like what they might expect to hear on the TV or radio. Adding Amazon Translate, they can also listen to articles that are automatically translated to a language they understand.

As of today, the newscaster style is available for two US English voices (Joanna and Matthew), both in real-time and in batch mode. Again, you can head out to the AWS console for a quick test, and here’s the same clip as above with the newscaster style.

Using Polly APIs with the NTTS voices and the newscaster style is extremely easy. Please let me show you how to get started with both.

Using NTTS Voices and the Newscaster Style
Let’s grab a bit of text for Polly to read: how about this paragraph from Amazon Simple Storage Service (S3)‘s announcement in 2006?

“Earlier today we rolled out Amazon S3, our reliable, highly scalable, and low-latency data storage service. Using SOAP and REST interfaces, developers can easily store any number of blocks of data in S3. Each block can be up to 5 GB in length, and is associated with a user-defined key and additional key-value metadata pairs. Further, each block is protected by an ACL (Access Control List) allowing the developer to keep the data private, share it for reading, or share it for reading and writing, as desired. The system was designed to provide a data availability factor of 99.99%; all data is transparently stored in multiple locations”.

I will use batch mode in order to save sound files in S3 and let you grab them: I explicitly changed permissions to make them public, but don’t worry, your own files are completely private by default.

Let’s first try the standard Matthew voice.

$ aws polly start-speech-synthesis-task
      --voice-id Matthew --text file://s3.txt
      --output-s3-bucket-name "jsimon-polly" --output-format mp3
      --query "SynthesisTask.TaskId"
"e3db409c-419d-4a31-a3a7-72c1e712fe23"
$ wget https://jsimon-polly.s3.amazonaws.com/e3db409c-419d-4a31-a3a7-72c1e712fe23.mp3 -O matthew-standard.mp3

Tell us a bit about S3, Matthew.

Now, let’s use the NTTS version of the same voice: all we have to do is set the ‘engine‘ parameter to ‘neural‘.

$ aws polly start-speech-synthesis-task
      --voice-id Matthew --engine neural --text file://s3.txt
      --output-s3-bucket-name "jsimon-polly" --output-format mp3
      --query "SynthesisTask.TaskId"
"e3902335-c1e6-450b-b6e9-f913d6d52055"
$ wget https://jsimon-polly.s3.amazonaws.com/e3902335-c1e6-450b-b6e9-f913d6d52055.mp3 -O matthew-neural.mp3

You should immediately notice the quality improvement that NTTS brings. Of course, Polly has correctly picked up technical abbreviations, numbers, etc.

Now let’s spice things up and apply the newscaster style. This requires that we use the SSML markup language. All we need to do is to enclose the text like so:

<speak>
<amazon:domain name="news">
Earlier today we rolled out Amazon S3, our reliable, highly scalable, and low-latency data storage service. Using SOAP and REST interfaces, developers can easily store any number of blocks of data in S3. Each block can be up to 5 GB in length, and is associated with a user-defined key and additional key-value metadata pairs. Further, each block is protected by an ACL (Access Control List) allowing the developer to keep the data private, share it for reading, or share it for reading and writing, as desired. The system was designed to provide a data availability factor of 99.99%; all data is transparently stored in multiple locations.
</amazon:domain>
</speak>

Let’s synthesize this text again, making sure to set text type to SSML.

$ aws polly start-speech-synthesis-task
       --voice-id Matthew --engine neural
       --text file://s3.ssml --text-type ssml
       --output-s3-bucket-name "jsimon-polly" --output-format mp3
       --query "SynthesisTask.TaskId"
"25c18bda-b32b-4485-a45f-eb9b757a513b"
$ wget https://jsimon-polly.s3.amazonaws.com/25c18bda-b32b-4485-a45f-eb9b757a513b.mp3 -O matthew-neural-newscaster.mp3

I’m sure you can immediately tell the difference! Doesn’t this sound like a news reporter reading our text?

If you’re curious about the Joanna voice, here are the equivalent clips: standard, neural, and neural with newscaster style.



Available Now!
As you can see, it’s extremely easy to use these new features, and they are available today in US East (N. Virginia), US West (Oregon) and Europe (Ireland). The free tier offers 1 million characters for NTTS voices per month for the first 12 months, starting from your first request for speech (standard or NTTS).

We’re looking forward to your feedback! Please post it to the AWS Forum for Polly, or send it to your usual AWS support contacts.

Julien;

Amazon Transcribe Streaming Now Supports WebSockets

Post Syndicated from Brandon West original https://aws.amazon.com/blogs/aws/amazon-transcribe-streaming-now-supports-websockets/

I love services like Amazon Transcribe. They are the kind of just-futuristic-enough technology that excites my imagination the same way that magic does. It’s incredible that we have accurate, automatic speech recognition for a variety of languages and accents, in real-time. There are so many use-cases, and nearly all of them are intriguing. Until now, the Amazon Transcribe Streaming API available has been available using HTTP/2 streaming. Today, we’re adding WebSockets as another integration option for bringing real-time voice capabilities to the things you build.

In this post, we are going to transcribe speech in real-time using only client-side JavaScript in a browser. But before we can build, we need a foundation. We’ll review just enough information about Amazon Transcribe, WebSockets, and the Amazon Transcribe Streaming API to broadly explain the demo. For more detailed information, check out the Amazon Transcribe docs.

If you are itching to see things in action, you can head directly to the demo, but I recommend taking a quick read through this post first.

What is Amazon Transcribe?

Amazon Transcribe applies machine learning models to convert speech in audio to text transcriptions. One of the most powerful features of Amazon Transcribe is the ability to perform real-time transcription of audio. Until now, this functionality has been available via HTTP/2 streams. Today, we’re announcing the ability to connect to Amazon Transcribe using WebSockets as well.

For real-time transcription, Amazon Transcribe currently supports British English (en-GB), US English (en-US), French (fr-FR), Canadian French (fr-CA), and US Spanish (es-US).

What are WebSockets?

WebSockets are a protocol built on top of TCP, like HTTP. While HTTP is great for short-lived requests, it hasn’t historically been good at handling situations that require persistent real-time communications. While an HTTP connection is normally closed at the end of the message, a WebSocket connection remains open. This means that messages can be sent bi-directionally with no bandwidth or latency added by handshaking and negotiating a connection. WebSocket connections are full-duplex, meaning that the server can client can both transmit data at the same time. They were also designed for cross-domain usage, so there’s no messing around with cross-origin resource sharing (CORS) as there is with HTTP.

HTTP/2 streams solve a lot of the issues that HTTP had with real-time communications, and the first Amazon Transcribe Streaming API available uses HTTP/2. WebSocket support opens Amazon Transcribe Streaming up to a wider audience, and makes integrations easier for customers that might have existing WebSocket-based integrations or knowledge.

How the Amazon Transcribe Streaming API Works

Authorization

The first thing we need to do is authorize an IAM user to use Amazon Transcribe Streaming WebSockets. In the AWS Management Console, attach the following policy to your user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "transcribestreaming",
            "Effect": "Allow",
            "Action": "transcribe:StartStreamTranscriptionWebSocket",
            "Resource": "*"
        }
    ]
}

Authentication

Transcribe uses AWS Signature Version 4 to authenticate requests. For WebSocket connections, use a pre-signed URL, that contains all of the necessary information is passed as query parameters in the URL. This gives us an authenticated endpoint that we can use to establish our WebSocket.

Required Parameters

All of the required parameters are included in our pre-signed URL as part of the query string. These are:

  • language-code: The language code. One of en-US, en-GB, fr-FR, fr-CA, es-US.
  • sample-rate: The sample rate of the audio, in Hz. Max of 16000 for en-US and es-US, and 8000 for the other languages.
  • media-encoding: Currently only pcm is valid.
  • vocabulary-name: Amazon Transcribe allows you to define custom vocabularies for uncommon or unique words that you expect to see in your data. To use a custom vocabulary, reference it here.

Audio Data Requirements

There are a few things that we need to know before we start sending data. First, Transcribe expects audio to be encoded as PCM data. The sample rate of a digital audio file relates to the quality of the captured audio. It is the number of times per second (Hz) that the analog signal is checked in order to generate the digital signal. For high-quality data, a sample rate of 16,000 Hz or higher is recommended. For lower-quality audio, such as a phone conversation, use a sample rate of 8,000 Hz. Currently, US English (en-US) and US Spanish (es-US) support sample rates up to 48,000 Hz. Other languages support rates up to 16,000 Hz.

In our demo, the file lib/audioUtils.js contains a downsampleBuffer() function for reducing the sample rate of the incoming audio bytes from the browser, and a pcmEncode() function that takes the raw audio bytes and converts them to PCM.

Request Format

Once we’ve got our audio encoding as PCM data with the right sample rate, we need to wrap it in an envelope before we send it across the WebSocket connection. Each messages consists of three headers, followed by the PCM-encoded audio bytes in the message body. The entire message is then encoded as a binary event stream message and sent. If you’ve used the HTTP/2 API before, there’s one difference that I think makes using WebSockets a bit more straightforward, which is that you don’t need to cryptographically sign each chunk of audio data you send.

Response Format

The messages we receive follow the same general format: they are binary-encoded event stream messages, with three headers and a body. But instead of audio bytes, the message body contains a Transcript object. Partial responses are returned until a natural stopping point in the audio is determined. For more details on how this response is formatted, check out the docs and have a look at the handleEventStreamMessage() function in main.js.

Let’s See the Demo!

Now that we’ve got some context, let’s try out a demo. I’ve deployed it using AWS Amplify Console – take a look, or push the button to deploy your own copy. Enter the Access ID and Secret Key for the IAM User you authorized earlier, hit the Start Transcription button, and start speaking into your microphone.

Deploy to Amplify Console

The complete project is available on GitHub. The most important file is lib/main.js. This file defines all our required dependencies, wires up the buttons and form fields in index.html, accesses the microphone stream, and pushes the data to Transcribe over the WebSocket. The code has been thoroughly commented and will hopefully be easy to understand, but if you have questions, feel free to open issues on the GitHub repo and I’ll be happy to help. I’d like to extend a special thanks to Karan Grover, Software Development Engineer on the Transcribe team, for providing the code that formed that basis of this demo.

Amazon Personalize is Now Generally Available

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-personalize-is-now-generally-available/

Today, we’re happy to announce that Amazon Personalize is available to all AWS customers. Announced in preview at AWS re:Invent 2018, Amazon Personalize is a fully-managed service that allows you to create private, customized personalization recommendations for your applications, with little to no machine learning experience required.

Whether it is a timely video recommendation inside an application or a personalized notification email delivered just at the right moment, personalized experiences, based on your data, deliver more relevant experiences for customers often with much higher business returns.

The task of developing an efficient recommender system is quite challenging: building, optimizing, and deploying real-time personalization requires specialized expertise in analytics, applied machine learning, software engineering, and systems operations. Few organizations have the knowledge, skills, and experience to overcome these challenges, and simple rule-based systems become brittle and costly to maintain as new products and promotions are introduced, or customer behavior changes.

For over 20 years, Amazon.com has perfected machine learning models that provide personalized buying experiences from product discovery to checkout. With Amazon Personalize, we are bringing developers that same capability to build custom models without having to deal with the complexity of infrastructure and machine learning that typically accompanies these types of solutions.

With Amazon Personalize, you provide the unique signals in your activity data (page views, signups, purchases, and so forth) along with optional customer demographic information (age, location, etc.). You then provide the inventory of the items you want to recommend, such as articles, products, videos, or music as an example. Then, entirely under the covers, Amazon Personalize will process and examine the data, identify what is meaningful, select the right algorithms, and train and optimize a personalization model that is customized for your data, and accessible via an API. All data analyzed by Amazon Personalize is kept private and secure and only used for your customized recommendations. The resulting models are yours and yours alone.

With a single API call, you can make recommendations for your users and personalize the customer experience, driving more engagement, higher conversion, and increased performance on marketing campaigns. Domino’s Pizza, for instance, is using Amazon Personalize to deliver customized communications such as promotional deals through their digital properties. Sony Interactive Entertainment uses Personalize with Amazon SageMaker to automate and accelerate their machine learning development and drive more effective personalization at scale.

Personalize is like having your own Amazon.com machine learning personalization team at your beck and call, 24 hours a day.

Introducing Amazon Personalize

Amazon Personalize can make recommendations based on your historical data stored in Amazon S3, or on streaming data sent in real-time by your applications, or on both.

This gives customers a lot of flexibility to build recommendation solutions. For instance, you could build an initial recommender based on historical data, and retrain it periodically when you’ve accumulated enough live events. Alternatively, if you have no historical data to start from, you could ingest events for a while, and then build your recommender.

Having covered historical data in my previous blog post, I will focus on ingesting live events this time.

The high-level process looks like this:

  1. Create a dataset group in order to store events sent by your application.
  2. Create an interaction dataset and define its schema (no data is needed at this point).
  3. Create an event tracker in order to send events to Amazon Personalize.
  4. Start sending events to Amazon Personalize.
  5. Select a recommendation recipe, or let Amazon Personalize pick one for you thanks to AutoML.
  6. Create a solution, i.e. train the recipe on your dataset.
  7. Create a campaign and start recommending items.

Creating a dataset group

Let’s say we’d like to capture a click stream of movie recommendations. Using the the first time setup wizard, we create a dataset group to store these events. Here, let’s assume we don’t have any historical data to start with: all events are generated by the click stream, and are ingested using the event ingestion SDK.

Creating a dataset group just requires a name.

Then, we have to create the interaction dataset, which shows how users are interacting with items (liking, clicking, etc.). Of course, we need to define a schema describing the data: here, we’ll simply use the default schema provided by Amazon Personalize.

Optionally, we could now define an import job, in order to add historical data to the data set: as mentioned above, we’ll skip this step as all data will come from the stream.

Configuring the event tracker

The next step is to create the event tracker that will allow us to send streaming events to the dataset group.

After a minute or so, our tracker is ready. Please take note of the tracking id: we’ll need it to send events.

Creating the dataset

When Amazon Personalize creates an event tracker, it automatically creates a new dataset in the dataset group associated with the event tracker. This dataset has a well-defined schema storing the following information:

  • user_id and session_id: these values are defined by your application.
  • tracking_id: the event tracker id.
  • timestamp, item_id, event_type, event_value: these values describe the event itself and must be passed by your application.

Real-time events can be sent to this dataset in two different ways:

  • Server-side, via the AWS SDK: please note ingestion can happen from any source, whether your code is hosted inside of AWS (e.g. in Amazon EC2 or AWS Lambda) or outside.
  • With the AWS Amplify JavaScript library.

Let’s look at both options.

Sending real-time events with the AWS SDK

This is a very easy process: we can simply use the PutEvents API to send either a single event, or a list of up to 10 events. Of course, we could use any of the AWS SDKs: since my favourite language is Python, this is how we can send events using the boto3 SDK.

import boto3
personalize_events = boto3.client('personalize-events')
personalize_events.put_events(
    trackingId = <TRACKING_ID>,
    userId = <USER_ID>,
    sessionId = <SESSION_ID>,
    eventList = [
      {
          "eventId": "event1",
          "sentAt": 1549959198,
          "eventType": "rating",
          "properties": """{\"itemId\": \"123\", \"eventValue\": \"4\"}"""
      },
      {
          "eventId": "event2",
          "sentAt": 1549959205,
          "eventType": "rating",
          "properties": """{\"itemId\": \"456\", \"eventValue\": \"2\"}"""
      }
    ]
)

In our application, we rated movie 123 as a 4, and movie 456 as a 2. Using the appropriate tracking identifier, we sent two Events to our event tracker:

  • eventId: an application-specific identifier.
  • sentAt: a timestamp, matching the timestamp property defined in the schema. This value seconds since the Unix Epoch (1 January 1970 00:00:00.000 UTC), and is independent of any particular time zone.
  • eventType: the event type, matching the event_type property defined in the schema,
  • properties: the item id and event value, matching the item_id and event_value properties defined in the schema.

Here’s a similar code snippet in Java.

List<Event> eventList = new ArrayList<>();
eventList.add(new Event().withProperties(properties).withType(eventType));
PutEventsRequest request = new PutEventsRequest()
  .withTrackingId(<TRACKING_ID>)
  .withUserId(<USER_ID>)
  .withSessionId(<SESSION_ID>)
  .withEventList(eventList);
client.putEvents(request)

You get the idea!

Sending real-time events with AWS Amplify

AWS Amplify is a JavaScript library that makes it easy to create, configure, and implement scalable mobile and web apps powered by AWS. It’s integrated with the event tracking service in Amazon Personalize.

A couple of setup steps are required before we can send events. For the sake of brevity, please refer to these detailed instructions in the Amazon Personalize documentation:

  • Create an identity pool in Amazon Cognito, in order to authenticate users.
  • Configure the Amazon Personalize plug-in with the pool id and tracker id.

Once this is taken care of, we can send events to Amazon Personalize. We can still use any text string for event types, but please note that a couple of special types are available:

  • Identify lets you send the userId for a particular user to Amazon Personalize. The userId then becomes an optional parameter in subsequent calls.
  • MediaAutoTrack automatically calculates the play, pause and resume position for media events, and Amazon Personalize uses the position as event value.

Here is how to send some sample events with AWS Amplify:

Analytics.record({
    eventType: "Identify",
    properties: {
      "userId": "<USER_ID>"
    }
}, "AmazonPersonalize");
Analytics.record({
    eventType: "<EVENT_TYPE>",
    properties: {
      "itemId": "<ITEM_ID>",
      "eventValue": "<EVENT_VALUE>"
    }
}, "AmazonPersonalize");
Analytics.record({
    eventType: "MediaAutoTrack",
    properties: {
      "itemId": "<ITEM_ID>",
      "domElementId": "MEDIA DOM ELEMENT ID"
    }
}, "AmazonPersonalize");

As you can see, this is pretty simple as well.

Creating a recommendation solution

Now that we know how to ingest events, let’s define how our recommendation solution will be trained.

We first need to select a recipe, which is much more than an algorithm: it also includes predefined feature transformations, initial parameters for the algorithm as well as automatic model tuning. Thus, recipes remove the need to have expertise in personalization. Amazon Personalize comes with several recipes suitable for different use cases.

Still, if you’re new to machine learning, you may wonder which one of these recipes best fits your use case. No worry: as mentioned earlier, Amazon Personalize supports AutoML, a new technique that automatically searches for the most optimal recipe, so let’s enable it. While we’re at it, let’s also ask Amazon Personalize to automatically tune recipe parameters.

All of this is very straightforward in the AWS console: as you’ll probably want to automate from now on, let’s use the AWS CLI instead.

$ aws personalize create-solution \
  --name jsimon-movieclick-solution \ 
  --perform-auto-ml --perform-hpo \
  --dataset-group-arn $DATASET_GROUP_ARN

Now we’re ready to train the solution. No servers to worry about, training takes places on fully-managed infrastructure.

$ aws personalize create-solution-version \
  --solution-arn $SOLUTION_ARN 

Once training is complete, we can use the solution version to create a recommendation campaign.

Deploying a recommendation campaign

Still no servers to worry about! In fact, campaigns scale automatically according to incoming traffic: we simply need to define the minimum number of transactions per second (TPS) that we want to support.

This number is used to size the initial fleet for hosting the model. It also impacts how much you will be charged for recommendations ($0.20 per TPS-hour). Here, I’m setting that parameter to 10, which means that I will initially be charged $2 per hour. If traffic exceeds 10 TPS, Personalize will scale up, increasing my bill according to the new TPS setting. Once traffic drops, Personalize will scale down, but it won’t go below my minimum TPS setting.

$ aws personalize create-campaign \
  --name jsimon-movieclick-campaign \
  --min-provisioned-tps 10 \
  --solution-version-arn $SOLUTION_VERSION_ARN

Should you later need to update the campaign with a new solution version, you can simply use the UpdateCampaign API and pass the ARN of the new solution version.

Once the campaign has been deployed, we can quickly test that it’s able to recommend new movies.

Recommending new items in real-time

I don’t think this could be simpler: just pass the id of the user and receive recommendations.

$ aws personalize-rec get-recommendations \
--campaign-arn $CAMPAIGN_ARN \
--user-id 123 --query "itemList[*].itemId"
["1210", "260", "2571", "110", "296", "1193", ...]

At this point, we’re ready to integrate our recommendation model in your application. For example, a web application would have to implement the following steps to display a list of recommended movies:

  • Use the GetRecommendations API in our favorite language to invoke the campaign and receive movie recommendation for a given user,
  • Read movie metadata from a backend (say, image URL, title, genre, release date, etc.),
  • Generate HTML code to be rendered in the user’s browser.

Amazon Personalize in action

Actually, my colleague Jake Wells has built a web application recommending books. Using an open dataset containing over 19 million book reviews, Jake first used a notebook hosted on Amazon SageMaker to clean and prepare the data. Then, he trained a recommendation model with Amazon Personalize, and wrote a simple web application demonstrating the recommendation process. This is a really cool project, which would definitely be worthy of its own blog post!

Available now!

Whether you work with historical data or event streams, a few simple API calls are all it takes to train and deploy recommendation models. Zero machine learning experience is required, so please visit aws.amazon.com/personalize, give it a try and let us know what you think.

Amazon Personalize is available in the following regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Singapore), and EU (Ireland)

The service is also part of the AWS free tier. For the first two months after sign-up, you will be offered:
1. Data processing and storage: up to 20 GB per month
2. Training: up to 100 training hours per month
3. Prediction: up to 50 TPS-hours of real-time recommendations per month

We’re looking forward to your feedback!

Julien;

The AWS DeepRacer League Virtual Circuit is Now Open – Train Your Model Today!

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/the-aws-deepracer-league-virtual-circuit-is-now-open-train-your-model-today/

AWS DeepRacer is a 1/18th scale four-wheel drive car with a considerable amount of onboard hardware and software. Starting at re:Invent 2018 and continuing with the AWS Global Summits, you have the opportunity to get hands-on experience with a DeepRacer. At these events, you can train a model using reinforcement learning, and then race it around a track. The fastest racers and their laptimes for each summit are shown on our leaderboards.

New DeepRacer League Virtual Circuit
Today we are launching the AWS DeepRacer League Virtual Circuit. You can build, train, and evaluate your reinforcement learning models online and compete online for some amazing prizes, all from the comfort of the DeepRacer Console!

We’ll add a new track each month, taking inspiration from famous race tracks around the globe, so that you can refine your models and broaden your skill set. The top entrant in the leaderboard each month will win an expenses-paid package to AWS re:Invent 2019, where they will take part in the League Knockout Rounds, with a chance to win the Championship Cup!

New DeepRacer Console
We are making the DeepRacer Console available today in the US East (N. Virginia) Region. You can use it to build and train your DeepRacer models and to compete in the Virtual Circuit, while gaining practical, hands-on experience with Reinforcement Learning. Following the steps in the DeepRacer Lab that is used at the hands-on DeepRacer workshops, I open the console and click Get started:

The console provides me with an overview of the model training process, and then asks to create the AWS resources needed to train and evaluate my models. I review the info and click Create resources to proceed:

The resources are created in minutes (I can click Learn RL to learn more about reinforcement learning while this is happening). I click Create model to move ahead:

I enter a name and a description for for my model:

Then I pick a track (more tracks will appear throughout the duration of the Virtual League):

Now I define the Action space for my model. This is a set of discrete actions that my model can perform. Choosing values that increase the number of options will generally enhance the quality of my model, at the cost of additional training time:

Next, I define the reward function for my model. This function evaluates the current state of the vehicle throughout the training process and returns a reward value to indicate how well the model is performing (higher rewards signify better performance). I can use one of three predefined models (available by clicking Reward function examples) as-is, customize them, or build one from scratch. I’ll use Prevent zig-zag, a sample reward function that penalizes zig-zap behavior, to get started:

The reward function is written in Python 3, and has access to parameters (track_width, distance_from_center, all_wheels_on_track, and many more) that describe the position and state of the car, and also provide information about the track.

I also control a set of hyperparameters that affect the overall training performance. Since I don’t understand any of these (just being honest here), I will accept all of the defaults:

To learn more about hyperparameters, read Systematically Tune Hyperparameters.

Finally, I specify a time limit for my training job, and click Start training. In general, simple models will converge in 90 to 120 minutes, but this is highly dependent on the maximum speed and the reward function.

The training job is initialized (this takes about 6 minutes), and I can track progress in the console:

The training job makes use of AWS RoboMaker so I can also monitor it from the RoboMaker Console. For example, I can open the Gazebo window, see my car, and watch the training process in real time:

One note of caution: changing the training environment (by directly manipulating Gazebo) will adversely affect the training run, most likely rendering it useless.

As the training progresses, the Reward graph will go up and to the right (as we often say at Amazon) if the car is learning how to stay on the track:

If the graph flattens out or declines precipitously and stays there, your reward function is not rewarding the desired behavior or some other setting is getting in the way. However, patience is a virtue, and there will be the occasional regression on the way to the top. After the training is complete, there’s a short pause while the new model is finalized and stored, and then it is time for me to evaluate my model by running it in a simulation. I click Start evaluation to do this:

I can evaluate the model on any of the available tracks. Using one track for training and a different one for evaluation is a good way to make sure that the model is general, and has not been overfit so that it works on just one track. However, using the same track for training and testing is a good way to get started, and that’s what I will do. I select the Oval Track and 3 trials, and click Start evaluation:

The RoboMaker simulator launches, with an hourly cost for the evaluation, as noted above. The results (lap times) are displayed when the simulation is complete:

At this point I can evaluate my model on another track, step back and refine my model and evaluate it again, or submit my model to the current month’s Virtual Circuit track to take part in the DeepRacer League. To do that, I click Submit to virtual race, enter my racer name, choose a model, agree to the Ts and C’s, and click Submit model:

After I submit, my model will be evaluated on the pre-season track and my lap time will be used to place me in the Virtual Circuit Leaderboard.

Things to Know
Here are a couple of things to know about the AWS DeepRacer and the AWS DeepRacer League:

AWS ResourcesAmazon SageMaker is used to train models, which are then stored in Amazon Simple Storage Service (S3). AWS RoboMaker provides the virtual track environment, which is used for training and evaluation. An AWS CloudFormation stack is used to create a Amazon Virtual Private Cloud, complete with subnets, routing tables, an Elastic IP Address, and a NAT Gateway.

Costs – You can use the DeepRacer console at no charge. As soon as you start training your first model, you will get service credits for SageMaker and RoboMaker to give you 10 hours of free training on these services. The credits are applied at the end of the month and are available for 30 days, as part of the AWS Free Tier. The DeepRacer architecture uses a NAT Gateway that carries an availability charge. Your account will automatically receive service credits to offset this charge, showing net zero on your account.

DeepRacer Cars – You can preorder your DeepRacer car now! Deliveries to addresses in the United States will begin in July 2019.

Jeff;

Amazon SageMaker Ground Truth keeps simplifying labeling workflows

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-ground-truth-keeps-simplifying-labeling-workflows/

Launched at AWS re:Invent 2018, Amazon SageMaker Ground Truth is a capability of Amazon SageMaker that makes it easy for customers to efficiently and accurately label the datasets required for training machine learning systems.

A quick recap on Amazon SageMaker Ground Truth

Amazon SageMaker Ground Truth helps you build highly accurate training datasets for machine learning quickly. SageMaker Ground Truth offers easy access to public and private human labelers and provides them with built-in workflows and interfaces for common labeling tasks. Additionally, SageMaker Ground Truth can lower your labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently.

Amazon SageMaker Ground Truth helps you build datasets for:

  • Text classification.
  • Image classification, i.e categorizing images in specific classes.
  • Object detection, i.e. locating objects in images with bounding boxes.
  • Semantic segmentation, i.e. locating objects in images with pixel-level precision.
  • Custom user-defined tasks, that let customers annotate literally anything.

You can choose to use your team of labelers and route labeling requests directly to them. Alternatively, if you need to scale up, options are provided directly in the Amazon SageMaker Ground Truth console to work with labelers outside of your organization. You can access a public workforce of over 500,000 labelers via integration with Amazon Mechanical Turk. Alternatively, if your data requires confidentiality or special skills, you can use professional labeling companies pre-screened by Amazon, and listed on the AWS Marketplace.

Announcing new features

Since the service was launched, we gathered plenty of customer feedback (keep it coming!), from companies such as T-Mobile, Pinterest, Change Healthcare, GumGum, Automagi and many more. We used it to define what the next iteration of the service would look like, and just a few weeks ago, we launched two highly requested features:

  • Multi-category bounding boxes, allowing you to label multiple categories within an image simultaneously.
  • Three new UI templates for your custom workflows, for a total of fifteen different templates that help you quickly build annotation workflows for images, text, and audio datasets.

Today, we’re happy to announce another set of new features that keep simplifying the process of building and running cost-effective labeling workflows. Let’s look at each one of them.

Job chaining

Customers often want to run a subsequent labeling job using the output of a previous labeling job. Basically, they want to chain together labeling jobs using the outputted labeled dataset (and outputted ML model if automated data labeling was enabled). For example, they may run an initial job where they identify if humans exist in an image, and then they may want to run a subsequent job where they get bounding boxes drawn around the humans.

If active learning was used, customers may also want to use the ML model that was produced in order to bootstrap automated data labeling in a subsequent job. Setup couldn’t be easier: you can chain labeling jobs with just one click!

Job tracking

Customers want to be able to see the status of the progress of their labeling jobs. We now provide near real-time status for labeling jobs.

Long-lived jobs

Many customers use experts as labelers, and these individuals perform labeling on a periodic basis. For example, healthcare companies often use clinicians as their expert labelers, and they can only perform labeling occasionally during downtime. In these scenarios, labeling jobs need to run longer, sometimes for weeks or months. We now support extended task timeout windows where each batch of a labeling job can run for 10 days, meaning labeling jobs can extend for months.

Dynamic custom workflows

When setting up custom workflows, customers want to insert or use additional context in addition to the source data. For example, a customer may want to display the specific weather conditions above each image in the tasks they send to labelers; this information can help labelers better perform the task at-hand. Specifically, this feature allows customers to inject output from previous labeling jobs or other custom content into the custom workflow. This information is passed into a pre-processing Lambda function using the augmented manifest file that includes the source data and additional context. The customer can also use the additional context to dynamically adjust the workflow.

New service providers and new languages

We are listing two new data labeling service providers onto the AWS Marketplace: Vivetic and SmartOne. With the addition of these two vendors, Amazon SageMaker Ground Truth will add support for data labeling in French, German, and Spanish.

Regional expansion

In addition to US-East (Virginia), US-Central (Ohio), US-West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo), Amazon SageMaker Ground Truth is now available in Asia Pacific (Sydney).

Customer case study: ZipRecruiter

ZipRecruiter is helping people find great jobs, and helping employers build great companies. They’ve been using Amazon SageMaker since launch. Says ZipRecruiter CTO Craig Ogg: “ZipRecruiter’s AI-powered algorithm learns what each employer is looking for and provides a personalized, curated set of highly relevant candidates. On the other side of the marketplace, the company’s technology matches job seekers with the most pertinent jobs. And to do all that efficiently, we needed a Machine Learning model to extract relevant data automatically from uploaded resumes”.

Of course, building datasets is a critical part of the machine learning process, and it’s often expensive and extremely time-consuming. To solve both problems, ZipRecruiter turned to Ground Truth and one of our labeling partners, iMerit.

As Craig puts it: “Amazon SageMaker Ground Truth will significantly help us reduce the time and effort required to create datasets for training. Due to the confidential nature of the data, we initially considered using one of our teams but it would take time away from their regular tasks and it would take months to collect the data we needed. Using Amazon SageMaker Ground Truth, we engaged iMerit, a professional labeling company that has been pre-screened by Amazon, to assist with the custom annotation project. With their assistance we were able to collect thousands of annotations in a fraction of the time it would have taken using our own team.”

Getting started

I hope that this post was informative, and that the new features will let you build even faster. Please try Amazon SageMaker Ground Truth, let us know what you think, and help us build the next iteration of this cool service!

Julien

Using AWS AI and Amazon Sumerian in IT Education

Post Syndicated from Cyrus Wong original https://aws.amazon.com/blogs/aws/using-aws-ai-and-amazon-sumerian-in-it-education/

This guest post is by AWS Machine Learning Hero, Cyrus Wong. Cyrus is a Data Scientist at the Hong Kong Institute of Vocational Education (Lee Wai Lee) Cloud Innovation Centre. He has achieved all nine AWS Certifications and enjoys sharing his AWS knowledge with others through open-source projects, blog posts, and events.

Our institution (IVE) provides IT training to several thousand students every year and one of our courses successfully applied AWS Promotional Credits. We recently built an open-source project called “Lab Monitor,” which uses AWS AI, serverless, and AR/VR services to enhance our learning experience and gather data to understand what students are doing during labs.

Problem

One of the common problems of lab activity is that students are often doing things that have nothing to do with the course (such as watching videos or playing games). And students can easily copy answers from their classmate because the lab answers are in softcopy. Teachers struggle to challenge students as there is only one answer in general. No one knows which students are working on the lab or which are copying from one another!

Solution

Lab Monitor changes the assessment model form just the final result to the entire development process. We can support and monitor students using AWS AI services.

The system consists of the following parts:

  • A lab monitor agent
  • A lab monitor collector
  • An AR lab assistant

Lab monitor agent

The Lab monitor agent is a Python application that runs on a student’s computer activities. All information is periodically sent to AWS. To identify students and protect the API gateway, each student has a unique API key with a usage limit. The function includes:

  • Capturing all keyboard and pointer events. This can ensure that students are really working on the exercise as it is impossible to complete a coding task without using keyboard and pointer! Also, we encourage students to use shortcuts and we need that information as indicator.
  • Monitoring and controlling PC processes. Teachers can stop students from running programs that are irrelevant to the lab. For computer test, we can kill all browsers and communication software. Processing detailed information is important to decide to upgrade hardware or not!
  • Capturing screens. Amazon Rekognition can detect video or inappropriate content. Extracted text content can trigger an Amazon Sumerian host to talk to a student automatically. It is impossible for a teacher to monitor all student screens! We use a presigned URL with S3 Transfer Acceleration to speed up the image upload.
  • Uploading source code to AWS when students save their code. It is good to know when students complete tasks and to give support to those students who are slower!

Lab monitor collector

The Lab monitor collector is an AWS Serverless Application Model that collects data and provides an API to AR Lab Assistant. Optionally, a teacher can grade students immediately every time they save code by running the unit test inside AWS Lambda. It constantly saves all data into an Amazon S3 data lake and teachers can use Amazon Athena to analyze the data.

To save costs, a scheduled Lambda function checks the teacher’s class calendar every 15 minutes. When there is an upcoming class, it creates a Kinesis stream and Kinesis data analytics application automatically. Teachers can have a nearly real-time view of all student activity.

AR Lab Assistant

The AR lab assistant is a Amazon Sumerian application that reminds students to work on their lab exercise. It sends a camera image to Amazon Rekognition and gets back a student ID.

A Sumerian host, Christine, uses Amazon Polly to speak to students with when something happens:

  • When students pass a unit test, she says congratulations.
  • When students watch movies, she scolds them with the movie actor’s name, such as Tom Cruise.
  • When students watch porn, she scolds them.
  • When students do something wrong, such as forgetting to set up the Python interpreter, she reminds them to set it up.

Students can also ask her questions, for example, checking their overall progress. The host can connect to a Lex chatbot. Student’s conversations are saved in DynamoDB with the sentiment analysis result provided by Amazon Comprehend.

The student screen is like a projector inside the Sumerian application.

Christine: “Stop, watching dirty thing during Lab! Tom Cruise should not be able to help you writing Python code!”

Simplified Architectural Diagrams

Demo video

AR Lab Assistant reaction: https://youtu.be/YZCR2aROBp4

Conclusion

With the combined power of various AWS services, students can now concentrate on only their lab exercise and stop thinking about copying answers from each other! We built the project in about four months and it is still evolving. In a future version, we plan to build a machine learning model to predict the students’ final grade based on their class behavior. They feel that the class is much more fun with Christine.

Lastly, we would like to say thank you to AWS Educate, who provided us with AWS credit, and my AWS Academy student developer team: Mike, Long, Mandy, Tung, Jacqueline, and Hin from IVE Higher Diploma in Cloud and Data Centre Administration. They submitted this application to the AWS Artificial Intelligence (AI) Hackathon and just learned that they received a 3rd place prize!

Amazon SageMaker Neo – Train Your Machine Learning Models Once, Run Them Anywhere

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-neo-train-your-machine-learning-models-once-run-them-anywhere/

Machine learning (ML) is split in two distinct phases: training and inference. Training deals with building the model, i.e. running a ML algorithm on a dataset in order to identify meaningful patterns. This often requires large amounts of storage and computing power, making the cloud a natural place to train ML jobs with services such as Amazon SageMaker and the AWS Deep Learning AMIs.

Inference deals with using the model, i.e. predicting results for data samples that the model has never seen. Here, the requirements are different: developers are typically concerned with optimizing latency (how long does a single prediction take?) and throughput (how many predictions can I run in parallel?). Of course, the hardware architecture of your prediction environment has a very significant impact on such metrics, especially if you’re dealing with resource-constrained devices: as a Raspberry Pi enthusiast, I often wish the little fellow packed a little more punch to speed up my inference code.

Tuning a model for a specific hardware architecture is possible, but the lack of tooling makes this an error-prone and time-consuming process. Minor changes to the ML framework or the model itself usually require the user to start all over again. Unfortunately, this forces most ML developers to deploy the same model everywhere regardless of the underlying hardware, thus missing out on significant performance gains.

Well, no more. Today, I’m very happy to announce Amazon SageMaker Neo, a new capability of Amazon SageMaker that enables machine learning models to train once and run anywhere in the cloud and at the edge with optimal performance.

Introducing Amazon SageMaker Neo

Without any manual intervention, Amazon SageMaker Neo optimizes models deployed on Amazon EC2 instances, Amazon SageMaker endpoints and devices managed by AWS Greengrass.

Here are the supported configurations:

  • Frameworks and algorithms: TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost.
  • Hardware architectures: ARM, Intel, and NVIDIA starting today, with support for Cadence, Qualcomm, and Xilinx hardware coming soon. In addition, Amazon SageMaker Neo is released as open source code under the Apache Software License, enabling hardware vendors to customize it for their processors and devices.

The Amazon SageMaker Neo compiler converts models into an efficient common format, which is executed on the device by a compact runtime that uses less than one-hundredth of the resources that a generic framework would traditionally consume. The Amazon SageMaker Neo runtime is optimized for the underlying hardware, using specific instruction sets that help speed up ML inference.

This has three main benefits:

  • Converted models perform at up to twice the speed, with no loss of accuracy.
  • Sophisticated models can now run on virtually any resource-limited device, unlocking innovative use cases like autonomous vehicles, automated video security, and anomaly detection in manufacturing.
  • Developers can run models on the target hardware without dependencies on the framework.

Under the hood

Most machine learning frameworks represent a model as a computational graph: a vertex represents an operation on data arrays (tensors) and an edge represents data dependencies between operations. The Amazon SageMaker Neo compiler exploits patterns in the computational graph to apply high-level optimizations including operator fusion, which fuses multiple small operations together; constant-folding, which statically pre-computes portions of the graph to save execution costs; a static memory planning pass, which pre-allocates memory to hold each intermediate tensor; and data layout transformations, which transform internal data layouts into hardware-friendly forms. The compiler then produces efficient code for each operator.

Once a model has been compiled, it can be run by the Amazon SageMaker Neo runtime. This runtime takes about 1MB of disk space, compared to the 500MB-1GB required by popular deep learning libraries. An application invokes a model by first loading the runtime, which then loads the model definition, model parameters, and precompiled operations.

I can’t wait to try this on my Raspberry Pi. Let’s get to work.

Downloading a pre-trained model

Plenty of pre-trained models are available in the Apache MXNet, Gluon CV or TensorFlow model zoos: here, I’m using a 50-layer model based on the ResNet architecture, pre-trained with Apache MXNet on the ImageNet dataset.

First, I’m downloading the 227MB model as well as the JSON file defining its different layers. This file is particularly important: it tells me that the input symbol is called ‘data’ and that its shape is [1, 3, 224, 224], i.e. 1 image, 3 channels (red, green and blue), 224×224 pixels. I’ll need to make sure that images passed to the model have this exact shape. The output shape is [1, 1000], i.e. a vector containing the probability for each one of the 1,000 classes present in the ImageNet dataset.

To define a performance baseline, I use this model and a vanilla unoptimized version of Apache MXNet 1.2 to predict a few images: on average, inference takes about 6.5 seconds and requires about 306 MB of RAM.

That’s pretty slow: let’s compile the model and see how fast it gets.

Compiling the model for the Raspberry Pi

First, let’s store both model files in a compressed TAR archive and upload it to an Amazon S3 bucket.

$ tar cvfz model.tar.gz resnet50_v1-symbol.json resnet50_v1-0000.params
a resnet50_v1-symbol.json
a resnet50_v1-0000.paramsresnet50_v1-0000.params
$ aws s3 cp model.tar.gz s3://jsimon-neo/
upload: ./model.tar.gz to s3://jsimon-neo/model.tar.gz

Then, I just have to write a simple configuration file for my compilation job. If you’re curious about other frameworks and hardware targets, ‘aws sagemaker create-compilation-job help‘ will give you the exact syntax to use.

{
    "CompilationJobName": "resnet50-mxnet-raspberrypi",
    "RoleArn": $SAGEMAKER_ROLE_ARN,
    "InputConfig": {
        "S3Uri": "s3://jsimon-neo/model.tar.gz",
        "DataInputConfig": "{\"data\": [1, 3, 224, 224]}",
        "Framework": "MXNET"
    },
    "OutputConfig": {
        "S3OutputLocation": "s3://jsimon-neo/",
        "TargetDevice": "rasp3b"
    },
    "StoppingCondition": {
        "MaxRuntimeInSeconds": 300
    }
}

Launching the compilation process takes a single command.

$ aws sagemaker create-compilation-job --cli-input-json file://job.json

Compilation is complete in seconds. Let’s figure out the name of the compilation artifact, fetch it from Amazon S3 and extract it locally

$ aws sagemaker describe-compilation-job \
--compilation-job-name resnet50-mxnet-raspberrypi \
--query "ModelArtifacts"
{
"S3ModelArtifacts": "s3://jsimon-neo/model-rasp3b.tar.gz"
}
$ aws s3 cp s3://jsimon-neo/model-rasp3b.tar.gz .
$ tar xvfz model-rasp3b.tar.gz
x compiled.params
x compiled_model.json
x compiled.so

As you can see, the artifact contains:

  • The original model and symbol files.
  • A shared object file storing compiled, hardware-optimized, operators used by the model.

For convenience, let’s rename them to ‘model.params’, ‘model.json’ and ‘model.so’, and then copy them on the Raspberry pi in a ‘resnet50’ directory.

$ mkdir resnet50
$ mv compiled.params resnet50/model.params
$ mv compiled_model.json resnet50/model.json
$ mv compiled.so resnet50/model.so
$ scp -r resnet50 [email protected]:~

Setting up the inference environment on the Raspberry Pi

Before I can predict images with the model, I need to install the appropriate runtime on my Raspberry Pi. Pre-built packages are available [neopackages]: I just have to download the one for ‘armv7l’ architectures and to install it on my Pi with the provided script. Please note that I don’t need to install any additional deep learning framework (Apache MXNet in this case), saving up to 1GB of persistent storage.

$ scp -r dlr-1.0-py2.py3-armv7l [email protected]:~
<ssh to the Pi>
$ cd dlr-1.0-py2.py3-armv7l
$ sh ./install-py3.sh

We’re all set. Time to predict images!

Using the Amazon SageMaker Neo runtime

On the Pi, the runtime is available as a Python package named ‘dlr’ (deep learning runtime). Using it to predict images is what you would expect:

  • Load the model, defining its input and output symbols.
  • Load an image.
  • Predict!

Here’s the corresponding Python code.

import os
import numpy as np
from dlr import DLRModel

# Load the compiled model
input_shape = {'data': [1, 3, 224, 224]} # A single RGB 224x224 image
output_shape = [1, 1000]                 # The probability for each one of the 1,000 classes
device = 'cpu'                           # Go, Raspberry Pi, go!
model = DLRModel('resnet50', input_shape, output_shape, device)

# Load names for ImageNet classes
synset_path = os.path.join(model_path, 'synset.txt')
with open(synset_path, 'r') as f:
    synset = eval(f.read())

# Load an image stored as a numpy array
image = np.load('dog.npy').astype(np.float32)
print(image.shape)
input_data = {'data': image}

# Predict 
out = model.run(input_data)
top1 = np.argmax(out[0])
prob = np.max(out)
print("Class: %s, probability: %f" % (synset[top1], prob))

Let’s give it a try on this image. Aren’t chihuahuas and Raspberry Pis made for one another?



(1, 3, 224, 224)
Class: Chihuahua, probability: 0.901803

The prediction is correct, but what about speed and memory consumption? Well, this prediction takes about 0.85 second and requires about 260MB of RAM: with Amazon SageMaker Neo, it’s now 5 times faster and 15% more RAM-efficient than with a vanilla model.

This impressive performance gain didn’t require any complex and time-consuming work: all we had to do was to compile the model. Of course, your mileage will vary depending on models and hardware architectures, but you should see significant improvements across the board, including on Amazon EC2 instances such as the C5 or P3 families.

Now available

I hope this post was informative. Compiling models with Amazon SageMaker Neo is free of charge, you will only pay for the underlying resource using the model (Amazon EC2 instances, Amazon SageMaker instances and devices managed by AWS Greengrass).

The service is generally available today in US-East (N. Virginia), US-West (Oregon) and Europe (Ireland). Please start exploring and let us know what you think. We can’t wait to see what you will build!

Julien;

Amazon Forecast – Time Series Forecasting Made Easy

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-forecast-time-series-forecasting-made-easy/

The capacity to foresee the future would be an incredible superpower. At AWS, we can’t give you that, but we can help you use machine learning to forecast time series in a few steps.

The goal of time series forecasting is to predict future values of time-dependent data such as weekly sales, daily inventory levels, or hourly website traffic. Companies today use everything from simple spreadsheets to complex financial planning software to attempt to accurately forecast future business outcomes such as product demand, resource needs, or financial performance.

These tools build forecasts by looking at a historical series of data, which is called time series data. For example, such tools may try to predict the future sales of a raincoat by looking only at its previous sales data with the underlying assumption that the future is determined by the past.

This approach can struggle to produce accurate forecasts for large sets of data that have irregular trends. Also, it fails to easily combine data series that change over time (such as price, discounts, web traffic) with relevant independent variables like product features and store locations.

Introducing Amazon Forecast

Amazon has been solving time-series forecasting challenges across multiple areas including retail, supply chain, and server capacity for over two decades. Using machine learning techniques we have learned from this experience, today we are introducing Amazon Forecast, a fully managed deep learning service for time-series forecasting. Amazon Forecast packages our years of experience in building and operating scalable, highly accurate forecasting technology into an easy-to-use and fully-managed service.

You can use Amazon Forecast to generate predictions on time-series data to estimate:

  • Operational metrics, such as web traffic to servers, AWS usage, or IoT sensor metrics.
  • Business metrics, such as sales, profits, and expenses.
  • Resource requirements, such as the quantity of energy or bandwidth needed to meet a specific demand.
  • The amount of raw goods, services, or other inputs needed by a manufacturing process.
  • Retail demand considering the impact of price discounts, marketing promotions, and other campaigns.

Amazon Forecast is designed with these three main benefits in minds:

  • Accuracy, using deep neural nets and traditional statistical methods for forecasting. Amazon Forecast can learn from your data automatically and pick the best algorithms to train a model designed for your data. When you have many related time- series, forecasts made using the Amazon Forecast deep learning algorithms, such as DeepAR and MQ-RNN, tend to be more accurate than forecasts made with traditional methods, such as exponential smoothing.
  • End-to-end management, automating the entire forecasting workflow from data upload to data processing, model training, dataset updates, and forecasting. Enterprise systems can directly consume your forecasts as an API.
  • Usability, in the console you can look up and visualize forecasts for any time series at different granularities. You can also see metrics for the accuracy of your predictor’s forecasts. Developers with no machine learning expertise can use the Amazon Forecast APIs, AWS Command Line Interface (CLI), or the console to import training data into one or more Amazon Forecast datasets, train models, and deploy the models to generate forecasts.

Using Amazon Forecast

When creating forecasting projects in Amazon Forecast, you primarily work with the following resources:

  • Dataset, to upload your data. Amazon Forecast algorithms use the datasets to train models.
  • Dataset Group, a container for one or more datasets, to use multiple datasets for model training.
  • Predictor, a result of training models. To create a predictor you provide a dataset group and a recipe (which provides an algorithm) or let Amazon Forecast decide which forecasting model works best. The algorithm trains a model using the data in the datasets.
  • Forecast, using a predictor you can run inference to generate forecasts.

You can use Amazon Forecast with the AWS console, CLI and SDKs. For example, you can use the AWS SDK for Python to train a model or get a forecast in a Jupyter notebook, or the AWS SDK for Java to add forecasting capabilities to an existing business application.

Pricing and Availability

With Amazon Forecast, you pay only for what you use. There are three different types of costs in Amazon Forecast:

  • Generated forecast: A forecast is a prediction of future values for a single variable over any time horizon. Forecasts are billed in units of 1,000 (rounded up to the nearest thousand).
  • Data storage: Costs for each GB of data stored and used to train your models.
  • Training hours: Costs for each hour of training required for a custom model based on data provided by customers.

As part of the AWS Free Tier, for the first two months after first using Amazon Forecast, you have no charge for:

  • Generated forecasts: Up to 10K time series forecasts per month
  • Data storage: Up to 10GB per month
  • Training hours: Up to 10 hours per month

Amazon Forecast is available in preview in the following regions: US East (Northern Virginia), US West (Oregon).

It has never been so easy to do time-series forecasts with high accuracy. I really look forward to seeing what our customers are going to build with this!

Amazon Personalize – Real-Time Personalization and Recommendation for Everyone

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-personalize-real-time-personalization-and-recommendation-for-everyone/

Machine learning definitely offers a wide range of exciting topics to work on, but there’s nothing quite like personalization and recommendation.

At first glance, matching users to items that they may like sounds like a simple problem. However, the task of developing an efficient recommender system is challenging. Years ago, Netflix even ran a movie recommendation competition with a $1 Million award! Indeed, building, optimizing and deploying real-time personalization today requires specialized expertise in analytics, applied machine learning, software engineering, and systems operations. Few organizations have the knowledge, skills, and experience to overcome these challenges, and they either abandon the idea of using recommendation or build under-performing models.

For over 20 years, Amazon.com has built recommender systems at scale, integrating personalized recommendations across the buying experience – from product discovery to checkout.

To help all AWS customers do the same, we are very happy to announce Amazon Personalize, a fully-managed service that puts personalization and recommendation in the hands of developers with little machine learning experience.

Introducing Amazon Personalize

How does Amazon Personalize simplify personalization and recommendation? As explained in a previous blog post, you could already build recommendation models on Amazon SageMaker using algorithms such as Factorization Machines. However, it’s fair to say that this requires extensive data preparation and expert tuning in order to get good results.

Creating a recommendation model with Amazon Personalize is much simpler. Using AutoML, a new process that automates complex machine learning tasks, Personalize performs and accelerates the difficult work required to design, train, and deploy a machine learning model.

Amazon Personalize supports both datasets stored in Amazon S3 and streaming data sets, e.g. events sent in real-time from a JavaScript tracker or server-side. The high-level process looks like this:

  1. Create a schema describing the dataset, using Personalize-reserved keywords for user ids, item ids, etc.
  2. Create a dataset group that contains datasets used for building the model and for predicting: user-item interactions (aka “who liked what”), users and items. The last two are optional, as we will see in the example below.
  3. Send data to Personalize.
  4. Create a solution, i.e. select a recommendation recipe and train it on the dataset group.
  5. Create a campaign to predict new samples.

With data stored in Amazon S3, sending data to Personalize simply means adding your data files to the dataset group. Ingestion is triggered automatically.

Working with streaming data is different. One way to send events would be to use the AWS Amplify JavaScript library, which is integrated with the event tracking service in Personalize. Another way would be to send them server-side via the AWS SDK in your favourite language: ingestion can happen from any source with the code hosted inside of AWS (e.g. in Amazon EC2 or AWS Lambda) or outside.

Time for an example. Let’s build a solution based on the MovieLens dataset!

The MovieLens dataset

MovieLens is a well-known dataset storing movies recommendations. It comes in different sizes and formats: here, we will use ml-20m, which contains 20 million ratings applied to 27,000 movies by 138,000 users.

This dataset contains a file named ‘ratings.csv’ storing user-item interactions. The first lines look like this.

userId,movieId,rating,timestamp
1,2,3.5,1112486027
1,29,3.5,1112484676
1,32,3.5,1112484819
1,47,3.5,1112484727
1,50,3.5,1112484580

It reads like this: user 1 gave movie 2 a 3.5 rating. Same for movies 29, 32, 47, 50 and so on! This is exactly what we need to build a recommendation model. Let’s get to work.

Creating a schema for the dataset

The first step is to create an Avro schema for this dataset. This is pretty straightforward, we just need to use some of the keywords defined in Amazon Personalize.

{"type": "record", 
"name": "Interactions", 
"namespace": "com.amazonaws.personalize.schema",
"fields":[
    {"name": "ITEM_ID", "type": "string"},
    {"name": "USER_ID", "type": "string"},
    {"name": "TIMESTAMP", "type": "long"}
],
"version": "1.0"}

Preparing the dataset

Once we’ve downloaded and unzipped the dataset, let’s load the ‘ratings.csv’ file and apply the following processing:

  • Shuffle reviews.
  • Keep only movies rated 4 and above, and drop the ratings columns: we just want our model to recommend movies that users should really like.
  • Rename columns to the names used in the schema.
  • Keep only 100,000 interactions to minimize training time (this is just a demo after all!).

All of this is easily achieved with the Pandas Python library, the Swiss Army knife for columnar data processing. While we’re at it, we’ll also upload the processed file to an Amazon S3 bucket.

import pandas, boto3 
from sklearn.utils import shuffle
ratings = pandas.read_csv('ratings.csv')
ratings = shuffle(ratings)
ratings = ratings[ratings['rating']>3.6]
ratings = ratings.drop(columns='rating')
ratings.columns = ['USER_ID','ITEM_ID','TIMESTAMP']
ratings = ratings[:100000]
ratings.to_csv('ratings.processed.csv',index=False)
s3 = boto3.client('s3')
s3.upload_file('ratings.processed.csv','jsimon-ml20m','ratings.processed.csv')

Creating the dataset group

First, we need to create a dataset group containing the user-item dataset as well as its schema. Let’s do this with the AWS CLI: as you’ll see, a lot of these CLI operations require Amazon Resource Names (ARNs) output by a previous call, so make sure you keep track of everything when you experiment.

$ aws personalize create-dataset-group --name jsimon-ml20m-dataset-group
$ aws personalize create-schema --name jsimon-ml20m-schema \
--schema file://jsimon-ml20m-schema.json
$ aws personalize create-dataset --schema-arn $SCHEMA_ARN \
--dataset-group-arn $DATASET_GROUP_ARN \
--dataset-type INTERACTIONS 

Importing datasets

In this simple example, we’ll import data on-demand. It’s also possible to schedule import jobs in order to load new data regularly. We need to pass a role allowing data to be read from the Amazon S3 bucket.

$ aws personalize create-dataset-import-job --job-name jsimon-ml20m-job \
--role-arn $ROLE_ARN
--dataset-arn $DATASET_ARN \
--data-source dataLocation=s3://jsimon-ml20m/ratings.processed.csv

This will take a little while and we can use the describe-dataset-import-job API to check for completion. Plenty of information is returned, but let’s just query the import status.

$ aws personalize describe-dataset-import-job \
--dataset-import-job-arn $DATASET_IMPORT_JOB_ARN \
--query "datasetImportJob.latestDatasetImportJobRun.status"
"CREATE IN_PROGRESS"

Putting it all together: creating a solution

Once datasets have been imported, we need to select a recipe to cook our recommendation model. A recipe is much more than an algorithm: it also includes predefined feature transformation, initial parameters for the algorithm as well as automatic model tuning. Thus, recipes remove the need to have expertise in personalization.

Amazon Personalize comes with several recipes suitable for different use cases, and advanced users can also add their own recipes.

Here’s the list of available recipes.

arn:aws:personalize:::recipe/awspersonalizehrnnmodel
arn:aws:personalize:::recipe/awspersonalizehrnnmodel-for-coldstart
arn:aws:personalize:::recipe/awspersonalizehrnnmodel-for-metadata
arn:aws:personalize:::recipe/awspersonalizeffnnmodel
arn:aws:personalize:::recipe/awspersonalizedeepfmmodel
arn:aws:personalize:::recipe/awspersonalizesimsmodel
arn:aws:personalize:::recipe/search-personalization
arn:aws:personalize:::recipe/popularity-baseline

Recommendation experts will certainly enjoy the flexibility that they bring, but what about developers who are new to the topic?

As mentioned earlier, Amazon Personalize supports AutoML, a new technique that automatically searches for the most optimal recipe, so let’s enable it. Hyper parameter optimization is enabled by default. Last but not least, Amazon Personalize solutions can scale automatically according to incoming traffic: we simply need to define the minimum number to transactions per second (TPS) that we want to support.

Thus, we can create the solution like so:

$ aws personalize create-solution --name jsimon-ml20m-solution \
--minTPS 10 --perform-auto-ml \
--dataset-group-arn $DATASET_GROUP_ARN \
--query 'solution.status'
"CREATE IN_PROGRESS"

This will take a little while as the optimal recipe is selected, trained and tuned. Once all of this is complete, we can look at solution metrics.

$ aws personalize get-metrics --solution-arn $SOLUTION_ARN

Recommending new items in real-time

If we’re happy with the model, we can now create a campaign in order to deploy it. It will be updated automatically every time the solution is deployed.

$ aws personalize create-campaign --name jsimon-ml20m-solution \
--solution-arn $SOLUTION_ARN --update-mode AUTO

Now, let’s recommend some movies.

$ aws personalize-rec get-recommendations --campaign-arn $CAMPAIGN_ARN \
--user-id $USER_ID --query "itemList[*].itemId"
["1210", "260", "2571", "110", "296", "1193", ...]

That’s it! As you can see, we successfully built a recommendation model with a few API calls. All we had to do was define a schema and upload the dataset. We relied on Amazon Personalize to select the best recipe with AutoML, and to optimize its hyper parameters. The solution was trained and deployed on fully-managed infrastructure, letting us focus even more on building our application.

Sign up for the preview now!

I hope this post was informative. We just scratched the surface of what Amazon Personalize can do. The service is available in preview in US-East (Virginia) and US-West (Oregon).

There is no charge for the service during the preview. Once the preview is complete, the service will be part of the AWS free tier. For the first two months after sign-up, you will be offered:
1. Data processing and storage: Up to 20 GB per month
2. Training: Up to 100 training hours per month
3. Inference: Up to 50 TPS-hours of real-time recommendations per month

To get started, visit aws.amazon.com/personalize/. Now it’s your turn to try it and let us know what you think.

Julien;

Amazon SageMaker RL – Managed Reinforcement Learning with Amazon SageMaker

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-rl-managed-reinforcement-learning-with-amazon-sagemaker/

In the last few years, machine learning (ML) has generated a lot of excitement. Indeed, from medical image analysis to self-driving trucks, the list of complex tasks that ML models can successfully accomplish keeps growing, but what makes these models so smart?

In a nutshell, you can train a model in several different ways of which these are three:

  1. Supervised learning: run an algorithm on a labelled data set, i.e. a data set containing samples and answers. Gradually, the model will learn how to correctly predict the right answer. Regression and classification are examples of supervised learning.
  2. Unsupervised learning: run an algorithm on an unlabelled data set, i.e. a data set containing samples only. Here, the model will progressively learn patterns in data and organize samples accordingly. Clustering and topic modeling are examples of unsupervised learning.
  3. Reinforcement learning: this one is quite different. Here, a computer program (aka an agent) interacts with its environment: most of the time, this takes place in a simulator. The agent receives a positive or negative reward for actions that it takes: rewards are computed by a user-defined function which outputs a numeric representation of the actions that should be incentivized. By trying to maximize positive rewards, the agent learns an optimal strategy for decision making.

Launched at AWS re:Invent 2017, Amazon SageMaker is helping customers quickly build, train and deploy ML models. Today, with the launch of Amazon SageMaker RL, we’re happy to extend the advantages of Amazon SageMaker to reinforcement learning, making it easier for all developers and data scientists regardless of their ML expertise.

A quick primer on reinforcement learning

Reinforcement learning (RL) can sound very confusing at first, so let’s take an example. Imagine an agent learning to navigate a maze. The simulator allows it to move in certain directions but blocks it from going through walls: using RL to learn a policy, the agent soon starts to take increasingly relevant actions.

One critical thing to understand is that the RL model isn’t trained on a predefined set of labelled mazes (that would be supervised learning). Instead, the agent discovers its environment (the current maze) one step at at time, moves one more step and receives a reward: stepping into a dead end is a negative reward, moving one step closer to the exit is a positive reward. Once a number of different mazes have been processed, the agent learns the action/reward data points and trains a model to make better decisions next time around. This cycle of exploring and training is central to RL: given enough mazes and enough training time, we would soon enough know how to navigate any maze.

RL is particularly suitable for complex, unpredictable, environments that can be simulated and where building a prior dataset would either be infeasible or prohibitively expensive: autonomous vehicles, games, portfolio management, inventory management, robotics or industrial control systems. For instance, researchers have shown that applying RL-based control to HVAC systems can result in 20% – 40% cost savings compared to typical rule-based systems [1], not to mention the large reduction in ecological footprint.

Introducing Amazon SageMaker RL

Amazon SageMaker RL builds on top of Amazon SageMaker, adding pre-packaged RL toolkits and making it easy to integrate any simulation environment. As you would expect, training and prediction infrastructure is fully managed, so that you can focus on your RL problem and not on managing servers.

Today, you can use containers provided by SageMaker for Apache MXNet and Tensorflow that include Open AI Gym, Intel Coach and Berkeley Ray RLLib. As usual with Amazon SageMaker, you can easily create your own custom environment using other RL libraries such as TensorForce or StableBaselines.

When it comes to simulation environments, Amazon SageMaker RL supports the following options:

  • First party simulators for AWS RoboMaker and Amazon Sumerian.
  • Open AI Gym environments and open source simulation environments that are developed using Gym interfaces, such as Roboschool or EnergyPlus.
  • Customer-developed simulation environments using the Gym interface.
  • Commercial simulators such as MATLAB and Simulink (customers will need to manage their own licenses).

Amazon SageMaker RL also comes with a collection of Jupyter notebooks, just like Amazon SageMaker does. They are available on Github, featuring both simple examples (cartpole, simple corridor) as well as advanced ones in a variety of domains such as robotics, operations research, finance, and more. You can easily extend these notebooks and customize them for your own business problem.

In addition, you’ll find examples showing you how to scale RL using either homogeneous or heterogeneous scaling. The latter is particularly important for many RL applications where simulation runs on CPUs and training on GPUs. Your simulation environment can also run locally or remotely in a different network and SageMaker will set everything up for you.

Don’t worry, this is easier than it seems. Let’s look at an example.

Predictive Auto Scaling with Amazon SageMaker RL

Auto Scaling allows you to dynamically scale your service (such as Amazon EC2), adding or removing capacity automatically according to conditions you define. Today, this typically requires setting up thresholds, alarms, scaling policies, etc.

Let’s see how we could optimize this process with a RL model and a custom simulator, pretending to scale your Amazon EC2 capacity (of course, this is just a toy example). For the sake of brevity, I will only highlight the most important code snippets: you’ll find the complete example on Github.

Here, the name of the game is to adapt the instance capacity to the load profile. We don’t want to be under-provisioned (losing traffic) or over-provisioned (wasting money): we want to be ‘just right’.

In RL terms:

  • The environment contains the load profile and the number of running instances.
  • At each step, the agent can take two actions: add instances and remove instances. Adding instances helps process more transactions, but they cost money and need a few minutes to come online. Removing instances saves money but reduces the overall processing capacity.
  • The reward is a combination of the cost for running instances and the value for completing successful transactions, with a big penalty for insufficient capacity.

Setting up the simulation

First, we need a simulator in order to generate load profiles similar to what you would observe on a high-traffic web server: let’s use a very simple Python program for that. Here’s an example plotting transactions per minute (tpm) over a 3-day period: mostly periodic with sharp unpredictable spikes.

Load profile

This is the initial state:

config_defaults = {
            "warmup_latency": 5,       # It takes 5 minutes for a new machine to warm up and become available.
            "tpm_per_machine": 300,    # Each machine can process 300 transactions per minute (tpm) on average
            "tpm_sigma": 30,           # Machine's TPM capacity is variable with +/- 30 standard deviation
            "machine_cost": 0.05,      # Machines cost $0.05/min
            "transaction_val": 0.90,   # Successful transactions are worth $0.90 per thousand (CPM)
            "downtime_cost": 200,      # Downtime is assumed to cost the business $200/min beyond incomplete transactions
            "downtime_percent": 99.5,  # Downtime is defined as availability dropping below 99.5%
            "initial_machines": 50,    # How many machines are initially turned on
            "max_time_steps": 1000,    # Maximum number of timesteps per episode
        }

Computing the reward

This is quite straightforward! The current load is compared to the current capacity, we deduct the cost of any lost transaction and we apply a large penalty for losing more than 0.5% (a pretty strict definition of downtime!).

def _react_to_load(self):
        self.capacity = int(self.active_machines * np.random.normal(self.tpm_per_machine, self.tpm_sigma))
        if self.current_load <= self.capacity:
            # All transactions succeed
            self.failed = 0
            succeeded = self.current_load
        else:
            # Some transactions failed
            self.failed = self.current_load - self.capacity
            succeeded = self.capacity
        reward = succeeded * self.transaction_val / 1000.0  # divide by thousand for CPM
        percent_success = 100.0 * succeeded / (self.current_load + 1e-20)
        if percent_success < self.downtime_percent:
            self.is_down = 1
            reward -= self.downtime_cost
        else:
            self.is_down = 0
        reward -= self.active_machines * self.machine_cost
        return reward

Stepping through the simulation

Here’s how the agent goes through each time step initiated by the RL framework. As explained above, the model will initially predict random actions, but after a few training rounds, it’ll get much smarter.

def step(self, action):
        # First, react to the actions and adjust the fleet
        turn_on_machines = int(action[0])
        turn_off_machines = int(action[1])
        self.active_machines = max(0, self.active_machines - turn_off_machines)
        warmed_up_machines = self.warmup_queue[0]
        self.active_machines = min(self.active_machines + warmed_up_machines, self.max_machines)
        self.warmup_queue = self.warmup_queue[1:] + [turn_on_machines]
        # Now react to the current load and calculate reward
        self.current_load = self.load_simulator.time_step_load()
        reward = self._react_to_load()
        self.t += 1
        done = self.t > self.max_time_steps
        return self._observation(), reward, done, {}

Training on Amazon SageMaker

Now, we’re ready to train our model, just like any other SageMaker model: passing the image name (here, the TensorFlow container for Intel Coach), the instance type, etc.

rlestimator = RLEstimator(role=role,
        framework=Framework.TENSORFLOW,
        framework_version='1.11.0',
        toolkit=Toolkit.COACH,
        entry_point="train-autoscale.py",
        train_instance_count=1,
        train_instance_type=p3.2xlarge)
rlestimator.fit()

In the training log, we see that the agent first explores its environment without any training: this is called the heatup phase and it’s used to generate an initial dataset to learn from.

## simple_rl_graph: Starting heatup
Heatup> Name=main_level/agent, Worker=0, Episode=1, Total reward=-39771.13, Steps=1001, Training iteration=0
Heatup> Name=main_level/agent, Worker=0, Episode=2, Total reward=-3089.54, Steps=2002, Training iteration=0
Heatup> Name=main_level/agent, Worker=0, Episode=3, Total reward=-43205.29, Steps=3003, Training iteration=0
Heatup> Name=main_level/agent, Worker=0, Episode=4, Total reward=-24542.07, Steps=4004, Training iteration=0
...

Once the heatup phase is complete, the model goes through repeated cycles of learning (aka ‘policy training’) and exploration based on what it has learned (aka ‘training’).

Policy training> Surrogate loss=-0.09095033258199692, KL divergence=0.0003891458618454635, Entropy=2.8382163047790527, training epoch=0, learning_rate=0.0003
Policy training> Surrogate loss=-0.1263471096754074, KL divergence=0.00145535240881145, Entropy=2.836780071258545, training epoch=1, learning_rate=0.0003
Policy training> Surrogate loss=-0.12835979461669922, KL divergence=0.0022696126252412796, Entropy=2.835214376449585, training epoch=2, learning_rate=0.0003
Policy training> Surrogate loss=-0.12992703914642334, KL divergence=0.00254297093488276, Entropy=2.8339898586273193, training epoch=3, learning_rate=0.0003
....
Training> Name=main_level/agent, Worker=0, Episode=152, Total reward=-54843.29, Steps=152152, Training iteration=1
Training> Name=main_level/agent, Worker=0, Episode=153, Total reward=-51277.82, Steps=153153, Training iteration=1
Training> Name=main_level/agent, Worker=0, Episode=154, Total reward=-26061.17, Steps=154154, Training iteration=1 

Once the model hits the number of epochs that we set, training is complete. In this case, we trained for 18 minutes: let’s see how well our model learned.

Visualizing training

One way to find out is to plot the rewards received by the agent after each exploration iteration. As expected, rewards in the heatup phase (150 iterations) are extremely negative because the agent hasn’t been trained at all. Then, as soon as training is applied, rewards start to improve rapidly.

Rewards vs iterations

Here’s a zoom on post-heatup iterations. As you can see, about halfway through, the agent starts receiving pretty consistent positive rewards, showing that it’s able to apply efficient scaling to the load profiles that it discovers.

Rewards vs iterations

Deploying the model

If we’re happy with the model, we can then deploy it just like any SageMaker model and use the newly-created HTTPS endpoint to predict. Alternatively, if you are training a robot then you can also deploy on Edge devices using AWS Greengrass.

Now available

I hope this post was informative. We’ve barely scratched the surface of what Amazon SageMaker RL can do. You can use it today in all regions where Amazon SageMaker is available. Please start exploring and let us know what you think. We can’t wait to see what you will build!

Julien;

[1] “Deep Reinforcement Learning for Building HVAC Control”, T. Wei, Y. Wang and Q. Zhu, DAC’17, June 18-22, 2017, Austin, TX, USA.

Amazon SageMaker Ground Truth – Build Highly Accurate Datasets and Reduce Labeling Costs by up to 70%

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-ground-truth-build-highly-accurate-datasets-and-reduce-labeling-costs-by-up-to-70/

In 1959, Arthur Samuel defined machine learning as a “field of study that gives computers the ability to learn without being explicitly programmed”. However, there is no deus ex machina: the learning process requires an algorithm (“how to learn”) and a training dataset (“what to learn from”).

Today, most machine learning tasks use a technique called supervised learning: an algorithm learns patterns or behaviours from a labeled dataset. A labeled dataset containing data samples as well as the correct answer for each one of them, aka ‘ground truth’. Depending on the problem at hand, one could use labeled images (“this is a dog”, “this is a cat”), labeled text (“this is spam”, “this isn’t”), etc.

Fortunately, developers and data scientists can now rely on a vast collection of off-the-shelf algorithms (as illustrated by the built-in algorithms in Amazon SageMaker) and of reference datasets. Deep learning has popularized image datasets such as MNIST, CIFAR-10 or ImageNet, and more are also available for tasks like machine translation or text classification. These reference datasets are extremely useful for beginners and experienced practitioners alike, but a lot of companies and organizations still need to train machine learning models on their own dataset: think about medical imaging, autonomous driving, etc.

Building such datasets is a complex problem, particularly when working at scale. How long would it take one person to label one thousand images or documents? ‘Quite some time’ is probably the answer! Now imagine having to label one million images or documents: how many people would you now need? For most companies and organizations, this is a moot point, as they would never be able to muster enough people anyway.

Well, no more! Today, I’m very happy to announce Amazon SageMaker Ground Truth, a new capability of Amazon SageMaker that makes it easy for customers to to efficiently and accurately label the datasets required for training machine learning systems.

Introducing Amazon SageMaker Ground Truth

Amazon SageMaker Ground Truth helps you build datasets for:

  • Text classification.
  • Image classification, i.e categorizing images in specific classes.
  • Object detection, i.e. locating objects in images with bounding boxes.
  • Semantic segmentation, i.e. locating objects in images with pixel-level precision.
  • Custom user-defined tasks.

Amazon SageMaker Ground Truth can optionally use active learning to automate the labeling of your input data. Active learning is a machine learning technique that identifies data that needs to be labeled by humans and data that can be labeled by machine. Automated data labeling incurs Amazon SageMaker training and inference costs, but it can help to reduce the cost (up to 70%) and time that it takes to label your dataset over having humans label your complete dataset.

When manual effort is required, you can choose to use a crowdsourced Amazon Mechanical Turk workforce of over 500,000 workers, a private workforce of your own workers, or one of the curated third party vendors listed on the AWS Marketplace.

Let’s look at the high-level steps required to label a dataset:

  • Store your data in Amazon S3,
  • Create a labeling workforce,
  • Create a labeling job,
  • Get to work,
  • Visualize results.

How about an example? Let me show you how to label images from the CBCL StreetScenes dataset. This dataset contains 3548 images such as this one. For the sake of brevity, I will only use the first 10 images and annotate cars only.

Street scene

Storing data in Amazon S3

The first step is to create a manifest file for the dataset. This is a simple JSON file listing all images present in the dataset. Mine looks like this: please note that each line corresponds to a single object and is an independent JSON document.

{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00001.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00002.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00003.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00004.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00005.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00006.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00007.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00008.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00009.JPG"}
{"source-ref": "s3://jsimon-groundtruth-demo/SSDB00010.JPG"}

Then, I simply copy the manifest file and the corresponding images to an Amazon S3 bucket.

Creating a labeling workforce

Amazon SageMaker Ground Truth gives us different options:

  • Public workforce, backed by Amazon Mechanical Turk,
  • Private workforce, backed by internal resources,
  • Vendor workforce, backed by third-party resources.

The first option is probably the most scalable one. However, the last two may be a better fit if your job requires confidentiality, service guarantees, or special skills.

I can only count on myself here, so I create a private team authenticated by a new Amazon Cognito group. Indeed, authentication is required before any worker can access the dataset.

Work force

Then, I add myself to the team by entering my email address. A few seconds later, I receive an invitation containing credentials and a URL. This URL also can be found on the labeling workforces dashboard.

Once I’ve clicked on the link and changed my password, I am registered as a verified worker for this team.

The one-man team is now ready. It’s time to create the labeling job itself.

Creating a labeling job

As you would expect, I have to define the location of the manifest file and of the dataset.

Dataset

Then, I can decide whether I want to use the full dataset or a subset: I could even write a SQL query to filter the files. Here, let’s use the full dataset, as it only has 10 images.

Data set

Next, I have to select the type of the labeling job. As stated earlier, there are multiple options available and here I’m interested in adding bounding boxes to my images.

Next, I select the team that I want to assign to the job. This is where I could select automated data labeling. I could also decide to ask multiple workers to label the same image to increase accuracy.

Labeling job

Finally, I can provide additional instructions to workers, detailing the specific task that needs to be performed and giving them a couple of examples.

Labeling job

That’s it. Our labeling job is now ready. Time for the team (well… me, really) to get to work.

Labeling job

Labeling images

Logging into the URL I received by email, I see the list of jobs I’m assigned to.

Working

When I click on the ‘Start working’ button, I see instructions as well as a first image to work on. Using the toolbox, I can draw boxes, zoom in and out, etc. This is pretty intuitive, but drawing boxes that fit just right takes time and care. Now I understand why this is such a time-consuming process… and I have only ten images to go!

Here’s a zoom on another image. Can you see all seven cars?

Working

Once I’m done with all ten images, I can take a well-deserved break and enjoy the completion of the labeling job.

Labeling job

Visualizing results

Annotated images are visible directly in the AWS console, which comes in handy for sanity checks. I can also click on any image and see the list of labels that have been applied.

Of course, our purpose is to use this information to train machine learning models: we can find it in the augmented manifest file stored in our bucket. For example, here’s what the manifest has to say about the first image, where I labeled five cars.

{
"source-ref": "s3://jsimon-groundtruth-demo/SSDB00001.JPG",
"GroundTruthDemo": {
  "annotations": [
    {"class_id": 0, "width": 54, "top": 482, "height": 39, "left": 337},
    {"class_id": 0, "width": 69, "top": 495, "height": 53, "left": 461},
    {"class_id": 0, "width": 52, "top": 482, "height": 41, "left": 523},
    {"class_id": 0, "width": 71, "top": 481, "height": 62, "left": 589},
    {"class_id": 0, "width": 347, "top": 479, "height": 120, "left": 573}
  ],
  "image_size": [{"width": 1280, "depth": 3, "height": 960}
]
},
"GroundTruthDemo-metadata": {
  "job-name": "labeling-job/groundtruthdemo",
  "class-map": {"0": "Car"},
  "human-annotated": "yes",
  "objects": [
    {"confidence": 0.94},
    {"confidence": 0.94},
    {"confidence": 0.94},
    {"confidence": 0.94},
    {"confidence": 0.94}
  ],
  "creation-date": "2018-11-26T04:01:09.038134",
  "type": "groundtruth/object-detection"
  }
}

This has all the information required to train an object detection model, such as the built-in Single-Shot Detector available in Amazon SageMaker, but this is another story!

Now available!

I hope this post was informative. We just scratched the surface of what Amazon SageMaker Ground Truth can do. The service is available today in US-East (Virginia), US-Central (Ohio), US-West (Oregon), Europe (Ireland) and Asia Pacific (Tokyo). Now it’s your turn to try it, and let us know what you think!

Julien;

Amazon Elastic Inference – GPU-Powered Deep Learning Inference Acceleration

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-elastic-inference-gpu-powered-deep-learning-inference-acceleration/

One of the reasons for the recent progress of Artificial Intelligence and Deep Learning is the fantastic computing capabilities of Graphics Processing Units (GPU). About ten years ago, researchers learned how to harness their massive hardware parallelism for Machine Learning and High Performance Computing: curious minds will enjoy the seminal paper (PDF) published in 2009 by Stanford University.

Today, GPUs help developers and data scientists train complex models on massive data sets for medical image analysis or autonomous driving. For instance, the Amazon EC2 P3 family lets you use up to eight NVIDIA V100 GPUs in the same instance, for up to 1 PetaFLOP of mixed-precision performance: can you believe that 10 years ago this was the performance of the fastest supercomputer ever built?

Of course, training a model is half the story: what about inference, i.e. putting the model to work and predicting results for new data samples? Unfortunately, developers are often stumped when the time comes to pick an instance type and size. Indeed, for larger models, the inference latency of CPUs may not meet the needs of online applications, while the cost of a full-fledged GPU may not be justified. In addition, resources like RAM and CPU may be more important to the overall performance of your application than raw inference speed.

For example, let’s say your power-hungry application requires a c5.9xlarge instance ($1.53 per hour in us-east-1): a single inference call with an SSD model would take close to 400 milliseconds, which is certainly too slow for real-time interaction. Moving your application to a p2.xlarge instance (the most inexpensive general-purpose GPU instance at $0.90 per hour in us-east-1) would improve inference performance to 180 milliseconds: then again, this would impact application performance as p2.xlarge has less vCPUs and less RAM.

Well, no more compromising. Today, I’m very happy to announce Amazon Elastic Inference, a new service that lets you attach just the right amount of GPU-powered inference acceleration to any Amazon EC2 instance. This is also available for Amazon SageMaker notebook instances and endpoints, bringing acceleration to built-in algorithms and to deep learning environments.

Pick the best CPU instance type for your application, attach the right amount of GPU acceleration and get the best of both worlds! Of course, you can use EC2 Auto Scaling to add and remove accelerated instances whenever needed.

Introducing Amazon Elastic Inference

Amazon Elastic Inference supports popular machine learning frameworks TensorFlow, Apache MXNet and ONNX (applied via MXNet). Changes to your existing code are minimal, but you will need to use AWS-optimized builds which automatically detect accelerators attached to instances, ensure that only authorized access is allowed, and distribute computation across the local CPU resource and the attached accelerator. These builds are available in the AWS Deep Learning AMIs, on Amazon S3 so you can build it into your own image or container, and provided automatically when you use Amazon SageMaker.

Amazon Elastic Inference is available in three sizes, making it efficient for a wide range of inference models including computer vision, natural language processing, and speech recognition.

  • eia1.medium: 8 TeraFLOPs of mixed-precision performance.
  • eia1.large: 16 TeraFLOPs of mixed-precision performance.
  • eia1.xlarge: 32 TeraFLOPs of mixed-precision performance.

This lets you select the best price/performance ratio for your application. For instance, a c5.large instance configured with eia1.medium acceleration will cost you $0.22 an hour (us-east-1). This combination is only 10-15% slower than a p2.xlarge instance, which hosts a dedicated NVIDIA K80 GPU and costs $0.90 an hour (us-east-1). Bottom line: you get a 75% cost reduction for equivalent GPU performance, while picking the exact instance type that fits your application.

Let’s dive in and look at Apache MXNet and TensorFlow examples on an Amazon EC2 instance.

Setting up Amazon Elastic Inference

Here are the high-level steps required to use the service with an Amazon EC2 instance.

  1. Create a security group for the instance allowing only incoming SSH traffic.
  2. Create an IAM role for the instance, allowing it to connect to the Amazon Elastic Inference service.
  3. Create a VPC endpoint for Amazon Elastic Inference in the VPC where the instance will run, attaching a security group allowing only incoming HTTPS traffic from the instance. Please note that you’ll only have to do this once per VPC and that charges for the endpoint are included in the cost of the accelerator.

VPC endpoint

Creating an accelerated instance

Now that the endpoint is available, let’s use the AWS CLI to fire up a c5.large instance with the AWS Deep Learning AMI.

aws ec2 run-instances --image-id $AMI_ID \
--key-name $KEYPAIR_NAME --security-group-ids $SG_ID \
--subnet-id $SUBNET_ID --instance-type c5.large \
--elastic-inference-accelerator Type=eia1.large

That’s it! You don’t need to learn any new APIs to use Amazon Elastic Inference: simply pass an extra parameter describing the accelerator type. After a few minutes, the instance is up and we can connect to it.

Accelerating Apache MXNet

In this classic example, we will load a large pre-trained convolution neural network on the Amazon Elastic Inference Accelerator (if you’re not familiar with pre-trained models, I covered the topic in a previous post). Specifically, we’ll use a ResNet-152 network trained on the ImageNet dataset.

Then, we’ll simply classify an image on the Amazon Elastic Inference Accelerator

import mxnet as mx
import numpy as np
from collections import namedtuple
Batch = namedtuple('Batch', ['data'])

# Download model (ResNet-152 trained on ImageNet) and ImageNet categories
path='http://data.mxnet.io/models/imagenet/'
[mx.test_utils.download(path+'resnet/152-layers/resnet-152-0000.params'),
 mx.test_utils.download(path+'resnet/152-layers/resnet-152-symbol.json'),
 mx.test_utils.download(path+'synset.txt')]

# Set compute context to Elastic Inference Accelerator
# ctx = mx.gpu(0) # This is how we'd predict on a GPU
ctx = mx.eia()    # This is how we predict on an EI accelerator

# Load pre-trained model
sym, arg_params, aux_params = mx.model.load_checkpoint('resnet-152', 0)
mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))],
         label_shapes=mod._label_shapes)
mod.set_params(arg_params, aux_params, allow_missing=True)

# Load ImageNet category labels
with open('synset.txt', 'r') as f:
    labels = [l.rstrip() for l in f]

# Download and load test image
fname = mx.test_utils.download('https://github.com/dmlc/web-data/blob/master/mxnet/doc/tutorials/python/predict_image/dog.jpg?raw=true')
img = mx.image.imread(fname)

# Convert and reshape image to (batch=1, channels=3, width, height)
img = mx.image.imresize(img, 224, 224) # Resize to training settings
img = img.transpose((2, 0, 1)) # Channels 
img = img.expand_dims(axis=0)  # Batch size
# img = img.as_in_context(ctx) # Not needed: data is loaded automatically to the EIA

# Predict the image
mod.forward(Batch([img]))
prob = mod.get_outputs()[0].asnumpy()

# Print the top 3 classes
prob = np.squeeze(prob)
a = np.argsort(prob)[::-1]
for i in a[0:3]:
    print('probability=%f, class=%s' %(prob[i], labels[i]))

As you can see, there are only a couple of differences:

  • I set the compute context to mx.eia(). No numbering is required, as only one Amazon Elastic Inference accelerator may be attached on an Amazon EC2 instance.
  • I did not explicitly load the image on the Amazon Elastic Inference accelerator, as I would have done with a GPU. This is taken care of automatically.

Running this example produces the following result.

probability=0.979113, class=n02110958 pug, pug-dog
probability=0.003781, class=n02108422 bull mastiff
probability=0.003718, class=n02112706 Brabancon griffon

What about performance? On our c5.large instance, this prediction takes about 0.23 second on the CPU, and only 0.031 second on its eia1.large accelerator. For comparison, it takes about 0.015 second on a p3.2xlarge instance equipped with a full-fledged NVIDIA V100 GPU. If we use a eia1.medium accelerator instead, this prediction takes 0.046 second, which is just as fast as a p2.xlarge (0.042 second) but at a 75% discount!

Accelerating TensorFlow

You can use TensorFlow Serving to serve accelerated predictions: it’s a model server which loads saved models and serves high-performance prediction through REST APIs and gRPC.

Amazon Elastic Inference includes an accelerated version of TensorFlow Serving, which you would use like this.

$ ei_tensorflow_model_server --model_name=resnet --model_base_path=$MODEL_PATH --port=9000
$ python resnet_client.py --server=localhost:9000

Now Available

I hope this post was informative. Amazon Elastic Inference is available now in US East (N. Virginia and Ohio), US West (Oregon), EU (Ireland) and Asia Pacific (Seoul and Tokyo). You can start building applications with it today!

Julien;

Amazon Comprehend Medical – Natural Language Processing for Healthcare Customers

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-comprehend-medical-natural-language-processing-for-healthcare-customers/

As the son of a Gastroenterologist and a Dermatologist, I grew up listening to arcane conversations involving a never-ending stream of complex medical terms: human anatomy, surgical procedures, medication names… and their abbreviations. A fascinating experience for a curious child wondering whether his parents were wizards of some sort and what all this gibberish meant.

For this reason, I am very happy to announce Amazon Comprehend Medical, an extension of Amazon Comprehend for healthcare customers.

A quick reminder on Amazon Comprehend

Amazon Comprehend was launched last year at AWS re:Invent. In a nutshell, this Natural Language Processing service provides simple real-time APIs for language detection, entity categorization, sentiment analysis, and key phrase extraction. In addition, it also lets you organize text documents automatically using an unsupervised learning technique called “topic modeling“.

Used by FINRA, LexisNexis, or Isentia, Amazon Comprehend can understand general-purpose text. However, given the very specific nature of clinical documents, healthcare customers have asked us to build them a version of Amazon Comprehend tailored to their unique needs.

Introducing Amazon Comprehend Medical

Amazon Comprehend Medical builds on top of Amazon Comprehend and adds the following features:

  • Support for entity extraction and entity traits on a vast vocabulary of medical terms: anatomy, conditions, procedures, medications, abbreviations, etc.
  • An entity extraction API (detect_entities) trained on these categories and their subtypes.
  • A Protected Health Information extraction API (detect_phi) able to locate contact details, medical record numbers, etc.

A word of caution: Amazon Comprehend Medical may not accurately identify protected health information in all circumstances, and does not meet the requirements for de-identification of protected health information under HIPAA. You are responsible for reviewing any output provided by Amazon Comprehend Medical to ensure it meets your needs.

Now let me show you how to get started with this new service. First, I’ll use the AWS Console and then I’ll run a simple Python example.

Using Amazon Comprehend Medical in the AWS Console

Opening the AWS Console, all we have to do is paste some text and click on the ‘Analyze’ button.

Analysing textThe document is processed immediately. Entities are extracted and highlighted: we see personal information in orange, medication in red, anatomy in purple and medical conditions in green.

Amazon Comprehend Medical

Personal Identifiable Information is correctly picked up. This is particularly important for researchers who need to anonymize documents before exchanging or publishing them. Also, ‘rash’ and ‘sleeping trouble’ are correctly detected as medical conditions diagnosed by the doctor (‘Dx’ is shorthand for ‘diagnosis’). Medications are detected as well.

However, Amazon Comprehend Medical goes beyond the simple extraction of medical terms. It’s also able to understand complex relationships, such as the dosage for a medication or detailed diagnosis information. Here’s a nice example.

Amazon Comprehend Medical

As you can see, Amazon Comprehend Medical is able to figure out abbreviations such as ‘po‘ and ‘qhs‘: the first one means that the medication should be taken orally and the second is an abbrevation for ‘quaque hora somni‘ (yes, it’s Latin), i.e. at bedtime.

Let’s now dive a little deeper and run a Python example.

Using Amazon Comprehend Medical with the AWS SDK for Python

First, let’s import the boto3 SDK and create a client for the service.

import boto3
comprehend = boto3.client(service_name='comprehendmedical')

Now let’s call the detect_entity API on a text sample and print the detected entities.

text = "Pt is 40yo mother, software engineer HPI : Sleeping trouble on present 
dosage of Clonidine. Severe Rash  on face and leg, slightly itchy  Meds : Vyvanse 
50 mgs po at breakfast daily, Clonidine 0.2 mgs -- 1 and 1 / 2 tabs po qhs HEENT : 
Boggy inferior turbinates, No oropharyngeal lesion Lungs : clear Heart : Regular 
rhythm Skin :  Papular mild erythematous eruption to hairline Follow-up as scheduled"

result = comprehend.detect_entities(Text=text)
entities = result['Entities']
for entity in entities:
    print(entity)

Take a look at this medication entity: it has three nested attributes (dosage, route and frequency) which add critically important context.

{u'Id': 3,
u'Score': 0.9976208806037903,
u'BeginOffset': 145, u'EndOffset': 152, 
u'Category': u'MEDICATION',
u'Type': u'BRAND_NAME', 
u'Text': u'Vyvanse', 
u'Traits': [],
u'Attributes': [
  {u'Id': 4,
     u'Score': 0.9681360125541687, 
     u'BeginOffset': 153, u'EndOffset': 159, 
     u'Type': u'DOSAGE', 
     u'Text': u'50 mgs', 
     u'Traits': []
     }, 
  {u'Id': 5,
     u'Score': 0.99924635887146, 
     u'BeginOffset': 160, u'EndOffset': 162, 
     u'Type': u'ROUTE_OR_MODE', 
     u'Text': u'po', 
     u'Traits': []
     }, 
  {u'Id': 6,
     u'Score': 0.9738683700561523, 
     u'BeginOffset': 163, u'EndOffset': 181, 
     u'Type': u'FREQUENCY', 
     u'Text': u'at breakfast daily', 
     u'Traits': []
     }]
}

Here is another example. This medical condition entity is completed by a ‘negation’ trait, meaning that the condition was not detected, i.e. this patient doesn’t have any oropharyngeal lesion.

{u'Category': u'MEDICAL_CONDITION',
u'Id': 16,
u'Score': 0.9825472235679626, 
u'BeginOffset': 266, u'EndOffset': 286,
u'Type': u'DX_NAME', 
u'Text': u'oropharyngeal lesion', 
u'Traits': [
    {u'Score': 0.9701067209243774, u'Name': u'NEGATION'}, 
    {u'Score': 0.9053299427032471, u'Name': u'SIGN'}
]}

The last feature I’d like to show you is extracting personal information with the detect_phi API.

result = comprehend.detect_phi(Text=text)
entities = result['Entities']
for entity in entities:
print(entity)

A couple of pieces of personal information are present in this text and we correctly extract them.

{u'Category': u'PERSONAL_IDENTIFIABLE_INFORMATION', 
u'BeginOffset': 6, u'EndOffset': 10, u'Text': u'40yo', 
u'Traits': [], 
u'Score': 0.997914731502533, 
u'Type': u'AGE', u'Id': 0}

{u'Category': u'PERSONAL_IDENTIFIABLE_INFORMATION', 
u'BeginOffset': 19, u'EndOffset': 36, u'Text': u'software engineer', 
u'Traits': [], 
u'Score': 0.8865673542022705, 
u'Type': u'PROFESSION', u'Id': 1}

As you can see, Amazon Comprehend can help you extract complex information and relationships, while being extremely simple to use.

Once again, please keep in mind that Amazon Comprehend Medical is not a substitute for professional medical advice, diagnosis, or treatment. You definitely want to closely review any information it provides and use your own experience and judgement before taking any decision.

Now Available
I hope this post was informative. You can start building applications with Amazon Comprehend Medical today in the following regions: US East (Northern Virginia), US Central (Ohio), US West (Oregon) and Europe (Ireland).

In addition, the service is part of the AWS free tier: for three months after signup, the first 25,000 units of text (or 2.5 million characters) are free of charge.

Why don’t you try it on your latest prescription or medical exam and let us know what you think?

Julien;

 

 

How AWS SideTrail verifies key AWS cryptography code

Post Syndicated from Daniel Schwartz-Narbonne original https://aws.amazon.com/blogs/security/how-aws-sidetrail-verifies-key-aws-cryptography-code/

We know you want to spend your time learning valuable new skills, building innovative software, and scaling up applications — not worrying about managing infrastructure. That’s why we’re always looking for ways to help you automate the management of AWS services, particularly when it comes to cloud security. With that in mind, we recently developed SideTrail, an open source, program analysis tool that helps AWS developers verify key security properties of cryptographic implementations. In other words, the tool gives you assurances that secret information is kept secret. For example, developers at AWS have used this tool to verify that an AWS open source library, s2n, is protected against cryptographic side-channels. Let’s dive a little deeper into what this tool is and how it works.

SideTrail’s secret sauce is its automated reasoning capability, a mathematical proof-based technology, that verifies the correctness of critical code and helps detect errors that could lead to data leaks. Automated reasoning mathematically checks every possible input and reports any input that could cause an error. Full detail about SideTrail can be found in a scientific publication from VSTTE, a prominent software conference.

Automated reasoning is also useful because it can help prevent timing side-channels, a process by which data may be inadvertently revealed. Automated reasoning is what powers SideTrail to be able to detect the presence of these issues. If a code change causes a timing side-channel to be introduced, the mathematical proof will fail, and you’ll be notified about the flaw before the code is released.

Timing side-channels potentially allows attackers to learn about sensitive information, such as encryption keys, by observing the timing of messages on a network. For example, let’s say you have some code that checks a password as follows:


for (i = 0; i < length; ++i) {
	if (password[i] != input[i]) {
		send("bad password");
  	}
}

The amount of time until the reply message is received depends on which byte in the password is incorrect. As a result, an attacker can start guessing what the first character of the password is:

  • a*******
  • b*******
  • c*******

When the time to receive the error message changes, the first letter in the password is revealed. Repeating this process for each of the remaining characters turns a complex, exponential guessing challenge into a comparatively simple, linear one. SideTrail identifies these timing gaps, alerting developers to potential data leaks that could impact customers.

Here’s how to get started with SideTrail:

  1. Download SideTrail from GitHub.
  2. Follow the instructions provided with the download to setup your environment.
  3. Annotate your code with a threat model. The threat model is used to determine:
    • Which values should be considered secret. For example, a cryptographic key would be considered a secret, whereas the IP address of a public server wouldn’t.
    • The minimum timing difference an attacker can observe.
  4. Compile your code using SideTrail and analyze the report (for full guidance on how to analyze your report, read the VSTTE paper).

How SideTrail proves code correctness

The prior example had a data leak detected by SideTrail. However, SideTrail can do more than just detect leaks caused timing side-channels: it can prove that correct code won’t result in such leaks. One technique that developers can use to prevent data loss is to ensure that all paths in a program take the same amount of time to run no matter what a secret value may be. SideTrail helps to prove that such code is correct. Consider the following code that implements this technique:


int example(int pub, int secret) { 
	if (secret > pub) { 
  		pub -= secret; 
  	} else {
  		pub += secret; 
	}
	return pub;
}

Following the steps outlined in the prior section, a developer creates a threat model for this code by determining which values are public and which are secret. After annotating their code accordingly, they then compile it using SideTrail, which verifies that all paths in the code take the same amount of time. SideTrail verifies the code by leveraging the industry-standard Clang compiler to compile the code to LLVM bitcode. Working at the LLVM bitcode level allows SideTrail to work on the same optimized representation the compiler does. This is important because compiler optimizations can potentially affect the timing behavior of code, and hence the effectiveness of countermeasures used in that code. To maximize the effectiveness of SideTrail, developers can compile code for SideTrail using the same compiler configuration that’s used for released code.

The result of this process is a modified program that tracks the time required to execute each instruction. Here is how the result looks, inclusive of the annotations SideTrail includes (in red) to indicate the runtime of each operation. For more detail on the LLVM annotations, please read the VSTTE paper referenced earlier in this post.


int example(int pub, int secret, int *cost) { 
	*cost += GT_OP_COST;  
	if (secret > pub) { 
		*cost += SUBTRACTION_OP_COST;
		pub -= secret; 
  	} else {
		*cost += ADDITION_OP_COST;
  		pub += secret; 
  	} 
	return pub;
} 

If you’re wondering how this works in practice, you can inspect the SideTrail model by examining the annotated LLVM bitcode. Guidance on inspecting the SideTrail model can be found in the VSTTE paper.

To check whether the above code has a timing side-channel, SideTrail makes two copies of the code, symbolically executes both copies, and then checks whether the runtime differs. SideTrail does this by generating a wrapper and then using an automated reasoning engine to mathematically verify the correctness of this test for all possible inputs. If automated reasoning engine does this successfully, you can have confidence that the code does not contain a timing side-channel.

Conclusion

By using SideTrail, developers will be able to prove that there are no timing side channels present in their code, providing increased assurance against data loss. For example, Amazon s2n uses code-balancing countermeasures to mitigate this kind of attack. Through the use of SideTrail, we’ve proven the correctness of these countermeasures by running regression tests on every check-in to s2n, allowing AWS developers to build on top of this library with even more confidence, while delivering product and technology updates to customers with the highest security standards available.

More information about SideTrail and our other research can be found on our Provable Security hub. SideTrail is also available for public use as an open-source project.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Daniel Schwartz-Narbonne

Daniel is a Software Development Engineer in the AWS Automated Reasoning Group. Prior to joining Amazon, he earned a PhD at Princeton, where he developed a software framework to debug parallel programs. Then, at New York University, he designed a tool that automatically isolates and explains the cause of crashes in C programs. When he’s not working, you might find Daniel in the kitchen making dinner for his family, in a tent camping, or volunteering as an EMT with a local ambulance squad.

Store, Protect, Optimize Your Healthcare Data with AWS: Part 2

Post Syndicated from Stephen Jepsen original https://aws.amazon.com/blogs/architecture/store-protect-optimize-your-healthcare-data-with-aws-part-2/

Leveraging Analytics and Machine Learning Tools for Readmissions Prediction

This blog post was co-authored by Ujjwal Ratan, a senior AI/ML solutions architect on the global life sciences team.

In Part 1, we looked at various options to ingest and store sensitive healthcare data using AWS. The post described our shared responsibility model and provided a reference architecture that healthcare organizations could use as a foundation to build a robust platform on AWS to store and protect their sensitive data, including protected health information (PHI). In Part 2, we will dive deeper into how customers can optimize their healthcare datasets for analytics and machine learning (ML) to address clinical and operational challenges.

There are a number of factors creating pressures for healthcare organizations, both providers and payers, to adopt analytic tools to better understand their data: regulatory requirements, changing reimbursement models from volume- to value-based care, population health management for risk-bearing organizations, and movement toward personalized medicine. As organizations deploy new solutions to address these areas, the availability of large and complex datasets from electronic health records, genomics, images (for example, CAT, PET, MRI, ultrasound, X-ray), and IoT has been increasing. With these data assets growing in size, healthcare organizations want to leverage analytic and ML tools to derive new actionable insights across their departments.

One example of the use of ML in healthcare is diagnostic image analysis, including digital pathology. Pathology is extremely important in diagnosing and treating patients, but it is also extremely time-consuming and largely a manual process. While the complexity and quantity of workloads are increasing, the number of pathologists is decreasing. According to one study, the number of active pathologists could drop by 30 percent by 2030 compared to 2010 levels. (1) A cloud architecture and solution can automate part of the workflow, including sample management, analysis, storing, sharing, and comparison with previous samples to complement existing provider workflows effectively. A recent study using deep learning to analyze metastatic breast cancer tissue samples resulted in an approximately 85% reduction in human error rate. (2)

ML is also being used to assist radiologists in examining other diagnostic images such as X-rays, MRIs, and CAT scans. Having large quantities of images and metadata to train the algorithms that are the key to ML is one of the main challenges for ML adoption. To help address this problem, the National Institutes of Health recently released 90,000 X-ray plates tagged either with one of 14 diseases or tagged as being normal. Leading academic medical centers are using these images to build their neural networks and train their algorithms. With advanced analytics and ML, we can answer the hard questions such as “what is the next best action for my patient, the expected outcome, and the cost.”

The foundations for a great analytical layer

Let’s pick up from where we left off in Part 1. We have seen how providers can ingest data into AWS from their data centers and store it securely into different services depending on the type of data. For example:

  1. All object data is stored in Amazon S3, Amazon S3 Infrequent Access, or Amazon Glacier depending on how often they are used.
  2. Data from the provider’s database is either processed and stored as objects in Amazon S3 or aggregated into data marts on Amazon Redshift.
  3. Metadata of the objects on Amazon S3 are maintained in the DynamoDB database.
  4. Amazon Athena is used to query the objects directly stored on Amazon S3 to address ad hoc requirements.

We will now look at two best practices that are key to building a robust analytical layer using these datasets.

  1. Separating storage and compute: You should not be compelled to scale compute resources just to store more data. The scaling rules of the two layers should be separate.
  2. Leverage the vast array of AWS big data services when it comes to building the analytical platforms instead of concentrating on just a few of them. Remember, one size does not fit all.

Technical overview

In this overview, we will demonstrate how we can leverage AWS big data and ML services to build a scalable analytical layer for our healthcare data. We will use a single source of data stored in Amazon S3 for performing ad hoc analysis using Amazon Athena, integrate it with a data warehouse on Amazon Redshift, build a visual dashboard for some metrics using Amazon QuickSight, and finally build a ML model to predict readmissions using Amazon SageMaker. By not moving the data around and just connecting to it using different services, we avoid building redundant copies of the same data. There are multiple advantages to this approach:

  1. We optimize our storage. Not having redundant copies reduces the amount of storage required.
  2. We keep the data secure with only authorized services having access to it. Keeping multiple copies of the data can result in higher security risk.
  3. We are able to scale the storage and compute separately as needed.
  4. It becomes easier to manage the data and monitor usage metrics centrally such as how often the data has been accessed, who has been accessing it, and what has been the growth pattern of the data over a period of time. These metrics can be difficult to aggregate if the data is duplicated multiple times.

Let’s build out this architecture using the following steps:

  1. Create a database in AWS Glue Data Catalog

We will do this using a Glue crawler. First create a JSON file that contains the parameters for the Glue crawler.

{
"Name": "readmissions",
"Role": "arn of the role for Glue",
"DatabaseName": "readmissions",
"Description": "glue data catalog for storing readmission data",
"Targets": {
"S3Targets": [
{
"Path": "s3://<bucket>/<prefix>"
},
{
"Path": "s3://<bucket>/<prefix>"
}
]
}
}

As you can see, the crawler will crawl two locations in Amazon S3 and save the resulting tables in a new database called “readmissions.” Replace the role ARN and Amazon S3 locations with your corresponding details. Save this in a file create_crawler.json. Then from the AWS CLI, call the following command to create the crawler:

aws glue create-crawler --cli-input-json file://create_crawler.json

Once the crawler is created, run it by calling the following command:

aws glue start-crawler --name readmissions

Log on to the AWS Glue console, navigate to the crawlers, and wait until the crawler completes running.

This will create two tables — phi and non-phi — in a database named “readmissions” in the AWS Glue Data Catalog as shown below.

  1. Query the data using Athena

The Amazon Glue Data Catalog is seamlessly integrated with Amazon Athena. For details on how to enable this, see Integration with AWS Glue.

As a result of this integration, the tables created using the Glue crawler can now be queried using Amazon Athena. Amazon Athena allows you to do ad hoc analysis on the dataset. You can do exploratory analysis on the data and also determine its structure and quality. This type of upfront ad hoc analysis is invaluable for ensuring the data quality in your downstream data warehouse or your ML algorithms that will make use of this data for training models. In the next few sections, we will explore these aspects in greater detail.

To query the data using Amazon Athena, navigate to the Amazon Athena console.

NOTE: Make sure the region is the same as the region you chose in the previous step. If it’s not the same, switch the region by using the drop-down menu on the top right-hand corner of the screen.

Once you arrive in the Amazon Athena console, you should already see the tables and databases you created previously, and you should be able to see the data in the two tables by writing Amazon Athena queries. Here is a list of the top 10 rows from the table readmissions.nonphi:

Now that we are able to query the dataset, we can run some queries for exploratory analysis. Here are just a few examples:

AnalysisAmazon Athena Query
How many Patients have been discharged to home?SELECT count(*) from nonphi where discharge_disposition = ‘Discharged to home’
What’s the minimum and the maximum number of procedures carried out on a patient?SELECT min(num_procedures), max(num_procedures) from nonphi
How many patients were referred to this hospital by another physician?SELECT count(*) FROM nonphi group by admission_source having admission_source = ‘Physician Referral’
What were the top 5 specialties with positive readmissions?

SELECT count(readmission_result) as num_readmissions, medical_specialty from

(select readmission_result,medical_specialty from nonphi where readmission_result = ‘Yes’)

group by medical_specialty order by num_readmissions desc limit 5

Which payer was responsible for paying for treatments that involved more than 5 procedures?SELECT distinct payer_code from nonphi where num_procedures >5 and payer_code !='(null)’

While this information is valuable, you typically do not want to invest too much time and effort into building an ad hoc query platform like this because at this stage, you are not even sure if the data is of any value for your business-critical analytical applications. One benefit of using Amazon Athena for ad hoc analysis is that it requires little effort or time. It uses Schema-On-Read instead of schema on write, allowing you to work with various source data formats without worrying about the underlying structures. You can put the data on Amazon S3 and start querying immediately.

  1. Create an external table in Amazon Redshift Spectrum with the same data

Now that we are satisfied with the data quality and understand the structure of the data, we would like to integrate this with a data warehouse. We’ll use Amazon Redshift Spectrum to create external tables on the files in S3 and then integrate these external tables with a physical table in Amazon Redshift.

Amazon Redshift Spectrum allows you to run Amazon Redshift SQL queries against data on Amazon S3, extending the capabilities of your data warehouse beyond the physical Amazon Redshift clusters. You don’t need to do any elaborate ETL or move the data around. The data exists in one place in Amazon S3 and you interface with it using different services (Athena and Redshift Spectrum) to satisfy different requirements.

Before beginning, please look at this step by step guide to set up Redshift Spectrum.

After you have set up Amazon Redshift Spectrum, you can begin executing the steps below:

  1. Create an external schema called “readmissions.” Amazon Redshift Spectrum integrates with the Amazon Glue Data Catalog and allows you to create spectrum tables by referring the catalog. This feature allows you to build the external table on the same data that you analyzed with Amazon Athena in the previous step without the need for ETL. This can be achieved by the following:
create external schema readmissions
from data catalog
database 'readmissions'
iam_role 'arn for your redshift spectrum role '
region ‘region when the S3 data exists’;

NOTE: Make sure you select the appropriate role arn and region.

  1. Once the command executes successfully, you can confirm the schema was created by running the following:
select * from svv_external_schemas;

You should see a row similar to the one above with your corresponding region and role.

You can also see the external tables that were created by running the following command:

select * from SVV_EXTERNAL_TABLES;

  1. Let’s confirm we can see all the rows in the external table by counting the number of rows:
select count(*) from readmissions.phi;
select count(*) from readmissions.nonphi;

You should see 101,766 rows in both the tables, confirming that your external tables contain all the records that you read using the AWS Glue data crawler and analyzed using Athena.

  1. Now that we have all the external tables created, let’s create an aggregate fact table in the physical Redshift data warehouse. We can use the “As Select” clause of the Redshift create table query to do this:
create table readmissions_aggregate_fact as
select
readmission_result,admission_type,discharge_disposition,diabetesmed,
avg(time_in_hospital) as avg_time_in_hospital,
min(num_procedures) as min_procedures,
max(num_procedures) as max_procedures,
avg(num_procedures) as avg_num_procedures,
avg(num_medications) as avg_num_medications,
avg(number_outpatient) as avg_number_outpatient,
avg(number_emergency) as avg_number_emergency,
avg(number_inpatient) as avg_number_inpatient,
avg(number_diagnoses) as avg_number_diagnoses
from readmissions.nonphi
group by readmission_result,admission_type,discharge_disposition,diabetesmed

Once this query executes successfully, you can see a new table created in the physical public schema of your Amazon Redshift cluster. You can confirm this by executing the following query:

select distinct(tablename) from pg_table_def where schemaname = 'public'

  1. Build a QuickSight Dashboard from the aggregate fact

We can now create dashboards to visualize the data in our readmissions aggregate fact table using Amazon QuickSight. Here are some examples of reports you can generate using Amazon QuickSight on the readmission data.

For more details on Amazon QuickSight, refer to the service documentation.

  1. Build a ML model in Amazon SageMaker to predict readmissions

As a final step, we will create a ML model to predict the attribute readmission_result, which denotes if a patient was readmitted or not, using the non-PHI dataset.

  1. Create a notebook instance in Amazon SageMaker that is used to develop our code.
  2. The code reads non-PHI data from the Amazon S3 bucket as a data frame in Python. This is achieved using the pandas.readcsv function.

  1. Use the pandas.get_dummies function to encode categorical values into numeric values for use with the model.

  1. Split the data into two, 80% for training and 20% for testing, using the numpy.random.rand function.

  1. Form train_X, train_y and test_X, test_y corresponding to training features, training labels, testing features, and testing labels respectively.

  1. Use the Amazon SageMaker Linear learner algorithm to train our model. The implementation of the algorithm uses dense tensor format to optimize the training job. Use the function write_numpy_to_dense_tensor from the Amazon SageMaker library to convert the numpy array into the dense tensor format.

  1. Create the training job in Amazon SageMaker with appropriate configurations and run it.

  1. Once the training job completes, create an endpoint in Amazon SageMaker to host our model, using the linear.deploy function to deploy the endpoint.

  1. Finally, run a prediction by invoking the endpoint using the linear_predictor.predict function.

You can view the complete notebook here.

Data, analytics, and ML are strategic assets to help you manage your patients, staff, equipment, and supplies more efficiently. These technologies can also help you be more proactive in treating and preventing disease. Industry luminaries share this opinion: “By leveraging big data and scientific advancements while maintaining the important doctor-patient bond, we believe we can create a health system that will go beyond curing disease after the fact to preventing disease before it strikes by focusing on health and wellness,” writes Lloyd B. Minor, MD, dean of the Stanford School of Medicine.

ML and analytics offer huge value in helping achieve the quadruple aim : improved patient satisfaction, improved population health, improved provider satisfaction, and reduced costs. Technology should never replace the clinician but instead become an extension of the clinician and allow them to be more efficient by removing some of the mundane, repetitive tasks involved in prevention, diagnostics, and treatment of patients.

(1) “The Digital Future of Pathology.” The Medical Futurist, 28 May 2018, medicalfuturist.com/digital-future-pathology.

(2) Wang, Dayong, et al. “Deep Learning for Identifying Metastatic Breast Cancer.” Deep Learning for Identifying Metastatic Breast Cancer, 18 June 2016, arxiv.org/abs/1606.05718.

About the Author

Stephen Jepsen is a Global HCLS Practice Manager in AWS Professional Services.

 

Rock, paper, scissors, lizard, Spock, fire, water balloon!

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/rock-paper-scissors-lizard-spock-fire-water-balloon/

Use a Raspberry Pi and a Pi Camera Module to build your own machine learning–powered rock paper scissors game!

Rock-Paper-Scissors game using computer vision and machine learning on Raspberry Pi

A Rock-Paper-Scissors game using computer vision and machine learning on the Raspberry Pi. Project GitHub page: https://github.com/DrGFreeman/rps-cv PROJECT ORIGIN: This project results from a challenge my son gave me when I was teaching him the basics of computer programming making a simple text based Rock-Paper-Scissors game in Python.

Virtual rock paper scissors

Here’s why you should always leave comments on our blog: this project from Julien de la Bruère-Terreault instantly had our attention when he shared it on our recent Android Things post.

Julien and his son were building a text-based version of rock paper scissors in Python when his son asked him: “Could you make a rock paper scissors game that uses the camera to detect hand gestures?” Obviously, Julien really had no choice but to accept the challenge.

“The game uses a Raspberry Pi computer and Raspberry Pi Camera Module installed on a 3D-printed support with LED strips to achieve consistent images,” Julien explains in the tutorial for the build. “The pictures taken by the camera are processed and fed to an image classifier that determines whether the gesture corresponds to ‘Rock’, ‘Paper’, or ‘Scissors’ gestures.”

How does it work?

Physically, the build uses a Pi 3 Model B and a Camera Module V2 alongside 3D-printed parts. The parts are all green, since a consistent colour allows easy subtraction of background from the captured images. You can download the files for the setup from Thingiverse.

rock paper scissors raspberry pi

To illustrate how the software works, Julien has created a rather delightful pipeline demonstrating where computer vision and machine learning come in.

rock paper scissors using raspberry pi

The way the software works means the game doesn’t need to be limited to the standard three hand signs. If you wanted to, you could add other signs such as ‘lizard’ and ‘Spock’! Or ‘fire’ and ‘water balloon’. Or any other alterations made to the game in your pop culture favourites.

rock paper scissors lizard spock

Check out Julien’s full tutorial to build your own AI-powered rock paper scissors game here on Julien’s GitHub. Massive kudos to Julien for spending a year learning the skills required to make it happen. And a massive thank you to Julien’s son for inspiring him! This is why it’s great to do coding and digital making with kids — they have the best project ideas!

Sharing is caring

If you’ve built your own project using Raspberry Pi, please share it with us in the comments below, or via social media. As you can tell from today’s blog post, we love to see them and share them with the whole community!

The post Rock, paper, scissors, lizard, Spock, fire, water balloon! appeared first on Raspberry Pi.

Amazon SageMaker Adds Batch Transform Feature and Pipe Input Mode for TensorFlow Containers

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/sagemaker-nysummit2018/

At the New York Summit a few days ago we launched two new Amazon SageMaker features: a new batch inference feature called Batch Transform that allows customers to make predictions in non-real time scenarios across petabytes of data and Pipe Input Mode support for TensorFlow containers. SageMaker remains one of my favorite services and we’ve covered it extensively on this blog and the machine learning blog. In fact, the rapid pace of innovation from the SageMaker team is a bit hard to keep up with. Since our last post on SageMaker’s Automatic Model Tuning with Hyper Parameter Optimization, the team launched 4 new built-in algorithms and tons of new features. Let’s take a look at the new Batch Transform feature.

Batch Transform

The Batch Transform feature is a high-performance and high-throughput method for transforming data and generating inferences. It’s ideal for scenarios where you’re dealing with large batches of data, don’t need sub-second latency, or need to both preprocess and transform the training data. The best part? You don’t have to write a single additional line of code to make use of this feature. You can take all of your existing models and start batch transform jobs based on them. This feature is available at no additional charge and you pay only for the underlying resources.

Let’s take a look at how we would do this for the built-in Object Detection algorithm. I followed the example notebook to train my object detection model. Now I’ll go to the SageMaker console and open the Batch Transform sub-console.

From there I can start a new batch transform job.

Here I can name my transform job, select which of my models I want to use, and the number and type of instances to use. Additionally, I can configure the specifics around how many records to send to my inference concurrently and the size of the payload. If I don’t manually specify these then SageMaker will select some sensible defaults.

Next I need to specify my input location. I can either use a manifest file or just load all the files in an S3 location. Since I’m dealing with images here I’ve manually specified my input content-type.

Finally, I’ll configure my output location and start the job!

Once the job is running, I can open the job detail page and follow the links to the metrics and the logs in Amazon CloudWatch.

I can see the job is running and if I look at my results in S3 I can see the predicted labels for each image.

The transform generated one output JSON file per input file containing the detected objects.

From here it would be easy to create a table for the bucket in AWS Glue and either query the results with Amazon Athena or visualize them with Amazon QuickSight.

Of course it’s also possible to start these jobs programmatically from the SageMaker API.

You can find a lot more detail on how to use batch transforms in your own containers in the documentation.

Pipe Input Mode for Tensorflow

Pipe input mode allows customers to stream their training dataset directly from Amazon Simple Storage Service (S3) into Amazon SageMaker using a highly optimized multi-threaded background process. This mode offers significantly better read throughput than the File input mode that must first download the data to the local Amazon Elastic Block Store (EBS) volume. This means your training jobs start sooner, finish faster, and use less disk space, lowering the costs associated with training your models. It has the added benefit of letting you train on datasets beyond the 16 TB EBS volume size limit.

Earlier this year, we ran some experiments with Pipe Input Mode and found that startup times were reduced up to 87% on a 78 GB dataset, with throughput twice as fast in some benchmarks, ultimately resulting in up to a 35% reduction in total training time.

By adding support for Pipe Input Mode to TensorFlow we’re making it easier for customers to take advantage of the same increased speed available to the built-in algorithms. Let’s look at how this works in practice.

First, I need to make sure I have the sagemaker-tensorflow-extensions available for my training job. This gives us the new PipeModeDataset class which takes a channel and a record format as inputs and returns a TensorFlow dataset. We can use this in our input_fn for the TensorFlow estimator and read from the channel. The code sample below shows a simple example.

from sagemaker_tensorflow import PipeModeDataset

def input_fn(channel):
    # Simple example data - a labeled vector.
    features = {
        'data': tf.FixedLenFeature([], tf.string),
        'labels': tf.FixedLenFeature([], tf.int64),
    }
    
    # A function to parse record bytes to a labeled vector record
    def parse(record):
        parsed = tf.parse_single_example(record, features)
        return ({
            'data': tf.decode_raw(parsed['data'], tf.float64)
        }, parsed['labels'])

    # Construct a PipeModeDataset reading from a 'training' channel, using
    # the TF Record encoding.
    ds = PipeModeDataset(channel=channel, record_format='TFRecord')

    # The PipeModeDataset is a TensorFlow Dataset and provides standard Dataset methods
    ds = ds.repeat(20)
    ds = ds.prefetch(10)
    ds = ds.map(parse, num_parallel_calls=10)
    ds = ds.batch(64)
    
    return ds

Then you can define your model and the same way you would for a normal TensorFlow estimator. When it comes to estimator creation time you just need to pass in input_mode='Pipe' as one of the parameters.

Available Now

Both of these new features are available now at no additional charge, and I’m looking forward to seeing what customers can build with the batch transform feature. I can already tell you that it will help us with some of our internal ML workloads here in AWS Marketing.

As always, let us know what you think of this feature in the comments or on Twitter!

Randall

Amazon Translate Adds Support for Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-translate-adds-support-for-new-languages/

Today I’m excited to announce that Amazon Translate has added support for Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech! Amazon Translate is a translation API that delivers fast, high-quality, and affordable language translation. Translate was originally released in preview at AWS re:Invent in 2017 and my colleague, Tara, wrote about the service in depth.

Since the initial preview, we’ve continued to iterate on customer feedback by adding features like automatic source language inference via Amazon Comprehend, Amazon CloudWatch metrics, and larger pieces of text in each TranslateText. In April, we made the service generally available and have continued to collect feature requests and feedback from customers.

Working with Amazon Translate

You can use the API explorer in the Amazon Translate Console to try out the new languages immediately.

You could also use any of the SDKs as well. I wrote a quick Python sample below.

import boto3
translate = boto3.client("translate")
lang_flag_pairs = [("ja", "🇯🇵"), ("ru", "🇷🇺"),
                   ("it", "🇮🇹"), ("zh-TW", "🇹🇼"),
                   ("tr", "🇹🇷"), ("cs", "🇨🇿")]
for lang, flag in language_flag_pairs:
    print(flag)
    print(translate.translate_text(
        Text="Hello, World!",
        SourceLanguageCode="en",
        TargetLanguageCode=lang
    )['TranslatedText'])

Which gave me a cool result:

🇯🇵
ハローワールド!
🇷🇺
Привет, Мир!
🇮🇹
Ciao, Mondo!
🇹🇼
你好,世界杯!
🇹🇷
Merhaba, Dünya!
🇨🇿
Ahoj, světe!

You can check out the examples page in the documentation for more interesting ways to use Amazon Translate.

I had the chance to chat with a few of our customers about how they’re using Amazon Translate. I found out that they’re using it to enable a lot of innovative use cases and customers are building real-time chat translation in games, capturing and translating tweets for analytics, and as you can see on this blog – translating blog posts and creating spoken versions of them with Polly.

Additional Info

Amazon Translate is available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) and AWS GovCloud (US). It has a useful monthly free tier of 2 million characters for the first 12 months, and $15 per million characters after that.

The team wanted me to let you know that they’re hard at work on some additional functionality and that they’re hoping, later this year, to add support for Danish, Dutch, Finnish, Hebrew, Polish, and Swedish.

As always, let us know if you have any feedback on this service in the comments or on Twitter!

Randall