Tag Archives: artificial intelligence

re:Invent 2019: Introducing the Amazon Builders’ Library (Part I)

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/reinvent-2019-introducing-the-amazon-builders-library-part-i/

Today, I’m going to tell you about a new site we launched at re:Invent, the Amazon Builders’ Library, a collection of living articles covering topics across architecture, software delivery, and operations. You get to peek under the hood of how Amazon architects, releases, and operates the software underpinning Amazon.com and AWS.

Want to know how Amazon.com does what it does? This is for you. In this two-part series (the next one coming December 23), I’ll highlight some of the best architecture articles written by Amazon’s senior technical leaders and engineers.

Avoiding insurmountable queue backlogs

Avoiding insurmountable queue backlogs

In queueing theory, the behavior of queues when they are short is relatively uninteresting. After all, when a queue is short, everyone is happy. It’s only when the queue is backlogged, when the line to an event goes out the door and around the corner, that people start thinking about throughput and prioritization.

In this article, I discuss strategies we use at Amazon to deal with queue backlog scenarios – design approaches we take to drain queues quickly and to prioritize workloads. Most importantly, I describe how to prevent queue backlogs from building up in the first place. In the first half, I describe scenarios that lead to backlogs, and in the second half, I describe many approaches used at Amazon to avoid backlogs or deal with them gracefully.

Read the full article by David Yanacek – Principal Engineer

Timeouts, retries, and backoff with jitter

Timeouts, retries and backoff with jitter

Whenever one service or system calls another, failures can happen. These failures can come from a variety of factors. They include servers, networks, load balancers, software, operating systems, or even mistakes from system operators. We design our systems to reduce the probability of failure, but impossible to build systems that never fail. So in Amazon, we design our systems to tolerate and reduce the probability of failure, and avoid magnifying a small percentage of failures into a complete outage. To build resilient systems, we employ three essential tools: timeouts, retries, and backoff.

Read the full article by Marc Brooker, Senior Principal Engineer

Challenges with distributed systems

Challenges with distributed systems

The moment we added our second server, distributed systems became the way of life at Amazon. When I started at Amazon in 1999, we had so few servers that we could give some of them recognizable names like “fishy” or “online-01”. However, even in 1999, distributed computing was not easy. Then as now, challenges with distributed systems involved latency, scaling, understanding networking APIs, marshalling and unmarshalling data, and the complexity of algorithms such as Paxos. As the systems quickly grew larger and more distributed, what had been theoretical edge cases turned into regular occurrences.

Developing distributed utility computing services, such as reliable long-distance telephone networks, or Amazon Web Services (AWS) services, is hard. Distributed computing is also weirder and less intuitive than other forms of computing because of two interrelated problems. Independent failures and nondeterminism cause the most impactful issues in distributed systems. In addition to the typical computing failures most engineers are used to, failures in distributed systems can occur in many other ways. What’s worse, it’s impossible always to know whether something failed.

Read the full article by Jacob Gabrielson, Senior Principal Engineer

Static stability using Availability Zones

Static stability using availability zones

At Amazon, the services we build must meet extremely high availability targets. This means that we need to think carefully about the dependencies that our systems take. We design our systems to stay resilient even when those dependencies are impaired. In this article, we’ll define a pattern that we use called static stability to achieve this level of resilience. We’ll show you how we apply this concept to Availability Zones, a key infrastructure building block in AWS and therefore a bedrock dependency on which all of our services are built.

Read the full article by Becky Weiss, Senior Principal Engineer, and Mike Furr, Principal Engineer

Check back in two weeks to read about some other architecture-based expert articles that let you in on how Amazon does what it does.

Amazon SageMaker Studio: The First Fully Integrated Development Environment For Machine Learning

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-studio-the-first-fully-integrated-development-environment-for-machine-learning/

Today, we’re extremely happy to launch Amazon SageMaker Studio, the first fully integrated development environment (IDE) for machine learning (ML).

We have come a long way since we launched Amazon SageMaker in 2017, and it is shown in the growing number of customers using the service. However, the ML development workflow is still very iterative, and is challenging for developers to manage due to the relative immaturity of ML tooling. Many of the tools which developers take for granted when building traditional software (debuggers, project management, collaboration, monitoring, and so forth) have yet been invented for ML.

For example, when trying a new algorithm or tweaking hyper parameters, developers and data scientists typically run hundreds and thousands of experiments on Amazon SageMaker, and they need to manage all this manually. Over time, it becomes much harder to track the best performing models, and to capitalize on lessons learned during the course of experimentation.

Amazon SageMaker Studio unifies at last all the tools needed for ML development. Developers can write code, track experiments, visualize data, and perform debugging and monitoring all within a single, integrated visual interface, which significantly boosts developer productivity.

In addition, since all these steps of the ML workflow are tracked within the environment, developers can quickly move back and forth between steps, and also clone, tweak, and replay them. This gives developers the ability to make changes quickly, observe outcomes, and iterate faster, reducing the time to market for high quality ML solutions.

Introducing Amazon SageMaker Studio
Amazon SageMaker Studio lets you manage your entire ML workflow through a single pane of glass. Let me give you the whirlwind tour!

With Amazon SageMaker Notebooks (currently in preview), you can enjoy an enhanced notebook experience that lets you easily create and share Jupyter notebooks. Without having to manage any infrastructure, you can also quickly switch from one hardware configuration to another.

With Amazon SageMaker Experiments, you can organize, track and compare thousands of ML jobs: these can be training jobs, or data processing and model evaluation jobs run with Amazon SageMaker Processing.

With Amazon SageMaker Debugger, you can debug and analyze complex training issues, and receive alerts. It automatically introspects your models, collects debugging data, and analyzes it to provide real-time alerts and advice on ways to optimize your training times, and improve model quality. All information is visible as your models are training.

With Amazon SageMaker Model Monitor, you can detect quality deviations for deployed models, and receive alerts. You can easily visualize issues like data drift that could be affecting your models. No code needed: all it takes is a few clicks.

With Amazon SageMaker Autopilot, you can build models automatically with full control and visibility. Algorithm selection, data preprocessing, and model tuning are taken care automatically, as well as all infrastructure.

Thanks to these new capabilities, Amazon SageMaker now covers the complete ML workflow to build, train, and deploy machine learning models, quickly and at any scale.

These services mentioned above, except for Amazon SageMaker Notebooks, are covered in individual blog posts (see below) showing you how to quickly get started, so keep your eyes peeled and read on!

Now Available!
Amazon SageMaker Studio is available today in US East (Ohio).

Give it a try, and please send us feedback either in the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

– Julien

Amazon SageMaker Debugger – Debug Your Machine Learning Models

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-debugger-debug-your-machine-learning-models/

Today, we’re extremely happy to announce Amazon SageMaker Debugger, a new capability of Amazon SageMaker that automatically identifies complex issues developing in machine learning (ML) training jobs.

Building and training ML models is a mix of science and craft (some would even say witchcraft). From collecting and preparing data sets to experimenting with different algorithms to figuring out optimal training parameters (the dreaded hyperparameters), ML practitioners need to clear quite a few hurdles to deliver high-performance models. This is the very reason why be built Amazon SageMaker : a modular, fully managed service that simplifies and speeds up ML workflows.

As I keep finding out, ML seems to be one of Mr. Murphy’s favorite hangouts, and everything that may possibly go wrong often does! In particular, many obscure issues can happen during the training process, preventing your model from correctly extracting and learning patterns present in your data set. I’m not talking about software bugs in ML libraries (although they do happen too): most failed training jobs are caused by an inappropriate initialization of parameters, a poor combination of hyperparameters, a design issue in your own code, etc.

To make things worse, these issues are rarely visible immediately: they grow over time, slowly but surely ruining your training process, and yielding low accuracy models. Let’s face it, even if you’re a bona fide expert, it’s devilishly difficult and time-consuming to identify them and hunt them down, which is why we built Amazon SageMaker Debugger.

Let me tell you more.

Introducing Amazon SageMaker Debugger
In your existing training code for TensorFlow, Keras, Apache MXNet, PyTorch and XGBoost, you can use the new SageMaker Debugger SDK to save internal model state at periodic intervals; as you can guess, it will be stored in Amazon Simple Storage Service (S3).

This state is composed of:

  • The parameters being learned by the model, e.g. weights and biases for neural networks,
  • The changes applied to these parameters by the optimizer, aka gradients,
  • The optimization parameters themselves,
  • Scalar values, e.g. accuracies and losses,
  • The output of each layer,
  • Etc.

Each specific set of values – say, the sequence of gradients flowing over time through a specific neural network layer – is saved independently, and referred to as a tensor. Tensors are organized in collections (weights, gradients, etc.), and you can decide which ones you want to save during training. Then, using the SageMaker SDK and its estimators, you configure your training job as usual, passing additional parameters defining the rules you want SageMaker Debugger to apply.

A rule is a piece of Python code that analyses tensors for the model in training, looking for specific unwanted conditions. Pre-defined rules are available for common problems such as exploding/vanishing tensors (parameters reaching NaN or zero values), exploding/vanishing gradients, loss not changing, and more. Of course, you can also write your own rules.

Once the SageMaker estimator is configured, you can launch the training job. Immediately, it fires up a debug job for each rule that you configured, and they start inspecting available tensors. If a debug job detects a problem, it stops and logs additional information. A CloudWatch Events event is also sent, should you want to trigger additional automated steps.

So now you know that your deep learning job suffers from say, vanishing gradients. With a little brainstorming and experience, you’ll know where to look: maybe the neural network is too deep? Maybe your learning rate is too small? As the internal state has been saved to S3, you can now use the SageMaker Debugger SDK to explore the evolution of tensors over time, confirm your hypothesis and fix the root cause.

Let’s see SageMaker Debugger in action with a quick demo.

Debugging Machine Learning Models with Amazon SageMaker Debugger
At the core of SageMaker Debugger is the ability to capture tensors during training. This requires a little bit of instrumentation in your training code, in order to select the tensor collections you want to save, the frequency at which you want to save them, and whether you want to save the values themselves or a reduction (mean, average, etc.).

For this purpose, the SageMaker Debugger SDK provides simple APIs for each framework that it supports. Let me show you how this works with a simple TensorFlow script, trying to fit a 2-dimension linear regression model. Of course, you’ll find more examples in this Github repository.

Let’s take a look at the initial code:

import argparse
import numpy as np
import tensorflow as tf
import random

parser = argparse.ArgumentParser()
parser.add_argument('--model_dir', type=str, help="S3 path for the model")
parser.add_argument('--lr', type=float, help="Learning Rate", default=0.001)
parser.add_argument('--steps', type=int, help="Number of steps to run", default=100)
parser.add_argument('--scale', type=float, help="Scaling factor for inputs", default=1.0)

args = parser.parse_args()

with tf.name_scope('initialize'):
    # 2-dimensional input sample
    x = tf.placeholder(shape=(None, 2), dtype=tf.float32)
    # Initial weights: [10, 10]
    w = tf.Variable(initial_value=[[10.], [10.]], name='weight1')
    # True weights, i.e. the ones we're trying to learn
    w0 = [[1], [1.]]
with tf.name_scope('multiply'):
    # Compute true label
    y = tf.matmul(x, w0)
    # Compute "predicted" label
    y_hat = tf.matmul(x, w)
with tf.name_scope('loss'):
    # Compute loss
    loss = tf.reduce_mean((y_hat - y) ** 2, name="loss")

optimizer = tf.train.AdamOptimizer(args.lr)
optimizer_op = optimizer.minimize(loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(args.steps):
        x_ = np.random.random((10, 2)) * args.scale
        _loss, opt = sess.run([loss, optimizer_op], {x: x_})
        print (f'Step={i}, Loss={_loss}')

Let’s train this script using the TensorFlow Estimator. I’m using SageMaker local mode, which is a great way to quickly iterate on experimental code.

bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000}

estimator = TensorFlow(
    role=sagemaker.get_execution_role(),
    base_job_name='debugger-simple-demo',
    train_instance_count=1,
    train_instance_type='local',
    entry_point='script-v1.py',
    framework_version='1.13.1',
    py_version='py3',
    script_mode=True,
    hyperparameters=bad_hyperparameters)

Looking at the training log, things did not go well.

Step=0, Loss=7.883463958023267e+23
algo-1-hrvqg_1 | Step=1, Loss=9.502028841062608e+23
algo-1-hrvqg_1 | Step=2, Loss=nan
algo-1-hrvqg_1 | Step=3, Loss=nan
algo-1-hrvqg_1 | Step=4, Loss=nan
algo-1-hrvqg_1 | Step=5, Loss=nan
algo-1-hrvqg_1 | Step=6, Loss=nan
algo-1-hrvqg_1 | Step=7, Loss=nan
algo-1-hrvqg_1 | Step=8, Loss=nan
algo-1-hrvqg_1 | Step=9, Loss=nan

Loss does not decrease at all, and even goes to infinity… This looks like an exploding tensor problem, which is one of the built-in rules defined in SageMaker Debugger. Let’s get to work.

Using the Amazon SageMaker Debugger SDK
In order to capture tensors, I need to instrument the training script with:

  • A SaveConfig object specifying the frequency at which tensors should be saved,
  • A SessionHook object attached to the TensorFlow session, putting everything together and saving required tensors during training,
  • An (optional) ReductionConfig object, listing tensor reductions that should be saved instead of full tensors,
  • An (optional) optimizer wrapper to capture gradients.

Here’s the updated code, with extra command line arguments for SageMaker Debugger parameters.

import argparse
import numpy as np
import tensorflow as tf
import random
import smdebug.tensorflow as smd

parser = argparse.ArgumentParser()
parser.add_argument('--model_dir', type=str, help="S3 path for the model")
parser.add_argument('--lr', type=float, help="Learning Rate", default=0.001 )
parser.add_argument('--steps', type=int, help="Number of steps to run", default=100 )
parser.add_argument('--scale', type=float, help="Scaling factor for inputs", default=1.0 )
parser.add_argument('--debug_path', type=str, default='/opt/ml/output/tensors')
parser.add_argument('--debug_frequency', type=int, help="How often to save tensor data", default=10)
feature_parser = parser.add_mutually_exclusive_group(required=False)
feature_parser.add_argument('--reductions', dest='reductions', action='store_true', help="save reductions of tensors instead of saving full tensors")
feature_parser.add_argument('--no_reductions', dest='reductions', action='store_false', help="save full tensors")
args = parser.parse_args()
args = parser.parse_args()

reduc = smd.ReductionConfig(reductions=['mean'], abs_reductions=['max'], norms=['l1']) if args.reductions else None

hook = smd.SessionHook(out_dir=args.debug_path,
                       include_collections=['weights', 'gradients', 'losses'],
                       save_config=smd.SaveConfig(save_interval=args.debug_frequency),
                       reduction_config=reduc)

with tf.name_scope('initialize'):
    # 2-dimensional input sample
    x = tf.placeholder(shape=(None, 2), dtype=tf.float32)
    # Initial weights: [10, 10]
    w = tf.Variable(initial_value=[[10.], [10.]], name='weight1')
    # True weights, i.e. the ones we're trying to learn
    w0 = [[1], [1.]]
with tf.name_scope('multiply'):
    # Compute true label
    y = tf.matmul(x, w0)
    # Compute "predicted" label
    y_hat = tf.matmul(x, w)
with tf.name_scope('loss'):
    # Compute loss
    loss = tf.reduce_mean((y_hat - y) ** 2, name="loss")
    hook.add_to_collection('losses', loss)

optimizer = tf.train.AdamOptimizer(args.lr)
optimizer = hook.wrap_optimizer(optimizer)
optimizer_op = optimizer.minimize(loss)

hook.set_mode(smd.modes.TRAIN)

with tf.train.MonitoredSession(hooks=[hook]) as sess:
    for i in range(args.steps):
        x_ = np.random.random((10, 2)) * args.scale
        _loss, opt = sess.run([loss, optimizer_op], {x: x_})
        print (f'Step={i}, Loss={_loss}')

I also need to modify the TensorFlow Estimator, to use the SageMaker Debugger-enabled training container and to pass additional parameters.

bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000, 'debug_frequency': 1}

from sagemaker.debugger import Rule, rule_configs
estimator = TensorFlow(
    role=sagemaker.get_execution_role(),
    base_job_name='debugger-simple-demo',
    train_instance_count=1,
    train_instance_type='ml.c5.2xlarge',
    image_name=cpu_docker_image_name,
    entry_point='script-v2.py',
    framework_version='1.15',
    py_version='py3',
    script_mode=True,
    hyperparameters=bad_hyperparameters,
    rules = [Rule.sagemaker(rule_configs.exploding_tensor())]
)

estimator.fit()
2019-11-27 10:42:02 Starting - Starting the training job...
2019-11-27 10:42:25 Starting - Launching requested ML instances
********* Debugger Rule Status *********
*
* ExplodingTensor: InProgress 
*
****************************************

Two jobs are running: the actual training job, and a debug job checking for the rule defined in the Estimator. Quickly, the debug job fails!

Describing the training job, I can get more information on what happened.

description = client.describe_training_job(TrainingJobName=job_name)
print(description['DebugRuleEvaluationStatuses'][0]['RuleConfigurationName'])
print(description['DebugRuleEvaluationStatuses'][0]['RuleEvaluationStatus'])

ExplodingTensor
IssuesFound

Let’s take a look at the saved tensors.

Exploring Tensors
I can easily grab the tensors saved in S3 during the training process.

s3_output_path = description["DebugConfig"]["DebugHookConfig"]["S3OutputPath"]
trial = create_trial(s3_output_path)

Let’s list available tensors.

trial.tensors()

['loss/loss:0',
'gradients/multiply/MatMul_1_grad/tuple/control_dependency_1:0',
'initialize/weight1:0']

All values are numpy arrays, and I can easily iterate over them.

tensor = 'gradients/multiply/MatMul_1_grad/tuple/control_dependency_1:0'
for s in list(trial.tensor(tensor).steps()):
    print("Value: ", trial.tensor(tensor).step(s).value)

Value:  [[1.1508383e+23] [1.0809098e+23]]
Value:  [[1.0278440e+23] [1.1347468e+23]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]

As tensor names include the TensorFlow scope defined in the training code, I can easily see that something is wrong with my matrix multiplication.

# Compute true label
y = tf.matmul(x, w0)
# Compute "predicted" label
y_hat = tf.matmul(x, w)

Digging a little deeper, the x input is modified by a scaling parameter, which I set to 100000000000 in the Estimator. The learning rate doesn’t look sane either. Bingo!

x_ = np.random.random((10, 2)) * args.scale

bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000, 'debug_frequency': 1}

As you probably knew all along, setting these hyperpameteres to more reasonable values will fix the training issue.

Now Available!
We believe Amazon SageMaker Debugger will help you find and solve training issues quicker, so it’s now your turn to go bug hunting.

Amazon SageMaker Debugger is available today in all commercial regions where Amazon SageMaker is available. Give it a try and please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

– Julien

 

 

Amazon SageMaker Model Monitor – Fully Managed Automatic Monitoring For Your Machine Learning Models

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-model-monitor-fully-managed-automatic-monitoring-for-your-machine-learning-models/

Today, we’re extremely happy to announce Amazon SageMaker Model Monitor, a new capability of Amazon SageMaker that automatically monitors machine learning (ML) models in production, and alerts you when data quality issues appear.

The first thing I learned when I started working with data is that there is no such thing as paying too much attention to data quality. Raise your hand if you’ve spent hours hunting down problems caused by unexpected NULL values or by exotic character encodings that somehow ended up in one of your databases.

As models are literally built from large amounts of data, it’s easy to see why ML practitioners spend so much time caring for their data sets. In particular, they make sure that data samples in the training set (used to train the model) and in the validation set (used to measure its accuracy) have the same statistical properties.

There be monsters! Although you have full control over your experimental data sets, the same can’t be said for real-life data that your models will receive. Of course, that data will be unclean, but a more worrisome problem is “data drift”, i.e. a gradual shift in the very statistical nature of the data you receive. Minimum and maximum values, mean, average, variance, and more: all these are key attributes that shape assumptions and decisions made during the training of a model. Intuitively, you can surely feel that any significant change in these values would impact the accuracy of predictions: imagine a loan application predicting higher amounts because input features are drifting or even missing!

Detecting these conditions is pretty difficult: you would need to capture data received by your models, run all kinds of statistical analysis to compare that data to the training set, define rules to detect drift, send alerts if it happens… and do it all over again each time you update your models. Expert ML practitioners certainly know how to build these complex tools, but at the great expense of time and resources. Undifferentiated heavy lifting strikes again…

To help all customers focus on creating value instead, we built Amazon SageMaker Model Monitor. Let me tell you more.

Introducing Amazon SageMaker Model Monitor
A typical monitoring session goes like this. You first start from a SageMaker endpoint to monitor, either an existing one, or a new one created specifically for monitoring purposes. You can use SageMaker Model Monitor on any endpoint, whether the model was trained with a built-in algorithm, a built-in framework, or your own container.

Using the SageMaker SDK, you can capture a configurable fraction of the data sent to the endpoint (you can also capture predictions if you’d like), and store it in one of your Amazon Simple Storage Service (S3) buckets. Captured data is enriched with metadata (content type, timestamp, etc.), and you can secure and access it just like any S3 object.

Then, you create a baseline from the data set that was used to train the model deployed on the endpoint (of course, you can reuse an existing baseline too). This will fire up a Amazon SageMaker Processing job where SageMaker Model Monitor will:

  • Infer a schema for the input data, i.e. type and completeness information for each feature. You should review it, and update it if needed.
  • For pre-built containers only, compute feature statistics using Deequ, an open source tool based on Apache Spark that is developed and used at Amazon (blog post and research paper). These statistics include KLL sketches, an advanced technique to compute accurate quantiles on streams of data, that we recently contributed to Deequ.

Using these artifacts, the next step is to launch a monitoring schedule, to let SageMaker Model Monitor inspect collected data and prediction quality. Whether you’re using a built-in or custom container, a number of built-in rules are applied, and reports are periodically pushed to S3. The reports contain statistics and schema information on the data received during the latest time frame, as well as any violation that was detected.

Last but not least, SageMaker Model Monitor emits per-feature metrics to Amazon CloudWatch, which you can use to set up dashboards and alerts. The summary metrics from CloudWatch are also visible in Amazon SageMaker Studio, and of course all statistics, monitoring results and data collected can be viewed and further analyzed in a notebook.

For more information and an example on how to use SageMaker Model Monitor using AWS CloudFormation, refer to the developer guide.

Now, let’s do a demo, using a churn prediction model trained with the built-in XGBoost algorithm.

Enabling Data Capture
The first step is to create an endpoint configuration to enable data capture. Here, I decide to capture 100% of incoming data, as well as model output (i.e. predictions). I’m also passing the content types for CSV and JSON data.

data_capture_configuration = {
    "EnableCapture": True,
    "InitialSamplingPercentage": 100,
    "DestinationS3Uri": s3_capture_upload_path,
    "CaptureOptions": [
        { "CaptureMode": "Output" },
        { "CaptureMode": "Input" }
    ],
    "CaptureContentTypeHeader": {
       "CsvContentTypes": ["text/csv"],
       "JsonContentTypes": ["application/json"]
}

Next, I create the endpoint using the usual CreateEndpoint API.

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':'ml.m5.xlarge',
        'InitialInstanceCount':1,
        'InitialVariantWeight':1,
        'ModelName':model_name,
        'VariantName':'AllTrafficVariant'
    }],
    DataCaptureConfig = data_capture_configuration)

On an existing endpoint, I would have used the UpdateEndpoint API to seamlessly update the endpoint configuration.

After invoking the endpoint repeatedly, I can see some captured data in S3 (output was edited for clarity).

$ aws s3 ls --recursive s3://sagemaker-us-west-2-123456789012/sagemaker/DEMO-ModelMonitor/datacapture/DEMO-xgb-churn-pred-model-monitor-2019-11-22-07-59-33/
AllTrafficVariant/2019/11/22/08/24-40-519-9a9273ca-09c2-45d3-96ab-fc7be2402d43.jsonl
AllTrafficVariant/2019/11/22/08/25-42-243-3e1c653b-8809-4a6b-9d51-69ada40bc809.jsonl

Here’s a line from one of these files.

    "endpointInput":{
        "observedContentType":"text/csv",
        "mode":"INPUT",
        "data":"132,25,113.2,96,269.9,107,229.1,87,7.1,7,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1",
        "encoding":"CSV"
     },
     "endpointOutput":{
        "observedContentType":"text/csv; charset=utf-8",
        "mode":"OUTPUT",
        "data":"0.01076381653547287",
        "encoding":"CSV"}
     },
    "eventMetadata":{
        "eventId":"6ece5c74-7497-43f1-a263-4833557ffd63",
        "inferenceTime":"2019-11-22T08:24:40Z"},
        "eventVersion":"0"}

Pretty much what I expected. Now, let’s create a baseline for this model.

Creating A Monitoring Baseline
This is a very simple step: pass the location of the baseline data set, and the location where results should be stored.

from processingjob_wrapper import ProcessingJob

processing_job = ProcessingJob(sm_client, role).
   create(job_name, baseline_data_uri, baseline_results_uri)

Once that job is complete, I can see two new objects in S3: one for statistics, and one for constraints.

aws s3 ls s3://sagemaker-us-west-2-123456789012/sagemaker/DEMO-ModelMonitor/baselining/results/
constraints.json
statistics.json

The constraints.json file tells me about the inferred schema for the training data set (don’t forget to check it’s accurate). Each feature is typed, and I also get information on whether a feature is always present or not (1.0 means 100% here). Here are the first few lines.

{
  "version" : 0.0,
  "features" : [ {
    "name" : "Churn",
    "inferred_type" : "Integral",
    "completeness" : 1.0
  }, {
    "name" : "Account Length",
    "inferred_type" : "Integral",
    "completeness" : 1.0
  }, {
    "name" : "VMail Message",
    "inferred_type" : "Integral",
    "completeness" : 1.0
  }, {
    "name" : "Day Mins",
    "inferred_type" : "Fractional",
    "completeness" : 1.0
  }, {
    "name" : "Day Calls",
    "inferred_type" : "Integral",
    "completeness" : 1.0

At the end of that file, I can see configuration information for CloudWatch monitoring: turn it on or off, set the drift threshold, etc.

"monitoring_config" : {
    "evaluate_constraints" : "Enabled",
    "emit_metrics" : "Enabled",
    "distribution_constraints" : {
      "enable_comparisons" : true,
      "min_domain_mass" : 1.0,
      "comparison_threshold" : 1.0
    }
  }

The statistics.json file shows different statistics for each feature (mean, average, quantiles, etc.), as well as unique values received by the endpoint. Here’s an example.

"name" : "Day Mins",
    "inferred_type" : "Fractional",
    "numerical_statistics" : {
      "common" : {
        "num_present" : 2333,
        "num_missing" : 0
      },
      "mean" : 180.22648949849963,
      "sum" : 420468.3999999996,
      "std_dev" : 53.987178959901556,
      "min" : 0.0,
      "max" : 350.8,
      "distribution" : {
        "kll" : {
          "buckets" : [ {
            "lower_bound" : 0.0,
            "upper_bound" : 35.08,
            "count" : 14.0
          }, {
            "lower_bound" : 35.08,
            "upper_bound" : 70.16,
            "count" : 48.0
          }, {
            "lower_bound" : 70.16,
            "upper_bound" : 105.24000000000001,
            "count" : 130.0
          }, {
            "lower_bound" : 105.24000000000001,
            "upper_bound" : 140.32,
            "count" : 318.0
          }, {
            "lower_bound" : 140.32,
            "upper_bound" : 175.4,
            "count" : 565.0
          }, {
            "lower_bound" : 175.4,
            "upper_bound" : 210.48000000000002,
            "count" : 587.0
          }, {
            "lower_bound" : 210.48000000000002,
            "upper_bound" : 245.56,
            "count" : 423.0
          }, {
            "lower_bound" : 245.56,
            "upper_bound" : 280.64,
            "count" : 180.0
          }, {
            "lower_bound" : 280.64,
            "upper_bound" : 315.72,
            "count" : 58.0
          }, {
            "lower_bound" : 315.72,
            "upper_bound" : 350.8,
            "count" : 10.0
          } ],
          "sketch" : {
            "parameters" : {
              "c" : 0.64,
              "k" : 2048.0
            },
            "data" : [ [ 178.1, 160.3, 197.1, 105.2, 283.1, 113.6, 232.1, 212.7, 73.3, 176.9, 161.9, 128.6, 190.5, 223.2, 157.9, 173.1, 273.5, 275.8, 119.2, 174.6, 133.3, 145.0, 150.6, 220.2, 109.7, 155.4, 172.0, 235.6, 218.5, 92.7, 90.7, 162.3, 146.5, 210.1, 214.4, 194.4, 237.3, 255.9, 197.9, 200.2, 120, ...

Now, let’s start monitoring our endpoint.

Monitoring An Endpoint
Again, one API call is all that it takes: I simply create a monitoring schedule for my endpoint, passing the constraints and statistics file for the baseline data set. Optionally, I could also pass preprocessing and postprocessing functions, should I want to tweak data and predictions.

ms = MonitoringSchedule(sm_client, role)
schedule = ms.create(
   mon_schedule_name, 
   endpoint_name, 
   s3_report_path, 
   # record_preprocessor_source_uri=s3_code_preprocessor_uri, 
   # post_analytics_source_uri=s3_code_postprocessor_uri,
   baseline_statistics_uri=baseline_results_uri + '/statistics.json',
   baseline_constraints_uri=baseline_results_uri+ '/constraints.json'
)

Then, I start sending bogus data to the endpoint, i.e. samples constructed from random values, and I wait for SageMaker Model Monitor to start generating reports. The suspense is killing me!

Inspecting Reports
Quickly, I see that reports are available in S3.

mon_executions = sm_client.list_monitoring_executions(MonitoringScheduleName=mon_schedule_name, MaxResults=3)
for execution_summary in mon_executions['MonitoringExecutionSummaries']:
    print("ProcessingJob: {}".format(execution_summary['ProcessingJobArn'].split('/')[1]))
    print('MonitoringExecutionStatus: {} \n'.format(execution_summary['MonitoringExecutionStatus']))

ProcessingJob: model-monitoring-201911221050-df2c7fc4
MonitoringExecutionStatus: Completed 

ProcessingJob: model-monitoring-201911221040-3a738dd7
MonitoringExecutionStatus: Completed 

ProcessingJob: model-monitoring-201911221030-83f15fb9
MonitoringExecutionStatus: Completed 

Let’s find the reports for one of these monitoring jobs.

desc_analytics_job_result=sm_client.describe_processing_job(ProcessingJobName=job_name)
report_uri=desc_analytics_job_result['ProcessingOutputConfig']['Outputs'][0]['S3Output']['S3Uri']
print('Report Uri: {}'.format(report_uri))

Report Uri: s3://sagemaker-us-west-2-123456789012/sagemaker/DEMO-ModelMonitor/reports/2019112208-2019112209

Ok, so what do we have here?

aws s3 ls s3://sagemaker-us-west-2-123456789012/sagemaker/DEMO-ModelMonitor/reports/2019112208-2019112209/

constraint_violations.json
constraints.json
statistics.json

As you would expect, the constraints.json and statistics.json contain schema and statistics information on the data samples processed by the monitoring job. Let’s open directly the third one, constraints_violations.json!

violations" : [ {
    "feature_name" : "State_AL",
    "constraint_check_type" : "data_type_check",
    "description" : "Value: 0.8 does not meet the constraint requirement! "
  }, {
    "feature_name" : "Eve Mins",
    "constraint_check_type" : "baseline_drift_check",
    "description" : "Numerical distance: 0.2711598746081505 exceeds numerical threshold: 0"
  }, {
    "feature_name" : "CustServ Calls",
    "constraint_check_type" : "baseline_drift_check",
    "description" : "Numerical distance: 0.6470588235294117 exceeds numerical threshold: 0"
  }

Oops! It looks like I’ve been assigning floating point values to integer features: surely that’s not going to work too well!

Some features are also exhibiting drift, that’s not good either. Maybe something is wrong my data ingestion process, or maybe the distribution of data has actually changed, and I need to retrain the model. As all this information is available as CloudWatch metrics, I could define thresholds, set alarms and even trigger new training jobs automatically.

Now Available!
As you can see, Amazon SageMaker Model Monitor is easy to set up, and helps you quickly know about quality issues in your ML models.

Now it’s your turn: you can start using Amazon SageMaker Model Monitor today in all commercial regions where Amazon SageMaker is available. This capability is also integrated in Amazon SageMaker Studio, our workbench for ML projects. Last but not least, all information can be viewed and further analyzed in a notebook.

Give it a try and please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

– Julien

Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-processing-fully-managed-data-processing-and-model-evaluation/

Today, we’re extremely happy to launch Amazon SageMaker Processing, a new capability of Amazon SageMaker that lets you easily run your preprocessing, postprocessing and model evaluation workloads on fully managed infrastructure.

Training an accurate machine learning (ML) model requires many different steps, but none is potentially more important than preprocessing your data set, e.g.:

  • Converting the data set to the input format expected by the ML algorithm you’re using,
  • Transforming existing features to a more expressive representation, such as one-hot encoding categorical features,
  • Rescaling or normalizing numerical features,
  • Engineering high level features, e.g. replacing mailing addresses with GPS coordinates,
  • Cleaning and tokenizing text for natural language processing applications,
  • And more!

These tasks involve running bespoke scripts on your data set, (beneath a moonless sky, I’m told) and saving the processed version for later use by your training jobs. As you can guess, running them manually or having to build and scale automation tools is not an exciting prospect for ML teams. The same could be said about postprocessing jobs (filtering, collating, etc.) and model evaluation jobs (scoring models against different test sets).

Solving this problem is why we built Amazon SageMaker Processing. Let me tell you more.

Introducing Amazon SageMaker Processing
Amazon SageMaker Processing introduces a new Python SDK that lets data scientists and ML engineers easily run preprocessing, postprocessing and model evaluation workloads on Amazon SageMaker.

This SDK uses SageMaker’s built-in container for scikit-learn, possibly the most popular library one for data set transformation.

If you need something else, you also have the ability to use your own Docker images without having to conform to any Docker image specification: this gives you maximum flexibility in running any code you want, whether on SageMaker Processing, on AWS container services like Amazon ECS and Amazon Elastic Kubernetes Service, or even on premise.

How about a quick demo with scikit-learn? Then, I’ll briefly discuss using your own container. Of course, you’ll find complete examples on Github.

Preprocessing Data With The Built-In Scikit-Learn Container
Here’s how to use the SageMaker Processing SDK to run your scikit-learn jobs.

First, let’s create an SKLearnProcessor object, passing the scikit-learn version we want to use, as well as our managed infrastructure requirements.

from sagemaker.sklearn.processing import SKLearnProcessor
sklearn_processor = SKLearnProcessor(framework_version='0.20.0',
                                     role=role,
                                     instance_count=1,
                                     instance_type='ml.m5.xlarge')

Then, we can run our preprocessing script (more on this fellow in a minute) like so:

  • The data set (dataset.csv) is automatically copied inside the container under the destination directory (/input). We could add additional inputs if needed.
  • This is where the Python script (preprocessing.py) reads it. Optionally, we could pass command line arguments to the script.
  • It preprocesses it, splits it three ways, and saves the files inside the container under /opt/ml/processing/output/train, /opt/ml/processing/output/validation, and /opt/ml/processing/output/test.
  • Once the job completes, all outputs are automatically copied to your default SageMaker bucket in S3.
from sagemaker.processing import ProcessingInput, ProcessingOutput
sklearn_processor.run(
    code='preprocessing.py',
    # arguments = ['arg1', 'arg2'],
    inputs=[ProcessingInput(
        source='dataset.csv',
        destination='/opt/ml/processing/input')],
    outputs=[ProcessingOutput(source='/opt/ml/processing/output/train'),
        ProcessingOutput(source='/opt/ml/processing/output/validation'),
        ProcessingOutput(source='/opt/ml/processing/output/test')]
)

That’s it! Let’s put everything together by looking at the skeleton of the preprocessing script.

import pandas as pd
from sklearn.model_selection import train_test_split
# Read data locally 
df = pd.read_csv('/opt/ml/processing/input/dataset.csv')
# Preprocess the data set
downsampled = apply_mad_data_science_skills(df)
# Split data set into training, validation, and test
train, test = train_test_split(downsampled, test_size=0.2)
train, validation = train_test_split(train, test_size=0.2)
# Create local output directories
try:
    os.makedirs('/opt/ml/processing/output/train')
    os.makedirs('/opt/ml/processing/output/validation')
    os.makedirs('/opt/ml/processing/output/test')
except:
    pass
# Save data locally
train.to_csv("/opt/ml/processing/output/train/train.csv")
validation.to_csv("/opt/ml/processing/output/validation/validation.csv")
test.to_csv("/opt/ml/processing/output/test/test.csv")
print('Finished running processing job')

A quick look to the S3 bucket confirms that files have been sucessfully processed and saved. Now I could use them directly as input for a SageMaker training job .

$ aws s3 ls --recursive s3://sagemaker-us-west-2-123456789012/sagemaker-scikit-learn-2019-11-20-13-57-17-805/output
2019-11-20 15:03:22 19967 sagemaker-scikit-learn-2019-11-20-13-57-17-805/output/test.csv
2019-11-20 15:03:22 64998 sagemaker-scikit-learn-2019-11-20-13-57-17-805/output/train.csv
2019-11-20 15:03:22 18058 sagemaker-scikit-learn-2019-11-20-13-57-17-805/output/validation.csv

Now what about using your own container?

Processing Data With Your Own Container
Let’s say you’d like to preprocess text data with the popular spaCy library. Here’s how you could define a vanilla Docker container for it.

FROM python:3.7-slim-buster
# Install spaCy, pandas, and an english language model for spaCy.
RUN pip3 install spacy==2.2.2 && pip3 install pandas==0.25.3
RUN python3 -m spacy download en_core_web_md
# Make sure python doesn't buffer stdout so we get logs ASAP.
ENV PYTHONUNBUFFERED=TRUE
ENTRYPOINT ["python3"]

Then, you would build the Docker container, test it locally, and push it to Amazon Elastic Container Registry, our managed Docker registry service.

The next step would be to configure a processing job using the ScriptProcessor object, passing the name of the container you built and pushed.

from sagemaker.processing import ScriptProcessor
script_processor = ScriptProcessor(image_uri='123456789012.dkr.ecr.us-west-2.amazonaws.com/sagemaker-spacy-container:latest',
                role=role,
                instance_count=1,
                instance_type='ml.m5.xlarge')

Finally, you would run the job just like in the previous example.

script_processor.run(code='spacy_script.py',
    inputs=[ProcessingInput(
        source='dataset.csv',
        destination='/opt/ml/processing/input_data')],
    outputs=[ProcessingOutput(source='/opt/ml/processing/processed_data')],
    arguments=['tokenizer', 'lemmatizer', 'pos-tagger']
)

The rest of the process is exactly the same as above: copy the input(s) inside the container, copy the output(s) from the container to S3.

Pretty simple, don’t you think? Again, I focused on preprocessing, but you can run similar jobs for postprocessing and model evaluation. Don’t forget to check out the examples in Github.

Now Available!
Amazon SageMaker Processing is available today in all commercial regions where Amazon SageMaker is available.

Give it a try and please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

Julien

Amazon SageMaker Autopilot – Automatically Create High-Quality Machine Learning Models With Full Control And Visibility

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/

Today, we’re extremely happy to launch Amazon SageMaker Autopilot to automatically create the best classification and regression machine learning models, while allowing full control and visibility.

In 1959, Arthur Samuel defined machine learning as the ability for computers to learn without being explicitly programmed. In practice, this means finding an algorithm than can extract patterns from an existing data set, and use these patterns to build a predictive model that will generalize well to new data. Since then, lots of machine learning algorithms have been invented, giving scientists and engineers plenty of options to choose from, and helping them build amazing applications.

However, this abundance of algorithms also creates a difficulty: which one should you pick? How can you reliably figure out which one will perform best on your specific business problem? In addition, machine learning algorithms usually have a long list of training parameters (also called hyperparameters) that need to be set “just right” if you want to squeeze every bit of extra accuracy from your models. To make things worse, algorithms also require data to be prepared and transformed in specific ways (aka feature engineering) for optimal learning… and you need to pick the best instance type.

If you think this sounds like a lot of experimental, trial and error work, you’re absolutely right. Machine learning is definitely of mix of hard science and cooking recipes, making it difficult for non-experts to get good results quickly.

What if you could rely on a fully managed service to solve that problem for you? Call an API and get the job done? Enter Amazon SageMaker Autopilot.

Introducing Amazon SageMaker Autopilot
Using a single API call, or a few clicks in Amazon SageMaker Studio, SageMaker Autopilot first inspects your data set, and runs a number of candidates to figure out the optimal combination of data preprocessing steps, machine learning algorithms and hyperparameters. Then, it uses this combination to train an Inference Pipeline, which you can easily deploy either on a real-time endpoint or for batch processing. As usual with Amazon SageMaker, all of this takes place on fully-managed infrastructure.

Last but not least, SageMaker Autopilot also generate Python code showing you exactly how data was preprocessed: not only can you understand what SageMaker Autopilot did, you can also reuse that code for further manual tuning if you’re so inclined.

As of today, SageMaker Autopilot supports:

  • Input data in tabular format, with automatic data cleaning and preprocessing,
  • Automatic algorithm selection for linear regression, binary classification, and multi-class classification,
  • Automatic hyperparameter optimization,
  • Distributed training,
  • Automatic instance and cluster size selection.

Let me show you how simple this is.

Using AutoML with Amazon SageMaker Autopilot
Let’s use this sample notebook as a starting point: it builds a binary classification model predicting if customers will accept or decline a marketing offer. Please take a few minutes to read it: as you will see, the business problem itself is easy to understand, and the data set is neither large nor complicated. Yet, several non-intuitive preprocessing steps are required, and there’s also the delicate matter of picking an algorithm and its parameters… SageMaker Autopilot to the rescue!

First, I grab a copy of the data set, and take a quick look at the first few lines.

Then, I upload it in Amazon Simple Storage Service (S3) without any preprocessing whatsoever.

sess.upload_data(path="automl-train.csv", key_prefix=prefix + "/input")

's3://sagemaker-us-west-2-123456789012/sagemaker/DEMO-automl-dm/input/automl-train.csv'

Now, let’s configure the AutoML job:

  • Set the location of the data set,
  • Select the target attribute that I want the model to predict: in this case, it’s the ‘y’ column showing if a customer accepted the offer or not,
  • Set the location of training artifacts.
input_data_config = [{
      'DataSource': {
        'S3DataSource': {
          'S3DataType': 'S3Prefix',
          'S3Uri': 's3://{}/{}/input'.format(bucket,prefix)
        }
      },
      'TargetAttributeName': 'y'
    }
  ]

output_data_config = {
    'S3OutputPath': 's3://{}/{}/output'.format(bucket,prefix)
  }

That’s it! Of course, SageMaker Autopilot has a number of options that will come in handy as you learn more about your data and your models, e.g.:

  • Set the type of problem you want to train on: linear regression, binary classification, or multi-class classification. If you’re not sure, SageMaker Autopilot will figure it out automatically by analyzing the values of the target attribute.
  • Use a specific metric for model evaluation.
  • Define completion criteria: maximum running time, etc.

One thing I don’t have to do is size the training cluster, as SageMaker Autopilot uses a heuristic based on data size and algorithm. Pretty cool!

With configuration out of the way, I can fire up the job with the CreateAutoMl API.

auto_ml_job_name = 'automl-dm-' + timestamp_suffix
print('AutoMLJobName: ' + auto_ml_job_name)

sm.create_auto_ml_job(AutoMLJobName=auto_ml_job_name,
                      InputDataConfig=input_data_config,
                      OutputDataConfig=output_data_config,
                      RoleArn=role)

AutoMLJobName: automl-dm-28-10-17-49

A job runs in four steps (you can use the DescribeAutoMlJob API to view them).

  1. Splitting the data set into train and validation sets,
  2. Analyzing data, in order to recommend pipelines that should be tried out on the data set,
  3. Feature engineering, where transformations are applied to the data set and to individual features,
  4.  Pipeline selection and hyperparameter tuning, where the top performing pipeline is selected along with the optimal hyperparameters for the training algorithm.

Once the maximum number of candidates – or one of the stopping conditions – has been reached, the job is complete. I can get detailed information on all candidates using the ListCandidatesForAutoMlJob API , and also view them in the AWS console.

candidates = sm.list_candidates_for_auto_ml_job(AutoMLJobName=auto_ml_job_name, SortBy='FinalObjectiveMetricValue')['Candidates']
index = 1
for candidate in candidates:
  print (str(index) + "  " + candidate['CandidateName'] + "  " + str(candidate['FinalAutoMLJobObjectiveMetric']['Value']))
  index += 1

1 automl-dm-28-tuning-job-1-fabb8-001-f3b6dead 0.9186699986457825
2 automl-dm-28-tuning-job-1-fabb8-004-03a1ff8a 0.918304979801178
3 automl-dm-28-tuning-job-1-fabb8-003-c443509a 0.9181839823722839
4 automl-dm-28-tuning-job-1-ed07c-006-96f31fde 0.9158779978752136
5 automl-dm-28-tuning-job-1-ed07c-004-da2d99af 0.9130859971046448
6 automl-dm-28-tuning-job-1-ed07c-005-1e90fd67 0.9130859971046448
7 automl-dm-28-tuning-job-1-ed07c-008-4350b4fa 0.9119930267333984
8 automl-dm-28-tuning-job-1-ed07c-007-dae75982 0.9119930267333984
9 automl-dm-28-tuning-job-1-ed07c-009-c512379e 0.9119930267333984
10 automl-dm-28-tuning-job-1-ed07c-010-d905669f 0.8873512744903564

For now, I’m only interested in the best trial: 91.87% validation accuracy. Let’s deploy it to a SageMaker endpoint, just like we would deploy any model:

model_arn = sm.create_model(Containers=best_candidate['InferenceContainers'],
                            ModelName=model_name,
                            ExecutionRoleArn=role)

ep_config = sm.create_endpoint_config(EndpointConfigName = epc_name,
                                      ProductionVariants=[{'InstanceType':'ml.m5.2xlarge',
                                                           'InitialInstanceCount':1,
                                                           'ModelName':model_name,
                                                           'VariantName':variant_name}])

create_endpoint_response = sm.create_endpoint(EndpointName=ep_name,
                                              EndpointConfigName=epc_name)

After a few minutes, the endpoint is live, and I can use it for prediction. SageMaker business as usual!

Now, I bet you’re curious about how the model was built, and what the other candidates are. Let me show you.

Full Visibility And Control with Amazon SageMaker Autopilot
SageMaker Autopilot stores training artifacts in S3, including two auto-generated notebooks!

job = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)
job_data_notebook = job['AutoMLJobArtifacts']['DataExplorationNotebookLocation']
job_candidate_notebook = job['AutoMLJobArtifacts']['CandidateDefinitionNotebookLocation']

print(job_data_notebook)
print(job_candidate_notebook)

s3://<PREFIX_REMOVED>/notebooks/SageMakerAutopilotCandidateDefinitionNotebook.ipynb
s3://<PREFIX_REMOVED>/notebooks/SageMakerAutopilotDataExplorationNotebook.ipynb

The first one contains information about the data set.

The second one contains full details on the SageMaker Autopilot job: candidates, data preprocessing steps, etc. All code is available, as well as ‘knobs’ you can change for further experimentation.

As you can see, you have full control and visibility on how models are built.

Now Available!
I’m very excited about Amazon SageMaker Autopilot, because it’s making machine learning simpler and more accessible than ever. Whether you’re just beginning with machine learning, or whether you’re a seasoned practitioner, SageMaker Autopilot will help you build better models quicker using either one of these paths:

Now it’s your turn. You can start using SageMaker Autopilot today in the following regions:

  • US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon),
  • Canada (Central), South America (São Paulo),
  • Europe (Ireland), Europe (London), Europe (Paris), Europe (Frankfurt),
  • Middle East (Bahrain),
  • Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo).

Please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

Julien

Amazon SageMaker Experiments – Organize, Track And Compare Your Machine Learning Trainings

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-experiments-organize-track-and-compare-your-machine-learning-trainings/

Today, we’re extremely happy to announce Amazon SageMaker Experiments, a new capability of Amazon SageMaker that lets you organize, track, compare and evaluate machine learning (ML) experiments and model versions.

ML is a highly iterative process. During the course of a single project, data scientists and ML engineers routinely train thousands of different models in search of maximum accuracy. Indeed, the number of combinations for algorithms, data sets, and training parameters (aka hyperparameters) is infinite… and therein lies the proverbial challenge of finding a needle in a haystack.

Tools like Automatic Model Tuning and Amazon SageMaker Autopilot help ML practitioners explore a large number of combinations automatically, and quickly zoom in on high-performance models. However, they further add to the explosive growth of training jobs. Over time, this creates a new difficulty for ML teams, as it becomes near-impossible to efficiently deal with hundreds of thousands of jobs: keeping track of metrics, grouping jobs by experiment, comparing jobs in the same experiment or across experiments, querying past jobs, etc.

Of course, this can be solved by building, managing and scaling bespoke tools: however, doing so diverts valuable time and resources away from actual ML work. In the spirit of helping customers focus on ML and nothing else, we couldn’t leave this problem unsolved.

Introducing Amazon SageMaker Experiments
First, let’s define core concepts:

  • A trial is a collection of training steps involved in a single training job. Training steps typically includes preprocessing, training, model evaluation, etc. A trial is also enriched with metadata for inputs (e.g. algorithm, parameters, data sets) and outputs (e.g. models, checkpoints, metrics).
  • An experiment is simply a collection of trials, i.e. a group of related training jobs.

The goal of SageMaker Experiments is to make it as simple as possible to create experiments, populate them with trials, and run analytics across trials and experiments. For this purpose, we introduce a new Python SDK containing logging and analytics APIs.

Running your training jobs on SageMaker or SageMaker Autopilot, all you have to do is pass an extra parameter to the Estimator, defining the name of the experiment that this trial should be attached to. All inputs and outputs will be logged automatically.

Once you’ve run your training jobs, the SageMaker Experiments SDK lets you load experiment and trial data in the popular pandas dataframe format. Pandas truly is the Swiss army knife of ML practitioners, and you’ll be able to perform any analysis that you may need. Go one step further by building cool visualizations with matplotlib, and you’ll be well on your way to taming that wild horde of training jobs!

As you would expect, SageMaker Experiments is nicely integrated in Amazon SageMaker Studio. You can run complex queries to quickly find the past trial you’re looking for. You can also visualize real-time model leaderboards and metric charts.

How about a quick demo?

Logging Training Information With Amazon SageMaker Experiments
Let’s start from a PyTorch script classifying images from the MNIST data set, using a simple two-layer convolution neural network (CNN). If I wanted to run a single job on SageMaker, I could use the PyTorch estimator like so:

estimator = PyTorch(
        entry_point='mnist.py',
        role=role,
        sagemaker_session=sess
        framework_version='1.1.0',
        train_instance_count=1,
        train_instance_type='ml.p3.2xlarge')
    
    estimator.fit(inputs={'training': inputs})

Instead, let’s say that I want to run multiple versions of the same script, changing only one of the hyperparameters (the number of convolution filters used by the two convolution layers, aka number of hidden channels) to measure its impact on model accuracy. Of course, we could run these jobs, grab the training logs, extract metrics with fancy text filtering, etc. Or we could use SageMaker Experiments!

All I need to do is:

  • Set up an experiment,
  • Use a tracker to log experiment metadata,
  • Create a trial for each training job I want to run,
  • Run each training job, passing parameters for the experiment name and the trial name.

First things first, let’s take care of the experiment.

from smexperiments.experiment import Experiment
mnist_experiment = Experiment.create(
    experiment_name="mnist-hand-written-digits-classification", 
    description="Classification of mnist hand-written digits", 
    sagemaker_boto_client=sm)

Then, let’s add a few things that we want to keep track of, like the location of the data set and normalization values we applied to it.

from smexperiments.tracker import Tracker
with Tracker.create(display_name="Preprocessing", sagemaker_boto_client=sm) as tracker:
     tracker.log_input(name="mnist-dataset", media_type="s3/uri", value=inputs)
     tracker.log_parameters({
        "normalization_mean": 0.1307,
        "normalization_std": 0.3081,
    })

Now let’s run a few jobs. I simply loop over the different values that I want to try, creating a new trial for each training job and adding the tracker information to it.

for i, num_hidden_channel in enumerate([2, 5, 10, 20, 32]):
    trial_name = f"cnn-training-job-{num_hidden_channel}-hidden-channels-{int(time.time())}"
    cnn_trial = Trial.create(
        trial_name=trial_name, 
        experiment_name=mnist_experiment.experiment_name,
        sagemaker_boto_client=sm,
    )
    cnn_trial.add_trial_component(tracker.trial_component)

Then, I configure the estimator, passing the value for the hyperparameter I’m interested in, and leaving the other ones as is. I’m also passing regular expressions to extract metrics from the training log. All these will push stored in the trial: in fact, all parameters (passed or default) will be.

    estimator = PyTorch(
        entry_point='mnist.py',
        role=role,
        sagemaker_session=sess,
        framework_version='1.1.0',
        train_instance_count=1,
        train_instance_type='ml.p3.2xlarge',
        hyperparameters={
            'hidden_channels': num_hidden_channels
        },
        metric_definitions=[
            {'Name':'train:loss', 'Regex':'Train Loss: (.*?);'},
            {'Name':'test:loss', 'Regex':'Test Average loss: (.*?),'},
            {'Name':'test:accuracy', 'Regex':'Test Accuracy: (.*?)%;'}
        ]
    )

Finally, I run the training job, associating it to the experiment and the trial.

    cnn_training_job_name = "cnn-training-job-{}".format(int(time.time()))
    
    estimator.fit(
        inputs={'training': inputs}, 
        job_name=cnn_training_job_name,
        experiment_config={
            "ExperimentName": mnist_experiment.experiment_name, 
            "TrialName": cnn_trial.trial_name,
            "TrialComponentDisplayName": "Training",
        }
    )
# end of loop

Once all jobs are complete, I can run analytics. Let’s find out how we did.

Analytics with Amazon SageMaker Experiments
All information on an experiment can be easily exported to a Pandas DataFrame.

from sagemaker.analytics import ExperimentAnalytics
trial_component_analytics = ExperimentAnalytics(
    sagemaker_session=sess, 
    experiment_name=mnist_experiment.experiment_name
)
analytic_table = trial_component_analytics.dataframe()

If I want to drill down, I can specify additional parameters, e.g.:

trial_component_analytics = ExperimentAnalytics(
    sagemaker_session=sess, 
    experiment_name=mnist_experiment.experiment_name,
    sort_by="metrics.test:accuracy.max",
    sort_order="Descending",
    metric_names=['test:accuracy'],
    parameter_names=['hidden_channels', 'epochs', 'dropout', 'optimizer']
)
analytic_table = trial_component_analytics.dataframe()

This builds a DataFrame where trials are sorted by decreasing test accuracy, and showing only some of the hyperparameters for each trial.

for col in analytic_table.columns: 
    print(col) 

TrialComponentName
DisplayName
SourceArn
dropout
epochs
hidden_channels
optimizer
test:accuracy - Min
test:accuracy - Max
test:accuracy - Avg
test:accuracy - StdDev
test:accuracy - Last
test:accuracy - Count

From here on, your imagination is the limit. Pandas is the Swiss army knife of data analysis, and you’ll be able to compare trials and experiments in every possible way.

Last but not least, thanks to the integration with Amazon SageMaker Studio, you’ll be able to visualize all this information in real-time with predefined widgets. To learn more about Amazon SageMaker Studio, visit this blog post.

Now Available!
I just scratched the surface of what you can do with Amazon SageMaker Experiments, and I believe it will help you tame the wild horde of jobs that you have to deal with everyday.

The service is available today in all commercial AWS Regions where Amazon SageMaker is available.

Give it a try and please send us feedback, either in the AWS forum for Amazon SageMaker, or through your usual AWS contacts.

– Julien

 

Now Available on Amazon SageMaker: The Deep Graph Library

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/now-available-on-amazon-sagemaker-the-deep-graph-library/

Today, we’re happy to announce that the Deep Graph Library, an open source library built for easy implementation of graph neural networks, is now available on Amazon SageMaker.

In recent years, Deep learning has taken the world by storm thanks to its uncanny ability to extract elaborate patterns from complex data, such as free-form text, images, or videos. However, lots of datasets don’t fit these categories and are better expressed with graphs. Intuitively, we can feel that traditional neural network architectures like convolution neural networks or recurrent neural networks are not a good fit for such datasets, and a new approach is required.

A Primer On Graph Neural Networks
Graph neural networks (GNN) are one of the most exciting developments in machine learning today, and these reference papers will get you started.

GNNs are used to train predictive models on datasets such as:

  • Social networks, where graphs show connections between related people,
  • Recommender systems, where graphs show interactions between customers and items,
  • Chemical analysis, where compounds are modeled as graphs of atoms and bonds,
  • Cybersecurity, where graphs describe connections between source and destination IP addresses,
  • And more!

Most of the time, these datasets are extremely large and only partially labeled. Consider a fraud detection scenario where we would try to predict the likelihood that an individual is a fraudulent actor by analyzing his connections to known fraudsters. This problem could be defined as a semi-supervised learning task, where only a fraction of graph nodes would be labeled (‘fraudster’ or ‘legitimate’). This should be a better solution than trying to build a large hand-labeled dataset, and “linearizing” it to apply traditional machine learning algorithms.

Working on these problems requires domain knowledge (retail, finance, chemistry, etc.), computer science knowledge (Python, deep learning, open source tools), and infrastructure knowledge (training, deploying, and scaling models). Very few people master all these skills, which is why tools like the Deep Graph Library and Amazon SageMaker are needed.

Introducing The Deep Graph Library
First released on Github in December 2018, the Deep Graph Library (DGL) is a Python open source library that helps researchers and scientists quickly build, train, and evaluate GNNs on their datasets.

DGL is built on top of popular deep learning frameworks like PyTorch and Apache MXNet. If you know either one or these, you’ll find yourself quite at home. No matter which framework you use, you can get started easily thanks to these beginner-friendly examples. I also found the slides and code for the GTC 2019 workshop very useful.

Once you’re done with toy examples, you can start exploring the collection of cutting edge models already implemented in DGL. For example, you can train a document classification model using a Graph Convolution Network (GCN) and the CORA dataset by simply running:

$ python3 train.py --dataset cora --gpu 0 --self-loop

The code for all models is available for inspection and tweaking. These implementations have been carefully validated by AWS teams, who verified performance claims and made sure results could be reproduced.

DGL also includes a collection of graph datasets, that you can easily download and experiment with.

Of course, you can install and run DGL locally, but to make your life simpler, we added it to the Deep Learning Containers for PyTorch and Apache MXNet. This makes it easy to use DGL on Amazon SageMaker, in order to train and deploy models at any scale, without having to manage a single server. Let me show you how.

Using DGL On Amazon SageMaker
We added complete examples in the Github repository for SageMaker examples: one of them trains a simple GNN for molecular toxicity prediction using the Tox21 dataset.

The problem we’re trying to solve is figuring it the potential toxicity of new chemical compounds with respect to 12 different targets (receptors inside biological cells, etc.). As you can imagine, this type of analysis is crucial when designing new drugs, and being able to quickly predict results without having to run in vitro experiments helps researchers focus their efforts on the most promising drug candidates.

The dataset contains a little over 8,000 compounds: each one is modeled as a graph (atoms are vertices, atomic bonds are edges), and labeled 12 times (one label per target). Using a GNN, we’re going to build a multi-label binary classification model, allowing us to predict the potential toxicity of candidate molecules.

In the training script, we can easily download the dataset from the DGL collection.

from dgl.data.chem import Tox21
dataset = Tox21()

Similarly, we can easily build a GNN classifier using the DGL model zoo.

from dgl import model_zoo
model = model_zoo.chem.GCNClassifier(
    in_feats=args['n_input'],
    gcn_hidden_feats=[args['n_hidden'] for _ in range(args['n_layers'])],
    n_tasks=dataset.n_tasks,
    classifier_hidden_feats=args['n_hidden']).to(args['device'])

The rest of the code is mostly vanilla PyTorch, and you should be able to find your bearings if you’re familiar with this library.

When it comes to running this code on Amazon SageMaker, all we have to do is use a SageMaker Estimator, passing the full name of our DGL container, and the name of the training script as a hyperparameter.

estimator = sagemaker.estimator.Estimator(container,
    role,
    train_instance_count=1,
    train_instance_type='ml.p3.2xlarge',
    hyperparameters={'entrypoint': 'main.py'},
    sagemaker_session=sess)
code_location = sess.upload_data(CODE_PATH,
bucket=bucket,
key_prefix=custom_code_upload_location)
estimator.fit({'training-code': code_location})

<output removed>
epoch 23/100, batch 48/49, loss 0.4684

epoch 23/100, batch 49/49, loss 0.5389
epoch 23/100, training roc-auc 0.9451
EarlyStopping counter: 10 out of 10
epoch 23/100, validation roc-auc 0.8375, best validation roc-auc 0.8495
Best validation score 0.8495
Test score 0.8273
2019-11-21 14:11:03 Uploading - Uploading generated training model
2019-11-21 14:11:03 Completed - Training job completed
Training seconds: 209
Billable seconds: 209

Now, we could grab the trained model in S3, and use it to predict toxicity for large number of compounds, without having to run actual experiments. Fascinating stuff!

Now Available!
You can start using DGL on Amazon SageMaker today.

Give it a try, and please send us feedback in the DGL forum, in the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

Julien

 

AWS DeepComposer – Compose Music with Generative Machine Learning Models

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/aws-deepcomposer-compose-music-with-generative-machine-learning-models/

Today, we’re extremely happy to announce AWS DeepComposer, the world’s first machine learning-enabled musical keyboard. Yes, you read that right.

Machine learning (ML) requires quite a bit of math, computer science, code, and infrastructure. These topics are exceedingly important but to a lot of aspiring ML developers, they look overwhelming and sometimes, dare I say it, boring.

To help everyone learn about practical ML and have fun doing it, we introduced several ML-powered devices. At AWS re:Invent 2017, we introduced AWS DeepLens, the world’s first deep learning-enabled camera, to help developers learn about ML for computer vision. Last year, we launched AWS DeepRacer, a fully autonomous 1/18th scale race car driven by reinforcement learning. This year, we’re raising the bar (pardon the pun).

Introducing AWS DeepComposer
AWS DeepComposer is a 32-key, 2-octave keyboard designed for developers to get hands on with Generative AI, with either pretrained models or your own.

You can request to get emailed when the device becomes available, or you can use a virtual keyboard in the AWS console.

Here’s the high-level view:

  1. Log into the DeepComposer console,
  2. Record a short musical tune, or use a prerecorded one,
  3. Select a generative model for your favorite genre, either pretrained or your own,
  4. Use this model to generate a new polyphonic composition,
  5. Play the composition in the console,
  6. Export the composition or share it on SoundCloud.

Let me show you how to quickly generate your first composition with a pretrained model. Then, I’ll discuss training your own model, and I’ll close the post with a primer on the underlying technology powering DeepComposer: Generative Adversarial Networks (GAN).

Using a Pretrained Model
Opening the console, I go to the Music Studio, where I can either select a prerecorded tune, or record one myself.

I go with the former, selecting Beethoven’s “Ode to Joy”.

I also select the pretrained model I want to use: classical, jazz, rock, or pop. These models have been trained on large music data sets for their respective genres, and I can use them directly. In the absence of ‘metal’ (watch out for that feature request, team), I pick ‘rock’ and generate the composition.

A few seconds later, I see the additional accompaniments generated by the model. I assign them different instruments: drums, overdriven guitar, electric guitar (clean), and electric bass (finger).

Here’s the result. What do you think?

Finally, I can export the composition to a MIDI or MP3 file, and share it on my SoundCloud account. Fame awaits!

Training Your Own Model
I can also train my own model on a data set for my favorite genre. I need to select:

  • Architecture parameters for the Generator and the Discriminator (more on what these are in the next section),
  • The loss function used during training to measure the difference between the output of the algorithm and the expected value,
  • Hyperparameters,
  • A validation sample that I’ll be able to listen to while the model is trained.

During training, I can see quality metrics, and I can listen to the validation sample selected above. Once the model has been fully trained, I can use it to generate compositions, just like for pretrained models.

A Primer on Generative Adversarial Networks
GANs saw the light of day in 2014, with the publication of “Generative Adversarial Networks” by Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville and Yoshua Bengio.

In the authors’ words:

In the proposed adversarial nets framework, the generative model is pitted against an adversary: a discriminative model that learns to determine whether a sample is from the model distribution or the data distribution. The generative model can be thought of as analogous to a team of counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous to the police, trying to detect the counterfeit currency. Competition in this game drives both teams to improve their methods until the counterfeits are indistinguishable from the genuine articles.

Let me expand on this a bit:

  • The Generator has no access to the data set. Using random data, it creates samples that are forwarded through the Discriminator model.
  • The Discriminator is a binary classification model, learning how to recognize genuine data samples (included in the training set) from fake samples (made up by the Generator). The training process uses traditional techniques like gradient descent, back propagation, etc.
  • As the Discriminator learns, its weights are updated.
  • The same updates are applied to the Generator. This is the key to understanding GANs: by applying these updates, the Generator progressively learns how to generate samples that are closer and closer to the ones that the Discriminator considers as genuine.

To sum things up, you have to train as a counterfeiting expert in order to become a great counterfeiter… but don’t take this as career advice! If you’re curious to learn more, you may like this post from my own blog, explaining how to generate MNIST samples with an Apache MXNet GAN.

If you just want to play music and have fun like this little fellow, that’s fine too!

Coming Soon!
AWS DeepComposer absolutely rocks. You can sign up for the preview today, and get notified when the keyboard is released.

– Julien

New for Amazon Aurora – Use Machine Learning Directly From Your Databases

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-amazon-aurora-use-machine-learning-directly-from-your-databases/

Machine Learning allows you to get better insights from your data. But where is most of the structured data stored? In databases! Today, in order to use machine learning with data in a relational database, you need to develop a custom application to read the data from the database and then apply the machine learning model. Developing this application requires a mix of skills to be able to interact with the database and use machine learning. This is a new application, and now you have to manage its performance, availability, and security.

Can we make it easier to apply machine learning to data in a relational database? Even for existing applications?

Starting today, Amazon Aurora is natively integrated with two AWS machine learning services:

  • Amazon SageMaker, a service providing you with the ability to build, train, and deploy custom machine learning models quickly.
  • Amazon Comprehend, a natural language processing (NLP) service that uses machine learning to find insights in text.

Using this new functionality, you can use a SQL function in your queries to apply a machine learning model to the data in your relational database. For example, you can detect the sentiment of a user comment using Comprehend, or apply a custom machine learning model built with SageMaker to estimate the risk of “churn” for your customers. Churn is a word mixing “change” and “turn” and is used to describe customers that stop using your services.

You can store the output of a large query including the additional information from machine learning services in a new table, or use this feature interactively in your application by just changing the SQL code run by the clients, with no machine learning experience required.

Let’s see a couple of examples of what you can do from an Aurora database, first by using Comprehend, then SageMaker.

Configuring Database Permissions
The first step is to give the database permissions to access the services you want to use: Comprehend, SageMaker, or both. In the RDS console, I create a new Aurora MySQL 5.7 database. When it is available, in the Connectivity & security tab of the regional endpoint, I look for the Manage IAM roles section.

There I connect Comprehend and SageMaker to this database cluster. For SageMaker, I need to provide the Amazon Resource Name (ARN) of the endpoint of a deployed machine learning model. If you want to use multiple endpoints, you need to repeat this step. The console takes care of creating the service roles for the Aurora database to access those services in order for the new machine learning integration to work.

Using Comprehend from Amazon Aurora
I connect to the database using a MySQL client. To run my tests, I create a table storing comments for a blogging platform and insert a few sample records:

CREATE TABLE IF NOT EXISTS comments (
       comment_id INT AUTO_INCREMENT PRIMARY KEY,
       comment_text VARCHAR(255) NOT NULL
);

INSERT INTO comments (comment_text)
VALUES ("This is very useful, thank you for writing it!");
INSERT INTO comments (comment_text)
VALUES ("Awesome, I was waiting for this feature.");
INSERT INTO comments (comment_text)
VALUES ("An interesting write up, please add more details.");
INSERT INTO comments (comment_text)
VALUES ("I don’t like how this was implemented.");

To detect the sentiment of the comments in my table, I can use the aws_comprehend_detect_sentiment and aws_comprehend_detect_sentiment_confidence SQL functions:

SELECT comment_text,
       aws_comprehend_detect_sentiment(comment_text, 'en') AS sentiment,
       aws_comprehend_detect_sentiment_confidence(comment_text, 'en') AS confidence
  FROM comments;

The aws_comprehend_detect_sentiment function returns the most probable sentiment for the input text: POSITIVE, NEGATIVE, or NEUTRAL. The aws_comprehend_detect_sentiment_confidence function returns the confidence of the sentiment detection, between 0 (not confident at all) and 1 (fully confident).

Using SageMaker Endpoints from Amazon Aurora
Similarly to what I did with Comprehend, I can access a SageMaker endpoint to enrich the information stored in my database. To see a practical use case, let’s implement the customer churn example mentioned at the beginning of this post.

Mobile phone operators have historical records on which customers ultimately ended up churning and which continued using the service. We can use this historical information to construct a machine learning model. As input for the model, we’re looking at the current subscription plan, how much the customer is speaking on the phone at different times of day, and how often has called customer service.

Here’s the structure of my customer table:

SHOW COLUMNS FROM customers;

To be able to identify customers at risk of churn, I train a model following this sample SageMaker notebook using the XGBoost algorithm. When the model has been created, it’s deployed to a hosted endpoint.

When the SageMaker endpoint is in service, I go back to the Manage IAM roles section of the console to give the Aurora database permissions to access the endpoint ARN.

Now, I create a new will_churn SQL function giving input to the endpoint the parameters required by the model:

CREATE FUNCTION will_churn (
       state varchar(2048), acc_length bigint(20),
       area_code bigint(20), int_plan varchar(2048),
       vmail_plan varchar(2048), vmail_msg bigint(20),
       day_mins double, day_calls bigint(20),
       eve_mins double, eve_calls bigint(20),
       night_mins double, night_calls bigint(20),
       int_mins double, int_calls bigint(20),
       cust_service_calls bigint(20))
RETURNS varchar(2048) CHARSET latin1
       alias aws_sagemaker_invoke_endpoint
       endpoint name 'estimate_customer_churn_endpoint_version_123';

As you can see, the model looks at the customer’s phone subscription details and service usage patterns to identify the risk of churn. Using the will_churn SQL function, I run a query over my customers table to flag customers based on my machine learning model. To store the result of the query, I create a new customers_churn table:

CREATE TABLE customers_churn AS
SELECT *, will_churn(state, acc_length, area_code, int_plan,
       vmail_plan, vmail_msg, day_mins, day_calls,
       eve_mins, eve_calls, night_mins, night_calls,
       int_mins, int_calls, cust_service_calls) will_churn
  FROM customers;

Let’s see a few records from the customers_churn table:

SELECT * FROM customers_churn LIMIT 7;

I am lucky the first 7 customers are apparently not going to churn. But what happens overall? Since I stored the results of the will_churn function, I can run a SELECT GROUP BY statement on the customers_churn table.

SELECT will_churn, COUNT(*) FROM customers_churn GROUP BY will_churn;

Starting from there, I can dive deep to understand what brings my customers to churn.

If I create a new version of my machine learning model, with a new endpoint ARN, I can recreate the will_churn function without changing my SQL statements.

Available Now
The new machine learning integration is available today for Aurora MySQL 5.7, with the SageMaker integration generally available and the Comprehend integration in preview. You can learn more in the documentation. We are working on other engines and versions: Aurora MySQL 5.6 and Aurora PostgreSQL 10 and 11 are coming soon.

The Aurora machine learning integration is available in all regions in which the underlying services are available. For example, if both Aurora MySQL 5.7 and SageMaker are available in a region, then you can use the integration for SageMaker. For a complete list of services availability, please see the AWS Regional Table.

There’s no additional cost for using the integration, you just pay for the underlying services at your normal rates. Pay attention to the size of your queries when using Comprehend. For example, if you do sentiment analysis on user feedback in your customer service web page, to contact those who made particularly positive or negative comments, and people are making 10,000 comments a day, you’d pay $3/day. To optimize your costs, remember to store results.

It’s never been easier to apply machine learning models to data stored in your relational databases. Let me know what you are going to build with this!

Danilo

22 New Languages And Variants, 6 New Regions For Amazon Translate

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/22-new-languages-and-variants-6-new-regions-for-amazon-translate/

Just a few weeks ago, I told you about 7 new languages supported by Amazon Translate, our fully managed service for machine translation. Well, here I am again, announcing no less than 22 new languages and variants, as well as 6 additional AWS Regions where Translate is now available.

Introducing 22 New Languages And Variants
That’s what I call an update! In addition to existing languages, Translate now supports: Afrikaans, Albanian, Amharic, Azerbaijani, Bengali, Bosnian, Bulgarian, Croatian, Dari, Estonian, Canadian French, Georgian, Hausa, Latvian, Pashto, Serbian, Slovak, Slovenian, Somali, Swahili, Tagalog, and Tamil. Congratulations if you can name all countries and regions of origin: I couldn’t!

With these, Translate now supports a total of 54 languages and variants, and 2804 language pairs. The full list is available in the documentation.

Whether you are expanding your retail operations globally like Regatta, analyzing employee surveys like Siemens, or enabling multilingual chat in customer engagement like Verint, the new language pairs will help you further streamline and automate your translation workflows, by delivering fast, high-quality, and affordable language translation.

Introducing 6 New AWS Regions
In addition to existing regions, you can now use Translate in US West (N. California), Europe (London), Europe (Paris), Europe (Stockholm), Asia Pacific (Hong Kong) and Asia Pacific (Sydney). This brings to 17 the number of regions where Translate is available.

This expansion is great news for many customers who will now be able to translate data in the region where it’s stored, without having to invoke the service in another region. Again, this will make workflows simpler, faster, and even more cost-effective.

Using Amazon Translate
In the last post, I showed you how to use Translate with the AWS SDK for C++. In the continued spirit of language diversity, let’s use the SDK for Ruby this time. Just run gem install aws-sdk to install it.

The simple program below opens a text file, then reads and translates one line at a time. As you can see, translating only requires one simple API call. Of course, it’s the same with other programming languages: call an API and get the job done!

require 'aws-sdk'

if ARGV.length != 2
  puts "Usage: translate.rb <filename> <target language code>"
  exit
end

translate = Aws::Translate::Client.new(region: 'eu-west-1')

File.open(ARGV[0], "r") do |f|
  f.each_line do |line|
  	resp = translate.translate_text({
  		text: line,
  		source_language_code: "auto",
  		target_language_code: ARGV[1],
	})
	puts(resp.translated_text)
  end
end

Here’s an extract from “Notes on Structured Programming“, a famous computer science paper published by E.W. Dijkstra in 1972.

In my life I have seen many programming courses that were essentially like the usual kind of driving lessons, in which one is taught how to handle a car instead of how to use a car to reach one’s destination. My point is that a program is never a goal in itself; the purpose of a program is to evoke computations and the purpose of the computations is to establish a desired effect. Although the program is the final product made by the programmer, the possible computations evoked by it – the “making” of which is left to the machine! – are the true subject matter of his trade. For instance, whenever a programmer states that his program is correct, he really makes an assertion about the computations it may evoke.

Let’s translate it to a few languages: how about Albanian, Hausa, Pashto and Tagalog?

$ ruby translate.rb dijkstra.txt sq
Në jetën time kam parë shumë kurse programimi që ishin në thelb si lloji i zakonshëm i mësimeve të vozitjes, në të cilën mësohet se si të merret me një makinë në vend se si të përdorësh një makinë për të arritur destinacionin e dikujt. Pika ime është se një program nuk është kurrë një qëllim në vetvete; qëllimi i një programi është të ndjell llogaritjet dhe qëllimi i llogaritjeve është të krijojë një efekt të dëshiruar. Megjithëse programi është produkti përfundimtar i bërë nga programuesi, llogaritjet e mundshme të evokuara nga ai - “bërja” e të cilit i është lënë makinë! - janë çështja e vërtetë subjekt i tregtisë së tij. Për shembull, sa herë që një programues thotë se programi i tij është i saktë, ai me të vërtetë bën një pohim në lidhje me llogaritjet që mund të ndjell.

$ ruby translate.rb article.txt ha
A rayuwata na ga kwasa-kwasai da dama da suka kasance da gaske kamar irin darussan tuki da aka saba da su, inda ake koya wa mutum yadda zai rike mota maimakon yadda zai yi amfani da mota don kaiwa mutum makoma. Dalilina shi ne, shirin ba shi da wata manufa a kanta; manufar shirin shi ne tayar da komfuta kuma manufar ƙididdigar ita ce kafa tasirin da ake so. Ko da yake shirin shine samfurin karshe da mai shiryawa ya yi, ƙididdigar da za a iya amfani da ita - “yin” wanda aka bar shi zuwa na'ura! - su ne batun gaskiya game da cinikinsa. Alal misali, duk lokacin da mai shiryawa ya ce shirinsa daidai ne, yana yin tabbaci game da ƙididdigar da zai iya fitarwa.

$ ruby translate.rb dijkstra.txt ps
زما په ژوند کې ما د پروګرام کولو ډیری کورسونه لیدلي دي چې په اصل کې د معمول ډول ډول چلولو درسونو په څیر وو، په کوم کې چې دا درس ورکول کیږي چې څنګه د موټر سره معامله وکړي ترڅو د چا منزل ته ورسیږي. زما ټکی دا دی چې یو پروګرام هېڅکله هم په ځان کې هدف نه دی؛ د یوه پروګرام هدف دا دی چې محاسبه راوباسي، او د محاسبې هدف دا دی چې یو مطلوب اثر رامنځته کړي. که څه هم دا پروګرام وروستی محصول دی چې د پروګرام لخوا جوړ شوی، هغه ممکنه حسابونه چې د هغه لخوا رامینځته شوي - د چا «جوړولو» ماشین ته پریښودل کیږي! - اصلي مسله د هغه د سوداګرۍ موضوع ده. د مثال په توګه، کله چې یو پروګرام کوونکی وايي چې د هغه پروګرام سم دی، هغه په حقیقت کې د هغه محاسبې په اړه یو ادعا کوي چې هغه یې کولی شي.

$ ruby translate.rb dijkstra.txt tl
Sa aking buhay nakita ko ang maraming mga kurso sa programming na karaniwang tulad ng karaniwang uri ng mga aralin sa pagmamaneho, kung saan itinuturo kung paano haharapin ang isang kotse upang makapunta sa patutunguhan ng isang tao. Ang aking punto ay ang isang programa ay hindi kailanman naglalayong mismo; ang layunin ng isang programa ay upang gumuhit ng mga kalkulasyon, at ang layunin ng accounting ay upang lumikha ng nais na epekto. Kahit na ang programa ay ang huling produkto na nilikha ng programa, ang mga posibleng kalkulasyon na nilikha niya - na nag-iiwan ng isang tao na “bumuo” ng makina! - Ang pangunahing isyu ay ang paksa ng kanyang negosyo. Halimbawa, kapag sinabi ng isang programmer na tama ang kanyang programa, talagang gumagawa siya ng claim tungkol sa pagkalkula na maaari niyang gawin.

Available Now!
The new languages and the new regions are available today. If you’ve never tried Amazon Translate, did you know that the free tier offers 2 million characters per month for the first 12 months, starting from your first translation request?

Also, which languages should Translate support next? We’re looking forward to your feedback: please post it to the AWS Forum for Amazon Translate, or send it to your usual AWS support contacts.

Julien

Now available: Batch Recommendations in Amazon Personalize

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/now-available-batch-recommendations-in-amazon-personalize/

Today, we’re very happy to announce that Amazon Personalize now supports batch recommendations.

Launched at AWS re:Invent 2018, Personalize is a fully-managed service that allows you to create private, customized personalization recommendations for your applications, with little to no machine learning experience required.

With Personalize, you provide the unique signals in your activity data (page views, sign-ups, purchases, and so forth) along with optional customer demographic information (age, location, etc.). You then provide the inventory of the items you want to recommend, such as articles, products, videos, or music: as explained in previous blog posts, you can use both historical data stored in Amazon Simple Storage Service (S3) and streaming data sent in real-time from a JavaScript tracker or server-side.

Then, entirely under the covers, Personalize will process and examine the data, identify what is meaningful, select the right algorithms, train and optimize a personalization model that is customized for your data, and is accessible via an API that can be easily invoked by your business application.

However, some customers have told us that batch recommendations would be a better fit for their use cases. For example, some of them need the ability to compute recommendations for very large numbers of users or items in one go, store them, and feed them over time to batch-oriented workflows such as sending email or notifications: although you could certainly do this with a real-time recommendation endpoint, batch processing is simply more convenient and more cost-effective.

Let’s do a quick demo.

Introducing Batch Recommendations
For the sake of brevity, I’ll reuse the movie recommendation solution trained in this post on the MovieLens data set. Here, instead of deploying a real-time campaign based on this solution, we’re going to create a batch recommendation job.

First, let’s define users for whom we’d like to recommend movies. I simply list their user ids in a JSON file that I store in an S3 bucket.

{"userId": "123"}
{"userId": "456"}
{"userId": "789"}
{"userId": "321"}
{"userId": "654"}
{"userId": "987"}

Then, I apply a bucket policy to that bucket, so that Personalize may read and write objects in it. I’m using the AWS console here, and you can do the same thing programmatically with the PutBucketAcl API.

Now let’s head out to the Personalize console, and create a batch inference job.

As you would expect, I need to give the job a name, and select an AWS Identity and Access Management (IAM) role for Personalize in order to allow access to my S3 bucket. The bucket policy was taken care of already.

Then, I select the solution that I want to use to recommend movies.

Finally, I define the location of input and output data, with optional AWS Key Management Service (KMS) keys for decryption and encryption.

After a little while, the job is complete, and I can fetch recommendations from my bucket.

$ aws s3 cp s3://jsimon-personalize-euwest-1/batch/output/batch/users.json.out -
{"input":{"userId":"123"}, "output": {"recommendedItems": ["137", "285", "14", "283", "124", "13", "508", "276", "275", "475", "515", "237", "246", "117", "19", "9", "25", "93", "181", "100", "10", "7", "273", "1", "150"]}}
{"input":{"userId":"456"}, "output": {"recommendedItems": ["272", "333", "286", "271", "268", "313", "340", "751", "332", "750", "347", "316", "300", "294", "690", "331", "307", "288", "304", "302", "245", "326", "315", "346", "305"]}}
{"input":{"userId":"789"}, "output": {"recommendedItems": ["275", "14", "13", "93", "1", "117", "7", "246", "508", "9", "248", "276", "137", "151", "150", "111", "124", "237", "744", "475", "24", "283", "20", "273", "25"]}}
{"input":{"userId":"321"}, "output": {"recommendedItems": ["86", "197", "180", "603", "170", "427", "191", "462", "494", "175", "61", "198", "238", "45", "507", "203", "357", "661", "30", "428", "132", "135", "479", "657", "530"]}}
{"input":{"userId":"654"}, "output": {"recommendedItems": ["272", "270", "268", "340", "210", "313", "216", "302", "182", "318", "168", "174", "751", "234", "750", "183", "271", "79", "603", "204", "12", "98", "333", "202", "902"]}}
{"input":{"userId":"987"}, "output": {"recommendedItems": ["286", "302", "313", "294", "300", "268", "269", "288", "315", "333", "272", "242", "258", "347", "690", "310", "100", "340", "50", "292", "327", "332", "751", "319", "181"]}}

In a real-life scenario, I would then feed these recommendations to downstream applications for further processing. Of course, instead of using the console, I would create and manage jobs programmatically with the CreateBatchInferenceJob, DescribeBatchInferenceJob, and ListBatchInferenceJobs APIs.

Now Available!
Using batch recommendations with Amazon Personalize is an easy and cost-effective way to add personalization to your applications. You can start using this feature today in all regions where Personalize is available.

Please send us feedback, either on the AWS forum for Amazon Personalize, or through your usual AWS support contacts.

Julien

FogHorn: Edge-to-Edge Communication and Deep Learning

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/foghorn-edge-to-edge-communication-and-deep-learning/

FogHorn is an intelligent Internet of Things ( IoT) edge solution that delivers data processing and real-time inference where data is created. Referring to itself as “the only ‘real’ edge intelligence solution in the market today,”  FogHorn is powered by a hyper-efficient Complex Event Processor (CEP) and delivers comprehensive data enrichment and real-time analytics on high volumes, varieties, and velocities of streaming sensor data, and is optimized for constrained compute footprints and limited connectivity.

Andrea Sabet, AWS Solutions Architect speaks with Ramya Ravichandar, Vice President of Products at Foghorn to talk about how FogHorn integrates with IoT MQTT for edge-to-edge communication as well as Amazon SageMaker for deep learning model deployment. The edgefication process involves running inference with real-time streaming data against a trained deep learning model. Drifts in the model accuracy trigger a callback to SageMaker for retraining.

*Check out more This Is My Architecture video series.

 

USPTO Questions if Artificial Intelligence Can Create or Infringe Copyrighted Works

Post Syndicated from Ernesto original https://torrentfreak.com/uspto-questions-if-artificial-intelligence-can-create-or-infringe-copyrighted-works-191107/

Artificial Intelligence (AI) is a buzzword that’s frequently used by startups and established businesses in the tech industry.

In some cases, it refers to little more than advanced algorithms, but complex self-learning computer systems with human-like traits are actively being developed as well.

As these AI technologies become increasingly advanced, they raise more ethical and legal questions. This was recognized by the US Patent and Trademark Office (USPTO) recently, which launched a public consultation on the matter.

“Artificial Intelligence technologies are increasingly becoming important across a diverse spectrum of technologies and businesses. AI poses unique challenges in the sphere of intellectual property law,” USPTO writes.

The USPTO is part of the US Department of Commerce and deals with various intellectual property rights issues. It previously raised questions on how AI technology impacts patent law and is now expanding this to copyright matters.

The consultation starts off by asking whether anything created by an AI, without human involvement, can be copyrighted. This can refer to any type of content, including music, images, and texts.

“Should a work produced by an AI algorithm or process, without the involvement of a natural person contributing expression to the resulting work, qualify as a work of authorship protectable under U.S. copyright law? Why or why not?” the Office asks.

The technology and code that makes any AI work obviously relies on human interaction, but USPTO’s question is destined to raise a lively debate. Since it’s expected that more and more creations will rely heavily on AI in the future, the US Government requests guidance on these issues.

AI composed music?

In a follow-up question, the Office zooms in further still by asking what kind of human involvement is required to make something copyrightable. Yet another question deals with possible copyright infringements by an AI. Or in other words, can an AI pirate?

This is a relevant question since these technologies can rely on input from other copyrighted works. A simple example would be where an AI ‘decides’ to use hundreds of music tracks to create a new one.

If that’s the case, should this simply be allowed under fair use, or should the original authors have the right to be compensated?

“To the extent an AI algorithm or process learns its function(s) by ingesting large volumes of copyrighted material, does the existing statutory language (e.g., the fair use doctrine) and related case law adequately address the legality of making such use? Should authors be recognized for this type of use of their works? If so, how?” USPTO questions.

The Office notes that further guidance is needed on these and other topics so it’s asking the public for input. USPTO says that it’s not predisposed to any particular views and also welcomes additional AI feedback, beyond the questions it asked.

The full set of questions is available in the Federal Register notice, which includes additional background information. For those who want to chime in, the comment period closes December 16.

Update: IPKAT published an article today showing that similar issues are being discussed in Europe as well.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Now available in Amazon SageMaker: EC2 P3dn GPU Instances

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/now-available-in-amazon-sagemaker-ec2-p3dn-gpu-instances/

In recent years, the meteoric rise of deep learning has made incredible applications possible, such as detecting skin cancer (SkinVision) and building autonomous vehicles (TuSimple). Thanks to neural networks, deep learning indeed has the uncanny ability to extract and model intricate patterns from vast amounts of unstructured data (e.g. images, video, and free-form text).

However, training these neural networks requires equally vasts amounts of computing power. Graphics Processing Units (GPUs) have long proven that they were up to that task, and AWS customers have quickly understood how they could use Amazon Elastic Compute Cloud (EC2) P2 and P3 instances to train their models, in particular on Amazon SageMaker, our fully-managed, modular, machine learning service.

Today, I’m very happy to announce that the largest P3 instance, named p3dn.24xlarge, is now available for model training on Amazon SageMaker. Launched last year, this instance is designed to accelerate large, complex, distributed training jobs: it has twice as much GPU memory as other P3 instances, 50% more vCPUs, blazing-fast local NVMe storage, and 100 Gbit networking.

How about we give it a try on Amazon SageMaker?

Introducing EC2 P3dn instances on Amazon SageMaker
Let’s start from this notebook, which uses the built-in image classification algorithm to train a model on the Caltech-256 dataset. All I have to do to use a p3dn.24xlarge instance on Amazon SageMaker is to set train_instance_type to 'ml.p3dn.24xlarge', and train!

ic = sagemaker.estimator.Estimator(training_image,
                                         role, 
                                         train_instance_count=1, 
                                         train_instance_type='ml.p3dn.24xlarge',
                                         input_mode='File',
                                         output_path=s3_output_location,
                                         sagemaker_session=sess)
...
ic.fit(...)

I ran some quick tests on this notebook, and I got a sweet 20% training speedup out of the box (your mileage may vary!). I’m using 'File' mode here, meaning that the full dataset is copied to the training instance: the faster network (100 Gbit, up from 25 Gbit) and storage (local NVMe instead of Amazon EBS) are certainly helping!

When working with large data sets, you could put 100 Gbit networking to good use either by streaming data from Amazon Simple Storage Service (S3) with Pipe Mode, or by storing it in Amazon Elastic File System or Amazon FSx for Lustre. It would also help with distributed training (using Horovod, maybe), as instances would be able to exchange parameter updates faster.

In short, the Amazon SageMaker and P3dn tag team packs quite a punch, and it should deliver a significant performance improvement for large-scale deep learning workloads.

Now available!
P3dn instances are available on Amazon SageMaker in the US East (N. Virginia) and US West (Oregon) regions. If you are ready to get started, please contact your AWS account team or use the Contact Us page to make a request.

As always, we’d love to hear your feedback, either on the AWS Forum for Amazon SageMaker, or through your usual AWS contacts.

New languages for Amazon Translate: Greek, Hungarian, Romanian, Thai, Ukrainian, Urdu and Vietnamese

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/new-languages-for-amazon-translate-greek-hungarian-romanian-thai-ukrainian-urdu-and-vietnamese/

Technical Evangelists travel quite a lot, and the number one question that we get from customers when presenting Amazon Translate is: “Is my native language supported?“. Well, I’m happy to announce that starting today, we’ll be able to answer “yes” if your language is Greek, Hungarian, Romanian, Thai, Ukrainian, Urdu and Vietnamese. In fact, using Amazon Translate, we could even say “ναί”, “igen”, “da”, “ใช่”, “так”, “جی ہاں” and “có”… hopefully with a decent accent!

With these additions, Amazon Translate now supports 32 languages: Arabic, Chinese (Simplified), Chinese (Traditional), Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Thai, Turkish, Ukrainian, Urdu and Vietnamese.

Between these languages, the service supports 987 translation combinations: you can see the full list of supported language pairs on this documentation page.

Using Amazon Translate
Amazon Translate is extremely simple to use. Let’s quickly test it in the AWS console on one of my favourite poems:

Developers will certainly prefer to invoke the TranslateText API. Here’s an example with the AWS CLI.

$ aws translate translate-text --source-language-code auto --target-language-code hu --text "Les sanglots longs des violons de l’automne blessent mon coeur d’une langueur monotone"
{
"TranslatedText": "Az őszi hegedű hosszú zokogása monoton bágyadtsággal fáj a szívem",
"SourceLanguageCode": "fr",
"TargetLanguageCode": "hu"
}

Of course, this API is also available in any of the AWS SDKs. In the continued spirit of language diversity, how about an example in C++? Here’s a short program translating a text file stored on disk.

#include <aws/core/Aws.h>
#include <aws/core/utils/Outcome.h>
#include <aws/translate/TranslateClient.h>
#include <aws/translate/model/TranslateTextRequest.h>
#include <aws/translate/model/TranslateTextResult.h>

#include <fstream>
#include <iostream>
#include <string>

# define MAX_LINE_LENGTH 5000

int main(int argc, char **argv) {
  if (argc != 4) {
    std::cout << "Usage: translate_text_file 'target language code' 'input file' 'output file'"
         << std::endl;
    return -1;
  }

  const Aws::String target_language = argv[1];
  const std::string input_file = argv[2];
  const std::string output_file = argv[3];

  std::ifstream fin(input_file.c_str(), std::ios::in);
  if (!fin.good()) {
    std::cerr << "Input file is invalid." << std::endl;
    return -1;
  }

  std::ofstream fout(output_file.c_str(), std::ios::out);
  if (!fout.good()) {
    std::cerr << "Output file is invalid." << std::endl;
    return -1;
  }

  Aws::SDKOptions options;
  Aws::InitAPI(options);
  {
    Aws::Translate::TranslateClient translate_client;
    Aws::Translate::Model::TranslateTextRequest request;
    request = request.WithSourceLanguageCode("auto").WithTargetLanguageCode(target_language);

    Aws::String line;
    while (getline(fin, line)) {
      if (line.empty()) {
        continue;
      }

      if (line.length() > MAX_LINE_LENGTH) {
        std::cerr << "Line is too long." << std::endl;
        break;
      }

      request.SetText(line);
      auto outcome = translate_client.TranslateText(request);

      if (outcome.IsSuccess()) {
        auto translation = outcome.GetResult().GetTranslatedText();
        fout << translation << std::endl;
      } else {
        std::cout << "TranslateText error: " << outcome.GetError().GetExceptionName()
             << " - " << outcome.GetError().GetMessage() << std::endl;
        break;
      }
    }
  }
  Aws::ShutdownAPI(options);
}

Once the code has been built, let’s translate the full poem to Thai:

$ translate_text_file th verlaine.txt verlaine-th.txt

$ cat verlaine-th.txt

“เสียงสะอื้นยาวของไวโอลินฤดูใบไม้ร่วงทำร้ายหัวใจของฉันด้วยความอ่อนเพลียที่น่าเบื่อ ทั้งหมดหายใจไม่ออกและซีดเมื่อชั่วโมงดังผมจำได้ว่าวันเก่าและร้องไห้ และฉันไปที่ลมเลวร้ายที่พาฉันออกไปจากที่นี่ไกลกว่าเช่นใบไม้ที่ตายแล้ว” - Paul Verlaine บทกวีของดาวเสาร์

As you can see, it’s extremely simple to integrate Amazon Translate in your own applications. An single API call is really all that it takes!

Available Now!
These new languages are available today in all regions where Amazon Translate is available. The free tier offers 2 million characters per month for the first 12 months, starting from your first translation request.

We’re looking forward to your feedback! Please post it to the AWS Forum for Amazon Translate, or send it to your usual AWS support contacts.

Julien;

New Issue of Architecture Monthly: Games

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/new-issue-of-architecture-monthly-games/

Architecture Monthyl Magazine - September 2019 (Games)This month’s Architecture Monthly magazine is all about games—not Scrabble, not Uno, not Twister, and certainly not hide-and-seek.

No, we’re talking the big business of online, multiplayer games. And did you know that approximately 90% of large, public game companies are running on the AWS cloud? Yep, I’m talking Epic (ever heard of Fortnite?), Ubisoft, Nintendo, and more. I had the opportunity to sit down with a senior tech leader for AWS Games, who talked about why companies are moving to the cloud from on-premise, and it’s about a whole lot more than just games for entertainment. We got into the big-money world of competitive eSports as well as the gamification of learning processes and economics.

Consider Twitch, often defined as Amazon’s live streaming platform for gamers. But Twitch is much more than a gaming platform. For example, AWS Live Video on Twitch offers live streaming about everything from how to develop serverless apps and robots to interactive quiz shows that help you prepare for AWS Certification exams. And of course, you can also learn about the technology that powers your favorite video games.

September’s Issue

For September’s issue, we’ve assembled architectural best practices about games from all over AWS, and we’ve made sure that a broad audience can appreciate it.

How to Access the Magazine

We hope you’re enjoying Architecture Monthly, and we’d like to hear from you—leave us star rating and comment on the Amazon Kindle page or contact us anytime at [email protected].

Introducing Batch Mode Processing for Amazon Comprehend Medical

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/introducing-batch-mode-processing-for-amazon-comprehend-medical/

Launched at AWS re:Invent 2018, Amazon Comprehend Medical is a HIPAA-eligible natural language processing service that makes it easy to use machine learning to extract relevant medical information from unstructured text.

For example, customers like Roche Diagnostics and The Fred Hutchinson Cancer Research Center can quickly and accurately extract information, such as medical condition, medication, dosage, strength, and frequency from a variety of sources like doctors’ notes, clinical trial reports, and patient health records. They can also identify protected health information (PHI) present in these documents in order to anonymize it before data exchange.

In a previous blog post, I showed you how to use the Amazon Comprehend Medical API to extract entities and detect PHI in a single document. Today we’re happy to announce that this API can now process batches of documents stored in an Amazon Simple Storage Service (S3) bucket. Let’s do a demo!

Introducing the Batch Mode API
First, we need to grab some data to test batch mode: MT Samples is a great collection of real-life anonymized medical transcripts that are free to use and distribute. I picked a few transcripts, and converted them to the simple JSON format that Amazon Comprehend Medical expects: in a production workflow, converting documents to this format could easily be done by your application code, or by one of our analytics services such as AWS Glue.

{"Text": " VITAL SIGNS: The patient was afebrile. He is slightly tachycardic, 105,
but stable blood pressure and respiratory rate.GENERAL: The patient is in no distress.
Sitting quietly on the gurney. HEENT: Unremarkable. His oral mucosa is moist and well
hydrated. Lips and tongue look normal. Posterior pharynx is clear. NECK: Supple. His
trachea is midline.There is no stridor. LUNGS: Very clear with good breath sounds in
all fields. There is no wheezing. Good air movement in all lung fields.
CARDIAC: Without murmur. Slight tachycardia. ABDOMEN: Soft, nontender.
SKIN: Notable for a confluence erythematous, blanching rash on the torso as well
as more of a blotchy papular, macular rash on the upper arms. He noted some on his
buttocks as well. Remaining of the exam is unremarkable.}

Then, I simply upload the samples to an Amazon S3 bucket located in the same region as the service… and yes, ‘esophagogastroduodenoscopy’ is a word.

Now let’s head to the AWS console and create a entity detection job. The rest of the process would be identical for PHI.


Samples are stored under the ‘input/’ prefix, and I’m expecting results under the ‘output/’ prefix. Of course, you could use different buckets if you were so inclined. Optionally, you could also use AWS Key Management Service (KMS) to encrypt output results. For the sake of brevity, I won’t set up KMS here, but you’d certainly want to consider it for production workflows.

I also need to provide a data access role in AWS Identity and Access Management (IAM), allowing Amazon Comprehend Medical to access the relevant S3 bucket(s). You can use a role that you previously set up in AWS Identity and Access Management (IAM), or you can use the wizard in the Amazon Comprehend Medical console. For detailed information on permissions, please refer to the documentation.

Then, I create the batch job, and wait for it to complete. After a few minutes, the job is done.

Results are available at the output location: one output for each input, containing a JSON-formatted description of entities and their relationships.

A manifest also includes global information: number of processed documents, total amount of data, etc. Paths are edited out for clarity.

{
"Summary" : {
    "Status" : "COMPLETED",
    "JobType" : "EntitiesDetection",
    "InputDataConfiguration" : {
        "Bucket" : "jsimon-comprehend-medical-uswest2",
        "Path" : "input/"
    },
    "OutputDataConfiguration" : {
        "Bucket" : "jsimon-comprehend-medical-uswest2",
        "Path" : ...
    },
    "InputFileCount" : 4,
    "TotalMeteredCharacters" : 3636,
    "UnprocessedFilesCount" : 0,
    "SuccessfulFilesCount" : 4,
    "TotalDurationSeconds" : 366,
    "SuccessfulFilesListLocation" : ... ,
    "UnprocessedFilesListLocation" : ...
}
}

After retrieving the ‘rash.json.out‘ object from S3, I can use a JSON editor to view its contents. Here are some of the entities that have been detected.

Of course, this data is not meant to be read by humans. In a production workflow, it would be processed automatically by the Amazon Comprehend Medical APIs. Results would then stored in an AWS backend, and made available to healthcare professionals through a business application.

Now Available!
As you can see, it’s extremely easy to use Amazon Comprehend Medical in batch mode, even at very large scale. Zero machine learning work and zero infrastructure work required!

The service is available today in the following AWS regions:

  • US East (North Virginia), US East (Ohio), US West (Oregon),
  • Canada (Central),
  • EU (Ireland), EU (London),
  • Asia Pacific (Sydney).

The free tier covers 25,000 units of text (2.5 million characters) for the first three months when you start using the service, either with entity extraction or with PHI detection.

As always, we’d love to hear your feedback: please post it to the AWS forum for Amazon Comprehend, or send it through your usual AWS contacts.

Julien

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Post Syndicated from Jenny Hang original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-september-aws-online-tech-talks/

Learn about AWS Services & Solutions – September AWS Online Tech Talks

AWS Tech Talks

Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

 

Compute:

September 23, 2019 | 11:00 AM – 12:00 PM PTBuild Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.

September 26, 2019 | 1:00 PM – 2:00 PM PTSelf-Hosted WordPress: It’s Easier Than You Think – Learn how you can easily build a fault-tolerant WordPress site using Amazon Lightsail.

October 3, 2019 | 11:00 AM – 12:00 PM PTLower Costs by Right Sizing Your Instance with Amazon EC2 T3 General Purpose Burstable Instances – Get an overview of T3 instances, understand what workloads are ideal for them, and understand how the T3 credit system works so that you can lower your EC2 instance costs today.

 

Containers:

September 26, 2019 | 11:00 AM – 12:00 PM PTDevelop a Web App Using Amazon ECS and AWS Cloud Development Kit (CDK) – Learn how to build your first app using CDK and AWS container services.

 

Data Lakes & Analytics:

September 26, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Provisioning Amazon MSK Clusters and Using Popular Apache Kafka-Compatible Tooling – Learn best practices on running Apache Kafka production workloads at a lower cost on Amazon MSK.

 

Databases:

September 25, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

October 3, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Enterprise-Class Security, High-Availability, and Scalability with Amazon ElastiCache – Learn about new enterprise-friendly Amazon ElastiCache enhancements like customer managed key and online scaling up or down to make your critical workloads more secure, scalable and available.

 

DevOps:

October 1, 2019 | 9:00 AM – 10:00 AM PT – CI/CD for Containers: A Way Forward for Your DevOps Pipeline – Learn how to build CI/CD pipelines using AWS services to get the most out of the agility afforded by containers.

 

Enterprise & Hybrid:

September 24, 2019 | 1:00 PM – 2:30 PM PT Virtual Workshop: How to Monitor and Manage Your AWS Costs – Learn how to visualize and manage your AWS cost and usage in this virtual hands-on workshop.

October 2, 2019 | 1:00 PM – 2:00 PM PT – Accelerate Cloud Adoption and Reduce Operational Risk with AWS Managed Services – Learn how AMS accelerates your migration to AWS, reduces your operating costs, improves security and compliance, and enables you to focus on your differentiating business priorities.

 

IoT:

September 25, 2019 | 9:00 AM – 10:00 AM PTComplex Monitoring for Industrial with AWS IoT Data Services – Learn how to solve your complex event monitoring challenges with AWS IoT Data Services.

 

Machine Learning:

September 23, 2019 | 9:00 AM – 10:00 AM PTTraining Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.

September 30, 2019 | 11:00 AM – 12:00 PM PTUsing Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.

October 3, 2019 | 1:00 PM – 2:30 PM PTVirtual Workshop: Getting Hands-On with Machine Learning and Ready to Race in the AWS DeepRacer League – Join DeClercq Wentzel, Senior Product Manager for AWS DeepRacer, for a presentation on the basics of machine learning and how to build a reinforcement learning model that you can use to join the AWS DeepRacer League.

 

AWS Marketplace:

September 30, 2019 | 9:00 AM – 10:00 AM PTAdvancing Software Procurement in a Containerized World – Learn how to deploy applications faster with third-party container products.

 

Migration:

September 24, 2019 | 11:00 AM – 12:00 PM PTApplication Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.

 

Networking & Content Delivery:

September 25, 2019 | 11:00 AM – 12:00 PM PTBuilding Highly Available and Performant Applications using AWS Global Accelerator – Learn how to build highly available and performant architectures for your applications with AWS Global Accelerator, now with source IP preservation.

September 30, 2019 | 1:00 PM – 2:00 PM PTAWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.

 

Robotics:

October 1, 2019 | 11:00 AM – 12:00 PM PTRobots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.

 

Security, Identity & Compliance:

October 1, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on Running Active Directory on AWS – Learn how to deploy Active Directory on AWS and start migrating your windows workloads.

 

Serverless:

October 2, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.

 

Storage:

September 24, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Amazon S3 Data Lake with S3 Storage Classes and Management Tools – Learn how to use the Amazon S3 Storage Classes and management tools to better manage your data lake at scale and to optimize storage costs and resources.

October 2, 2019 | 11:00 AM – 12:00 PM PTThe Great Migration to Cloud Storage: Choosing the Right Storage Solution for Your Workload – Learn more about AWS storage services and identify which service is the right fit for your business.

 

 

Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/managed-spot-training-save-up-to-90-on-your-amazon-sagemaker-training-jobs/

Amazon SageMaker is a fully-managed, modular machine learning (ML) service that enables developers and data scientists to easily build, train, and deploy models at any scale. With a choice of using built-in algorithms, bringing your own, or choosing from algorithms available in AWS Marketplace, it’s never been easier and faster to get ML models from experimentation to scale-out production.

One of the key benefits of Amazon SageMaker is that it frees you of any infrastructure management, no matter the scale you’re working at. For instance, instead of having to set up and manage complex training clusters, you simply tell Amazon SageMaker which Amazon Elastic Compute Cloud (EC2) instance type to use, and how many you need: the appropriate instances are then created on-demand, configured, and terminated automatically once the training job is complete. As customers have quickly understood, this means that they will never pay for idle training instances, a simple way to keep costs under control.

Introducing Managed Spot Training
Going one step further, we’re extremely happy to announce Managed Spot Training for Amazon SageMaker, a new feature based on Amazon EC2 Spot Instances that will help you lower ML training costs by up to 90% compared to using on-demand instances in Amazon SageMaker. Launched almost 10 years ago, Spot Instances have since been one of the cornerstones of building scalable and cost-optimized IT platforms on AWS. Starting today, not only will your Amazon SageMaker training jobs run on fully-managed infrastructure, they will also benefit from fully-managed cost optimization, letting you achieve much more with the same budget. Let’s dive in!

Managed Spot Training is available in all training configurations:

Setting it up is extremely simple, as it should be when working with a fully-managed service:

  • If you’re using the console, just switch the feature on.
  • If you’re working with the Amazon SageMaker SDK, just set the train_use_spot_instances to true in the Estimator constructor.

That’s all it takes: do this, and you’ll save up to 90%. Pretty cool, don’t you think?

Interruptions and Checkpointing
There’s an important difference when working with Managed Spot Training. Unlike on-demand training instances that are expected to be available until a training job completes, Managed Spot Training instances may be reclaimed at any time if we need more capacity.

With Amazon Elastic Compute Cloud (EC2) Spot Instances, you would receive a termination notification 2 minutes in advance, and would have to take appropriate action yourself. Don’t worry, though: as Amazon SageMaker is a fully-managed service, it will handle this process automatically, interrupting the training job, obtaining adequate spot capacity again, and either restarting or resuming the training job. This makes Managed Spot Training particularly interesting when you’re flexible on job starting time and job duration. You can also use the MaxWaitTimeInSeconds parameter to control the total duration of your training job (actual training time plus waiting time).

To avoid restarting a training job from scratch should it be interrupted, we strongly recommend that you implement checkpointing, a technique that saves the model in training at periodic intervals. Thanks to this, you can resume a training job from a well-defined point in time, continuing from the most recent partially trained model:

  • Built-in frameworks and custom models: you have full control over the training code. Just make sure that you use the appropriate APIs to save model checkpoints to Amazon Simple Storage Service (S3) regularly, using the location you defined in the CheckpointConfig parameter and passed to the SageMaker Estimator. Please note that TensorFlow uses checkpoints by default. For other frameworks, you’ll find examples in our sample notebooks and in the documentation.
  • Built-in algorithms: computer vision algorithms support checkpointing (Object Detection, Semantic Segmentation, and very soon Image Classification). As they tend to train on large data sets and run for longer than other algorithms, they have a higher likelihood of being interrupted. Other built-in algorithms do not support checkpointing for now.

Alright, enough talk, time for a quick demo!

Training a Built-in Object Detection Model with Managed Spot Training
Reading from this sample notebook, let’s use the AWS console to train the same job with Managed Spot Training instead of on-demand training. As explained before, I only need to take care of two things:

  • Enable Managed Spot Training (obviously).
  • Set MaxWaitTimeInSeconds.

First, let’s name our training job, and make sure it has appropriate AWS Identity and Access Management (IAM) permissions (no change).

Then, I select the built-in algorithm for object detection.

Then, I select the instance count and instance type for my training job, making sure I have enough storage for the checkpoints.

The next step is to set hyper parameters, and I’ll use the same ones as in the notebook. I then define the location and properties of the training data set.

I do the same for the validation data set.

I also define where model checkpoints should be saved. This is where Amazon SageMaker will pick them up to resume my training job should it be interrupted.

This is where the final model artifact should be saved.

Good things come to those who wait! This is where I enable Managed Spot Training, configuring a very relaxed 48 hours of maximum wait time.

I’m done, let’s train this model. Once training is complete, cost savings are clearly visible in the console.

As you can see, my training job ran for 2423 seconds, but I’m only billed for 837 seconds, saving 65% thanks to Managed Spot Training! While we’re on the topic, let me explain how pricing works.

Pricing
A Managed Spot training job is priced for the duration for which it ran before it completed, or before it was terminated.

For built-in algorithms and AWS Marketplace algorithms that don’t use checkpointing, we’re enforcing a maximum training time of 60 minutes (MaxWaitTimeInSeconds parameter).

Last but not least, no matter how many times the training job restarts or resumes, you only get charged for data download time once.

Now Available!
This new feature is available in all regions where Amazon SageMaker is available, so don’t wait and start saving now!

As always, we’d love to hear your feedback: please post it to the AWS forum for Amazon SageMaker, or send it through your usual AWS contacts.

Julien;