All posts by Randall Hunt

AWS X-Ray Now Supports Amazon API Gateway and New Sampling Rules API

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/apigateway-xray/

My colleague Jeff first introduced us to AWS X-Ray almost 2 years ago in his post from AWS re:Invent. If you’re not already aware, AWS X-Ray helps developers analyze and debug everything from simple web apps to large and complex distributed microservices, both in production and in development. Since X-Ray became generally available in 2017, we’ve iterated rapidly on customer feedback and continued to make enhancements to the service like encryption with AWS Key Management Service (KMS), new SDKs and language support (Python!), open sourcing the daemon, and latency visualization tools. Today, we’re adding two new features:

    • Support for Amazon API Gateway, making it easier to trace and analyze requests as they travel through your APIs to the underlying services.
    • We also recently launched support for controlling sampling rules in the AWS X-Ray console and API.

Let me show you how to enable tracing for an API.

Enabling X-Ray Tracing

I’ll start with a simple API deployed to API Gateway. I’ll add two endpoints. One to push records into Amazon Kinesis Data Streams and one to invoke a simple AWS Lambda function. It looks something like this:

After deploying my API, I can go to the Stages sub console, and select a specific stage, like “dev” or “production”. From there, I can enable X-Ray tracing by navigating to the Logs/Tracing tab, selecting Enable X-Ray Tracing and clicking Save Changes.

After tracing is enabled, I can hop over to the X-Ray console to look at my sampling rules in the new Sampling interface.

I can modify the rules in the console and, of course, with the CLI, SDKs, or API. Let’s take a brief interlude to talk about sampling rules.

Sampling Rules
The sampling rules allow me to customize, at a very granular level, the requests and traces I want to record. This allows me to control the amount of data that I record on-the-fly, across code running anywhere (AWS Lambda, Amazon ECS, Amazon Elastic Compute Cloud (EC2), or even on-prem) – all without having to rewrite any code or redeploy an application. The default rule that is pictured above states that it will record the first request each second, and five percent of any additional requests. We talk about that one request each second as the reservoir, which ensures that at least one trace is recorded each second. The five percent of additional requests is what we refer to as the fixed rate. Both the reservoir and the fixed rate are configurable. If I set the reservoir size to 50 and the fixed rate to 10%, then if 100 requests per second match the rule, the total number of requests sampled is 55 requests per second. Configuring my X-Ray recorders to read sampling rules from the X-Ray service allows the X-Ray service to maintain the sampling rate and reservoir across all of my distributed compute. If I want to enable this functionality, I just install the latest version of the X-Ray SDK and daemon on my instances. At the moment only the GA SDKs are supported with support for Ruby and Go on the way. With services like API Gateway and Lambda, I can configure everything right in the X-Ray console or API. There’s a lot more detail on this feature in the documentation, and I suggest taking the time to check it out.

While I can, of course, use the sampling rules to control costs, the dynamic nature and the granularity of the rules is also extremely powerful for debugging production systems. If I know one particular URL or service is going to need extra monitoring I can specify that as part of the sampling rule. I can filter on individual stages of APIs, service types, service names, hosts, ARNs, HTTP methods, segment attributes, and more. This lets me quickly examine distributed microservices at 30,000 feet, identify issues, adjust some rules, and then dive deep into production requests. I can use this to develop insights about problems occurring in the 99th percentile of my traffic and deliver a better overall customer experience. I remember building and deploying a lot of ad-hoc instrumentation over the years, at various companies, to try to support something like this, and I don’t think I was ever particularly successful. Now that I can just deploy X-Ray and adjust sampling rules centrally, it feels like I have a debugging crystal ball. I really wish I’d had this tool 5 years ago.

Ok, enough reminiscing, let’s hop back to the walkthrough.

I’ll stick with the default sampling rule for now. Since we’ve enabled tracing and I’ve got some requests running, after about 30 seconds I can refresh my service map and look at the results. I can click on any node to view the traces directly or drop into the Traces sub console to look at all of the traces.

From there, I can see the individual URLs being triggered, the source IPs, and various other useful metrics.

If I want to dive deeper, I can write some filtering rules in the search bar and find a particular trace. An API Gateway segment has a few useful annotations that I can use to filter and group like the API ID and stage. This is what a typical API Gateway trace might look like.

Adding API Gateway support to X-Ray gives us end-to-end production traceability in serverless environments and sampling rules give us the ability to adjust our tracing in real time without redeploying any code. I had the pleasure of speaking with Ashley Sole from Skyscanner, about how they use AWS X-Ray at the AWS Summit in London last year, and these were both features he asked me about earlier that day. I hope this release makes it easier for Ashley and other developers to debug and analyze their production applications.

Available Now

Support for both of these features is available, today, in all public regions that have both API Gateway and X-Ray. In fact, X-Ray launched their new console and API last week so you may have already seen it! You can start using it right now. As always, let us know what you think on Twitter or in the comments below.

Randall

Extending AWS CloudFormation with AWS Lambda Powered Macros

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/cloudformation-macros/

Today I’m really excited to show you a powerful new feature of AWS CloudFormation called Macros. CloudFormation Macros allow developers to extend the native syntax of CloudFormation templates by calling out to AWS Lambda powered transformations. This is the same technology that powers the popular Serverless Application Model functionality but the transforms run in your own accounts, on your own lambda functions, and they’re completely customizable. CloudFormation, if you’re new to AWS, is an absolutely essential tool for modeling and defining your infrastructure as code (YAML or JSON). It is a core building block for all of AWS and many of our services depend on it.

There are two major steps for using macros. First, we need to define a macro, which of course, we do with a CloudFormation template. Second, to use the created macro in our template we need to add it as a transform for the entire template or call it directly. Throughout this post, I use the term macro and transform somewhat interchangeably. Ready to see how this works?

Creating a CloudFormation Macro

Creating a macro has two components: a definition and an implementation. To create the definition of a macro we create a CloudFormation resource of a type AWS::CloudFormation::Macro, that outlines which Lambda function to use and what the macro should be called.

Type: "AWS::CloudFormation::Macro"
Properties:
  Description: String
  FunctionName: String
  LogGroupName: String
  LogRoleARN: String
  Name: String

The Name of the macro must be unique throughout the region and the Lambda function referenced by FunctionName must be in the same region the macro is being created in. When you execute the macro template, it will make that macro available for other templates to use. The implementation of the macro is fulfilled by a Lambda function. Macros can be in their own templates or grouped with others, but you won’t be able to use a macro in the same template you’re registering it in. The Lambda function receives a JSON payload that looks like something like this:

{
    "region": "us-east-1",
    "accountId": "$ACCOUNT_ID",
    "fragment": { ... },
    "transformId": "$TRANSFORM_ID",
    "params": { ... },
    "requestId": "$REQUEST_ID",
    "templateParameterValues": { ... }
}

The fragment portion of the payload contains either the entire template or the relevant fragments of the template – depending on how the transform is invoked from the calling template. The fragment will always be in JSON, even if the template is in YAML.

The Lambda function is expected to return a simple JSON response:

{
    "requestId": "$REQUEST_ID",
    "status": "success",
    "fragment": { ... }
}

The requestId needs to be the same as the one received in the input payload, and if status contains any value other than success (case-insensitive) then the changeset will fail to create. Now, fragment must contain the valid CloudFormation JSON of the transformed template. Even if your function performed no action it would still need to return the fragment for it to be included in the final template.

Using CloudFormation Macros


To use the macro we simply call out to Fn::Transform with the required parameters. If we want to have a macro parse the whole template we can include it in our list of transforms in the template the same way we would with SAM: Transform: [Echo]. When we go to execute this template the transforms will be collected into a changeset, by calling out to each macro’s specified function and returning the final template.

Let’s imagine we have a dummy Lambda function called EchoFunction, it just logs the data passed into it and returns the fragments unchanged. We define the macro as a normal CloudFormation resource, like this:

EchoMacro:
  Type: "AWS::CloudFormation::Macro"
  Properties:
    FunctionName: arn:aws:lambda:us-east-1:1234567:function:EchoFunction
	Name: EchoMacro

The code for the lambda function could be as simple as this:

def lambda_handler(event, context):
    print(event)
    return {
        "requestId": event['requestId'],
        "status": "success",
        "fragment": event["fragment"]
    }

Then, after deploying this function and executing the macro template, we can invoke the macro in a transform at the top level of any other template like this:

AWSTemplateFormatVersion: 2010-09-09 
 Transform: [EchoMacro, AWS::Serverless-2016-10-31]
 Resources:
    FancyTable:
      Type: AWS::Serverless::SimpleTable

The CloudFormation service creates a changeset for the template by first calling the Echo macro we defined and then the AWS::Serverless transform. It will execute the macros listed in the transform in the order they’re listed.

We could also invoke the macro using the Fn::Transform intrinsic function which allows us to pass in additional parameters. For example:

AWSTemplateFormatVersion: 2010-09-09
Resources:
  MyS3Bucket:
    Type: 'AWS::S3::Bucket'
    Fn::Transform:
      Name: EchoMacro
      Parameters:
        Key: Value

The inline transform will have access to all of its sibling nodes and all of its children nodes. Transforms are processed from deepest to shallowest which means top-level transforms are executed last. Since I know most of you are going to ask: no you cannot include macros within macros – but nice try.

When you go to execute the CloudFormation template it would simply ask you to create a changeset and you could preview the output before deploying.

Example Macros

We’re launching a number of reference macros to help developers get started and I expect many people will publish others. These four are the winners from a little internal hackathon we had prior to releasing this feature:

NameDescriptionAuthor
PyPlateAllows you to inline Python in your templatesJay McConnel – Partner SA
ShortHandDefines a short-hand syntax for common cloudformation resourcesSteve Engledow – Solutions Builder
StackMetricsAdds cloudwatch metrics to stacksSteve Engledow and Jason Gregson – Global SA
String FunctionsAdds common string functions to your templatesJay McConnel – Partner SA

Here are a few ideas I thought of that might be fun for someone to implement:

If you end up building something cool I’m more than happy to tweet it out!

Available Now

CloudFormation Macros are available today, in all AWS regions that have AWS Lambda. There is no additional CloudFormation charge for Macros meaning you are only billed normal AWS Lambda function charges. The documentation has more information that may be helpful.

This is one of my favorite new features for CloudFormation and I’m excited to see some of the amazing things our customers will build with it. The real power here is that you can extend your existing infrastructure as code with code. The possibilities enabled by this new functionality are virtually unlimited.

Randall

Amazon Kinesis Data Streams Adds Enhanced Fan-Out and HTTP/2 for Faster Streaming

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/

A few weeks ago, we launched two significant performance improving features for Amazon Kinesis Data Streams (KDS): enhanced fan-out and an HTTP/2 data retrieval API. Enhanced fan-out allows developers to scale up the number of stream consumers (applications reading data from a stream in real-time) by offering each stream consumer its own read throughput. Meanwhile, the HTTP/2 data retrieval API allows data to be delivered from producers to consumers in 70 milliseconds or better (a 65% improvement) in typical scenarios. These new features enable developers to build faster, more reactive, highly parallel, and latency-sensitive applications on top of Kinesis Data Streams.

Kinesis actually refers to a family of streaming services: Kinesis Video Streams, Kinesis Data Firehose, Kinesis Data Analytics, and the topic of today’s blog post, Kinesis Data Streams (KDS). Kinesis Data Streams allows developers to easily and continuously collect, process, and analyze streaming data in real-time with a fully-managed and massively scalable service. KDS can capture gigabytes of data per second from hundreds of thousands of sources – everything from website clickstreams and social media feeds to financial transactions and location-tracking events.

Kinesis Data Streams are scaled using the concept of a shard. One shard provides an ingest capacity of 1MB/second or 1000 records/second and an output capacity of 2MB/second. It’s not uncommon for customers to have thousands or tens of thousands of shards supporting 10s of GB/sec of ingest and egress. Before the enhanced fan-out capability, that 2MB/second/shard output was shared between all of the applications consuming data from the stream. With enhanced fan-out developers can register stream consumers to use enhanced fan-out and receive their own 2MB/second pipe of read throughput per shard, and this throughput automatically scales with the number of shards in a stream. Prior to the launch of Enhanced Fan-out customers would frequently fan-out their data out to multiple streams to support their desired read throughput for their downstream applications. That sounds like undifferentiated heavy lifting to us, and that’s something we decided our customers shouldn’t need to worry about. Customers pay for enhanced fan-out based on the amount of data retrieved from the stream using enhanced fan-out and the number of consumers registered per-shard. You can find additional info on the pricing page.

Before we jump into a description of the new API, let’s cover a few quick notes about HTTP/2 and how we use that with the new SubscribeToShard API.

HTTP/2

HTTP/2 is a major revision to the HTTP network protocol that introduces a new method for framing and transporting data between clients and servers. It’s a binary protocol. It enables many new features focused on decreasing latency and increasing throughput. The first gain is the use of HPACK to compress headers. Another useful feature is connection multiplexing which allows us to use a single TCP connection for multiple parallel non-blocking requests. Additionally, instead of the traditional request-response semantics of HTTP, the communication pipe is bidirectional. A server using HTTP/2 can push multiple responses to a client without waiting for the client to request those resources. Kinesis’s SubscribeToShard API takes advantage of this server push feature to receive new records and makes use of another HTTP/2 feature called flow control. Kinesis pushes data to the consumer and keeps track of the number of bytes that have been unacknowledged. The client acknowledges bytes received by sending WINDOW_UPDATE frames to the server. If the client can’t handle the rate of data, then Kinesis will pause the flow of data until a new WINDOW_UPDATE frame is received or until the 5 minute subscription expires.

Now that we have a grasp on SubscribeToShard and HTTP/2 let’s cover how we use this to take advantage of enhanced fan-out!

Using Enhanced Fan-out

The easiest way to make use of enhanced fan-out is to use the updated Kinesis Client Library 2.0 (KCL). KCL will automatically register itself as a consumer of the stream. Then KCL will enumerate the shards and subscribe to them using the new SubscribeToShard API. It will also continuously call SubscribeToShard whenever the underlying connections are terminated. Under the hood, KCL handles checkpointing and state management of a distributed app with a Amazon DynamoDB table it creates in your AWS account. You can see an example of this in the documentation.

The general process for using enhanced fan-out is:

  1. Call RegisterStreamConsumer and provide the StreamARN and ConsumerName (commonly the application name). Save the ConsumerARN returned by this API call. As soon as the consumer is registered, enhanced fan-out is enabled and billing for consumer-shard-hours begins.
  2. Enumerate stream shards and call SubscribeToShard on each of them with the ConsumerARN returned by RegisterStreamConsumer. This establishes an HTTP/2 connection, and KDS will push SubscribeToShardEvents to the listening client. These connections are terminated by KDS every 5 minutes, so the client will need to call SubscribeToShard again if you want to continue receiving events. Bytes pushed to the client using enhanced fan-out are billed under enhanced fan-out data retrieval rates.
  3. Finally, remember to call DeregisterStreamConsumer when you’re no longer using the consumer since it does have an associated cost.

You can see some example code walking through this process in the documentation.

You can view Amazon CloudWatch metrics and manage consumer applications in the console, including deregistering them.

Available Now

Enhanced fan-out and the new HTTP/2 SubscribeToShard API are both available now in all regions for new streams and existing streams. There’s a lot more information than what I’ve covered in this blog post in the documentation. There is a per-stream limit of 5 consumer applications (e.g., 5 different KCL applications) reading from all shards but this can be increased with a  support ticket. I’m excited to see customers take advantage of these new features to reduce the complexity of managing multiple stream consumers and to increase the speed and parallelism of their real-time applications.

As always feel free to leave comments below or on Twitter.

Randall

Aurora Serverless MySQL Generally Available

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aurora-serverless-ga/

You may have heard of Amazon Aurora, a custom built MySQL and PostgreSQL compatible database born and built in the cloud. You may have also heard of serverless, which allows you to build and run applications and services without thinking about instances. These are two pieces of the growing AWS technology story that we’re really excited to be working on. Last year, at AWS re:Invent we announced a preview of a new capability for Aurora called Aurora Serverless. Today, I’m pleased to announce that Aurora Serverless for Aurora MySQL is generally available. Aurora Serverless is on-demand, auto-scaling, serverless Aurora. You don’t have to think about instances or scaling and you pay only for what you use.

This paradigm is great for applications with unpredictable load or infrequent demand. I’m excited to show you how this all works. Let me show you how to launch a serverless cluster.

Creating an Aurora Serverless Cluster

First, I’ll navigate to the Amazon Relational Database Service (RDS) console and select the Clusters sub-console. From there, I’ll click the Create database button in the top right corner to get to this screen.

From the screen above I select my engine type and click next, for now only Aurora MySQL 5.6 is supported.

Now comes the fun part. I specify my capacity type as Serverless and all of the instance selection and configuration options go away. I only have to give my cluster a name and a master username/password combo and click next.

From here I can select a number of options. I can specify the minimum and maximum number of Aurora Compute Units (ACU) to be consumed. These are billed per-second, with a 5-minute minimum, and my cluster will autoscale between the specified minimum and maximum number of ACUs. The rules and metrics for autoscaling will be automatically created by Aurora Serverless and will include CPU utilization and number of connections. When Aurora Serverless detects that my cluster needs additional capacity it will grab capacity from a warm pool of resources to meet the need. This new capacity can start serving traffic in seconds because of the separation of the compute layer and storage layer intrinsic to the design of Aurora.

The cluster can even automatically scale down to zero if my cluster isn’t seeing any activity. This is perfect for development databases that might go long periods of time with little or no use. When the cluster is paused I’m only charged for the underlying storage. If I want to manually scale up or down, pre-empting a large spike in traffic, I can easily do that with a single API call.

Finally, I click Create database in the bottom right and wait for my cluster to become available – which happens quite quickly. For now we only support a limited number of cluster parameters with plans to enable more customized options as we iterate on customer feedback.

Now, the console provides a wealth of data, similar to any other RDS database.

From here, I can connect to my cluster like any other MySQL database. I could run a tool like sysbench or mysqlslap to generate some load and trigger a scaling event or I could just wait for the service to scale down and pause.

If I scroll down or select the events subconsole I can see a few different autoscaling events happening including pausing the instance at one point.

The best part about this? When I’m done writing the blog post I don’t need to remember to shut this server down! When I’m ready to use it again I just make a connection request and my cluster starts responding in seconds.

How Aurora Serverless Works

I want to dive a bit deeper into what exactly is happening behind the scenes to enable this functionality. When you provision an Aurora Serverless database the service does a few things:

  • It creates an Aurora storage volume replicated across multiple AZs.
  • It creates an endpoint in your VPC for the application to connect to.
  • It configures a network load balancer (invisible to the customer) behind that endpoint.
  • It configures multi-tenant request routers to route database traffic to the underlying instances.
  • It provisions the initial minimum instance capacity.

 

When the cluster needs to autoscale up or down or resume after a pause, Aurora grabs capacity from a pool of already available nodes and adds them to the request routers. This process takes almost no time and since the storage is shared between nodes Aurora can scale up or down in seconds for most workloads. The service currently has autoscaling cooldown periods of 1.5 minutes for scaling up and 5 minutes for scaling down. Scaling operations are transparent to the connected clients and applications since existing connections and session state are transferred to the new nodes. The only difference with pausing and resuming is a higher latency for the first connection, typically around 25 seconds.

Available Now

Aurora Serverless for Aurora MySQL is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland). If you’re interested in learning more about the Aurora engine there’s a great design paper available. If you’re interested in diving a bit deeper on exactly how Aurora Serverless works then look forward to more detail in future posts!

I personally believe this is one of the really exciting points in the evolution of the database story and I can’t wait to see what customers build with it!

Randall

Amazon SageMaker Adds Batch Transform Feature and Pipe Input Mode for TensorFlow Containers

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/sagemaker-nysummit2018/

At the New York Summit a few days ago we launched two new Amazon SageMaker features: a new batch inference feature called Batch Transform that allows customers to make predictions in non-real time scenarios across petabytes of data and Pipe Input Mode support for TensorFlow containers. SageMaker remains one of my favorite services and we’ve covered it extensively on this blog and the machine learning blog. In fact, the rapid pace of innovation from the SageMaker team is a bit hard to keep up with. Since our last post on SageMaker’s Automatic Model Tuning with Hyper Parameter Optimization, the team launched 4 new built-in algorithms and tons of new features. Let’s take a look at the new Batch Transform feature.

Batch Transform

The Batch Transform feature is a high-performance and high-throughput method for transforming data and generating inferences. It’s ideal for scenarios where you’re dealing with large batches of data, don’t need sub-second latency, or need to both preprocess and transform the training data. The best part? You don’t have to write a single additional line of code to make use of this feature. You can take all of your existing models and start batch transform jobs based on them. This feature is available at no additional charge and you pay only for the underlying resources.

Let’s take a look at how we would do this for the built-in Object Detection algorithm. I followed the example notebook to train my object detection model. Now I’ll go to the SageMaker console and open the Batch Transform sub-console.

From there I can start a new batch transform job.

Here I can name my transform job, select which of my models I want to use, and the number and type of instances to use. Additionally, I can configure the specifics around how many records to send to my inference concurrently and the size of the payload. If I don’t manually specify these then SageMaker will select some sensible defaults.

Next I need to specify my input location. I can either use a manifest file or just load all the files in an S3 location. Since I’m dealing with images here I’ve manually specified my input content-type.

Finally, I’ll configure my output location and start the job!

Once the job is running, I can open the job detail page and follow the links to the metrics and the logs in Amazon CloudWatch.

I can see the job is running and if I look at my results in S3 I can see the predicted labels for each image.

The transform generated one output JSON file per input file containing the detected objects.

From here it would be easy to create a table for the bucket in AWS Glue and either query the results with Amazon Athena or visualize them with Amazon QuickSight.

Of course it’s also possible to start these jobs programmatically from the SageMaker API.

You can find a lot more detail on how to use batch transforms in your own containers in the documentation.

Pipe Input Mode for Tensorflow

Pipe input mode allows customers to stream their training dataset directly from Amazon Simple Storage Service (S3) into Amazon SageMaker using a highly optimized multi-threaded background process. This mode offers significantly better read throughput than the File input mode that must first download the data to the local Amazon Elastic Block Store (EBS) volume. This means your training jobs start sooner, finish faster, and use less disk space, lowering the costs associated with training your models. It has the added benefit of letting you train on datasets beyond the 16 TB EBS volume size limit.

Earlier this year, we ran some experiments with Pipe Input Mode and found that startup times were reduced up to 87% on a 78 GB dataset, with throughput twice as fast in some benchmarks, ultimately resulting in up to a 35% reduction in total training time.

By adding support for Pipe Input Mode to TensorFlow we’re making it easier for customers to take advantage of the same increased speed available to the built-in algorithms. Let’s look at how this works in practice.

First, I need to make sure I have the sagemaker-tensorflow-extensions available for my training job. This gives us the new PipeModeDataset class which takes a channel and a record format as inputs and returns a TensorFlow dataset. We can use this in our input_fn for the TensorFlow estimator and read from the channel. The code sample below shows a simple example.

from sagemaker_tensorflow import PipeModeDataset

def input_fn(channel):
    # Simple example data - a labeled vector.
    features = {
        'data': tf.FixedLenFeature([], tf.string),
        'labels': tf.FixedLenFeature([], tf.int64),
    }
    
    # A function to parse record bytes to a labeled vector record
    def parse(record):
        parsed = tf.parse_single_example(record, features)
        return ({
            'data': tf.decode_raw(parsed['data'], tf.float64)
        }, parsed['labels'])

    # Construct a PipeModeDataset reading from a 'training' channel, using
    # the TF Record encoding.
    ds = PipeModeDataset(channel=channel, record_format='TFRecord')

    # The PipeModeDataset is a TensorFlow Dataset and provides standard Dataset methods
    ds = ds.repeat(20)
    ds = ds.prefetch(10)
    ds = ds.map(parse, num_parallel_calls=10)
    ds = ds.batch(64)
    
    return ds

Then you can define your model and the same way you would for a normal TensorFlow estimator. When it comes to estimator creation time you just need to pass in input_mode='Pipe' as one of the parameters.

Available Now

Both of these new features are available now at no additional charge, and I’m looking forward to seeing what customers can build with the batch transform feature. I can already tell you that it will help us with some of our internal ML workloads here in AWS Marketing.

As always, let us know what you think of this feature in the comments or on Twitter!

Randall

Amazon Translate Adds Support for Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-translate-adds-support-for-new-languages/

Today I’m excited to announce that Amazon Translate has added support for Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech! Amazon Translate is a translation API that delivers fast, high-quality, and affordable language translation. Translate was originally released in preview at AWS re:Invent in 2017 and my colleague, Tara, wrote about the service in depth.

Since the initial preview, we’ve continued to iterate on customer feedback by adding features like automatic source language inference via Amazon Comprehend, Amazon CloudWatch metrics, and larger pieces of text in each TranslateText. In April, we made the service generally available and have continued to collect feature requests and feedback from customers.

Working with Amazon Translate

You can use the API explorer in the Amazon Translate Console to try out the new languages immediately.

You could also use any of the SDKs as well. I wrote a quick Python sample below.

import boto3
translate = boto3.client("translate")
lang_flag_pairs = [("ja", "🇯🇵"), ("ru", "🇷🇺"),
                   ("it", "🇮🇹"), ("zh-TW", "🇹🇼"),
                   ("tr", "🇹🇷"), ("cs", "🇨🇿")]
for lang, flag in language_flag_pairs:
    print(flag)
    print(translate.translate_text(
        Text="Hello, World!",
        SourceLanguageCode="en",
        TargetLanguageCode=lang
    )['TranslatedText'])

Which gave me a cool result:

🇯🇵
ハローワールド!
🇷🇺
Привет, Мир!
🇮🇹
Ciao, Mondo!
🇹🇼
你好,世界杯!
🇹🇷
Merhaba, Dünya!
🇨🇿
Ahoj, světe!

You can check out the examples page in the documentation for more interesting ways to use Amazon Translate.

I had the chance to chat with a few of our customers about how they’re using Amazon Translate. I found out that they’re using it to enable a lot of innovative use cases and customers are building real-time chat translation in games, capturing and translating tweets for analytics, and as you can see on this blog – translating blog posts and creating spoken versions of them with Polly.

Additional Info

Amazon Translate is available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) and AWS GovCloud (US). It has a useful monthly free tier of 2 million characters for the first 12 months, and $15 per million characters after that.

The team wanted me to let you know that they’re hard at work on some additional functionality and that they’re hoping, later this year, to add support for Danish, Dutch, Finnish, Hebrew, Polish, and Swedish.

As always, let us know if you have any feedback on this service in the comments or on Twitter!

Randall

Amazon Kinesis Video Streams Adds Support For HLS Output Streams

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-kinesis-video-streams-adds-support-for-hls-output-streams/

Today I’m excited to announce and demonstrate the new HTTP Live Streams (HLS) output feature for Amazon Kinesis Video Streams (KVS). If you’re not already familiar with KVS, Jeff covered the release for AWS re:Invent in 2017. In short, Amazon Kinesis Video Streams is a service for securely capturing, processing, and storing video for analytics and machine learning – from one device or millions. Customers are using Kinesis Video with machine learning algorithms to power everything from home automation and smart cities to industrial automation and security.

After iterating on customer feedback, we’ve launched a number of features in the past few months including a plugin for GStreamer, the popular open source multimedia framework, and docker containers which make it easy to start streaming video to Kinesis. We could talk about each of those features at length, but today is all about the new HLS output feature! Fair warning, there are a few pictures of my incredibly messy office in this post.

HLS output is a convenient new feature that allows customers to create HLS endpoints for their Kinesis Video Streams, convenient for building custom UIs and tools that can playback live and on-demand video. The HLS-based playback capability is fully managed, so you don’t have to build any infrastructure to transmux the incoming media. You simply create a new streaming session, up to 5 (for now), with the new GetHLSStreamingSessionURL API and you’re off to the races. The great thing about HLS is that it’s already an industry standard and really easy to leverage in existing web-players like JW Player, hls.js, VideoJS, Google’s Shaka Player, or even rendering natively in mobile apps with Android’s Exoplayer and iOS’s AV Foundation. Let’s take a quick look at the API, feel free to skip to the walk-through below as well.

Kinesis Video HLS Output API

The documentation covers this in more detail than what we can go over in the Blog but I’ll cover the broad components.

  1. Get an endpoint with the GetDataEndpoint API
  2. Use that endpoint to get an HLS streaming URL with the GetHLSStreamingSessionURL API
  3. Render the content in the HLS URL with whatever tools you want!

This is pretty easy in a Jupyter notebook with a quick bit of Python and boto3.

import boto3
STREAM_NAME = "RandallDeepLens"
kvs = boto3.client("kinesisvideo")
# Grab the endpoint from GetDataEndpoint
endpoint = kvs.get_data_endpoint(
    APIName="GET_HLS_STREAMING_SESSION_URL",
    StreamName=STREAM_NAME
)['DataEndpoint']
# Grab the HLS Stream URL from the endpoint
kvam = boto3.client("kinesis-video-archived-media", endpoint_url=endpoint)
url = kvam.get_hls_streaming_session_url(
    StreamName=STREAM_NAME,
    PlaybackMode="LIVE"
)['HLSStreamingSessionURL']

You can even visualize everything right away in Safari which can render HLS streams natively.

from IPython.display import HTML
HTML(data='<video src="{0}" autoplay="autoplay" controls="controls" width="300" height="400"></video>'.format(url)) 

We can also stream directly from a AWS DeepLens with just a bit of code:

import DeepLens_Kinesis_Video as dkv
import time
aws_access_key = "super_fake"
aws_secret_key = "even_more_fake"
region = "us-east-1"
stream_name ="RandallDeepLens"
retention = 1 #in minutes.
wait_time_sec = 60*300 #The number of seconds to stream the data
# will create the stream if it does not already exist
producer = dkv.createProducer(aws_access_key, aws_secret_key, "", region)
my_stream = producer.createStream(stream_name, retention)
my_stream.start()
time.sleep(wait_time_sec)
my_stream.stop()

How to use Kinesis Video Streams HLS Output Streams

We definitely need a Kinesis Video Stream, which we can create easily in the Kinesis Video Streams Console.

Now, we need to get some content into the stream. We have a few options here. Perhaps the easiest is the docker container. I decided to take the more adventurous route and compile the GStreamer plugin locally on my mac, following the scripts on github. Be warned, compiling this plugin takes a while and can cause your computer to transform into a space heater.

With our freshly compiled GStreamer binaries like gst-launch-1.0 and the kvssink plugin we can stream directly from my macbook’s webcam, or any other GStreamer source, into Kinesis Video Streams. I just use the kvssink output plugin and my data will wind up in the video stream. There are a few parameters to configure around this, so pay attention.

Here’s an example command that I ran to stream my macbook’s webcam to Kinesis Video Streams:

gst-launch-1.0 autovideosrc ! videoconvert \
! video/x-raw,format=I420,width=640,height=480,framerate=30/1 \
! vtenc_h264_hw allow-frame-reordering=FALSE realtime=TRUE max-keyframe-interval=45 bitrate=500 \
! h264parse \
! video/x-h264,stream-format=avc,alignment=au,width=640,height=480,framerate=30/1 \
! kvssink stream-name="BlogStream" storage-size=1024 aws-region=us-west-2 log-config=kvslog

Now that we’re streaming some data into Kinesis, I can use the getting started sample static website to test my HLS stream with a few different video players. I just fill in my AWS credentials and ask it to start playing. The GetHLSStreamingSessionURL API supports a number of parameters so you can play both on-demand segments and live streams from various timestamps.

Additional Info

Data Consumed from Kinesis Video Streams using HLS is charged $0.0119 per GB in US East (N. Virginia) and US West (Oregon) and pricing for other regions is available on the service pricing page. This feature is available now, in all regions where Kinesis Video Streams is available.

The Kinesis Video team told me they’re working hard on getting more integration with the AWS Media services, like MediaLive, which will make it easier to serve Kinesis Video Stream content to larger audiences.

As always, let us know what you you think on twitter or in the comments. I’ve had a ton of fun playing around with this feature over the past few days and I’m excited to see customers build some new tools with it!

Randall

AWS Lambda Adds Amazon Simple Queue Service to Supported Event Sources

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/

We can now use Amazon Simple Queue Service (SQS) to trigger AWS Lambda functions! This is a stellar update with some key functionality that I’ve personally been looking forward to for more than 4 years. I know our customers are excited to take it for a spin so feel free to skip to the walk through section below if you don’t want a trip down memory lane.

SQS was the first service we ever launched with AWS back in 2004, 14 years ago. For some perspective, the largest commercial hard drives in 2004 were around 60GB, PHP 5 came out, Facebook had just launched, the TV show Friends ended, GMail was brand new, and I was still in high school. Looking back, I can see some of the tenets that make AWS what it is today were present even very early on in the development of SQS: fully managed, network accessible, pay-as-you-go, and no minimum commitments. Today, SQS is one of our most popular services used by hundreds of thousands of customers at absolutely massive scales as one of the fundamental building blocks of many applications.

AWS Lambda, by comparison, is a relative new kid on the block having been released at AWS re:Invent in 2014 (I was in the crowd that day!). Lambda is a compute service that lets you run code without provisioning or managing servers and it launched the serverless revolution back in 2014. It has seen immediate adoption across a wide array of use-cases from web and mobile backends to IT policy engines to data processing pipelines. Today, Lambda supports Node.js, Java, Go, C#, and Python runtimes letting customers minimize changes to existing codebases and giving them flexibility to build new ones. Over the past 4 years we’ve added a large number of features and event sources for Lambda making it easier for customers to just get things done. By adding support for SQS to Lambda we’re removing a lot of the undifferentiated heavy lifting of running a polling service or creating an SQS to SNS mapping.

Let’s take a look at how this all works.

Triggering Lambda from SQS

First, I’ll need an existing SQS standard queue or I’ll need to create one. I’ll go over to the AWS Management Console and open up SQS to create a new queue. Let’s give it a fun name. At the moment the Lambda triggers only work with standard queues and not FIFO queues.

Now that I have a queue I want to create a Lambda function to process it. I can navigate to the Lambda console and create a simple new function that just prints out the message body with some Python code like this:

def lambda_handler(event, context):
    for record in event['Records']:
        print(record['body'])

Next I need to add the trigger to the Lambda function, which I can do from the console by selecting the SQS trigger on the left side of the console. I can select which queue I want to use to invoke the Lambda and the maximum number of records a single Lambda will process (up to 10, based on the SQS ReceiveMessage API).

Lambda will automatically scale out horizontally to invoke one Lambda function for each new message in my queue. For example, if I had 100 messages in my queue and enabled the trigger then Lambda would execute 100 invocations of my function to process the messages concurrently. As the queue traffic fluctuates the Lambda service will scale the polling operations up and down based on the number of inflight messages. I’ve covered this behavior in more detail in the additional info section at the bottom of this post. If I have my batch size set to 10 then I would invoke 10 new Lambda functions. In order to control this behavior I can increase or decrease the concurrent execution limit for my function.

For each batch of messages processed if the function returns successfully then those messages will be removed from the queue. If the function errors out or times out then the messages will return to the queue after the visibility timeout set on the queue. Just as a quick note here, our Lambda function timeout has to be lower than the queue’s visibility timeout in order to create the event mapping from SQS to Lambda.

After adding the trigger, I can make any other changes I want to the function and save it. If we hop back over to the SQS console we can see the trigger is registered. I can configure and edit the trigger from the SQS console as well.

I’ll use the AWS CLI to enqueue a simple message and test the functionality:

aws sqs send-message --queue-url https://sqs.us-east-2.amazonaws.com/054060359478/SQS_TO_LAMBDA_IS_FINALLY_HERE_YAY --message-body "hello, world"

My Lambda receives the message and executes the code printing the message payload into my Amazon CloudWatch logs.

Of course all of this works with AWS SAM out of the box.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Example of processing messages on an SQS queue with Lambda
Resources:
  MySQSQueueFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: python3.6
      CodeUri: src/
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt MySqsQueue.Arn
            BatchSize: 10

  MySqsQueue:
    Type: AWS::SQS::Queue

Additional Information

First of all, there are no additional charges for this feature, but because the Lambda service is continuously long-polling the SQS queue the account will be charged for those API calls at the standard SQS pricing rates.

So, a quick deep dive on concurrency and automatic scaling here – just keep in mind that this behavior could change. The automatic scaling behavior of Lambda is designed to keep polling costs low when a queue is empty while simultaneously letting us scale up to high throughput when the queue is being used heavily. When an SQS event source mapping is initially created and enabled, or when messages first appear after a period with no traffic, then the Lambda service will begin polling the SQS queue using five parallel long-polling connections. The Lambda service monitors the number of inflight messages, and when it detects that this number is trending up, it will increase the polling frequency by 20 ReceiveMessage requests per minute and the function concurrency by 60 calls per minute. As long as the queue remains busy it will continue to scale until it hits the function concurrency limits. As the number of inflight messages trends down Lambda will reduce the polling frequency by 10 ReceiveMessage requests per minute and decrease the concurrency used to invoke our function by 30 calls per-minute.

The documentation is up to date with more info than what’s contained in this post. You can find an example SQS event payload there as well. You can find more details from the SQS side in their documentation as well.

As always we’re excited to hear feedback about this feature either on Twitter or in the comments below.

Randall

Amazon Comprehend Launches Asynchronous Batch Operations

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-comprehend-launches-asynchronous-batch-operations/

My colleague Jeff Barr last wrote about Amazon Comprehend, a service for discovering insights and relationships in text, when it launched at AWS re:Invent in 2017. Today, after iterating on customer feedback, we’re releasing a new asynchronous batch inferencing feature for Comprehend. Asynchronous batch operations work on documents stored in Amazon Simple Storage Service (S3) buckets and can perform all of the normal Comprehend operations like entity recognition, key phrase extraction, sentiment analysis, and language detection. These new asynchronous batch APIs support significantly larger documents than the single document and batch APIs, reducing the need for customers to truncate their documents for the service. Of course, all of the single document and batch synchronous API operations remain available for real-time results. The addition of asynchronous operations allows developers to choose the tools most suited for their applications. Let’s take a deeper look at the new API.

Asynchronous API Operations

The new batch APIs follow the same asynchronous call structure as Amazon Comprehend’s TopicDetection API. To analyze a collection of documents we first call one of the Start* APIs like StartDominantLanguageDetectionJob, StartEntitiesDetectionJob, StartKeyPhrasesDetectionJob, or StartSentimentDetectionJob.

Each of these APIs take an InputDataConfig and an OutputDataConfig that specify the incoming data format and location as well as where in S3 the results should be stored. The InputDataConfig specifies whether the input data should be treated as one document per file or one document per line.

Additionally we can name the job and include a unique request identifier for synchronization purposes. If we don’t supply these the Comprehend service will automatically generate them.

At the time of this writing the asynchronous operations support individual documents of up to 100KB for entity and key phrase detection, 1MB for language detection, and 5KB for sentiment detection. The total size of all files in the batch must be under 5GB and we cannot submit more than 1 million individual files per batch.

Now that we see what the API is doing let’s take a look at the updated console and start a job!

Amazon Comprehend Analysis Console

First I’ll navigate to the AWS Management Console and open Amazon Comprehend. Next I’ll select the new Analysis console.

From here I can create a new analysis job by clicking the create button in the top right of the console. I’ll create an entities detection job and select English as my document language. Then I’ll point the console at some sampe data.

Now I’ll configure my output data location and make sure the service role has access to that S3 bucket. Then I’ll start the job!

Here I can see the operation started in the console and I can wait until it’s complete to view the detailed results.

Here on the job page I can see the status of the job and the output location. If I download the results from the S3 location I can take a look at the detected entities in the sample text.

I’ve truncated the results here but mostly they look something like this:

{
  "Entities": [
    {
      "BeginOffset": 875,
      "EndOffset": 899,
      "Score": 0.9936646223068237,
      "Text": "University of California",
      "Type": "ORGANIZATION"
    },
    {
      "BeginOffset": 903,
      "EndOffset": 911,
      "Score": 0.9519965648651123,
      "Text": "Berkeley",
      "Type": "LOCATION"
    },
    {
      "BeginOffset": 974,
      "EndOffset": 992,
      "Score": 0.9981470108032227,
      "Text": "Christopher Monroe",
      "Type": "PERSON"
    },
    {
      "BeginOffset": 997,
      "EndOffset": 1010,
      "Score": 0.9992995262145996,
      "Text": "Mikhail Lukin",
      "Type": "PERSON"
    },
    {
      "BeginOffset": 1095,
      "EndOffset": 1099,
      "Score": 0.9990954399108887,
      "Text": "2017",
      "Type": "DATE"
    }
  ],
  "File": "Sample.txt",
  "Line": 8
}

Pretty cool! We could go through similar steps for setniment detection of key phrase detection. The fact that we can submit up to 5GB of data in a single batch means customers will spend less time transforming and truncating their documents.

I personally recommend a tool like AWS Step Functions for checking on the status of jobs programatically. It’s very easy to setup and build programatic analysis pipelines.

You could also use AWS Glue to call comprehend as part of your regular ETL operations as we mentioned in this blog post by Roy Hasson.

Additional Information

You can find detailed information on these new APIs in the documentation and learn more about the limits and best practices.

As previously mentioned the synchronous batch APIs are still available and work best for smaller sets of documents and smaller sizes.

As always don’t hesitate to share your feedback here or on twitter.

Randall

New – Redis 4.0 Compatibility in Amazon ElastiCache

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/new-redis-4-0-compatibility-in-amazon-elasticache/

Amazon ElastiCache makes it easy for you to set up a fully managed in-memory data store and cache with Redis or Memcached. Today we’re pleased to launch compatibility with Redis 4.0 in ElastiCache. You can now launch Redis 4.0 compatible ElastiCache nodes or clusters, in all commercial AWS regions. ElastiCache Redis clusters can scale to terabytes of memory and millions of reads / writes per second to serve the most demanding needs of games, IoT devices, financial applications, and web applications.

Launching a Redis cluster in the AWS Management Console or AWS Command Line Interface (CLI) remains simple. I’m going to create a small cluster to play with the new Redis 4.0 features, to use the new version I just select a 4.0 release in “Engine version compatibility”. This will launch, at the time of this writing, a 4.0.10 compatible cluster.

New Features

  • Least Frequently Used (LFU) cache eviction policy – Redis 4.0 launched with a number of caching improvements including a new LFU cache eviction algorithm, customers may see better performance from LFU over Least Recently Used (LRU). Antirez’s blog has a deep dive on some of the changes.
  • Asynchronous FLUSHDB, FLUSHALL, and UNLINK – using the ASYNC option of the FLUSH commands allows users to make a non-blocking call to clear databases. Using UNLINK instead of DEL allows users to asynchronously delete individual keys. There’s also the SWAPDB command which can be useful to atomically switch between entire datasets.
  • Active memory defragmentation – Redis can now defragment memory while running which allows more efficient utilization of memory for customer data. This is off by default but you can modify the parameter group to turn it on. Customers should probably only turn it on if they’re running into fragmentation issues.
  • Online Cluster Resizing and Encryption in transit – with Redis 4.0 you can now use encryption in transit and online cluster resizing at the same time. With Online Cluster Resizing you can add and remove shards from a running cluster to dynamically scale-out or scale-in your Redis cluster and adapt to changes on demand. Previously this feature wasn’t able to be used with encryption in transit but now you can use both features simultaneously. This helps with workloads that require encryption for compliance purposes.
  • MEMORY commands – a whole new family of memory commands: DOCTOR, USAGE, STATS, PURGE, and MALLOC-STATS are available for gathering statistics or usage information on your redis nodes. Running MEMORY DOCTOR will tell you about any memory issues (and it will give you a nice sci-fi easter egg if no problems are detected). The MEMORY STATS command will return some useful statistics like “bytes-per-key” that aren’t available in the INFO commands.

Additional Resources

You can find more information in the documentation and in antirez’s blogs/release notes.

We hope customers can take advantage of these new features right away. As always, feel free to leave any comments below or reach out to us on twitter!

Randall

Amazon SageMaker Automatic Model Tuning: Using Machine Learning for Machine Learning

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/sagemaker-automatic-model-tuning/

Today I’m excited to announce the general availability of Amazon SageMaker Automatic Model Tuning. Automatic Model Tuning eliminates the undifferentiated heavy lifting required to search the hyperparameter space for more accurate models. This feature allows developers and data scientists to save significant time and effort in training and tuning their machine learning models. A Hyperparameter Tuning job launches multiple training jobs, with different hyperparameter combinations, based on the results of completed training jobs. SageMaker trains a “meta” machine learning model, based on Bayesian Optimization, to infer hyperparameter combinations for our training jobs. Let’s dive a little deeper.

Model Tuning in the Machine Learning Process

A developer’s typical machine learning process comprises 4 steps: exploratory data analysis (EDA), model design, model training, and model evaluation. SageMaker already makes each of those steps easy with access to powerful Jupyter notebook instances, built-in algorithms, and model training within the service. Focusing on the training portion of the process, we typically work with data and feed it into a model where we evaluate the model’s prediction against our expected result. We keep a portion of our overall input data, the evaluation data, away from the training data used to train the model. We can use the evaluation data to examine the behavior of our model on data it has never seen. In many cases after we’ve chosen an algorithm or built a custom model, we will need to search the space of possible hyperparameter configurations of that algorithm for the best results for our input data.

Hyperparameters control how our underlying algorithms operate and influence the performance of the model. They can be things like: the number of epochs to train for, the number of layers in the network, the learning rate, the optimization algorithms, and more. Typically, you start with random values, or common values for other problems, and iterate through adjustments as you begin to see what effect the changes have. In the past this was a painstakingly manual process. However, thanks to the work of some very talented researchers we can use SageMaker to eliminate almost all of the manual overhead. A user only needs to select the hyperparameters to tune, a range for each parameter to explore, and the total number of training jobs to budget. Let’s see how this works in practice.

Hyperparameter Tuning

To demonstrate this feature we’ll work with the standard MNIST dataset, the Apache MXNet framework, and the SageMaker Python SDK. Everything you see below is available in the SageMaker example notebooks.

First, I’ll create a traditional MXNet estimator using the SageMaker Python SDK on a Notebook Instance:


import boto3
import sagemaker
from sagemaker.mxnet import MXNet
role = sagemaker.get_execution_role()
region = boto3.Session().region_name
train_data_location = 's3://sagemaker-sample-data-{}/mxnet/mnist/train'.format(region)
test_data_location = 's3://sagemaker-sample-data-{}/mxnet/mnist/test'.format(region)
estimator = MXNet(entry_point='mnist.py',
                  role=role,
                  train_instance_count=1,
                  train_instance_type='ml.m4.xlarge',
                  sagemaker_session=sagemaker.Session(),
                  base_job_name='HPO-mxnet',
                  hyperparameters={'batch_size': 100})

This is probably quite similar to what you’ve seen in other SageMaker examples.

Now, we can import some tools for the Auto Model Tuning and create our hyperparameter ranges.


from sagemaker.tuner import HyperparameterTuner, IntegerParameter, CategoricalParameter, ContinuousParameter
hyperparameter_ranges = {'optimizer': CategoricalParameter(['sgd', 'Adam']),
                         'learning_rate': ContinuousParameter(0.01, 0.2),
                         'num_epoch': IntegerParameter(10, 50)}

The tuning job will select parameters from these ranges and use those to determine the best place to focus training efforts. There are few types of parameters:

  • Categorical parameters use one value from a discrete set.
  • Continuous parameters can use any real number value between the minimum and maximum value.
  • Integer parameters can use any integer within the bounds specified.

Now that we have our ranges defined we want to define our success metric and a regular expression for finding that metric in the training job logs.


objective_metric_name = 'Validation-accuracy'
metric_definitions = [{'Name': 'Validation-accuracy',
                       'Regex': 'Validation-accuracy=([0-9\\.]+)'}]

Now, with just these few things defined we can start our tuning job!


tuner = HyperparameterTuner(estimator,
                            objective_metric_name,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=9,
                            max_parallel_jobs=3)
tuner.fit({'train': train_data_location, 'test': test_data_location})

Now, we can open up the SageMaker console, select the Hyperparameter tuning jobs sub-console and check out all our tuning jobs.

We can click on the job we just created to get some more detail and explore the results of the tuning.

By default the console will show us the best job and the parameters used but we can also check out each of the other jobs.

Hopping back over to our notebook instance, we have a handy analytics object from tuner.analytics() that we can use to visualize the results of the training with some bokeh plots. Some examples of this are provided in the SageMaker example notebooks.

This feature works for built-in algorithms, jobs created with the SageMaker Python SDK, or even bring-your-own training jobs in docker.

We can even create tuning jobs right in the console by clicking Create hyperparameter tuning job.

First we select a name for our job, an IAM role and which VPC it should run in, if any.

Next, we configure the training job. We can use built-in algorithms or a custom docker image. If we’re using a custom image this would be where we defined the regex to to find the objective metric in the logs. For now we’ll just select XGBoost and click next.

Now we’ll configure our tuning job parameters just like in the notebook example. I’ll select the area under the curve (AUC) as the objective metric to optimize. Since this is a builtin algorithm the regex for that metric was already filled in by the previous step. I’ll set the minimum and maximum number of rounds and click next.

In the next screen we can configure the input channels that our algorithm is expecting as well as the location to output the models. We’d typically have more than just the “train” channel and would have an “eval” channel as well.

Finally, we can configure the resource limits for this tuning job.

Now we’re off to the races tuning!

Additional Resources

To take advantage of automatic model tuning there are really only a few things users have to define: the hyperparameter ranges, the success metric and a regex to find it, the number of jobs to run in parallel, and the maximum number of jobs to run. For the built-in algorithms we don’t even need to define the regex. There’s a small trade-off between the number of parallel jobs used and the accuracy of the final model. Increasing max_parallel_jobs will cause the tuning job to finish much faster but a lower parallelism will generally provide a slightly better final result.

Amazon SageMaker Automatic Model Tuning is provided at no additional charge, you pay only for the underlying resources used by the training jobs that the tuning job launches. This feature is available now in all regions where SageMaker is available. This feature is available in the API and training jobs launched by automatic model tuning are visible in the console. You can find our more by reading the documentation.

I really think this feature will save developers a lot of time and effort and I’m excited to see what customers do with it. As always, we welcome your feedback in the comments or on Twitter!

Randall

Amazon SageMaker Updates – Tokyo Region, CloudFormation, Chainer, and GreenGrass ML

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/sagemaker-tokyo-summit-2018/

Today, at the AWS Summit in Tokyo we announced a number of updates and new features for Amazon SageMaker. Starting today, SageMaker is available in Asia Pacific (Tokyo)! SageMaker also now supports CloudFormation. A new machine learning framework, Chainer, is now available in the SageMaker Python SDK, in addition to MXNet and Tensorflow. Finally, support for running Chainer models on several devices was added to AWS Greengrass Machine Learning.

Amazon SageMaker Chainer Estimator


Chainer is a popular, flexible, and intuitive deep learning framework. Chainer networks work on a “Define-by-Run” scheme, where the network topology is defined dynamically via forward computation. This is in contrast to many other frameworks which work on a “Define-and-Run” scheme where the topology of the network is defined separately from the data. A lot of developers enjoy the Chainer scheme since it allows them to write their networks with native python constructs and tools.

Luckily, using Chainer with SageMaker is just as easy as using a TensorFlow or MXNet estimator. In fact, it might even be a bit easier since it’s likely you can take your existing scripts and use them to train on SageMaker with very few modifications. With TensorFlow or MXNet users have to implement a train function with a particular signature. With Chainer your scripts can be a little bit more portable as you can simply read from a few environment variables like SM_MODEL_DIR, SM_NUM_GPUS, and others. We can wrap our existing script in a if __name__ == '__main__': guard and invoke it locally or on sagemaker.


import argparse
import os

if __name__ =='__main__':

    parser = argparse.ArgumentParser()

    # hyperparameters sent by the client are passed as command-line arguments to the script.
    parser.add_argument('--epochs', type=int, default=10)
    parser.add_argument('--batch-size', type=int, default=64)
    parser.add_argument('--learning-rate', type=float, default=0.05)

    # Data, model, and output directories
    parser.add_argument('--output-data-dir', type=str, default=os.environ['SM_OUTPUT_DATA_DIR'])
    parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])
    parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN'])
    parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST'])

    args, _ = parser.parse_known_args()

    # ... load from args.train and args.test, train a model, write model to args.model_dir.

Then, we can run that script locally or use the SageMaker Python SDK to launch it on some GPU instances in SageMaker. The hyperparameters will get passed in to the script as CLI commands and the environment variables above will be autopopulated. When we call fit the input channels we pass will be populated in the SM_CHANNEL_* environment variables.


from sagemaker.chainer.estimator import Chainer
# Create my estimator
chainer_estimator = Chainer(
    entry_point='example.py',
    train_instance_count=1,
    train_instance_type='ml.p3.2xlarge',
    hyperparameters={'epochs': 10, 'batch-size': 64}
)
# Train my estimator
chainer_estimator.fit({'train': train_input, 'test': test_input})

# Deploy my estimator to a SageMaker Endpoint and get a Predictor
predictor = chainer_estimator.deploy(
    instance_type="ml.m4.xlarge",
    initial_instance_count=1
)

Now, instead of bringing your own docker container for training and hosting with Chainer, you can just maintain your script. You can see the full sagemaker-chainer-containers on github. One of my favorite features of the new container is built-in chainermn for easy multi-node distribution of your chainer training jobs.

There’s a lot more documentation and information available in both the README and the example notebooks.

AWS GreenGrass ML with Chainer

AWS GreenGrass ML now includes a pre-built Chainer package for all devices powered by Intel Atom, NVIDIA Jetson, TX2, and Raspberry Pi. So, now GreenGrass ML provides pre-built packages for TensorFlow, Apache MXNet, and Chainer! You can train your models on SageMaker then easily deploy it to any GreenGrass-enabled device using GreenGrass ML.

JAWS UG

I want to give a quick shout out to all of our wonderful and inspirational friends in the JAWS UG who attended the AWS Summit in Tokyo today. I’ve very much enjoyed seeing your pictures of the summit. Thanks for making Japan an amazing place for AWS developers! I can’t wait to visit again and meet with all of you.

Randall

Amazon Neptune Generally Available

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-neptune-generally-available/

Amazon Neptune is now Generally Available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland). Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. At the core of Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with millisecond latencies. Neptune supports two popular graph models, Property Graph and RDF, through Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune can be used to power everything from recommendation engines and knowledge graphs to drug discovery and network security. Neptune is fully-managed with automatic minor version upgrades, backups, encryption, and fail-over. I wrote about Neptune in detail for AWS re:Invent last year and customers have been using the preview and providing great feedback that the team has used to prepare the service for GA.

Now that Amazon Neptune is generally available there are a few changes from the preview:

Launching an Amazon Neptune Cluster

Launching a Neptune cluster is as easy as navigating to the AWS Management Console and clicking create cluster. Of course you can also launch with CloudFormation, the CLI, or the SDKs.

You can monitor your cluster health and the health of individual instances through Amazon CloudWatch and the console.

Additional Resources

We’ve created two repos with some additional tools and examples here. You can expect continuous development on these repos as we add additional tools and examples.

  • Amazon Neptune Tools Repo
    This repo has a useful tool for converting GraphML files into Neptune compatible CSVs for bulk loading from S3.
  • Amazon Neptune Samples Repo
    This repo has a really cool example of building a collaborative filtering recommendation engine for video game preferences.

Purpose Built Databases

There’s an industry trend where we’re moving more and more onto purpose-built databases. Developers and businesses want to access their data in the format that makes the most sense for their applications. As cloud resources make transforming large datasets easier with tools like AWS Glue, we have a lot more options than we used to for accessing our data. With tools like Amazon Redshift, Amazon Athena, Amazon Aurora, Amazon DynamoDB, and more we get to choose the best database for the job or even enable entirely new use-cases. Amazon Neptune is perfect for workloads where the data is highly connected across data rich edges.

I’m really excited about graph databases and I see a huge number of applications. Looking for ideas of cool things to build? I’d love to build a web crawler in AWS Lambda that uses Neptune as the backing store. You could further enrich it by running Amazon Comprehend or Amazon Rekognition on the text and images found and creating a search engine on top of Neptune.

As always, feel free to reach out in the comments or on twitter to provide any feedback!

Randall

Introducing the AWS Machine Learning Competency for Consulting Partners

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/introducing-the-aws-machine-learning-competency-for-consulting-partners/

Today I’m excited to announce a new Machine Learning Competency for Consulting Partners in the Amazon Partner Network (APN). This AWS Competency program allows APN Consulting Partners to demonstrate a deep expertise in machine learning on AWS by providing solutions that enable machine learning and data science workflows for their customers. This new AWS Competency is in addition to the Machine Learning comptency for our APN Technology Partners, that we launched at the re:Invent 2017 partner summit.

These APN Consulting Partners help organizations solve their machine learning and data challenges through:

  • Providing data services that help data scientists and machine learning practitioners prepare their enterprise data for training.
  • Platform solutions that provide data scientists and machine learning practitioners with tools to take their data, train models, and make predictions on new data.
  • SaaS and API solutions to enable predictive capabilities within customer applications.

Why work with an AWS Machine Learning Competency Partner?

The AWS Competency Program helps customers find the most qualified partners with deep expertise. AWS Machine Learning Competency Partners undergo a strict validation of their capabilities to demonstrate technical proficiency and proven customer success with AWS machine learning tools.

If you’re an AWS customer interested in machine learning workloads on AWS, check out our AWS Machine Learning launch partners below:

 

Interested in becoming an AWS Machine Learning Competency Partner?

APN Partners with experience in Machine Learning can learn more about becoming an AWS Machine Learning Competency Partner here. To learn more about the benefits of joining the AWS Partner Network, see our APN Partner website.

Thanks to the AWS Partner Team for their help with this post!
Randall

New .BOT gTLD from Amazon

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/new-bot-gtld-from-amazon/

Today, I’m excited to announce the launch of .BOT, a new generic top-level domain (gTLD) from Amazon. Customers can use .BOT domains to provide an identity and portal for their bots. Fitness bots, slack bots, e-commerce bots, and more can all benefit from an easy-to-access .BOT domain. The phrase “bot” was the 4th most registered domain keyword within the .COM TLD in 2016 with more than 6000 domains per month. A .BOT domain allows customers to provide a definitive internet identity for their bots as well as enhancing SEO performance.

At the time of this writing .BOT domains start at $75 each and must be verified and published with a supported tool like: Amazon Lex, Botkit Studio, Dialogflow, Gupshup, Microsoft Bot Framework, or Pandorabots. You can expect support for more tools over time and if your favorite bot framework isn’t supported feel free to contact us here: [email protected].

Below, I’ll walk through the experience of registering and provisioning a domain for my bot, whereml.bot. Then we’ll look at setting up the domain as a hosted zone in Amazon Route 53. Let’s get started.

Registering a .BOT domain

First, I’ll head over to https://amazonregistry.com/bot, type in a new domain, and click magnifying class to make sure my domain is available and get taken to the registration wizard.

Next, I have the opportunity to choose how I want to verify my bot. I build all of my bots with Amazon Lex so I’ll select that in the drop down and get prompted for instructions specific to AWS. If I had my bot hosted somewhere else I would need to follow the unique verification instructions for that particular framework.

To verify my Lex bot I need to give the Amazon Registry permissions to invoke the bot and verify it’s existence. I’ll do this by creating an AWS Identity and Access Management (IAM) cross account role and providing the AmazonLexReadOnly permissions to that role. This is easily accomplished in the AWS Console. Be sure to provide the account number and external ID shown on the registration page.

Now I’ll add read only permissions to our Amazon Lex bots.

I’ll give my role a fancy name like DotBotCrossAccountVerifyRole and a description so it’s easy to remember why I made this then I’ll click create to create the role and be transported to the role summary page.

Finally, I’ll copy the ARN from the created role and save it for my next step.

Here I’ll add all the details of my Amazon Lex bot. If you haven’t made a bot yet you can follow the tutorial to build a basic bot. I can refer to any alias I’ve deployed but if I just want to grab the latest published bot I can pass in $LATEST as the alias. Finally I’ll click Validate and proceed to registering my domain.

Amazon Registry works with a partner EnCirca to register our domains so we’ll select them and optionally grab Site Builder. I know how to sling some HTML and Javascript together so I’ll pass on the Site Builder side of things.

 

After I click continue we’re taken to EnCirca’s website to finalize the registration and with any luck within a few minutes of purchasing and completing the registration we should receive an email with some good news:

Alright, now that we have a domain name let’s find out how to host things on it.

Using Amazon Route53 with a .BOT domain

Amazon Route 53 is a highly available and scalable DNS with robust APIs, healthchecks, service discovery, and many other features. I definitely want to use this to host my new domain. The first thing I’ll do is navigate to the Route53 console and create a hosted zone with the same name as my domain.


Great! Now, I need to take the Name Server (NS) records that Route53 created for me and use EnCirca’s portal to add these as the authoritative nameservers on the domain.

Now I just add my records to my hosted zone and I should be able to serve traffic! Way cool, I’ve got my very own .bot domain for @WhereML.

Next Steps

  • I could and should add to the security of my site by creating TLS certificates for people who intend to access my domain over TLS. Luckily with AWS Certificate Manager (ACM) this is extremely straightforward and I’ve got my subdomains and root domain verified in just a few clicks.
  • I could create a cloudfront distrobution to front an S3 static single page application to host my entire chatbot and invoke Amazon Lex with a cognito identity right from the browser.

Randall

AWS Certificate Manager Launches Private Certificate Authority

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-certificate-manager-launches-private-certificate-authority/

Today we’re launching a new feature for AWS Certificate Manager (ACM), Private Certificate Authority (CA). This new service allows ACM to act as a private subordinate CA. Previously, if a customer wanted to use private certificates, they needed specialized infrastructure and security expertise that could be expensive to maintain and operate. ACM Private CA builds on ACM’s existing certificate capabilities to help you easily and securely manage the lifecycle of your private certificates with pay as you go pricing. This enables developers to provision certificates in just a few simple API calls while administrators have a central CA management console and fine grained access control through granular IAM policies. ACM Private CA keys are stored securely in AWS managed hardware security modules (HSMs) that adhere to FIPS 140-2 Level 3 security standards. ACM Private CA automatically maintains certificate revocation lists (CRLs) in Amazon Simple Storage Service (S3) and lets administrators generate audit reports of certificate creation with the API or console. This service is packed full of features so let’s jump in and provision a CA.

Provisioning a Private Certificate Authority (CA)

First, I’ll navigate to the ACM console in my region and select the new Private CAs section in the sidebar. From there I’ll click Get Started to start the CA wizard. For now, I only have the option to provision a subordinate CA so we’ll select that and use my super secure desktop as the root CA and click Next. This isn’t what I would do in a production setting but it will work for testing out our private CA.

Now, I’ll configure the CA with some common details. The most important thing here is the Common Name which I’ll set as secure.internal to represent my internal domain.

Now I need to choose my key algorithm. You should choose the best algorithm for your needs but know that ACM has a limitation today that it can only manage certificates that chain up to to RSA CAs. For now, I’ll go with RSA 2048 bit and click Next.

In this next screen, I’m able to configure my certificate revocation list (CRL). CRLs are essential for notifying clients in the case that a certificate has been compromised before certificate expiration. ACM will maintain the revocation list for me and I have the option of routing my S3 bucket to a custome domain. In this case I’ll create a new S3 bucket to store my CRL in and click Next.

Finally, I’ll review all the details to make sure I didn’t make any typos and click Confirm and create.

A few seconds later and I’m greeted with a fancy screen saying I successfully provisioned a certificate authority. Hooray! I’m not done yet though. I still need to activate my CA by creating a certificate signing request (CSR) and signing that with my root CA. I’ll click Get started to begin that process.

Now I’ll copy the CSR or download it to a server or desktop that has access to my root CA (or potentially another subordinate – so long as it chains to a trusted root for my clients).

Now I can use a tool like openssl to sign my cert and generate the certificate chain.


$openssl ca -config openssl_root.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in csr/CSR.pem -out certs/subordinate_cert.pem
Using configuration from openssl_root.cnf
Enter pass phrase for /Users/randhunt/dev/amzn/ca/private/root_private_key.pem:
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
stateOrProvinceName   :ASN.1 12:'Washington'
localityName          :ASN.1 12:'Seattle'
organizationName      :ASN.1 12:'Amazon'
organizationalUnitName:ASN.1 12:'Engineering'
commonName            :ASN.1 12:'secure.internal'
Certificate is to be certified until Mar 31 06:05:30 2028 GMT (3650 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

After that I’ll copy my subordinate_cert.pem and certificate chain back into the console. and click Next.

Finally, I’ll review all the information and click Confirm and import. I should see a screen like the one below that shows my CA has been activated successfully.

Now that I have a private CA we can provision private certificates by hopping back to the ACM console and creating a new certificate. After clicking create a new certificate I’ll select the radio button Request a private certificate then I’ll click Request a certificate.

From there it’s just similar to provisioning a normal certificate in ACM.

Now I have a private certificate that I can bind to my ELBs, CloudFront Distributions, API Gateways, and more. I can also export the certificate for use on embedded devices or outside of ACM managed environments.

Available Now
ACM Private CA is a service in and of itself and it is packed full of features that won’t fit into a blog post. I strongly encourage the interested readers to go through the developer guide and familiarize themselves with certificate based security. ACM Private CA is available in in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt) and EU (Ireland). Private CAs cost $400 per month (prorated) for each private CA. You are not charged for certificates created and maintained in ACM but you are charged for certificates where you have access to the private key (exported or created outside of ACM). The pricing per certificate is tiered starting at $0.75 per certificate for the first 1000 certificates and going down to $0.001 per certificate after 10,000 certificates.

I’m excited to see administrators and developers take advantage of this new service. As always please let us know what you think of this service on Twitter or in the comments below.

Randall

AWS Secrets Manager: Store, Distribute, and Rotate Credentials Securely

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-secrets-manager-store-distribute-and-rotate-credentials-securely/

Today we’re launching AWS Secrets Manager which makes it easy to store and retrieve your secrets via API or the AWS Command Line Interface (CLI) and rotate your credentials with built-in or custom AWS Lambda functions. Managing application secrets like database credentials, passwords, or API Keys is easy when you’re working locally with one machine and one application. As you grow and scale to many distributed microservices, it becomes a daunting task to securely store, distribute, rotate, and consume secrets. Previously, customers needed to provision and maintain additional infrastructure solely for secrets management which could incur costs and introduce unneeded complexity into systems.

AWS Secrets Manager

Imagine that I have an application that takes incoming tweets from Twitter and stores them in an Amazon Aurora database. Previously, I would have had to request a username and password from my database administrator and embed those credentials in environment variables or, in my race to production, even in the application itself. I would also need to have our social media manager create the Twitter API credentials and figure out how to store those. This is a fairly manual process, involving multiple people, that I have to restart every time I want to rotate these credentials. With Secrets Manager my database administrator can provide the credentials in secrets manager once and subsequently rely on a Secrets Manager provided Lambda function to automatically update and rotate those credentials. My social media manager can put the Twitter API keys in Secrets Manager which I can then access with a simple API call and I can even rotate these programmatically with a custom lambda function calling out to the Twitter API. My secrets are encrypted with the KMS key of my choice, and each of these administrators can explicitly grant access to these secrets with with granular IAM policies for individual roles or users.

Let’s take a look at how I would store a secret using the AWS Secrets Manager console. First, I’ll click Store a new secret to get to the new secrets wizard. For my RDS Aurora instance it’s straightforward to simply select the instance and provide the initial username and password to connect to the database.

Next, I’ll fill in a quick description and a name to access my secret by. You can use whatever naming scheme you want here.

Next, we’ll configure rotation to use the Secrets Manager-provided Lambda function to rotate our password every 10 days.

Finally, we’ll review all the details and check out our sample code for storing and retrieving our secret!

Finally I can review the secrets in the console.

Now, if I needed to access these secrets I’d simply call the API.

import json
import boto3
secrets = boto3.client("secretsmanager")
rds = json.dumps(secrets.get_secrets_value("prod/TwitterApp/Database")['SecretString'])
print(rds)

Which would give me the following values:


{'engine': 'mysql',
 'host': 'twitterapp2.abcdefg.us-east-1.rds.amazonaws.com',
 'password': '-)Kw>THISISAFAKEPASSWORD:lg{&sad+Canr',
 'port': 3306,
 'username': 'ranman'}

More than passwords

AWS Secrets Manager works for more than just passwords. I can store OAuth credentials, binary data, and more. Let’s look at storing my Twitter OAuth application keys.

Now, I can define the rotation for these third-party OAuth credentials with a custom AWS Lambda function that can call out to Twitter whenever we need to rotate our credentials.

Custom Rotation

One of the niftiest features of AWS Secrets Manager is custom AWS Lambda functions for credential rotation. This allows you to define completely custom workflows for credentials. Secrets Manager will call your lambda with a payload that includes a Step which specifies which step of the rotation you’re in, a SecretId which specifies which secret the rotation is for, and importantly a ClientRequestToken which is used to ensure idempotency in any changes to the underlying secret.

When you’re rotating secrets you go through a few different steps:

  1. createSecret
  2. setSecret
  3. testSecret
  4. finishSecret

The advantage of these steps is that you can add any kind of approval steps you want for each phase of the rotation. For more details on custom rotation check out the documentation.

Available Now
AWS Secrets Manager is available today in US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (São Paulo). Secrets are priced at $0.40 per month per secret and $0.05 per 10,000 API calls. I’m looking forward to seeing more users adopt rotating credentials to secure their applications!

Randall

Amazon Transcribe Now Generally Available

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-transcribe-now-generally-available/


At AWS re:Invent 2017 we launched Amazon Transcribe in private preview. Today we’re excited to make Amazon Transcribe generally available for all developers. Amazon Transcribe is an automatic speech recognition service (ASR) that makes it easy for developers to add speech to text capabilities to their applications. We’ve iterated on customer feedback in the preview to make a number of enhancements to Amazon Transcribe.

New Amazon Transcribe Features in GA

To start off we’ve made the SampleRate parameter optional which means you only need to know the file type of your media and the input language. We’ve added two new features – the ability to differentiate multiple speakers in the audio to provide more intelligible transcripts (“who spoke when”), and a custom vocabulary to improve the accuracy of speech recognition for product names, industry-specific terminology, or names of individuals. To refresh our memories on how Amazon Transcribe works lets look at a quick example. I’ll convert this audio in my S3 bucket.

import boto3
transcribe = boto3.client("transcribe")
transcribe.start_transcription_job(
    TranscriptionJobName="TranscribeDemo",
    LanguageCode="en-US",
    MediaFormat="mp3",
    Media={"MediaFileUri": "https://s3.amazonaws.com/randhunt-transcribe-demo-us-east-1/out.mp3"}
)

This will output JSON similar to this (I’ve stripped out most of the response) with indidivudal speakers identified:

{
  "jobName": "reinvent",
  "accountId": "1234",
  "results": {
    "transcripts": [
      {
        "transcript": "Hi, everybody, i'm randall ..."
      }
    ],
    "speaker_labels": {
      "speakers": 2,
      "segments": [
        {
          "start_time": "0.000000",
          "speaker_label": "spk_0",
          "end_time": "0.010",
          "items": []
        },
        {
          "start_time": "0.010000",
          "speaker_label": "spk_1",
          "end_time": "4.990",
          "items": [
            {
              "start_time": "1.000",
              "speaker_label": "spk_1",
              "end_time": "1.190"
            },
            {
              "start_time": "1.190",
              "speaker_label": "spk_1",
              "end_time": "1.700"
            }
          ]
        }
      ]
    },
    "items": [
      {
        "start_time": "1.000",
        "end_time": "1.190",
        "alternatives": [
          {
            "confidence": "0.9971",
            "content": "Hi"
          }
        ],
        "type": "pronunciation"
      },
      {
        "alternatives": [
          {
            "content": ","
          }
        ],
        "type": "punctuation"
      },
      {
        "start_time": "1.190",
        "end_time": "1.700",
        "alternatives": [
          {
            "confidence": "1.0000",
            "content": "everybody"
          }
        ],
        "type": "pronunciation"
      }
    ]
  },
  "status": "COMPLETED"
}

Custom Vocabulary

Now if I needed to have a more complex technical discussion with a colleague I could create a custom vocabulary. A custom vocabulary is specified as an array of strings passed to the CreateVocabulary API and you can include your custom vocabulary in a transcription job by passing in the name as part of the Settings in a StartTranscriptionJob API call. An individual vocabulary can be as large as 50KB and each phrase must be less than 256 characters. If I wanted to transcribe the recordings of my highschool AP Biology class I could create a custom vocabulary in Python like this:

import boto3
transcribe = boto3.client("transcribe")
transcribe.create_vocabulary(
LanguageCode="en-US",
VocabularyName="APBiology"
Phrases=[
    "endoplasmic-reticulum",
    "organelle",
    "cisternae",
    "eukaryotic",
    "ribosomes",
    "hepatocyes",
    "cell-membrane"
]
)

I can refer to this vocabulary later on by the name APBiology and update it programatically based on any errors I may find in the transcriptions.

Available Now

Amazon Transcribe is available now in US East (N. Virginia), US West (Oregon), US East (Ohio) and EU (Ireland). Transcribe’s free tier gives you 60 minutes of transcription for free per month for the first 12 months with a pay-as-you-go model of $0.0004 per second of transcribed audio after that, with a minimum charge of 15 seconds.

When combined with other tools and services I think transcribe opens up a entirely new opportunities for application development. I’m excited to see what technologies developers build with this new service.

Randall