Tag Archives: Amazon Elastic Transcoder

Implementing Serverless Video Subtitles

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/implementing-serverless-video-subtitles/

This post is courtesy of Maxime Thomas, DevOps Partner Solutions Architect – AWS

This story begins when I joined AWS at the beginning of the year. I had a hard time during my ramp-up period trying to handle the amount of information coming from all directions. Technical training, meetings, new colleagues, in a worldwide company—the volume of information was overwhelming. However, my first priority was to get my AWS Certified Solutions Architect — Professional certification. This gave me plenty of opportunities to learn and focus on all of the new domains I had never heard about.

This intensive self-paced training quickly gave me a way to get experience. I was opening the AWS Management Console, diving deep into the service documentation, and comparing to my own experience and understanding of production constraints. I wasn’t disappointed by the scope of the platform and its various capabilities.

However, as a native French speaker, I struggled a bit because all of the training videos were in English. Okay, it’s not a problem when you speak another language for 20 minutes a day, but 6 hours every day was exhausting. (It did help me to learn the language faster.) I looked at all of those training videos, and I thought: It would be so much easier if they had French subtitles!

But they didn’t. I continued my deep dive into the serverless world, which led me to another consideration: It would be cool to have a service that could generate subtitles from a video in any language.

Wait–the AWS platform has everything we need to do that!

Video: Playing a video after subtitle generation

I mean, what is the process of translation when you watch a video? It’s basically the following:

  • Listen
  • Extract the information
  • Translate

Proof of concept

I decided to focus on this subject to understand how I could build that kind of system. My pitch was this: The system can receive a video input, extract the audio track, transcribe it, and generate different subtitle files for your video. Since AWS re:Invent 2017, AWS has announced several services that helped me with my proof of concept:

Finally, the way to define subtitles has been specified by the World Wide Web Consortium under the WebVTT format, providing a simple way to produce subtitles for online videos.

I proved the concept in barely 20 minutes with a video file, an Amazon S3 bucket, some AWS IAM roles, and access to the beta versions of the different services. It was going to work, so I decided to transform it into a demo project.


The fun part of this project was doing it in a serverless way using AWS Lambda and AWS Step Functions. I could have developed it in other ways, but I eliminated them quickly: a custom code base on Amazon EC2 would take too long to code and was excessive computation for what I needed; a container with the code base on Amazon Elastic Container Service would be better, but still was overkill from a compute perspective.

So, Lambda was the solution of choice for compute. Step Functions would take care of coordinating the workflow of the application and the different Lambda functions, so I didn’t need to build that logic into the functions themselves. I split the solution into two parts:

  • The backend processes an MP4 file and outputs the same file plus WebVTT files for each language
  • The frontend provides a web interface to submit the video and render the result in a fancy way

The following image shows the solution’s architecture.


The solution consists of a Step Functions state machine that executes the following sequence triggered by an Amazon S3 event notification:

  1. Transcode the file with Elastic Transcoder using its API.
  2. Wait two minutes, which is enough time for transcoding.
  3. Submit the file to Amazon Transcribe and enter the following loop:
    1. Wait for 30 seconds.
    2. Check the API to know if transcription is over. If it is, go to step 4; otherwise, go back to step 3.1.
  4. Process the transcript to become a VTT file, which goes to Amazon Translate several times to get a version of the file in another language.
  5. Clean and wrap up.

The following image shows this sequence as a Step Functions state machine.

The power of Step Functions appears in the integration of such a sequence. You can set up different Lambda functions at each stage of the sequence, put them in parallel if you need to, and handle errors with a retry and fallback. Everything is declarative in the JSON that defines the state machine. The input object that the state machine evaluates between each transition is the one that you provide at the first call. You can enrich it as the state machine executes and gathers more information at later steps.

For instance, if you pass a JSON object as input, it goes through all the way through, and each step can add information that wasn’t there at the beginning of the workflow. This is useful when your decision tree is creating elements and you need to refer to it in other steps.

I also set up an Amazon DynamoDB table to store the state of each file for further processing on the front end.


The front end’s setup is easy: an Amazon S3 bucket with the static website feature on and a combination of HTML, AWS SDK for JavaScript in the Browser, and a JavaScript framework to handle calls to the AWS Platform. The sequence has the following steps:

  1. Load HTML, CSS, and JavaScript from a bucket in Amazon S3.
  2. Specific JavaScript for this project does the following:
    • Sets up the AWS SDK
    • Connects to Amazon Cognito against a predefined identity pool set up for anonymous users
    • Loads a custom IAM role that gives access to an Amazon S3 bucket
  3. The user uploads an MP4 file to the bucket, and the backend process starts.
  4. A JavaScript loop checks the DynamoDB table where the state of the process is stored and do the following:
    • Add a description of the video process and show the state of the process.
    • Update the progress bar in the description block to inform the user what the process is doing
    • Update the video links when the process is over.
  5. When the process completes, the user can choose the list item to get an HTML5 video player with the VTT files loaded.


Keep the following points in mind:

  • This isn’t a production solution. Don’t use it as is.
  • The solution is designed for videos where a person speaks clearly. I tried with non- native English-speaking people, and results are poor at the moment.
  • The solution is adapted for videos without background noise or music. I checked with different types of videos (movie scenes, music videos, and ads), and results are poor.
  • Processing time depends on the length of the original video.
  • The frontend check is basic. Improve it by implementing WebSockets to avoid polling from the browser, which it doesn’t scale.

What’s next?

Feel free to try out the code yourself and customize it for your own needs! This project is open source. To download the project files, see Serverless Subtitles on the AWSLabs GitHub website. Feel free to contribute (Pull Requests only).

Invoking AWS Lambda from Amazon MQ

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/invoking-aws-lambda-from-amazon-mq/

Contributed by Josh Kahn, AWS Solutions Architect

Message brokers can be used to solve a number of needs in enterprise architectures, including managing workload queues and broadcasting messages to a number of subscribers. Amazon MQ is a managed message broker service for Apache ActiveMQ that makes it easy to set up and operate message brokers in the cloud.

In this post, I discuss one approach to invoking AWS Lambda from queues and topics managed by Amazon MQ brokers. This and other similar patterns can be useful in integrating legacy systems with serverless architectures. You could also integrate systems already migrated to the cloud that use common APIs such as JMS.

For example, imagine that you work for a company that produces training videos and which recently migrated its video management system to AWS. The on-premises system used to publish a message to an ActiveMQ broker when a video was ready for processing by an on-premises transcoder. However, on AWS, your company uses Amazon Elastic Transcoder. Instead of modifying the management system, Lambda polls the broker for new messages and starts a new Elastic Transcoder job. This approach avoids changes to the existing application while refactoring the workload to leverage cloud-native components.

This solution uses Amazon CloudWatch Events to trigger a Lambda function that polls the Amazon MQ broker for messages. Instead of starting an Elastic Transcoder job, the sample writes the received message to an Amazon DynamoDB table with a time stamp indicating the time received.

Getting started

To start, navigate to the Amazon MQ console. Next, launch a new Amazon MQ instance, selecting Single-instance Broker and supplying a broker name, user name, and password. Be sure to document the user name and password for later.

For the purposes of this sample, choose the default options in the Advanced settings section. Your new broker is deployed to the default VPC in the selected AWS Region with the default security group. For this post, you update the security group to allow access for your sample Lambda function. In a production scenario, I recommend deploying both the Lambda function and your Amazon MQ broker in your own VPC.

After several minutes, your instance changes status from “Creation Pending” to “Available.” You can then visit the Details page of your broker to retrieve connection information, including a link to the ActiveMQ web console where you can monitor the status of your broker, publish test messages, and so on. In this example, use the Stomp protocol to connect to your broker. Be sure to capture the broker host name, for example:


You should also modify the Security Group for the broker by clicking on its Security Group ID. Click the Edit button and then click Add Rule to allow inbound traffic on port 8162 for your IP address.

Deploying and scheduling the Lambda function

To simplify the deployment of this example, I’ve provided an AWS Serverless Application Model (SAM) template that deploys the sample function and DynamoDB table, and schedules the function to be invoked every five minutes. Detailed instructions can be found with sample code on GitHub in the amazonmq-invoke-aws-lambda repository, with sample code. I discuss a few key aspects in this post.

First, SAM makes it easy to deploy and schedule invocation of our function:

	Type: AWS::Serverless::Function
		CodeUri: subscriber/
		Handler: index.handler
		Runtime: nodejs6.10
		Role: !GetAtt SubscriberFunctionRole.Arn
		Timeout: 15
				HOST: !Ref AmazonMQHost
				LOGIN: !Ref AmazonMQLogin
				PASSWORD: !Ref AmazonMQPassword
				QUEUE_NAME: !Ref AmazonMQQueueName
				WORKER_FUNCTIOn: !Ref WorkerFunction
				Type: Schedule
					Schedule: rate(5 minutes)

Type: AWS::Serverless::Function
		CodeUri: worker/
		Handler: index.handler
		Runtime: nodejs6.10
Role: !GetAtt WorkerFunctionRole.Arn
				TABLE_NAME: !Ref MessagesTable

In the code, you include the URI, user name, and password for your newly created Amazon MQ broker. These allow the function to poll the broker for new messages on the sample queue.

The sample Lambda function is written in Node.js, but clients exist for a number of programming languages.

stomp.connect(options, (error, client) => {
	if (error) { /* do something */ }

	let headers = {
		destination: ‘/queue/SAMPLE_QUEUE’,
		ack: ‘auto’

	client.subscribe(headers, (error, message) => {
		if (error) { /* do something */ }

		message.readString(‘utf-8’, (error, body) => {
			if (error) { /* do something */ }

			let params = {
				FunctionName: MyWorkerFunction,
				Payload: JSON.stringify({
					message: body,
					timestamp: Date.now()

			let lambda = new AWS.Lambda()
			lambda.invoke(params, (error, data) => {
				if (error) { /* do something */ }

Sending a sample message

For the purpose of this example, use the Amazon MQ console to send a test message. Navigate to the details page for your broker.

About midway down the page, choose ActiveMQ Web Console. Next, choose Manage ActiveMQ Broker to launch the admin console. When you are prompted for a user name and password, use the credentials created earlier.

At the top of the page, choose Send. From here, you can send a sample message from the broker to subscribers. For this example, this is how you generate traffic to test the end-to-end system. Be sure to set the Destination value to “SAMPLE_QUEUE.” The message body can contain any text. Choose Send.

You now have a Lambda function polling for messages on the broker. To verify that your function is working, you can confirm in the DynamoDB console that the message was successfully received and processed by the sample Lambda function.

First, choose Tables on the left and select the table name “amazonmq-messages” in the middle section. With the table detail in view, choose Items. If the function was successful, you’ll find a new entry similar to the following:

If there is no message in DynamoDB, check again in a few minutes or review the CloudWatch Logs group for Lambda functions that contain debug messages.

Alternative approaches

Beyond the approach described here, you may consider other approaches as well. For example, you could use an intermediary system such as Apache Flume to pass messages from the broker to Lambda or deploy Apache Camel to trigger Lambda via a POST to API Gateway. There are trade-offs to each of these approaches. My goal in using CloudWatch Events was to introduce an easily repeatable pattern familiar to many Lambda developers.


I hope that you have found this example of how to integrate AWS Lambda with Amazon MQ useful. If you have expertise or legacy systems that leverage APIs such as JMS, you may find this useful as you incorporate serverless concepts in your enterprise architectures.

To learn more, see the Amazon MQ website and Developer Guide. You can try Amazon MQ for free with the AWS Free Tier, which includes up to 750 hours of a single-instance mq.t2.micro broker and up to 1 GB of storage per month for one year.

AWS Hot Startups – April 2017

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/aws-hot-startups-april-2017/

Spring is here, the flowers are blooming and Tina Barr is back with more great startups for you to check out!


Welcome back to another month of hot AWS-powered startups! Today we have three exciting startups:

  • Beekeeper – simplifying employee communication in the workplace.
  • Betterment – making investing easier for everyone.
  • ClearSlide – a leading sales engagement platform.

Be sure to check out our March hot startups in case you missed them.

Beekeeper (Zurich, Switzerland)
Beekeeper logoFlavio Pfaffhauser and Christian Grossmann, both graduates of ETH Zurich, were passionate about building a technology that would connect and bring people together. What started as a student’s social community soon turned into Beekeeper – a communication platform for the workplace that allows employees to interact wherever they are. As Flavio and Christian learned how to build a social platform that engaged people properly, businesses began requesting a platform that could be adapted to their specific processes and needs. The platform started with the concept of helping people feel as if they are sitting right next to each other, whether they’re at a desk or in the field. Founded in 2012, Beekeeper is focused on improving information sharing, communication and peer collaboration, and the company strongly believes that listening to employees is crucial for organizations.

The “Mobile First, Desktop Friendly” platform has a simple and intuitive interface that easily integrates multiple operating systems into one ecosystem. The interface can be styled and customized to match a company’s brand and identity. Employees can connect with their colleagues anytime and anywhere with private and group chats, video and file sharing, and feedback surveys. With Beekeeper’s analytical dashboard leadership teams can identify trending topics of discussion and track employee engagement and app usage in real-time. Beekeeper is currently connecting users in 137 countries across industries including hospitality, construction, transportation, and more.

Beekeeper likes using AWS because it allows their engineers to focus on the things that really matter; solving customer issues. The company builds its infrastructure using services like Amazon EC2, Amazon S3, and Amazon RDS, all of which allow the technical teams to offload administrative tasks. Amazon Elastic Transcoder and Amazon QuickSight are used to build analytical dashboards and Amazon Redshift for data warehousing.

Check out the Beekeeper blog to keep up with their latest news!

Betterment (New York, NY)
Betterment logo
Betterment is on a mission to make investing easier and more accessible for everyone, no matter their financial goal. In 2008, Jon Stein founded Betterment with the intent to reinvent the industry and save future investors from making the same common mistakes he had been making. At that time, most people only had a couple of options when it came to investing their money – either do it yourself or hire another person to do it for you. Unfortunately, financial advisors are sometimes paid to recommend certain investments even if it’s not what is best for their clients. Betterment only chooses investments that are in their customer’s best interest and align with their financial goals. Today, they are the largest, independent online investment advisor managing more than $8 billion in assets for over 240,000 customers.

Betterment uses technology to make investing easier and more efficient, while also helping to increase after-tax returns. They offer a wide range of financial planning services that are personalized to their customer’s life goals. To start an investment plan, customers can input their age, retirement status, and annual income and Betterment will recommend how much money to invest and which type of account is the right choice. They will invest and manage it in a way that many traditional investment services can’t at a lower cost.

The engineers at Betterment are constantly working to build industry-changing technology as quickly as possible to help customers maximize their money. AWS gives Betterment the flexibility to easily provision infrastructure and offload functions to various services that once required entire teams to manage. When they first started in the cloud, Betterment was using standard implementations of Amazon EC2, Amazon RDS, and Amazon S3. Since they’ve gone all in with AWS, they have been leveraging services like Amazon Redshift, AWS Lambda, AWS Database Migration Service, Amazon Kinesis, Amazon DynamoDB, and more. Today, they are using over 20 AWS services to develop, test, and deploy features and enhancements on a daily basis.

Learn more about Betterment here.

ClearSlide (San Francisco, CA)
ClearSlide is one of today’s leading sales engagement platforms, offering a complete and integrated tool that makes every customer interaction successful. Since their founding in 2009, ClearSlide has looked for ways to improve customer experiences and have developed numerous enablement tools for sales leaders and teams, marketing, customer support teams, and more. The platform puts content, communication channels, and insights at their customer’s fingertips to help drive better decisions and manage opportunities. ClearSlide serves thousands of companies including Comcast, the Sacramento Kings, The Economist, and so far their customers have generated over 750 million minutes of engagement!

ClearSlide offers a solution for all parts of the sales process. For sales leaders, ClearSlide provides engagement dashboards to improve deal visibility, coaching, and sales forecast accuracy. For marketing and sales enablement teams, they guide sellers to the right content, at the right time, in the right context, and provide insight to maximize content ROI. For sales reps, ClearSlide integrates communications, content, and analytics in a single platform experience. Communications can be made across email, in-person or online meetings, web, or social. Today, ClearSlide customers report a 10-20% increase in closed deals, 25% decrease in onboarding time for new reps, and a 50-80% reduction in selling costs.

ClearSlide uses a range of AWS services, but Amazon EC2 and Amazon RDS have made the biggest impact on their business. EC2 enables them to easily scale compute capacity, which is critical for a fast-growing startup. It also provides consistency during deployment – from development and integration to staging and production. RDS reduces overhead and allows ClearSlide to scale their database infrastructure. Since AWS takes care of time-consuming database management tasks, ClearSlide sees a reduction in operations costs and can focus on being more strategic with their customers.

Watch this video to learn how LiveIntent reduced sales cycles by 22% using ClearSlide. Get all the latest updates by following them on Twitter!

Thanks for checking out another month of awesome AWS-powered startups!



AWS Week in Review – September 27, 2016

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-week-in-review-september-27-2016/

Fourteen (14) external and internal contributors worked together to create this edition of the AWS Week in Review. If you would like to join the party (with the possibility of a free lunch at re:Invent), please visit the AWS Week in Review on GitHub.


September 26


September 27


September 28


September 29


September 30


October 1


October 2

New & Notable Open Source

  • dynamodb-continuous-backup sets up continuous backup automation for DynamoDB.
  • lambda-billing uses NodeJS to automate billing to AWS tagged projects, producing PDF invoices.
  • vyos-based-vpc-wan is a complete Packer + CloudFormation + Troposphere powered setup of AMIs to run VyOS IPSec tunnels across multiple AWS VPCs, using BGP-4 for dynamic routing.
  • s3encrypt is a utility that encrypts and decrypts files in S3 with KMS keys.
  • lambda-uploader helps to package and upload Lambda functions to AWS.
  • AWS-Architect helps to deploy microservices to Lambda and API Gateway.
  • awsgi is an WSGI gateway for API Gateway and Lambda proxy integration.
  • rusoto is an AWS SDK for Rust.
  • EBS_Scripts contains some EBS tricks and triads.
  • landsat-on-aws is a web application that uses Amazon S3, Amazon API Gateway, and AWS Lambda to create an infinitely scalable interface to navigate Landsat satellite data.

New SlideShare Presentations

New AWS Marketplace Listings

Upcoming Events

Help Wanted

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

AWS Week in Review – September 19, 2016

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-week-in-review-september-19-2016/

Eighteen (18) external and internal contributors worked together to create this edition of the AWS Week in Review. If you would like to join the party (with the possibility of a free lunch at re:Invent), please visit the AWS Week in Review on GitHub.


September 19


September 20


September 21


September 22


September 23


September 24


September 25

New & Notable Open Source

  • ecs-refarch-cloudformation is reference architecture for deploying Microservices with Amazon ECS, AWS CloudFormation (YAML), and an Application Load Balancer.
  • rclone syncs files and directories to and from S3 and many other cloud storage providers.
  • Syncany is an open source cloud storage and filesharing application.
  • chalice-transmogrify is an AWS Lambda Python Microservice that transforms arbitrary XML/RSS to JSON.
  • amp-validator is a serverless AMP HTML Validator Microservice for AWS Lambda.
  • ecs-pilot is a simple tool for managing AWS ECS.
  • vman is an object version manager for AWS S3 buckets.
  • aws-codedeploy-linux is a demo of how to use CodeDeploy and CodePipeline with AWS.
  • autospotting is a tool for automatically replacing EC2 instances in AWS AutoScaling groups with compatible instances requested on the EC2 Spot Market.
  • shep is a framework for building APIs using AWS API Gateway and Lambda.

New SlideShare Presentations

New Customer Success Stories

  • NetSeer significantly reduces costs, improves the reliability of its real-time ad-bidding cluster, and delivers 100-millisecond response times using AWS. The company offers online solutions that help advertisers and publishers match search queries and web content to relevant ads. NetSeer runs its bidding cluster on AWS, taking advantage of Amazon EC2 Spot Fleet Instances.
  • New York Public Library revamped its fractured IT environment—which had older technology and legacy computing—to a modernized platform on AWS. The New York Public Library has been a provider of free books, information, ideas, and education for more than 17 million patrons a year. Using Amazon EC2, Elastic Load Balancer, Amazon RDS and Auto Scaling, NYPL is able to build scalable, repeatable systems quickly at a fraction of the cost.
  • MakerBot uses AWS to understand what its customers need, and to go to market faster with new and innovative products. MakerBot is a desktop 3-D printing company with more than 100 thousand customers using its 3-D printers. MakerBot uses Matillion ETL for Amazon Redshift to process data from a variety of sources in a fast and cost-effective way.
  • University of Maryland, College Park uses the AWS cloud to create a stable, secure and modern technical environment for its students and staff while ensuring compliance. The University of Maryland is a public research university located in the city of College Park, Maryland, and is the flagship institution of the University System of Maryland. The university uses AWS to migrate all of their datacenters to the cloud, as well as Amazon WorkSpaces to give students access to software anytime, anywhere and with any device.

Upcoming Events

Help Wanted

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

Amazon Elastic Transcoder Update – Support for MPEG-DASH

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-elastic-transcoder-update-support-for-mpeg-dash/

Amazon Elastic Transcoder converts media files (audio and video) from one format to another. The service is robust, scalable, cost-effective, and easy to use. You simply create a processing pipeline (pointing to a pair of S3 buckets for input and output in the process), and then create transcoding jobs. Each job reads a specific file from the input bucket, transcodes it to the desired format(s) as specified in the job, and then writes the output to the output bucket. You pay for only what you transcode, with price points for Standard Definition (SD) video, High Definition (HD) video, and audio. We launched the service with support for an initial set of transcoding presets (combinations of output formats and relevant settings). Over time, in response to customer demand and changes in encoding technologies, we have added additional presets and formats. For example, we added support for the VP9 Codec earlier this year.

Support for MPEG-DASH
Today we are adding support for transcoding to the MPEG-DASH format. This International Standard format supports high-quality audio and video streaming from HTTP servers, and has the ability to adapt to changes in available network throughput using a technique known as adaptive streaming. It was designed to work well across multiple platforms and at multiple bitrates, simplifying the transcoding process and sidestepping the need to create output in multiple formats.

During the MPEG-DASH transcoding process, the content is transcoded into segmented outputs at the different bitrates and a playlist is created that references these outputs. The client (most often a video player) downloads the playlist to initiate playback. Then it monitors the effective network bandwidth and latency, requests video segments as needed. If network conditions change during the playback process, the player will take action, upshifting or downshifting as needed.

You can serve up the transcoded content directly from S3 or you can use Amazon CloudFront to get the content even closer to your users. Either way, you need to create a CORS policy that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">

If you are using CloudFront, you need to enable the OPTIONS method, and allow it to be cached:

You also need to add three headers to the whitelist for the distribution:

Transcoding With MPEG-DASH
To make use of the adaptive bitrate feature of MPEG-DASH, you create a single transcoding job and specify multiple outputs, each with a different preset. Here are your choices (4 for video and 1 for audio):

When you use this format, you also need to choose a suitable segment duration (in seconds). A shorter duration produces a larger number of smaller segments and allows the client to adapt to changes more quickly.

You can create a single playlist that contains all of the bitrates, or you can choose the bitrates that are most appropriate for your customers and your content. You can also create your own presets, using an existing one as a starting point:

Available Now
MPEG-DASH support is available now in all Regions where Amazon Elastic Transcoder is available. There is no extra charge for this use of this format (see Elastic Transcoder Pricing to learn more).