When taking photos, most of us simply like to press the shutter button on our cameras and phones so that viewable image is produced almost instantaneously, usually encoded in the well-known JPEG format. However, there are some applications where a little more control over the production of that JPEG is desirable. For instance, you may want more or less de-noising, or you may feel that the colours are not being rendered quite right.
This is where raw (sometimes RAW) files come in. A raw image in this context is a direct capture of the pixels output from the image sensor, with no additional processing. Normally this is in a relatively standard format known as a Bayer image, named after Bryce Bayer who pioneered the technique back in 1974 while working for Kodak. The idea is not to let the on-board hardware ISP (Image Signal Processor) turn the raw Bayer image into a viewable picture, but instead to do it offline with an additional piece of software, often referred to as a raw converter.
A Bayer image records only one colour at each pixel location, in the pattern shown
The raw image is sometimes likened to the old photographic negative, and whilst many camera vendors use their own proprietary formats, the most portable form of raw file is the Digital Negative (or DNG) format, defined by Adobe in 2004. The question at hand is how to obtain DNG files from Raspberry Pi, in such a way that we can process them using our favourite raw converters.
Obtaining a raw image from Raspberry Pi
Many readers will be familiar with the raspistill application, which captures JPEG images from the attached camera. raspistill includes the -r option, which appends all the raw image data to the end of the JPEG file. JPEG viewers will still display the file as normal but ignore the (many megabytes of) raw data tacked on the end. Such a “JPEG+RAW” file can be captured using the terminal command:
raspistill -r -o image.jpg
Unfortunately this JPEG+RAW format is merely what comes out of the camera stack and is not supported by any raw converters. So to make use of it we will have to convert it into a DNG file.
PyDNG
This Python utility converts the Raspberry Pi’s native JPEG+RAW files into DNGs. PyDNG can be installed from github.com/schoolpost/PyDNG, where more complete instructions are available. In brief, we need to perform the following steps:
git clone https://github.com/schoolpost/PyDNG
cd PyDNG
pip3 install src/. # note that PyDNG requires Python3
PyDNG can be used as part of larger Python scripts, or it can be run stand-alone. Continuing the raspistill example from before, we can enter in a terminal window:
python3 examples/utility.py image.jpg
The resulting DNG file can be processed by a variety of raw converters. Some are free (such as RawTherapee or dcraw, though the latter is no longer officially developed or supported), and there are many well-known proprietary options (Adobe Camera Raw or Lightroom, for instance). Perhaps users will post in the comments any that they feel have given them good results.
White balancing and colour matrices
Now, one of the bugbears of processing Raspberry Pi raw files up to this point has been the problem of getting sensible colours. Previously, the images have been rendered with a sickly green cast, simply because no colour balancing is being done and green is normally the most sensitive colour channel. In fact it’s even worse than this, as the RGB values in the raw image merely reflect the sensitivity of the sensor’s photo-sites to different wavelengths, and do not a priori have more than a general correlation with the colours as perceived by our own eyes. This is where we need white balancing and colour matrices.
Correct white balance multipliers are required if neutral parts of the scene are to look, well, neutral. We can use raspistill‘s guesstimate of them, found in the JPEG+RAW file (or you can measure your own on a neutral part of the scene, like a grey card). Matrices and look-up tables are then required to convert colour from ‘camera’ space to the final colour space of choice, mostly sRGB or Adobe RGB.
My thanks go to forum contributors Jack Hogan for measuring these colour matrices, and to Csaba Nagy for implementing them in the PyDNG tool. The results speak for themselves.
Results
Previous attempts at raw conversion are on the left; the results using the updated PyDNG are on the right.
Images 2 and 3 courtesy of Csaba Nagy; images 4 and 5 courtesy of Jack Hogan
DCP files
For those familiar with DNG files, we include links to DCP (DNG Camera Profile) files (warning: binary format). You can try different ones out in raw converters, and we would encourage users to experiment, to perhaps create their own, and to share their results!
This is a basic colour profile baked into PyDNG, and is the one shown in the results above. It’s sufficiently small that we can view it as a JSON file.
Note, however, that these files come with a few caveats. Specifically:
The calibration is only for a single Raspberry Pi High Quality Camera rather than a known average or “typical” module.
The illuminants used for the calibration are merely the ones that we had to hand — the D65 lamp in particular appears to be some way off.
The calibration only really works when the colour temperature lies between, or not too far from, the two calibration illuminants, approximately 2900K to 6000K in our case.
So there remains room for improvement. Nevertheless, results across a number of modules have shown these parameters to be a significant step forward.
Acknowledgements
My thanks again to Jack Hogan for performing the colour matrix calibration with DCamProf, and to Csaba Nagy for adding these new features to PyDNG.
Further reading
There are many resources explaining how a raw (Bayer) image is converted into a viewable RGB or YUV image, among them Jack’s blog post.
To understand the role of the colour matrices in a DNG file, please refer to the DNG specification. Chapter 6 in particular describes how they are used.
By Aditya Mavlankar, Jan De Cock¹, Cyril Concolato, Kyle Swanson, Anush Moorthy and Anne Aaron
TL; DR
We need an alternative to JPEG that a) is widely supported, b) has better compression efficiency and c) has a wider feature set. We believe AV1 Image File Format (AVIF) has the potential. Using the framework we have open sourced, AVIF compression efficiency can be seen at work and compared against a whole range of image codecs that came before it.
Image compression at Netflix
Netflix is enjoyed by its members on a variety of devices — smart TVs, phones, tablets, personal computers and streaming devices connected to TV screens. The user interface (UI), intended for browsing the catalog and serving up recommendations, is rich in images and graphics across all device categories. Shown below are screenshots of the Netflix app on iOS as an example.
Screenshots showing the Netflix UI on iOS (iPhone 7) at the time of this writing.
Image assets might be based on still frames from the title, special on-set photography or a combination thereof. Assets could also stem from art generated during the production of the feature.
As seen above, image assets typically have gradients, text and graphics, for example the Netflix symbol or other title-specific symbols such as “The Witcher” insignia, composited on the image. Such special treatments lead to a variety of peculiarities which do not necessarily arise in natural images. Hard edges, including those with chroma differences on either side of the edge, are common and require good detail preservation, since they typically occur at salient locations and convey important information. Further, there is typically a character or a face in salient locations with a smooth, uncluttered background. Again, preservation of detail on the character’s face is of primary importance. In some cases, the background is textured and complex, exhibiting a wide range of frequencies.
After an image asset is ingested, the compression pipeline kicks in and prepares compressed image assets meant for delivering to devices. The goal is to have the compressed image look as close to the original as possible while reducing the number of bytes required. Given the image-heavy nature of the UI, compressing these images well is of primary importance. This involves picking, among other things, the right combination of color subsampling, codec, encoder parameters and encoding resolution.
Compressed image assets destined for various client devices and various spaces in the UI are created from corresponding “pristine” image sources.
Let us take color subsampling as an example. Choosing 420 subsampling, over the original 444 format, halves the number of samples (counting across all 3 color planes) that need to be encoded while relying on the fact that the human visual system is more sensitive to luma than chroma. However, 420 subsampling can introduce color bleeding and jaggies in locations with color transitions. Below we toggle between the original source in 444 and the source converted to 420 subsampling. The toggling shows loss introduced just by the color subsampling, even before the codec enters the picture.
Toggling between the original source image with 444 subsampling and after converting to 420 subsampling. Showing the top part of the artwork only. The reader may zoom in on the webpage to view jaggies around the Netflix logo appearing due to 420 subsampling.
Nevertheless, there are source images where the loss due to 420 subsampling is not obvious to human perception and in such cases it can be advantageous to use 420 subsampling. Ideally, a codec should be able to support both subsampling formats. However, there are a few codecs that only support 420 subsampling — webp, discussed below, is one such popular codec.
Brief overview of image coding formats
The JPEG format was introduced in 1992 and is widely popular. It supports various color subsamplings including 420, 422 and 444. JPEG can ingest RGB data and transform it to a luma-chroma representation before performing lossy compression. The discrete cosine transform (DCT) is employed as the decorrelating transform on 8×8 blocks of samples. This is followed by quantization and entropy coding. However, JPEG is restricted to 8-bit imagery and lacks support for alpha channel. The more recent JPEG-XT standard extends JPEG to higher bit-depths, support for alpha channel, lossless compression and more in a backwards compatible way.
The JPEG 2000 format, based on the discrete wavelet transform (DWT), was introduced as a successor to JPEG in the year 2000. It brought a whole range of additional features such as spatial scalability, region of interest coding, range of supported bit-depths, flexible number of color planes, lossless coding, etc. With the motion extension, it was accepted as the video coding standard for digital cinema in 2004.
The webp format was introduced by Google around 2010. Google added decoding support on Android devices and Chrome browser and also released libraries that developers could add to their apps on other platforms, for example iOS. Webp is based on intra-frame coding from the VP8 video coding format. Webp does not have all the flexibilities of JPEG 2000. It does, however, support lossless coding and also a lossless alpha channel, making it a more efficient and faster alternative to PNG in certain situations.
High-Efficiency Video Coding (HEVC) is the successor of H.264, a.k.a. Advanced Video Coding (AVC) format. HEVC intra-frame coding can be encapsulated in the High-Efficiency Image File Format (HEIF). This format is most notably used by Apple devices to store recorded imagery.
Similarly, AV1 Image File Format (AVIF) allows encapsulating AV1 intra-frame coded content, thus taking advantage of excellent compression gains achieved by AV1 over predecessors. We touch upon some appealing technical features of AVIF in the next section.
The JPEG committee is pursuing a coding format called JPEG XL which includes features aimed at helping the transition from legacy JPEG format. Existing JPEG files can be losslessly transcoded to JPEG XL while achieving file size reduction. Also included is a lightweight conversion process back to JPEG format in order to serve clients that only support legacy JPEG.
AVIF technical features
Although modern video codecs were developed with primarily video in mind, the intraframe coding tools in a video codec are not significantly different from image compression tooling. Given the huge compression gains of modern video codecs, they are compelling as image coding formats. There is a potential benefit in reusing the hardware in place for video compression/decompression. Image decoding in hardware may not be a primary motivator, given the peculiarities of OS dependent UI composition, and architectural implications of moving uncompressed image pixels around.
In the area of image coding formats, the Moving Picture Experts Group (MPEG) has standardized a codec-agnostic and generic image container format: ISO/IEC 23000–12 standard (a.k.a. HEIF). HEIF has been used to store most notably HEVC-encoded images (in its HEIC variant) but is also capable of storing AVC-encoded images or even JPEG-encoded images. The Alliance for Open Media (AOM) has recently extended this format to specify the storage of AV1-encoded images in its AVIF format. The base HEIF format offers typical features expected from an image format such as: support for any image codec, ability to use a lossy or a lossless mode for compression, support for varied subsampling and bit-depths, etc. Furthermore, the format also allows the storage of a series of animated frames (offering an efficient and long-awaited alternative to animated GIFs), and the ability to specify an alpha channel (which sees tremendous use in UIs). Further, since the HEIF format borrows learnings from next-generation video compression, the format allows for preserving metadata such as color gamut and high dynamic range (HDR) information.
Image compression comparison framework
We have open sourced a Docker based framework for comparing various image codecs. Salient features include:
Encode orchestration (with parallelization) and insights generation using Python 3
Easy reproducibility of results and
Easy control of target quality range(s).
Since the framework allows one to specify a target quality (using a certain metric) for target codec(s), and stores these results in a local database, one can easily utilize the Bjontegaard-Delta (BD) rate to compare across codecs since the target points can be restricted to a useful or meaningful quality range, instead of blindly sweeping across the encoder parameter range (such as a quality factor) with fixed parameter values and landing on arbitrary quality points.
An an example, below are the calls that would produce compressed images for the choice of codecs at the specified SSIM and VMAF values, with the desired tolerance in target quality:
For the various codecs and configurations involved in the ensuing comparison, the reader can view the actual command lines in the shared repository. We have attempted to get the best compression efficiency out of every codec / configuration compared here. The reader is free to experiment with changes to encoding commands within the framework. Furthermore, newer versions of respective software implementations might have been released compared to versions used at the time of gathering below results. For example, a newer software version of Kakadu demo apps is available compared to the one in the framework snapshot on github used at the time of gathering below results.
Visual examples
This is the section where we get to admire the work of the compression community over the last 3 decades by looking at visual examples comparing JPEG and the state-of-the-art.
The encoded images shown below are illustrative and meant to compare visual quality at various target bitrates. Please note that the quality of the illustrative encodes is not representative of the high quality bar that Netflix employs for streaming image assets on the actual service, and is meant to be purely educative in nature.
Shown below is one original source image from the Kodak dataset and the corresponding result with JPEG 444 @ 20,429 bytes and with AVIF 444 @ 19,788 bytes. The JPEG encode shows very obvious blocking artifacts in the sky, in the pond as well as on the roof. The AVIF encode is much better, with less blocking artifacts, although there is some blurriness and loss of texture on the roof. It is still a remarkable result, given the compression factor of around 59x (original image has dimensions 768×512, thus requiring 768x512x3 bytes compared to the 20k bytes of the compressed image).
An original image from the Kodak datasetJPEG 444 @ 20,429 bytesAVIF 444 @ 19,788 bytes
For the same source, shown below is the comparison of JPEG 444 @ 40,276 bytes and AVIF 444 @ 39,819 bytes. The JPEG encode still has visible blocking artifacts in the sky, along with ringing around the roof edges and chroma bleeding in several locations. The AVIF image however, is now comparable to the original, with a compression factor of 29x.
JPEG 444 @ 40,276 bytesAVIF 444 @ 39,819 bytes
Shown below is another original source image from the Kodak dataset and the corresponding result with JPEG 444 @ 13,939 bytes and with AVIF 444 @ 4,176 bytes. The JPEG encode shows blocking artifacts around most edges, particularly around the slanting edge as well as color distortions. The AVIF encode looks “cleaner” even though it is one-third the size of the JPEG encode. It is not a perfect rendition of the original, but with a compression factor of 282x, this is commendable.
Another original source image from the Kodak datasetJPEG 444 @ 13,939 bytesAVIF 444 @ 4,176 bytes
Shown below are results for the same image with slightly higher bit-budget; JPEG 444 @ 19,787 bytes versus AVIF 444 @ 20,120 bytes. The JPEG encode still shows blocking artifacts around the slanting edge whereas the AVIF encode looks nearly identical to the source.
JPEG 444 @ 19,787 bytesAVIF 444 @ 20,120 bytes
Shown below is an original image from the Netflix (internal) 1142×1600 resolution “boxshots-1” dataset. Followed by JPEG 444 @ 69,445 bytes and AVIF 444 @ 40,811 bytes. Severe banding and blocking artifacts along with color distortions are visible in the JPEG encode. Less so in the AVIF encode which is actually 29kB smaller.
An original source image from the Netflix (internal) boxshots-1 datasetJPEG 444 @ 69,445 bytesAVIF 444 @ 40,811 bytes
Shown below are results for the same image with slightly increased bit-budget. JPEG 444 @ 80,101 bytes versus AVIF 444 @ 85,162 bytes. The banding and blocking is still visible in the JPEG encode whereas the AVIF encode looks very close to the original.
JPEG 444 @ 80,101 bytesAVIF 444 @ 85,162 bytes
Shown below is another source image from the same boxshots-1 dataset along with JPEG 444 @ 81,745 bytes versus AVIF 444 @ 76,087 bytes. Blocking artifacts overall and mosquito artifacts around text can be seen in the JPEG encode.
Another original source image from the Netflix (internal) boxshots-1 datasetJPEG 444 @ 81,745 bytesAVIF 444 @ 76,087 bytes
Shown below is another source image from the boxshots-1 dataset along with JPEG 444 @ 80,562 bytes versus AVIF 444 @ 80,432 bytes. There is visible banding, blocking and mosquito artifacts in the JPEG encode whereas the AVIF encode looks very close to the original source.
Another original source image from the Netflix (internal) boxshots-1 datasetJPEG 444 @ 80,562 bytesAVIF 444 @ 80,432 bytes
Overall results
Shown below are results over public datasets as well as Netflix-internal datasets. The reference codec used is JPEG from the JPEG-XT reference software, using the standard quantization matrix defined in Annex K of the JPEG standard. Following are the codecs and/or configurations tested and reported against the baseline in the form of BD rate.
The encoding resolution in these experiments is the same as the source resolution. For 420 subsampling encodes, the quality metrics were computed in 420 subsampling domain. Likewise, for 444 subsampling encodes, the quality metrics were computed in 444 subsampling domain. Along with BD rates associated with various quality metrics, such as SSIM, MS-SSIM, VIF and PSNR, we also show rate-quality plots using SSIM as the metric.
Kodak dataset; 24 images; 768×512 resolution
We have uploaded the source images in PNG format here for easy reference. We give the necessary attribution to Kodak as the source of this dataset.
Given a quality metric, for each image, we consider two separate rate-quality curves. One curve associated with the baseline (JPEG) and one curve associated with the target codec. We compare the two and compute the BD-rate which can be interpreted as the average percentage rate reduction for the same quality over the quality region being considered. A negative value implies rate reduction and hence is better compared to the baseline. As a last step, we report the arithmetic mean of BD rates over all images in the dataset. We also highlight the best performer in the tables below.
CLIC dataset; 303 images; 2048×1320 resolution
We selected a subset of images from the dataset made public as part of the workshop and challenge on learned image compression (CLIC), held in conjunction with CVPR. We have uploaded our selected 303 source images in PNG format here for easy reference with appropriate attribution to CLIC.
Billboard dataset (Netflix-internal); 223 images; 2048×1152 resolution
Billboard images generally occupy a larger canvas than the thumbnail-like boxshot images and are generally horizontal. There is room to overlay text or graphics on one of the sides, either left or right, with salient characters/scenery/art being located on the other side. An example can be seen below. The billboard source images are internal to Netflix and hence do not constitute a public dataset.
A sample original source image from the billboard dataset
Unlike billboard images, boxshot images are vertical and typically boxshot images representing different titles are displayed side-by-side in the UI. Examples from this dataset are showcased in the section above on visual examples. The boxshots-1 source images are internal to Netflix and hence do not constitute a public dataset.
The boxshots-2 dataset also has vertical box art but of lower resolution. The boxshots-2 source images are internal to Netflix and hence do not constitute a public dataset.
At this point, it might be prudent to discuss the omission of VMAF as a quality metric here. In previous work we have shown that for JPEG-like distortions and datasets similar to “boxshots” and “billboards”, VMAF has high correlation with perceived quality. However, VMAF, as of today, is a metric trained and developed to judge encoded videos rather than static images. The range of distortions associated with the range of image codecs in our tests is broader than what was considered in the VMAF development process and to that end, it may not be an accurate measure of image quality for those codecs. Further, today’s VMAF model is not designed to capture chroma artifacts and hence would be unable to distinguish between 420 and 444 subsampling, for instance, apart from other chroma artifacts (this is also true of some other measures we’ve used, but given the lack of alternatives, we’ve leaned on the side of using the most well tested and documented image quality metrics). This is not to say that VMAF is grossly inaccurate for image quality, but to say that we would not use it in our evaluation of image compression algorithms with such a wide diversity of codecs at this time. We have some exciting upcoming work to improve the accuracy of VMAF for images, across a variety of codecs, and resolutions, including chroma channels in the score. Having said that, the code in the repository computes VMAF and the reader is encouraged to try it out and see that AVIF also shines judging by VMAF as is today.
PSNR does not have as high correlation with perceptual quality over a wide quality range. However, if encodes are made with a high PSNR target then one overspends bits but can rest assured that a high PSNR score implies closeness to the original. With perceptually driven metrics, we sometimes see failure manifest in rare cases where the score is undeservingly high but visual quality is lacking.
Interesting observation regarding subsampling
In addition to above quality calculations, we have the following observation which reveals an encouraging trend among modern codecs. After performing an encode with 420 subsampling, let’s assume we decode the image, up-convert it to 444 subsampling and then compute various metrics by comparing against the original source in 444 format. We call this configuration “444u” to distinguish from above cases where “encode-subsampling” and “quality-computation-subsampling” match. Among the chosen metrics, PSNR_AVG is one which takes all 3 channels (1 luma and 2 chroma) into account. With an older codec like JPEG, the bit-budget is spread thin over more samples while encoding 444 subsampling compared to encoding 420 subsampling. This shows as poorer PSNR_AVG for encoding JPEG with 444 subsampling compared to 420 subsampling, as shown below. However, given a rate target, with modern codecs like HEVC and AVIF, it is simply better to encode 444 subsampling over a wide range of bitrates.
It is simply better to encode with 444 subsampling with a modern codec such as AVIF judging by PSNR_AVG as the metric
We see that with modern codecs we yield a higher PSNR_AVG when encoding 444 subsampling than 420 subsampling over the entire region of “practical” rates, even for the other, more practical, datasets such as boxshots-1. Interestingly, with JPEG, we see a crossover; i.e., after crossing a certain rate, it starts being more efficient to encode 444 subsampling. Such crossovers are analogous to rate-quality curves crossing over when encoding over multiple spatial resolutions. Shown below are rate-quality curves for two different source images from the boxshots-1 dataset, comparing JPEG and AVIF in both 444u and 444 configurations.
It is simply better to encode with 444 subsampling with a modern codec such as AVIF judging by PSNR_AVG as the metricIt is simply better to encode with 444 subsampling with a modern codec such as AVIF judging by PSNR_AVG as the metric
AVIF support and next steps
Although AVIF provides superior compression efficiency, it is still at an early deployment stage. Various tools exist to produce and consume AVIF images. The Alliance for Open Media is notably developing an open-source library, called libavif, that can encode and decode AVIF images. The goal of this library is to ease the integration in software from the image community. Such integration has already started, for example, in various browsers, such as Google Chrome, and we expect to see broad support for AVIF images in the near future. Major efforts are also ongoing, in particular from the dav1d team, to make AVIF image decoding as fast as possible, including for 10-bit images. It is conceivable that we will soon test AVIF images on Android following on the heels of our recently announced AV1 video adoption efforts on Android.
The datasets used above have standard dynamic range (SDR) 8-bit imagery. At Netflix, we are also working on HDR images for the UI and are planning to use AVIF for encoding these HDR image assets. This is a continuation of our previous efforts where we experimented with JPEG 2000 as the compression format for HDR images and we are looking forward to the superior compression gains afforded by AVIF.
Acknowledgments
We would like to thank Marjan Parsa, Pierre Lemieux, Zhi Li, Christos Bampis, Andrey Norkin, Hunter Ford, Igor Okulist, Joe Drago, Benbuck Nason, Yuji Mano, Adam Rofer and Jeff Watts for all their contributions and collaborations.
¹as part of his work while he was affiliated with Netflix
Last year I suggested that you Get Ready for the AWS Serverless Application Repository and gave you a sneak peek. The Repository is designed to make it as easy as possible for you to discover, configure, and deploy serverless applications and components on AWS. It is also an ideal venue for AWS partners, enterprise customers, and independent developers to share their serverless creations.
Now Available After a well-received public preview, the AWS Serverless Application Repository is now generally available and you can start using it today!
As a consumer, you will be able to tap in to a thriving ecosystem of serverless applications and components that will be a perfect complement to your machine learning, image processing, IoT, and general-purpose work. You can configure and consume them as-is, or you can take them apart, add features, and submit pull requests to the author.
As a publisher, you can publish your contribution in the Serverless Application Repository with ease. You simply enter a name and a description, choose some labels to increase discoverability, select an appropriate open source license from a menu, and supply a README to help users get started. Then you enter a link to your existing source code repo, choose a SAM template, and designate a semantic version.
Let’s take a look at both operations…
Consuming a Serverless Application The Serverless Application Repository is accessible from the Lambda Console. I can page through the existing applications or I can initiate a search:
A search for “todo” returns some interesting results:
I simply click on an application to learn more:
I can configure the application and deploy it right away if I am already familiar with the application:
I can expand each of the sections to learn more. The Permissions section tells me which IAM policies will be used:
And the Template section displays the SAM template that will be used to deploy the application:
I can inspect the template to learn more about the AWS resources that will be created when the template is deployed. I can also use the templates as a learning resource in preparation for creating and publishing my own application.
The License section displays the application’s license:
To deploy todo, I name the application and click Deploy:
Deployment starts immediately and is done within a minute (application deployment time will vary, depending on the number and type of resources to be created):
I can see all of my deployed applications in the Lambda Console:
There’s currently no way for a SAM template to indicate that an API Gateway function returns binary media types, so I set this up by hand and then re-deploy the API:
Following the directions in the Readme, I open the API Gateway Console and find the URL for the app in the API Gateway Dashboard:
I visit the URL and enter some items into my list:
Publishing a Serverless Application Publishing applications is a breeze! I visit the Serverless App Repository page and click on Publish application to get started:
Then I assign a name to my application, enter my own name, and so forth:
I can choose from a long list of open-source friendly SPDX licenses:
I can create an initial version of my application at this point, or I can do it later. Either way, I simply provide a version number, a URL to a public repository containing my code, and a SAM template:
Available Now The AWS Serverless Application Repository is available now and you can start using it today, paying only for the AWS resources consumed by the serverless applications that you deploy.
You can deploy applications in the US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (São Paulo) Regions. You can publish from the US East (N. Virginia) or US East (Ohio) Regions for global availability.
We can’t believe that there are just few days left before re:Invent 2017. If you are attending this year, you’ll want to check out our Big Data sessions! The Big Data and Machine Learning categories are bigger than ever. As in previous years, you can find these sessions in various tracks, including Analytics & Big Data, Deep Learning Summit, Artificial Intelligence & Machine Learning, Architecture, and Databases.
We have great sessions from organizations and companies like Vanguard, Cox Automotive, Pinterest, Netflix, FINRA, Amtrak, AmazonFresh, Sysco Foods, Twilio, American Heart Association, Expedia, Esri, Nextdoor, and many more. All sessions are recorded and made available on YouTube. In addition, all slide decks from the sessions will be available on SlideShare.net after the conference.
This post highlights the sessions that will be presented as part of the Analytics & Big Data track, as well as relevant sessions from other tracks like Architecture, Artificial Intelligence & Machine Learning, and IoT. If you’re interested in Machine Learning sessions, don’t forget to check out our Guide to Machine Learning at re:Invent 2017.
This year’s session catalog contains the following breakout sessions.
Raju Gulabani, VP, Database, Analytics and AI at AWS will discuss the evolution of database and analytics services in AWS, the new database and analytics services and features we launched this year, and our vision for continued innovation in this space. We are witnessing an unprecedented growth in the amount of data collected, in many different forms. Storage, management, and analysis of this data require database services that scale and perform in ways not possible before. AWS offers a collection of database and other data services—including Amazon Aurora, Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon ElastiCache, Amazon Kinesis, and Amazon EMR—to process, store, manage, and analyze data. In this session, we provide an overview of AWS database and analytics services and discuss how customers are using these services today.
Deep dive customer use cases
ABD401 – How Netflix Monitors Applications in Near Real-Time with Amazon Kinesis Thousands of services work in concert to deliver millions of hours of video streams to Netflix customers every day. These applications vary in size, function, and technology, but they all make use of the Netflix network to communicate. Understanding the interactions between these services is a daunting challenge both because of the sheer volume of traffic and the dynamic nature of deployments. In this session, we first discuss why Netflix chose Kinesis Streams to address these challenges at scale. We then dive deep into how Netflix uses Kinesis Streams to enrich network traffic logs and identify usage patterns in real time. Lastly, we cover how Netflix uses this system to build comprehensive dependency maps, increase network efficiency, and improve failure resiliency. From this session, you will learn how to build a real-time application monitoring system using network traffic logs and get real-time, actionable insights.
In this session, learn how Nextdoor replaced their home-grown data pipeline based on a topology of Flume nodes with a completely serverless architecture based on Kinesis and Lambda. By making these changes, they improved both the reliability of their data and the delivery times of billions of records of data to their Amazon S3–based data lake and Amazon Redshift cluster. Nextdoor is a private social networking service for neighborhoods.
ABD205 – Taking a Page Out of Ivy Tech’s Book: Using Data for Student Success Data speaks. Discover how Ivy Tech, the nation’s largest singly accredited community college, uses AWS to gather, analyze, and take action on student behavioral data for the betterment of over 3,100 students. This session outlines the process from inception to implementation across the state of Indiana and highlights how Ivy Tech’s model can be applied to your own complex business problems.
ABD207 – Leveraging AWS to Fight Financial Crime and Protect National Security Banks aren’t known to share data and collaborate with one another. But that is exactly what the Mid-Sized Bank Coalition of America (MBCA) is doing to fight digital financial crime—and protect national security. Using the AWS Cloud, the MBCA developed a shared data analytics utility that processes terabytes of non-competitive customer account, transaction, and government risk data. The intelligence produced from the data helps banks increase the efficiency of their operations, cut labor and operating costs, and reduce false positive volumes. The collective intelligence also allows greater enforcement of Anti-Money Laundering (AML) regulations by helping members detect internal risks—and identify the challenges to detecting these risks in the first place. This session demonstrates how the AWS Cloud supports the MBCA to deliver advanced data analytics, provide consistent operating models across financial institutions, reduce costs, and strengthen national security.
ABD208 – Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores New Innovation with Amazon Kinesis Firehose In this session, learn how Cox Automotive is using Splunk Cloud for real time visibility into its AWS and hybrid environments to achieve near instantaneous MTTI, reduce auction incidents by 90%, and proactively predict outages. We also introduce a highly anticipated capability that allows you to ingest, transform, and analyze data in real time using Splunk and Amazon Kinesis Firehose to gain valuable insights from your cloud resources. It’s now quicker and easier than ever to gain access to analytics-driven infrastructure monitoring using Splunk Enterprise & Splunk Cloud.
ABD209 – Accelerating the Speed of Innovation with a Data Sciences Data & Analytics Hub at Takeda Historically, silos of data, analytics, and processes across functions, stages of development, and geography created a barrier to R&D efficiency. Gathering the right data necessary for decision-making was challenging due to issues of accessibility, trust, and timeliness. In this session, learn how Takeda is undergoing a transformation in R&D to increase the speed-to-market of high-impact therapies to improve patient lives. The Data and Analytics Hub was built, with Deloitte, to address these issues and support the efficient generation of data insights for functions such as clinical operations, clinical development, medical affairs, portfolio management, and R&D finance. In the AWS hosted data lake, this data is processed, integrated, and made available to business end users through data visualization interfaces, and to data scientists through direct connectivity. Learn how Takeda has achieved significant time reductions—from weeks to minutes—to gather and provision data that has the potential to reduce cycle times in drug development. The hub also enables more efficient operations and alignment to achieve product goals through cross functional team accountability and collaboration due to the ability to access the same cross domain data.
ABD210 – Modernizing Amtrak: Serverless Solution for Real-Time Data Capabilities As the nation’s only high-speed intercity passenger rail provider, Amtrak needs to know critical information to run their business such as: Who’s onboard any train at any time? How are booking and revenue trending? Amtrak was faced with unpredictable and often slow response times from existing databases, ranging from seconds to hours; existing booking and revenue dashboards were spreadsheet-based and manual; multiple copies of data were stored in different repositories, lacking integration and consistency; and operations and maintenance (O&M) costs were relatively high. Join us as we demonstrate how Deloitte and Amtrak successfully went live with a cloud-native operational database and analytical datamart for near-real-time reporting in under six months. We highlight the specific challenges and the modernization of architecture on an AWS native Platform as a Service (PaaS) solution. The solution includes cloud-native components such as AWS Lambda for microservices, Amazon Kinesis and AWS Data Pipeline for moving data, Amazon S3 for storage, Amazon DynamoDB for a managed NoSQL database service, and Amazon Redshift for near-real time reports and dashboards. Deloitte’s solution enabled “at scale” processing of 1 million transactions/day and up to 2K transactions/minute. It provided flexibility and scalability, largely eliminate the need for system management, and dramatically reduce operating costs. Moreover, it laid the groundwork for decommissioning legacy systems, anticipated to save at least $1M over 3 years.
ABD211 – Sysco Foods: A Journey from Too Much Data to Curated Insights In this session, we detail Sysco’s journey from a company focused on hindsight-based reporting to one focused on insights and foresight. For this shift, Sysco moved from multiple data warehouses to an AWS ecosystem, including Amazon Redshift, Amazon EMR, AWS Data Pipeline, and more. As the team at Sysco worked with Tableau, they gained agile insight across their business. Learn how Sysco decided to use AWS, how they scaled, and how they became more strategic with the AWS ecosystem and Tableau.
ABD217 – From Batch to Streaming: How Amazon Flex Uses Real-time Analytics to Deliver Packages on Time Reducing the time to get actionable insights from data is important to all businesses, and customers who employ batch data analytics tools are exploring the benefits of streaming analytics. Learn best practices to extend your architecture from data warehouses and databases to real-time solutions. Learn how to use Amazon Kinesis to get real-time data insights and integrate them with Amazon Aurora, Amazon RDS, Amazon Redshift, and Amazon S3. The Amazon Flex team describes how they used streaming analytics in their Amazon Flex mobile app, used by Amazon delivery drivers to deliver millions of packages each month on time. They discuss the architecture that enabled the move from a batch processing system to a real-time system, overcoming the challenges of migrating existing batch data to streaming data, and how to benefit from real-time analytics.
ABD218 – How EuroLeague Basketball Uses IoT Analytics to Engage Fans IoT and big data have made their way out of industrial applications, general automation, and consumer goods, and are now a valuable tool for improving consumer engagement across a number of industries, including media, entertainment, and sports. The low cost and ease of implementation of AWS analytics services and AWS IoT have allowed AGT, a leader in IoT, to develop their IoTA analytics platform. Using IoTA, AGT brought a tailored solution to EuroLeague Basketball for real-time content production and fan engagement during the 2017-18 season. In this session, we take a deep dive into how this solution is architected for secure, scalable, and highly performant data collection from athletes, coaches, and fans. We also talk about how the data is transformed into insights and integrated into a content generation pipeline. Lastly, we demonstrate how this solution can be easily adapted for other industries and applications.
ABD222 – How to Confidently Unleash Data to Meet the Needs of Your Entire Organization Where are you on the spectrum of IT leaders? Are you confident that you’re providing the technology and solutions that consistently meet or exceed the needs of your internal customers? Do your peers at the executive table see you as an innovative technology leader? Innovative IT leaders understand the value of getting data and analytics directly into the hands of decision makers, and into their own. In this session, Daren Thayne, Domo’s Chief Technology Officer, shares how innovative IT leaders are helping drive a culture change at their organizations. See how transformative it can be to have real-time access to all of the data that’ is relevant to YOUR job (including a complete view of your entire AWS environment), as well as understand how it can help you lead the way in applying that same pattern throughout your entire company
ABD303 – Developing an Insights Platform – Sysco’s Journey from Disparate Systems to Data Lake and Beyond Sysco has nearly 200 operating companies across its multiple lines of business throughout the United States, Canada, Central/South America, and Europe. As the global leader in food services, Sysco identified the need to streamline the collection, transformation, and presentation of data produced by the distributed units and systems, into a central data ecosystem. Sysco’s Business Intelligence and Analytics team addressed these requirements by creating a data lake with scalable analytics and query engines leveraging AWS services. In this session, Sysco will outline their journey from a hindsight reporting focused company to an insights driven organization. They will cover solution architecture, challenges, and lessons learned from deploying a self-service insights platform. They will also walk through the design patterns they used and how they designed the solution to provide predictive analytics using Amazon Redshift Spectrum, Amazon S3, Amazon EMR, AWS Glue, Amazon Elasticsearch Service and other AWS services.
ABD309 – How Twilio Scaled Its Data-Driven Culture As a leading cloud communications platform, Twilio has always been strongly data-driven. But as headcount and data volumes grew—and grew quickly—they faced many new challenges. One-off, static reports work when you’re a small startup, but how do you support a growth stage company to a successful IPO and beyond? Today, Twilio’s data team relies on AWS and Looker to provide data access to 700 colleagues. Departments have the data they need to make decisions, and cloud-based scale means they get answers fast. Data delivers real-business value at Twilio, providing a 360-degree view of their customer, product, and business. In this session, you hear firsthand stories directly from the Twilio data team and learn real-world tips for fostering a truly data-driven culture at scale.
ABD310 – How FINRA Secures Its Big Data and Data Science Platform on AWS FINRA uses big data and data science technologies to detect fraud, market manipulation, and insider trading across US capital markets. As a financial regulator, FINRA analyzes highly sensitive data, so information security is critical. Learn how FINRA secures its Amazon S3 Data Lake and its data science platform on Amazon EMR and Amazon Redshift, while empowering data scientists with tools they need to be effective. In addition, FINRA shares AWS security best practices, covering topics such as AMI updates, micro segmentation, encryption, key management, logging, identity and access management, and compliance.
ABD331 – Log Analytics at Expedia Using Amazon Elasticsearch Service Expedia uses Amazon Elasticsearch Service (Amazon ES) for a variety of mission-critical use cases, ranging from log aggregation to application monitoring and pricing optimization. In this session, the Expedia team reviews how they use Amazon ES and Kibana to analyze and visualize Docker startup logs, AWS CloudTrail data, and application metrics. They share best practices for architecting a scalable, secure log analytics solution using Amazon ES, so you can add new data sources almost effortlessly and get insights quickly
ABD316 – American Heart Association: Finding Cures to Heart Disease Through the Power of Technology Combining disparate datasets and making them accessible to data scientists and researchers is a prevalent challenge for many organizations, not just in healthcare research. American Heart Association (AHA) has built a data science platform using Amazon EMR, Amazon Elasticsearch Service, and other AWS services, that corrals multiple datasets and enables advanced research on phenotype and genotype datasets, aimed at curing heart diseases. In this session, we present how AHA built this platform and the key challenges they addressed with the solution. We also provide a demo of the platform, and leave you with suggestions and next steps so you can build similar solutions for your use cases
ABD319 – Tooling Up for Efficiency: DIY Solutions @ Netflix At Netflix, we have traditionally approached cloud efficiency from a human standpoint, whether it be in-person meetings with the largest service teams or manually flipping reservations. Over time, we realized that these manual processes are not scalable as the business continues to grow. Therefore, in the past year, we have focused on building out tools that allow us to make more insightful, data-driven decisions around capacity and efficiency. In this session, we discuss the DIY applications, dashboards, and processes we built to help with capacity and efficiency. We start at the ten thousand foot view to understand the unique business and cloud problems that drove us to create these products, and discuss implementation details, including the challenges encountered along the way. Tools discussed include Picsou, the successor to our AWS billing file cost analyzer; Libra, an easy-to-use reservation conversion application; and cost and efficiency dashboards that relay useful financial context to 50+ engineering teams and managers.
ABD312 – Deep Dive: Migrating Big Data Workloads to AWS Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premise deployments to AWS in order to save costs, increase availability, and improve performance. AWS offers a broad set of analytics services, including solutions for batch processing, stream processing, machine learning, data workflow orchestration, and data warehousing. This session will focus on identifying the components and workflows in your current environment; and providing the best practices to migrate these workloads to the right AWS data analytics product. We will cover services such as Amazon EMR, Amazon Athena, Amazon Redshift, Amazon Kinesis, and more. We will also feature Vanguard, an American investment management company based in Malvern, Pennsylvania with over $4.4 trillion in assets under management. Ritesh Shah, Sr. Program Manager for Cloud Analytics Program at Vanguard, will describe how they orchestrated their migration to AWS analytics services, including Hadoop and Spark workloads to Amazon EMR. Ritesh will highlight the technical challenges they faced and overcame along the way, as well as share common recommendations and tuning tips to accelerate the time to production.
ABD402 – How Esri Optimizes Massive Image Archives for Analytics in the Cloud Petabyte scale archives of satellites, planes, and drones imagery continue to grow exponentially. They mostly exist as semi-structured data, but they are only valuable when accessed and processed by a wide range of products for both visualization and analysis. This session provides an overview of how ArcGIS indexes and structures data so that any part of it can be quickly accessed, processed, and analyzed by reading only the minimum amount of data needed for the task. In this session, we share best practices for structuring and compressing massive datasets in Amazon S3, so it can be analyzed efficiently. We also review a number of different image formats, including GeoTIFF (used for the Public Datasets on AWS program, Landsat on AWS), cloud optimized GeoTIFF, MRF, and CRF as well as different compression approaches to show the effect on processing performance. Finally, we provide examples of how this technology has been used to help image processing and analysis for the response to Hurricane Harvey.
ABD329 – A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Massive Scale Amazon’s consumer business continues to grow, and so does the volume of data and the number and complexity of the analytics done in support of the business. In this session, we talk about how Amazon.com uses AWS technologies to build a scalable environment for data and analytics. We look at how Amazon is evolving the world of data warehousing with a combination of a data lake and parallel, scalable compute engines such as Amazon EMR and Amazon Redshift.
ABD327 – Migrating Your Traditional Data Warehouse to a Modern Data Lake In this session, we discuss the latest features of Amazon Redshift and Redshift Spectrum, and take a deep dive into its architecture and inner workings. We share many of the recent availability, performance, and management enhancements and how they improve your end user experience. You also hear from 21st Century Fox, who presents a case study of their fast migration from an on-premises data warehouse to Amazon Redshift. Learn how they are expanding their data warehouse to a data lake that encompasses multiple data sources and data formats. This architecture helps them tie together siloed business units and get actionable 360-degree insights across their consumer base. MCL202 – Ally Bank & Cognizant: Transforming Customer Experience Using Amazon Alexa Given the increasing popularity of natural language interfaces such as Voice as User technology or conversational artificial intelligence (AI), Ally® Bank was looking to interact with customers by enabling direct transactions through conversation or voice. They also needed to develop a capability that allows third parties to connect to the bank securely for information sharing and exchange, using oAuth, an authentication protocol seen as the future of secure banking technology. Cognizant’s Architecture team partnered with Ally Bank’s Enterprise Architecture group and identified the right product for oAuth integration with Amazon Alexa and third-party technologies. In this session, we discuss how building products with conversational AI helps Ally Bank offer an innovative customer experience; increase retention through improved data-driven personalization; increase the efficiency and convenience of customer service; and gain deep insights into customer needs through data analysis and predictive analytics to offer new products and services.
MCL317 – Orchestrating Machine Learning Training for Netflix Recommendations At Netflix, we use machine learning (ML) algorithms extensively to recommend relevant titles to our 100+ million members based on their tastes. Everything on the member home page is an evidence-driven, A/B-tested experience that we roll out backed by ML models. These models are trained using Meson, our workflow orchestration system. Meson distinguishes itself from other workflow engines by handling more sophisticated execution graphs, such as loops and parameterized fan-outs. Meson can schedule Spark jobs, Docker containers, bash scripts, gists of Scala code, and more. Meson also provides a rich visual interface for monitoring active workflows and inspecting execution logs. It has a powerful Scala DSL for authoring workflows as well as the REST API. In this session, we focus on how Meson trains recommendation ML models in production, and how we have re-architected it to scale up for a growing need of broad ETL applications within Netflix. As a driver for this change, we have had to evolve the persistence layer for Meson. We talk about how we migrated from Cassandra to Amazon RDS backed by Amazon Aurora
MCL350 – Humans vs. the Machines: How Pinterest Uses Amazon Mechanical Turk’s Worker Community to Improve Machine Learning Ever since the term “crowdsourcing” was coined in 2006, it’s been a buzzword for technology companies and social institutions. In the technology sector, crowdsourcing is instrumental for verifying machine learning algorithms, which, in turn, improves the user’s experience. In this session, we explore how Pinterest adapted to an increased reliability on human evaluation to improve their product, with a focus on how they’ve integrated with Mechanical Turk’s platform. This presentation is aimed at engineers, analysts, program managers, and product managers who are interested in how companies rely on Mechanical Turk’s human evaluation platform to better understand content and improve machine learning algorithms. The discussion focuses on the analysis and product decisions related to building a high quality crowdsourcing system that takes advantage of Mechanical Turk’s powerful worker community.
ABD201 – Big Data Architectural Patterns and Best Practices on AWS In this session, we simplify big data processing as a data bus comprising various stages: collect, store, process, analyze, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost
ABD202 – Best Practices for Building Serverless Big Data Applications Serverless technologies let you build and scale applications and services rapidly without the need to provision or manage servers. In this session, we show you how to incorporate serverless concepts into your big data architectures. We explore the concepts behind and benefits of serverless architectures for big data, looking at design patterns to ingest, store, process, and visualize your data. Along the way, we explain when and how you can use serverless technologies to streamline data processing, minimize infrastructure management, and improve agility and robustness and share a reference architecture using a combination of cloud and open source technologies to solve your big data problems. Topics include: use cases and best practices for serverless big data applications; leveraging AWS technologies such as Amazon DynamoDB, Amazon S3, Amazon Kinesis, AWS Lambda, Amazon Athena, and Amazon EMR; and serverless ETL, event processing, ad hoc analysis, and real-time analytics.
ABD206 – Building Visualizations and Dashboards with Amazon QuickSight Just as a picture is worth a thousand words, a visual is worth a thousand data points. A key aspect of our ability to gain insights from our data is to look for patterns, and these patterns are often not evident when we simply look at data in tables. The right visualization will help you gain a deeper understanding in a much quicker timeframe. In this session, we will show you how to quickly and easily visualize your data using Amazon QuickSight. We will show you how you can connect to data sources, generate custom metrics and calculations, create comprehensive business dashboards with various chart types, and setup filters and drill downs to slice and dice the data.
ABD203 – Real-Time Streaming Applications on AWS: Use Cases and Patterns To win in the marketplace and provide differentiated customer experiences, businesses need to be able to use live data in real time to facilitate fast decision making. In this session, you learn common streaming data processing use cases and architectures. First, we give an overview of streaming data and AWS streaming data capabilities. Next, we look at a few customer examples and their real-time streaming applications. Finally, we walk through common architectures and design patterns of top streaming data use cases.
ABD213 – How to Build a Data Lake with AWS Glue Data Catalog As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. The AWS Glue Data Catalog provides a central view of your data lake, making data readily available for analytics. We introduce key features of the AWS Glue Data Catalog and its use cases. Learn how crawlers can automatically discover your data, extract relevant metadata, and add it as table definitions to the AWS Glue Data Catalog. We will also explore the integration between AWS Glue Data Catalog and Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
ABD214 – Real-time User Insights for Mobile and Web Applications with Amazon Pinpoint With customers demanding relevant and real-time experiences across a range of devices, digital businesses are looking to gather user data at scale, understand this data, and respond to customer needs instantly. This requires tools that can record large volumes of user data in a structured fashion, and then instantly make this data available to generate insights. In this session, we demonstrate how you can use Amazon Pinpoint to capture user data in a structured yet flexible manner. Further, we demonstrate how this data can be set up for instant consumption using services like Amazon Kinesis Firehose and Amazon Redshift. We walk through example data based on real world scenarios, to illustrate how Amazon Pinpoint lets you easily organize millions of events, record them in real-time, and store them for further analysis.
ABD223 – IT Innovators: New Technology for Leveraging Data to Enable Agility, Innovation, and Business Optimization Companies of all sizes are looking for technology to efficiently leverage data and their existing IT investments to stay competitive and understand where to find new growth. Regardless of where companies are in their data-driven journey, they face greater demands for information by customers, prospects, partners, vendors and employees. All stakeholders inside and outside the organization want information on-demand or in “real time”, available anywhere on any device. They want to use it to optimize business outcomes without having to rely on complex software tools or human gatekeepers to relevant information. Learn how IT innovators at companies such as MasterCard, Jefferson Health, and TELUS are using Domo’s Business Cloud to help their organizations more effectively leverage data at scale.
ABD301 – Analyzing Streaming Data in Real Time with Amazon Kinesis Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we present an end-to-end streaming data solution using Kinesis Streams for data ingestion, Kinesis Analytics for real-time processing, and Kinesis Firehose for persistence. We review in detail how to write SQL queries using streaming data and discuss best practices to optimize and monitor your Kinesis Analytics applications. Lastly, we discuss how to estimate the cost of the entire system
ABD302 – Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service and Kibana In this session, we use Apache web logs as example and show you how to build an end-to-end analytics solution. First, we cover how to configure an Amazon ES cluster and ingest data using Amazon Kinesis Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data. Then we demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we review approaches for generating custom, ad-hoc reports.
ABD304 – Best Practices for Data Warehousing with Amazon Redshift & Redshift Spectrum Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we take an in-depth look at how modern data warehousing blends and analyzes all your data, inside and outside your data warehouse without moving the data, to give you deeper insights to run your business. We will cover best practices on how to design optimal schemas, load data efficiently, and optimize your queries to deliver high throughput and performance.
ABD305 – Design Patterns and Best Practices for Data Analytics with Amazon EMR Amazon EMR is one of the largest Hadoop operators in the world, enabling customers to run ETL, machine learning, real-time processing, data science, and low-latency SQL at petabyte scale. In this session, we introduce you to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long and short-lived clusters, and other Amazon EMR architectural best practices. We talk about lowering cost with Auto Scaling and Spot Instances, and security best practices for encryption and fine-grained access control. Finally, we dive into some of our recent launches to keep you current on our latest features.
ABD307 – Deep Analytics for Global AWS Marketing Organization To meet the needs of the global marketing organization, the AWS marketing analytics team built a scalable platform that allows the data science team to deliver custom econometric and machine learning models for end user self-service. To meet data security standards, we use end-to-end data encryption and different AWS services such as Amazon Redshift, Amazon RDS, Amazon S3, Amazon EMR with Apache Spark and Auto Scaling. In this session, you see real examples of how we have scaled and automated critical analysis, such as calculating the impact of marketing programs like re:Invent and prioritizing leads for our sales teams.
ABD311 – Deploying Business Analytics at Enterprise Scale with Amazon QuickSight One of the biggest tradeoffs customers usually make when deploying BI solutions at scale is agility versus governance. Large-scale BI implementations with the right governance structure can take months to design and deploy. In this session, learn how you can avoid making this tradeoff using Amazon QuickSight. Learn how to easily deploy Amazon QuickSight to thousands of users using Active Directory and Federated SSO, while securely accessing your data sources in Amazon VPCs or on-premises. We also cover how to control access to your datasets, implement row-level security, create scheduled email reports, and audit access to your data.
ABD315 – Building Serverless ETL Pipelines with AWS Glue Organizations need to gain insight and knowledge from a growing number of Internet of Things (IoT), APIs, clickstreams, unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL flows for your data lake. We discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue. Additionally, Merck will share how they built an end-to-end ETL pipeline for their application release management system, and launched it in production in less than a week using AWS Glue.
ABD318 – Architecting a data lake with Amazon S3, Amazon Kinesis, and Amazon Athena Learn how to architect a data lake where different teams within your organization can publish and consume data in a self-service manner. As organizations aim to become more data-driven, data engineering teams have to build architectures that can cater to the needs of diverse users – from developers, to business analysts, to data scientists. Each of these user groups employs different tools, have different data needs and access data in different ways. In this talk, we will dive deep into assembling a data lake using Amazon S3, Amazon Kinesis, Amazon Athena, Amazon EMR, and AWS Glue. The session will feature Mohit Rao, Architect and Integration lead at Atlassian, the maker of products such as JIRA, Confluence, and Stride. First, we will look at a couple of common architectures for building a data lake. Then we will show how Atlassian built a self-service data lake, where any team within the company can publish a dataset to be consumed by a broad set of users.
Companies have valuable data that they may not be analyzing due to the complexity, scalability, and performance issues of loading the data into their data warehouse. However, with the right tools, you can extend your analytics to query data in your data lake—with no loading required. Amazon Redshift Spectrum extends the analytic power of Amazon Redshift beyond data stored in your data warehouse to run SQL queries directly against vast amounts of unstructured data in your Amazon S3 data lake. This gives you the freedom to store your data where you want, in the format you want, and have it available for analytics when you need it. Join a discussion with AWS solution architects to ask question.
ABD330 – Combining Batch and Stream Processing to Get the Best of Both Worlds Today, many architects and developers are looking to build solutions that integrate batch and real-time data processing, and deliver the best of both approaches. Lambda architecture (not to be confused with the AWS Lambda service) is a design pattern that leverages both batch and real-time processing within a single solution to meet the latency, accuracy, and throughput requirements of big data use cases. Come join us for a discussion on how to implement Lambda architecture (batch, speed, and serving layers) and best practices for data processing, loading, and performance tuning
ABD335 – Real-Time Anomaly Detection Using Amazon Kinesis Amazon Kinesis Analytics offers a built-in machine learning algorithm that you can use to easily detect anomalies in your VPC network traffic and improve security monitoring. Join us for an interactive discussion on how to stream your VPC flow Logs to Amazon Kinesis Streams and identify anomalies using Kinesis Analytics.
ABD339 – Deep Dive and Best Practices for Amazon Athena Amazon Athena is an interactive query service that enables you to process data directly from Amazon S3 without the need for infrastructure. Since its launch at re:invent 2016, several organizations have adopted Athena as the central tool to process all their data. In this talk, we dive deep into the most common use cases, including working with other AWS services. We review the best practices for creating tables and partitions and performance optimizations. We also dive into how Athena handles security, authorization, and authentication. Lastly, we hear from a customer who has reduced costs and improved time to market by deploying Athena across their organization.
We look forward to meeting you at re:Invent 2017!
About the Author
Roy Ben-Alta is a solution architect and principal business development manager at Amazon Web Services in New York. He focuses on Data Analytics and ML Technologies, working with AWS customers to build innovative data-driven products.
Contributed by Tiffany Jernigan, Developer Advocate for Amazon ECS
Get ready for takeoff!
We made sure that this year’s re:Invent is chock-full of containers: there are over 40 sessions! New to containers? No problem, we have several introductory sessions for you to dip your toes. Been using containers for years and know the ins and outs? Don’t miss our technical deep-dives and interactive chalk talks led by container experts.
If you can’t make it to Las Vegas, you can catch the keynotes and session recaps from our livestream and on Twitch.
Session types
Not everyone learns the same way, so we have multiple types of breakout content:
Birds of a Feather An interactive discussion with industry leaders about containers on AWS.
Breakout sessions 60-minute presentations about building on AWS. Sessions are delivered by both AWS experts and customers and span all content levels.
Workshops 2.5-hour, hands-on sessions that teach how to build on AWS. AWS credits are provided. Bring a laptop, and have an active AWS account.
Chalk Talks 1-hour, highly interactive sessions with a smaller audience. They begin with a short lecture delivered by an AWS expert, followed by a discussion with the audience.
Session levels
Whether you’re new to containers or you’ve been using them for years, you’ll find useful information at every level.
Introductory Sessions are focused on providing an overview of AWS services and features, with the assumption that attendees are new to the topic.
Advanced Sessions dive deeper into the selected topic. Presenters assume that the audience has some familiarity with the topic, but may or may not have direct experience implementing a similar solution.
Expert Sessions are for attendees who are deeply familiar with the topic, have implemented a solution on their own already, and are comfortable with how the technology works across multiple services, architectures, and implementations.
Session locations
All container sessions are located in the Aria Resort.
MONDAY 11/27
Breakout sessions
Level 200 (Introductory)
CON202 – Getting Started with Docker and Amazon ECS By packaging software into standardized units, Docker gives code everything it needs to run, ensuring consistency from your laptop all the way into production. But once you have your code ready to ship, how do you run and scale it in the cloud? In this session, you become comfortable running containerized services in production using Amazon ECS. We cover container deployment, cluster management, service auto-scaling, service discovery, secrets management, logging, monitoring, security, and other core concepts. We also cover integrated AWS services and supplementary services that you can take advantage of to run and scale container-based services in the cloud.
Chalk talks
Level 200 (Introductory)
CON211 – Reducing your Compute Footprint with Containers and Amazon ECS Tomas Riha, platform architect for Volvo, shows how Volvo transitioned its WirelessCar platform from using Amazon EC2 virtual machines to containers running on Amazon ECS, significantly reducing cost. Tomas dives deep into the architecture that Volvo used to achieve the migration in under four months, including Amazon ECS, Amazon ECR, Elastic Load Balancing, and AWS CloudFormation.
CON212 – Anomaly Detection Using Amazon ECS, AWS Lambda, and Amazon EMR Learn about the architecture that Cisco CloudLock uses to enable automated security and compliance checks throughout the entire development lifecycle, from the first line of code through runtime. It includes integration with IAM roles, Amazon VPC, and AWS KMS.
Level 400 (Expert)
CON410 – Advanced CICD with Amazon ECS Control Plane Mohit Gupta, product and engineering lead for Clever, demonstrates how to extend the Amazon ECS control plane to optimize management of container deployments and how the control plane can be broadly applied to take advantage of new AWS services. This includes ark—an AWS CLI-based deployment to Amazon ECS, Dapple—a slack-based automation system for deployments and notifications, and Kayvee—log and event routing libraries based on Amazon Kinesis.
Workshops
Level 200 (Introductory)
CON209 – Interstella 8888: Learn How to Use Docker on AWS Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. Join this workshop to get hands-on experience with Docker as you containerize Interstella 8888’s aging monolithic application and deploy it using Amazon ECS.
CON213 – Hands-on Deployment of Kubernetes on AWS In this workshop, attendees get hands-on experience using Kubernetes and Kops (Kubernetes Operations), as described in our recent blog post. Attendees learn how to provision a cluster, assign role-based permissions and security, and launch a container. If you’re interested in learning best practices for running Kubernetes on AWS, don’t miss this workshop.
TUESDAY 11/28
Breakout Sessions
Level 200 (Introductory)
CON206 – Docker on AWS In this session, Docker Technical Staff Member Patrick Chanezon discusses how Finnish Rail, the national train system for Finland, is using Docker on Amazon Web Services to modernize their customer facing applications, from ticket sales to reservations. Patrick also shares the state of Docker development and adoption on AWS, including explaining the opportunities and implications of efforts such as Project Moby, Docker EE, and how developers can use and contribute to Docker projects.
CON208 – Building Microservices on AWS Increasingly, organizations are turning to microservices to help them empower autonomous teams, letting them innovate and ship software faster than ever before. But implementing a microservices architecture comes with a number of new challenges that need to be dealt with. Chief among these finding an appropriate platform to help manage a growing number of independently deployable services. In this session, Sam Newman, author of Building Microservices and a renowned expert in microservices strategy, discusses strategies for building scalable and robust microservices architectures. He also tells you how to choose the right platform for building microservices, and about common challenges and mistakes organizations make when they move to microservices architectures.
Level 300 (Advanced)
CON302 – Building a CICD Pipeline for Containers on AWS Containers can make it easier to scale applications in the cloud, but how do you set up your CICD workflow to automatically test and deploy code to containerized apps? In this session, we explore how developers can build effective CICD workflows to manage their containerized code deployments on AWS.
Ajit Zadgaonkar, Director of Engineering and Operations at Edmunds walks through best practices for CICD architectures used by his team to deploy containers. We also deep dive into topics such as how to create an accessible CICD platform and architect for safe blue/green deployments.
CON307 – Building Effective Container Images Sick of getting paged at 2am and wondering “where did all my disk space go?” New Docker users often start with a stock image in order to get up and running quickly, but this can cause problems as your application matures and scales. Creating efficient container images is important to maximize resources, and deliver critical security benefits.
In this session, AWS Sr. Technical Evangelist Abby Fuller covers how to create effective images to run containers in production. This includes an in-depth discussion of how Docker image layers work, things you should think about when creating your images, working with Amazon ECR, and mise-en-place for install dependencies. Prakash Janakiraman, Co-Founder and Chief Architect at Nextdoor discuss high-level and language-specific best practices for with building images and how Nextdoor uses these practices to successfully scale their containerized services with a small team.
CON309 – Containerized Machine Learning on AWS Image recognition is a field of deep learning that uses neural networks to recognize the subject and traits for a given image. In Japan, Cookpad uses Amazon ECS to run an image recognition platform on clusters of GPU-enabled EC2 instances. In this session, hear from Cookpad about the challenges they faced building and scaling this advanced, user-friendly service to ensure high-availability and low-latency for tens of millions of users.
CON320 – Monitoring, Logging, and Debugging for Containerized Services As containers become more embedded in the platform tools, debug tools, traces, and logs become increasingly important. Nare Hayrapetyan, Senior Software Engineer and Calvin French-Owen, Senior Technical Officer for Segment discuss the principals of monitoring and debugging containers and the tools Segment has implemented and built for logging, alerting, metric collection, and debugging of containerized services running on Amazon ECS.
Chalk Talks
Level 300 (Advanced)
CON314 – Automating Zero-Downtime Production Cluster Upgrades for Amazon ECS Containers make it easy to deploy new code into production to update the functionality of a service, but what happens when you need to update the Amazon EC2 compute instances that your containers are running on? In this talk, we’ll deep dive into how to upgrade the Amazon EC2 infrastructure underlying a live production Amazon ECS cluster without affecting service availability. Matt Callanan, Engineering Manager at Expedia walk through Expedia’s “PRISM” project that safely relocates hundreds of tasks onto new Amazon EC2 instances with zero-downtime to applications.
CON322 – Maximizing Amazon ECS for Large-Scale Workloads Head of Mobfox DevOps, David Spitzer, shows how Mobfox used Docker and Amazon ECS to scale the Mobfox services and development teams to achieve low-latency networking and automatic scaling. This session covers Mobfox’s ecosystem architecture. It compares 2015 and today, the challenges Mobfox faced in growing their platform, and how they overcame them.
CON323 – Microservices Architectures for the Enterprise Salva Jung, Principle Engineer for Samsung Mobile shares how Samsung Connect is architected as microservices running on Amazon ECS to securely, stably, and efficiently handle requests from millions of mobile and IoT devices around the world.
CON324 – Windows Containers on Amazon ECS Docker containers are commonly regarded as powerful and portable runtime environments for Linux code, but Docker also offers API and toolchain support for running Windows Servers in containers. In this talk, we discuss the various options for running windows-based applications in containers on AWS.
CON326 – Remote Sensing and Image Processing on AWS Learn how Encirca services by DuPont Pioneer uses Amazon ECS powered by GPU-instances and Amazon EC2 Spot Instances to run proprietary image-processing algorithms against satellite imagery. Mark Lanning and Ethan Harstad, engineers at DuPont Pioneer show how this architecture has allowed them to process satellite imagery multiple times a day for each agricultural field in the United States in order to identify crop health changes.
Workshops
Level 300 (Advanced)
CON317 – Advanced Container Management at Catsndogs.lol Catsndogs.lol is a (fictional) company that needs help deploying and scaling its container-based application. During this workshop, attendees join the new DevOps team at CatsnDogs.lol, and help the company to manage their applications using Amazon ECS, and help release new features to make our customers happier than ever.Attendees get hands-on with service and container-instance auto-scaling, spot-fleet integration, container placement strategies, service discovery, secrets management with AWS Systems Manager Parameter Store, time-based and event-based scheduling, and automated deployment pipelines. If you are a developer interested in learning more about how Amazon ECS can accelerate your application development and deployment workflows, or if you are a systems administrator or DevOps person interested in understanding how Amazon ECS can simplify the operational model associated with running containers at scale, then this workshop is for you. You should have basic familiarity with Amazon ECS, Amazon EC2, and IAM.
Additional requirements:
The AWS CLI or AWS Tools for PowerShell installed
An AWS account with administrative permissions (including the ability to create IAM roles and policies) created at least 24 hours in advance.
WEDNESDAY 11/29
Birds of a Feather (BoF)
CON01 – Birds of a Feather: Containers and Open Source at AWS Cloud native architectures take advantage of on-demand delivery, global deployment, elasticity, and higher-level services to enable developer productivity and business agility. Open source is a core part of making cloud native possible for everyone. In this session, we welcome thought leaders from the CNCF, Docker, and AWS to discuss the cloud’s direction for growth and enablement of the open source community. We also discuss how AWS is integrating open source code into its container services and its contributions to open source projects.
Breakout Sessions
Level 300 (Advanced)
CON308 – Mastering Kubernetes on AWS Much progress has been made on how to bootstrap a cluster since Kubernetes’ first commit and is now only a matter of minutes to go from zero to a running cluster on Amazon Web Services. However, evolving a simple Kubernetes architecture to be ready for production in a large enterprise can quickly become overwhelming with options for configuration and customization.
In this session, Arun Gupta, Open Source Strategist for AWS and Raffaele Di Fazio, software engineer at leading European fashion platform Zalando, show the common practices for running Kubernetes on AWS and share insights from experience in operating tens of Kubernetes clusters in production on AWS. We cover options and recommendations on how to install and manage clusters, configure high availability, perform rolling upgrades and handle disaster recovery, as well as continuous integration and deployment of applications, logging, and security.
CON310 – Moving to Containers: Building with Docker and Amazon ECS If you’ve ever considered moving part of your application stack to containers, don’t miss this session. We cover best practices for containerizing your code, implementing automated service scaling and monitoring, and setting up automated CI/CD pipelines with fail-safe deployments. Manjeeva Silva and Thilina Gunasinghe show how McDonalds implemented their home delivery platform in four months using Docker containers and Amazon ECS to serve tens of thousands of customers.
Level 400 (Expert)
CON402 – Advanced Patterns in Microservices Implementation with Amazon ECS Scaling a microservice-based infrastructure can be challenging in terms of both technical implementation and developer workflow. In this talk, AWS Solutions Architect Pierre Steckmeyer is joined by Will McCutchen, Architect at BuzzFeed, to discuss Amazon ECS as a platform for building a robust infrastructure for microservices. We look at the key attributes of microservice architectures and how Amazon ECS supports these requirements in production, from configuration to sophisticated workload scheduling to networking capabilities to resource optimization. We also examine what it takes to build an end-to-end platform on top of the wider AWS ecosystem, and what it’s like to migrate a large engineering organization from a monolithic approach to microservices.
CON404 – Deep Dive into Container Scheduling with Amazon ECS As your application’s infrastructure grows and scales, well-managed container scheduling is critical to ensuring high availability and resource optimization. In this session, we deep dive into the challenges and opportunities around container scheduling, as well as the different tools available within Amazon ECS and AWS to carry out efficient container scheduling. We discuss patterns for container scheduling available with Amazon ECS, the Blox scheduling framework, and how you can customize and integrate third-party scheduler frameworks to manage container scheduling on Amazon ECS.
Chalk Talks
Level 300 (Advanced)
CON312 – Building a Selenium Fleet on the Cheap with Amazon ECS with Spot Fleet Roberto Rivera and Matthew Wedgwood, engineers at RetailMeNot, give a practical overview of setting up a fleet of Selenium nodes running on Amazon ECS with Spot Fleet. Discuss the challenges of running Selenium with high availability at minimum cost using Amazon ECS container introspection to connect the Selenium Hub with its nodes.
CON315 – Virtually There: Building a Render Farm with Amazon ECS Learn how 8i Corp scales its multi-tenanted, volumetric render farm up to thousands of instances using AWS, Docker, and an API-driven infrastructure. This render farm enables them to turn the video footage from an array of synchronized cameras into a photo-realistic hologram capable of playback on a range of devices, from mobile phones to high-end head mounted displays. Join Owen Evans, VP of Engineering for 8i, as they dive deep into how 8i’s rendering infrastructure is built and maintained by just a handful of people and powered by Amazon ECS.
CON325 – Developing Microservices – from Your Laptop to the Cloud Wesley Chow, Staff Engineer at Adroll, shows how his team extends Amazon ECS by enabling local development capabilities. Hologram, Adroll’s local development program, brings the capabilities of the Amazon EC2 instance metadata service to non-EC2 hosts, so that developers can run the same software on local machines with the same credentials source as in production.
CON327 – Patterns and Considerations for Service Discovery Roven Drabo, head of cloud operations at Kaplan Test Prep, illustrates Kaplan’s complete container automation solution using Amazon ECS along with how his team uses NGINX and HashiCorp Consul to provide an automated approach to service discovery and container provisioning.
CON328 – Building a Development Platform on Amazon ECS Quinton Anderson, Head of Engineering for Commonwealth Bank of Australia, walks through how they migrated their internal development and deployment platform from Mesos/Marathon to Amazon ECS. The platform uses a custom DSL to abstract a layered application architecture, in a way that makes it easy to plug or replace new implementations into each layer in the stack.
Workshops
Level 300 (Advanced)
CON318 – Interstella 8888: Monolith to Microservices with Amazon ECS Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. Join this workshop to get hands-on experience deploying Docker containers as you break Interstella 8888’s aging monolithic application into containerized microservices. Using Amazon ECS and an Application Load Balancer, you create API-based microservices and deploy them leveraging integrations with other AWS services.
CON332 – Build a Java Spring Application on Amazon ECS This workshop teaches you how to lift and shift existing Spring and Spring Cloud applications onto the AWS platform. Learn how to build a Spring application container, understand bootstrap secrets, push container images to Amazon ECR, and deploy the application to Amazon ECS. Then, learn how to configure the deployment for production.
THURSDAY 11/30
Breakout Sessions
Level 200 (Introductory)
CON201 – Containers on AWS – State of the Union Just over four years after the first public release of Docker, and three years to the day after the launch of Amazon ECS, the use of containers has surged to run a significant percentage of production workloads at startups and enterprise organizations. Join Deepak Singh, General Manager of Amazon Container Services, as he covers the state of containerized application development and deployment trends, new container capabilities on AWS that are available now, options for running containerized applications on AWS, and how AWS customers successfully run container workloads in production.
Level 300 (Advanced)
CON304 – Batch Processing with Containers on AWS Batch processing is useful to analyze large amounts of data. But configuring and scaling a cluster of virtual machines to process complex batch jobs can be difficult. In this talk, we show how to use containers on AWS for batch processing jobs that can scale quickly and cost-effectively. We also discuss AWS Batch, our fully managed batch-processing service. You also hear from GoPro and Here about how they use AWS to run batch processing jobs at scale including best practices for ensuring efficient scheduling, fine-grained monitoring, compute resource automatic scaling, and security for your batch jobs.
Level 400 (Expert)
CON406 – Architecting Container Infrastructure for Security and Compliance While organizations gain agility and scalability when they migrate to containers and microservices, they also benefit from compliance and security, advantages that are often overlooked. In this session, Kelvin Zhu, lead software engineer at Okta, joins Mitch Beaumont, enterprise solutions architect at AWS, to discuss security best practices for containerized infrastructure. Learn how Okta built their development workflow with an emphasis on security through testing and automation. Dive deep into how containers enable automated security and compliance checks throughout the development lifecycle. Also understand best practices for implementing AWS security and secrets management services for any containerized service architecture.
Chalk Talks
Level 300 (Advanced)
CON329 – Full Software Lifecycle Management for Containers Running on Amazon ECS Learn how The Washington Post uses Amazon ECS to run Arc Publishing, a digital journalism platform that powers The Washington Post and a growing number of major media websites. Amazon ECS enabled The Washington Post to containerize their existing microservices architecture, avoiding a complete rewrite that would have delayed the platform’s launch by several years. In this session, Jason Bartz, Technical Architect at The Washington Post, discusses the platform’s architecture. He addresses the challenges of optimizing Arc Publishing’s workload, and managing the application lifecycle to support 2,000 containers running on more than 50 Amazon ECS clusters.
CON330 – Running Containerized HIPAA Workloads on AWS Nihar Pasala, Engineer at Aetion, discusses the Aetion Evidence Platform, a system for generating the real-world evidence used by healthcare decision makers to implement value-based care. This session discusses the architecture Aetion uses to run HIPAA workloads using containers on Amazon ECS, best practices, and learnings.
Level 400 (Expert)
CON408 – Building a Machine Learning Platform Using Containers on AWS DeepLearni.ng develops and implements machine learning models for complex enterprise applications. In this session, Thomas Rogers, Engineer for DeepLearni.ng discusses how they worked with Scotiabank to leverage Amazon ECS, Amazon ECR, Docker, GPU-accelerated Amazon EC2 instances, and TensorFlow to develop a retail risk model that helps manage payment collections for millions of Canadian credit card customers.
Workshops
Level 300 (Advanced)
CON319 – Interstella 8888: CICD for Containers on AWS Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. Join this workshop to learn how to set up a CI/CD pipeline for containerized microservices. You get hands-on experience deploying Docker container images using Amazon ECS, AWS CloudFormation, AWS CodeBuild, and AWS CodePipeline, automating everything from code check-in to production.
FRIDAY 12/1
Breakout Sessions
Level 400 (Expert)
CON405 – Moving to Amazon ECS – the Not-So-Obvious Benefits If you ask 10 teams why they migrated to containers, you will likely get answers like ‘developer productivity’, ‘cost reduction’, and ‘faster scaling’. But teams often find there are several other ‘hidden’ benefits to using containers for their services. In this talk, Franziska Schmidt, Platform Engineer at Mapbox and Yaniv Donenfeld from AWS will discuss the obvious, and not so obvious benefits of moving to containerized architecture. These include using Docker and Amazon ECS to achieve shared libraries for dev teams, separating private infrastructure from shareable code, and making it easier for non-ops engineers to run services.
Chalk Talks
Level 300 (Advanced)
CON331 – Deploying a Regulated Payments Application on Amazon ECS Travelex discusses how they built an FCA-compliant international payments service using a microservices architecture on AWS. This chalk talk covers the challenges of designing and operating an Amazon ECS-based PaaS in a regulated environment using a DevOps model.
Workshops
Level 400 (Expert)
CON407 – Interstella 8888: Advanced Microservice Operations Interstella 8888 is an intergalactic trading company that deals in rare resources, but their antiquated monolithic logistics systems are causing the business to lose money. In this workshop, you help Interstella 8888 build a modern microservices-based logistics system to save the company from financial ruin. We give you the hands-on experience you need to run microservices in the real world. This includes implementing advanced container scheduling and scaling to deal with variable service requests, implementing a service mesh, issue tracing with AWS X-Ray, container and instance-level logging with Amazon CloudWatch, and load testing.
Know before you go
Want to brush up on your container knowledge before re:Invent? Here are some helpful resources to get started:
Three researchers from Michigan State University have developed a low-cost, open-source fingerprint reader which can detect fake prints. They call it RaspiReader, and they’ve built it using a Raspberry Pi 3 and two Camera Modules. Joshua and his colleagues have just uploaded all the info you need to build your own version — let’s go!
Sadly not the real output of the RaspiReader
Falsified fingerprints
We’ve probably all seen a movie in which a burglar crosses a room full of laser tripwires and then enters the safe full of loot by tricking the fingerprint-secured lock with a fake print. Turns out, the second part is not that unrealistic: you can fake fingerprints using a range of materials, such as glue or latex.
The RaspiReader team collected live and fake fingerprints to test the device
If the spoof print layer capping the spoofer’s finger is thin enough, it can even fool readers that detect blood flow, pulse, or temperature. This is becoming a significant security risk, not least for anyone who unlocks their smartphone using a fingerprint.
The RaspiReader
This is where Anil K. Jain comes in: Professor Jain leads a biometrics research group. Under his guidance, Joshua J. Engelsma and Kai Cao set out to develop a fingerprint reader with improved spoof-print detection. Ultimately, they aim to help the development of more secure commercial technologies. With their project, the team has also created an amazing resource for anyone who wants to build their own fingerprint reader.
So that replicating their device would be easy, they wanted to make it using inexpensive, readily available components, which is why they turned to Raspberry Pi technology.
The Raspireader and its output
Inside the RaspiReader’s 3D-printed housing, LEDs shine light through an acrylic prism, on top of which the user rests their finger. The prism refracts the light so that the two Camera Modules can take images from different angles. The Pi receives these images via a Multi Camera Adapter Module feeding into the CSI port. Collecting two images means the researchers’ spoof detection algorithm has more information to work with.
Real on the left, fake on the right
RaspiReader software
The Camera Adaptor uses the RPi.GPIO Python package. The RaspiReader performs image processing, and its spoof detection takes image colour and 3D friction ridge patterns into account. The detection algorithm extracts colour local binary patterns … please don’t ask me to explain! You can have a look at the researchers’ manuscript if you want to get stuck into the fine details of their project.
Build your own fingerprint reader
I’ve had my eyes glued to my inbox waiting for Josh to send me links to instructions and files for this build, and here they are (thanks, Josh)! Check out the video tutorial, which walks you through how to assemble the RaspiReader:
Building a cost-effective, open-source, and spoof-resilient fingerprint reader for $160* in under an hour. Code: https://github.com/engelsjo/RaspiReader Links to parts: 1. PRISM – https://www.amazon.com/gp/product/B00WL3OBK4/ref=oh_aui_detailpage_o05_s00?ie=UTF8&psc=1 (Better fit) https://www.thorlabs.com/thorproduct.cfm?partnumber=PS611 2. RaspiCams – https://www.amazon.com/gp/product/B012V1HEP4/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1 3. Camera Multiplexer https://www.amazon.com/gp/product/B012UQWOOQ/ref=oh_aui_detailpage_o04_s01?ie=UTF8&psc=1 4. Raspberry Pi Kit: https://www.amazon.com/CanaKit-Raspberry-Clear-Power-Supply/dp/B01C6EQNNK/ref=sr_1_6?ie=UTF8&qid=1507058509&sr=8-6&keywords=raspberry+pi+3b Whitepaper: https://arxiv.org/abs/1708.07887 * Prices can vary based on Amazon’s pricing. P.s.
You can find a parts list with links to suppliers in the video description — the whole build costs around $160. All the STL files for the housing and the Python scripts you need to run on the Pi are available on Josh’s GitHub.
Enhance your home security
The RaspiReader is a great resource for researchers, and it would also be a terrific project to build at home! Is there a more impressive way to protect a treasured possession, or secure access to your computer, than with a DIY fingerprint scanner?
Check out this James-Bond-themed blog post for Raspberry Pi resources to help you build a high-security lair. If you want even more inspiration, watch this video about a laser-secured cookie jar which Estefannie made for us. And be sure to share your successful fingerprint scanner builds with us via social media!
Sam Dengler, Amazon Web Services Solutions Architect
Serverless architectures allow solution builders to focus on solving challenges particular to their business, without assuming the overhead of managing infrastructure in AWS. AWS Lambda is a service that lets you run code without provisioning or managing servers.
When using Lambda in a serverless architecture, the goal should be to design tightly focused functions that do one thing and do it well. When these functions are composed to accomplish larger goals in microservice architectures, the complexity shifts from the internal components to the external communication between components. It’s all too easy to accidentally back into an architecture that is rigid to change because components are too knowledgeable of each other via the communication paths between them.
Solution builders can address this architectural challenge by using messaging patterns, resulting in loosely coupled communication between highly cohesive components to manage complexity in serverless architectures. As introduced in the recent Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox post, a common approach when one component wishes to deliver the same message to multiple receivers is to use the fanout publish/subscribe messaging pattern.
The fanout pattern for message communication can be implemented in code. However, depending on your requirements, alternative solutions exist to offload this undifferentiated responsibility from the application. Amazon SNS is a fully managed pub/sub messaging service that lets you fan out messages to large numbers of recipients.
In this post, I review a serverless architecture from PlayOn! Sports as a case study for migration of fanout functionality from application code to SNS.
PlayOn! Sports serverless video processing platform
PlayOn! Sports is one of the nation’s leading high school sports media companies. They operate a comprehensive technology platform, enabling high-quality, low-cost productions of live sports events for the NFHS High School Sports Network.
At the 2014 AWS re:Invent conference, Lambda was announced. The PlayOn! Sports technology team recognized the parallels between serverless demos featuring image processing using ImageMagick to video processing using ffmpeg.
At the time, PlayOn! Sports was broadcasting live video with adaptive bit rates, requiring a transcoding of the video stream to multiple quality levels for consumption on desktop, mobile, and connected devices. This is not unusual for an internet media company. However, with over 50,000 live broadcasts produced in 2014, the traditional media and entertainment technological approaches and pricing models would not work.
After some consultation with the Lambda team to validate support for custom binary execution, PlayOn! Sports moved forward with the development of a new serverless video processing platform according to the architecture diagram below.
In the architecture, a laptop in the field captures the video from a camera source and divides it into small fragments, according to the HLS protocol. The fragments are published to Amazon S3, through an Amazon CloudFront distribution for accelerated upload. When the file has been written to S3, it triggers a Lambda function to initiate the video segment processing.
Video transcoding fanout implementation in Lambda
Given Lambda’s integration growth across AWS, it’s easy to forget that it did not include managed integration with SNS when it was announced in November 2014.
PlayOn! Sports was actively experimenting with approaches to video processing to address quality control, audience growth, and cost constraints. Lambda was a great tool for rapid innovation. A goal of the architecture design was the ability to add and remove video processing alternatives to the workflow, using the fanout pattern to identify optimal solutions. Below is example code from the initial implementation:
Each Lambda function is invoked asynchronously, injecting the same S3 event that triggered the original Lambda function. For example, the media_info Lambda function could be scaffolded similar to the following code snippet:
The PlayOn! Sports development team was familiar with SNS, but had not used it previously to support system-to-system messaging patterns. After the announcement of SNS triggering of Lambda functions, the PlayOn! Sports team planned to migrate to the new feature to offload the overhead of managing the fanout Lambda function.
When invoking a Lambda function, SNS wraps the original event with SNSEvent. The Lambda function can be refactored by adding a function to parse the S3 event from SNSEvent, as seen in the following code:
import json
import logging
import boto3
logger = logging.getLogger('boto3')
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
s3_event = parse_event(event)
logger.info(json.dumps(s3_event))
# MediaInfo Processing
# see: https://aws.amazon.com/blogs/compute/extracting-video-metadata-using-lambda-and-mediainfo/
return 'done'
def parse_event(event):
record = event['Records'][0]
if 'EventSource' in record and record['EventSource'] == 'aws:sns':
return json.loads(record['Sns']['Message'])
return event
This Lambda function modification can be authored, tested, and deployed before enabling the SNS integration to verify that the existing Lambda fanout execution path continues to operate as before. The Lambda function invocation can now be transferred from the fanout Lambda function to SNS without disruption to S3 processing.
As the diagram below shows, the resulting architecture is similar to the original. The exception is that objects written to S3 now trigger a message to be published to an SNS topic. This sends the S3 event to multiple Lambda functions to be processed independently.
Sample architecture deployment using AWS CloudFormation
AWS CloudFormation gives developers and system administrators an easy way to create and manage a collection of related AWS resources. CloudFormation provisions and updates resources in an orderly and predictable fashion. To launch the CloudFormation stack for the sample fanout architecture, choose the following button:
Follow these steps to complete the architecture deployment:
If necessary, sign into the console for your account when prompted.
On the Select Template page, choose Next.
Under Parameters, for S3BucketName, enter a globally unique name.
For SnsTopicName, enter a region-unique name.
Choose Next, Next.
Select the checkbox for I acknowledge that AWS CloudFormation might create IAM resources, and choose Create.
After the stack has completed creation, you can test both paths of execution by uploading files to the S3 bucket that you created. Uploading a file to the “/uploads/lambda/” directory in S3 triggers the Lambda fanout function. Uploading a file to the “/uploads/sns/” directory in S3 triggers the SNS fanout execution path. You can verify execution by monitoring the Lambda function outputs in CloudWatch Logs.
Conclusion
In this post, I reviewed the fanout messaging pattern and options for its inclusion in a serverless architectures using Lambda application code and SNS. Using the PlayOn! Sports serverless video processing pipeline use case, I demonstrated how easy it is to refactor an existing application to use the SNS fanout approach.
I also provided a sample architecture in CloudFormation that you can run in your own account. Try it out and expand the sample architecture by adding other Lambda functions to the SNS topic, to demonstrate the flexibility of the fanout messaging pattern!
You can get started with SNS using the AWS Management Console, or the SDK of your choice. For more information about how to use SNS fanout messaging, see the following resources:
G’MIC is a generic, extensible framework for image processing, often used as a plug-in for GIMP. Version 2.0 has been released. “One of the major new features of this version 2.0 is the re-implementation of the plug-in code, from scratch. The repository G’MIC-Qt developed by Sébastien (an experienced member of the team) is a Qt-based version of the plug-in interface, being as independent as possible of the widget API provided by GIMP.” The announcement has much more details about G’MIC and how it can be used. LWN looked at G’MIC in August 2014.
Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS
Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS
Deriving insights from data is foundational to nearly every organization, and many customers process high volumes of data every day. One common requirement of customers in life sciences is the need to analyze these data in a high-throughput fashion without sacrificing time-to-insight.
One such use case is genomic sequencing. Modern DNA sequencers, such as Illumina’s NovaSeq, can produce multiple terabytes of raw data each day. The data must then be processed into meaningful information that clinicians and research can act on in a timely fashion. This processing of genomic data is commonly referred to as "secondary analysis".
Most common secondary analysis workflows take raw reads generated from sequencers and then process them in a multi-step workflow to identify the variation in a biological sample compared to a standard genome reference. The individual steps are normally similar to the following:
DNA sequences are mapped to a standard genome reference by use of an alignment algorithm, such as Smith-Waterman or Burrows-Wheeler.
After the sequences are mapped to the reference, the differences are identified as single nucleotide variations, insertions, deletions, or complex structural variation in a process known as variant calling.
The resulting variants are often combined with other information to identify genomic variants highly correlated with disease or drug response. They might also be analyzed in the context of clinical data such as to identify disease susceptibility or state for a patient.
Along the way, quality metrics are collected or computed to ensure that the generated data is of the appropriate quality for use in requisite research or clinical settings.
In this post series, you can build a secondary analysis workflow similar to the one just described. Here is a diagram of the workflow:
Secondary analysis is a batch workflow
At its core, a genomics pipeline is similar to a series of Extract Transform and Load (ETL) steps that convert raw files from a DNA sequencer to a list of variants for one or more individuals. Each step extracts a set of input files from a data source, processes them as a compute-intensive workload (transform), and then loads the output into another location for subsequent storage or analysis.
These steps are often chained together to build a flexible genomics processing workflow. The files can then be used for downstream analysis, such as population scale analytics with Amazon Athena or Spark on Amazon EMR. These ETL processes can be represented as individual batch steps in an overall workflow.
When we discuss batch processing with customers, we often focus on the following three layers:
Jobs (analytical modules): These jobs are individual units of work that take a set of inputs, run a single process, and produce a set of outputs. In this series, you use Docker containers to define these analytical modules. For genomics, these commonly include alignment, variant calling, quality control, or another module in your workflow. Amazon ECS is an AWS service that orchestrates and runs these Docker containerized modules on top of Amazon EC2 instances.
Batch engine: This is a framework for submitting individual analytical modules with the appropriate requirements for computational resources, such as memory and number of CPUs. Each step of the analysis pipeline requires a definition of how to run a job:
Computational resources (disk, memory, CPU)
The compute environment to run it in (Docker container, runtime parameters)
Information about the priority of different jobs
Any dependencies between jobs
You can leverage concepts such as container placement and bin packing to maximize performance of your genomic pipeline while concurrently optimizing for cost. We will use AWS Batch for this layer. AWS Batch dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the submitted batch jobs.
Workflow orchestration: This layer sits on top of the batch engine and allows you to decouple your workflow definition from the execution of the individual jobs. You can envision this layer as a state machine where you define a series of steps and pass appropriate metadata between states. We will use AWS Lambda to submit the jobs to AWS Batch and use AWS Step Functions to define and orchestrate the workflow.
Architecture
In the next three posts, you build a genome analysis pipeline using the following architecture. You don’t explicitly build the grayed out section, but we wanted to include them in the diagram as they are natural extensions to the core architecture. We discuss these and other extensions, briefly, in the concluding post. All code related to this blog series can be found in the associated GitHub repository here.
Part 2 covers the job layer. We demonstrate how you can package bioinformatics applications in Docker containers, and discuss best practices when developing these containers for use in a multitenant batch environment.
Part 3 dives deep into the batch, or data processing layer. We discuss common considerations for deploying Docker containers to be used in batch analysis as well as demonstrate how you can use AWS Batch for a scalable and elastic batch engine.
Part 4 dives into workflow layer orchestration. We show how you might architect that layer with AWS services. You take the components built in parts 2 and 3 and combine them into an entire secondary analysis workflow. This workflow manages dependencies as well as continually checking the progress of existing jobs. We conclude by running a secondary analysis end-to-end for under $1 and discuss some extensions you can build on top of this core workflow.
What you can expect to learn
At the end of this series, you will have built a scalable and elastic solution to process genomes on AWS, as well as gained a general understanding of architectural choices available to you. The solution is generally applicable to other workloads, such as image processing. Even non-life science workloads such as trade analytics in financial services can benefit.
In your solution, you use Amazon EC2 Spot Instances to optimize for cost. Spot Instances allow you to bid on spare EC2 compute capacity, which can save up to 90% off of traditional On-Demand prices. In many cases, this translates into the ability to analyze genomes at scale for as low as $1 per analysis.
It is also fun to challenge yourself to employ fuzzers in non-conventional ways. Two canonical examples are having your fuzzing target call abort() whenever two libraries that are supposed to implement the same algorithm produce different outputs when given identical input data; or when a library produces different outputs when asked to encode or decode the same data several times in a row.
Such tricks may sound fanciful, but they actually find interesting bugs. In one case, AFL-based equivalence fuzzing revealed a bunch of fairly rudimentary flaws in common bignum libraries, with some theoretical implications for crypto apps. Another time, output stability checks revealed long-lived issues in IJG jpeg and other widely-used image processing libraries, leaking data across web origins.
In one of my recent experiments, I decided to fuzz brotli, an innovative compression library used in Chrome. But since it’s been already fuzzed for many CPU-years, I wanted to do it with a twist: stress-test the compression routines, rather than the usually targeted decompression side. The latter is a far more fruitful target for security research, because decompression normally involves dealing with well-formed inputs, whereas compression code is meant to accept arbitrary data and not think about it too hard. That said, the low likelihood of flaws also means that the compression bits are a relatively unexplored surface that may be worth poking with a stick every now and then.
In this case, the library held up admirably – spare for a handful of computationally intensive plaintext inputs (that are now easy to spot due to the recent improvements to AFL). But the output corpus synthesized by AFL, after being seeded just with a single file containing just “0”, featured quite a few peculiar finds:
Strings that looked like viable bits of HTML or XML: <META HTTP-AAA IDEAAAA, DATA="IIA DATA="IIA DATA="IIADATA="IIA, </TD>.
Nonsensical but undeniably English sentences: them with them m with them with themselves, in the fix the in the pin th in the tin, amassize the the in the in the [email protected] in, he the themes where there the where there, size at size at the tie.
Bogus but semi-legible URLs: CcCdc.com/.com/m/ /00.com/.com/m/ /00(0(000000CcCdc.com/.com/.com
Snippets of Lisp code: )))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))).
The results are quite unexpected, given that they are just a product of randomly mutating a single-byte input file and observing the code coverage in a simple compression tool. The explanation is that brotli, in addition to more familiar binary coding methods, uses a static dictionary constructed by analyzing common types of web content. Somehow, by observing the behavior of the program, AFL was able to incrementally reconstruct quite a few of these hardcoded keywords – and then put them together in various semi-interesting ways. Not bad.
As we finish up the month of February, Tina Barr is back with some awesome startups.
-Ana
This month we are bringing you five innovative hot startups:
GumGum – Creating and popularizing the field of in-image advertising.
Jiobit – Smart tags to help parents keep track of kids.
Parsec – Offers flexibility in hardware and location for PC gamers.
Peloton – Revolutionizing indoor cycling and fitness classes at home.
Tendril – Reducing energy consumption for homeowners.
If you missed any of our January startups, make sure to check them out here.
GumGum (Santa Monica, CA) GumGum is best known for inventing and popularizing the field of in-image advertising. Founded in 2008 by Ophir Tanz, the company is on a mission to unlock the value held within the vast content produced daily via social media, editorials, and broadcasts in a variety of industries. GumGum powers campaigns across more than 2,000 premium publishers, which are seen by over 400 million users.
In-image advertising was pioneered by GumGum and has given companies a platform to deliver highly visible ads to a place where the consumer’s attention is already focused. Using image recognition technology, GumGum delivers targeted placements as contextual overlays on related pictures, as banners that fit on all screen sizes, or as In-Feed placements that blend seamlessly into the surrounding content. Using Visual Intelligence, GumGum can scour social media and broadcast TV for all images and videos related to a brand, allowing companies to gain a stronger understanding of their audience and how they are relating to that brand on social media.
GumGum relies on AWS for its Image Processing and Ad Serving operations. Using AWS infrastructure, GumGum currently processes 13 million requests per minute across the globe and generates 30 TB of new data every day. The company uses a suite of services including but not limited to Amazon EC2, Amazon S3, Amazon Kinesis, Amazon EMR, AWS Data Pipeline, and Amazon SNS. AWS edge locations allow GumGum to serve its customers in the US, Europe, Australia, and Japan and the company has plans to expand its infrastructure to Australia and APAC regions in the future.
Jiobit (Chicago, IL) Jiobit was inspired by a real event that took place in a crowded Chicago park. A couple of summers ago, John Renaldi experienced every parent’s worst nightmare – he lost track of his then 6-year-old son in a public park for almost 30 minutes. John knew he wasn’t the only parent with this problem. After months of research, he determined that over 50% of parents have had a similar experience and an even greater percentage are actively looking for a way to prevent it.
Jiobit is the world’s smallest and longest lasting smart tag that helps parents keep track of their kids in every location – indoors and outdoors. The small device is kid-proof: lightweight, durable, and waterproof. It acts as a virtual “safety harness” as it uses a combination of Bluetooth, Wi-Fi, Multiple Cellular Networks, GPS, and sensors to provide accurate locations in real-time. Jiobit can automatically learn routes and locations, and will send parents an alert if their child does not arrive at their destination on time. The talented team of experienced engineers, designers, marketers, and parents has over 150 patents and has shipped dozens of hardware and software products worldwide.
The Jiobit team is utilizing a number of AWS services in the development of their product. Security is critical to the overall product experience, and they are over-engineering security on both the hardware and software side with the help of AWS. Jiobit is also working towards being the first child monitoring device that will have implemented an Alexa Skill via the Amazon Echo device (see here for a demo!). The devices use AWS IoT to send and receive data from the Jio Cloud over the MQTT protocol. Once data is received, they use AWS Lambda to parse the received data and take appropriate actions, including storing relevant data using Amazon DynamoDB, and sending location data to Amazon Machine Learning processing jobs.
Parsec (New York, NY) Parsec operates under the notion that everyone should have access to the best computing in the world because access to technology creates endless opportunities. Founded in 2016 by Benjy Boxer and Chris Dickson, Parsec aims to eliminate the burden of hardware upgrades that users frequently experience by building the technology to make a computer in the cloud available anywhere, at any time. Today, they are using their technology to enable greater flexibility in the hardware and location that PC gamers choose to play their favorite games on. Check out this interview with Benjy and our Startups team for a look at how Parsec works.
Parsec built their first product to improve the gaming experience; gamers no longer have to purchase consoles or expensive PCs to access the entertainment they love. Their low latency video streaming and networking technologies allow gamers to remotely access their gaming rig and play on any Windows, Mac, Android, or Raspberry Pi device. With the global reach of AWS, Parsec is able to deliver cloud gaming to the median user in the US and Europe with less than 30 milliseconds of network latency.
Parsec users currently have two options available to start gaming with cloud resources. They can either set up their own machines with the Parsec AMI in their region or rely on Parsec to manage everything for a seamless experience. In either case, Parsec uses the g2.2xlarge EC2 instance type. Parsec is using Amazon Elastic Block Storage to store games, Amazon DynamoDB for scalability, and Amazon EC2 for its web servers and various APIs. They also deal with a high volume of logs and take advantage of the Amazon Elasticsearch Service to analyze the data.
Be sure to check out Parsec’s blog to keep up with the latest news.
Peloton (New York, NY) The idea for Peloton was born in 2012 when John Foley, Founder and CEO, and his wife Jill started realizing the challenge of balancing work, raising young children, and keeping up with personal fitness. This is a common challenge people face – they want to work out, but there are a lot of obstacles that stand in their way. Peloton offers a solution that enables people to join indoor cycling and fitness classes anywhere, anytime.
Peloton has created a cutting-edge indoor bike that streams up to 14 hours of live classes daily and has over 4,000 on-demand classes. Users can access live classes from world-class instructors from the convenience of their home or gym. The bike tracks progress with in-depth ride metrics and allows people to compete in real-time with other users who have taken a specific ride. The live classes even feature top DJs that play current playlists to keep users motivated.
With an aggressive marketing campaign, which has included high-visibility TV advertising, Peloton made the decision to run its entire platform in the cloud. Most recently, they ran an ad during an NFL playoff game and their rate of requests per minute to their site increased from ~2k/min to ~32.2k/min within 60 seconds. As they continue to grow and diversify, they are utilizing services such as Amazon S3 for thousands of hours of archived on-demand video content, Amazon Redshift for data warehousing, and Application Load Balancer for intelligent request routing.
Tendril (Denver, CO) Tendril was founded in 2004 with the goal of helping homeowners better manage and reduce their energy consumption. Today, electric and gas utilities use Tendril’s data analytics platform on more than 140 million homes to deliver a personalized energy experience for consumers around the world. Using the latest technology in decision science and analytics, Tendril can gain access to real-time, ever evolving data about energy consumers and their homes so they can improve customer acquisition, increase engagement, and orchestrate home energy experiences. In turn, Tendril helps its customers unlock the true value of energy interactions.
AWS helps Tendril run its services globally, while scaling capacity up and down as needed, and in real-time. This has been especially important in support of Tendril’s newest solution, Orchestrated Energy, a continuous demand management platform that calculates a home’s thermal mass, predicts consumer behavior, and integrates with smart thermostats and other connected home devices. This solution allows millions of consumers to create a personalized energy plan for their home based on their individual needs.
With the explosion of device types used to access the Internet with different capabilities, screen sizes, and resolutions, developers must often provide images in an array of sizes to ensure a great user experience. This can become complex to manage and drive up costs.
Images stored using Amazon S3 are often processed into multiple sizes to fit within the design constraints of a website or mobile application. It’s a common approach to use S3 event notifications and AWS Lambda for eager processing of images when a new object is created in a bucket.
In this post, I explore a different approach and outline a method of lazily generating images, in which a resized asset is only created if a user requests that specific size.
Resizing on the fly
Instead of processing and resizing images into all necessary sizes upon upload, the approach of processing images on the fly has several upsides:
Increased agility
Reduced storage costs
Resilience to failure
Increased agility
When you redesign your website or application, you can add new dimensions on the fly, rather than working to reprocess the entire archive of images that you have stored.
Running a batch process to resize all original images into new, resized dimensions can be time-consuming, costly, and error-prone. With the on-the-fly approach, a developer can instead specify a new set of dimensions and lazily generate new assets as customers use the new website or application.
Reduced storage costs
With eager image processing, the resized images must be stored indefinitely as the operation only happens one time. The approach of resizing on-demand means that developers do not need to store images that are not accessed by users.
As a user request initiates resizing, this also unlocks options for optimizing storage costs for resized image assets, such as S3 lifecycle rules to expire older images that can be tuned to an application’s specific access patterns. If a user attempts to access a resized image that has been removed by a lifecycle rule, the API resizes it on demand to fulfill the request.
“Design for failure and nothing will fail.” When building distributed services, developers should be pessimistic and assume that failures will occur.
If image processing is designed to occur only one time upon object creation, an intermittent failure in that process―or any data loss to the processed images―could cause continual failures to future users. When resizing images on-demand, each request initiates processing if a resized image is not found, meaning that future requests could recover from a previous failure automatically.
Architecture overview
Here’s the process:
A user requests a resized asset from an S3 bucket through its static website hosting endpoint. The bucket has a routing rule configured to redirect to the resize API any request for an object that cannot be found.
Because the resized asset does not exist in the bucket, the request is temporarily redirected to the resize API method.
The user’s browser follows the redirect and requests the resize operation via API Gateway.
The API Gateway method is configured to trigger a Lambda function to serve the request.
The Lambda function downloads the original image from the S3 bucket, resizes it, and uploads the resized image back into the bucket as the originally requested key.
When the Lambda function completes, API Gateway permanently redirects the user to the file stored in S3.
The user’s browser requests the now-available resized image from the S3 bucket. Subsequent requests from this and other users will be served directly from S3 and bypass the resize operation. If the resized image is deleted in the future, the above process repeats and the resized image is re-created and replaced into the S3 bucket.
To configure your function, for Environment variables, add two variables:
For Key, enter BUCKET; for Value,enter the bucket name that you created above.
For Key, enter URL; for Value, enter the endpoint field that you noted above, prefixed with http://.
To define the execution role permissions for the function, for Role, choose Create a custom role. Choose View Policy Document, Edit, Ok.
Replace YOUR_BUCKET_NAME_HERE with the name of the bucket that you’ve created and copy the following code into the policy document. Note that any leading spaces in your policy may cause a validation error.
Upload a test image into your bucket to for testing. The blue marble is a great sample image for testing because it is large and square. Once uploaded, try to retrieve resized versions of the image using your bucket’s static website hosting endpoint:
You should see a smaller version of the test photo. If not, choose Monitoring in your Lambda function and check CloudWatch Logs for troubleshooting. You can also refer to the serverless-image-resizing GitHub repo for a working example that you can deploy to your account.
Conclusion
The solution I’ve outlined is a simplified example of how to implement this functionality. For example, in a real-world implementation, there would likely be a list of permitted sizes to prevent a requestor from filling your bucket with randomly sized images. Further cost optimizations could be employed, such as using S3 lifecycle rules on a bucket dedicated to resized images to expire resized images after a given amount of time.
This approach allows you to lazily generate resized images while taking advantage of serverless architecture. This means you have no operating systems to manage, secure, or patch; no servers to right-size, monitor, or scale; no risk of over-spending by over-provisioning; and no risk of delivering a poor user experience due to poor performance by under-provisioning.
If you have any questions, feedback, or suggestions about this approach, please let us know in the comments!
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.