Tag Archives: Stream

Add Watermarks to your Cloudflare Stream Video Uploads

Post Syndicated from Rachel Chen original https://blog.cloudflare.com/add-watermarks-to-your-cloudflare-stream-video-uploads/

Add Watermarks to your Cloudflare Stream Video Uploads

Add Watermarks to your Cloudflare Stream Video Uploads

Since the launch of Cloudflare Stream, our customers have been asking for a programmatic way to add watermarks to their videos. We built the Watermarks API to support a wide range of use cases: from customers who simply want to tell Stream “can you put this watermark image to the top right of my video?” to customers with more detailed asks such as “can you put this watermark image in a way it doesn’t take up more than 10% of the original video and with 20% opacity?” All that and more is now available at no additional cost through the Watermarks API.

What is Cloudflare Stream?

Cloudflare Stream provides out-of-the-box video infrastructure so developers can bring their app ideas to market faster. While building a video streaming app, developers must ask themselves questions like

  • Where do we store the videos affordably?
  • How do we encode the videos to support users with varying Internet speeds?
  • How do we maintain our video pipeline in the long term?”

Cloudflare Stream is a single product that handles video encoding, storage, delivery and presentation (with the Stream Player.) Stream lets developers launch their ideas faster while having the confidence the video infrastructure will scale with their app’s growth.

How the Watermark API works

The Watermark API lets you add a watermark to a video at the time of uploading. It consists of two new features to the Stream API:

  • A new /stream/watermarks endpoint that lets you create watermark profiles and returns a uid, a unique identifier for each watermark profile
  • Support for a watermark object containing the uid of the watermark profile that can be passed at the time of upload

Step 1: Creating a Watermark Profile

A watermark profile describes the nature of the watermark, including the image to use as a watermark and properties such as its positioning, padding and scale.

Add Watermarks to your Cloudflare Stream Video Uploads

In this example, we are going to create a watermark profile that places the Cloudflare logo to the lower left of the video:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/stream/watermarks \
  --header 'content-type: application/json' \
  --header 'x-auth-email: $CLOUDFLARE_EMAIL \
  --header 'x-auth-key: $CLOUDFLARE_KEY \
  --data '{
  "url": "https://storage.googleapis.com/zaid-test/Watermarks%20Demo/cf-icon.png",
  "name": "Cloudflare Icon",
  "opacity": 0.5,
  "padding": 0.05,
  "scale": 0.1,
  "position": "lowerLeft"

The response contains information about the watermark profile, including a uid that we will use in the next step

  "result": {
    "uid": "a85d289c2e3f82701103620d16cd2408",
    "size": 9165,
    "height": 504,
    "width": 600,
    "created": "2020-09-03T20:43:56.337486Z",
    "downloadedFrom": "REDACTED_VIDEO_URL",
    "name": "Cloudflare Icon",
    "opacity": 0.5,
    "padding": 0.05,
    "scale": 0.1,
    "position": "lowerLeft"
  "success": true,
  "errors": [],
  "messages": []

Step 2: Apply the Watermark

We’ve created the watermark and are ready to use it. Below is a screengrab from the Built For This commercial. It contains no watermark:

Add Watermarks to your Cloudflare Stream Video Uploads

We are going to upload the commercial and request Stream to add the logo from the previous step as a watermark:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/stream/copy \
  --header 'content-type: application/json' \
  --header 'x-auth-email: $EMAIL \
  --header 'x-auth-key: $AUTH_KEY' \
  --data '{
  "url": "https://storage.googleapis.com/zaid-test/Watermarks%20Demo/The%20Internet%20was%20BuiltForThis.mp4",
  "watermark": {
    "uid": "a85d289c2e3f82701103620d16cd2408"

Step 3: Your video, now with a watermark!

You’re done! You can watch the video with a watermark:

What’s next

Read the detailed Watermark API docs covering different use cases.

In future iterations, we plan to add support for animated watermarks. Additionally, we want to add Watermark support to the Stream Dashboard so you have a UI to manage and add watermarks.

Building Cloudflare TV from scratch

Post Syndicated from Oliver Yu original https://blog.cloudflare.com/building-cloudflare-tv-from-scratch/

Building Cloudflare TV from scratch

Building Cloudflare TV from scratch

Cloudflare TV is inspired by television shows of the 90s that shared the newest, most exciting developments in computing and music videos. We had three basic requirements for Cloudflare TV:

  1. Guest participation should be as simple as joining a video call
  2. There should be 24×7 programming. Something interesting should be playing all the time
  3. Everything should happen in the cloud and we should never have to ask anyone “to leave their computer on” to have the stream running 24 hours a day
Building Cloudflare TV from scratch

We didn’t set out to build Cloudflare TV from scratch

Building a lot of the technology behind Cloudflare TV from scratch was not part of the plan, especially given our aggressive timeline. So why did we decide to pursue it? After evaluating multiple live streaming solutions, we reached the following conclusion:

  • 24×7 linear streaming is not something that is a priority for most video streaming platforms. This makes sense: the rise of video-on-demand and event-based live streaming has come at the expense of linear streaming.
  • Most broadcasting platforms have their own guest apps which must be downloaded and set up in advance. This introduces unnecessary friction compared to clicking a link in the calendar invite to join a video call.

“Wait! Can we just use Zoom + Cloudflare?”

When we discovered that Zoom lets you push live video to any RTMP end point, we started experimenting with the feature.

“RTMP” stands for Real-Time Messaging Protocol and was originally developed to facilitate low-latency communication using TCP via Macromedia Flash. RTMP has outlived Flash and is widely used by platforms, including YouTube, to enable live video streaming. RTMP is a push protocol and platforms like YouTube provide RTMP endpoints which are simply URLs. Most video broadcast apps will let you configure multiple RTMP endpoints, which tells the app “hey send my live video feed from my phone or computer to these services.” If you find yourself watching a live video that is being broadcasted on multiple services, it is very likely made possible by RTMP.

Building Cloudflare TV from scratch

Zoom lets you provide RTMP endpoints and instruct it to send the live video feed of Zoom calls to, in our case, Cloudflare TV’s RTMP. Before we could use this feature, we needed to be able to ingest RTMP video feeds.

First, we set up an NGINX server with the RTMP module:

apt-get install build-essential libpcre3 libpcre3-dev libssl-dev git zlib1g-dev -y
mkdir ~/build && cd ~/build
git clone git://github.com/arut/nginx-rtmp-module.git
wget http://nginx.org/download/nginx-1.14.1.tar.gz
tar xzf nginx-1.14.1.tar.gz
cd nginx-1.14.1
sudo ./configure --with-http_ssl_module --add-module=../nginx-rtmp-module
sudo make
sudo make install

Next, we configured nginx.conf so NGINX can not only ingest the RTMP feed, but also make it streamable to the end user. A browser typically can’t stream from an RTMP source. We need NGINX to take the RTMP feed and create HLS/DASH segments.

We defined an application called live inside nginx.conf. Within the live application, we can add directives to ingest RTMP and output HLS:

rtmp {
    server {
        application live {
            allow play all;
            live on;

            # sample HLS
            hls on;
            hls_path /mnt/hls/;
            hls_fragment 1;
            hls_playlist_length 4;
            hls_sync 100ms;

Once we had NGINX set up to ingest RTMP and HLS, we followed Zoom’s instructions on Custom Live Streaming. And soon enough, we had a basic prototype of live streaming Zoom calls using the Cloudflare network!

Transitions without interruption

So we met our number one requirement of making the guest experience as easy as joining a video call. But Cloudflare TV isn’t going to be one never-ending call. We needed a way to smoothly transition between multiple calls over the course of the day, and to replay some of our favorite segments.

For example, we may have live programming from 1000 to 1100 followed by two hours of pre-recorded (or replayed) content. When the live programming ends at 1100, the video experience would break and the user would need to hit refresh to see the next show on the schedule.

So how do we fix this? We determined we needed the following:

  1. Ability to set the programming (the “what plays when?”) many days in advance
  2. Have “virtual rooms” ingesting video from different sources (live events, pre-recorded videos stored using our Cloudflare Stream product)

Once we have a schedule and “virtual rooms”, we can dynamically switch what is currently playing on-air to the appropriate “virtual room” streaming the content.

To implement this, we used Contentful, Workers, and Brave (an open-source video editor).

Building Cloudflare TV from scratch


Building Cloudflare TV from scratch

Brave is an open-source project started by the BBC. Using Brave, we were able to set up multiple virtual rooms and smoothly make any virtual room go on-air.

Under the hood, Brave is doing two key things:

  1. pulling multiple video feeds from various sources and placing them in virtual rooms
  2. pushing the final (“on air feed”) to NGINX every second of the day


Contentful is a headless content management platform designed to be API-first; it eliminated the need for a database and helped us build our scheduling feature rapidly.

Most of the necessary fields are pretty straightforward for a CMS: title, presenters, and, of course, the time slot. Each of these is automatically synced with the publicly-facing schedule at cloudflare.tv/schedule.

We are able to use Workers to fetch events from Contentful:

export async function fetchEventRaw(id: string) {
  let r = await fetch(`${CONTENTFUL_API}/entries/${id}`, {
    headers: {
      'Content-Type': 'application/json',
      Authorization: `Bearer ${CONTENTFUL_ADMIN}`,
  return unwrap(r, 'Failed to retrieve event')

The more complex piece was integrating this with Zoom. Each segment needs its own Zoom meeting, and it’d be pretty arduous to create these manually. So when we publish in Contentful, Contentful makes a call to a Worker endpoint. The Worker endpoint automatically generates a Zoom meeting — and provides the Programming Team with the customized invite to send to the guest.

For example, when a new event is added to Contentful, Contentful notifies our Worker endpoint which creates a new meeting and configures it so it is being pushed to Cloudflare TV:

export async function createMeeting(ev: TVEvent) {
  const headers = await zoomHeaders()

  const alternative_hosts = ev.altHosts ? ev.altHosts.join(',') : ''

  ev.zoomPassword = genPassword()

  let r = await fetch(`https://api.zoom.us/v2/users/${ev.studio}@cloudflare.com/meetings`, {
    method: 'POST',
    body: JSON.stringify({
      topic: ev.title,
      type: 2,
      start_time: ev.start,
      duration: ev.duration,
      timezone: 'UTC',
      agenda: ev.description,
      password: ev.zoomPassword,
      settings: {
        host_video: true,
        participant_video: false,
        cn_meeting: false,
        in_meeting: false,
        join_before_host: true,
        mute_upon_entry: true,
        watermark: false,
        use_pmi: false,
        approval_type: 2,
        audio: 'both',
        auto_recording: 'cloud',
        enforce_login: false,
  let data = await unwrap(r, 'Failed to create ZOOM meeting')
  log('zoom: ', data)

  ev.meetingId = data.id
  ev.zoomUrl = data.join_url

  // push livestream configuration data to meeting
  r = await fetch(`https://api.zoom.us/v2/meetings/${ev.meetingId}/livestream`, {
    method: 'PATCH',
    body: JSON.stringify({
      //TODO: make configurable
      stream_url: CFTV_RTMP_ENDPOINT,
      stream_key: ev.studio,
      page_url: 'https://cloudflare.tv',
  await unwrap(r, 'Failed to update LiveStream config')

  return ev

The other upside to using Contentful is that many members of our team already have familiarity with it, so it reduces the overhead of learning a new tool.


So far, we’ve described the different pieces of the backend (NGINX, Brave, Contentful) that make Cloudflare TV possible. How do we bring them all together? Cloudflare Workers serves as the glue that brings these systems together. The Cloudflare TV frontend is built on Worker Sites. The frontend calls our Worker endpoints to fetch data, such as the programming calendar.

Thinking Ahead…

We’re just getting started with Cloudflare TV. We have a long wish list of features we’d really like to see. Here are some of the features we can’t wait to work on:

  • Improve the viewing experience by adding closed-caption support
  • Enable our viewers to call in and ask questions and contribute to the conversation
  • Bring Cloudflare TV to platforms like Apple TV and Roku

Making Video Intuitive: An Explainer

Post Syndicated from Kyle Boutette original https://blog.cloudflare.com/making-video-intuitive-an-explainer/

Making Video Intuitive: An Explainer

On the Stream team at Cloudflare, we work to provide a great viewing experience while keeping our service affordable. That involves a lot of small tweaks to our video pipeline that can be difficult to discern by most people. And that makes the results of those tweaks less intuitive.

In this post, let’s have some fun. Instead of fine-grained optimization work, we’ll do the opposite. Today we’ll make it easy to see changes between different versions of a video: we’ll start with a high-quality video and ruin it. Instead of aiming for perfection, let’s see the impact of various video coding settings. We’ll go on a deep dive on how to make some victim video look gloriously bad and learn on the way.

Everyone agrees that video on the Internet should look good, start playing fast, and never rebuffer regardless of the device they’re on. People can prefer one version of a video over another and say it looks better. Most people, though, would have difficulty elaborating on what ‘better’ means. That’s not an issue when you’re just consuming video. However, when you’re storing, encoding, and distributing it, how that video looks determines how happy your viewers are.

To determine what looks better, video engineers can use a variety of techniques. The most accessible is the most obvious: compare two versions of a video by having people look at them—a subjective comparison. We’ll apply eyeballs here.

So, who’s our sacrificial video? We’re going to use a classic video for the demonstration here—perhaps too classic for people that work with video—Big Buck Bunny. This is an open-source film by Sacha Goedegebure available under the permissive Creative Commons Attribution 3.0 license. We’re only going to work with 17 seconds of it to save some time. This is what the video looks like when downloaded from https://peach.blender.org/download/. Take a moment to savor the quality since we’re only getting worse from here.

For brevity, we’ll evaluate our results by two properties: smooth motion and looking ‘crisp’. The video shouldn’t stutter and its important features should be distinguishable.

It’s worth mentioning that video is a hack of your brain. Every video is just an optimized series of pictures— a very sophisticated flipbook. Display those pictures quickly enough and you can fool the brain into interpreting motion. If you show enough points of light close together, they meld into a continuous image. Then, change the color of those lights frequently enough and you end up with smooth motion.

Making Video Intuitive: An Explainer

Frame rate

Not stuttering is covered by framerate, measured in frames-per-second (fps). fps is the number of individual pictures displayed in a single second; many videos are encoded at somewhere between 24 and 30fps. One way to describe fps is in terms of how long a frame is shown for—commonly called the frame time. At 24fps, each frame is shown for about 41 milliseconds. At 2fps, that jumps to 500ms. Lowering fps causes frames to trend rapidly towards persisting for the full second. Smooth motion mostly comes down to the single knob of fps. Mucking about with framerate isn’t a sporting way to achieve our goal. It’s extremely easy to tank the framerate and ruin the experience. Humans have a low tolerance for janky motion. To get the idea, here’s what our original clip reduced to 2fps looks like; 500ms per-frame is a long time.

ffmpeg -v info -y -hide_banner -i source.mp4 -r 2 -c:v h264 -c:a copy 2fps.mp4


Making tiny features distinguishable has many more knobs. Choices you can make include what codec, level, profile, bitrate, resolution, color space, or keyframe frequency, to name a few. Each of these also influences factors apart from perceived quality, such as how large the resulting file is plus what devices it is compatible with. There’s no universal right answer for what parameters to encode a video with. For the best experience while not wasting resources, the same video intended for a modern 4k display should be tailored differently for a 2007 iPod Nano. We’ll spend our time here focusing on what impacts a video’s crispness since that’s what largely determines the experience.

We’re going to use FFmpeg to make this happen. This is the sonic screwdriver of the video world; a near-universal command-line tool for converting and manipulating media. FFmpeg is almost two decades old, has hundreds of contributors, and can do essentially any digital video-related task. Its flexibility also makes it rather complex to work with. For each version of the video, we’ll show the command used to generate it as we go.

Let’s figure out exactly what we want to change about the video to make it a bad experience.

Making Video Intuitive: An Explainer

You may have heard about resolution and bitrate. To explain them, let’s use an analogy. Resolution provides pixels. Pixels are buckets for information. Bitrate is the information that fills those buckets. How full a given bucket is determines how well a pixel can represent content. With too few bits of information for a bucket, the pixel will get less and less accurate to the original source. In practice, their numerical relationship is complicated. These are what we’ll be varying.

The decision of which bucket should get how many bits of information is determined by software called a video encoder. The job of the encoder is to use the bits budgeted for it as efficiently as possible to display the best quality video. We’ll be changing the bitrate budget to influence the resulting bitrate. Like people with money, budgeting is a good idea for our encoder. Uncompressed video can use a byte, or more, per-pixel for each of the red, green, and blue(RGB) channels. For a 1080p video, that means 1920×1080 pixels multiplied by 3 bytes to get 6.2MB per frame. We’ll talk about frames later but 6.2 MB is a lot— at this rate, a DVD disc would only fit about 50 seconds of video.

With our variables chosen, we’re good to go. For every variation we encode, we’ll show a comparison to this table. Our source video is encoded in H.264 at 24fps with a variety of other settings, those features will not change. Expect these numbers to get significantly smaller as we poke around to see what changes.

Resolution Bitrate File Size
Source 1280×720 7.5Mbps 16MB

To start, let’s change just resolution and see what impact that has. The lowest resolution most people are exposed to is usually 140p, so let’s reencode our source video targeting that. Since many video platforms have this as an option, we’re not expecting an unwatchable experience quite yet.

ffmpeg -v info -y -hide_banner -i source.mp4 -vf scale=-2:140 -c:v h264 -b:v 6000k -c:a copy scaled-140.mp4

Resolution Bitrate File Size
Source 1280×720 7.5Mbps 16MB
Scaled to 140p 248×140 2.9Mbps 6.1MB

By the numbers, we find some curious results. We didn’t ask for a different bitrate from the source but our encoder gave us one that is roughly a third. Given that the number of pixels was dramatically reduced, the encoder had fewer buckets to put the information in our bitrate. Despite its best attempt at using the entire bitrate budget provided to it, our encoder filled all the buckets we provided. What did it do with the leftover information? Since it isn’t in the video, it tossed it.

This would probably be an acceptable experience on a 4in phone screen. You wouldn’t notice the sort-of grainy result on a small display. On a 40in TV, it’d be blocky and unpleasant. At 40in, 140 rows of pixels become individually distinguishable which doesn’t fool the brain and ruins the magic.


Bitrate is the density of information for a given period of time, typically a second. This interacts with framerate to give us a per frame bitrate budget. Our source having a bitrate of 7.5Mbps (millions of bits-per-second) and framerate of 24fps means we have an average of 7500Kbps / 24fps = 312.5Kb of information per frame.

Different kinds of frames

Making Video Intuitive: An Explainer

There are different ways a frame can be encoded. It doesn’t make sense to use the same technique for a sequence of frames of a single color and most of the sequences in Big Buck Bunny. There’s differing information density and distribution between those sequences. Different ways of representing frames take advantage of those differing patterns. As a result, the 312Kb average for each frame is both lower than the size of the larger frames and greater than the size of the smallest frames. Some frames contain just changes relative to other frames – these are P or B frames – those could be far smaller than 312Kb. However, some frames contain full images – these are I frames – and tend to be far larger than 312Kb. Since we’re viewing the video holistically as multiple seconds, we don’t need to worry about them since we’re concerned with the overall experience. Knowing about frames is useful for their impact on bitrate for different types of content, which we’ll discuss later.

Our starting bitrate is extremely large and has more information than we actually need. Let’s be aggressive and cut it down to 1/75th while maintaining the source’s resolution.

ffmpeg -v info -y -hide_banner -i source.mp4 -c:v h264 -b:v 100k -c:a copy bitrate-100k.mp4

Resolution Bitrate File Size
Source 1280×720 7.5Mbps 16MB
Scaled to 140p 248×140 2.9Mbps 6.1MB
Targeted to 100Kbps 1280×720 102Kbps 217KB

When you take a look at the video, fur and grass become blobs. There’s just not enough information to accurately represent the fine details.

Making Video Intuitive: An Explainer
Source Video
Making Video Intuitive: An Explainer
100 Kbps budget

We provided a bitrate budget of 100Kbps but the encoder doesn’t seem to have quite hit it. When we changed the resolution, we had a lower bitrate than we asked for, here we have a higher bitrate. Why would that be the case?

We have so many buckets that there’s some minimum amount the encoder wants in each. Since it can play with the bitrate, it ends up favoring slightly more full buckets since that’s easier. This is somewhat the reverse of why our previous experiment had a lower bitrate than expected.

We can influence how the encoder budgets bitrate using rate control modes. We’re going to stick with the default ‘Average-Bitrate’ mode to keep things easy. This mode is sub-optimal since it lets the encoder spend a bunch of budget up front to its detriment later. However, it’s easy to reason about.

Resolution + Bitrate

Targeting a bitrate of 100Kbps got us an unpleasant video but not something completely unwatchable. We haven’t quite ruined our video yet. We might as well take bitrate down to an even further extreme of 20Kbps while keeping the resolution constant.

ffmpeg -v info -y -hide_banner -i source.mp4 -c:v h264 -b:v 20k -c:a copy bitrate-20k.mp4

Resolution Bitrate File Size
Source 1280×720 7.5Mbps 16MB
Scaled to 140p 248×140 2.9Mbps 6.1MB
Targeted to 100Kbps 1280×720 102Kbps 217KB
Targeted to 20Kbps 1280×720 35Kbps 81KB

Now, this is truly unwatchable! There’s sometimes color but the video mostly devolves into grayscale rectangles roughly approximating the silhouettes of what we’re expecting. At slightly less than a third the bitrate of the previous trial, this definitely looks like it has less than a third of the information.

As before, we didn’t hit our bitrate target and for the same reason that our pixel buckets were insufficiently filled with information. The encoder needed to start making hard decisions at some point between 102 and 35Kbps. Most of the color and the comprehensibility of the scene were sacrificed.

We’ll discuss why there’s moving grayscale rectangles and patches of color in a bit. They’re giving us a hint about how the encoder works under the hood.

What if we go just one step further and combine our tiny resolution with the absurdly low bitrate? That should be an even worse experience, right?

ffmpeg -v info -y -hide_banner -i source.mp4 -vf scale=-2:140 -c:v h264 -b:v 20k -c:a copy scaled-140_bitrate-20k.mp4

Resolution Bitrate File Size
Source 1280×720 7.5Mbps 16MB
Scaled to 140p 248×140 2.9Mbps 6.1MB
Targeted to 100Kbps 1280×720 102Kbps 217KB
Targeted to 20Kbps 1280×720 35Kbps 81KB
Scaled to 140p and Targeted to 20Kbps 248×140 19Kbps 48KB

Wait a minute, that’s actually not too bad at all. It’s almost like a tinier version of 1280 by 720 at 100Kbps. Why doesn’t this look terrible? Having a lower bitrate means there’s less information, which implies that the video should look worse. A lower resolution means the image should be less detailed. The numbers got smaller, so the video shouldn’t look better!

Thinking back to buckets and information, we now have less information but fewer discrete places for that information to live. This specific combination of low bitrate and low resolution means the buckets are nicely filled. The encoder exactly hit our target bitrate which is a reasonable indicator that it was at least somewhat satisfied with the final result.

This isn’t going to be a fun experience on a 4k display but it is fine enough for an iPod Nano from 2007. A 3rd generation iPod Nano has a 320×240 display spread across a 2in screen. Our 140p video will be nearly indistinguishable from a much higher quality video. Even more, 48KB for 17 seconds of video makes fantastic use of the limited storage – 4GB on some models. In a resource-constrained environment, this low video quality can be a large quality of experience improvement.

Making Video Intuitive: An Explainer
CC BY 2.0image by nez

We should have a decent intuition for the relationship between bitrate and resolution plus what the tradeoffs are. There’s a lingering question, though, do we need to make tradeoffs? There has to be some ratio of bitrate to pixel-count in order to get the best quality for a given resolution at a minimal file size.

In fact, there are such perfect ratios. In ruining the video, we ended up testing a few candidates of this ratio for our source video.

Resolution Bitrate File Size Bits/Pixel
Source 1280×720 7.5Mbps 16MB 8.10
Scaled to 140p 248×140 2.9Mbps 6.1MB 83.5
Targeted to 100Kbps 1280×720 102Kbps 217KB 0.11
Targeted to 20Kbps 1280×720 35Kbps 81KB 0.03
Scaled to 140p and Targeted to 20Kbps 248×140 19Kbps 48KB 0.55

However, there are some complications.

The biggest caveat is that the optimal ratio depends on your source video. Each video has a different amount of information required to be displayed. There are a couple of reasons for that.

If a frame has many details then it takes more information to represent. Frames in chronological order that visually differ significantly (think of an action movie) take more information than a set of visually similar frames (like a security camera outside a quiet warehouse). The former can’t use as many B or P frames which occupy less space. Animated content with flat colors require encoders to make fewer trade offs that cause visual degradation than live-action.

Thinking back to the settings that resulted in grayscale rectangles and patches of color, we can learn a bit more. We saw that the rectangles and color seem to move, as though the encoder was playing a shell game with tiny boxes of pictures.

What is happening is that the encoder is recognizing repeated patterns within and between frames. Then, it can reference those patterns to move them around without needing to actually duplicate them. The P and B frames mentioned earlier are mainly composed of these shifted patterns. This is similar, at least in spirit, to other compression algorithms that use dictionaries to refer to previous content. In most video codecs, the bits of picture that can be shifted are called ‘macroblocks’, which subdivide each frame with NxN squares of pixels. The less stingy the bitrate, the less obvious the macroblock shell game.

To see this effect more clearly, we can ask FFmpeg to show us decisions it makes. Specifically, it can show us what it decides is ‘motion’ moving the macroblocks. The video here is 140p for the motion vector arrows to be easier to see.

ffmpeg -v info -y -hide_banner -flags2 +export_mvs -i source.mp4 -vf scale=-2:140,codecview=mv=pf+bf+bb -c:v h264 -b:v 6000k -c:a copy motion-vector.mp4

Even worse is that flat color and noise might only be seen in two different scenes in the same video. That forces you to either waste your bitrate budget in one scene or look terrible in the other. We give the encoder a bitrate budget it can use. How it uses it is the result of a feedback loop during encoding.

Yet another caveat is that your resulting bitrate is influenced by all those knobs that were listed earlier, the most impactful being codec choice followed by bitrate budget. We explored the relationship between bitrate and resolution but every knob has an impact on the quality and a single knob frequently interacts with other knobs.

So far we’ve taken a look at some of the knobs and settings that affect visual quality in a video. Every day, video engineers and encoders make tough decisions to optimize for the human eye, while keeping file sizes at a minimum. Modern encoding schemes use techniques such as per title encoding to narrow down the best resolution-bitrate combinations. Those schemes look somewhat similar to what we’ve done here: test various settings and see what gives the desired result.

With every example, we’ve included an FFmpeg command you can use to replicate the output above and experiment with your own videos. We encourage you to try improving the video quality while reducing file sizes on your own and to find other levers that will help you on this journey!

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Post Syndicated from Zaid Farooqui original https://blog.cloudflare.com/remote-work-isnt-just-video-conferencing-how-we-built-cloudflaretv/

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

At Cloudflare, we produce all types of video content, ranging from recordings of our Weekly All-Hands to product demos. Being able to stream video on demand has two major advantages when compared to live video:

  1. It encourages asynchronous communication within the organization
  2. It extends the life time value of the shared knowledge

Historically, we haven’t had a central, secure repository of all video content that could be easily accessed from the browser. Various teams choose their own platform to share the content. If I wanted to find a recording of a product demo, for example, I’d need to search Google Drive, Gmail and Google Chat with creative keywords. Very often, I would need to reach out to individual teams to finally locate the content.

So we decided we wanted to build CloudflareTV, an internal Netflix-like application that can only be accessed by Cloudflare employees and has all of our videos neatly organized and immediately watchable from the browser.

We wanted to achieve the following when building CloudflareTV:

  • Security: make sure the videos are access controlled and not publicly accessible
  • Authentication: ensure the application can only be accessed by Cloudflare employees
  • Tagging: allow the videos to be categorized so they can be found easily
  • Originless: build the entire backend using Workers and Stream so we don’t need separate infrastructure for encoding, storage and delivery

Securing the videos using signed URLs

Every video uploaded to Cloudflare Stream can be locked down by requiring signed URLs. A Stream video can be marked as requiring signed URLs using the UI or by making an API call:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Once locked down in this way videos can’t be accessed directly. Instead, they can only be accessed using a temporary token.

In order to create signed tokens, we must first make an API call to create a key:

curl -X POST -H "X-Auth-Email: {$EMAIL}" -H "X-Auth-Key: {$AUTH_KEY}"  "https://api.cloudflare.com/client/v4/accounts/{$ACCOUNT_ID}/media/keys"

The API call will return a JSON object similar to this:

  "result": {
    "id": "...",
    "pem": "...",
    "jwk": "...",
    "created": "2020-03-10T18:17:00.075188052Z"
  "success": true,
  "errors": [],
  "messages": []

We can use the id and pem values in a Workers script that takes a video ID and returns a signed token that expires after 1 hour:

async function generateToken(video_id) {
var exp_time = Math.round((new Date()).getTime() / 1000)+3600;

    const key_data = {
        'id': '{$KEY_ID}',
        'pem': '{$PEM}',
        'exp': exp_time

    let response = await fetch('https://util.cloudflarestream.com/sign/'+video_id, {
        method: 'POST',
        body: JSON.stringify(key_data)
    let token_value = await response.text();
    return token_value;

The returned signed token should look something like this:


Stream provides an embed code for each video. The “src” attribute of the embed code typically contains the video ID. But if the video is private, instead of setting the “src” attribute to the video ID, you set it to the signed token value:

<stream src="eyJhbGciOiJSUzI1NiIsImtpZCI6IjExZDM5ZjEwY2M0NGY1NGE4ZDJlMjM5OGY3YWVlOGYzIn0.eyJzdWIiOiJiODdjOWYzOTkwYjE4ODI0ZTYzMTZlMThkOWYwY2I1ZiIsImtpZCI6IjExZDM5ZjEwY2M0NGY1NGE4ZDJlMjM5OGY3YWVlOGYzIiwiZXhwIjoiMTUzNzQ2MDM2NSIsIm5iZiI6IjE1Mzc0NTMxNjUifQ.C1BEveKi4XVeZk781K8eCGsMJrhbvj4RUB-FjybSm2xiQntFi7AqJHmj_ws591JguzOqM1q-Bz5e2dIEpllFf6JKK4DMK8S8B11Vf-bRmaIqXQ-QcpizJfewNxaBx9JdWRt8bR00DG_AaYPrMPWi9eH3w8Oim6AhfBiIAudU6qeyUXRKiolyXDle0jaP9bjsKQpqJ10K5oPWbCJ4Nf2QHBzl7Aasu6GK72hBsvPjdwTxdD5neazdxViMwqGKw6M8x_L2j2bj93X0xjiFTyHeVwyTJyj6jyPwdcOT5Bpuj6raS5Zq35qgvffXWAy_bfrWqXNHiQdSMOCNa8MsV8hljQsh" controls></stream>
<script data-cfasync="false" defer type="text/javascript" src="https://embed.videodelivery.net/embed/r4xu.fla9.latest.js"></script>

Tagging videos

We would like to categorize videos uploaded to Stream by tagging them. This can be done by updating the video object’s meta field and passing it arbitrary JSON data. To categorize a video, we simply update the meta field with a comma-delimited list of tags:

curl -X POST  -d '{"uid": "VIDEO_ID", "meta": {"tags": "All Hands,Stream"}}' "https://api.cloudflare.com/client/v4/accounts/{$ACCOUNT_ID}/stream/{$VIDEO_ID}"  -H "X-Auth-Email: {$EMAIL}"  -H "X-Auth-Key: {$ACCOUNT_KEY}"  -H "Content-Type: application/json"

Later, we will create a getVideos Worker function to fetch a list of videos and all associated data so we can render the UI. The tagging data we just set for this video will be included in the video data returned by the Worker.

Fetching Video Data using Workers

The heart of the UI is a list of videos. How do we get this list of videos programmatically? Stream provides an endpoint that returns all the videos and any metadata associated with them.

First, we set up environment variables for our Worker:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Next, we wrote a simple Workers function to call the Stream API and return a list of videos, eliminating the need for an origin:

async function getVideos() {
    const headers = {
        'X-Auth-Key': CF_KEY,
        'X-Auth-Email': CF_EMAIL

    let response = await fetch(“https://api.cloudflare.com/client/v4/accounts/” + CF_ACCOUNT_ID + '/stream', {
        headers: headers
    let video_list = await response.text();
    return video_list;

Lastly, we set up a zone and within the zone, we set up a Worker routes pointing to our Workers script. This can be done from the Workers tab:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

Authenticating using Cloudflare Access

Finally, we want to restrict access to CloudflareTV to people within the organization. We can do this using Cloudflare Access, available under the Access tab.

To restrict access to CloudflareTV, we must do two things:

  1. Add a new login method
  2. Add an access policy

To add a new login method, click the “+” icon and choose your identity provider. In our case, we chose Google:

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

You will see a pop up asking for information including Client ID and Client Secret, both key pieces of information required to set up Google as the identity provider.

Once we add an identity provider, we want to tell Access “who specifically should be allowed to access our application?” This is done by creating an Access Policy.

Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV
Remote Work Isn’t Just Video Conferencing: How We Built CloudflareTV

We set up an Access Policy to only allow emails ending in our domain name. This effectively makes CloudflareTV only accessible by our team!

What’s next?

If you have interesting ideas around video, Cloudflare Stream lets you focus on your idea while it handles storage, encoding and the viewing experience for your users. Coupled that with Access and Workers, you can build powerful applications. Here are the docs to help you get started:

Live video just got more live: Introducing Concurrent Streaming Acceleration

Post Syndicated from Jon Levine original https://blog.cloudflare.com/introducing-concurrent-streaming-acceleration/

Live video just got more live: Introducing Concurrent Streaming Acceleration

Live video just got more live: Introducing Concurrent Streaming Acceleration

Today we’re excited to introduce Concurrent Streaming Acceleration, a new technique for reducing the end-to-end latency of live video on the web when using Stream Delivery.

Let’s dig into live-streaming latency, why it’s important, and what folks have done to improve it.

How “live” is “live” video?

Live streaming makes up an increasing share of video on the web. Whether it’s a TV broadcast, a live game show, or an online classroom, users expect video to arrive quickly and smoothly. And the promise of “live” is that the user is seeing events as they happen. But just how close to “real-time” is “live” Internet video?

Delivering live video on the Internet is still hard and adds lots of latency:

  1. The content source records video and sends it to an encoding server;
  2. The origin server transforms this video into a format like DASH, HLS or CMAF that can be delivered to millions of devices efficiently;
  3. A CDN is typically used to deliver encoded video across the globe
  4. Client players decode the video and render it on the screen

Live video just got more live: Introducing Concurrent Streaming Acceleration

And all of this is under a time constraint — the whole process need to happen in a few seconds, or video experiences will suffer. We call the total delay between when the video was shot, and when it can be viewed on an end-user’s device, as “end-to-end latency” (think of it as the time from the camera lens to your phone’s screen).

Traditional segmented delivery

Video formats like DASH, HLS, and CMAF work by splitting video into small files, called “segments”. A typical segment duration is 6 seconds.

If a client player needs to wait for a whole 6s segment to be encoded, sent through a CDN, and then decoded, it can be a long wait! It takes even longer if you want the client to build up a buffer of segments to protect against any interruptions in delivery. A typical player buffer for HLS is 3 segments:

Live video just got more live: Introducing Concurrent Streaming Acceleration
Clients may have to buffer three 6-second chunks, introducing at least 18s of latency‌‌

When you consider encoding delays, it’s easy to see why live streaming latency on the Internet has typically been about 20-30 seconds. We can do better.

Reduced latency with chunked transfer encoding

A natural way to solve this problem is to enable client players to start playing the chunks while they’re downloading, or even while they’re still being created. Making this possible requires a clever bit of cooperation to encode and deliver the files in a particular way, known as “chunked encoding.” This involves splitting up segments into smaller, bite-sized pieces, or “chunks”. Chunked encoding can typically bring live latency down to 5 or 10 seconds.

Confusingly, the word “chunk” is overloaded to mean two different things:

  1. CMAF or HLS chunks, which are small pieces of a segment (typically 1s) that are aligned on key frames
  2. HTTP chunks, which are just a way of delivering any file over the web

Live video just got more live: Introducing Concurrent Streaming Acceleration
Chunked Encoding splits segments into shorter chunks

HTTP chunks are important because web clients have limited ability to process streams of data. Most clients can only work with data once they’ve received the full HTTP response, or at least a complete HTTP chunk. By using HTTP chunked transfer encoding, we enable video players to start parsing and decoding video sooner.

CMAF chunks are important so that decoders can actually play the bits that are in the HTTP chunks. Without encoding video in a careful way, decoders would have random bits of a video file that can’t be played.

CDNs can introduce additional buffering

Chunked encoding with HLS and CMAF is growing in use across the web today. Part of what makes this technique great is that HTTP chunked encoding is widely supported by CDNs – it’s been part of the HTTP spec for 20 years.

CDN support is critical because it allows low-latency live video to scale up and reach audiences of thousands or millions of concurrent viewers – something that’s currently very difficult to do with other, non-HTTP based protocols.

Unfortunately, even if you enable chunking to optimise delivery, your CDN may be working against you by buffering the entire segment. To understand why consider what happens when many people request a live segment at the same time:

Live video just got more live: Introducing Concurrent Streaming Acceleration

If the file is already in cache, great! CDNs do a great job at delivering cached files to huge audiences. But what happens when the segment isn’t in cache yet? Remember – this is the typical request pattern for live video!

Typically, CDNs are able to “stream on cache miss” from the origin. That looks something like this:

Live video just got more live: Introducing Concurrent Streaming Acceleration

But again – what happens when multiple people request the file at once? CDNs typically need to pull the entire file into cache before serving additional viewers:

Live video just got more live: Introducing Concurrent Streaming Acceleration
Only one viewer can stream video, while other clients wait for the segment to buffer at the CDN

This behavior is understandable. CDN data centers consist of many servers. To avoid overloading origins, these servers typically coordinate amongst themselves using a “cache lock” (mutex) that allows only one server to request a particular file from origin at a given time. A side effect of this is that while a file is being pulled into cache, it can’t be served to any user other than the first one that requested it. Unfortunately, this cache lock also defeats the purpose of using chunked encoding!

To recap thus far:

  • Chunked encoding splits up video segments into smaller pieces
  • This can reduce end-to-end latency by allowing chunks to be fetched and decoded by players, even while segments are being produced at the origin server
  • Some CDNs neutralize the benefits of chunked encoding by buffering entire files inside the CDN before they can be delivered to clients

Cloudflare’s solution: Concurrent Streaming Acceleration

As you may have guessed, we think we can do better. Put simply, we now have the ability to deliver un-cached files to multiple clients simultaneously while we pull the file once from the origin server.

Live video just got more live: Introducing Concurrent Streaming Acceleration

This sounds like a simple change, but there’s a lot of subtlety to do this safely. Under the hood, we’ve made deep changes to our caching infrastructure to remove the cache lock and enable multiple clients to be able to safely read from a single file while it’s still being written.

The best part is – all of Cloudflare now works this way! There’s no need to opt-in, or even make a config change to get the benefit.

We rolled this feature out a couple months ago and have been really pleased with the results so far. We measure success by the “cache lock wait time,” i.e. how long a request must wait for other requests – a direct component of Time To First Byte.  One OTT customer saw this metric drop from 1.5s at P99 to nearly 0, as expected:

Live video just got more live: Introducing Concurrent Streaming Acceleration

This directly translates into a 1.5-second improvement in end-to-end latency. Live video just got more live!


New techniques like chunked encoding have revolutionized live delivery, enabling publishers to deliver low-latency live video at scale. Concurrent Streaming Acceleration helps you unlock the power of this technique at your CDN, potentially shaving precious seconds of end-to-end latency.

If you’re interested in using Cloudflare for live video delivery, contact our enterprise sales team.

And if you’re interested in working on projects like this and helping us improve live video delivery for the entire Internet, join our engineering team!