All posts by Pat Patterson

Media Transcoding With Backblaze B2 and Vultr Optimized Cloud Compute

2022-03-17 Pat Patterson

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/media-transcoding-with-backblaze-b2-and-vultr-optimized-cloud-compute/

Since announcing the Backblaze + Vultr partnership last year, we’ve seen our mutual customers build a wide variety of applications combining Vultr’s Infrastructure Cloud with Backblaze B2 Cloud Storage, taking advantage of zero-cost data transfer between Vultr and Backblaze. This week, Vultr announced Optimized Cloud Compute instances, virtual machines pairing dedicated best-in-class AMD CPUs with just the right amount of RAM and NVMe SSDs.

To mark the occasion, I built a demonstration that both showcases this new capability and gives you an example application to adapt to your own use cases.

Imagine you’re creating the next big video sharing site—CatTube—a spin-off of Catblaze, your feline-friendly backup service. You’re planning all sorts of amazing features, but the core of the user experience is very familiar:

A user uploads a video from their mobile or desktop device.
The user’s video is available for viewing on a wide variety of devices, from anywhere in the world.

Let’s take a high-level look at how this might work…

Transcoding Explained: How Video Sharing Sites Make Videos Shareable

The user will upload their video to a web application from their browser or a mobile app. The web application must store the uploaded user videos in a highly scalable, highly available service—enter Backblaze B2 Cloud Storage. Our customers store, in the aggregate, petabytes of media data including video, audio, and still images.

But, those videos may be too large for efficient sharing and streaming. Today’s mobile devices can record video with stunning quality at 4K resolution, typically 3840 × 2160 pixels. While 4K video looks great, the issue is that even with compression, it’s a lot of data—about 1MB per second. Not all of your viewers will have that kind of bandwidth available, particularly if they’re on the move.

So, CatTube, in common with other popular video sharing sites, will need to convert raw uploaded video to one or more standard, lower-resolution formats, a process known as transcoding.

Transcoding is a very different workload from running a web application’s backend. Where an application server requires high I/O capability, but relatively little CPU power, transcoding is extremely CPU-intensive. You decide that you’ll need two sets of machines for CatTube—application servers and workers. The worker machines can be optimized for the transcoding task, taking advantage of the fastest available CPUs.

For these tasks, you need appropriate cloud compute instances. I’ll walk you through how I implemented CatTube as a very simple video sharing site with Backblaze B2 and Vultr’s Infrastructure Cloud using Vultr’s Cloud Compute instances for the application servers and their new Optimized Cloud Compute instances for the transcoding workers.

Building a Video Sharing Site With Backblaze B2 + Vultr

The video sharing example comprises a web application, written in Python using the Django web framework, and a worker application, also written in Python, but using the Flask framework.

Here’s how the pieces fit together:

The user uploads a video from their browser to the web app.
The web app uploads the raw video to a private bucket on Backblaze B2.
The web app sends a message to the worker instructing it to transcode the video.
The worker downloads the raw video to local storage and transcodes it, also creating a thumbnail image.
The worker uploads the transcoded video and thumbnail to Backblaze B2.
The worker sends a message to the web app with the addresses of the input and output files in Backblaze B2.
Viewers around the world can enjoy the video.

These steps are illustrated in the diagram below.

There’s a more detailed description in the Backblaze B2 Video Sharing Example GitHub repository, as well as all of the code for the web application and the worker. Feel free to fork the repository and use the code as a starting point for your own projects.

Here’s a short video of the system in action:

Some Caveats:

Note that this is very much a sample implementation. The web app and the worker communicate via HTTP—this works just fine for a demo, but doesn’t account for the worker being too busy to receive the message. Nor does it scale to multiple workers. In a production implementation, these issues would be addressed by the components communicating via an asynchronous messaging system such as Kafka. Similarly, this sample transcodes to a single target format: 720p. A real video sharing site would transcode the raw video to a range of formats and resolutions.

Want to Try It for Yourself?

Vultr’s new Cloud Compute Optimized instances are a perfect match for CPU-intensive tasks such as media transcoding. Zero-cost ingress and egress between Backblaze B2 and Vultr’s Infrastructure Cloud allow you to build high performance, scalable applications to satisfy a global audience. Sign up for Backblaze B2 and Vultr’s Infrastructure Cloud today, and get to work!

The post Media Transcoding With Backblaze B2 and Vultr Optimized Cloud Compute appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Building a Multiregion Origin Store With Backblaze B2 + Fastly Compute@Edge

2022-03-02 Pat Patterson

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/building-a-multiregion-origin-store-with-backblaze-b2-fastly-computeedge/

Backblaze B2 Cloud Storage customers have long leveraged our partner Fastly’s Deliver@Edge CDN as an essential component of a modern, scalable web architecture. Complementing Deliver@Edge, Compute@Edge is a serverless computing environment built on the same caching platform to provide a general-purpose compute layer between the cloud and end users. Today, we’re excited to celebrate Fastly’s announcement of its Compute@Edge partner ecosystem.

Serverless computing is quickly gaining popularity among developers for its simplicity, agility, and functionality. In the serverless model, cloud providers allocate resources to applications on demand, managing the compute infrastructure on behalf of their customers. The term, “serverless,” is a little misleading: The servers are actually still there, but customers don’t have to get involved in their provisioning, configuration, maintenance, or scaling.

Fastly’s Compute@Edge represents the next generation of serverless computing—purpose-built for better performance, reduced latency, and enhanced visibility and security. Using Fastly’s tools, a developer can create an edge application, test it locally, then with one command, deploy it to the Compute@Edge platform. When a request for that application reaches any of Fastly’s global network of edge servers, the application is launched and running in microseconds and can instantly scale to tens of thousands of requests per second.

It’s difficult to overstate the power and flexibility this puts in your hands as a developer—your application can be running on every edge server, with access to every attribute of its incoming requests, assembling responses in any way you choose. For an idea of the possibilities, check out the Compute@Edge demos, in particular, the implementation of the video game classic, “Doom.”

We don’t have space in a single blog post to explore an edge application of that magnitude, but read on for a simple example of how you can combine Fastly’s Compute@Edge with Backblaze B2 to improve your website’s user experience, directing requests to the optimal origin store end point based on the user’s location.

The Case for a Multiregion Origin Store

Although the CDN caches resources to improve performance, if a requested resource is not present in the edge server cache, it must be fetched from the origin store. When the edge server is close to the origin store, the increase in latency is minimal. If, on the other hand, the edge server is on a different continent from the origin store, it can take significantly longer to retrieve uncached content. In most cases, this additional delay is hardly noticeable, but for websites with many resources that are frequently updated, it can add up to a sluggish experience for users. A solution is for the origin store to maintain multiple copies of a website’s content, each at an end point in a different region. This approach can dramatically reduce the penalty for cache misses, improving the user experience.

There is a problem here, though: How do we ensure that a given CDN edge server directs requests to the “best” end point? The answer: build an application that uses the edge server’s location to select the end point. I’ll explain how I did just that, creating a Fastly Compute@Edge application to proxy requests to Backblaze B2 buckets.

Creating an Application on Fastly Compute@Edge

The Fastly Compute@Edge developer documentation did a great job of walking me through creating a Compute@Edge application. As part of the process, I had to choose a starter kit—a simple working application targeting a specific use case. The Static Content starter kit was the ideal basis for my application—it demonstrates many useful techniques, such as generating an AWS V4 Signature and manipulating the request’s Host HTTP header to match the origin store.

The core of the application is just a few lines written in the Rust programming language:

#[fastly::main] fn main(mut req: Request) -> Result<Response, Error> { // 1. Where is the application running? let pop = get_pop(&req);


 // 2. Choose the origin based on the edge server (pop) -

// default to US if there is no match on the pop

let origin = POP_ORIGIN.get(pop.as_str()).unwrap_or(&US_ORIGIN);
 // 3. Remove the query string to improve cache hit ratio

req.remove_query();
 // 4. Set the `Host` header to the bucket name + host rather than

// our Compute@Edge endpoint

let host = format!("{}.{}", origin.bucket_name, origin.endpoint);

req.set_header(header::HOST, &host);
 // 5. Copy the modified client request to form the backend request

let mut bereq = req.clone_without_body();
 // 6. Set the AWS V4 authentication headers

set_authentication_headers(&mut bereq, &origin);
 // 7. Send the request to the backend and assign its response to `beresp`

let mut beresp = bereq.send(origin.backend_name)?;
 // 8. Set a response header indicating the origin that we used

beresp.set_header("X-B2-Host", &host);

// 9. Return the response to the client return Ok(beresp); }

In step one, the get_pop function returns the three-letter abbreviation for the edge server, or point of presence (POP). For the purposes of testing, you can specify a POP as a query parameter in your HTTP request. For example, https://three.interesting.words.edgecompute.app/image.png?pop=AMS will simulate the application running on the Amsterdam POP. Next, in step two, the application looks up the POP in a mapping of POPs to Backblaze B2 end points. There are about a hundred Fastly POPs spread around the world; I simply took the list generated by running the Fastly command-line tool with the POPs argument, and assigned POPs to Backblaze B2 end points based on their location:

POPs in North America, South America, and Asia/Pacific map to the U.S. end point.
POPs in Europe and Africa map to the EU end point.

I won’t step through the rest of the logic in detail here—the comments in the code sample above cover the basics; feel free to examine the code in detail on GitHub if you’d like a closer look.

Serve Your Own Data From Multiple Backblaze B2 Regions

As you can see in the screenshot above, Fastly has implemented a Deploy to Fastly button. You can use this to create your own copy of the Backblaze B2 Compute@Edge demo application in just a couple of minutes. You’ll need to gather a few prerequisites before you start:

You must create Backblaze B2 accounts in both the U.S. and EU regions. If you have an existing account and you’re not sure which region it’s in, just take a look at the end point for one of your buckets. For example, this bucket is in the U.S. West region:

To create your second account, go to the Sign Up page, and click the Region drop-down on the right under the big, red Sign Up button:

Pick the region in which you don’t already have an account, and enter an email and password. Remember, your new account comes with 10GB of storage, free of charge, so there’s no need to enter your credit card details.

Note: You’ll need to use a different email address from your existing account. If you don’t have a second email address, you can use the plus trick (officially known as sub-addressing) and reuse an existing email address. For example, if you used [email protected] for your existing B2 Cloud Storage account in the U.S. region, you can use [email protected] for your new EU account. Mail will be routed to the same inbox, and Backblaze B2 will be satisfied that it’s a different email address. This technique isn’t limited to Gmail, by the way, it works with many email providers.
Create a private bucket in each account, and use your tool of choice to copy the same data into each of them. Make a note of the end point for each bucket.
Create an application key with read access to each bucket.
Sign up for a free Fastly account if you don’t already have one. Right now, this includes free credits for Compute@Edge.
Sign up for a free GitHub account.
Go to the Backblaze B2/Fastly Compute@Edge Demo GitHub repository, click the Deploy to Fastly button, and follow the prompts. The repository will be forked to your GitHub account and then deployed to Fastly.
Important: There is one post-deploy step you must complete before your application will work! In your new GitHub repository, navigate to src/config.rs and hit the pencil icon near the top right to edit the file. Change the origin configuration in lines 18-31 to match your buckets and their end points. Alternatively, you can, of course, clone the repository to your local machine, edit it there, and push the changes back to GitHub.

Once you have your accounts and buckets created, it takes just a few minutes to deploy the application. Watch me walk through the process:

What Can You Do With Fastly’s Compute@Edge and Backblaze B2?

My simple demo application only scratches the surfaces of Compute@Edge. How could you combine Fastly’s edge computing platform with Backblaze B2 to create a new capability for your website? Check out Fastly’s collection of over 100 Compute@Edge code samples for inspiration. If you come up with something neat and share it on GitHub, let me know in the comments and I’ll round up a bundle of Backblaze-branded goodies, just for you!

The post Building a Multiregion Origin Store With Backblaze B2 + Fastly Compute@Edge appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Explore the Backblaze S3 Compatible API With Our New Postman Collection

2022-02-24 Pat Patterson

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/explore-the-backblaze-s3-compatible-api-with-our-new-postman-collection/

Postman is a platform for building and using APIs. API providers such as Backblaze can use Postman to build API documentation and provide a live environment for developers to experiment with those APIs. Today, you can interact with Backblaze B2 Cloud Storage via our new Postman Collection for the Backblaze S3 Compatible API.

Using the Backblaze S3 Compatible API

The Backblaze S3 Compatible API implements the most commonly used S3 operations, allowing applications to integrate with Backblaze B2 in exactly the same way they do with Amazon S3. Many of our Alliance Partners have used the S3 Compatible API in integrating their products and services with Backblaze B2. Often, integration is as simple as allowing the user to specify a custom endpoint, for example, https://s3.us-west-001.backblazeb2.com, alongside their API credentials in the S3 settings, and verifying that the application works as expected with Backblaze B2.

The Backblaze B2 Native API, introduced alongside Backblaze B2 back in 2015, provides a low-level interface to B2 Cloud Storage. We generally recommend that developers use the S3 Compatible API when writing new applications and integrations, as it is supported by a wider range of SDKs and libraries, and many developers already have experience with Amazon S3. You can use the Backblaze B2 web console or the B2 Native API to access functionality, such as application key management and lifecycle rules, that is not covered by the S3 Compatible API.

Our post on the B2 Native and S3 Compatible APIs provides a more detailed comparison.

Most applications and scripts use one of the AWS SDKs or the S3 commands in the AWS CLI to access Backblaze B2. All of the SDKs, and the CLI, allow you to override the default Amazon S3 endpoint in favor of Backblaze B2. Sometimes, though, you might want to interact directly with Backblaze B2 via the S3 Compatible API, perhaps in debugging an issue, or just to better understand how the service works.

Exploring the Backblaze S3 Compatible API in Postman

Our new Backblaze S3 Compatible API Documentation page is the definitive reference for developers wishing to access Backblaze B2 directly via the S3 Compatible API.

In addition to reading the documentation, you can click the Run in Postman button on the top right of the page, log in to the Postman website or desktop app (creating a Postman account is free), and interact with the API.

Integrate With Backblaze B2

Whether you are backing up, archiving data, or serving content via the web, Backblaze B2 is an easy to use and, at a quarter of the cost of Amazon S3, cost-effective cloud object storage solution. If you’re not already using Backblaze B2, sign up now and try it out—your first 10GB of storage is free!

The post Explore the Backblaze S3 Compatible API With Our New Postman Collection appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Free Image Hosting With Cloudflare Transform Rules and Backblaze B2

2022-02-08 Pat Patterson

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/free-image-hosting-with-cloudflare-transform-rules-and-backblaze-b2/

Before I dive into using Cloudflare Transform Rules to implement image hosting on Backblaze B2 Cloud Storage, I’d like to take a moment to introduce myself. I’m Pat Patterson, recently hired by Backblaze as chief developer evangelist. I’ve been working with technology and technical communities for close to two decades, at companies such as Sun Microsystems and Salesforce. I’ll be creating and delivering technical content for you, our Backblaze B2 community, and advocating on your behalf within Backblaze. Feel free to follow my journey and reach out to me via Twitter or LinkedIn.

Cloudflare Transform Rules

Now, on with the show! Cloudflare Transform Rules give you access to HTTP traffic at the CDN edge server, allowing you to manipulate the URI path, query string, and HTTP headers of incoming requests and outgoing responses. Where Cloudflare Workers allows you to write JavaScript code that executes in the same environment, Transform Rules give you much of the same power without the semi-colons and curly braces.

Let’s look at a specific use case: implementing image hosting on top of a cloud object store. Backblaze power user James Ross wrote an excellent blog post back in August 2019, long before the introduction of Transform Rules, explaining how to do this with Cloudflare Workers and Backblaze B2. We’ll see how much of James’ solution we can recreate with Transform Rules, without writing any code. We’ll also discover how the combination of Cloudflare and Backblaze allows you to create your own, personal 10GB image hosting site for free.

Implementing Image Hosting on a Cloud Object Store

James’ requirements were simple:

Serve image files from a custom domain, such as files.example.com, rather than the cloud storage provider’s domain.
Remove the bucket name, and any other extraneous information, from the URL.
Remove extraneous headers, such as the object ID, from the HTTP response.
Improve caching (both browser and edge cache) for images.
Add basic CORS headers to allow embedding of images on external sites.

I’ll work through each of these requirements in this blog post, and wrap up by explaining why Backblaze B2 might be a better long term provider for this and many other cloud object storage use cases than other cloud object stores.

It’s worth noting that nothing here is Backblaze B2-specific—the user’s browser is requesting objects from a B2 Cloud Storage public bucket via their URLs, just as it would with any other cloud object store. The techniques are exactly the same on Amazon S3, for example.

Prerequisites

You’ll need accounts with both Cloudflare and Backblaze. You can get started for free with both:

You’ll also need your own DNS domain, which I’ll call example.com in this article, on which you can create subdomains such as files.example.com. If you’ve read this far, you likely already have at least one. Otherwise, you can register a new domain at Cloudflare for a few dollars a year, or your local equivalent.

Create a Bucket for Your Images

If you already have a B2 Cloud Storage bucket you want to use for your image store, you can skip this section. Note: It doesn’t matter whether you created the bucket and its objects via the B2 Native API, the Backblaze S3 Compatible API, or any other mechanism—your objects are accessible to Cloudflare via their friendly URLs.

Log in to Backblaze, and click Buckets on the left under B2 Cloud Storage, then Create a Bucket. You will need to give your bucket a unique name, and make it public. Leave the other settings with their default values.

Note that the bucket name must be globally unique within Backblaze B2, so you can’t just call it something like “myfiles.” You’ll hide the bucket name from public view, so you can call it literally anything, as long as there isn’t already a Backblaze B2 bucket with that name.

Finally, click Upload/Download and upload a test file to your new bucket.

Click the file to see its details, including its various URLs.

In the next step, you’ll rewrite requests that use your custom subdomain, for example, https://files.example.com/smiley.png, to the friendly URL of the form, https://f004.backblazeb2.com/file/metadaddy-public/smiley.png.

Make a note of the hostname in the friendly URL. As you can see in the previous paragraph, mine is f004.backblazeb2.com.

Create a DNS Subdomain for Your Image Host

You will need to activate your domain (example.com, rather than files.example.com) in your Cloudflare account, if you have not already done so.

Now, in the Cloudflare dashboard, create your subdomain by adding a DNS CNAME record pointing to the bucket hostname you made a note of earlier.

I created files.superpat.com, which points to my bucket’s hostname, f004.backblazeb2.com.

If you test this right now by going to your test file’s URL in your custom subdomain, for example, https://files.example.com/file/my-unique-bucket-name/smiley.png, after a few seconds you will see a 522 “connection timed out” error from Cloudflare:

This is because, by default, Cloudflare accesses the upstream server via plain HTTP, rather than HTTPS. Backblaze only supports secure HTTPS connections, so the HTTP request fails. To remedy this, in the SSL/TLS section of the Cloudflare dashboard, change the encryption mode from “Flexible” to “Full (strict),” so that Cloudflare connects to Backblaze via HTTPS, and requires a CA-issued certificate.

Now you should be able to access your test file in your custom subdomain via a URL of the form https://files.example.com/file/my-unique-bucket-name/smiley.png. The next task is to create the first Transform Rule to remove /file/my-unique-bucket-name from the URL.

Rewrite the URL Path on Incoming Requests

There are three varieties of Cloudflare Transform Rules:

URL Rewrite Rules: Rewrite the URL path and query string of an HTTP request.
HTTP Request Header Modification Rules: Set the value of an HTTP request header or remove a request header.
HTTP Response Header Modification Rules: Set the value of an HTTP response header or remove a response header.

Click Rules on the left of the Cloudflare dashboard, then Transform Rules. You’ll see that the Cloudflare free plan includes 10 Transform Rules—plenty for our purposes. Click Create Transform Rule, then Rewrite URL.

It’s useful to pause for a moment and think about what we need to ask Cloudflare to do. Users will be requesting URLs of the form https://files.example.com/smiley.png, and we want the request to Backblaze B2 to be like https://f004.backblazeb2.com/file/metadaddy-public/smiley.png. We’ve already taken care of the domain part of the URL, so it becomes clear that all we need to do is prefix the outgoing URL with /file/<bucket name>.

Give your rule a descriptive name such as “Add file and bucket name.”

There is an opportunity to set a condition that incoming requests must match to fire the trigger. In James’ article, he tested that the path did not already begin with the /file/<bucket name> prefix, so that you can refer to a file with either the short or long URL.

At first glance, the Cloudflare dashboard doesn’t offer “does not start with” as an operator.

However, clicking Edit expression reveals a more powerful way of specifying the condition:

The Cloudflare Rules language allows us to express our condition precisely:

Moving on, Cloudflare offers static and dynamic options for rewriting the path. A static rewrite would apply the same value to the URL path of every request. This use case requires a dynamic rewrite, where, for each request, Cloudflare evaluates the value as an expression which yields the path.

Your expression would prepend the existing path with /file/<bucket name>, like this:

Save the Transform Rule, and try to access your test file again, this time without the /file/<bucket name> prefix in the URL path, for example: https://files.example.com/smiley.png.

You should see your test file, as expected:

Great! Now, let’s take a look at those HTTP headers in the response.

Remove HTTP Headers From the Response

You could use Chrome Developer Tools to view the response headers, but I prefer the curl command line tool. I used the --head argument to show the HTTP headers without the response body, since my terminal would not be happy with binary image data!

Note: I’ve removed some extraneous headers from this and subsequent HTTP responses for clarity and length.

% curl --head https://files.superpat.com/smiley.png HTTP/2 200 date: Thu, 20 Jan 2022 01:26:10 GMT content-type: image/png content-length: 23889 x-bz-file-name: smiley.png x-bz-file-id: 4_zf1f51fb913357c4f74ed0c1b_f1163cc3f37a60613_d20220119_m204457_c004_v0402000_t0044 x-bz-content-sha1: 3cea1118fbaab607a7afd930480670970b278586 x-bz-upload-timestamp: 1642625097000 x-bz-info-src_last_modified_millis: 1642192830529 cache-control: max-age=14400 cf-cache-status: MISS last-modified: Thu, 20 Jan 2022 01:26:10 GMT

Our goal is to remove all the x-bz headers. Create a Modify Response Header rule and set its name to something like “Remove Backbaze B2 Headers.” We want this rule to apply to all traffic, so the match expression is simple:

Unfortunately there isn’t a way to tell Cloudflare to remove all the headers that are prefixed x-bz, so we just have to list them all:

Save the rule, and request your test file again. You should see fewer headers:

% curl --head https://files.superpat.com/smiley.png HTTP/2 200 date: Thu, 20 Jan 2022 01:57:01 GMT content-type: image/png content-length: 23889 x-bz-info-src_last_modified_millis: 1642192830529 cache-control: max-age=14400 cf-cache-status: HIT age: 1851 last-modified: Thu, 20 Jan 2022 01:26:10 GMT

Note: As you can see, for some reason Cloudflare does not remove the x-bz-info-src_last_modified_millis header. I’ve reported this to Cloudflare as a bug.

Optimize Cache Efficiency via the ETag and Cache-Control HTTP Headers

We can follow James’ lead in making caching more efficient by leveraging the ETag header. As explained in the MDN Web Docs for ETag:

The ETag (or entity tag) HTTP response header is an identifier for a specific version of a resource. It lets caches be more efficient and save bandwidth, as a web server does not need to resend a full response if the content was not changed.

Essentially, a cache can just request the HTTP headers for a resource and only proceed to fetch the resource body if the ETag has changed.

James constructed the ETag by using one of x-bz-content-sha1, x-bz-info-src_last_modified_millis, or x-bz-file-id, in that order. If none of those headers are set, then neither is ETag. It’s not possible to express this level of complexity in a Transform Rule, but we can apply a little lateral thinking to the problem. We can easily concatenate the three headers to create a result that will change when any one or more of them changes:

concat(http.response.headers["x-bz-content-sha1"][0], http.response.headers["x-bz-info-src_last_modified_millis"][0], http.response.headers["x-bz-file-id"][0])

Note that it’s possible for there to be multiple values of a given HTTP header, so http.response.headers["<header-name>"] is an array. http.response.headers["<header-name>"][0] yields the first, and in most cases only, element of the array.

Edit the Transform Rule you just created, update its name to something like “Remove Backblaze B2 Headers, set ETag,” and add a header with a dynamic value:

Don’t worry about the ordering; Cloudflare will reorder the operations so that “set” occurs before “remove.” Also, if none of those headers are present in the response, resulting in an empty value for the ETag header, Cloudflare will not set that header at all. Exactly the behavior we need!

Another test shows the result. Note that HTTP headers are not case-sensitive, so etag has just the same meaning as ETag:

% curl --head https://files.superpat.com/smiley.png HTTP/2 200 date: Thu, 20 Jan 2022 02:01:19 GMT content-type: image/png content-length: 23889 x-bz-info-src_last_modified_millis: 1642192830529 cache-control: max-age=14400 cf-cache-status: HIT age: 2198 last-modified: Thu, 20 Jan 2022 01:24:41 GMT etag: 3cea1118fbaab607a7afd930480670970b27858616421928305294_zf1f51fb913357c4f74ed0c1b_f1163cc3f37a60613_d20220119_m204457_c004_v0402000_t0044

The other cache-related header is Cache-Control, which tells the browser how to cache the resource. As you can see in the above responses, Cloudflare sets Cache-Control to a max-age of 14400 seconds, or four hours.

James’ code, on the other hand, sets Cache-Control according to whether or not the request to B2 Cloud Storage is successful. For an HTTP status code of 200, Cache-Control is set to public, max-age=31536000, instructing the browser to cache the response for 31,536,000 seconds; in other words, a year. For any other HTTP status, Cache-Control is set to public, max-age=300, so the browser only caches the response for five minutes. In both cases, the public directive indicates that the response can be cached in a shared cache, even if the request contained an Authorization header field.

Note: We’re effectively assuming that once created, files on the image host are immutable. This is often true for this use case, but you should think carefully about cache policy when you build your own solutions.

At present, Cloudflare Transform Rules do not give access to the HTTP status code, but, again, we can satisfy the requirement with a little thought and investigation. As mentioned above, for successful operations, Cloudflare sets Cache-Control to max-age=14400, or four hours. For failed operations, for example, requesting a non-existent object, Cloudflare passes back the Cache-Control header from Backblaze B2 of max-age=0, no-cache, no-store. With this information, it’s straightforward to construct a Transform Rule to increase max-age from 14400 to 31536000 for the successful case:

Again, we need to use [0] to select the first matching HTTP header. Notice that this rule uses a static value for the header—it’s the same for every matching response.

We’ll leave the header as it’s set by B2 Cloud Storage for failure cases, though it would be just as easy to override it.

Another test shows the results of our efforts:

% curl --head https://files.superpat.com/smiley.png HTTP/2 200 date: Thu, 20 Jan 2022 02:31:38 GMT content-type: image/png content-length: 23889 x-bz-info-src_last_modified_millis: 1642192830529 cache-control: public, max-age=31536000 cf-cache-status: HIT age: 4017 last-modified: Thu, 20 Jan 2022 01:24:41 GMT etag: 3cea1118fbaab607a7afd930480670970b27858616421928305294_zf1f51fb913357c4f74ed0c1b_f1163cc3f37a60613_d20220119_m204457_c004_v0402000_t0044

Checking the failure case—notice that there is no ETag header, since B2 Cloud Storage did not return any x-bz headers:

% curl --head https://files.superpat.com/badname.png HTTP/2 404 date: Thu, 20 Jan 2022 02:32:35 GMT content-type: application/json;charset=utf-8 content-length: 94 cache-control: max-age=0, no-cache, no-store cf-cache-status: BYPASS

Success! Browsers and caches will aggressively cache responses, reducing the burden on Cloudflare and Backblaze B2.

Set a CORS Header for Image Files

We’re almost done! Our final requirement is to set a cross-origin resource sharing (CORS) header for images so that they can be manipulated in web pages from any domain on the web.

The Transform Rule must match a range of file extensions, and set the Access-Control-Allow-Origin HTTP response header to allow any webpage to access resources:

Upload a text file and run a final couple of tests to see the results. First, the image:

% curl --head https://files.superpat.com/smiley.png HTTP/2 200 date: Thu, 20 Jan 2022 02:50:52 GMT content-type: image/png content-length: 23889 x-bz-info-src_last_modified_millis: 1642192830529 cache-control: public, max-age=31536000 cf-cache-status: HIT age: 4459 last-modified: Thu, 20 Jan 2022 01:36:33 GMT etag: 3cea1118fbaab607a7afd930480670970b27858616421928305294_zf1f51fb913357c4f74ed0c1b_f1163cc3f37a60613_d20220119_m204457_c004_v0402000_t0044 access-control-allow-origin: *

The Access-Control-Allow-Origin header is present, as expected.

Finally, the text file, without an Access-Control-Allow-Origin header. You can use the --include argument rather than --head to see the file content as well as the headers:

% curl --include https://files.superpat.com/hello.txt HTTP/2 200 date: Thu, 20 Jan 2022 02:48:51 GMT content-type: text/plain content-length: 14 accept-ranges: bytes x-bz-info-src_last_modified_millis: 1642646740075 cf-cache-status: DYNAMIC etag: 60fde9c2310b0d4cad4dab8d126b04387efba28916426467400754_zf1f51fb913357c4f74ed0c1b_f1092902424a40504_d20220120_m024635_c004_v0402003_t0000

Hello, World!

Troubleshooting

The most frequent issue I encountered while getting all this working was mixing up request and response when referencing HTTP headers. If things are not working as expected, double check that you don’t have http.response.headers["<header-name>"] where you need http.request.headers["<header-name>"] or vice versa.

Can I Really Do This Free of Charge?

Backblaze B2 pricing is very simple:

Storage

The first 10GB of storage is free of charge.
Above 10GB, we charge $0.005/GB/month, around a quarter of the cost of other leading cloud object stores (cough, S3, cough).
Storage cost is calculated hourly, with no minimum retention requirement, and billed monthly.

Downloaded Data

The first 1GB of data downloaded each day is free.
Above 1GB, we charge $0.01/GB, but…
Downloads through our CDN and compute partners, of which Cloudflare is one, are free.

Transactions

Each download operation counts as one class B transaction.
The first 2,500 class B transactions each day are free.
Beyond 2,500 class B transactions, they are charged at a rate of $0.004 per 10,000.

No Surprise Bills

If you already signed up for Backblaze B2, you might have noticed that you didn’t have to provide a credit card number. Your 10GB of free storage never expires, and there is no chance of you unexpectedly incurring any charges.

By serving your images via Cloudflare’s global CDN and optimizing your cache configuration as described above, you will incur no download costs from B2 Cloud Storage, and likely stay well within the 2,500 free download operations per day. Similarly, Cloudflare’s free plan does not require a credit card for activation, and there are no data or transaction limits.

Sign up for Backblaze B2 today, deploy your own personal image host, explore our off-the-shelf integrations, and consider what you can create with an affordable, S3-compatible cloud object storage platform.

The post Free Image Hosting With Cloudflare Transform Rules and Backblaze B2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Noise

All posts by Pat Patterson

The collective thoughts of the interwebz