Tag Archives: Cloudflare Workers KV

Workers KV – free to try, with increased limits!

Post Syndicated from Greg McKeon original https://blog.cloudflare.com/workers-kv-free-tier/

Workers KV - free to try, with increased limits!

Workers KV - free to try, with increased limits!

In May 2019, we launched Workers KV, letting developers store key-value data and make that data globally accessible from Workers running in Cloudflare’s over 200 data centers.

Today, we’re announcing a Free Tier for Workers KV that opens up global, low-latency data storage to every developer on the Workers platform. Additionally, to expand Workers KV’s use cases even further, we’re also raising the maximum value size from 10 MB to 25 MB. You can now write an application that serves larger static files directly or JSON blobs directly from KV.

Together with our announcement of the Durable Objects limited beta last month, the Workers platform continues to move toward providing storage solutions for applications that are globally deployed as easily as an application running in a single data center today.

What are the new free tier limits?

The free tier includes 100,000 read operations and 1,000 each of write, list and delete operations per day, resetting daily at UTC 00:00, with a maximum total storage size of 1 GB. Operations that exceed these limits will fail with an error.

Additional KV usage costs $0.50 per million read operations, $5.00 per million list, write and delete operations and $0.50 per GB of stored data.

We intentionally chose these limits to prioritize use cases where KV works well – infrequently written data that may be frequently read around the globe.

What is the new KV value size limit?

We’re raising the value size limit in Workers KV from 10 MB to 25 MB. Users frequently store static assets in Workers KV to then be served by Workers code. To make it as easy as possible to deploy your entire site on Workers, we’re raising the value size limit to handle even larger assets.

Since Workers Sites hosts your site from Workers KV, the increased size limit also means Workers Sites assets can now be as large as 25 MB.

How does Workers KV work?

Workers KV stores key-value pairs and caches hot keys in Cloudflare’s data centers around the world. When a request hits a Worker that uses KV, it retrieves the KV pair from Cloudflare’s local cache with low latency if the pair has been accessed recently.

While some programs running on the Workers platform are stateless, it is often necessary to distribute files or configuration data to running Workers. Workers KV allows you to persist data and access it across multiple Workers calls.

For example, let’s say I wanted to serve a static text file from Cloudflare’s edge. I could provision my own object storage, host it on my own domain, and put that domain behind Cloudflare.

With Workers KV, however, that reduces down to a few simple steps. First, I bind my KV namespace to my Workers code with Wrangler.

wrangler kv:namespace create "BUCKET"

Then, in my wrangler.toml, I add my new namespace id to associate it with my Worker.

kv_namespaces = [
 {binding = “BUCKET", id = <insert-id-here>}
]

I can upload a new text file from the command line using Wrangler:

$ wrangler kv:key put --binding=BUCKET "my-file" value.txt --path

And then serve that file from my Workers script with low latency from any of Cloudflare’s points of presence around the globe!

addEventListener('fetch', event => {
  event.respondWith(handleEvent(event))
})

async function handleEvent(event) {
 let txt = await BUCKET.get("my-file")  
 return new Response(txt, {
    headers: {
      "content-type": "text/plain"
    }
  })
}

Beyond file hosting, Workers users have built many other types of applications with Workers KV:

  • Mass redirects – handle billions of HTTP redirects.
  • Access control rules – validate user requests to your API.
  • Translation keys – dynamically localize your web pages.
  • Configuration data – manage who can access your origin.

While Workers KV provides low latency access across the globe, it may not return the most up-to-date data if updates to keys are happening more than once a minute or from multiple data centers simultaneously. For use cases that cannot tolerate stale data, Durable Objects is a better solution.

Get started with Workers KV today, for free

The free tier and increased limits are live now!

You can get started with Workers and Workers KV in the Cloudflare dash. To check out an example of how to use Workers KV, check out the tutorial in the Workers documentation.

Migrating cdnjs to serverless with Workers KV

Post Syndicated from Tyler Caslin original https://blog.cloudflare.com/migrating-cdnjs-to-serverless-with-workers-kv/

Migrating cdnjs to serverless with Workers KV

Cloudflare powers cdnjs, an open-source project that accelerates websites by delivering popular JavaScript libraries and resources via Cloudflare’s network. Since our major update in December, we focused on remodelling cdnjs for scalability and resilience. Today, we are excited to announce how Cloudflare delivers cdnjs—a migration to a serverless infrastructure using Cloudflare Workers and its distributed key-value store Workers KV!

What is cdnjs and why do I care?

Migrating cdnjs to serverless with Workers KV

For those unfamiliar, cdnjs is an acronym describing a Content Delivery Network (CDN) for JavaScript (JS). A CDN simply refers to a geographically distributed network of servers that provide Internet content, whether it is memes, cat videos, or HTML pages. In our case, the CDN refers to Cloudflare’s ever expanding network of over 200 globally distributed data centers.

And here’s why this is relevant to you: it makes page load times lightning-fast. Virtually every website you visit needs to fetch JS libraries in order to load, including this one. Let’s say you visit a Sydney-based website that contains a local file from jQuery, a popular library found in 76.2% of websites. If you are located in New York, you may notice a delay, as it can easily exceed 300ms to fetch the file—not to mention the time it takes for the round trips involved with the TLS handshake. However, if the website references jQuery using cdnjs.cloudflare.com, you can retrieve the file from the closest Cloudflare data center in Buffalo, reducing the latency to a blazing 20ms.

While cdnjs operates behind the scenes, it is used by over 11% of websites, making the Internet a much faster and more reliable place. In July, cdnjs served almost 190 billion requests—an enormous 3.46PB of data.

Where are the files stored?

Migrating cdnjs to serverless with Workers KV

While cdnjs speeds up the Internet, it certainly isn’t magic!

Historically, a number of load-balanced machines at one of Cloudflare’s core data centers would periodically pull cdnjs files from a backing store, acting as the origin for cdnjs.cloudflare.com. When a new file is requested, it is cached by Cloudflare, allowing it to be fetched quickly from any of our data centers.

The backing store is a catalogue of JS, CSS, and other web libraries in the form of an open-source GitHub repository. What this means is that anyone—including you—can contribute to it, subject to review and other processes.

However, until recently, these existing operations were very labor intensive and fragile.

This blog post will explain why we changed the infrastructure behind cdnjs to make it faster, more reliable, and easier to maintain. First, we will discuss how the community used to contribute to cdnjs, outlining the pains and concerns of the old system. Then, we will explore the benefits of migrating to Workers KV. After, we will dive into the new architecture, as well as upgrades to the website and cdnjs API. Finally, we will review the history of cdnjs, and where it is headed in the future.

If you think you know how to make a PR, think again

Migrating cdnjs to serverless with Workers KV

For the non-technical reader, a pull request (PR) is a request to merge changes you’ve made to a repository. Traditionally, if you wanted to include your JavaScript library in cdnjs, you would first create a PR on GitHub to cdnjs/cdnjs with a JSON file describing your package and additional files for any version you wished to include. Once your PR was approved by our old bot, manually reviewed, and then merged by a maintainer, your package would be integrated with cdnjs.

Sounds easy, right? You can just fork the repo, clone it, and copy paste a few files, no?

Exactly. Contributing was easy if you had several hours to burn, a case-sensitive file system, and a couple hundred gigabytes of free disk space to git clone the 300GB repo. If you were short on time—no problem, you could always use your advanced knowledge of git sparse-checkout to get the job done. Don’t know git? Just add one file at a time manually through GitHub’s UI.

I think you get the point. I know I certainly did when I naively spent 10 hours cloning the repo, only to discover that macOS is case-insensitive by default.

However, updating cdnjs was not only difficult for the contributors, but also the maintainers. Historically, the community was able to contribute version files directly, which could potentially be malicious. This created lots of work for maintainers, requiring them to inspect each file manually, diffing files against the official library source and running malware checks.
So how did packages update once they were in cdnjs? In the JSON file describing each package, there was an optional auto-update definition telling the bot where to look for new versions of the library. If present, when your package released a new version from npm or GitHub, the bot would download it, pushing the files to cdnjs/cdnjs and computed Subresource Integrity (SRI) hashes to cdnjs/SRIs. If the auto-update property was missing, it would be your responsibility to make manual PRs to update cdnjs with any future versions.

A wake-up call for cdnjs

Migrating cdnjs to serverless with Workers KV

In April, during maintenance at one of our core data centers, a technician accidentally disconnected the cables supplying all external connections to our other data centers, causing the data center to go offline for approximately four hours. This incident served as the first wake-up call for cdnjs, especially since the affected data center housed the primary cdnjs origin web servers. In this case, we did have a backup running on an external provider, but what really saved us was Cloudflare’s global cache, which minimized the impact of the outage as only uncached assets failed to load.

We started to think about how we can improve both the reliability and performance of how we serve cdnjs. We went straight to Cloudflare Workers, our own platform for developing on the edge. One powerful tool built into Workers is Workers KV—a low-latency, globally distributed key-value store optimized for high-read applications.

We put two and two together, realizing that instead of pulling the cdnjs/cdnjs repository and serving files from disk, we could cut the physical machines out entirely, distributing the data around the world and serving files straight from the edge. That way, cdnjs would be able to recover from any origin data center failure, while also increasing its scalability.

Workers KV to the rescue

Migrating cdnjs to serverless with Workers KV

At first glance, the decision to use Workers KV was a no-brainer. Since files in cdnjs never change but require frequent reads, Workers KV was a perfect fit.

However, as we planned our migration, we became concerned that with over 7 million assets in cdnjs, there would undoubtedly exist files that exceed Workers KV’s 10MiB value limit. After investigating, we discovered that several hundred cdnjs files were oversized, the majority being JavaScript Source Maps.

Then the idea hit us. We could store compressed versions of cdnjs files in Workers KV, not only solving our oversized file issue, but also optimizing how we serve files.

If you pay the Internet bill, you’ll know that bandwidth is expensive! For this reason, all modern browsers will try to fetch compressed web content whenever it is available. Similarly, within Cloudflare we often experiment with on-the-fly compression to reduce our bandwidth, always serving compressed content to the eyeball when it is accepted. As a result, we decided to compress all cdnjs files ahead of time, writing them to Workers KV with both optimal Brotli and gzip forms. That way, we could increase the compression level compared to on-the-fly compression as we no longer have the latency requirements.

This means we now serve cdnjs files faster and smaller!

A complete makeover for cdnjs

Migrating cdnjs to serverless with Workers KV

Today, if you want to include your JavaScript library in cdnjs, you first create a PR on GitHub to our new repository cdnjs/packages. The repo is easily cloneable at 50MB and consists of thousands of JSON files, each describing a cdnjs package and how it is auto-updated from npm or git. Once your file is validated by our automated CI—powered by a new bot—and merged by a maintainer, your package would be automatically enrolled in our auto-update service.

In the new system, security and maintainability are prioritized. For starters, cdnjs version files are created by our bot, minimizing the possibility of human error when merging a new version. While the JSON files in cdnjs/packages are added by error-prone humans, they are inspected by our bot before being approved by a maintainer. Each file is automatically validated against a JSON schema, as well as checked for popularity on npm or GitHub.

When the bot discovers a new release, it pushes Brotli and gzip-compressed versions of the files to a files namespace in Workers KV. With each entry, the bot writes some metadata in Workers KV for the ETag and Last-Modified HTTP headers. Similar to before, the bot also computes Subresource Integrity (SRI) hashes of the uncompressed files, but now pushes them instead to a SRIs namespace in Workers KV.

Then, when a new file is requested from cdnjs.cloudflare.com, a Cloudflare Worker will inspect the client’s Accept-Encoding header, fetching either the Brotli or gzip-compressed version with its ETag and Last-Modified metadata from Workers KV. As the compressed file travels back through Cloudflare, it is cached for future requests and uncompressed on-the-fly if needed.

At the moment, there are still a handful of files exceeding Workers KV’s size limit. Consequently, if the Cloudflare Worker fails to retrieve a file from Workers KV, it is fetched from the origin backed by the original git repo. In the coming months, we plan on gradually removing this infrastructure.

Scaling the website and API

Migrating cdnjs to serverless with Workers KV

Besides the core cdnjs infrastructure, many of its other components received upgrades as well!

On the cdnjs project’s homepage, you will be greeted by a slick new beta website built by Matt. Constructed with Vue and Nuxt, the beta website is powered entirely by the cdnjs API. As a result, it is always up-to-date with the latest package information and requires low resource usage to serve the site—which runs completely on the client-side after the first page load—helping us scale with cdnjs’s never-ending growth.

In fact, the cdnjs API also strengthened its scalability, benefitting from a serverless architecture close to the one we have seen with cdnjs and Workers KV.

Before migrating to Workers KV, the cdnjs API relied on a regularly scheduled process that involved generating about 300MB of metadata. The cdnjs API’s backend would then fetch this enormous “package.min.js” file into memory and use it to operate the API. If you are curious, the file is still being hosted here, but be warned—it may lag your browser! Similarly, file SRIs were pushed to cdnjs/SRIs, which was cloned by the API locally to serve SRI responses.

After all cdnjs files (within the permitted size limit) were moved to Workers KV, these legacy processes became unsustainable, requiring millions of reads and an unreasonable amount of time. Therefore, we decided to upload all metadata found into Workers KV. We split the metadata into four namespaces—one for package-level metadata, one for version-specific metadata, one containing aggregated metadata, and one for file SRIs.

Similar to cdnjs’s serverless design, a Cloudflare Worker sits on top of metadata.speedcdnjs.com, serving data from Workers KV using several public endpoints. Currently, the cdnjs API is fully integrated with these endpoints, which provide an elegant solution as cdnjs continues to scale.

Transparency and the future of cdnjs

Since its birth in January 2011, cdnjs has always been deeply rooted in transparency, deriving its strength from the community. Even when cdnjs exploded in size and its founders Ryan Kirkman and Thomas Davis teamed up with us in June 2011, the project remained entirely open-source on GitHub.

As the years passed, it became harder for the founders to stay active, heavily depending on the community for support. With a nearly nonexistent budget and little access to the repository, core cdnjs maintainers were challenged every day to keep the project alive.

Last year, this led us to contact the founders, who were happy to have our assistance with the project. With Cloudflare’s increased role, cdnjs is as stable as ever, with active members from both Cloudflare and the community.

However, as we remove our reliance on the legacy system and store files in Workers KV, there are concerns that cdnjs will become proprietary. Don’t worry, we are working hard to ensure that cdnjs remains as transparent and open-source as possible. To help the community audit updates to Workers KV, there is a new repository, cdnjs/logs, which is used by the bot to log all Workers KV-related events. Furthermore, anyone can validate the integrity of cdnjs files by fetching SRIs from the cdnjs API.

Conclusion

Overall, this past year has been a turbulent time for cdnjs, but all of its shortcomings have acted as red flags to help us build a better system. Most recently, we have mitigated the risks of depending on physical machines at a single location, migrating cdnjs to a serverless infrastructure where its files are stored in Workers KV.

Today, cdnjs is in good hands, and is not going away anytime soon. Shout out especially to the maintainers Sven and Matt for creating tons of momentum with the project, working on everything from scaling cdnjs to editing this post.

Moving forward, we are committed to making cdnjs as transparent as possible. As we continue to improve cdnjs, we will release more blog posts to keep the community up to date. If you are interested, please subscribe to our blog. After all, it is the community that makes cdnjs possible! A special thanks to our active GitHub contributors and members of the cdnjs Community Forum for sticking with us!

Catching up with Workers KV

Post Syndicated from Steve Klabnik original https://blog.cloudflare.com/catching-up-with-workers-kv/

Catching up with Workers KV

Catching up with Workers KV

The Workers Distributed Data team has been hard at work since we gave you an update last November. Today, we’d like to share with you some of the stuff that has recently shipped in Workers KV: a new feature and an internal change that should significantly improve latency in some cases. Let’s dig in!

KV Metadata

Workers KV has a fairly straightforward interface: you can put keys and values into KV, and then fetch the value back out by key:

await contents.put(“index.html”, someHtmlContent);
await contents.put(“index.css”, someCssContent);
await contents.put(“index.js”, someJsContent);

// later

let index = await contents.get(“index.html”);

Pretty straightforward. But as you can see from this example, you may store different kinds of content in KV, even if the type is identical. All of the values are strings, but one is HTML, one is CSS, and one is JavaScript. If we were going to serve this content to users, we would have to construct a response. And when we do, we have to let the client know what the content type of that request is: text/html for HTML, text/css for CSS, and text/javascript for JavaScript. If we serve the incorrect content type to our clients, they won’t display the pages correctly.

One possible solution to this problem is using the mime package from npm. This lets us write code that looks like this:

// pathKey is a variable with a value like “index.html”
const mimeType = mime.getType(pathKey) || ‘text/plain’

Nice and easy. But there are some drawbacks. First of all, because we have to detect the content type at runtime which means we’re figuring this out on every request. It would be nicer to figure it out only once instead. Second, if we look at how the package implements getType, it does this by including an array of possible extensions and their types. This means that this array is included in our worker, taking up 9kb of space. That’s also less than ideal.

But now, we have a better way. Workers KV will now allow you to add some extra JSON to each key/value pair, to use however you’d like. So we could start inserting the contents of those files like this, instead:

await contents.put(“index.html”, someHtmlContent, {“Content-Type”: “text/html”});
await contents.put(“index.css”, someCssContent, {“Content-Type”: “text/css”});
await contents.put(“index.js”, someJsContent, {“Content-Type”: “text/javascript”});

You could determine these content types in various ways: by looking at the file extension like the mime package, or by using a library that inspects the file’s contents to figure out its type like libmagic. Regardless, the type would be stored in KV alongside the contents of the file. This way, there’s no need to recompute the type on every request. Additionally, the detection code would live in your uploading tool, not in your worker, creating a smaller bundle. Win-win!

The worker code would pass along this metadata by using a new method:

let {value, metadata} = await contents.getWithMetadata(“index.js”);

Here, value would have the contents, like before. But metadata contains the JSON of the metadata that was stored: metadata[“Content-Type”]would return “text/javascript”. You’ll also see this metadata come back when you make a list request as well.

Given that you can store arbitrary JSON, it’s useful for more than just content types: we’ve had folks post to the forums asking about etags, for example. We’re excited to see what you do with this new capability!

Significantly faster writes

Our documentation states:

Very infrequently read values are stored centrally, while more popular values are maintained in all of our data centers around the world.

This is why Workers KV is optimized for higher read volumes than writes. We distribute popular data across the globe, close to users wherever they are. However, for infrequently accessed data, we store the data in a central location until access is requested. Each write (and delete) must go back to the central data store, as do reads on less popular values. The central store was located in the United States, and so the speed for writes would be variable. In the US, it would be much faster than say, in Europe or Asia.

Recently, we have rolled out a major internal change. We have added a second source of truth on the European continent. These two sources of truth will still coordinate between themselves, ensuring that any data you write or update will be available in both places as soon as possible. But latencies from Europe, as well as places closer to Europe than the United States, should be much faster, as they do not have to go the full way to the US.

How much faster? Well, it will depend on your workload. Several other Cloudflare products use Workers KV, and here’s a graph of response times from one of them:

Catching up with Workers KV

As you can see, there’s a sharp drop in the graph when the switchover happened.

We can also measure this time across all customers:

Catching up with Workers KV

The long tail has been significantly shortened. (We’ve redacted the exact numbers, but you can still see the magnitude of the changes.)

More to come

The distributed data team has been working on some additional things, but we’re not quite ready to share them with you yet! We hope that you’ll find these changes make Workers KV even better for you, and we’ll be sharing more updates on the blog as we ship.

Introducing Secrets and Environment Variables to Cloudflare Workers

Post Syndicated from John Donmoyer original https://blog.cloudflare.com/workers-secrets-environment/

Introducing Secrets and Environment  Variables to Cloudflare Workers

Introducing Secrets and Environment  Variables to Cloudflare Workers

The Workers team here at Cloudflare has been hard at work shipping a bunch of new features in the last year and we’ve seen some amazing things built with the tools we’ve provided. However, as my uncle once said, with great serverless platform growth comes great responsibility.

One of the ways we can help is by ensuring that deploying and maintaining your Workers scripts is a low risk endeavor. Rotating a set of API keys shouldn’t require risking downtime through code edits and redeployments and in some cases it may not make sense for the developer writing the script to know the actual API key value at all. To help tackle this problem, we’re releasing Secrets and Environment Variables to the Wrangler CLI and Workers Dashboard.

Supporting secrets

As we started to design support for secrets in Workers we had a sense that this was already a big concern for a lot of our users but we wanted to learn about all of the use cases to ensure we were building the right thing. We headed to the community forums, twitter, and the inbox of Louis Grace, business development representative extraordinaire, for some anecdotes about Secrets usage. We also sent out a survey to our existing users to learn about use cases and pain points.

We learned that even though there was already a way to store secrets without exposing them via Workers KV, the solution was not very intuitive, nor did it meet all the needs of our users. Many users didn’t even know we had an interim solution in place. Recognizing that we were not the first platform to encounter this problem, we surveyed the existing landscape of Platform as a Service offerings to get a better sense for what our users would expect of us.

Deciding on a solution

One of the first things we found was that not all environment variables are created equal. While the simplest use case for having a defined environment variable may be storing a piece of text that can be updated no matter where it is referenced in a script, sometimes those variables may have higher stakes associated with them. If you’re storing an API key that controls access to an important system, you may not want to allow anyone with dashboard access to see it, maybe not even the developers themselves.

With this in mind, we had to ensure the feature covered two different use cases: one for storing variables in plain text where you could see the variable being referenced and make edits to it and another where the variable would be encrypted as soon as you save it, never to be seen again. This way, we were able to serve both needs of our users, side by side, without one compromising for the other.

Testing our prototypes

Once we had a fairly good idea of what we wanted to build, we built some prototypes and rough implementations in staging environments so we would be able to perform some usability testing. We wrangled up some developers and observed them as they performed a series of tasks where they were asked to add some secrets and plain-text environment variables, reference them in one of their Workers, and bind their Worker to a Worker KV namespace.

Along the way we also asked questions to understand the developer’s professional background, familiarity with the product, and the use cases they’ve had for using Workers in the past along with any paint points they experienced.

While we were testing the new dashboard interface we also began testing the usability of the Wrangler CLI. We had Wrangler users perform the same tasks as the Workers dashboard users to help us find out if users are expecting different things out of their command-line tooling.

Findings and fixes

Through our testing we were able to make a number of changes before the final release. Some of the smaller changes included things like adjusting the behavior of form fields to ensure users knew which variable would be associated with each value. We also made larger changes like electing to separate the KV namespace bindings from the other environment variables as a way to emphasize that KV namespace bindings are not the keys and values themselves but a reference to a namespace where those keys are stored.

Cina, one of our engineers, put together a proposal to align some of our terminology with the terms that our developers were naturally using to describe their workflow. In Wrangler users were accustomed to referencing their KV namespaces by adding a KV namespace binding so when they came to the Workers dashboard interface and saw a field called “KV Variables” they were often confused, thinking they were adding keys and values to the namespace itself instead of establishing a variable that could be used to reference the namespace. As a fix, we decided to call it a “KV namespace binding” throughout the experience.

Try it out

Environment variables are available now with the Wrangler CLI and in the Workers Dashboard so go ahead and give them a shot today!

Introducing Secrets and Environment  Variables to Cloudflare Workers
Adding a secret with Wrangler
Introducing Secrets and Environment  Variables to Cloudflare Workers
Managing environment variables and KV bindings in the Workers Dashboard

As we continue to build out the Workers platform we’d love to hear from you. Let us know if you’re interested in participating in user research or just have something to say as we’d love to hear from you.