Tag Archives: Developers

Making magic: Reimagining Developer Experience for the World of Serverless

Post Syndicated from Rita Kozlov original https://blog.cloudflare.com/making-magic-reimagining-developer-experiences-for-the-world-of-serverless/

Making magic: Reimagining Developer Experience for the World of Serverless

Making magic: Reimagining Developer Experience for the World of Serverless

This week we’ve talked about how Workers provides a step function improvement in the TTFB (time to first byte) of applications, by running lightweight isolates in over 200 cities around the world, free of cold starts. Today I’m going to talk about another metric, one that’s arguably even more important: TTFD, or time to first dopamine, and announce a huge improvement to the Workers development experience — wrangler dev, our edge-based development environment with all the perks of a local environment.

There’s nothing quite like the rush of getting your first few lines of code to work — no matter how many times you’ve done it before, there’s something so magical about the computer understanding exactly what you wanted it to do and doing it!

Making magic: Reimagining Developer Experience for the World of Serverless

This is the kind of magic I expected of “serverless”, and while it’s true that most serverless offerings today get you to that feeling faster than setting up a virtual server ever would, I still can’t help but be disappointed with how lackluster developing with most serverless platforms is today.

Some of my disappointment can be attributed to the leaky nature of the abstraction: the journey to getting you to the point of writing code is drawn out by forced decision making about servers (regions, memory allocation, etc). Servers, however, are not the only thing holding developers back from getting to the delightful magical feeling in the serverless world today.

The “serverless” experience on AWS Lambda today looks like this: between configuring the right access policy to invoke my own test application, and deciding whether an HTTP or REST API was better suited for my needs, 30 minutes had easily passed, and I still didn’t have a URL I could call to invoke my application. I did, however, spin up five different services, and was already worrying about cleaning them up lest I be charged for them.

That doesn’t feel like magic!

In building what we believe to be the serverless platform of the future — a promise that feels very magical —  we wanted to bring back that magical feeling to every step of the development journey. If serverless is about empowering developers, then they should be empowered every step of the way: from proof of concept to MVP and beyond.

We’re excited to share with you today our approach to making our developer experience delightful — we recognize we still have plenty of room to continue to grow and innovate (and we can’t wait to tell you about everything we have currently in the works as well!), but we’re proud of all the progress we’ve made in making Workers the easiest development platform for developers to use.

Defining “developer experience”

To get us started, let’s look at what the journey of a developer entails. Today, we’ll be defining the user experience as the following four stages:

  • Getting started: All the steps we have to take before putting in some keystrokes
  • Iteration: Does my code do what I expect it to do? What do I need to do to get it there?
  • Release: I’ve tested what I can — time to hit the big red button!
  • Observe: Is anything broken? And how do I fix it?
Making magic: Reimagining Developer Experience for the World of Serverless

When approaching each stage of development, we wanted to reimagine the experience, the way that we’ve always wanted our development flow to work, and fix places along the way where existing platforms have let us down.

Zero to Hello World

Making magic: Reimagining Developer Experience for the World of Serverless

With Workers, we want to get you to that aforementioned delightful feeling as quickly as possible, and remove every obstacle in the way of writing and deploying your code. The first deployment experience is really important — if you’ve done it once and haven’t given up along the way, you can do it again.

We’re very proud to say our TTFD — even for a new user without a Cloudflare account — is as low as three minutes. If you’re an existing customer, you can have your first Worker running in seconds. No regions to choose, no IAM rules to configure, and no API Gateways to set up or worry about paying for.

If you’re new to Workers and still trying to get a feel for it, you can instantly deploy your Worker to 200 cities around the world within seconds, with the simple click of a button.

If you’ve already decided on Workers as the choice for building your next application, we want to make you feel at home by allowing you to use all of your favorite IDEs, be it vim or emacs or VSCode (we don’t care!).

With the release of wrangler — the official command-line tool for Workers, getting started is just as easy as:

wrangler generate hello
cd hello
wrangler publish

Again, in seconds your code is up and running, and easily accessible all over the world.

“Hello, World!”, of course, doesn’t have to be quite so literal. We provide a range of tutorials to help get you started and get familiar with developing with Workers.

To save you that last bit of time in getting started, our template gallery provides starter templates so you can dive straight into building the products you’re excited about — whether it’s a new GraphQL server or a brand new static site, we’ve got you covered.

Local(ish) development: code, test, repeat

Making magic: Reimagining Developer Experience for the World of Serverless

We can’t promise to get the code right on your behalf, but we can promise to do everything we can to get you the feedback you need to help you get your code right.

The development journey requires lots of experimentation, trial and error, and debugging. If my Computer Science degree came with instructions on the back of the bottle, they would read: “code, print, repeat.”

Getting code right is an extremely iterative, feedback-driven process. We would all love to get code right the first time around and move on, but the reality is, computers are bad mind-readers, and you’ve ended up with an extraneous parenthesis or a stray comma in your JSON, so your code is not going to run. Found where the loose parenthesis was introduced? Great! Now your code is running, but the output is not right — time to go find that off-by-one error.

Local development has traditionally been the way for developers to get a tight feedback loop during the development process. The crucial components that make up an effective local development environment and make it a great testing ground are: fast feedback loop, its sandboxed nature (ability to develop without affecting production), and accuracy.

As we started thinking about accomplishing all three of those goals, we realized that being local actually wasn’t itself a requirement — speed is the real requirement, and running on the client is the only way acceptable speed for a good-enough feedback loop could be achieved.

One option was to provide a traditional local development environment, but one thing didn’t sit well with us: we wanted to provide a local development environment for the Workers runtime, however, we knew there was more to handling a request than just the runtime, which could compromise accuracy. We didn’t want to set our users up to fail with code that works on their machine but not ours.

Shipping the rest of our edge infrastructure to the user would pose its own challenges of keeping it up to date, and it would require the user to install hundreds of unnecessary dependencies, all potentially to end up with the most frustrating experience of all: running into some installation bug the explanation to which couldn’t be found on StackOverflow. This experience didn’t sit right with us.

As it turns out, this is a very similar problem to one we commonly solve for our customers: Running code on the client is fast, but it doesn’t give me the control I need; running code on the server gives me the control I need, but it requires a slow round-trip to the origin. All we had to do was take our own advice and run it on the edge! It’s the best of both worlds: your code runs so close to your end user that you get the same performance as running it on the client, without having to lose control.

To provide developers access to this tight feedback loop, we introduced wrangler dev earlier this year!

Making magic: Reimagining Developer Experience for the World of Serverless

wrangler dev  has the look and feel of a local development environment: it runs on localhost but tunnels to the edge, and provides output directly to your favorite IDE of choice. Since wrangler dev now runs on the edge, it works on your machine and ours exactly the same!

Our release candidate for wrangler dev is live and waiting for you to take it for a test drive, as easily as:

npm i @cloudflare/[email protected] -g

Let us know what you think.

Release

Making magic: Reimagining Developer Experience for the World of Serverless

After writing all the code, testing every edge case imaginable, and going through code review, at some point the code needs to be released for the rest of the world to reap the fruits of your hard labor and enjoy the features you’ve built.

For smaller, quick applications, it’s exciting to hit the “Save & deploy” button and let fate take the wheel.

For production level projects, however, the process of deploying to production may be a bit different. Different organizations adopt different processes for code release. For those using GitHub, last year we introduced our GitHub Action, to make it easy to configure an integrated release process.

With Wrangler, you can configure Workers to deploy using your existing CI, to automate deployments, and minimize human intervention.

When deploying to production, again, feedback becomes extremely important. Some platforms today still take as long as a few minutes to deploy your code. A few minutes may seem trivial, but a few minutes of nervously refreshing, wondering whether your code is live yet, and which version of your code your users are seeing is stressful. This is especially true in a rollback or a bug-fix situation where you want the new version to be live ASAP.

New Workers are deployed globally in less than five seconds, which means new changes are instantaneous. Better yet, since Workers runs on lightweight isolates, newly deployed Workers don’t experience dreaded cold starts, which means you can release code as frequently as you’re able to ship it, without having to invest additional time in auxiliary gadgets to pre-warm your Worker — more time for you to start working on your next feature!

Observe & Resolve

Making magic: Reimagining Developer Experience for the World of Serverless

The big red button has been pushed. Dopamine has been replaced with adrenaline: the instant question on your mind is: “Did I break anything? And if so, what, and how do I fix it?” These questions are at the core of what the industry calls “observability”.

There are different ways things can break and incidents can manifest themselves: increases in errors, drops in traffic, even a drop in performance could be considered a regression.

To identify these kinds of issues, you need to be able to spot a trend. Raw data, however, is not a very useful medium for spotting trends — humans simply cannot parse raw lines of logs to identify a subtle increase in errors.

This is why we took a two-punch approach to helping developers identify and fix issues: exposing trend data through analytics, while also providing the ability to tail production logs for forensics and investigation.

Earlier this year, we introduced Workers Metrics: an easy way for developers to identify trends in their production traffic.

With requests metrics, you can easily spot any increases in errors, or drastic changes in traffic patterns after a given release:

Making magic: Reimagining Developer Experience for the World of Serverless

Additionally, sometimes new code can introduce unforeseen regressions in the overall performance of the application. With CPU time metrics, our developers are now able to spot changes in the performance of their Worker, as well as use that information to guide and optimize their code.

Making magic: Reimagining Developer Experience for the World of Serverless

Once you’ve identified a regression, we wanted to provide the tools needed to find your bug and fix it, which is why we also recently launched `wrangler tail`: production logs in a single command.

wrangler tail can help diagnose where code is failing or why certain customers are getting unexpected outcomes because it exposes console.log() output and exceptions. By having access to this output, developers can immediately diagnose, fix, and resolve any issues occurring in production.

Making magic: Reimagining Developer Experience for the World of Serverless

We know how precious every moment can be when a bad code deploy impacts customer traffic. Luckily, once you’ve found and fixed your bug, it’s only a matter of seconds for users to start benefiting from the fix — unlike other platforms which make you wait as long as 5 minutes, Workers get deployed globally within five seconds.

Repeat

Making magic: Reimagining Developer Experience for the World of Serverless

As you’re thinking about your next feature, you checkout a new branch, and the cycle begins all over. We’re excited for you to check out all the improvements we’ve made to the development experience with Workers, all to reduce your time to first dopamine (TTFD).

We are always working on improving it further, looking where we can remove every additional bit of friction, and love to hear your feedback as we do so.

Workers Security

Post Syndicated from Kenton Varda original https://blog.cloudflare.com/workers-security/

Workers Security

Workers Security
Hello, I’m an engineer on the Workers team, and today I want to talk to you about security.

Cloudflare is a security company, and the heart of Workers is, in my view, a security project. Running code written by third parties is always a scary proposition, and the primary concern of the Workers team is to make that safe.

For a project like this, it is not enough to pass a security review and say "ok, we’re secure" and move on. It’s not even enough to consider security at every stage of design and implementation. For Workers, security in and of itself is an ongoing project, and that work is never done. There are always things we can do to reduce the risk and impact of future vulnerabilities.

Today, I want to give you an overview of our security architecture, and then address two specific issues that we are frequently asked about: V8 bugs, and Spectre.

Architectural Overview

Let’s start with a quick overview of the Workers Runtime architecture.

Workers Security

There are two fundamental parts of designing a code sandbox: secure isolation and API design.

Isolation

First, we need to create an execution environment where code can’t access anything it’s not supposed to.

For this, our primary tool is V8, the JavaScript engine developed by Google for use in Chrome. V8 executes code inside "isolates", which prevent that code from accessing memory outside the isolate — even within the same process. Importantly, this means we can run many isolates within a single process. This is essential for an edge compute platform like Workers where we must host many thousands of guest apps on every machine, and rapidly switch between these guests thousands of times per second with minimal overhead. If we had to run a separate process for every guest, the number of tenants we could support would be drastically reduced, and we’d have to limit edge compute to a small number of big enterprise customers who could pay a lot of money. With isolate technology, we can make edge compute available to everyone.

Sometimes, though, we do decide to schedule a worker in its own private process. We do this if it uses certain features that we feel need an extra layer of isolation. For example, when a developer uses the devtools debugger to inspect their worker, we run that worker in a separate process. This is because historically, in the browser, the inspector protocol has only been usable by the browser’s trusted operator, and therefore has not received as much security scrutiny as the rest of V8. In order to hedge against the increased risk of bugs in the inspector protocol, we move inspected workers into a separate process with a process-level sandbox. We also use process isolation as an extra defense against Spectre, which I’ll describe later in this post.

Additionally, even for isolates that run in a shared process with other isolates, we run multiple instances of the whole runtime on each machine, which we call "cordons". Workers are distributed among cordons by assigning each worker a level of trust and separating low-trusted workers from those we trust more highly. As one example of this in operation: a customer who signs up for our free plan will not be scheduled in the same process as an enterprise customer. This provides some defense-in-depth in the case a zero-day security vulnerability is found in V8. But I’ll talk more about V8 bugs, and how we address them, later in this post.

At the whole-process level, we apply another layer of sandboxing for defense in depth. The "layer 2" sandbox uses Linux namespaces and seccomp to prohibit all access to the filesystem and network. Namespaces and seccomp are commonly used to implement containers. However, our use of these technologies is much stricter than what is usually possible in container engines, because we configure namespaces and seccomp after the process has started (but before any isolates have been loaded). This means, for example, we can (and do) use a totally empty filesystem (mount namespace) and use seccomp to block absolutely all filesystem-related system calls. Container engines can’t normally prohibit all filesystem access because doing so would make it impossible to use exec() to start the guest program from disk; in our case, our guest programs are not native binaries, and the Workers runtime itself has already finished loading before we block filesystem access.

The layer 2 sandbox also totally prohibits network access. Instead, the process is limited to communicating only over local Unix domain sockets, to talk to other processes on the same system. Any communication to the outside world must be mediated by some other local process outside the sandbox.

One such process in particular, which we call the "supervisor", is responsible for fetching worker code and configuration from disk or from other internal services. The supervisor ensures that the sandbox process cannot read any configuration except that which is relevant to the workers that it should be running.

For example, when the sandbox process receives a request for a worker it hasn’t seen before, that request includes the encryption key for that worker’s code (including attached secrets). The sandbox can then pass that key to the supervisor in order to request the code. The sandbox cannot request any worker for which it has not received the appropriate key. It cannot enumerate known workers. It also cannot request configuration it doesn’t need; for example, it cannot request the TLS key used for HTTPS traffic to the worker.

Aside from reading configuration, the other reason for the sandbox to talk to other processes on the system is to implement APIs exposed to Workers. Which brings us to API design.

API Design

There is a saying: "If a tree falls in the forest, but no one is there to hear it, does it make a sound?" I have a related saying: "If a Worker executes in a fully-isolated environment in which it is totally prevented from communicating with the outside world, does it actually run?"

Complete code isolation is, in fact, useless. In order for Workers to do anything useful, they have to be allowed to communicate with users. At the very least, a Worker needs to be able to receive requests and respond to them. It would also be nice if it could send requests to the world, safely. For that, we need APIs.

In the context of sandboxing, API design takes on a new level of responsibility. Our APIs define exactly what a Worker can and cannot do. We must be very careful to design each API so that it can only express operations which we want to allow, and no more. For example, we want to allow Workers to make and receive HTTP requests, while we do not want them to be able to access the local filesystem or internal network services.

Let’s dig into the easier example first. Currently, Workers does not allow any access to the local filesystem. Therefore, we do not expose a filesystem API at all. No API means no access.

But, imagine if we did want to support local filesystem access in the future. How would we do that? We obviously wouldn’t want Workers to see the whole filesystem. Imagine, though, that we wanted each Worker to have its own private directory on the filesystem where it can store whatever it wants.

To do this, we would use a design based on capability-based security. Capabilities are a big topic, but in this case, what it would mean is that we would give the worker an object of type Directory, representing a directory on the filesystem. This object would have an API that allows creating and opening files and subdirectories, but does not permit traversing "up" the tree to the parent directory. Effectively, each worker would see its private Directory as if it were the root of their own filesystem.

How would such an API be implemented? As described above, the sandbox process cannot access the real filesystem, and we’d prefer to keep it that way. Instead, file access would be mediated by the supervisor process. The sandbox talks to the supervisor using Cap’n Proto RPC, a capability-based RPC protocol. (Cap’n Proto is an open source project currently maintained by the Cloudflare Workers team.) This protocol makes it very easy to implement capability-based APIs, so that we can strictly limit the sandbox to accessing only the files that belong to the Workers it is running.

Now what about network access? Today, Workers are allowed to talk to the rest of the world only via HTTP — both incoming and outgoing. There is no API for other forms of network access, therefore it is prohibited (though we plan to support other protocols in the future).

As mentioned before, the sandbox process cannot connect directly to the network. Instead, all outbound HTTP requests are sent over a Unix domain socket to a local proxy service. That service implements restrictions on the request. For example, it verifies that the request is either addressed to a public Internet service, or to the Worker’s zone’s own origin server, not to internal services that might be visible on the local machine or network. It also adds a header to every request identifying the worker from which it originates, so that abusive requests can be traced and blocked. Once everything is in order, the request is sent on to our HTTP caching layer, and then out to the Internet.

Similarly, inbound HTTP requests do not go directly to the Workers Runtime. They are first received by an inbound proxy service. That service is responsible for TLS termination (the Workers Runtime never sees TLS keys), as well as identifying the correct Worker script to run for a particular request URL. Once everything is in order, the request is passed over a Unix domain socket to the sandbox process.

V8 bugs and the "patch gap"

Every non-trivial piece of software has bugs, and sandboxing technologies are no exception. Virtual machines have bugs, containers have bugs, and yes, isolates (which we use) also have bugs. We can’t live life pretending that no further bugs will ever be discovered; instead, we must assume they will and plan accordingly.

We rely heavily on isolation provided by V8, the JavaScript engine built by Google for use in Chrome. This has good sides and bad sides. On one hand, V8 is an extraordinarily complicated piece of technology, creating a wider "attack surface" than virtual machines. More complexity means more opportunities for something to go wrong. On the bright side, though, an extraordinary amount of effort goes into finding and fixing V8 bugs, owing to its position as arguably the most popular sandboxing technology in the world. Google regularly pays out 5-figure bounties to anyone finding a V8 sandbox escape. Google also operates fuzzing infrastructure that automatically finds bugs faster than most humans can. Google’s investment does a lot to minimize the danger of V8 "zero-days" — bugs that are found by the bad guys and not known to Google.

But, what happens after a bug is found and reported by the good guys? V8 is open source, so fixes for security bugs are developed in the open and released to everyone at the same time — good guys and bad guys. It’s important that any patch be rolled out to production as fast as possible, before the bad guys can develop an exploit.

The time between publishing the fix and deploying it is known as the "patch gap". Earlier this year, Google announced that Chrome’s patch gap had been reduced from 33 days to 15 days.

Fortunately, we have an advantage here, in that we directly control the machines on which our system runs. We have automated almost our entire build and release process, so the moment a V8 patch is published, our systems automatically build a new release of the Workers Runtime and, after one-click sign-off from the necessary (human) reviewers, automatically push that release out to production.

As a result, our patch gap is now under 24 hours. A patch published by V8’s team in Munich during their work day will usually be in production before the end of our work day in the US.

Spectre: Introduction

Workers Security
We get a lot of questions about Spectre. The V8 team at Google has stated in no uncertain terms that V8 itself cannot defend against Spectre. Since Workers relies on V8 for sandboxing, many have asked if that leaves Workers vulnerable. However, we do not need to depend on V8 for this; the Workers environment presents many alternative approaches to mitigating Spectre.

Spectre is complicated and nuanced, and there’s no way I can cover everything there is to know about it or how Workers addresses it in a single blog post. But, hopefully I can clear up some of the confusion and concern.

What is it?

Spectre is a class of attacks in which a malicious program can trick the CPU into "speculatively" performing computation using data that the program is not supposed to have access to. The CPU eventually realizes the problem and does not allow the program to see the results of the speculative computation. However, the program may be able to derive bits of the secret data by looking at subtle side effects of the computation, such as the effects on cache.

For more background about Spectre, check out our Learning Center page on the topic.

Why does it matter for Workers?

Spectre encompasses a wide variety of vulnerabilities present in modern CPUs. The specific vulnerabilities vary by architecture and model, and it is likely that many vulnerabilities exist which haven’t yet been discovered.

These vulnerabilities are a problem for every cloud compute platform. Any time you have more than one tenant running code on the same machine, Spectre attacks come into play. However, the "closer together" the tenants are, the more difficult it can be to mitigate specific vulnerabilities. Many of the known issues can be mitigated at the kernel level (protecting processes from each other) or at the hypervisor level (protecting VMs), often with the help of CPU microcode updates and various tricks (many of which, unfortunately, come with serious performance impact).

In Cloudflare Workers, we isolate tenants from each other using V8 isolates — not processes nor VMs. This means that we cannot necessarily rely on OS or hypervisor patches to "solve" Spectre for us. We need our own strategy.

Why not use process isolation?

Cloudflare Workers is designed to run your code in every single Cloudflare location, of which there are currently 200 worldwide and growing.

We wanted Workers to be a platform that is accessible to everyone — not just big enterprise customers who can pay megabucks for it. We need to handle a huge number of tenants, where many tenants get very little traffic.

Combine these two points, and things get tricky.

A typical, non-edge serverless provider could handle a low-traffic tenant by sending all of that tenant’s traffic to a single machine, so that only one copy of the application needs to be loaded. If the machine can handle, say, a dozen tenants, that’s plenty. That machine can be hosted in a mega-datacenter with literally millions of machines, achieving economies of scale. However, this centralization incurs latency and worldwide bandwidth costs when the users don’t happen to be nearby.

With Workers, on the other hand, every tenant, regardless of traffic level, currently runs in every Cloudflare location. And in our quest to get as close to the end user as possible, we sometimes choose locations that only have space for a limited number of machines. The net result is that we need to be able to host thousands of active tenants per machine, with the ability to rapidly spin up inactive ones on-demand. That means that each guest cannot take more than a couple megabytes of memory — hardly enough space for a call stack, much less everything else that a process needs.

Moreover, we need context switching to be extremely cheap. Many Workers resident in memory will only handle an event every now and then, and many Workers spend less than a fraction of a millisecond on any particular event. In this environment, a single core can easily find itself switching between thousands of different tenants every second. Moreover, to handle one event, a significant amount of communication needs to happen between the guest application and its host, meaning still more switching and communications overhead. If each tenant lives in its own process, all this overhead is orders of magnitude larger than if many tenants live in a single process. When using strict process isolation in Workers, we find the CPU cost can easily be 10x what it is with a shared process.

In order to keep Workers inexpensive, fast, and accessible to everyone, we must solve these issues, and that means we must find a way to host multiple tenants in a single process.

There is no "fix" for Spectre

A dirty secret that the industry doesn’t like to admit: no one has "fixed" Spectre. Not even when using heavyweight virtual machines. Everyone is still vulnerable.

The current approach being taken by most of the industry is essentially a game of whack-a-mole. Every couple months, researchers uncover a new Spectre vulnerability. CPU vendors release some new microcode, OS vendors release kernel patches, and everyone has to update.

But is it enough to merely deploy the latest patches?

It is abundantly clear that many more vulnerabilities exist, but haven’t yet been publicized. Who might know about those vulnerabilities? Most of the bugs being published are being found by (very smart) graduate students on a shoestring budget. Imagine, for a minute, how many more bugs a well-funded government agency, able to buy the very best talent in the world, could be uncovering.

To truly defend against Spectre, we need to take a different approach. It’s not enough to block individual known vulnerabilities. We must address the entire class of vulnerabilities at once.

We can’t stop it, but we can slow it down

Unfortunately, it’s unlikely that any catch-all "fix" for Spectre will be found. But for the sake of argument, let’s try.

Fundamentally, all Spectre vulnerabilities use side channels to detect hidden processor state. Side channels, by definition, involve observing some non-deterministic behavior of a system. Conveniently, most software execution environments try hard to eliminate non-determinism, because non-deterministic execution makes applications unreliable.

However, there are a few sorts of non-determinism that are still common. The most obvious among these is timing. The industry long ago gave up on the idea that a program should take the same amount of time every time it runs, because deterministic timing is fundamentally at odds with heuristic performance optimization. Sure enough, most Spectre attacks focus on timing as a way to detect the hidden microarchitectural state of the CPU.

Some have proposed that we can solve this by making timers inaccurate or adding random noise. However, it turns out that this does not stop attacks; it only makes them slower. If the timer tracks real time at all, then anything you can do to make it inaccurate can be overcome by running an attack multiple times and using statistics to filter out the noise.

Many security researchers see this as the end of the story. What good is slowing down an attack, if the attack is still possible? Once the attacker gets your private key, it’s game over, right? What difference does it make if it takes them a minute or a month?

Cascading Slow-downs

We find that, actually, measures that slow down an attack can be powerful.

Our key insight is this: as an attack becomes slower, new techniques become practical to make it even slower still. The goal, then, is to chain together enough techniques that an attack becomes so slow as to be uninteresting.

Much of cryptography, after all, is technically vulnerable to "brute force" attacks — technically, with enough time, you can break it. But when the time required is thousands (or even billions) of years, we decide that this is good enough.

So, what do we do to slow down Spectre attacks to the point of meaninglessness?

Freezing a Spectre Attack

Workers Security

Step 0: Don’t allow native code

We do not allow our customers to upload native-code binaries to run on our network. We only accept JavaScript and WebAssembly. Of course, many other languages, like Python, Rust, or even Cobol, can be compiled or transpiled to one of these two formats; the important point is that we do another pass on our end, using V8, to convert these formats into true native code.

This, in itself, doesn’t necessarily make Spectre attacks harder. However, I present this as step 0 because it is fundamental to enabling everything else below.

Accepting native code programs implies being beholden to an existing CPU architecture (typically, x86). In order to execute code with reasonable performance, it is usually necessary to run the code directly on real hardware, severely limiting the host’s control over how that execution plays out. For example, a kernel or hypervisor has no ability to prohibit applications from invoking the CLFLUSH instruction, an instruction which is very useful in side channel attacks and almost nothing else.

Moreover, supporting native code typically implies supporting whole existing operating systems and software stacks, which bring with them decades of expectations about how the architecture works under them. For example, x86 CPUs allow a kernel or hypervisor to disable the RDTSC instruction, which reads a high-precision timer. Realistically, though, disabling it will break many programs because they are implemented to use RDTSC any time they want to know the current time.

Supporting native code would bind our hands in terms of mitigation techniques. By using an abstract intermediate format, we have much greater freedom.

Step 1: Disallow timers and multi-threading

In Workers, you can get the current time using the JavaScript Date API, for example by calling Date.now(). However, the time value returned by this is not really the current time. Instead, it is the time at which the network message was received which caused the application to begin executing. While the application executes, time is locked in place. For example, say an attacker writes:

let start = Date.now();
for (let i = 0; i < 1e6; i++) {
  doSpectreAttack();
}
let end = Date.now();

The values of start and end will always be exactly the same. The attacker cannot use Date to measure the execution time of their code, which they would need to do to carry out an attack.

As an aside: This is a measure we actually implemented in mid-2017, long before Spectre was announced (and before we knew about it). We implemented this measure because we were worried about timing side channels in general. Side channels have been a concern of the Workers team from day one, and we have designed our system from the ground up with this concern in mind.

Related to our taming of Date, we also do not permit multi-threading or shared memory in Workers. Everything related to the processing of one event happens on the same thread — otherwise, it would be possible to "race" threads in order to "MacGyver" an implicit timer. We don’t even allow multiple Workers operating on the same request to run concurrently. For example, if you have installed a Cloudflare App on your zone which is implemented using Workers, and your zone itself also uses Workers, then a request to your zone may actually be processed by two Workers in sequence. These run in the same thread.

So, we have prevented code execution time from being measured locally. However, that doesn’t actually prevent it from being measured: it can still be measured remotely. For example, the HTTP client that is sending a request to trigger the execution of the Worker can measure how long it takes for the Worker to respond. Of course, such a measurement is likely to be very noisy, since it would have to traverse the Internet. Such noise can be overcome, in theory, by executing the attack many times and taking an average.

Another aside: Some people have suggested that if a serverless platform like Workers were to completely reset an application’s state between requests, so that every request "starts fresh", this would make attacks harder. That is, imagine that a Worker’s global variables were reset after every request, meaning you cannot store state in globals in one request and then read that state in the next. Then, doesn’t that mean the attack has to start over from scratch for every request? If each request is limited to, say, 50ms of CPU time, does that mean that a Spectre attack isn’t possible, because there’s not enough time to carry it out? Unfortunately, it’s not so simple. State doesn’t have to be stored in the Worker; it could instead be stored in a conspiring client. The server can return its state to the client in each response, and the client can send it back to the server in the next request.

But is an attack based on remote timers really feasible in practice? In adversarial testing, with help from leading Spectre experts, we have not been able to develop an attack that actually works in production.

However, we don’t feel the lack of a working attack means we should stop building defenses. Instead, we’re currently testing some more advanced measures, which we plan to roll out in the coming weeks.

Step 2: Dynamic Process Isolation

We know that if an attack is possible at all, it would take a very long time to run — hours at the very least, maybe as long as weeks. But once an attack has been running even for a second, we have a huge amount of new data that we can use to trigger further measures.

Spectre attacks, you see, do a lot of "weird stuff" that you wouldn’t usually expect to see in a normal program. These attacks intentionally try to create pathological performance scenarios in order to amplify microarchitectural effects. This is especially true when the attack has already been forced to run billions of times in a loop in order to overcome other mitigations, like those discussed above. This tends to show up in metrics like CPU performance counters.

Now, the usual problem with using performance metrics to detect Spectre attacks is that sometimes you get false positives. Sometimes, a legitimate program behaves really badly. You can’t go around shutting down every app that has bad performance.

Luckily, we don’t have to. Instead, we can choose to reschedule any Worker with suspicious performance metrics into its own process. As I described above, we can’t do this with every Worker, because the overhead would be too high. But, it’s totally fine to process-isolate just a few Workers, defensively. If the Worker is legitimate, it will keep operating just fine, albeit with a little more overhead. Fortunately for us, the nature of our platform is such that we can reschedule a Worker into its own process at basically any time.

In fact, fancy performance-counter based triggering may not even be necessary here. If a Worker merely uses a large amount of CPU time per event, then the overhead of isolating it in its own process is relatively less, because it switches context less often. So, we might as well use process isolation for any Worker that is CPU-hungry.

Once a Worker is isolated, then we can rely on the operating system’s Spectre defenses, just aslike, for example, most desktop web browsers now do.

Over the past year we’ve been working with the experts at Graz Technical University to develop this approach. (TU Graz’s team co-discovered Spectre itself, and has been responsible for a huge number of the follow-on discoveries since then.) We have developed the ability to dynamically isolate workers, and we have identified metrics which reliably detect attacks. The whole system is currently undergoing testing to work out any remaining bugs, and we expect to roll it out fully within the next several weeks.

But wait, didn’t I say earlier that even process isolation isn’t a complete defense, because it only addresses known vulnerabilities? Yes, this is still true. However, the trend over time is that new Spectre attacks tend to be slower and slower to carry out, and hence we can reasonably guess that by imposing process isolation we have further slowed down even attacks that we don’t know about yet.

Step 3: Periodic Whole-Memory Shuffling

After Step 2, we already think we’ve prevented all known attacks, and we’re only worried about hypothetical unknown attacks. How long does a hypothetical unknown attack take to carry out? Well, obviously, nobody knows. But with all the mitigations in place so far, and considering that new attacks have generally been slower than older ones, we think it’s reasonable to guess attacks will take days or longer.

On a time scale of a day, we have new things we can do. In particular, it’s totally reasonable to restart the entire Workers runtime on a daily basis, which resets the locations of everything in memory, forcing attacks to restart the process of discovering the locations of secrets.

We can also reschedule Workers across physical machines or cordons, so that the window to attack any particular neighbor is limited.

In general, because Workers are fundamentally preemptible (unlike containers or VMs), we have a lot of freedom to frustrate attacks.

Once we have dynamic process isolation fully deployed, we plan to develop these ideas next. We see this as an ongoing investment, not something that will ever be "done".

Conclusion

Phew. You just read twelve pages about Workers security. Hopefully I’ve convinced you that designing a secure sandbox is only the beginning of building a secure compute platform, and the real work is never done. Popular security culture often dwells on clever hacks and clean fixes. But for the difficult real-world problems, often there is no right answer or simple fix, only the hard work of building defenses thicker and thicker.

The Migration of Legacy Applications to Workers

Post Syndicated from Kirk Schwenkler original https://blog.cloudflare.com/the-migration-of-legacy-applications-to-workers/

The Migration of Legacy Applications to Workers

The Migration of Legacy Applications to Workers

As Cloudflare Workers, and other Serverless platforms, continue to drive down costs while making it easier for developers to stand up globally scaled applications, the migration of legacy applications is becoming increasingly common. In this post, I want to show how easy it is to migrate such an application onto Workers. To demonstrate, I’m going to use a common migration scenario: moving a legacy application — on an old compute platform behind a VPN or in a private cloud — to a serverless compute platform behind zero-trust security.

Wait but why?

Before we dive further into the technical work, however, let me just address up front: why would someone want to do this? What benefits would they get from such a migration? In my experience, there are two sets of reasons: (1) factors that are “pushing” off legacy platforms, or the constraints and problems of the legacy approach; and (2) factors that are “pulling” onto serverless platforms like Workers, which speaks to the many benefits of this new approach. In terms of the push factors, we often see three core ones:

  • Legacy compute resources are not flexible and must be constantly maintained, often leading to capacity constraints or excess cost;
  • Maintaining VPN credentials is cumbersome, and can introduce security risks if not done properly;
  • VPN client software can be challenging for non-technical users to operate.

Similarly, there are some very key benefits “pulling” folks onto Serverless applications and zero-trust security:

  • Instant scaling, up or down, depending on usage. No capacity constraints, and no excess cost;
  • No separate credentials to maintain, users can use Single Sign On (SSO) across many applications;
  • VPN hardware / private cloud; and existing compute, can be retired to simplify operations and reduce cost

While the benefits to this more modern end-state are clear, there’s one other thing that causes organizations to pause: the costs in time and migration effort seem daunting. Often what organizations find is that migration is not as difficult as they fear. In the rest of this post, I will show you how Cloudflare Workers, and the rest of the Cloudflare platform, can greatly simplify migrations and help you modernize all of your applications.

Getting Started

To take you through this, we will use a contrived application I’ve written in Node.js to illustrate the steps we would take with a real, and far more complex, example. The goal is to show the different tools and features you can use at each step; and how our platform design supports development and cutover of an application.  We’ll use four key Cloudflare technologies, as we see how to move this Application off of my Laptop and into the Cloud:

  1. Serverless Compute through Workers
  2. Robust Developer-focused Tooling for Workers via Wrangler
  3. Zero-Trust security through Access
  4. Instant, Secure Origin Tunnels through Argo Tunnels

Our example application for today is called Post Process, and it performs business logic on input provided in an HTTP POST body. It takes the input data from authenticated clients, performs a processing task, and responds with the result in the body of an HTTP response. The server runs in Node.js on my laptop.

Since the example application is written in Node.js; we will be able to directly copy some of the JavaScript assets for our new application. You could follow this “direct port” method not only for JavaScript applications, but even applications in our other WASM-supported languages. For other languages, you first need to rewrite or transpile into one with WASM support.

Into our ApplicationOur basic example will perform only simple text processing, so that we can focus on the broad features of the migration. I’ve set up an unauthenticated copy (using Workers, to give us a scalable and reliable place to host it) at https://postprocess-workers.kirk.workers.dev/postprocess where you can see how it operates. Here is an example cURL:

curl -X POST https://postprocess-workers.kirk.workers.dev/postprocess --data '{"operation":"2","data":"Data-Gram!"}'

The relevant takeaways from the code itself are pretty simple:

  • There are two code modules, which conveniently split the application logic completely from the Preprocessing / HTTP interface.
  • The application logic module exposes one function postProcess(object) where object is the parsed JSON of the POST body. It returns a JavaScript object, ready to be encoded into a string in the JSON HTTP response. This module can be run on Workers JavaScript, with no changes. It only needs a new preprocessing / HTTP interface.
  • The Preprocessing / HTTP interface runs on raw Node.js; and exposes a local HTTPS server. The server does not directly take inbound traffic from the Internet, but sits behind a gateway which controls access to the service.

Code snippet from Node.js HTTP module

const server = http.createServer((req, res) => {
    if (req.url == '/postprocess') {
        if(req.method == 'POST') {
                gatherPost(req, data => {
                        try{
                                jsonData = JSON.parse(data)
                        } catch (e) {
                                res.end('Invalid JSON payload! \n')
                                return
                        }
                        result = postProcess(jsonData)
                        res.write(JSON.stringify(result) + '\n');
                        res.end();
                })
        } else {
                res.end('Invalid Method, only POST is supported! \nPlease send a POST with data in format {"Operation":1","data","Data-Gram!"        }
    } else {
        res.end('Invalid request. Did you mean to POST to /postprocess? \n');
    }
});

Code snippet from Node.js logic module

function postProcess (postJson) {
        const ServerVersion = "2.5.17"
        if(postJson != null && 'operation' in postJson && 'data' in postJson){
                var output
                var operation = postJson['operation']
                var data = postJson['data']
                switch(operation){
                        case "1":
                              output = String(data).toLowerCase()
                              break
                        case "2":
                              d = data + "\n"
                              output = d + d + d
                              break
                        case "3":
                              output = ServerVersion
                              break
                        default:
                              output = "Invalid Operation"
                }
                return {'Output': output}
        }
        else{
                return {'Error':'Invalid request, invalid JSON format'}
        }

Current State Application Architecture

The Migration of Legacy Applications to Workers

Design Decisions

With all this information in hand, we can arrive at at the details of our new Cloudflare-based design:

  1. Keep the business logic completely intact, and specifically use the same .js asset
  2. Build a new preprocessing layer in Workers to replace the Node.js module
  3. Use Cloudflare Access to authenticate users to our application

Target State Application Architecture

The Migration of Legacy Applications to Workers

Finding the first win

One good way to make a migration successful is to find a quick win early on; a useful task which can be executed while other work is still ongoing. It is even better if the quick win also benefits the eventual cutover. We can find a quick win here, if we solve the zero-trust security problem ahead of the compute problem by putting Cloudflare’s security in front of the existing application.

We will do this by using cloudflare’s Argo Tunnel feature to securely connect to the existing application, and Access for zero-trust authentication. Below, you can see how easy this process is for any command-line user, with our cloudflared tool.

I open up a terminal and use cloudflared tunnel login, which presents me with an authentication flow. I then use the cloudflared tunnel --hostname postprocess.kschwenkler.com --url localhost:8080 command to connect an Argo Tunnel between the “url” (my local server) and the “hostname” (the new, public address we will use on my Cloudflare zone).

The Migration of Legacy Applications to Workers

Next I flip over to my Cloudflare dashboard, and attach an Access Policy to the “hostname” I specified before. We will be using the Service Token mode of Access; which generates a client-specific security token which that client can attach to each HTTP POST. Other modes are better suited to interactive browser use cases.

The Migration of Legacy Applications to Workers

Now, without using the VPN, I can send a POST to the service, still running on Node.js on my laptop, from any Internet-connected device which has the correct token! It has taken only a few minutes to add zero-trust security to this application; and safely expose it to the Internet while still running on a legacy compute platform (my laptop!).

The Migration of Legacy Applications to Workers

“Quick Win” Architecture

The Migration of Legacy Applications to Workers

Beyond the direct benefit of a huge security upgrade; we’ve also made our eventual application migration much easier, by putting the traffic through the target-state API gateway already. We will see later how we can surgically move traffic to the new application for testing, in this state.

Lift to the Cloud

With our zero-trust security benefits in hand, and our traffic running through Cloudflare; we can now proceed with the migration of the application itself to Workers. We’ll be using the Wrangler tooling to make this process very easy.

As noted when we first looked at the code, this contrived application exposes a very clean interface between the Node.js-specific HTTP module, which we need to replace, and the business logic postprocess module which we can use as is with Workers. We’ll first need to re-write the HTTP module, and then bundle it with the existing business logic into a new Workers application.

Here is a handwritten example of the basic pattern we’ll use for the HTTP module. We can see how the Service Workers API makes it very easy to grab the POST body with await, and how the JSON interface lets us easily pass the data to the postprocess module we took directly from the initial Node.js app.

addEventListener('fetch', event => {
 event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
 try{
   requestData = await request.json()
 } catch (e) {
   return new Response("Invalid JSON", {status:500})
 }
 const response = new Response(JSON.stringify(postProcess (requestData)))
 return response
}

For our work on the mock application, we will go a slightly different route; more in line with a real application which would be more complex. Instead of writing this by hand, we will use Wrangler and our Router template, to build the new front end from a robust framework.

We’ll run wrangler generate post-process-workers https://github.com/cloudflare/worker-template-router to initialize a new Wrangler project with the Router template. Most of the configurations for this template will work as is; and we just have to update account_id in our wrangler.toml and make a few small edits to the code in index.js.

Below is our index.js after my edits, Note the line const postProcess = require('./postProcess.js') at the start of the new http module – this will tell Wrangler to include the original business logic, from the legacy app’s postProcess.js module which I will copy to our working directory.

const Router = require('./router')
const postProcess = require('./postProcess.js')

addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request))
})

async function handler(request) {
    const init = {
        headers: { 'content-type': 'application/json' },
    }
    const body = JSON.stringify(postProcess(await request.json()))
    return new Response(body, init)
}

async function handleRequest(request) {
    const r = new Router()
    r.post('.*/postprocess*', request => handler(request))
    r.get('/', () => new Response('Hello worker!')) // return a default message for the root route

    const resp = await r.route(request)
    return resp
}

Now we can simply run wrangler publish, to put our application on workers.dev for testing! The Router template’s defaults; and the small edits made above, are all we need. Since Wrangler automatically exposes the test application to the Internet (note that we can *also* put the test application behind Access, with a slightly modified method), we can easily send test traffic from any device.

The Migration of Legacy Applications to Workers

Shift, Safely!

With our application up for testing on workers.dev, we finally come to the last and most daunting migration step: cutting over traffic from the legacy application to the new one without any service interruption.

Luckily, we had our quick win earlier and are already routing our production traffic through the Cloudflare network (to the legacy application via Argo Tunnels). This provides huge benefits now that we are at the cutover step. Without changing our IP address, SSL configuration, or any other client-facing properties, we can route traffic to the new application with just one wrangler command.

Seamless cutover from Transition to Target state

The Migration of Legacy Applications to Workers

We simply modify wrangler.toml to indicate the production domain / route we’d like the application to operate on; and wrangler publish. As soon as Cloudflare receives this update; it will send production traffic to our new application instead of the Argo Tunnel. We have configured the application to send a ‘version’ header which lets us verify this easily using curl.

The Migration of Legacy Applications to Workers
The Migration of Legacy Applications to Workers

Rollback, if it is needed, is also very easy. We can either set the wrangler.toml back to the workers.dev only mode, and wrangler publish again; or delete our route manually. Either will send traffic back to the Argo Tunnel.

The Migration of Legacy Applications to Workers
The Migration of Legacy Applications to Workers
The Migration of Legacy Applications to Workers

In Conclusion

Clearly, a real application will be more complex than our example above. It may have multiple components, with complex interactions, which must each be handled in turn. Argo Tunnel might remain in use, to connect to a data store or other application outside of our network. We might use WASM to support modules written in other languages. In any of these scenarios, Cloudflare’s Wrangler tooling and Serverless capabilities will help us work through the complexities and achieve success.

I hope that this simple example has helped you to see how Wrangler, cloudflared, Workers, and our entire global network can work together to make migrations as quick and hassle-free as possible. Whether for this case of an old application behind a VPN, or another application that has outgrown its current home – our Workers platform, Wrangler tooling, and underlying platform will scale to meet your business needs.

Trailblazing a Development Environment for Workers

Post Syndicated from Avery Harnish original https://blog.cloudflare.com/trailblazing-a-development-environment-for-workers/

Trailblazing a Development Environment for Workers

Trailblazing a Development Environment for Workers

When I arrived at Cloudflare for an internship in the summer of 2018, I was taken on a tour, introduced to my mentor who took me out for coffee (shoutout to Preston), and given a quick whiteboard overview of how Cloudflare works. Each of the interns would work on a small project of their own and they’d try to finish them by the end of the summer. The description of the project I was given on my very first day read something along the lines of “implementing signed exchanges in a Cloudflare Worker to fix the AMP URL attribution problem,” which was a lot to take in at once. I asked so many questions those first couple of weeks. What are signed exchanges? Can I put these stickers on my laptop? What’s a Cloudflare Worker? Is there a limit to how much Topo Chico I can take from the fridge? What’s the AMP URL attribution problem? Where’s the bathroom?

I got the answers to all of those questions (and more!) and eventually landed a full-time job at Cloudflare. Here’s the story of my internship and working on the Workers Developer Experience team at Cloudflare.

Getting Started with Workers in 2018

After doing a lot of reading, and asking a lot more questions, it was time to start coding. I set up a Cloudflare account with a Workers subscription, and was greeted with a page that looked something like this:

Trailblazing a Development Environment for Workers

I was able to change the code in the text area on the left, click “Update”, and the changes would be reflected on the right — fairly self-explanatory. There was also a testing tab which allowed me to handcraft HTTP requests with different methods and custom headers. So far so good.

As my project evolved, it became clear that I needed to leave the Workers editor behind. Anything more than a one-off script tends to require JavaScript modules and multiple files. I spent some time setting up a local development environment for myself with npm and webpack (see, purgatory: a place or state of temporary suffering. merriam-webster.com).

After I finally got everything working, my iteration cycle looked a bit like this:

  1. Make a change to my code
  2. Run npm run build (which ran webpack and bundled my code in a single script)
  3. Open ./dist/worker.min.js (the output from my build step)
  4. Copy the entire contents of the built Worker to my clipboard
  5. Switch to the Cloudflare Workers Dashboard
  6. Paste my script into the Workers editor
  7. Click update
  8. Investigate the behavior of my recently modified script
  9. Rinse and repeat

There were two main things here that were decidedly not a fantastic developer experience:

  1. Inspecting the value of a variable by adding a console.log statement would take me ~2-3 minutes and involved lots of manual steps to perform a full rebuild.
  2. I was unable to use familiar HTTP clients such as cURL and Postman without deploying to production. This was because the Workers Preview UI was an iframe nested in the dashboard.

Luckily for me, Cloudflare Workers deploy globally incredibly quickly, so I could push the latest iteration of my Worker, wait just a few seconds for it to go live, and cURL away.

A Better Workers Developer Experience in 2019

Shortly after we shipped AMP Real URL, Cloudflare released Wrangler, the official CLI tool for developing Workers, and I was hired full time to work on it. Wrangler came with a feature that automated steps 2-7 of my workflow by running the command wrangler preview, which was a significant improvement. Running the command would build my Worker and open the browser automatically for me so I could see log messages and test out HTTP requests. That summer, our intern Matt Alonso created wrangler preview --watch. This command automatically updates the Workers preview window when changes are made to your code. You can read more about that here. This was, yet again, another improvement over my old friend Build and Open and Copy and Switch Windows and Paste Forever and Ever, Amen. But there was still no way that I could test my Worker with any HTTP client I wanted without deploying to production — I was still locked in to using the nested iframe.

A few months ago we decided it was time to do something about it. To the whiteboard!

Enter wrangler dev

Most web developers are familiar with developing their applications on localhost, and since Wrangler is written in Rust, it means we could start up a server on localhost that would handle requests to a Worker. The idea was to somehow start a server on localhost and then transform incoming requests and send them off to a preview session running on a Cloudflare server.

Proof of Concept

What we came up with ended up looking a little something like this — when a developer runs wrangler dev, do the following:

Trailblazing a Development Environment for Workers

  1. Build the Worker
  2. Upload the Worker via the Cloudflare API as a previewable Worker
  3. The Cloudflare API takes the uploaded script and creates a preview session, and returns an access token
  4. Start listening for incoming HTTP requests at localhost:8787

Top secret fact: 8787 spells out Rust on a phone numpad Happy Easter!

  1. All incoming requests to localhost:8787 are modified:

  • All headers are prepended with cf-ew-raw- (for instance, X-Auth-Header would become cf-ew-raw-X-Auth-Header)
  • The URL is changed to https://rawhttp.cloudflareworkers.com/${path}
  • The Host header is changed to rawhttp.cloudflareworkers.com
  • The cf-ew-preview header is added with the access token returned from the API in step 3

  1. After sending this request, the response is modified

  • All headers not prefixed with cf-ew-raw- are discarded and headers with the prefix have it removed (for instance, cf-ew-raw-X-Auth-Success would become X-Auth-Success)

The hard part here was already done — the Workers Core team had already implemented the API to support the Preview UI. We just needed to gently nudge Wrangler and the API to be the best of friends. After some investigation into Rust’s HTTP ecosystem, we settled on using the HTTP library hyper, which I highly recommend if you’re in need of a low level HTTP library — it’s fast, correct, and the ergonomics are constantly improving. After a bit of work, we got a prototype working and carved Wrangler ❤️ Cloudflare API into the old oak tree down by Lady Bird Lake.

Usage

Let’s say I have a Workers script that looks like this:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  let message = "Hello, World!"
  return new Response(message)
}

If I created a Wrangler project with this code and ran wrangler dev, this is what it looked like:

$ wrangler dev
👂  Listening on http://127.0.0.1:8787

In another terminal session, I could run the following:

$ curl localhost:8787
Hello, World!

It worked! Hooray!

Just the Right Amount of Scope Creep

At this point, our initial goal was complete: any HTTP client could test out a Worker before it was deployed. However, wrangler dev was still missing crucial functionality. When running wrangler preview, it’s possible to view console.log output in the browser editor. This is incredibly useful for debugging Workers applications, and something with a name like wrangler dev should include a way to view those logs as well. “This will be easy,” I said, not yet knowing what I was signing up for. Buckle up!

console.log, V8, and the Chrome Devtools Protocol, Oh My!

My first goal was to get a Hello, World! message streamed to my terminal session so that developers can debug their applications using wrangler dev. Let’s take the script from earlier and add a console.log statement to it:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  let message = "Hello, World!"
  console.log(message) // this line is new
  return new Response(message)
}

If you’d like to follow along, you can paste that script into the editor at cloudflareworkers.com using Google Chrome.

This is what the Preview editor looks like when that script is run:

Trailblazing a Development Environment for Workers

You can see that Hello, World! has been printed to the console. This may not be the most useful example, but in more complex applications logging different variables is helpful for debugging. If you’re following along, try changing console.log(message) to something more interesting, like console.log(request.url).

The console may look familiar to you if you’re a web developer because it’s the same interface you see when you open the Developer Tools in Google Chrome. Since Cloudflare Workers is built on top of V8 (more info about that here and here), the Workers runtime is able to create a WebSocket that speaks the Chrome Devtools Protocol. This protocol allows the client (your browser, Wrangler, or anything else that supports WebSockets) to send and receive messages that contain information about the script that is running.

In order to see the messages that are being sent back and forth between our browser and the Workers runtime:

  1. Open Chrome Devtools
  2. Click the Network tab at the top of the inspector
  3. Click the filter icon underneath the Network tab (it looks like a funnel and is nested between the cancel icon and the search icon)
  4. Click WS to filter out all requests but WebSocket connections

Your inspector should look like this:

Trailblazing a Development Environment for Workers

Then, reload the page, and select the /inspect item to view its messages. It should look like this:

Trailblazing a Development Environment for Workers

Hey look at that! We can see messages that our browser sent to the Workers runtime to enable different portions of the developer tools for this Worker, and we can see that the runtime sent back our Hello, World! Pretty cool!

On the Wrangler side of things, all we had to do to get started was initialize a WebSocket connection for the current Worker, and send a message with the method Runtime.enable so the Workers runtime would enable the Runtime domain and start sending console.log messages from our script.

After those initial steps, it quickly became clear that a lot more work was needed to get to a useful developer tool. There’s a lot that goes into the Chrome Devtools Inspector and most of the libraries for interacting with it are written in languages other than Rust (which we use for Wrangler). We spent a lot of time switching WebSocket libraries due to incompatibilities across operating systems (turns out TLS is hard) and implementing the part of the Chrome Devtools Protocol in Rust that we needed to. There’s a lot of work that still needs to be done in order to make wrangler dev a top notch developer tool, but we wanted to get it into the hands of developers as quickly as possible.

Try it Out!

wrangler dev is currently in alpha, and we’d love it if you could try it out! You should first check out the Quick Start and then move on to wrangler dev. If you run into issues or have any feedback, please let us know!

Signing Off

I’ve come a long way from where I started in 2018 and so has the Workers ecosystem. It’s been awesome helping to improve the developer experience of Workers for future interns, internal Cloudflare teams, and of course our customers. I can’t wait to see what we do next. I have some ideas for what’s next with Wrangler, so stay posted!

P.S. Wrangler is also open source, and we are more than happy to field bug reports, feedback, and community PRs. Check out our Contribution Guide if you want to help out!

Migrating to React land: Gatsby

Post Syndicated from Victoria Bernard original https://blog.cloudflare.com/migrating-to-react-land-gatsby/

Migrating to React land: Gatsby

Migrating to React land: Gatsby

I am an engineer that loves docs. Well, OK, I don’t love all docs but I believe docs are a crucial, yet often neglected element to a great developer experience. I work on the developer experience team for Cloudflare Workers focusing on several components of Workers, particularly on the docs that we recently migrated to Gatsby.


Through porting our documentation site to Gatsby I learned a lot. In this post, I share some of the learnings that could’ve saved my former self from several headaches. This will hopefully help others considering a move to Gatsby or another static site generator.

Why Gatsby?

Prior to our migration to Gatsby, we used Hugo for our developer documentation. There are a lot of positives about working with Hugo – fast build times, fast load times – that made building a simple static site a great use case for Hugo. Things started to turn sour when we started making our docs more interactive and expanding the content being generated.

Going from writing JSX with TypeScript back to string-based templating languages is difficult. Trying to perform complicated tasks, like generating a sidebar, cost me – a developer who knows nothing about liquid code or Go templating (though with Golang experience) – several tears not even to implement but to just understand what was happening.

Here is the code to template an item in the sidebar in Hugo:

<!-- templates -->
{{ define "section-tree-nav" }}
{{ $currentNode := .currentnode }}
{{ with .sect }}
 {{ if not .Params.Hidden }}
  {{ if .IsSection }}
    {{safeHTML .Params.head}}
    <li data-nav-id="{{.URL}}" class="dd-item
        {{ if .IsAncestor $currentNode }}parent{{ end }}
        {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        {{ if .Params.alwaysopen}}parent{{ end }}
        {{ if .Params.alwaysopen}}always-open{{ end }}
        ">
      <a href="{{ .RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
 
        {{ if .Params.new }}
          <span class="new-badge">NEW</span>
        {{ end }}
 
        {{ $numberOfPages := (add (len .Pages) (len .Sections)) }}
        {{ if ne $numberOfPages 0 }}
 
          {{ if or (.IsAncestor $currentNode) (.Params.alwaysopen)  }}
            <i class="triangle-up"></i>
          {{ else }}
            <i class="triangle-down"></i>
          {{ end }}
 
        {{ end }}
      </a>
      {{ if ne $numberOfPages 0 }}
        <ul>
          {{ .Scratch.Set "pages" .Pages }}
          {{ if .Sections}}
          {{ .Scratch.Set "pages" (.Pages | union .Sections) }}
          {{ end }}
          {{ $pages := (.Scratch.Get "pages") }}
 
        {{ if eq .Site.Params.ordersectionsby "title" }}
          {{ range $pages.ByTitle }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ else }}
          {{ range $pages.ByWeight }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ end }}
        </ul>
      {{ end }}
    </li>
  {{ else }}
    {{ if not .Params.Hidden }}
      <li data-nav-id="{{.URL}}" class="dd-item
     {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
      ">
        <a href="{{.RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if .Params.new }}
          <span class="new-badge">NEW</span>
        {{ end }}
 
        </a></li>
     {{ end }}
  {{ end }}
 {{ end }}
{{ end }}
{{ end }}

Whoa. I may be exceptionally oblivious, but I had to squint at the snippet above for an hour before I realized this was the code for a sidebar item (the li element was the eventual giveaway, but took some parsing to discover where the logic actually started).

(Disclaimer: I am in no way a pro at Hugo and in any situation there are always several ways to code a solution; thus I am in no way claiming this was the only way to write the template nor am I chastising the author of the code. I am just displaying the differences in pieces of code I came across)

Now, here is what the TSX (I will get into the JS later in the article) for the Gatsby project using the exact same styling would look like:

 <li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>
   <Link className="" to={pathToServe} title="Docs Home" activeClassName="active">
     {title || 'No title'}
     {numberOfPages ? <Triangle isAncestor={isAncestor} alwaysopen={showChildren} /> : ''}
     {showNew ? <span className="new-badge">NEW</span> : ''}
   </Link>
   {showChildren ? (
     <ul>
       {' '}
       {myChildren.map((child: mdx) => {
         return (
           <SidebarLi
             frontmatter={child.frontmatter}
             fields={child.fields}
             depth={++depth}
             key={child.frontmatter.title}
           />
         )
       })}
     </ul>
   ) : (
     ''
   )}
 </li>

This code is clean and compact because Gatsby is a static content generation tool based on React. It’s loved for a myriad of reasons, but my honest main reason to migrate to it was to make the Hugo code above much less ugly.

For our purposes, less ugly was important because we had dreams of redesigning our docs to be interactive with support for multiple coding languages and other features.

For example, the template gallery would be a place to go to for how-to recipes and examples. The templates themselves would live in a template registry service and turn into static pages via an API.

We wanted the docs to not be constrained by Go templating. The Hugo docs admit their templates aren’t the best for complicated logic:

Go Templates provide an extremely simple template language that adheres to the belief that only the most basic of logic belongs in the template or view layer.

Gatsby and React enable the more complex logic we were looking for. After our team built workers.cloudflare.com and Built with Workers on Gatsby, I figured this was my shot to really give Gatsby a try on our Workers developer docs.

Decision to Migrate over Starting from Scratch

I’m normally not a fan of fixing things that aren’t broken. Though I didn’t like working with Hugo, did love working in React, and had all the reasons to. I was timid about being the one in charge of switching from Hugo. I was scared. I hated looking at the liquid code of Go templates. I didn’t want to have to port all the existing templates to React without truly understanding what I might be missing.

There comes a point with tech debt though where you have to tackle the tech debt you are most scared of.

The easiest solution would be of course to throw the Hugo code away. Start from scratch. A clean slate. But this means taking something that was not broken and breaking it. The styling, SEO, tagging, and analytics of the site took small iterations over the course of a few years to get right and I didn’t want to be the one to break them. Instead of throwing all the styling and logic tied in for search, SEO, etc…, our plan was to maintain as much of the current design and logic as possible while converting it to React piece-by-piece, component-by-component.

Also there were existing developer docs still using Hugo on Cloudflare by other teams (e.g. Access, Argo Tunnel, etc…). I wanted a team at Cloudflare to be able to import their existing markdown files with frontmatter into the Gatsby repo and preserve the existing design.

I wanted to migrate instead of teleport to Gatsby.

How-to: Hugo to Gatsby

In this blog post, I go through some but not all of the steps of how I ported to Gatsby from Hugo for our complex doc site. The few examples here help to convey the issues that caused the most pain.

Let’s start with getting the markdown files to turn into HTML pages.

Markdown

One goal was to keep all the existing markdown and frontmatter we had set up in Hugo as similar as possible. The reasoning for this was to not break existing content and also maintain the version history of each doc.

Gatsby is built on top of GraphQL. All the data and most all content for Gatsby is put into GraphQL during startup likely via a plugin, then Gatsby will query for this data upon actual page creation. This is quite different from Hugo’s much more abstract model of putting all your content in a folder named content and then Hugo figures out which template to apply based on the logic in the template.

MDX is a sophisticated tool that parses markdown into Gatsby so it can later be represented as HTML (it actually can do much more than that but, I won’t get into it here). I started with Gatsby’s MDX plugin to create nodes from my markdown files. Here is the code to set up the plugin to get all the markdown files (files ending in .md and .mdx) I had in the src/content folder into GraphQL:

gatsby-config.js

const path = require('path')
 
module.exports = {
 plugins: [
   {
     resolve: `gatsby-source-filesystem`,
     options: {
       name: `mdx-pages`,
       path: `${__dirname}/src/content`,
       ignore: [`**/CONTRIBUTING*`, '/styles/**'],
     },
   },
   {
     resolve: `gatsby-plugin-mdx`,
     options: {
       extensions: [`.mdx`, `.md`],
     },
   }, 
]}

Now that Gatsby knows about these files as nodes, we can create pages for them. In gatsby-node.js, I tell Gatsby to grab these MDX pages and use a template markdownTemplate.tsx to create pages for them:

const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.createPages = async ({ actions, GraphQL, reporter }) => {
 const { createPage } = actions
 
 const markdownTemplate = path.resolve(`src/templates/markdownTemplate.tsx`)
 
 result = await GraphQL(`
   {
     allMdx(limit: 1000) {
       edges {
         node {
           fields {
             pathToServe
           }
           frontmatter {
             alwaysopen
             weight
           }
           fileAbsolutePath
         }
       }
     }
   }
 `)
 // Handle errors
 if (result.errors) {
   reporter.panicOnBuild(`Error while running GraphQL query.`)
   return
 }
 result.data.allMdx.edges.forEach(({ node }) => {
   return createPage({
     path: node.fields.pathToServe,
     component: markdownTemplate,
     context: {
       parent: node.fields.parent,
       weight: node.frontmatter.weight,
     }, // additional data can be passed via context, can use as variable on query
   })
 })
}
exports.onCreateNode = ({ node, getNode, actions }) => {
 const { createNodeField } = actions
 // Ensures we are processing only markdown files
 if (node.internal.type === 'Mdx') {
   // Use `createFilePath` to turn markdown files in our `content` directory into `/workers/`pathToServe
   const originalPath = node.fileAbsolutePath.replace(
     node.fileAbsolutePath.match(/.*content/)[0],
     ''
   )
   let pathToServe = createFilePath({
     node,
     getNode,
     basePath: 'content/',
   })
   let parentDir = path.dirname(pathToServe)
   if (pathToServe.includes('index')) {
     pathToServe = parentDir
     parentDir = path.dirname(parentDir) // "/" dirname will = "/"
   }
   pathToServe = pathToServe.replace(/\/+$/, '/') // always end the path with a slash
   // Creates new query'able field with name of 'pathToServe', 'parent'..
   // for allMdx edge nodes
   createNodeField({
     node,
     name: 'pathToServe',
     value: `/workers${pathToServe}`,
   })
   createNodeField({
     node,
     name: 'parent',
     value: parentDir,
   })
   createNodeField({
     node,
     name: 'filePath',
     value: originalPath,
   })
 }
}

Now every time Gatsby runs, it starts running through each node on onCreateNode. If the node is MDX, it passes the node’s content (the markdown, fileAbsolutePath, etc.) and all the node fields (filePath, parent and pathToServe) to the markdownTemplate.tsx component so that the component can render the appropriate information for that markdown file.

The barebone component for a page that renders a React component from the MDX node looks like this:

markdownTemplate.tsx

import React from "react"
import { graphql } from "gatsby"
import { MDXRenderer } from "gatsby-plugin-mdx"
 
export default function PageTemplate({ data: { mdx } }) {
 return (
   <div>
     <h1>{mdx.frontmatter.title}</h1>
     <MDXRenderer>{mdx.body}</MDXRenderer>
   </div>
 )
}
 
export const pageQuery = graphql`
 query BlogPostQuery($id: String) {
   mdx(id: { eq: $id }) {
     id
     body
     frontmatter {
       title
     }
   }
 }
`

A Complex Component: Sidebar

Now let’s get into where I wasted the most time, but learned hard lessons upfront: turning the Hugo template into a React component. At the beginning of this article, I showed that scary sidebar.

To set up the li element we had the Hugo logic looks like:

{{ define "section-tree-nav" }}
{{ $currentNode := .currentnode }}
{{ with .sect }}
 {{ if not .Params.Hidden }}
  {{ if .IsSection }}
    {{safeHTML .Params.head}}
    <li data-nav-id="{{.URL}}" class="dd-item
        {{ if .IsAncestor $currentNode }}parent{{ end }}
        {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        {{ if .Params.alwaysopen}}parent{{ end }}
        {{ if .Params.alwaysopen}}always-open{{ end }}
        ">

I see that the code is defining some section-tree-nav component-like thing and taking in some currentNode. To be honest, I still don’t know exactly what the variables .sect, IsSection, Params.head, Params.Hidden mean. Although I can take a wild guess, they’re not that important for understanding what the logic is doing. The logic is setting the classes on the li element which is all I really care about: parent, always-open and active.

When focusing on those three classes, we can port them to React in a much more readable way by defining a variable string ddClass:

 let ddClass = ''
 let isAncestor = numberOfPages > 0
 if (isAncestor) {
   ddClass += ' parent'
 }
 if (frontmatter.alwaysopen) {
   ddClass += ' parent alwaysOpen'
 }
 return (
   <Location>
     {({ location }) => {
       const currentPathActive = location.pathname === pathToServe
       if (currentPathActive) {
         ddClass += ' active'
       }
       return (
         <li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>

There are actually a few nice things about the Hugo code, I admit. Using the Location component in React was probably less intuitive than Hugo’s ability to access currentNode to get the active page. Also isAncestor is predefined in Hugo as Whether the current page is an ancestor of the given page. For me though, having to track down the definitions of the predefined variables was frustrating and I appreciate the local explicitness of the definition, but I admit I’m a bit jaded.

Children

The most complex part of the sidebar is getting the children. Now this is a story that really gets me starting to appreciate GraphQL.

Here’s getting the children for the sidebar in Hugo:

    {{ $numberOfPages := (add (len .Pages) (len .Sections)) }}
        {{ if ne $numberOfPages 0 }}
 
          {{ if or (.IsAncestor $currentNode) (.Params.alwaysopen)  }}
            <i class="triangle-up"></i>
          {{ else }}
            <i class="triangle-down"></i>
          {{ end }}
 
        {{ end }}
      </a>
      {{ if ne $numberOfPages 0 }}
        <ul>
          {{ .Scratch.Set "pages" .Pages }}
          {{ if .Sections}}
          {{ .Scratch.Set "pages" (.Pages | union .Sections) }}
          {{ end }}
          {{ $pages := (.Scratch.Get "pages") }}
 
        {{ if eq .Site.Params.ordersectionsby "title" }}
          {{ range $pages.ByTitle }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ else }}
          {{ range $pages.ByWeight }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ end }}
        </ul>
      {{ end }}
    </li>
  {{ else }}
    {{ if not .Params.Hidden }}
      <li data-nav-id="{{.URL}}" class="dd-item
     {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
      ">
        <a href="{{.RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if .Params.new }}
          <span class="new-badge">NEW</span>
        {{ end }}
 
        </a></li>
     {{ end }}
  {{ end }}
 {{ end }}
{{ end }}
{{ end }}

This is just the first layer of children. No grandbabies, sorry. And I won’t even get into all that is going on there exactly. When I started porting this over, I realized a lot of that logic was not even being used.

In React, we grab all the markdown pages and see which have parents that match the current page:

 const topLevelMarkdown: markdownRemarkEdge[] = useStaticQuery(
   GraphQL`
     {
       allMdx(limit: 1000) {
         edges {
           node {
             frontmatter {
               title
               alwaysopen
               hidden
               showNew
               weight
             }
             fileAbsolutePath
             fields {
               pathToServe
               parent
               filePath
             }
           }
         }
       }
     }
   `
 ).allMdx.edges
 const myChildren: mdx[] = topLevelMarkdown
   .filter(
     edge =>
       fields.pathToServe === '/workers' + edge.node.fields.parent &&
       fields.pathToServe !== edge.node.fields.pathToServe
   )
   .map(child => child.node)
   .filter(child => !child.frontmatter.hidden)
   .sort(sortByWeight)
 const numberOfPages = myChildren.length

And then we render the children, so the full JSX becomes:

<li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>
   <Link
     to={pathToServe}
     title="Docs Home"
     activeClassName="active"
   >
     {title || 'No title'}
     {numberOfPages ? (
       <Triangle isAncestor={isAncestor} alwaysopen={showChildren} />
     ) : (
       ''
     )}
     {showNew ? <span className="new-badge">NEW</span> : ''}
   </Link>
   {showChildren ? (
     <ul>
       {' '}
       {myChildren.map((child: mdx) => {
         return (
           <SidebarLi
             frontmatter={child.frontmatter}
             fields={child.fields}
             depth={++depth}
             key={child.frontmatter.title}
           />
         )
       })}
     </ul>
   ) : (
     ''
   )}
 </li>

Ok now that we have a component, and we have Gatsby creating the pages off the markdown, I can go back to my PageTemplate component and render the sidebar:

import Sidebar from './Sidebar'
export default function PageTemplate({ data: { mdx } }) {
 return (
   <div>
     <Sidebar />
     <h1>{mdx.frontmatter.title}</h1>
     <MDXRenderer>{mdx.body}</MDXRenderer>
   </div>
 )
}

I don’t have to pass any props to Sidebar because the GraphQL static query in Sidebar.tsx gets all the data about all the pages that I need. I don’t even maintain state because Location is used to determine which path is active. Gatsby generates pages using the above component for each page that’s a markdown MDX node.

Wrapping up

This was just the beginning of the full migration to Gatsby. I repeated the process above for turning templates, partials, and other HTML component-like parts in Hugo into React, which was actually pretty fun, though turning vanilla JS that once manipulated the DOM into React would probably be a nightmare if I wasn’t somewhat comfortable working in React.

Main lessons learned:

  • Being careful about breaking things and being scared to break things are two very different things. Being careful is good; being scared is bad. If I were to complete this migration again, I would’ve used the Hugo templates as a reference but not as a source of truth. Staging environments are what testing is for. Don’t sacrifice writing things the right way to comply with the old way.
  • When doing a migration like this on a static site, get just a few pages working before moving the content over to avoid intermediate PRs from breaking. It seems obvious but, with the large amounts of content we had, a lot of things broke when porting over content. Get everything polished with each type of page before moving all your content over.
  • When doing a migration like this, it’s OK to compromise some features of the old design until you determine whether to add them back in, just make sure to test this with real users first. For example, I made the mistake of assuming others wouldn’t mind being without anchor tags. (Note Hugo templates create anchor tags for headers automatically as in Gatsby you have to use MDX to customize markdown components). Test this on a single, popular page with real users first to see if it matters before giving it up.
  • Even for those with React background, the ramp up with GraphQL and setting up Gatsby isn’t as simple as it seems at first. But once you’re set up it’s pretty dang nice.

Overall the process of moving to Gatsby was well worth the effort. As we implement a redesign in React it’s much easier to apply the designs in this cleaner code base. Also though Hugo was already very performant with a nice SEO score, in Gatsby we are able to increase the performance and SEO thanks to the framework’s flexibility.

Lastly, working with the Gatsby team was awesome and they even give free T-shirts for your first PR!

An Update on CDNJS

Post Syndicated from Zack Bloom original https://blog.cloudflare.com/an-update-on-cdnjs/

An Update on CDNJS

When you loaded this blog, a file was delivered to your browser called jquery-3.2.1.min.js. jQuery is a library which makes it easier to build websites, and was at one point included on as many as 74.1% of all websites. A full eighteen million sites include jQuery and other libraries using one of the most popular tools on Earth: CDNJS. Beginning about a month ago Cloudflare began to take a more active role in the operation of CDNJS. This post is here to tell you more about CDNJS’ history and explain why we are helping to manage CDNJS.

What CDNJS Does

Virtually every site is composed of not just the code written by its developers, but also dozens or hundreds of libraries. These libraries make it possible for websites to extend what a web browser can do on its own. For example, libraries can allow a site to include powerful data visualizations, respond to user input, or even get more performant.

These libraries created wondrous and magical new capabilities for web browsers, but they can also cause the size of a site to explode. Particularly a decade ago, connections were not always fast enough to permit the use of many libraries while maintaining performance. But if so many websites are all including the same libraries, why was it necessary for each of them to load their own copy?

If we all load jQuery from the same place the browser can do a much better job of not actually needing to download it for every site. When the user visits the first jQuery-powered site it will have to be downloaded, but it will already be cached on the user’s computer for any subsequent jQuery-powered site they might visit.

An Update on CDNJS

The first visit might take time to load:

An Update on CDNJS

But any future visit to any website pointing to this common URL would already be cached:

An Update on CDNJS

<!-- Loaded only on my site, will need to be downloaded by every user -->
<script src="./jquery.js"></script>

<!-- Loaded from a common location across many sites -->
<script src="https://cdnjs.cloudflare.com/jquery.js"></script>

Beyond the performance advantage, including files this way also made it very easy for users to experiment and create. When using a web browser as a creation tool users often didn’t have elaborate build systems (this was also before npm), so being able to include a simple script tag was a boon. It’s worth noting that it’s not clear a massive performance advantage was ever actually provided by this scheme. It is becoming even less of a performance advantage now that browser vendors are beginning to use separate cache’s for each website you visit, but with millions of sites using CDNJS there’s no doubt it is a critical part of the web.

A CDN for all of us

My first Pull Request into the CDNJS project was in 2013. Back then if you created a JavaScript project it wasn’t possible to have it included in the jQuery CDN, or the ones provided by large companies like Google and Microsoft. They were only for big, important, projects. Of course, however, even the biggest project starts small. The community needed a CDN which would agree to host nearly all JavaScript projects, even the ones which weren’t world-changing (yet). In 2011, that project was launched by Ryan Kirkman and Thomas Davis as CDNJS.

The project was quickly wildly successful, far beyond their expectations. Their CDN bill quickly began to skyrocket (it would now be over a million dollars a year on AWS). Under the threat of having to shut down the service, Cloudflare was approached by the CDNJS team to see if we could help. We agreed to support their efforts and created cdnjs.cloudflare.com which serves the CDNJS project free of charge.

CDNJS has been astonishingly successful. The project is currently installed on over eighteen million websites (10% of the Internet!), offers files totaling over 1.5 billion lines of code, and serves over 173 billion requests a month. CDNJS only gets more popular as sites get larger, with 34% of the top 10k websites using the service. Each month we serve almost three petabytes of JavaScript, CSS, and other resources which power the web via cdnjs.cloudflare.com.

An Update on CDNJS
Spikes can happen when a very large or popular site installs CDNJS, or when a misbehaving web crawler discovers a CDNJS link.

The future value of CDNJS is now in doubt, as web browsers are beginning to use a separate cache for every website you visit. It is currently used on such a wide swath of the web, however, it is unlikely it will be disappearing any time soon.

How CDNJS Works

CDNJS starts with a Github repo. That project contains every file served by CDNJS, at every version which it has ever offered. That’s 182 GB without the commit history, over five million files, and over 1.5 billion lines of code.

Given that it stores and delivers versioned code files, in many ways it was the Internet’s first JavaScript package manager. Unlike other package managers and even other CDNs everything CDNJS serves is publicly versioned. All 67,724 commits! This means you as a user can verify that you are being served files which haven’t been tampered with.

To make changes to CDNJS a commit has to be made. For new projects being added to CDNJS, or when projects change significantly, these commits are made by humans, and get reviewed by other humans. When projects just release new versions there is a bot made by Peter and maintained by Sven which sucks up changes from npm and automatically creates commits.

Within Cloudflare’s infrastructure there is a set of machines which are responsible for pulling the latest version of the repo periodically. Those machines then become the origin for cdnjs.cloudflare.com, with Cloudflare’s Global Load Balancer automatically handling failures. Cloudflare’s cache automatically stores copies of many of the projects making it possible for us to deliver them quickly from all 195 of our data centers.

An Update on CDNJS

The Internet on a Shoestring Budget

The CDNJS project has always been administered independently of Cloudflare. In addition to the founders, the project has additionally been maintained by exceptionally hard-working caretakers like Peter and Matt Cowley. Maintaining a single repo of nearly every frontend project on Earth is no small task, and it has required a substantial amount of both manual work and bot development.

Unfortunately approximately thirty days ago one of those bots stopped working, preventing updated projects from appearing in CDNJS. The bot’s open-source maintainer was not able to invest the time necessary to keep the bot running. After several weeks we were asked by the community and the CDNJS founders to take over maintenance of the CDNJS repo itself. This means the Cloudflare engineering team is taking responsibility for keeping the contents of github.com/cdnjs/cdnjs up to date, in addition to ensuring it is correctly served on cdnjs.cloudflare.com.

We agreed to do this because we were, frankly, scared. Like so many open-source projects CDNJS was a critical part of our world, but wasn’t getting the attention it needed to survive. The Internet relies on CDNJS as much as on any other single project, losing it or allowing it to be commandeered would be catastrophic to millions of websites and their visitors. If it began to fail, some sites would adapt and update, others would be broken forever.

CDNJS has always been, and remains, a project for and by the community. We are invested in making all decisions in a transparent and inclusive manner. If you are interested in contributing to CDNJS or in the topics we’re currently discussing please visit the CDNJS Github Issues page.

An Update on CDNJS

A Plan for the Future

One example of an area where we could use your help is in charting a path towards a CDNJS which requires less manual moderation. Nothing can replace the intelligence and creativity of a human (yet), but for a task like managing what resources go into a CDN, it is error prone and time consuming. At present a human has to review every new project to be included, and often has to take additional steps to include new versions of a project.

As a part of our analysis of the project we examined a snapshot of the still-open PRs made against CDNJS for several months:

An Update on CDNJS

The vast majority of these PRs were changes which ultimately passed the automated review but nevertheless couldn’t be merged without manual review.

There is consensus that we should move to a model which does not require human involvement in most cases. We would love your input and collaboration on the best way for that to be solved. If this is something you are passionate about, please contribute here.

Our plan is to support the CDNJS project in whichever ways it requires for as long as the Internet relies upon it. We invite you to use CDNJS in your next project with the full assurance that it is backed by the same network and team who protect and accelerate over twenty million of your favorite websites across the Internet. We are also planning more posts diving further into the CDNJS data, subscribe to this blog if you would like to be notified upon their release.

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Post Syndicated from Filipp Nisenzoun original https://blog.cloudflare.com/introducing-the-graphql-analytics-api-exactly-the-data-you-need-all-in-one-place/

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Today we’re excited to announce a powerful and flexible new way to explore your Cloudflare metrics and logs, with an API conforming to the industry-standard GraphQL specification. With our new GraphQL Analytics API, all of your performance, security, and reliability data is available from one endpoint, and you can select exactly what you need, whether it’s one metric for one domain or multiple metrics aggregated for all of your domains. You can ask questions like “How many cached bytes have been returned for these three domains?” Or, “How many requests have all the domains under my account received?” Or even, “What effect did changing my firewall rule an hour ago have on the responses my users were seeing?”

The GraphQL standard also has strong community resources, from extensive documentation to front-end clients, making it easy to start creating simple queries and progress to building your own sophisticated analytics dashboards.

From many APIs…

Providing insights has always been a core part of Cloudflare’s offering. After all, by using Cloudflare, you’re relying on us for key parts of your infrastructure, and so we need to make sure you have the data to manage, monitor, and troubleshoot your website, app, or service. Over time, we developed a few key data APIs, including ones providing information regarding your domain’s traffic, DNS queries, and firewall events. This multi-API approach was acceptable while we had only a few products, but we started to run into some challenges as we added more products and analytics. We couldn’t expect users to adopt a new analytics API every time they started using a new product. In fact, some of the customers and partners that were relying on many of our products were already becoming confused by the various APIs.

Following the multi-API approach was also affecting how quickly we could develop new analytics within the Cloudflare dashboard, which is used by more people for data exploration than our APIs. Each time we built a new product, our product engineering teams had to implement a corresponding analytics API, which our user interface engineering team then had to learn to use. This process could take up to several months for each new set of analytics dashboards.

…to one

Our new GraphQL Analytics API solves these problems by providing access to all Cloudflare analytics. It offers a standard, flexible syntax for describing exactly the data you need and provides predictable, matching responses. This approach makes it an ideal tool for:

  1. Data exploration. You can think of it as a way to query your own virtual data warehouse, full of metrics and logs regarding the performance, security, and reliability of your Internet property.
  2. Building amazing dashboards, which allow for flexible filtering, sorting, and drilling down or rolling up. Creating these kinds of dashboards would normally require paying thousands of dollars for a specialized analytics tool. You get them as part of our product and can customize them for yourself using the API.

In a companion post that was also published today, my colleague Nick discusses using the GraphQL Analytics API to build dashboards. So, in this post, I’ll focus on examples of how you can use the API to explore your data. To make the queries, I’ll be using GraphiQL, a popular open-source querying tool that takes advantage of GraphQL’s capabilities.

Introspection: what data is available?

The first thing you may be wondering: if the GraphQL Analytics API offers access to so much data, how do I figure out what exactly is available, and how I can ask for it? GraphQL makes this easy by offering “introspection,” meaning you can query the API itself to see the available data sets, the fields and their types, and the operations you can perform. GraphiQL uses this functionality to provide a “Documentation Explorer,” query auto-completion, and syntax validation. For example, here is how I can see all the data sets available for a zone (domain):

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

If I’m writing a query, and I’m interested in data on firewall events, auto-complete will help me quickly find relevant data sets and fields:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Querying: examples of questions you can ask

Let’s say you’ve made a major product announcement and expect a surge in requests to your blog, your application, and several other zones (domains) under your account. You can check if this surge materializes by asking for the requests aggregated under your account, in the 30 minutes after your announcement post, broken down by the minute:

{
 viewer { 
   accounts (filter: {accountTag: $accountTag}) {
     httpRequests1mGroups(limit: 30, filter: {datetime_geq: "2019-09-16T20:00:00Z", datetime_lt: "2019-09-16T20:30:00Z"}, orderBy: [datetimeMinute_ASC]) {
	  dimensions {
		datetimeMinute
	  }
	  sum {
		requests
	  }
	}
   }
 }
}

Here is the first part of the response, showing requests for your account, by the minute:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Now, let’s say you want to compare the traffic coming to your blog versus your marketing site over the last hour. You can do this in one query, asking for the number of requests to each zone:

{
 viewer {
   zones(filter: {zoneTag_in: [$zoneTag1, $zoneTag2]}) {
     httpRequests1hGroups(limit: 2, filter: {datetime_geq: "2019-09-16T20:00:00Z",
datetime_lt: "2019-09-16T21:00:00Z"}) {
       sum {
         requests
       }
     }
   }
 }
}

Here is the response:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Finally, let’s say you’re seeing an increase in error responses. Could this be correlated to an attack? You can look at error codes and firewall events over the last 15 minutes, for example:

{
 viewer {
   zones(filter: {zoneTag: $zoneTag}) {
     httpRequests1mGroups (limit: 100,
filter: {datetime_geq: "2019-09-16T21:00:00Z",
datetime_lt: "2019-09-16T21:15:00Z"}) {
       sum {
         responseStatusMap {
           edgeResponseStatus
           requests
         }
       }
     }
    firewallEventsAdaptiveGroups (limit: 100,
filter: {datetime_geq: "2019-09-16T21:00:00Z",
datetime_lt: "2019-09-16T21:15:00Z"}) {
       dimensions {
         action
       }
       count
     }
    }
  }
}

Notice that, in this query, we’re looking at multiple datasets at once, using a common zone identifier to “join” them. Here are the results:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

By examining both data sets in parallel, we can see a correlation: 31 requests were “dropped” or blocked by the Firewall, which is exactly the same as the number of “403” responses. So, the 403 responses were a result of Firewall actions.

Try it today

To learn more about the GraphQL Analytics API and start exploring your Cloudflare data, follow the “Getting started” guide in our developer documentation, which also has details regarding the current data sets and time periods available. We’ll be adding more data sets over time, so take advantage of the introspection feature to see the latest available.

Finally, to make way for the new API, the Zone Analytics API is now deprecated and will be sunset on May 31, 2020. The data that Zone Analytics provides is available from the GraphQL Analytics API. If you’re currently using the API directly, please follow our migration guide to change your API calls. If you get your analytics using the Cloudflare dashboard or our Datadog integration, you don’t need to take any action.

One more thing….

In the API examples above, if you find it helpful to get analytics aggregated for all the domains under your account, we have something else you may like: a brand new Analytics dashboard (in beta) that provides this same information. If your account has many zones, the dashboard is helpful for knowing summary information on metrics such as requests, bandwidth, cache rate, and error rate. Give it a try and let us know what you think using the feedback link above the new dashboard.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner

Post Syndicated from Nadin El-Yabroudi original https://blog.cloudflare.com/introducing-flan-scan/

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner

Today, we’re excited to open source Flan Scan, Cloudflare’s in-house lightweight network vulnerability scanner. Flan Scan is a thin wrapper around Nmap that converts this popular open source tool into a vulnerability scanner with the added benefit of easy deployment.

We created Flan Scan after two unsuccessful attempts at using “industry standard” scanners for our compliance scans. A little over a year ago, we were paying a big vendor for their scanner until we realized it was one of our highest security costs and many of its features were not relevant to our setup. It became clear we were not getting our money’s worth. Soon after, we switched to an open source scanner and took on the task of managing its complicated setup. That made it difficult to deploy to our entire fleet of more than 190 data centers.

We had a deadline at the end of Q3 to complete an internal scan for our compliance requirements but no tool that met our needs. Given our history with existing scanners, we decided to set off on our own and build a scanner that worked for our setup. To design Flan Scan, we worked closely with our auditors to understand the requirements of such a tool. We needed a scanner that could accurately detect the services on our network and then lookup those services in a database of CVEs to find vulnerabilities relevant to our services. Additionally, unlike other scanners we had tried, our tool had to be easy to deploy across our entire network.

We chose Nmap as our base scanner because, unlike other network scanners which sacrifice accuracy for speed, it prioritizes detecting services thereby reducing false positives. We also liked Nmap because of the Nmap Scripting Engine (NSE), which allows scripts to be run against the scan results. We found that the “vulners” script, available on NSE, mapped the detected services to relevant CVEs from a database, which is exactly what we needed.

The next step was to make the scanner easy to deploy while ensuring it outputted actionable and valuable results. We added three features to Flan Scan which helped package up Nmap into a user-friendly scanner that can be deployed across a large network.

  • Easy Deployment and ConfigurationTo create a lightweight scanner with easy configuration, we chose to run Flan Scan inside a Docker container. As a result, Flan Scan can be built and pushed to a Docker registry and maintains the flexibility to be configured at runtime. Flan Scan also includes sample Kubernetes configuration and deployment files with a few placeholders so you can get up and scanning quickly.
  • Pushing results to the Cloud Flan Scan adds support for pushing results to a Google Cloud Storage Bucket or an S3 bucket. All you need to do is set a few environment variables and Flan Scan will do the rest. This makes it possible to run many scans across a large network and collect the results in one central location for processing.
  • Actionable Reports – Flan Scan generates actionable reports from Nmap’s output so you can quickly identify vulnerable services on your network, the applicable CVEs, and the IP addresses and ports where these services were found. The reports are useful for engineers following up on the results of the scan as well as auditors looking for evidence of compliance scans.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample run of Flan Scan from start to finish. 

How has Scan Flan improved Cloudflare’s network security?

By the end of Q3, not only had we completed our compliance scans, we also used Flan Scan to tangibly improve the security of our network. At Cloudflare, we pin the software version of some services in production because it allows us to prioritize upgrades by weighing the operational cost of upgrading against the improvements of the latest version. Flan Scan’s results revealed that our FreeIPA nodes, used to manage Linux users and hosts, were running an outdated version of Apache with several medium severity vulnerabilities. As a result, we prioritized their update. Flan Scan also found a vulnerable instance of PostgreSQL leftover from a performance dashboard that no longer exists.

Flan Scan is part of a larger effort to expand our vulnerability management program. We recently deployed osquery to our entire network to perform host-based vulnerability tracking. By complementing osquery’s findings with Flan Scan’s network scans we are working towards comprehensive visibility of the services running at our edge and their vulnerabilities. With two vulnerability trackers in place, we decided to build a tool to manage the increasing number of vulnerability  sources. Our tool sends alerts on new vulnerabilities, filters out false positives, and tracks remediated vulnerabilities. Flan Scan’s valuable security insights were a major impetus for creating this vulnerability tracking tool.

How does Flan Scan work?

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner

The first step of Flan Scan is running an Nmap scan with service detection. Flan Scan’s default Nmap scan runs the following scans:

  1. ICMP ping scan – Nmap determines which of the IP addresses given are online.
  2. SYN scan – Nmap scans the 1000 most common ports of the IP addresses which responded to the ICMP ping. Nmap marks ports as open, closed, or filtered.
  3. Service detection scan – To detect which services are running on open ports Nmap performs TCP handshake and banner grabbing scans.

Other types of scanning such as UDP scanning and IPv6 addresses are also possible with Nmap. Flan Scan allows users to run these and any other extended features of Nmap by passing in Nmap flags at runtime.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample Nmap output

Flan Scan adds the “vulners” script tag in its default Nmap command to include in the output a list of vulnerabilities applicable to the services detected. The vulners script works by making API calls to a service run by vulners.com which returns any known vulnerabilities for the given service.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample Nmap output with Vulners script

The next step of Flan Scan uses a Python script to convert the structured XML of Nmap’s output to an actionable report. The reports of the previous scanner we used listed each of the IP addresses scanned and present the vulnerabilities applicable to that location. Since we had multiple IP addresses running the same service, the report would repeat the same list of vulnerabilities under each of these IP addresses. This meant scrolling back and forth on documents hundreds of pages long to obtain a list of all IP addresses with the same vulnerabilities.  The results were impossible to digest.

Flan Scans results are structured around services. The report enumerates all vulnerable services with a list beneath each one of relevant vulnerabilities and all IP addresses running this service. This structure makes the report shorter and actionable since the services that need to be remediated can be clearly identified. Flan Scan reports are made using LaTeX because who doesn’t like nicely formatted reports that can be generated with a script? The raw LaTeX file that Flan Scan outputs can be converted to a beautiful PDF by using tools like pdf2latex or TeXShop.

Introducing Flan Scan: Cloudflare’s Lightweight Network Vulnerability Scanner
Sample Flan Scan report

What’s next?

Cloudflare’s mission is to help build a better Internet for everyone, not just Internet giants who can afford to buy expensive tools. We’re open sourcing Flan Scan because we believe it shouldn’t cost tons of money to have strong network security.

You can get started running a vulnerability scan on your network in a few minutes by following the instructions on the README. We welcome contributions and suggestions from the community.

Improve Your App Testing With Amplify Console’s Pull Requests Previews and Cypress Testing

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/improve-your-app-testing-with-amplify-consoles-pull-requests-previews-and-cypress-testing/

Amplify Console allows developers to easly configure a Git-based workflow for continuous deployment and hosting of fullstack serverless web apps. Fullstack serverless apps comprise of backend resources such as GraphQL APIs, Data and File Storage, Authentication, or Analytics, integrated with a frontend framework such as React, Gatsby, or Angular. You can read more about the Amplify Console in a previous article I wrote.

Today, we are announcing the ability to create preview URLs and to run end-to-end tests on pull requests before releasing code to production.

Pull Request previews
You can now configure Amplify Console to deploy your application to a unique URL every time a developer submits a pull request to your Git repository. The preview URL is completely different from the one used by the production site. You can see how changes look before merging the pull request into the main branch of your code repository, triggering a new release in the Amplify Console. For fullstack apps with backend environments provisioned via the Amplify CLI, every pull request spins up an ephemeral backend that is deleted when the pull request is closed. You can test changes in complete isolation from the production environment. Amplify Console creates backend infrastructures for pull requests on private git repositories only. This allows to avoid incurring extra costs in case of unsolicited pull requests.

To learn how it works, let’s start a web application with a cloud-based authentication backend, and deploy it on Amplify Console. I first create a React application (check here to learn how to install React).

npx create-react-app amplify-console-demo                                                
cd amplify-console-demo

I initialize the Amplify environment (learn how to install the Amplify CLI first). I add a cloud based authentication backend powered by Amazon Cognito. I accept all the defaults answers proposed by Amplify CLI.

npm install aws-amplify aws-amplify-react
amplify init
amplify add auth
amplify push

I then modify src/App.js to add the front end authentication user interface. The code is available in the AWS Amplify documentation. Once ready, I start the local development server to test the application locally.

npm run start

I point my browser to http://localhost:8080 to verify the scafolding (the below screenshot is taken from my AWS Cloud 9 development environment). I click Create account to create a user, verify the SignUp flow, and authenticate to the app.

After signing up, I see the application page.

There are two important details to note. First, I am using a private GitHub repository. Amplify Console only creates backend infrastructure on pull requests for private repositories, to avoid creating unnecessary infrastructure for unsollicited pull requests. Second, the Amplify Console build process looks for dependencies in package-lock.json only. This is why I added the amplify packages with npm and not with yarn.

When I am happy with my app, I push the code to a GitHub repo (let’s assume I already did git remote add origin ...).

git add amplify
git commit -am "initial commit"
git push origin master

The next step consists of configuring Amplify Console to build and deploy my app on every git commit. I login to the Amplify Console, click Connect App, choose GitHub as repository and click Continue (the first time I do this, I need to authenticate on GitHub, using my GitHub username and password)

I select my repository and the branch I want to use as source:

Amplify Console detects the type of project and proposes a build file. I select the name of my environment (dev). The first time I use Amplify Console, I follow the instructions to create a new service role. This role authorises Amplify Console to access AWS backend services on my behalf.

I click Next. I review the settings and click Save and Deploy. After a few seconds or minutes, my application is ready. I can point my browser to the deployment URL and verify the app is working correctly.

Now, let’s enable previews for pull requests. Click Preview on the left menu and Enable Previews. To enable the previews, Amplify Console requires an app to be installed in my GitHub account. I follow the instructions provided by the console to configure my GitHub account. Once set up, I select a branch, click Manage to enable / disable the pull request previews. (At anytime, I can uninstall the Amplify app from my GitHub account by visiting the Applications section of my GitHub account’s settings.)

Now that the mechanism is in place, let’s create a pull request.

I edit App.js directly on GitHub. I customize the withAuthenticator component to change the color of the Sign In button from orange to green. I save the changes and I create a pull request.

On the Pull Request detail page, I click Show all checks to get the status of the Amplify Console test. I see AWS Amplify Console Web Preview in progress. Amplify Console creates a full backend environment to test the pull request, to build and to deploy the frontend.

Eventually, I see All checks have passed and a green mark. I click Details to get the preview url. In case of an error, you can see the detailled log file of the build phase in the Amplify Console.

I can also check the status of the preview in the Amplify Console.

I point my browser to the preview URL to test my change. I can see the green Sign In button instead of the orange one.

When I try to authenticate using the username and password I created previously, I receive an User does not exist error message because this preview URL points to a different backend than the main application. I can see two Cognito user pools in the Cognito console, one for each environment.

I can control who can access the preview URL using similar access control settings that I use for the main URL.

When I am happy with the proposed changes, I merge the pull request on GitHub to trigger a new build and to deploy the change to the production environment. Amplify Console deletes the preview environment upon merging. The ephemeral backend environment created for the pull request also gets deleted.

Cypress testing
In addition to previewing changes before merging them to the main branch, we also added the capability to run end to end tests during your build process. You can use your favorite test framework to add unit or end-to-end tests to your application and automatically run the tests during the build phase. When you use Cypress test framework, Amplify Console detects the tests in your source tree and automatically adds the testing phase in your application build process.

Only projects that are passing all tests are pushed down your pipeline to the deployment phase. You can learn more about this and follow step by step instructions we posted a few weeks ago.

These two additions to Amplify Console allow you to gain higher confidence in the robustness of your pipeline and the quality of the code delivered to your production environment.

Availability
Web previews are available in all Regions where AWS Amplify Console is available today, at no additional cost on top of the regular Amplify Console pricing. With the AWS Free Usage Tier, you can get started for free. Upon sign up, new AWS customers receive 1,000 build minutes per month for the build and deploy feature, and 15 GB served per month and 5 GB data storage per month for the hosting.

— seb

Experiment with HTTP/3 using NGINX and quiche

Post Syndicated from Alessandro Ghedini original https://blog.cloudflare.com/experiment-with-http-3-using-nginx-and-quiche/

Experiment with HTTP/3 using NGINX and quiche

Experiment with HTTP/3 using NGINX and quiche

Just a few weeks ago we announced the availability on our edge network of HTTP/3, the new revision of HTTP intended to improve security and performance on the Internet. Everyone can now enable HTTP/3 on their Cloudflare zone and experiment with it using Chrome Canary as well as curl, among other clients.

We have previously made available an example HTTP/3 server as part of the quiche project to allow people to experiment with the protocol, but it’s quite limited in the functionality that it offers, and was never intended to replace other general-purpose web servers.

We are now happy to announce that our implementation of HTTP/3 and QUIC can be integrated into your own installation of NGINX as well. This is made available as a patch to NGINX, that can be applied and built directly with the upstream NGINX codebase.

Experiment with HTTP/3 using NGINX and quiche

It’s important to note that this is not officially supported or endorsed by the NGINX project, it is just something that we, Cloudflare, want to make available to the wider community to help push adoption of QUIC and HTTP/3.

Building

The first step is to download and unpack the NGINX source code. Note that the HTTP/3 and QUIC patch only works with the 1.16.x release branch (the latest stable release being 1.16.1).

 % curl -O https://nginx.org/download/nginx-1.16.1.tar.gz
 % tar xvzf nginx-1.16.1.tar.gz

As well as quiche, the underlying implementation of HTTP/3 and QUIC:

 % git clone --recursive https://github.com/cloudflare/quiche

Next you’ll need to apply the patch to NGINX:

 % cd nginx-1.16.1
 % patch -p01 < ../quiche/extras/nginx/nginx-1.16.patch

And finally build NGINX with HTTP/3 support enabled:

 % ./configure                          	\
   	--prefix=$PWD                       	\
   	--with-http_ssl_module              	\
   	--with-http_v2_module               	\
   	--with-http_v3_module               	\
   	--with-openssl=../quiche/deps/boringssl \
   	--with-quiche=../quiche
 % make

The above command instructs the NGINX build system to enable the HTTP/3 support ( --with-http_v3_module) by using the quiche library found in the path it was previously downloaded into ( --with-quiche=../quiche), as well as TLS and HTTP/2. Additional build options can be added as needed.

You can check out the full instructions here.

Running

Once built, NGINX can be configured to accept incoming HTTP/3 connections by adding the quic and reuseport options to the listen configuration directive.

Here is a minimal configuration example that you can start from:

events {
    worker_connections  1024;
}

http {
    server {
        # Enable QUIC and HTTP/3.
        listen 443 quic reuseport;

        # Enable HTTP/2 (optional).
        listen 443 ssl http2;

        ssl_certificate      cert.crt;
        ssl_certificate_key  cert.key;

        # Enable all TLS versions (TLSv1.3 is required for QUIC).
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3;
    }
}

This will enable both HTTP/2 and HTTP/3 on the TCP/443 and UDP/443 ports respectively.

You can then use one of the available HTTP/3 clients (such as Chrome Canary, curl or even the example HTTP/3 client provided as part of quiche) to connect to your NGINX instance using HTTP/3.

We are excited to make this available for everyone to to experiment and play with HTTP/3, but it’s important to note that the implementation is still experimental and it’s likely to have bugs as well as limitations in functionality. Feel free to submit a ticket to the quiche project if you run into problems or find any bug.

Terraforming Cloudflare: in quest of the optimal setup

Post Syndicated from Guest Author original https://blog.cloudflare.com/terraforming-cloudflare/

Terraforming Cloudflare: in quest of the optimal setup

This is a guest post by Dimitris Koutsourelis and Alexis Dimitriadis, working for the Security Team at Workable, a company that makes software to help companies find and hire great people.

Terraforming Cloudflare: in quest of the optimal setup

This post is about our introductive journey to the infrastructure-as-code practice; managing Cloudflare configuration in a declarative and version-controlled way. We’d like to share the experience we’ve gained during this process; our pain points, limitations we faced, different approaches we took and provide parts of our solution and experimentations.

Terraform world

Terraform is a great tool that fulfills our requirements, and fortunately, Cloudflare maintains its own provider that allows us to manage its service configuration hasslefree.

On top of that, Terragrunt, is a thin wrapper that provides extra commands and functionality for keeping Terraform configurations DRY, and managing remote state.

The combination of both leads to a more modular and re-usable structure for Cloudflare resources (configuration), by utilizing terraform and terragrunt modules.
We’ve chosen to use the latest version of both tools (Terraform-v0.12 & Terragrunt-v0.19 respectively) and constantly upgrade to take advantage of the valuable new features and functionality, which at this point in time, remove important limitations.

Workable context

Our set up includes multiple domains that are grouped in two distinct Cloudflare organisations: production & staging. Our environments have their own purposes and technical requirements (i.e.: QA, development, sandbox and production) which translates to slightly different sets of Cloudflare zone configuration.

Our approach

Our main goal was to have a modular set up with the ability to manage any configuration for any zone, while keeping code repetition to a minimum. This is more complex than it sounds; we have repeatedly changed our Terraform folder structure – and other technical aspects – during the development period. The following sections illustrate a set of alternatives through our path, along with pros & cons.

Structure

Terraform configuration is based on the project’s directory structure, so this is the place to start.
Instead of retaining the Cloudflare organisation structure (production & staging as root level directories containing the zones that belong in each organization), our decision was to group zones that share common configuration under the same directory. This helps keep the code dry and the set up consistent and readable.
On the down side, this structure adds an extra layer of complexity, as two different sets of credentials need to be handled conditionally and two state files (at the environments/ root level) must be managed and isolated using workspaces.
On top of that, we used Terraform modules, to keep sets of common configuration across zone groups into a single place.
Terraform modules repository

modules/
│    ├── firewall/
│        ├── main.tf
│        ├── variables.tf
│    ├── zone_settings/
│        ├── main.tf
│        ├── variables.tf
│    └── [...]
└──

Terragrunt modules repository

environments/
│    ├── [...]
│    ├── dev/
│    ├── qa/
│    ├── demo/
│        ├── zone-8/ (production)
│            └── terragrunt.hcl
│        ├── zone-9/ (staging)
│            └── terragrunt.hcl
│        ├── config.tfvars
│        ├── main.tf
│        └── variables.tf
│    ├── config.tfvars
│    ├── secrets.tfvars
│    ├── main.tf
│    ├── variables.tf
│    └── terragrunt.hcl
└──

The Terragrunt modules tree gives flexibility, since we are able to apply configuration on a zone, group zone, or organisation level (which is inline with Cloudflare configuration capabilities – i.e.: custom error pages can also be configured on the organisation level).

Resource types

We decided to implement Terraform resources in different ways, to cover our requirements more efficiently.

1. Static resource

The first thought that came to mind was having one, or multiple .tf files implementing all the resources with hardcoded values assigned to each attribute. It’s simple and straightforward, but can have a high maintenance cost if it leads to code copy/paste between environments.
So, common settings seem to be a good use case; we chose to implement access_rules Terraform resources accordingly:
modules/access_rules/main.tf

resource "cloudflare_access_rule" "no_17" {
notes		= "this is a description"
mode 	= "blacklist"
configuration = {
target	= "ip"
value 	= "x.x.x.x"
}
}
[...]

2. Parametrized resources

Our next step was to add variables to gain flexibility. This is useful when few attributes of a shared resource configuration differ between multiple zones. Most of the configuration remains the same (as described above) and the variable instantiation is added in the Terraform module, while their values are fed through the Terragrunt module, as input variables, or entries inside_.tfvars_ files. The zone_settings_override resource was implemented accordingly:
modules/zone_settings/main.tf

resource "cloudflare_zone_settings_override" "zone_settings" {
zone_id = var.zone_id
settings {
always_online		= "on"
always_use_https		= "on"
[...]
browser_check		= var.browser_check
mobile_redirect {
mobile_subdomain	= var.mobile_redirect_subdomain
status			= var.mobile_redirect_status
strip_uri			= var.mobile_redirect_uri
}
[...]
waf			= "on"
webp		= "off"
websockets		= "on"
}
}

environments/qa/main.tf

module "zone_settings" {
source		= "[email protected]:foo/modules/zone_settings"
zone_name		= var.zone_name
browser_check	= var.zone_settings_browser_check
[...]
}

environments/qa/config.tfvars

#zone settings
zone_settings_browser_check = "off"
[...]
}

3. Dynamic resource

At that point, we thought that a more interesting approach would be to create generic resource templates to manage all instances of a given resource in one place. A template is implemented as a Terraform module and creates each resource dynamically, based on its input: data fed through the Terragrunt modules (/environments in our case), or entries in the tfvars files.
We chose to implement the account_member resource this way.
modules/account_members/variables.tf

variable "users" {
description	= "map of users - roles"
type        	= map(list(string))
}
variable "member_roles" {
description 	= "account role ids"
type        	= map(string)
}

modules/account_members/main.tf


resource "cloudflare_account_member" "account_member" {
for_each     		= var.users
email_address	= each.key
role_ids     		= [for role in each.value : lookup(var.member_roles, role)]
lifecycle {
prevent_destroy = true
}
}

We feed the template with a list of users (list of maps). Each member is assigned a number of roles. To make code more readable, we mapped users to role names instead of role ids:
environments/config.tfvars


member_roles = {
admin		= "000013091sds0193jdskd01d1dsdjhsd1"
admin_ro		= "0000ds81hd131bdsjd813hh173hds8adh"
analytics		= "0000hdsa8137djahd81y37318hshdsjhd"
[...]
super_admin		= "00001534sd1a2123781j5gj18gj511321"
}
users = {
"[email protected]"  	= ["super_admin"]
"[email protected]"	= ["analytics", "audit_logs", "cache_purge", "cf_workers"]
"[email protected]"	= ["cf_stream"]
[...]
"[email protected]"	= ["cf_stream"]
}

Another interesting case we dealt with was the rate_limit resource; the variable declaration (list of objects) & implementation goes as follows:
modules/rate_limit/variables.tf

variable "rate_limits" {
description	= "list of rate limits"
default	= []
type		= list(object(
{
disabled	= bool,
threshold	= number,
description	= string,
period	= number,
match	= object({
request	= object({
url_pattern	= map(string),
schemes		= list(string),
methods 		= list(string)
}),
response 		= object({
statuses		= list(number),
origin_traffic	= bool
})
}),
action	= object({
mode	= string,
timeout	= number
})
}))
}

modules/rate_limit/main.tf

locals {
[…]
}
data "cloudflare_zones" "zone" {
filter {
name   	= var.zone_name
status 	= "active"
paused 	= false
}
}
resource "cloudflare_rate_limit" "rate_limit" {
count 	= length(var.rate_limits)
zone_id    	=  lookup(data.cloudflare_zones.zone.zones[0], "id")
disabled    	= var.rate_limits[count.index].disabled
threshold   	= var.rate_limits[count.index].threshold
description 	= var.rate_limits[count.index].description
period        	= var.rate_limits[count.index].period
match {
request {
url_pattern 	= local.url_patterns[count.index]
schemes 		= var.rate_limits[count.index].match.request.schemes
methods 		= var.rate_limits[count.index].match.request.methods
}
response {
statuses       	= var.rate_limits[count.index].match.response.statuses
origin_traffic	= var.rate_limits[count.index].match.response.origin_traffic
}
}
action {
mode   	 = var.rate_limits[count.index].action.mode
timeout 	= var.rate_limits[count.index].action.timeout
}
}

environments/qa/rate_limit.tfvars

{
#1
disabled    	= false
threshold   	= 50
description 	= "sample description"
period     	 = 60
match 	= {
request 	= {
url_pattern 	= {
"subdomain" 	= "foo"
"path" 	= "/api/v1/bar"
}
schemes = [ "_ALL_", ]
methods = [ "GET", "POST", ]
}
response 	= {
statuses       	= []
origin_traffic 	= true
}
}
action 	= {
mode    	= "simulate"
timeout 	= 3600
}
},
[...]
}
]

The biggest advantage of this approach is that all common rate_limit rules are in one place and each environment can include its own rules in their .tfvars. The combination of those using Terraform built-in concat() function, achieves a 2-layer join of the two lists (common|unique rules). So we wanted to give it a try:

locals {
rate_limits  = concat(var.common_rate_limits, var.unique_rate_limits)
}

There is however a drawback: .tfvars files can only contain static values. So, since all url attributes – that include the zone name itself – have to be set explicitly in the data of each environment, it means that every time a change is needed to a url, this value has to be copied across all environments and change the zone name to match the environment.
The solution we came up with, in order to make the zone name dynamic, was to split the url attribute into 3 parts: subdomain, domain and path. This is effective for the .tfvars, but the added complexity to handle the new variables is non negligible. The corresponding code illustrates the issue:
modules/rate_limit/main.tf

locals {
rate_limits  	= concat(var.common_rate_limits, var.unique_rate_limits)
url_patterns 	= [for rate_limit in local.rate_limits:  "${lookup(rate_limit.match.request.url_pattern, "subdomain", null) != null ? "${lookup(rate_limit.match.request.url_pattern, "subdomain")}." : ""}"${lookup(rate_limit.match.request.url_pattern, "domain", null) != null ? "${lookup(rate_limit.match.request.url_pattern, "domain")}" : ${var.zone_name}}${lookup(rate_limit.match.request.url_pattern, "path", null) != null ? lookup(rate_limit.match.request.url_pattern, "path") : ""}"]
}

Readability vs functionality: although flexibility is increased and code duplication is reduced, the url transformations have an impact on code’s readability and ease of debugging (it took us several minutes to spot a typo). You can imagine this is even worse if you attempt to implement a more complex resource (such as page_rule which is a list of maps with four url attributes).
The underlying issue here is that at the point we were implementing our resources, we had to choose maps over objects due to their capability to omit attributes, using the lookup() function (by setting default values). This is a requirement for certain resources such as page_rules: only certain attributes need to be defined (and others ignored).
In the end, the context will determine if more complex resources can be implemented with dynamic resources.

4. Sequential resources

Cloudflare page rule resource has a specific peculiarity that differentiates it from other types of resources: the priority attribute.
When a page rule is applied, it gets a unique id and priority number which corresponds to the order it has been submitted. Although Cloudflare API and terraform provider give the ability to explicitly specify the priority, there is a catch.
Terraform doesn’t respect the order of resources inside a .tf file (even in a _for each loop!); each resource is randomly picked up and then applied to the provider. So, if page_rule priority is important – as in our case – the submission order counts. The solution is to lock the sequence in which the resources are created through the depends_on meta-attribute:

resource "cloudflare_page_rule" "no_3" {
depends_on 	= [cloudflare_page_rule.no_2]
zone_id    	= lookup(data.cloudflare_zones.zone.zones[0], "id")
target     	= "www.${var.zone_name}/foo"
status     	= "active"
priority   	= 3
actions {
forwarding_url {
status_code 	= 301
url        		 = "https://www.${var.zone_name}"
}
}
}
resource "cloudflare_page_rule" "no_2" {
depends_on = [cloudflare_page_rule.no_1]
zone_id   	= lookup(data.cloudflare_zones.zone.zones[0], "id")
target    	= "www.${var.zone_name}/lala*"
status     	= "active"
priority   	= 24
actions {
ssl                 		= "flexible"
cache_level         		= "simplified"
resolve_override    		= "bar.${var.zone_name}"
host_header_override 	= "new.domain.com"
}
}
resource "cloudflare_page_rule" "page_rule_1" {
zone_id    	= lookup(data.cloudflare_zones.zone.zones[0], "id")
target   	= "*.${var.zone_name}/foo/*"
status   	= "active"
priority 	= 1
actions {
forwarding_url {
status_code 	= 301
url         		= "https://foo.${var.zone_name}/$1/$2"
}
}
}

So we had to go with to a more static resource configuration because the depends_on attribute only takes static values (not dynamically calculated ones during the runtime).

Conclusion

After changing our minds several times along the way on Terraform structure and other technical details, we believe that there isn’t a single best solution. It all comes down to the requirements and keeping a balance between complexity and simplicity. In our case, a mixed approach is good middle ground.
Terraform is evolving quickly, but at this point it lacks some common coding capabilities. So over engineering can be a catch (which we fell-in too many times). Keep it simple and as DRY as possible. 🙂

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Post Syndicated from Jenny Hang original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-september-aws-online-tech-talks/

Learn about AWS Services & Solutions – September AWS Online Tech Talks

AWS Tech Talks

Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

 

Compute:

September 23, 2019 | 11:00 AM – 12:00 PM PTBuild Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.

September 26, 2019 | 1:00 PM – 2:00 PM PTSelf-Hosted WordPress: It’s Easier Than You Think – Learn how you can easily build a fault-tolerant WordPress site using Amazon Lightsail.

October 3, 2019 | 11:00 AM – 12:00 PM PTLower Costs by Right Sizing Your Instance with Amazon EC2 T3 General Purpose Burstable Instances – Get an overview of T3 instances, understand what workloads are ideal for them, and understand how the T3 credit system works so that you can lower your EC2 instance costs today.

 

Containers:

September 26, 2019 | 11:00 AM – 12:00 PM PTDevelop a Web App Using Amazon ECS and AWS Cloud Development Kit (CDK) – Learn how to build your first app using CDK and AWS container services.

 

Data Lakes & Analytics:

September 26, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Provisioning Amazon MSK Clusters and Using Popular Apache Kafka-Compatible Tooling – Learn best practices on running Apache Kafka production workloads at a lower cost on Amazon MSK.

 

Databases:

September 25, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

October 3, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Enterprise-Class Security, High-Availability, and Scalability with Amazon ElastiCache – Learn about new enterprise-friendly Amazon ElastiCache enhancements like customer managed key and online scaling up or down to make your critical workloads more secure, scalable and available.

 

DevOps:

October 1, 2019 | 9:00 AM – 10:00 AM PT – CI/CD for Containers: A Way Forward for Your DevOps Pipeline – Learn how to build CI/CD pipelines using AWS services to get the most out of the agility afforded by containers.

 

Enterprise & Hybrid:

September 24, 2019 | 1:00 PM – 2:30 PM PT Virtual Workshop: How to Monitor and Manage Your AWS Costs – Learn how to visualize and manage your AWS cost and usage in this virtual hands-on workshop.

October 2, 2019 | 1:00 PM – 2:00 PM PT – Accelerate Cloud Adoption and Reduce Operational Risk with AWS Managed Services – Learn how AMS accelerates your migration to AWS, reduces your operating costs, improves security and compliance, and enables you to focus on your differentiating business priorities.

 

IoT:

September 25, 2019 | 9:00 AM – 10:00 AM PTComplex Monitoring for Industrial with AWS IoT Data Services – Learn how to solve your complex event monitoring challenges with AWS IoT Data Services.

 

Machine Learning:

September 23, 2019 | 9:00 AM – 10:00 AM PTTraining Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.

September 30, 2019 | 11:00 AM – 12:00 PM PTUsing Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.

October 3, 2019 | 1:00 PM – 2:30 PM PTVirtual Workshop: Getting Hands-On with Machine Learning and Ready to Race in the AWS DeepRacer League – Join DeClercq Wentzel, Senior Product Manager for AWS DeepRacer, for a presentation on the basics of machine learning and how to build a reinforcement learning model that you can use to join the AWS DeepRacer League.

 

AWS Marketplace:

September 30, 2019 | 9:00 AM – 10:00 AM PTAdvancing Software Procurement in a Containerized World – Learn how to deploy applications faster with third-party container products.

 

Migration:

September 24, 2019 | 11:00 AM – 12:00 PM PTApplication Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.

 

Networking & Content Delivery:

September 25, 2019 | 11:00 AM – 12:00 PM PTBuilding Highly Available and Performant Applications using AWS Global Accelerator – Learn how to build highly available and performant architectures for your applications with AWS Global Accelerator, now with source IP preservation.

September 30, 2019 | 1:00 PM – 2:00 PM PTAWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.

 

Robotics:

October 1, 2019 | 11:00 AM – 12:00 PM PTRobots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.

 

Security, Identity & Compliance:

October 1, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on Running Active Directory on AWS – Learn how to deploy Active Directory on AWS and start migrating your windows workloads.

 

Serverless:

October 2, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.

 

Storage:

September 24, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Amazon S3 Data Lake with S3 Storage Classes and Management Tools – Learn how to use the Amazon S3 Storage Classes and management tools to better manage your data lake at scale and to optimize storage costs and resources.

October 2, 2019 | 11:00 AM – 12:00 PM PTThe Great Migration to Cloud Storage: Choosing the Right Storage Solution for Your Workload – Learn more about AWS storage services and identify which service is the right fit for your business.

 

 

Amplify Console – Hosting for Fullstack Serverless Web Apps

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amplify-console-hosting-for-fullstack-serverless-web-apps/

AWS Amplify Console is a fullstack web app hosting service, with continuous deployment from your preferred source code repository. Amplify Console has been introduced in November 2018 at AWS re:Invent. Since then, the team has been listening to customer feedback and iterated quickly to release several new features, here is a short re:Cap.

Instant Cache Invalidation
Amplify Console allows to host single page web apps or static sites with serverless backends via a content delivery network, or CDN. A CDN is a network of distributed servers that cache files at edge locations across the world enabling low latency distribution of your web file assets.

Previously, updating content on the CDN required manually invalidating the cache and waiting 15-20 minutes for changes to propagate globally. To make frequent updates, developers found workarounds such as setting lower time-to-live (TTLs) on asset headers which enables faster updates, but adversely impacts performance. Now, you no longer have to make a tradeoff between faster deployments and faster performance. On every commit code to your repository, the Amplify Console builds and deploys changes to the CDN that are viewable immediately in the browser.

“Deploy To Amplify Console” Button

Deploy To Amplify Console

When publishing your project source code on GitHub, you can make it easy for other developers to build and deploy your application by providing a “Deploy To Amplify Console” button in the Readme document. Clicking on that button will open Amplify Console and propose a three step process to deploy your code.

You test this yourself with these example projects and have a look at the documentation. Adding a button to your own code repository is as easy as adding this line in your Readme document (be sure to replace the username and repository name in the GitHub URL):

[![amplifybutton](https://oneclick.amplifyapp.com/button.svg)](https://console.aws.amazon.com/amplify/home#/deploy?repo=https://github.com/username/repository)

Manual Deploy
I think it is a good idea to version control everything, including simple web site where you are the only developer. But just in case you do not want to use a source code repository as source for your deployment, Amplify Console allows to deploy a zip file, a local folder on your laptop, an Amazon S3 bucket or any HTTPS URL, such as a shared repository on Dropbox.

When creating a new Amplify Console project, select Deploy without Git Provider option. 
Then choose your source file (your laptop, Amazon S3 or an HTTPS URI)

AWS CloudFormation Integration
Developers love automation. Deploying code or infrastructure is no different : you must ensure your infrastructure deployments are automated and repeatable. AWS CloudFormation allows you to automate the creation of infrastruture in the cloud based on a YAML or JSON description. Amplify Console added three new resource types to AWS CloudFormation:

  • AWS::Amplify::App
  • AWS::Amplify::Branch
  • AWS::Amplify::Domain

These allows you respectively to create a new Amplify Console app, to define the Git branch, and the DNS domain name to use.

AWS CloudFormation connects to your source code repository to add a webhook to it. You need to include your Github Personal Access Token to allow this to happen, this blog post has all the details. Remember to not hardcode credentials (or OAuth tokens) into your Cloudformation templates, use parameters instead.

Deploy Multiple Git Branches
We believe your CI/CD tools must adapt to your team workflow, not the other way around. Amplify Console supports branch pattern deployments, allowing you to automatically deploy branches that match a specific pattern without any extra configuration. Pattern matching is based on regular expresssions.

When you want to test a new feature, you typically create a new branch in Git. Amplify Console and the Amplify CLI are now detecting this and will provision a separate backend and hosting infrastructure for your serverless app.

To enable branch detection, use the left menu, click on General > Edit and turn on Branch Autodetection:

Custom HTTP Headers
You can customize Amplify Console to send customized HTTP response headers. Response headers can be used for debugging, security, or informational purposes. To add your custom headers, you select App Settings > Build Settings and then edit the buildspec. For example, to enforce TLS transport and prevent XSS attacks, you can add the following headers:

customHeaders:
        - pattern: '**/*'
          headers:
                - key: 'Strict-Transport-Security'
                        value: 'max-age=31536000; includeSubDomains'
                - key: 'X-Frame-Options'
                        value: 'X-Frame-Options: SAMEORIGIN'
                - key: 'X-XSS-Protection'
                        value: 'X-XSS-Protection: 1; mode=block'
                - key: 'X-Content-Type-Options'
                        value: 'X-Content-Type-Options: nosniff'
                - key: 'Content-Security-Policy'
                        value: "default-src 'self'"

The documentation has more details.

Custom Containers for Build
Last but not least, we made several changes to the build environment. Amplify Console uses AWS CodeBuild behind the scenes. The default build container image is now based on Amazon Linux 2 and has Serverless Application Model (SAM) CLI pre-installed. If, for whatever reasons you want to use your own container for the build, you can configure Amplify Console to do so. Select App Settings > Build Settings :

And then edit the build image setting

There are a few requirements on the container image: it has to have cURL, git, OpenSSH and, if you are building NodeJS projects, node and npm. As usual, the details are in the documentation.

Each of these new features has been driven by your feedback, so please continue to tell us what is important for you by submittin, and expect to see more changes coming in the second part of the year and beyond.

— seb

Building a GraphQL server on the edge with Cloudflare Workers

Post Syndicated from Kristian Freeman original https://blog.cloudflare.com/building-a-graphql-server-on-the-edge-with-cloudflare-workers/

Building a GraphQL server on the edge with Cloudflare Workers

Building a GraphQL server on the edge with Cloudflare Workers

Today, we’re open-sourcing an exciting project that showcases the strengths of our Cloudflare Workers platform: workers-graphql-server is a batteries-included Apollo GraphQL server, designed to get you up and running quickly with GraphQL.

Building a GraphQL server on the edge with Cloudflare Workers
Testing GraphQL queries in the GraphQL Playground

As a full-stack developer, I’m really excited about GraphQL. I love building user interfaces with React, but as a project gets more complex, it can become really difficult to manage how your data is managed inside of an application. GraphQL makes that really easy – instead of having to recall the REST URL structure of your backend API, or remember when your backend server doesn’t quite follow REST conventions – you just tell GraphQL what data you want, and it takes care of the rest.

Cloudflare Workers is uniquely suited as a platform to being an incredible place to host a GraphQL server. Because your code is running on Cloudflare’s servers around the world, the average latency for your requests is extremely low, and by using Wrangler, our open-source command line tool for building and managing Workers projects, you can deploy new versions of your GraphQL server around the world within seconds.

If you’d like to try the GraphQL server, check out a demo GraphQL playground, deployed on Workers.dev. This optional add-on to the GraphQL server allows you to experiment with GraphQL queries and mutations, giving you a super powerful way to understand how to interface with your data, without having to hop into a codebase.

If you’re ready to get started building your own GraphQL server with our new open-source project, we’ve added a new tutorial to our Workers documentation to help you get up and running – check it out here!

Finally, if you’re interested in how the project works, or want to help contribute – it’s open-source! We’d love to hear your feedback and see your contributions. Check out the project on GitHub.

Amazon Transcribe Streaming Now Supports WebSockets

Post Syndicated from Brandon West original https://aws.amazon.com/blogs/aws/amazon-transcribe-streaming-now-supports-websockets/

I love services like Amazon Transcribe. They are the kind of just-futuristic-enough technology that excites my imagination the same way that magic does. It’s incredible that we have accurate, automatic speech recognition for a variety of languages and accents, in real-time. There are so many use-cases, and nearly all of them are intriguing. Until now, the Amazon Transcribe Streaming API available has been available using HTTP/2 streaming. Today, we’re adding WebSockets as another integration option for bringing real-time voice capabilities to the things you build.

In this post, we are going to transcribe speech in real-time using only client-side JavaScript in a browser. But before we can build, we need a foundation. We’ll review just enough information about Amazon Transcribe, WebSockets, and the Amazon Transcribe Streaming API to broadly explain the demo. For more detailed information, check out the Amazon Transcribe docs.

If you are itching to see things in action, you can head directly to the demo, but I recommend taking a quick read through this post first.

What is Amazon Transcribe?

Amazon Transcribe applies machine learning models to convert speech in audio to text transcriptions. One of the most powerful features of Amazon Transcribe is the ability to perform real-time transcription of audio. Until now, this functionality has been available via HTTP/2 streams. Today, we’re announcing the ability to connect to Amazon Transcribe using WebSockets as well.

For real-time transcription, Amazon Transcribe currently supports British English (en-GB), US English (en-US), French (fr-FR), Canadian French (fr-CA), and US Spanish (es-US).

What are WebSockets?

WebSockets are a protocol built on top of TCP, like HTTP. While HTTP is great for short-lived requests, it hasn’t historically been good at handling situations that require persistent real-time communications. While an HTTP connection is normally closed at the end of the message, a WebSocket connection remains open. This means that messages can be sent bi-directionally with no bandwidth or latency added by handshaking and negotiating a connection. WebSocket connections are full-duplex, meaning that the server can client can both transmit data at the same time. They were also designed for cross-domain usage, so there’s no messing around with cross-origin resource sharing (CORS) as there is with HTTP.

HTTP/2 streams solve a lot of the issues that HTTP had with real-time communications, and the first Amazon Transcribe Streaming API available uses HTTP/2. WebSocket support opens Amazon Transcribe Streaming up to a wider audience, and makes integrations easier for customers that might have existing WebSocket-based integrations or knowledge.

How the Amazon Transcribe Streaming API Works

Authorization

The first thing we need to do is authorize an IAM user to use Amazon Transcribe Streaming WebSockets. In the AWS Management Console, attach the following policy to your user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "transcribestreaming",
            "Effect": "Allow",
            "Action": "transcribe:StartStreamTranscriptionWebSocket",
            "Resource": "*"
        }
    ]
}

Authentication

Transcribe uses AWS Signature Version 4 to authenticate requests. For WebSocket connections, use a pre-signed URL, that contains all of the necessary information is passed as query parameters in the URL. This gives us an authenticated endpoint that we can use to establish our WebSocket.

Required Parameters

All of the required parameters are included in our pre-signed URL as part of the query string. These are:

  • language-code: The language code. One of en-US, en-GB, fr-FR, fr-CA, es-US.
  • sample-rate: The sample rate of the audio, in Hz. Max of 16000 for en-US and es-US, and 8000 for the other languages.
  • media-encoding: Currently only pcm is valid.
  • vocabulary-name: Amazon Transcribe allows you to define custom vocabularies for uncommon or unique words that you expect to see in your data. To use a custom vocabulary, reference it here.

Audio Data Requirements

There are a few things that we need to know before we start sending data. First, Transcribe expects audio to be encoded as PCM data. The sample rate of a digital audio file relates to the quality of the captured audio. It is the number of times per second (Hz) that the analog signal is checked in order to generate the digital signal. For high-quality data, a sample rate of 16,000 Hz or higher is recommended. For lower-quality audio, such as a phone conversation, use a sample rate of 8,000 Hz. Currently, US English (en-US) and US Spanish (es-US) support sample rates up to 48,000 Hz. Other languages support rates up to 16,000 Hz.

In our demo, the file lib/audioUtils.js contains a downsampleBuffer() function for reducing the sample rate of the incoming audio bytes from the browser, and a pcmEncode() function that takes the raw audio bytes and converts them to PCM.

Request Format

Once we’ve got our audio encoding as PCM data with the right sample rate, we need to wrap it in an envelope before we send it across the WebSocket connection. Each messages consists of three headers, followed by the PCM-encoded audio bytes in the message body. The entire message is then encoded as a binary event stream message and sent. If you’ve used the HTTP/2 API before, there’s one difference that I think makes using WebSockets a bit more straightforward, which is that you don’t need to cryptographically sign each chunk of audio data you send.

Response Format

The messages we receive follow the same general format: they are binary-encoded event stream messages, with three headers and a body. But instead of audio bytes, the message body contains a Transcript object. Partial responses are returned until a natural stopping point in the audio is determined. For more details on how this response is formatted, check out the docs and have a look at the handleEventStreamMessage() function in main.js.

Let’s See the Demo!

Now that we’ve got some context, let’s try out a demo. I’ve deployed it using AWS Amplify Console – take a look, or push the button to deploy your own copy. Enter the Access ID and Secret Key for the IAM User you authorized earlier, hit the Start Transcription button, and start speaking into your microphone.

Deploy to Amplify Console

The complete project is available on GitHub. The most important file is lib/main.js. This file defines all our required dependencies, wires up the buttons and form fields in index.html, accesses the microphone stream, and pushes the data to Transcribe over the WebSocket. The code has been thoroughly commented and will hopefully be easy to understand, but if you have questions, feel free to open issues on the GitHub repo and I’ll be happy to help. I’d like to extend a special thanks to Karan Grover, Software Development Engineer on the Transcribe team, for providing the code that formed that basis of this demo.

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Post Syndicated from Giuliana DeAngelis original https://blog.cloudflare.com/join-cloudflare-moz-at-our-next-meetup-serverless-in-seattle/

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!
Photo by oakie / Unsplash

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Cloudflare is organizing a meetup in Seattle on Tuesday, June 25th and we hope you can join. We’ll be bringing together members of the developers community and Cloudflare users for an evening of discussion about serverless compute and the infinite number of use cases for deploying code at the edge.

To kick things off, our guest speaker Devin Ellis will share how Moz uses Cloudflare Workers to reduce time to first byte 30-70% by caching dynamic content at the edge. Kirk Schwenkler, Solutions Engineering Lead at Cloudflare, will facilitate this discussion and share his perspective on how to grow and secure businesses at scale.

Next up, Developer Advocate Kristian Freeman will take you through a live demo of Workers and highlight new features of the platform. This will be an interactive session where you can try out Workers for free and develop your own applications using our new command-line tool.

Food and drinks will be served til close so grab your laptop and a friend and come on by!

View Event Details & Register Here

Agenda:

  • 5:00 pm Doors open, food and drinks
  • 5:30 pm Customer use case by Devin and Kirk
  • 6:00 pm Workers deep dive with Kristian
  • 6:30 – 8:30 pm Networking, food and drinks

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Post Syndicated from Giuliana DeAngelis original https://blog.cloudflare.com/join-cloudflare-moz-at-our-next-meetup-serverless-in-seattle/

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!
Photo by oakie / Unsplash

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Cloudflare is organizing a meetup in Seattle on Tuesday, June 25th and we hope you can join. We’ll be bringing together members of the developers community and Cloudflare users for an evening of discussion about serverless compute and the infinite number of use cases for deploying code at the edge.

To kick things off, our guest speaker Devin Ellis will share how Moz uses Cloudflare Workers to reduce time to first byte 30-70% by caching dynamic content at the edge. Kirk Schwenkler, Solutions Engineering Lead at Cloudflare, will facilitate this discussion and share his perspective on how to grow and secure businesses at scale.

Next up, Developer Advocate Kristian Freeman will take you through a live demo of Workers and highlight new features of the platform. This will be an interactive session where you can try out Workers for free and develop your own applications using our new command-line tool.

Food and drinks will be served til close so grab your laptop and a friend and come on by!

View Event Details & Register Here

Agenda:

  • 5:00 pm Doors open, food and drinks
  • 5:30 pm Customer use case by Devin and Kirk
  • 6:00 pm Workers deep dive with Kristian
  • 6:30 – 8:30 pm Networking, food and drinks

A free Argo Tunnel for your next project

Post Syndicated from Sam Rhea original https://blog.cloudflare.com/a-free-argo-tunnel-for-your-next-project/

A free Argo Tunnel for your next project

Argo Tunnel lets you expose a server to the Internet without opening any ports. The service runs a lightweight process on your server that creates outbound tunnels to the Cloudflare network. Instead of managing DNS, network, and firewall complexity, Argo Tunnel helps administrators serve traffic from their origin through Cloudflare with a single command.

We built Argo Tunnel to remove the burden of securing and connecting servers to the Internet. This new model makes it easier to run a service in multi-cloud and hybrid deployments by replacing manual and error-prone work with a process that adds intelligence to the last-mile between Cloudflare and your origins or clusters. However, the service was previously only available to users with Cloudflare accounts. We want to make Argo Tunnel more accessible for any project.

Starting today, any user, even those without a Cloudflare account, can try this new method of connecting their server to the Internet. Argo Tunnel can now be used in a free model that will create a new URL, known only to you, that will proxy traffic to your server. We’re excited to make connecting a server to the Internet more accessible for everyone.

What is Argo Tunnel?

Argo Tunnel replaces legacy models of connecting a server to the Internet with a secure, persistent connection to Cloudflare. Since Cloudflare first launched in 2009, customers have added their site to our platform by changing their name servers at their domain’s registrar to ones managed by Cloudflare. Administrators then create a DNS record in our dashboard that points visitors to their domain to their origin server.

When requests are made for those domains, the queries hit our data centers first. We’re able to use that position to block malicious traffic like DDoS attacks. However, if attackers discovered that origin IP, they could bypass Cloudflare’s security features and attack the server directly. Adding additional protections against that risk introduced more hassle and configuration.

A free Argo Tunnel for your next project

One year ago, Cloudflare launched Argo Tunnel to solve those problems. Argo Tunnel connects your origin server to the Cloudflare network by running a lightweight daemon on your machine that only makes outbound calls. The process generates DNS records in the dashboard for you, removing the need to manually configure records and origin IP addresses.

Most importantly, Argo Tunnel helps shield your origin by simplifying the firewall rules you need to configure. Argo Tunnel makes outbound calls to the Cloudflare network and proxies requests back to your server. You can then disable all ingress to the machine and ensure that Cloudflare’s security features always stand between your server and the rest of the Internet. In addition to secure, we made it fast. The connection uses our Argo Smart Routing technology to find the most performant path from your visitors to your origin.

How can I use the free version?

Argo Tunnel is now available to all users without a Cloudflare account. All that is needed is the Cloudflare daemon, cloudflared, running on your machine. With a single command, cloudflared will generate a random subdomain of “trycloudflare.com” and begin proxying traffic to your server.

  1. Install cloudflared on your web server or laptop; instructions are available here. If you have an older copy, you’ll first need to update your version to the latest (2019.6.0)
  2. Launch a web server.
  3. Run the terminal command below to start a free tunnel. cloudflared will begin proxying requests to your localhost server; no additional flags needed.

$ cloudflared tunnel

The command above will proxy traffic to port 8080 by default, but you can specify a different port with the –url flag

$ cloudflared tunnel --url localhost:7000

cloudflared will generate a random subdomain when connecting to the Cloudflare network and print it in the terminal for you to use. This will make whatever server you are running on your local machine accessible to the world through a public URL only you know. The output will resemble the following:

A free Argo Tunnel for your next project

How can I use it?

  • Run a web server on your laptop to share a project with collaborates on different networks
  • Test mobile browser compatibility for a new site
  • Perform speed tests from different regions

Why is it free?

We want more users to experience the speed and security improvements of Argo Tunnel (and Argo Smart Routing). We hope you’ll feel the same way about those benefits after testing it with the free version and that you’ll start using it for your production sites.

We also don’t guarantee any SLA or up-time of the free service – we plan to test new Argo Tunnel features and improvements on these free tunnels. This provides us with a group of connections to test before we deploy to production customers. Free tunnels are meant to be used for testing and development, not for deploying a production website.

What’s next?

You can read our guide here to start using the free version of Argo Tunnel. Got feedback? Please send it here.

Announcing the New Cloudflare Partner Platform

Post Syndicated from Garrett Galow original https://blog.cloudflare.com/announcing-the-new-cloudflare-partner-platform/

Announcing the New Cloudflare Partner Platform

Announcing the New Cloudflare Partner Platform

When I first started at Cloudflare over two years ago, one of the first things I was tasked with was to help evolve our partner platform to support the changes in our service and the expanding needs of our partners and customers. Cloudflare’s existing partner platform was released in 2010. It is a testament to those who built it, that it was, and still is, in use today—but it was also clear that the landscape had substantially changed.

Since the launch of the existing partner platform, we had built and expanded multi-user access, and launched many new products: Argo, Load Balancing, and Cloudflare Workers, to name a few. Retrofitting the existing offering was not practical. Cloudflare needed a new partner platform that could meet the needs of partners and their customers.

As the team started to develop a new solution, we needed to find a partner who could keep us on the right path. The number of hypotheticals were infinite and we needed a first customer to ground ourselves. Lo and behold, not long after I had begun putting pen to paper, we found the perfect partner for the new platform.

The IBM Partnership

IBM was looking for a partner to bring various edge services to market quickly, and our suite of capabilities was what they were looking for. If you are not familiar with our partnership with IBM, you can learn a bit more about it in our blog post and on the IBM Cloud Internet Services landing page. We signed the contract in November 2017, and we had to be ready to launch by IBM Think the following February. Given that IBM’s engineering team needed time to integrate with us, we were on a tight timeline to deliver.

A number of team members and I jumped on a plane and flew to Austin, Texas (Hook ‘em!) to work with IBM and determine the minimum viable product (MVP). Over kolaches (for the Czech readers at home: Klobásník), IBM and Cloudflare nailed down the MVP requirements. Briefly, they were as follows:

  1. Full API integration to provision the building blocks of using Cloudflare.
    • This included:
      1. Accounts: The container of resources – typically zones
      2. Users: The way in which we partition access to accounts
  2. The ability to sell and provision Cloudflare’s paid services and package them in a way that made sense for IBM’s customers.
    • Our existing partner platform only supported zone plans and none of our newer offerings, such as Argo or load balancing.
    • IBM had specific requirements around how they could package and sell to customers, so our solution needed to be flexible enough to support that.
  3. Ensure that what we built was re-usable.
    • Cloudflare makes it a point to solve problems for scale. While we were focused on ensuring our first partner would be successful, we knew that long term we would need to be able to scale this solution to additional partners. Nothing we built could prevent us from doing that.

Over the next couple of months, many teams at Cloudflare came together to deliver this solution at breakneck speed. Given that the midpoint of this effort happened over the holiday season, I’m personally proud of our company not sacrificing employee’s time with their friends and families in order to deliver. Even when it feels like a sprint, it is still a marathon.

During this time, the engineering team we were working with at IBM felt like another team at Cloudflare. Their ability to move quickly, integrate, and validate our work was critical to the success of the project. At THINK in February 2018, we were able to announce the Beta of IBM CIS (Cloud Internet Services) powered by Cloudflare!

Following the initial release, we continued to add functionality to further enrich the IBM CIS offering, while behind the scenes we continued our work to redefine Cloudflare’s partner platform.

The New Partner Platform

Over the past year we have expanded the capabilities and completed the necessary work to enable more partners to be able to use what we initially built for the IBM partnership. Out of that comes our new partner platform we are announcing today. The new partner platform allows partners of Cloudflare to sell and provision Cloudflare for their customers in a scalable fashion.

Our new partner platform is the combination of two systems designed to fulfill specific needs:

1. Tenants: an abstraction on top of our existing accounts and users for easier management
2. Subscriptions: a new way of packaging and provisioning services

Tenants

An absolute necessity for partners is the ability to provision accounts for each of their customers. Normally the only way to get a Cloudflare account is to sign up on the dashboard. We needed a way for partners to be able to create end customer accounts at their discretion to support their specific onboarding needs. This also ensures proper separation of ownership between customers and allows end customers to access the Cloudflare dashboard directly.

With the introduction of tenants, our data model now looks like the following:

Announcing the New Cloudflare Partner Platform
Cloudflare Resource Data Model

Tenants provide partners the ability to create and manage the accounts for their customers. Each account created is a separate container of resources (zones, workers, etc) for each of customer. Users can be invited to each account as necessary for self service management, while the partner retains control of the capabilities enabled for each account. How a partner manages those capabilities brings us to the second major system that makes up the new partner platform.

Subscriptions

While not as obvious as the need for account provisioning, the ability to package and provision services is critical to providing differentiated offerings for partners of Cloudflare. One drawback of our old partner platform was the difficulty in ensuring new products and services were available to those partners. As Cloudflare grew, it reached the point where new paid services could not be added into the existing partner platform.

With subscriptions, this is no longer the case. What started as just a way to provision services for IBM, has now grown into the standard of how all customer services are provisioned at Cloudflare. Whether you purchase services through IBM CIS or buy Cloudflare Workers in our dashboard, behind the scenes, Subscriptions is what ensures you get exactly the right services enabled.

Enough talk, let’s show things in action!

The Partner Platform in Action

The full details of using the new partner platform can be found in our Provisioning API docs, but here we provide a walkthrough of a typical use case.

Using the new partner platform involves 4 steps:

  1. Provisioning Customer Accounts
  2. Granting Customer Access
  3. Enabling Services
  4. Service Configuration

1) Provisioning Customer Accounts

When onboarding customers, you want each to have their own Cloudflare account. This ensures one customer can not affect any resources belonging to another. By making a `POST /accounts` request, you can create an account for an individual customer.

Request:

curl -X POST \
    https://api.cloudflare.com/client/v4/accounts \
    -H 'Content-Type: application/json' \
    -H 'x-auth-email: <x-auth-email>' \
    -H 'x-auth-key: <x-auth-key>' \
    -d '{ "name": "Customer Account", 
          "type": "standard" 
        }'

Response:

{
    "result": {
        "id": "2bab6ace8c72ed3f09b9eca6db1396bb",
        "name": "Customer Account",
        "type": "standard",
        "settings": {
            "enforce_twofactor": false
        }
    },
    "success": true,
    "errors": [],
    "messages": []
}

This new account is owned by the partner. It can be managed by API, or in the UI by the partner or any additional administrators that are invited.

2) Granting Customer Access

Now that the customer’s account is created, let’s give them access to it. This step uses existing APIs and if you have shared access to a Cloudflare account before, then you have already done this.

Request:

curl -X POST \
    'https://api.cloudflare.com/client/v4/accounts/2bab6ace8c72ed3f09b9eca6db1396bb/members' \
    -H 'Content-Type: application/json' \
    -H 'x-auth-email: <x-auth-email>' \
    -H 'x-auth-key: <x-auth-key>' \
    -d '{ "email": "[email protected]",
          "roles": ["05784afa30c1afe1440e79d9351c7430"],
          "status": "accepted" 
        }'

Response:

{
    "result": {
        "id": "47bd8083af8516a20c410090d2f53655",
        "user": {
            "id": "fccad3c46f26dc2d6ba47ad19f639707",
            "first_name": null,
            "last_name": null,
            "email": "[email protected]",
            "two_factor_authentication_enabled": false
        },
        "status": "pending",
        "roles": [
            {
                "id": "05784afa30c1afe1440e79d9351c7430",
                "name": "Administrator",
                "description": "Can access the full account, except for membership management and billing.",
                "permissions": {
                    "organization": {
                        "read": true,
                        "edit": true
                    },
                    "zone": {
                        "read": true,
                        "edit": true
                    },
                    truncated...
                }
            }
        ]
    },
    "success": true,
    "errors": [],
    "messages": []
}

Alternatively, you can do this in the UI, from the Members section for the newly created account.

3) Enabling Services

Now the fun part! With the ability to provision subscriptions, you can enable paid services for your customers. Before we do that though, we will create a zone so we can attach a zone subscription to it.

Adding a zone as a partner is no different than adding a zone as a regular customer. It can also be done by the customer.

Request:

curl -X POST \
    https://api.cloudflare.com/client/v4/zones \
    -H 'Content-Type: application/json' \
    -H 'x-auth-email: <x-auth-email>' \
    -H 'x-auth-key: <x-auth-key>' \
    -d '{ "name": "theircompany.com",
            "account": { "id": "2bab6ace8c72ed3f09b9eca6db1396bb" }
        }'

Response:

{
    "result": {
        "id": "cae181e41197e2eb875d9bcb9396abe7",
        "name": "theircompany.com",
        "status": "pending",
        "paused": false,
        "type": "full",
        "development_mode": 0,
        "name_servers": [
            "lana.ns.cloudflare.com",
            "lynn.ns.cloudflare.com"
        ],
        "original_name_servers": null,
        "original_registrar": "cloudflare, inc.",
        "original_dnshost": null,
        "modified_on": "2019-05-30T17:51:08.510558Z",
        "created_on": "2019-05-30T17:51:08.510558Z",
        "activated_on": null,
        "meta": {
            "step": 4,
            "wildcard_proxiable": false,
            "custom_certificate_quota": 0,
            "page_rule_quota": 3,
            "phishing_detected": false,
            "multiple_railguns_allowed": false
        },
        "owner": {
            "id": null,
            "type": "user",
            "email": null
        },
        "account": {
            "id": "2bab6ace8c72ed3f09b9eca6db1396bb",
            "name": "Customer Account"
        },
        "permissions": [
            "#access:edit",
            "#access:read",
            ...truncated
        ],
        "plan": {
            "id": "0feeeeeeeeeeeeeeeeeeeeeeeeeeeeee",
            "name": "Free Website",
            "price": 0,
            "currency": "USD",
            "frequency": "",
            "is_subscribed": true,
            "can_subscribe": false,
            "legacy_id": "free",
            "legacy_discount": false,
            "externally_managed": false
        }
    },
    "success": true,
    "errors": [],
    "messages": []
}

For this customer we will provision a Pro plan for the newly created zone. If you are not familiar with our zone plans, then you can read about them here. For this, we make a call to the subscriptions service.

Request:

curl -X POST \
    https://api.cloudflare.com/client/v4/zones/cae181e41197e2eb875d9bcb9396abe7/subscription \
  -H 'Content-Type: application/json' \
  -H 'X-Auth-Email: <x-auth-email>' \
  -H 'X-Auth-Key: <x-auth-key>' \
  -d '{"rate_plan": {
          "id": "PARTNERS_PRO"}
      }'

Response:

{
    "success": true,
    "result": {
        "id": "ff563a93e11c46e7b278be46f49cdd2f",
        "product": {
            "name": "partners_cloudflare_zones",
            "period": "",
            "billing": "",
            "public_name": "CloudFlare Services",
            "duration": 0
        },
        "rate_plan": {
            "id": "partners_pro",
            "public_name": "Partners Professional Plan",
            "currency": "USD",
            "scope": "zone",
            "externally_managed": false,
            "sets": [
                "zone",
                "partner"
            ],
            "is_contract": true
        },
        "component_values": [
            {
                "name": "dedicated_certificates",
                "value": 0,
                "price": 0
            },
            {
                "name": "dedicated_certificates_custom",
                "value": 0,
                "price": 0
            },
            {
                "name": "page_rules",
                "value": 20,
                "default": 20,
                "price": 0
            },
            {
                "name": "zones",
                "value": 1,
                "default": 1,
                "price": 0
            }
        ],
        "zone": {
            "id": "cae181e41197e2eb875d9bcb9396abe7",
            "name": "theircompany.com"
        },
        "frequency": "monthly",
        "currency": "USD",
        "app": {
            "install_id": null
        },
        "entitled": true
    },
    "messages": null,
    "api_version": "2.0.0"
}

Now that the customer is set up with an account, zone, and zone subscription, the only thing left is configuring the resources appropriately.

4) Service Configuration

Service configuration can be done by either you, the partner, or the end customer. Most commonly, DNS records need to be added, security settings verified and updated, and customizations made. These can all be done either through our Client v4 APIs or the Cloudflare Dashboard.

Once that is done, the customer is all set!

This is just the beginning

With our announcement today, partners can protect and accelerate their customer’s internet services with Cloudflare’s partner platform. We have battled tested the underlying systems over the last year and are excited to partner with others to help make a better internet. We are not done yet though. We will be continually investing in the tenant and subscription services to expand their capabilities and simplify usage.

Announcing the New Cloudflare Partner Platform
Some of the latest partners using the new partner platform

If you are interested in partnering with Cloudflare, then reach out to [email protected]. If building the future of how Cloudflare’s partners and customers use our service sounds interesting then take a look at our career page.


For more information, see the following resources:

Enhancing the Optimizely Experimentation Platform with Cloudflare Workers

Post Syndicated from Remy Guercio original https://blog.cloudflare.com/enhancing-optimizely-with-cloudflare-workers/

Enhancing the Optimizely Experimentation Platform with Cloudflare Workers

Enhancing the Optimizely Experimentation Platform with Cloudflare Workers

This is a joint post by Whelan Boyd, Senior Product Manager at Optimizely and Remy Guercio, Product Marketing Manager for Cloudflare Workers.

Experimentation is an important ingredient in driving business growth: whether you’re iterating on a product or testing new messaging, there’s no substitute for the data and insights gathered from conducting rigorous experiments in the wild.

Optimizely is the world’s leading experimentation platform, with thousands of customers worldwide running tests for over 140 million visitors daily. If Optimizely were a website, it would be the third most trafficked in the US.  And when it came time to experiment with reinvigorating their own platform, Optimizely chose Cloudflare Workers.

Improving Performance and Agility with Cloudflare Workers

Cloudflare Workers is a globally distributed serverless compute platform that runs across Cloudflare’s network of 180 locations worldwide. Workers are designed for flexibility, with many different use cases ranging from customizing configuration of Cloudflare services and features to building full, independent applications.

In this post, we’re going to focus on how Workers can be used to improve performance and increase agility for more complex applications. One of the key benefits of Workers is that they allow developers to move decision logic and data into a highly efficient runtime operating in close proximity to end users — resulting in significant performance benefits and flexibility. Which brings us to Optimizely…

How Optimizely Works

Every week Optimizely delivers billions of experiences to help teams A/B test new products, de-risk new feature launches, and validate alternative designs. Optimizely lets companies test client-side changes like layouts and copy, as well as server-side changes like algorithms and feature rollouts.

Let’s explore how both have challenges that can be overcome with Workers, starting with Optimizely’s client-side A/B testing, or Optimizely Web, product.

Use Case: Optimizely Web

The main benefit of Optimizely Web — Optimizely’s client-side testing framework — is that it supports A/B testing via straightforward insertion of a JavaScript tag on the web page. The test is designed via the Optimizely WYSIWYG editor, and is live within minutes. Common use cases include style updates, image swaps, headlines and other text changes. You can also write any custom JavaScript or CSS you want.

With client-side A/B testing, the browser downloads JavaScript that modifies the page as it’s loading.  To avoid “flash-of-unstyled-content” (FOUC), developers need to implement this JavaScript synchronously in their <head> tag.  This constraint, though, can lead to page performance issues, especially on slower connections and devices.  Downloading and executing JavaScript in the browser has a cost, and this cost increases if the amount of JavaScript is large.  With a normal Optimizely Web implementation, all experiments are included in the JavaScript loaded on every page.

Enhancing the Optimizely Experimentation Platform with Cloudflare Workers
A traditional Optimizely implementation

With Workers, Optimizely can support many of these same use cases, but hoists critical logic to the edge to avoid much of the performance cost. Here’s how it works:

Enhancing the Optimizely Experimentation Platform with Cloudflare Workers
Implementing tests with Optimizely and Cloudflare Workers

This diagram shows how Optimizely customers can execute experiments created in the point-and-click UI through a Cloudflare Worker.  Rather than the browser downloading a large JavaScript file, your Worker handling HTTP/S requests calls out to Optimizely’s Worker.  Optimizely’s Worker determines which A/B tests should be active on this page and returns a small amount of JavaScript back to your Worker.  In fact, it is the JavaScript required to execute A/B test variations on just that specific page load.  Your Worker inlines the code in the page and returns it to the visitor’s browser.  

Not only does this avoid a browser bottleneck downloading a lot of data, the amount of code to execute is a fraction of a normal client-side implementation.  Since the experiments are set up inside the Optimizely interface just like any other Web experiment, you can run as many as you want without waiting for code deploy cycles.  Better yet, your non-technical (e.g. marketing) teams can still run these without depending on developers for each test.  It’s a one-time implementation.

Use Case: Going Further with Feature Rollouts

Optimizely Full Stack is Optimizely’s server-side experimentation and feature flagging platform for websites, mobile apps, chatbots, APIs, smart devices, and anything else with a network connection.  You can deploy code behind feature flags, experiment with A/B tests, and roll out or roll back features immediately.  Optimizely Rollouts is a free version of Full Stack that supports key feature rollout capabilities.

Full Stack SDKs are often implemented and instantiated directly in application code.

Enhancing the Optimizely Experimentation Platform with Cloudflare Workers
An Optimizely full stack experimentation setup

The main blocker to high velocity server-side testing is that experiments and feature rollouts must go through the code-deploy cycle — and to further add to the headache, many sites cache content on CDNs, so experiments or rollouts running at the origin never execute.  

In this example, we’ll consider a new feature you’d like to roll out gradually, exposing more and more users over time between code deploys. With Workers, you can implement feature rollouts by running the Optimizely JavaScript SDK at the edge.  The Worker is effectively a decision service.  Instead of installing the JS SDK inside each application service where you might need to gate or roll out features, centralize instantiation in a Worker.

From your application, simply hit the Worker and the response will tell you whether a feature is enabled for that particular user.  In the example below, we supply via query parameters a userId, feature, and account-specific SDK key and the Worker responds with its decision in result.  Below is a sample Cloudflare Worker:

import { createManager } from '../index'

/// <reference lib="es2015" />
/// <reference lib="webworker" />

addEventListener('fetch', (event: any) => {
  event.respondWith(handleRequest(event.request))
})

/**
 * Fetch and log a request
 * @param {Request} request
 */
async function handleRequest(request: Request): Promise<Response> {
  const url = new URL(request.url)
  const key = url.searchParams.get('key')
  const userId = url.searchParams.get('userId')
  const feature = url.searchParams.get('feature')
  if (!feature || !key || !userId) {
    throw new Error('must supply "feature", "userId" and "key"')
  }

  try {
    const manager = createManager({
      sdkKey: key,
    })

    await manager.onReady().catch(err => {
      return new Response(JSON.stringify({ status: 'error' }))
    })
    const client = manager.getClient()

    const result = await client.feature({
      key: feature,
      userId,
    })

    return new Response(JSON.stringify(result))
  } catch (e) {
    return new Response(JSON.stringify({ status: 'error' }))
  }
}

This kind of setup is common for React applications, which may update store values based on decisions returned by the Worker. No need to force a request all the way back to origin.

All in all, using Workers as a centralized decision service can reduce the complexity of your Full Stack implementation and support applications that rely on heavy caching.

How to Improve Your Experimentation Setup

Both of the examples above demonstrate how Workers can provide speed and flexibility to experimentation and feature flagging.  But this is just the tip of the iceberg!  There are plenty of other ways you can use these two technologies together. We’d love to hear from you and explore them together!

Are you a developer looking for a feature flagging or server-side testing solution? The Optimizely Rollouts product is free and ready for you to sign up!

Or does your marketing team need a high performance A/B testing solution? The Optimizely Web use case is in developer preview.

  • Cloudflare Enterprise Customers: Reach out to your dedicated Cloudflare account manager learn more and start the process.
  • Optimizely Customers and Cloudflare Customers (who aren’t on an enterprise plan): Reach out to your Optimizely contact to learn more and start the process.

You can sign up for and learn more about using Cloudflare Workers here!