Tag Archives: javascript

Building scheduling system with Workers and Durable Objects

Post Syndicated from Adam Janiš original https://blog.cloudflare.com/building-scheduling-system-with-workers-and-durable-objects/

Building scheduling system with Workers and Durable Objects

Building scheduling system with Workers and Durable Objects

We rely on technology to help us on a daily basis – if you are not good at keeping track of time, your calendar can remind you when it’s time to prepare for your next meeting. If you made a reservation at a really nice restaurant, you don’t want to miss it! You appreciate the app to remind you a day before your plans the next evening.

However, who tells the application when it’s the right time to send you a notification? For this, we generally rely on scheduled events. And when you are relying on them, you really want to make sure that they occur. Turns out, this can get difficult. The scheduler and storage backend need to be designed with scale in mind – otherwise you may hit limitations quickly.

Workers, Durable Objects, and Alarms are actually a perfect match for this type of workload. Thanks to the distributed architecture of Durable Objects and their storage, they are a reliable and scalable option. Each Durable Object has access to its own isolated storage and alarm scheduler, both being automatically replicated and failover in case of failures.

There are many use cases where having a reliable scheduler can come in handy: running a webhook service, sending emails to your customers a week after they sign up to keep them engaged, sending invoices reminders, and more!

Today, we’re going to show you how to build a scalable service that will schedule HTTP requests on a specific schedule or as one-off at a specific time as a way to guide you through any use case that requires scheduled events.

Building scheduling system with Workers and Durable Objects

Quick intro into the application stack

Before we dive in, here are some of the tools we’re going to be using today:

The application is going to have following components:

  • Scheduling system API to accept scheduled requests and manage Durable Objects
  • Unique Durable Object per scheduled request, each with
  • Storage – keeping the request metadata, such as URL, body, or headers.
  • Alarm – a timer (trigger) to wake Durable Object up.

While we will focus on building the application, the Cloudflare global network will take care of the rest – storing and replicating our data, and making sure to wake our Durable Objects up when the time’s right. Let’s build it!

Initialize new Workers project

Get started by generating a completely new Workers project using the wrangler init command, which makes creating new projects quick & easy.

wrangler init -y durable-objects-requests-scheduler

Building scheduling system with Workers and Durable Objects

For more information on how to install, authenticate, or update Wrangler, check out https://developers.cloudflare.com/workers/wrangler/get-started

Preparing TypeScript types

From my personal experience, at least a draft of TypeScript types significantly helps to be more productive down the road, so let’s prepare and describe our scheduled request in advance. Create a file types.ts in src directory and paste the following TypeScript definitions.

src/types.ts

export interface Env {
    DO_REQUEST: DurableObjectNamespace
}
 
export interface ScheduledRequest {
  url: string // URL of the request
  triggerAt?: number // optional, unix timestamp in milliseconds, defaults to `new Date()`
  requestInit?: RequestInit // optional, includes method, headers, body
}

A scheduled request Durable Object class & alarm

Based on our architecture design, each scheduled request will be saved into its own Durable Object, effectively separating storage and alarms from each other and allowing our scheduling system to scale horizontally – there is no limit to the number of Durable Objects we create.

In the end, the Durable Object class is a matter of a couple of lines. The code snippet below accepts and saves the request body to a persistent storage and sets the alarm timer. Workers runtime will wake up the Durable Object and call the alarm() method to process the request.

The alarm method reads the scheduled request data from the storage, then processes the request, and in the end reschedules itself in case it’s configured to be executed on a recurring schedule.

src/request-durable-object.ts

import { ScheduledRequest } from "./types";
 
export class RequestDurableObject {
  id: string|DurableObjectId
  storage: DurableObjectStorage
 
  constructor(state:DurableObjectState) {
    this.storage = state.storage
    this.id = state.id
  }
 
    async fetch(request:Request) {
    // read scheduled request from request body
    const scheduledRequest:ScheduledRequest = await request.json()
     
    // save scheduled request data to Durable Object storage, set the alarm, and return Durable Object id
    this.storage.put("request", scheduledRequest)
    this.storage.setAlarm(scheduledRequest.triggerAt || new Date())
    return new Response(JSON.stringify({
      id: this.id.toString()
    }), {
      headers: {
        "content-type": "application/json"
      }
    })
  }
 
  async alarm() {
    // read the scheduled request from Durable Object storage
    const scheduledRequest:ScheduledRequest|undefined = await this.storage.get("request")
 
    // call fetch on scheduled request URL with optional requestInit
    if (scheduledRequest) {
      await fetch(scheduledRequest.url, scheduledRequest.requestInit ? webhook.requestInit : undefined)
 
      // cleanup scheduled request once done
      this.storage.deleteAll()
    }
  }
}

Wrangler configuration

Once we have the Durable Object class, we need to create a Durable Object binding by instructing Wrangler where to find it and what the exported class name is.

wrangler.toml

name = "durable-objects-request-scheduler"
main = "src/index.ts"
compatibility_date = "2022-08-02"
 
# added Durable Objects configuration
[durable_objects]
bindings = [
  { name = "DO_REQUEST", class_name = "RequestDurableObject" },
]
 
[[migrations]]
tag = "v1"
new_classes = ["RequestDurableObject"]

Scheduling system API

The API Worker will accept POST HTTP methods only, and is expecting a JSON body with scheduled request data – what URL to call, optionally what method, headers, or body to send. Any other method than POST will return 405 – Method Not Allowed HTTP error.

HTTP POST /:scheduledRequestId? will create or override a scheduled request, where :scheduledRequestId is optional Durable Object ID returned from a scheduling system API before.

src/index.ts

import { Env } from "./types"
export { RequestDurableObject } from "./request-durable-object"
 
export default {
    async fetch(
        request: Request,
        env: Env
    ): Promise<Response> {
        if (request.method !== "POST") {
            return new Response("Method Not Allowed", {status: 405})
        }
 
        // parse the URL and get Durable Object ID from the URL
        const url = new URL(request.url)
        const idFromUrl = url.pathname.slice(1)
 
        // construct the Durable Object ID, use the ID from pathname or create a new unique id
        const doId = idFromUrl ? env.DO_REQUEST.idFromString(idFromUrl) : env.DO_REQUEST.newUniqueId()
 
        // get the Durable Object stub for our Durable Object instance
        const stub = env.DO_REQUEST.get(doId)
 
        // pass the request to Durable Object instance
        return stub.fetch(request)
    },
}

It’s good to mention that the script above does not implement any listing of scheduled or processed webhooks. Depending on how the scheduling system would be integrated, you can save each created Durable Object ID to your existing backend, or write your own registry – using one of the Workers storage options.

Starting a local development server and testing our application

We are almost done! Before we publish our scheduler system to the Cloudflare edge, let’s start Wrangler in a completely local mode to run a couple of tests against it and to see it in action – which will work even without an Internet connection!

wrangler dev --local
Building scheduling system with Workers and Durable Objects

The development server is listening on localhost:8787, which we will use for scheduling our first request. The JSON request payload should match the TypeScript schema we defined in the beginning – required URL, and optional triggerEverySeconds number or triggerAt unix timestamp. When only the required URL is passed, the request will be dispatched right away.

An example of request payload that will send a GET request to https://example.com every 30 seconds.

{
	"url":  "https://example.com",
	"triggerEverySeconds": 30,
}
> curl -X POST -d '{"url": "https://example.com", "triggerEverySeconds": 30}' http://localhost:8787
{"id":"000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f"}
Building scheduling system with Workers and Durable Objects

From the wrangler logs we can see the scheduled request ID 000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f is being triggered in 30s intervals.

Need to double the interval? No problem, just send a new POST request and pass the request ID as a pathname.

> curl -X POST -d '{"url": "https://example.com", "triggerEverySeconds": 60}' http://localhost:8787/000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f
{"id":"000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f"}

Every scheduled request gets a unique Durable Object ID with its own storage and alarm. As we demonstrated, the ID becomes handy when you need to change the settings of the scheduled request, or to deschedule them completely.

Publishing to the network

Following command will bundle our Workers application, export and bind Durable Objects, and deploy it to our workers.dev subdomain.

wrangler publish
Building scheduling system with Workers and Durable Objects

That’s it, we are live! 🎉 The URL of your deployment is shown in the Workers logs. In a reasonably short period of time we managed to write our own scheduling system that is ready to handle requests at scale.

You can check full source code in Workers templates repository, or experiment from your browser without installing any dependencies locally using the StackBlitz template.

What to see or build next

Simplifying serverless best practices with AWS Lambda Powertools for TypeScript

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/simplifying-serverless-best-practices-with-aws-lambda-powertools-for-typescript/

This blog post is written by Sara Gerion, Senior Solutions Architect.

Development teams must have a shared understanding of the workloads they own and their expected behaviors to deliver business value fast and with confidence. The AWS Well-Architected Framework and its Serverless Lens provide architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the AWS Cloud.

Developers should design and configure their workloads to emit information about their internal state and current status. This allows engineering teams to ask arbitrary questions about the health of their systems at any time. For example, emitting metrics, logs, and traces with useful contextual information enables situational awareness and allows developers to filter and select only what they need.

Following such practices reduces the number of bugs, accelerates remediation, and speeds up the application lifecycle into production. They can help mitigate deployment risks, offer more accurate production-readiness assessments and enable more informed decisions to deploy systems and changes.

AWS Lambda Powertools for TypeScript

AWS Lambda Powertools provides a suite of utilities for AWS Lambda functions to ease the adoption of serverless best practices. The AWS Hero Yan Cui’s initial implementation of DAZN Lambda Powertools inspired this idea.

Following the community’s adoption of AWS Lambda Powertools for Python and AWS Lambda Powertools for Java, we are excited to announce the general availability of the AWS Lambda Powertools for TypeScript.

AWS Lambda Powertools for TypeScript provides a suite of utilities for Node.js runtimes, which you can use in both JavaScript and TypeScript code bases. The library follows a modular approach similar to the AWS SDK v3 for JavaScript. Each utility is installed as standalone NPM package.

Today, the library is ready for production use with three observability features: distributed tracing (Tracer), structured logging (Logger), and asynchronous business and application metrics (Metrics).

You can instrument your code with Powertools in three different ways:

  • Manually. It provides the most granular control. It’s the most verbose approach, with the added benefit of no additional dependency and no refactoring to TypeScript Classes.
  • Middy middleware. It is the best choice if your existing code base relies on the Middy middleware engine. Powertools offers compatible Middy middleware to make this integration seamless.
  • Method decorator. Use TypeScript method decorators if you prefer writing your business logic using TypeScript Classes. If you aren’t using Classes, this requires the most significant refactoring.

The examples in this blog post use the Middy approach. To follow the examples, ensure that middy is installed:

npm i @middy/core

Logger

Logger provides an opinionated logger with output structured as JSON. Its key features include:

  • Capturing key fields from the Lambda context, cold starts, and structure logging output as JSON.
  • Logging Lambda invocation events when instructed (disabled by default).
  • Printing all the logs only for a percentage of invocations via log sampling (disabled by default).
  • Appending additional keys to structured logs at any point in time.
  • Providing a custom log formatter (Bring Your Own Formatter) to output logs in a structure compatible with your organization’s Logging RFC.

To install, run:

npm install @aws-lambda-powertools/logger

Usage example:

import { Logger, injectLambdaContext } from '@aws-lambda-powertools/logger';
 import middy from '@middy/core';

 const logger = new Logger({
    logLevel: 'INFO',
    serviceName: 'shopping-cart-api',
});

 const lambdaHandler = async (): Promise<void> => {
     logger.info('This is an INFO log with some context');
 };

 export const handler = middy(lambdaHandler)
     .use(injectLambdaContext(logger));

In Amazon CloudWatch, the structured log emitted by your application looks like:

{
     "cold_start": true,
     "function_arn": "arn:aws:lambda:eu-west-1:123456789012:function:shopping-cart-api-lambda-prod-eu-west-1",
     "function_memory_size": 128,
     "function_request_id": "c6af9ac6-7b61-11e6-9a41-93e812345678",
     "function_name": "shopping-cart-api-lambda-prod-eu-west-1",
     "level": "INFO",
     "message": "This is an INFO log with some context",
     "service": "shopping-cart-api",
     "timestamp": "2021-12-12T21:21:08.921Z",
     "xray_trace_id": "abcdef123456abcdef123456abcdef123456"
 }

Logs generated by Powertools can also be ingested and analyzed by any third-party SaaS vendor that supports JSON.

Tracer

Tracer is an opinionated thin wrapper for AWS X-Ray SDK for Node.js.

Its key features include:

  • Auto-capturing cold start and service name as annotations, and responses or full exceptions as metadata.
  • Automatically tracing HTTP(S) clients and generating segments for each request.
  • Supporting tracing functions via decorators, middleware, and manual instrumentation.
  • Supporting tracing AWS SDK v2 and v3 via AWS X-Ray SDK for Node.js.
  • Auto-disable tracing when not running in the Lambda environment.

To install, run:

npm install @aws-lambda-powertools/tracer

Usage example:

import { Tracer, captureLambdaHandler } from '@aws-lambda-powertools/tracer';
 import middy from '@middy/core'; 

 const tracer = new Tracer({
    serviceName: 'shopping-cart-api'
});

 const lambdaHandler = async (): Promise<void> => {
     /* ... Something happens ... */
 };

 export const handler = middy(lambdaHandler)
     .use(captureLambdaHandler(tracer));
AWS X-Ray segments and subsegments emitted by Powertools

AWS X-Ray segments and subsegments emitted by Powertools

Example service map generated with Powertools

Example service map generated with Powertools

Metrics

Metrics create custom metrics asynchronously by logging metrics to standard output following the Amazon CloudWatch Embedded Metric Format (EMF). These metrics can be visualized through CloudWatch dashboards or used to trigger alerts.

Its key features include:

  • Aggregating up to 100 metrics using a single CloudWatch EMF object (large JSON blob).
  • Validating your metrics against common metric definitions mistakes (for example, metric unit, values, max dimensions, max metrics).
  • Metrics are created asynchronously by the CloudWatch service. You do not need any custom stacks, and there is no impact to Lambda function latency.
  • Creating a one-off metric with different dimensions.

To install, run:

npm install @aws-lambda-powertools/metrics

Usage example:

import { Metrics, MetricUnits, logMetrics } from '@aws-lambda-powertools/metrics';
 import middy from '@middy/core';

 const metrics = new Metrics({
    namespace: 'serverlessAirline', 
    serviceName: 'orders'
});

 const lambdaHandler = async (): Promise<void> => {
     metrics.addMetric('successfulBooking', MetricUnits.Count, 1);
 };

 export const handler = middy(lambdaHandler)
     .use(logMetrics(metrics));

In CloudWatch, the custom metric emitted by your application looks like:

{
     "successfulBooking": 1.0,
     "_aws": {
     "Timestamp": 1592234975665,
     "CloudWatchMetrics": [
         {
         "Namespace": "serverlessAirline",
         "Dimensions": [
             [
             "service"
             ]
         ],
         "Metrics": [
             {
             "Name": "successfulBooking",
             "Unit": "Count"
             }
         ]
     },
     "service": "orders"
 }

Serverless TypeScript demo application

The Serverless TypeScript Demo shows how to use Lambda Powertools for TypeScript. You can find instructions on how to deploy and load test this application in the repository.

Serverless TypeScript Demo architecture

Serverless TypeScript Demo architecture

The code for the Get Products Lambda function shows how to use the utilities. The function is instrumented with Logger, Metrics and Tracer to emit observability data.

// blob/main/src/api/get-products.ts
import { APIGatewayProxyEvent, APIGatewayProxyResult} from "aws-lambda";
import { DynamoDbStore } from "../store/dynamodb/dynamodb-store";
import { ProductStore } from "../store/product-store";
import { logger, tracer, metrics } from "../powertools/utilities"
import middy from "@middy/core";
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer';
import { injectLambdaContext } from '@aws-lambda-powertools/logger';
import { logMetrics, MetricUnits } from '@aws-lambda-powertools/metrics';

const store: ProductStore = new DynamoDbStore();
const lambdaHandler = async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {

  logger.appendKeys({
    resource_path: event.requestContext.resourcePath
  });

  try {
    const result = await store.getProducts();

    logger.info('Products retrieved', { details: { products: result } });
    metrics.addMetric('productsRetrieved', MetricUnits.Count, 1);

    return {
      statusCode: 200,
      headers: { "content-type": "application/json" },
      body: `{"products":${JSON.stringify(result)}}`,
    };
  } catch (error) {
      logger.error('Unexpected error occurred while trying to retrieve products', error as Error);

      return {
        statusCode: 500,
        headers: { "content-type": "application/json" },
        body: JSON.stringify(error),
      };
  }
};

const handler = middy(lambdaHandler)
    .use(captureLambdaHandler(tracer))
    .use(logMetrics(metrics, { captureColdStartMetric: true }))
    .use(injectLambdaContext(logger, { clearState: true, logEvent: true }));

export {
  handler
};

The Logger utility adds useful context to the application logs. Structuring your logs as JSON allows you to search on your structured data using Amazon CloudWatch Logs Insights. This allows you to filter out the information you don’t need.

For example, use the following query to search for any errors for the serverless-typescript-demo service.

fields resource_path, message, timestamp
| filter service = 'serverless-typescript-demo'
| filter level = 'ERROR'
| sort @timestamp desc
| limit 20
CloudWatch Logs Insights showing errors for the serverless-typescript-demo service.

CloudWatch Logs Insights showing errors for the serverless-typescript-demo service.

The Tracer utility adds custom annotations and metadata during the function invocation, which it sends to AWS X-Ray. Annotations allow you to search for and filter traces by business or application contextual information such as product ID, or cold start.

You can see the duration of the putProduct method and the ColdStart and Service annotations attached to the Lambda handler function.

putProduct trace view

putProduct trace view

The Metrics utility simplifies the creation of complex high-cardinality application data. Including structured data along with your metrics allows you to search or perform additional analysis when needed.

In this example, you can see how many times per second a product is created, deleted, or queried. You could configure alarms based on the metrics.

Metrics view

Metrics view

Code examples

You can use Powertools with many Infrastructure as Code or deployment tools. The project contains source code and supporting files for serverless applications that you can deploy with the AWS Cloud Development Kit (AWS CDK) or AWS Serverless Application Model (AWS SAM).

The AWS CDK lets you build reliable and scalable applications in the cloud with the expressive power of a programming language, including TypeScript. The AWS SAM CLI is that makes it easier to create and manage serverless applications.

You can use the sample applications provided in the GitHub repository to understand how to use the library quickly and experiment in your own AWS environment.

Conclusion

AWS Lambda Powertools for TypeScript can help simplify, accelerate, and scale the adoption of serverless best practices within your team and across your organization.

The library implements best practices recommended as part of the AWS Well-Architected Framework, without you needing to write much custom code.

Since the library relieves the operational burden needed to implement these functionalities, you can focus on the features that matter the most, shortening the Software Development Life Cycle and reducing the Time To Market.

The library helps both individual developers and engineering teams to standardize their organizational best practices. Utilities are designed to be incrementally adoptable for customers at any stage of their serverless journey, from startup to enterprise.

To get started with AWS Lambda Powertools for TypeScript, see the official documentation. For more serverless learning resources, visit Serverless Land.

Making Page Shield malicious code alerts more actionable

Post Syndicated from Simon Wijckmans original https://blog.cloudflare.com/making-page-shield-malicious-code-alerts-more-actionable/

Making Page Shield malicious code alerts more actionable

Making Page Shield malicious code alerts more actionable

Last year during CIO week, we announced Page Shield in general availability. Today, we talk about improvements we’ve made to help Page Shield users focus on the highest impact scripts and get more value out of the product. In this post we go over improvements to script status, metadata and categorization.

What is Page Shield?

Page Shield protects website owners and visitors against malicious 3rd party JavaScript. JavaScript can be leveraged in a number of malicious ways: browser-side crypto mining, data exfiltration and malware injection to mention a few.

For example a single hijacked JavaScript can expose millions of user’s credit card details across a range of websites to a malicious actor. The bad actor would scrape details by leveraging a compromised JavaScript library, skimming inputs to a form and exfiltrating this to a 3rd party endpoint under their control.

Today Page Shield partially relies on Content Security Policies (CSP), a browser native framework that can be used to control and gain visibility of which scripts are allowed to load on pages (while also reporting on any violations). We use these violation reports to provide detailed information in the Cloudflare dashboard regarding scripts being loaded by end-user browsers.

Page Shield users, via the dashboard, can see which scripts are active on their website and on which pages. Users can be alerted in case a script performs malicious activity, and monitor code changes of the script.

Script status

To help identify malicious scripts, and make it easier to take action on live threats, we have introduced a status field.

When going to the Page Shield dashboard, the customer will now only see scripts with a status of active. Active scripts are those that were seen in the last seven days and didn’t get served through the “cdn-cgi” endpoint (which is managed by Cloudflare).

We also introduced other statuses:

  • infrequent scripts are scripts that have only been seen in a negligible amount of CSP reports over a period of time. TThey normally indicate a single user’s browser using a compromised browser extension. The goal of this status is to reduce noise caused by browser plugins that inject their JavaScript in the HTML.
  • inactive scripts are those that have not been reported for seven days and therefore have likely since been removed or replaced.
  • cdn-cgi are scripts served from the ‘/cdn-cgi/’ endpoint which is managed by Cloudflare. These scripts are related to Cloudflare products like our analytics or Bot Management features. Cloudflare closely monitors these scripts, they are fairly static, so they shouldn’t require close monitoring by a customer and therefore are hidden by default unless our detections find anything suspicious.

If the customer wishes to see the full list of scripts including the non-active scripts they can still do so by clicking ‘All scripts’.

Making Page Shield malicious code alerts more actionable

Script metadata in alerts

Customers that opt into the enterprise add-on version of Page Shield can also choose to set up notifications for malicious scripts. In the previous version, we offered the script URL, first seen on and last seen on data. These alerts have been revamped to improve the experience for security analysts. Our goal is to provide all data a security analyst would manually look-up in order to validate a script. With this update we’ve made a significant step in that direction through using insights delivered by Cloudflare Radar.

At the top of the email alert you will now find where the script was seen along with other information regarding when the script was seen and the full URL (not clickable for security purposes).

As part of the enterprise add-on version of Page Shield we review the scripts through a custom machine learning classifier and a range of domain and URL threat feeds. A very common question with any machine learning scoring system is “why did it score the way it scored”. Because of this, the machine learning score generated by our system has now been split up to show the components that made up the score; currently: obfuscation and data exfiltration values. This should help to improve the ability for a customer to review a script in case of a false positive.

Threat feeds can be very helpful in detecting new attack styles that our classifier hasn’t been trained for yet. Some of these feeds offer us a range of categories of malicious endpoints such as ‘malware’, ‘spyware’ or ‘phishing’. Our enterprise add-on customers will now be able to see the categorization in our alerts (as shown above) and on the dashboard. The goal is to provide more context on why a script is considered malicious.

We also now provide information on script changes along with the “malicious code score” for each version.

Making Page Shield malicious code alerts more actionable

Right below the most essential information we have added WHOIS information on the party domain that is providing the script. In some cases the registrar may hide relevant information such as the organization’s name, however, information on the date of registration and expiration can be very useful in detecting unexpected changes in ownership. For example, we often see malicious scripts being hosted under newly registered domains.

Making Page Shield malicious code alerts more actionable

We also now offer details regarding the SSL certificates issued for this domain through certificate transparency monitoring. This can be useful in detecting a potential take over. For example, if a 3rd party script endpoint usually uses a Digicert certificate, but recently a Let’s Encrypt certificate has been issued this could be an indicator that another party is trying to take over the domain.

Making Page Shield malicious code alerts more actionable

Last but not least, we have improved our dashboard link to take users directly to the specific script details page provided by the Page Shield UI.

What’s next?

There are many ways to perform malicious activity through JavaScript. Because of this it is important that we build attack type specific detection mechanisms as well as overarching tools that will help detect anomalies. We are currently building a new component purpose built to look for signs of malicious intent in data endpoints by leveraging the connect-src CSP directive. The goal is to improve the accuracy of our Magecart-style attack detection.

We are also working on providing the ability to generate CSP policies directly through Page Shield making it easy to perform positive blocking and maintain rules over time.

Another feature we are working on is offering the ability to block scripts from accessing a user’s webcam, microphone or location through a single click.

More about this in future blog posts. Many more features to come!

XSS in JSON: Old-School Attacks for Modern Applications

Post Syndicated from Julius Callahan original https://blog.rapid7.com/2022/05/04/xss-in-json-old-school-attacks-for-modern-applications/

XSS in JSON: Old-School Attacks for Modern Applications

I recently wrote a blog post on injection-type vulnerabilities and how they were knocked down a few spots from 1 to 3 on the new OWASP Top 10 for 2022. The main focus of that article was to demonstrate how stack traces could be — and still are — used via injection attacks to gather information about an application to further an attacker’s goal. In that post, I skimmed over one of my all time favorite types of injections: cross-site scripting (XSS).

In this post, I’ll cover this gem of an exploit in much more depth, highlighting how it has managed to adapt to the newer environments of today’s modern web applications, specifically the API and Javascript Object Notation (JSON).

I know the term API is thrown around a lot when referencing web applications these days, but for this post, I will specifically be referencing requests made from the front end of a web application to the back end via ajax (Asynchronous JavaScript and XML) or more modern approaches like the fetch method in JavaScript.

Before we begin, I’d like to give a quick recap of what XSS is and how a legacy application might handle these types of requests that could trigger XSS, then dive into how XSS still thrives today in modern web applications via the methods mentioned so far.

What is cross-site scripting?

There are many types of XSS, but for this post, I’ll only be focusing on persistent XSS, which is sometimes referred to as stored XSS.

XSS is a type of injection attack, in which malicious scripts are injected into otherwise benign and trusted websites. XSS attacks occur when an attacker uses a web application to execute malicious code — generally in the form of a browser-side script like JavaScript, for example — against an unsuspecting end user. Flaws that allow these attacks to succeed are quite widespread and occur anywhere a web application accepts an input from a user without sanitizing, validating, escaping, or encoding it.

Because the end user’s browser has no way to know not to trust the malicious script, the browser will execute the script. Because of this broken trust, attackers typically leverage these vulnerabilities to steal victims’ cookies, session tokens, or other sensitive information retained by the browser. They could also redirect to other malicious sites, install keyloggers or crypto miners, or even change the content of the website.

Now for the “stored” part. As the name implies, stored XSS generally occurs when the malicious payload has been stored on the target server, usually in a database, from input that has been submitted in a message forum, visitor log, comment field, form, or any parameter that lacks proper input sanitization.

What makes this type of XSS so much more damaging is that, unlike reflected XSS – which only affects specific targets via cleverly crafted links – stored XSS affects any and everyone visiting the compromised site. This is because the XSS has been stored in the applications database, allowing for a much larger attack surface.

Old-school apps

Now that we’ve established a basic understanding of stored XSS, let’s go back in time a few decades to when web apps were much simpler in their communications between the front-end and back-end counterparts.

Let’s say you want to change some personal information on a website, like your email address on a contacts page. When you enter in your email address and click the update button, it triggers the POST method to send the form data to the back end to update that value in a database. The database updates the value in a table, then pushes a response back to the web applications front end, or UI, for you to see. This would usually result in the entire page having to reload to display only a very minimal amount of change in content, and while it’s very inefficient, nonetheless the information would be added and updated for the end user to consume.

In the example below, clicking the update button submits a POST form request to the back-end database where the application updates and stores all the values, then provides a response back to the webpage with the updated info.

XSS in JSON: Old-School Attacks for Modern Applications

XSS in JSON: Old-School Attacks for Modern Applications

Old-school XSS

As mentioned in my previous blog post on injection, I give an example where an attacker enters in a payload of <script>alert(“This is XSS”)</script> instead of their email address and clicks the update button. Again, this triggers the POST method to take our payload and send it to the back-end database to update the email table, then pushes a response back to the front end, which gets rendered back to the UI in HTML. However, this time the email value being stored and displayed is my XSS payload, <script>alert(“This is XSS”)</script>, not an actual email address.

XSS in JSON: Old-School Attacks for Modern Applications

As seen above, clicking the “update” button submits the POST form data to the back end where the database stores the values, then pushes back a response to update the UI as HTML.

XSS in JSON: Old-School Attacks for Modern Applications

However, because our payload is not being sanitized properly, our malicious JavaScript gets executed by the browser, which causes our alert box to pop up as seen below.

XSS in JSON: Old-School Attacks for Modern Applications

While the payload used in the above example is harmless, the point to drive home here is that we were able to get the web application to store and execute our JavaScript all through a simple contact form. Anyone visiting my contact page will see this alert pop up because my XSS payload has been stored in the database and gets executed every time the page loads. From this point on, the possible damage that could be done here is endless and only limited by the attacker’s imagination… well, and their coding skills.  

New-school apps

In the first example I gave, when you updated the email address on the contact page and the request was fulfilled by the backend, the entire page would reload in order to display the newly created or updated information. You can see how inefficient this is, especially if the only thing changing on the page is a single line or a few lines of text. Here is where ajax and/or the fetch method comes in.

Ajax, or the fetch method, can be used to get data from or post data to a remote source, then update the front-end UI of that web application without having to refresh the page. Only the content from the specific request is updated, not the entire page, and that is the key difference between our first example and this one.

And a very popular format for said data being sent and received is JavaScript Object Notation, most commonly known as JSON. (Don’t worry, I’ll get back to those curly braces in just a bit.)

New-school XSS

(Well, not really, but it sounds cool.)

Now, let’s pretend we’ve traveled back to the future and our contact page has been rewritten to use ajax or the fetch method to send and receive data to and from the database. From the user’s point of view, nothing has changed — still the same ol’ form. But this time, when the email address is being updated, only the contact form refreshes. The entire page and all of its contents do not refresh like in the previous version, which is a major win for efficiency and user experience.

Below is an example of what a POST might look like formatted in JSON.

XSS in JSON: Old-School Attacks for Modern Applications

“What is JSON?” you might ask. Short for JavaScript Object Notation, it is a lightweight text format for storing and transferring data and is most commonly used when sending data to and from servers. Remember those curly braces I mentioned earlier? Well, one quick and easy way to spot JSON is the formatting and the use of curly braces.

In the example above, you can see what our new POST looks like using ajax or the fetch method in JavaScript. While the end result is no different than before, as seen in the example below, the method that was used to update the page is quite different. The key difference here is that the data we’re wanting to update is being treated as just that: data, but in the form of JSON as opposed to HTML.

XSS in JSON: Old-School Attacks for Modern Applications

Now, let’s inject the same XSS payload into the same email field and hit update.  In the example below, you can see that our POST request has been wrapped in curly braces, using JSON, and is formatted a bit differently than previously before being sent to the back end to be processed.

XSS in JSON: Old-School Attacks for Modern Applications

XSS in JSON: Old-School Attacks for Modern Applications

In the example above, you can see that the application is allowing my email address to be the XSS payload in its entirety. However, the JavaScript here is only being displayed and not being executed as code by the browser, so the alert “pop” message never gets triggered as in the previous example. That again is the key difference from the original way we were fulfilling the requests versus our new, more modern way — or in short, using JSON instead of HTML.

Now you might be asking yourself, what’s wrong with allowing the XSS payload to be the email address if it’s only being displayed and not being executed as JavaScript by the browser. That is a valid question, but hear me out.

See, I’ve been working in this industry long enough to know that the two most common responses to a question or statement regarding cybersecurity begin with either “that depends…” or “what if…”  I’m going to go with the latter here and throw a couple what-ifs at you.

Now that my XSS is stored in your database, it’s only a matter of time before this ticking time bomb goes off. Just because my XSS is being treated as JSON and not HTML now does not mean that will always be the case, and attackers are betting on this.

Here are a few scenarios.

Scenario 1

What if team B handles this data differently from team A? What if team B still uses more traditional methods of sending and receiving data to and from the back end and does leverage the use of HTML and not JSON?

In that case, the XSS would most likely eventually get executed. It might not affect the website that the XSS was originally injected into, but the stored data can be (and usually is) also used elsewhere. The XSS stored in that database is probably going to be shared and used by multiple other teams and applications at some point. The odds of all those different teams leveraging the exact same standards and best practices are slim to none, and attackers know this.  

Scenario 2

What if, down the road, a developer using more modern techniques like ajax or the fetch method to send and receive data to and from the back end decides to use the .innerHTML property instead of .innerTEXT to load that JSON into the UI? All bets are off, and the stored XSS that was previously being protected by those lovely curly braces will now most likely get executed by the browser.

Scenario 3

Lastly, what if the current app had been developed to use server-side rendering, but a decision from higher up has been made that some costs need to be cut and that the company could actually save money by recoding some of their web apps to be client-side rather than server-side?

Previously, the back end was doing all the work, including sanitizing all user input, but now the shift will be for the browser to do all the heavy lifting. Good luck spotting all the XSS stored in the DB — in its previous state, it was “harmless,” but now it could get rendered to the UI as HTML, allowing the browser to execute said stored XSS. In this scenario, a decision that was made upstream will have an unexpected security impact downstream, both figuratively and literally — a situation that is all too well-known these days.

Final thoughts

Part of my job as a security advisor is to, well, advise. And it’s these types of situations that keep me up at night. I come across XSS in applications every day, and while I may not see as many fun and exciting “pops” as in years past, I see something a bit more troubling.

This type of XSS is what I like to call a “sleeper vuln” – laying dormant, waiting for the right opportunity to be woken up. If I didn’t know any better, I’d say XSS has evolved and is aware of its new surroundings. Of course, XSS hasn’t evolved, but the applications in which it lives have.

At the end of the day, we’re still talking about the same XSS from its conception, the same XSS that has been on the OWASP Top 10 for decades — what we’re really concerned about is the lack of sanitization or handling of user input. But now, with the massive adoption of JavaScript frameworks like Angular, libraries like React, the use of APIs, and the heavy reliance on them to handle the data properly, we’ve become complacent in our duties to harden applications the proper way.

There seems to be a division in camps around XSS in JSON. On the one hand, some feel that since the JavaScript isn’t being executed by the browser, everything is fine. Who cares if an email address (or any data for that matter) is potentially dangerous — as long as it’s not being executed by the browser. And on the other hand, you have the more fundamentalist, dare I say philosophical thought that all user input should never be trusted: It should always be sanitized, regardless of whether it’s treated as data or not — and not solely because of following best coding and security practices, but also because of the “that depends” and “what if” scenarios in the world.  

I’d like to point out in my previous statement above, that “as long as” is vastly different from cannot.” “As long as” implies situational awareness and that a certain set of criteria need to be met for it to be true or false, while “cannot” is definite and fixed, regardless of the situation or criteria. “As long as the XSS is wrapped in curly braces” means it does not pose a risk in its current state but could in other states. But if input is sanitized and escaped properly, the XSS would never exist in the first place, and thus it “cannot” or could not be executed by the browser, ever.

I guess I cannot really complain too much about these differences of opinions though. The fact that I’m even having these conversations with others is already a step in the right direction. But what does concern me is that it’s 2022, and we’re still seeing XSS rampant in applications, but because it’s wrapped in JSON somehow makes it acceptable. One of the core fundamentals of my job is to find and prioritize risk, then report. And while there is always room for discussion around the severity of these types of situations, lots of factors have to be taken into consideration, a spade isn’t always a spade in application security, or cybersecurity in general for that matter. But you can rest assured if I find XSS in JSON in your environment, I will be calling it out.  

I hope there will be a future where I can look back and say, “Remember that one time when curly braces were all that prevented your website from getting hacked?” Until then, JSON or not, never trust user data, and sanitize all user input (and output for that matter). A mere { } should never be the difference between your site getting hacked or not.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

Announcing native support for Stripe’s JavaScript SDK in Cloudflare Workers

Post Syndicated from Kristian Freeman original https://blog.cloudflare.com/announcing-stripe-support-in-workers/

Announcing native support for Stripe’s JavaScript SDK in Cloudflare Workers

This post is also available in 日本語, 简体中文.

Announcing native support for Stripe’s JavaScript SDK in Cloudflare Workers

Handling payments inside your apps is crucial to building a business online. For many developers, the leading choice for handling payments is Stripe. Since my first encounter with Stripe about seven years ago, the service has evolved far beyond simple payment processing. In the e-commerce example application I shared last year, Stripe managed a complete seller marketplace, using the Connect product. Stripe’s product suite is great for developers looking to go beyond accepting payments.

Earlier versions of Stripe’s SDK had core Node.js dependencies, like many popular JavaScript packages. In Stripe’s case, it interacted directly with core Node.js libraries like net/http, to handle HTTP interactions. For Cloudflare Workers, a V8-based runtime, this meant that the official Stripe JS library didn’t work; you had to fall back to using Stripe’s (very well-documented) REST API. By doing so, you’d lose the benefits of using Stripe’s native JS library — things like automatic type-checking in your editor, and the simplicity of function calls like stripe.customers.create(), instead of manually constructed HTTP requests, to interact with Stripe’s various pieces of functionality.

In April, we wrote that we were focused on increasing the number of JavaScript packages that are compatible with Workers:

We won’t stop until our users can import popular Node.js libraries seamlessly. This effort will be large-scale and ongoing for us, but we think it’s well worth it.

I’m excited to announce general availability for the stripe JS package for Cloudflare Workers. You can now use the native Stripe SDK directly in your projects! To get started, install stripe in your project: npm i stripe.

This also opens up a great opportunity for developers who have deployed sites to Cloudflare Pages to begin using Stripe right alongside their applications. With this week’s announcement of serverless function support in Cloudflare Pages, you can accept payments for your digital product, or handle subscriptions for your membership site, with a few lines of JavaScript added to your Pages project. There’s no additional configuration, and it scales automatically, just like your Pages site.

To showcase this, we’ve prepared an example open-source repository, showing how you can integrate Stripe Checkout into your Pages application. There’s no additional configuration needed — as we announced yesterday, Pages’ new Workers Functions features allows you to deploy infinitely-scalable functions just by adding a new functions folder to your project. See it in action at stripe.pages.dev, or check out the open-source repository on GitHub.

With the SDK installed, you can begin accepting payments directly in your applications. The below example shows how to initiate a new Checkout session and redirect to Stripe’s hosted checkout page:

const Stripe = require("stripe");

const stripe = Stripe(STRIPE_API_KEY, {
  httpClient: Stripe.createFetchHttpClient()
});

export default {
  async fetch(request) {
    const session = await stripe.checkout.sessions.create({
      line_items: [{
        price_data: {
          currency: 'usd',
          product_data: {
            name: 'T-shirt',
          },
          unit_amount: 2000,
        },
        quantity: 1,
      }],
      payment_method_types: [
        'card',
      ],
      mode: 'payment',
      success_url: `${YOUR_DOMAIN}/success.html`,
      cancel_url: `${YOUR_DOMAIN}/cancel.html`,
    });

    return Response.redirect(session.url)
  }
}

With support for the Stripe SDK natively in Cloudflare Workers, you aren’t just limited to payment processing either. Any JavaScript example that’s currently in Stripe’s extensive documentation works directly in Workers, without needing to make any changes.

In particular, using Workers to handle the multitude of available Stripe webhooks means that you can get better visibility into how your existing projects are working, without needing to spin up any new infrastructure. The below example shows how you can securely validate incoming webhook requests to a Workers function, and perform business logic by parsing the data inside that webhook:

const Stripe = require("stripe");

const stripe = Stripe(STRIPE_API_KEY, {
  httpClient: Stripe.createFetchHttpClient()
});

export default {
  async fetch(request) {
    const body = await request.text()
    const sig = request.headers.get('stripe-signature')
    const event = stripe.webhooks.constructEvent(body, sig, secret)

    // Handle the event
    switch (event.type) {
      case 'payment_intent.succeeded':
        const paymentIntent = event.data.object;
        // Then define and call a method to handle the successful payment intent.
        // handlePaymentIntentSucceeded(paymentIntent);
        break;
      case 'payment_method.attached':
        const paymentMethod = event.data.object;
        // Then define and call a method to handle the successful attachment of a PaymentMethod.
        // handlePaymentMethodAttached(paymentMethod);
        break;
      // ... handle other event types
      default:
        console.log(`Unhandled event type ${event.type}`);
    }

    // Return a response to acknowledge receipt of the event
    return new Response(JSON.stringify({ received: true }), {
      headers: { 'Content-type': 'application/json' }
    })
  }
}

We’re also announcing a new Workers template in partnership with Stripe. The template will help you get up and running with Stripe and Workers, using our best practices.

In less than five minutes, you can begin accepting payments for your next digital product or membership business. You can also handle and validate incoming webhooks at the edge, from a single codebase. This comes with no upfront cost, server provisioning, or any of the standard scaling headaches. Take your serverless functions, deploy them to Cloudflare’s edge, and start making money!

“We’re big fans of Cloudflare Workers over here at Stripe. Between the wild performance at the edge and fantastic serverless development experience, we’re excited to see what novel ways you all use Stripe to make amazing apps.”
Brian Holt, Stripe

We’re thrilled with this incredible update to Stripe’s JavaScript SDK. We’re also excited to see what you’ll build with native Stripe support in Workers. Join our Discord server and share what you’ve built in the #what-i-built channel — get your invite here.

JavaScript modules are now supported on Cloudflare Workers

Post Syndicated from Ashcon Partovi original https://blog.cloudflare.com/workers-javascript-modules/

JavaScript modules are now supported on Cloudflare Workers

JavaScript modules are now supported on Cloudflare Workers

We’re excited to announce that JavaScript modules are now supported on Cloudflare Workers. If you’ve ever taken look at an example Worker written in JavaScript, you might recognize the following code snippet that has been floating around the Internet the past few years:

addEventListener("fetch", (event) => {
  event.respondWith(new Response("Hello Worker!"));
}

The above syntax is known as the “Service Worker” API, and it was proposed to be standardized for use in web browsers. The idea is that you can attach a JavaScript file to a web page to modify its HTTP requests and responses, acting like a virtual endpoint. It was exactly what we needed for Workers, and it even integrated well with standard Web APIs like fetch() and caches.

Before introducing modules, we want to make it clear that we will continue to support the Service Worker API. No developer wants to get an email saying that you need to rewrite your code because an API or feature is being deprecated; and you won’t be getting one from us. If you’re interested in learning why we made this decision, you can read about our commitment to backwards-compatibility for Workers.

What are JavaScript modules?

JavaScript modules, also known as ECMAScript (abbreviated. “ES”) modules, is the standard API for importing and exporting code in JavaScript. It was introduced by the “ES6” language specification for JavaScript, and has been implemented by most Web browsers, Node.js, Deno, and now Cloudflare Workers. Here’s an example to demonstrate how it works:

// filename: ./src/util.js
export function getDate(time) {
  return new Date(time).toISOString().split("T")[0]; // "YYYY-MM-DD"
}

The “export” keyword indicates that the “getDate” function should be exported from the current module. Then, you can use “import” from another module to use that function.

// filename: ./src/index.js
import { getDate } from "./util.js"

console.log("Today’s date:", getDate());

Those are the basics, but there’s a lot more you can do with modules. It allows you to organize, maintain, and re-use your code in an elegant way that just works. While we can’t go over every aspect of modules here, if you’d like to learn more we’d encourage you to read the MDN guide on modules, or a more technical deep-dive by Lin Clark.

How can I use modules in Workers?

You can export a default module, which will represent your Worker. Instead of using “addEventListener,” each event handler is defined as a function on that module. Today, we support “fetch” for HTTP and WebSocket requests and “scheduled” for cron triggers.

export default {
  async fetch(request, environment, context) {
    return new Response("I’m a module!");
  },
  async scheduled(controller, environment, context) {
    // await doATask();
  }
}

You may also notice some other differences, such as the parameters on each event handler. Instead of a single “Event” object, the parameters that you need the most are spread out on their own. The first parameter is specific to the event type: for “fetch” it’s the Request object, and for “scheduled” it’s a controller that contains the cron schedule.

The second parameter is an object that contains your environment variables (also known as “bindings“). Previously, each variable was inserted into the global scope of the Worker. While a simple solution, it was confusing to have variables magically appear in your code. Now, with an environment object, you can control which modules and libraries get access to your environment variables. This mechanism is more secure, as it can prevent a compromised or nosy third-party library from enumerating all your variables or secrets.

The third parameter is a context object, which allows you to register background tasks using waitUntil(). This is useful for tasks like logging or error reporting that should not block the execution of the event.

When you put that all together, you can import and export multiple modules, as well as use the new event handler syntax.

// filename: ./src/error.js
export async function logError(url, error) {
  await fetch(url, {
     method: "POST",
     body: error.stack
  })
}

// filename: ./src/worker.js
import { logError } from "./error.js"

export default {
  async fetch(request, environment, context) {
    try {
       return await fetch(request);
    } catch (error) {
       context.waitUntil(logError(environment.ERROR_URL, error));
       return new Response("Oops!", { status: 500 });
    }
  }
}

Let’s not forget about Durable Objects, which became generally available earlier this week! You can also export classes, which is how you define a Durable Object class. Here’s another example with a “Counter” Durable Object, that responds with an incrementing value.

// filename: ./src/counter.js
export class Counter {
  value = 0;
  fetch() {
    this.value++;
    return new Response(this.value.toString());
  }
}

// filename: ./src/worker.js
// We need to re-export the Durable Object class in the Worker module.
export { Counter } from "./counter.js"

export default {
  async fetch(request, environment) {
    const clientId = request.headers.get("cf-connecting-ip");
    const counterId = environment.Counter.idFromName(clientId);
    // Each IP address gets its own Counter.
    const counter = environment.Counter.get(counterId);
    return counter.fetch("https://counter.object/increment");
  }
}

Are there non-JavaScript modules?

Yes! While modules are primarily for JavaScript, we also support other modules types, some of which are not yet standardized.

For instance, you can import WebAssembly as a module. Previously, with the Service Worker API, WebAssembly was included as a binding. We think that was a mistake, since WebAssembly should be represented as code and not an external resource. With modules, here’s the new way to import WebAssembly:

import module from "./lib/hello.wasm"

export default {
  async fetch(request) {
    const instance = await WebAssembly.instantiate(module);
    const result = instance.exports.hello();
    return new Response(result);
  }
}

While not supported today, we look forward to a future where WebAssembly and JavaScript modules can be more tightly integrated, as outlined in this proposal. The ergonomics improvement, demonstrated below, can go a long way to make WebAssembly more included in the JavaScript ecosystem.

import { hello } from "./lib/hello.wasm"

export default {
  async fetch(request) {
    return new Response(hello());
  }
}

We’ve also added support for text and binary modules, which allow you to import a String and ArrayBuffer, respectively. While not standardized, it allows you to easily import resources like an HTML file or an image.

<!-- filename: ./public/index.html -->
<!DOCTYPE html>
<html><body>
<p>Hello!</p>
</body></html>

import html from "../public/index.html"

export default {
  fetch(request) {
    if (request.url.endsWith("/index.html") {
       return new Response(html, {
          headers: { "Content-Type": "text/html" }
       });
    }
    return fetch(request);
  }
}

How can I get started?

There are many ways to get started with modules.

First, you can try out modules in your browser using our playground (which doesn’t require an account) or by using the dashboard quick editor. Your browser will automatically detect when you’re using modules to allow you to seamlessly switch from the Service Worker API. For now, you’ll only be able to create one JavaScript module in the browser, though supporting multiple modules is something we’ll be improving soon.

If you’re feeling adventurous and want to start a new project using modules, you can try out the beta release of wrangler 2.0, the next-generation of the command-line interface (CLI) for Workers.

For existing projects, we still recommend using wrangler 1.0 (release 1.17 or later). To enable modules, you can adapt your “wrangler.toml” configuration to the following example:

name = "my-worker"
type = "javascript"
workers_dev = true

[build.upload]
format = "modules"
dir = "./src"
main = "./worker.js" # becomes "./src/worker.js"

[[build.upload.rules]]
type = "ESModule"
globs = "**/*.js"

# Uncomment if you have a build script.
# [build]
# command = "npm run build"

We’ve updated our documentation to provide more details about modules, though some examples will still be using the Service Worker API as we transition to showing both formats. (and TypeScript as a bonus!)

If you experience an issue or notice something strange with modules, please let us know, and we’ll take a look. Happy coding, and we’re excited to see what you build with modules!

JavaScript modules are now supported on Cloudflare Workers

How We Build Micro Frontends With Lattice

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/how-we-build-micro-frontends-with-lattice-22b8635f77ea

Written by Michael Possumato, Nick Tomlin, Jordan Andree, Andrew Shim, and Rahul Pilani.

As we continue to grow here at Netflix, the needs of Revenue and Growth Engineering are rapidly evolving; and our tools must also evolve just as rapidly. The Revenue and Growth Tools (RGT) team decided to set off on a journey to build tools in an abstract manner to have solutions readily available within our organization. We identified common design patterns and architectures scattered across various tools which were all duplicating efforts in some way or another.

We needed to consolidate these tools in a way that scaled with the teams we served. It needed to have the agility of a micro frontend and the extensibility of a framework to empower our stakeholders to extend our tools. We would abstract parts of which anyone can then customize, or extend, to meet their specific business or technical requirements. The end result is Lattice: RGT’s pluggable framework for micro frontends.

A Different Approach to Our Tools

A UI composed of other dependencies is nothing new; it’s something all modern web applications do today. The traditional approach of bundling dependencies at build time lacks the flexibility we need to empower our stakeholders. We want external dependencies to be resolved on-demand from any number of sources, from another application to an engineer’s laptop.

This led us to the following high level objectives:

  • Low Friction Adoption: Encourage reuse of existing front end code and avoid creating new packages that encapsulate UI functionality. Applications can be difficult to manage when functionality must be shared across packages. We would leverage an approach that enabled applications to extend their core functionality using common, and familiar, React paradigms.
  • Weak Dependencies: Host applications could reference modules over https to a remote bundle hosted internally within Netflix. These bundles could be owned by teams outside of RGT built by already adopted standards such as with Webpack Module Federation or native JavaScript Modules.
  • Highly Aligned, Loosely Coupled: fully align with the standard frameworks and libraries used within Netflix. Plugins should be focused on delivering their core functionality without unnecessary boilerplate and have the freedom to implement without cumbersome API wrappers.
  • Metadata Driven: Plugin modules are defined from a configuration which could be injected at any point in the application lifecycle. The framework must be flexible enough to register, and unregister, plugins such that the extensions only apply when necessary.
  • Rapid Development: Reduce the development cycle by avoiding unnecessary builds and deployments. Plugins would be developed in a manner in which all of the context is available to them ahead of time via TypeScript declarations. By designing to rigid interfaces defined by a host application, both the plugin and host can be developed in parallel.

A Theoretical Example

Example Developer Dashboard Application with Embedded Lattice Plugins

Let’s take the above example — it renders and controls its own header and content areas to expose specific functionality to users. Upon release, we receive feedback that it would be nice if we could include information presented from other tools within this application. Using our new framework, Lattice, we are able to embed the existing functionality from the other applications.

A Lattice Plugin Host (which we’ll dive into later) allows us to extend the existing application by referencing two external plugins, Workflows and Spinnaker. In our example, we need to define two areas that can be extended — the application content for portal components and configurable routing.

The sequence of events in order to accomplish the above rendering process would be handled by three components — our new framework Lattice and the two plugins:

Dispatch Cycle within Lattice

First, Lattice will load both plugins asynchronously.

Next, the framework will dispatch events as they flow through the application.

In our example, Workflows will register its routes and Spinnaker will add its overlays.

An Implementation with React

In order to accomplish the above scenario, the Host Application needs to include the Lattice library and add a new PluginHost with a configuration referencing the external plugins. This host requires information about the specific application and the configuration indicating which plugins to load:

Enhancing a React Application with a Lattice Plugin Host

We’ve mocked this implementation in the example above with a useFetchPluginConfiguration hook to retrieve the metadata from an external service. Owners can then choose to add or remove plugins dynamically, outside of the application source code.

Allowing plugins access to the routing can be done using hooks defined by the Lattice framework. The usePluggableState hook will retrieve the default application routes and pass them through the Lattice framework. If any plugin responds to this AppRoutes identifier, they can choose to inject their specific routes:

Extending Existing Application State with Lattice Hooks

Plugins can inject any React element into the page with the<Pluggable /> component as illustrated below. This will allow plugins to render within this AppContent area:

Rendering Custom Children with Lattice Pluggable

The final example application snippet has been included below:

Under the Hood

Lattice is a tiny framework that provides an abstraction layer for React web applications to leverage.

Using Lattice, developers can focus on their core product, and simply wrap areas of their application that are customizable by external plugins. Developers can also extend components to use external state by using Lattice hooks.

Lattice Plugin Modules are JavaScript functions implemented by remote applications. These functions act as the “glue” between the host application and the remote component(s) being shared. Modules declare which components within their application should be exposed and how they should be rendered based on information the host provides.

A Lattice Pluggable Component allows a host application to expose a mount point through a standard React component that plugins can manipulate or override with their own content.

Lattice Custom Hooks are used to manipulate state using a state reducer pattern. These hooks allow host applications to maintain their own initial state, and modify accordingly, while also allowing plugins the opportunity to inject their own data.

Lattice Plugins

Lattice Functionality within a Host Application

The core of Lattice provides the ability to asynchronously load remote modules via Webpack Module Federation, Native ES Modules, or a custom implementation defined outside of the framework. The host application provides Lattice with basic application context and a configuration which defines the remote plugin modules to load. Once loaded, references to these plugins are stored internally within a React Context instance.

Exposing Functionality to Lattice as a Federated Module

Plugin modules can then provide new functionality, or change existing functionality, to the host application. Standard identifiers are used that all Lattice-enabled applications should implement to allow plugins to universally work across different applications. Most extensions will choose to extend existing application functionality, which will not be universal, and requires knowledge of the host’s design.

Lattice requires constant identifier values (aka “magic strings”) to understand what is being rendered. The Lattice Plugin Host will dispatch this identifier through all of the plugins which have been registered and loaded. Plugin responses are composed together, and the final returned value is what gets rendered in the component tree. Through this model, plugins can decide to extend, change, or simply ignore the event. Think of this process as an approach similar to that of Redux or Express Middleware functions.

Lattice can also be used to extend existing application functionality. In order to accomplish this, Plugins must be aware of the host identifiers and data shapes used in the host application lifecycle. While this might sound like an impossible task to maintain, we encourage host applications to publish a TypeScript declarations project which is shared between the host and plugins. Think of us as having a DefinitelyTyped repository for all of the Netflix internal tools that embrace extending via Lattice.

Using this approach, we are able to provide developers with a highly aligned, loosely coupled development environment shared between host applications and plugins. Plugins can be developed in a silo, simply adhering to the interface which has been declared.

The Possibilities are Endless

While our original approach was to extend core functionality within an application, we have found that we are able to leverage Lattice in other ways. The concept of writing a simple if statement has been replaced; we take a step back, and consider which domain in our organization should be responsible for said logic and consider moving the logic into their respective plugin.

We have also found that we can easily model more fine-grained areas within an application. For example, we can render individual form components using Lattice identifiers and have plugins be responsible for the specific UI elements. This empowers us to build these generic tools backed by metadata models and a default out-of-box experience which others can choose to override.

Most importantly, we are able to easily, and quickly, respond to conflicting requirements by simply implementing different plugins.

What’s Next?

We are only getting started with Lattice and currently gauging interest internally from other teams. By dogfooding our approach within RGT, we can work out the kinks, squash some bugs, and build a robust process for building micro frontends with Lattice. The developer experience is crucial for Lattice to be successful. Empowering developers with the ability to understand the lifecycle of Lattice events within an application, verify functionality prior to deployments, versioning, developing end-to-end test suites, and general best practices are some nuggets critical to our success.


How We Build Micro Frontends With Lattice was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Pin, Unpin, and why Rust needs them

Post Syndicated from Adam Chalmers original https://blog.cloudflare.com/pin-and-unpin-in-rust/

Pin, Unpin, and why Rust needs them

Pin, Unpin, and why Rust needs them

Using async Rust libraries is usually easy. It’s just like using normal Rust code, with a little async or .await here and there. But writing your own async libraries can be hard. The first time I tried this, I got really confused by arcane, esoteric syntax like T: ?Unpin and Pin<&mut Self>. I had never seen these types before, and I didn’t understand what they were doing. Now that I understand them, I’ve written the explainer I wish I could have read back then. In this post, we’re gonna learn

  • What Futures are
  • What self-referential types are
  • Why they were unsafe
  • How Pin/Unpin made them safe
  • Using Pin/Unpin to write tricky nested futures

What are Futures?

A few years ago, I needed to write some code which would take some async function, run it and collect some metrics about it, e.g. how long it took to resolve. I wanted to write a type TimedWrapper that would work like this:

// Some async function, e.g. polling a URL with [https://docs.rs/reqwest]
// Remember, Rust functions do nothing until you .await them, so this isn't
// actually making a HTTP request yet.
let async_fn = reqwest::get("http://adamchalmers.com");

// Wrap the async function in my hypothetical wrapper.
let timed_async_fn = TimedWrapper::new(async_fn);

// Call the async function, which will send a HTTP request and time it.
let (resp, time) = timed_async_fn.await;
println!("Got a HTTP {} in {}ms", resp.unwrap().status(), time.as_millis())

I like this interface, it’s simple and should be easy for the other programmers on my team to use. OK, let’s implement it! I know that, under the hood, Rust’s async functions are just regular functions that return a Future. The Future trait is pretty simple. It just means a type which:

  • Can be polled
  • When it’s polled, it might return “Pending” or “Ready”
  • If it’s pending, you should poll it again later
  • If it’s ready, it responds with a value. We call this “resolving”.

Here’s a really easy example of implementing a Future. Let’s make a Future that returns a random u16.

use std::{future::Future, pin::Pin, task::Context}

/// A future which returns a random number when it resolves.
#[derive(Default)]
struct RandFuture;

impl Future for RandFuture {
	// Every future has to specify what type of value it returns when it resolves.
	// This particular future will return a u16.
	type Output = u16;

	// The `Future` trait has only one method, named "poll".
fn poll(self: Pin<&mut Self>, _cx: &mut Context) -> Poll<Self::Output  {
		Poll::ready(rand::random())
	}
}

Not too hard! I think we’re ready to implement TimedWrapper.

Trying and failing to use nested Futures

Let’s start by defining the type.

pub struct TimedWrapper<Fut: Future> {
	start: Option<Instant>,
	future: Fut,
}

OK, so a TimedWrapper is generic over a type Fut, which must be a Future. And it will store a future of that type as a field. It’ll also have a start field which will record when it first was first polled. Let’s write a constructor:

impl<Fut: Future> TimedWrapper<Fut> {
	pub fn new(future: Fut) -> Self {
		Self { future, start: None }
	}
}

Nothing too complicated here. The new function takes a future and wraps it in the TimedWrapper. Of course, we have to set start to None, because it hasn’t been polled yet. So, let’s implement the poll method, which is the only thing we need to implement Future and make it .awaitable.

impl<Fut: Future> Future for TimedWrapper<Fut> {
	// This future will output a pair of values:
	// 1. The value from the inner future
	// 2. How long it took for the inner future to resolve
	type Output = (Fut::Output, Duration);

	fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
		// Call the inner poll, measuring how long it took.
		let start = self.start.get_or_insert_with(Instant::now);
		let inner_poll = self.future.poll(cx);
		let elapsed = self.elapsed();

		match inner_poll {
			// The inner future needs more time, so this future needs more time too
			Poll::Pending => Poll::Pending,
			// Success!
			Poll::Ready(output) => Poll::Ready((output, elapsed)),
		}
	}
}

OK, that wasn’t too hard. There’s just one problem: this doesn’t work.

Pin, Unpin, and why Rust needs them

So, the Rust compiler reports an error on self.future.poll(cx), which is “no method named poll found for type parameter Fut in the current scope”. This is confusing, because we know Fut is a Future, so surely it has a poll method? OK, but Rust continues: Fut doesn’t have a poll method, but Pin<&mut Fut> has one. What is this weird type?

Well, we know that methods have a “receiver”, which is some way it can access self. The receiver might be self, &self or &mut self, which mean “take ownership of self,” “borrow self,” and “mutably borrow self” respectively. So this is just a new, unfamiliar kind of receiver. Rust is complaining because we have Fut and we really need a Pin<&mut Fut>. At this point I have two questions:

  1. What is Pin?
  2. If I have a T value, how do I get a Pin<&mut T>?

The rest of this post is going to be answering those questions. I’ll explain some problems in Rust that could lead to unsafe code, and why Pin safely solves them.

Self-reference is unsafe

Pin exists to solve a very specific problem: self-referential datatypes, i.e. data structures which have pointers into themselves. For example, a binary search tree might have self-referential pointers, which point to other nodes in the same struct.

Self-referential types can be really useful, but they’re also hard to make memory-safe. To see why, let’s use this example type with two fields, an i32 called val and a pointer to an i32 called pointer.

Pin, Unpin, and why Rust needs them

So far, everything is OK. The pointer field points to the val field in memory address A, which contains a valid i32. All the pointers are valid, i.e. they point to memory that does indeed encode a value of the right type (in this case, an i32). But the Rust compiler often moves values around in memory. For example, if we pass this struct into another function, it might get moved to a different memory address. Or we might Box it and put it on the heap. Or if this struct was in a Vec<MyStruct>, and we pushed more values in, the Vec might outgrow its capacity and need to move its elements into a new, larger buffer.

Pin, Unpin, and why Rust needs them

When we move it, the struct’s fields change their address, but not their value. So the pointer field is still pointing at address A, but address A now doesn’t have a valid i32. The data that was there was moved to address B, and some other value might have been written there instead! So now the pointer is invalid. This is bad — at best, invalid pointers cause crashes, at worst they cause hackable vulnerabilities. We only want to allow memory-unsafe behaviour in unsafe blocks, and we should be very careful to document this type and tell users to update the pointers after moves.

Unpin and !Unpin

To recap, all Rust types fall into two categories.

  1. Types that are safe to move around in memory. This is the default, the norm. For example, this includes primitives like numbers, strings, bools, as well as structs or enums entirely made of them. Most types fall into this category!
  2. Self-referential types, which are not safe to move around in memory. These are pretty rare. An example is the intrusive linked list inside some Tokio internals. Another example is most types which implement Future and also borrow data, for reasons explained in the Rust async book.

Types in category (1) are totally safe to move around in memory. You won’t invalidate any pointers by moving them around. But if you move a type in (2), then you invalidate pointers and can get undefined behaviour, as we saw before. In earlier versions of Rust, you had to be really careful using these types to not move them, or if you moved them, to use unsafe and update all the pointers. But since Rust 1.33, the compiler can automatically figure out which category any type is in, and make sure you only use it safely.

Any type in (1) implements a special auto trait called Unpin. Weird name, but its meaning will become clear soon. Again, most “normal” types implement Unpin, and because it’s an auto trait (like Send or Sync or Sized1), so you don’t have to worry about implementing it yourself. If you’re unsure if a type can be safely moved, just check it on docs.rs and see if it impls Unpin!

Types in (2) are creatively named !Unpin (the ! in a trait means “does not implement”). To use these types safely, we can’t use regular pointers for self-reference. Instead, we use special pointers that “pin” their values into place, ensuring they can’t be moved. This is exactly what the Pin type does.

Pin, Unpin, and why Rust needs them

Pin wraps a pointer and stops its value from moving. The only exception is if the value impls Unpin — then we know it’s safe to move. Voila! Now we can write self-referential structs safely! This is really important, because as discussed above, many Futures are self-referential, and we need them for async/await.

Using Pin

So now we understand why Pin exists, and why our Future poll method has a pinned &mut self to self instead of a regular &mut self. So let’s get back to the problem we had before: I need a pinned reference to the inner future. More generally: given a pinned struct, how do we access its fields?

The solution is to write helper functions which give you references to the fields. These references might be normal Rust references like &mut, or they might also be pinned. You can choose whichever one you need. This is called projection: if you have a pinned struct, you can write a projection method that gives you access to all its fields.

Projecting is really just getting data into and out of Pins. For example, we get the start: Option<Duration> field from the Pin<&mut self>, and we need to put the future: Fut into a Pin so we can call its poll method). If you read the Pin methods you’ll see this is always safe if it points to an Unpin value, but requires unsafe otherwise.

// Putting data into Pin
pub        fn new          <P: Deref<Target:Unpin>>(pointer: P) -> Pin<P>;
pub unsafe fn new_unchecked<P>                     (pointer: P) -> Pin<P>;

// Getting data from Pin
pub        fn into_inner          <P: Deref<Target: Unpin>>(pin: Pin<P>) -> P;
pub unsafe fn into_inner_unchecked<P>                      (pin: Pin<P>) -> P;

I know unsafe can be a bit scary, but it’s OK to write unsafe code! I think of unsafe as the compiler saying “hey, I can’t tell if this code follows the rules here, so I’m going to rely on you to check for me.” The Rust compiler does so much work for us, it’s only fair that we do some of the work every now and then. If you want to learn how to write your own projection methods, I can highly recommend this fasterthanli.me blog post on the topic. But we’re going to take a little shortcut.

Using pin-project instead

So, OK, look, it’s time for a confession: I don’t like using unsafe. I know I just explained why it’s OK, but still, given the option, I would rather not.

I didn’t start writing Rust because I wanted to carefully think about the consequences of my actions, damnit, I just want to go fast and not break things. Luckily, someone sympathized with me and made a crate which generates totally safe projections! It’s called pin-project and it’s awesome. All we need to do is change our definition:

#[pin_project::pin_project] // This generates a `project` method
pub struct TimedWrapper<Fut: Future> {
	// For each field, we need to choose whether `project` returns an
	// unpinned (&mut T) or pinned (Pin<&mut T>) reference to the field.
	// By default, it assumes unpinned:
	start: Option<Instant>,
	// Opt into pinned references with this attribute:
	#[pin]
	future: Fut,
}

For each field, you have to choose whether its projection should be pinned or not. By default, you should use a normal reference, just because they’re easier and simpler. But if you know you need a pinned reference — for example, because you want to call .poll(), whose receiver is Pin<&mut Self> — then you can do that with #[pin].

Now we can finally poll the inner future!

fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
	// This returns a type with all the same fields, with all the same types,
	// except that the fields defined with #[pin] will be pinned.
	let mut this = self.project();
	
    // Call the inner poll, measuring how long it took.
	let start = this.start.get_or_insert_with(Instant::now);
	let inner_poll = this.future.as_mut().poll(cx);
	let elapsed = start.elapsed();

	match inner_poll {
		// The inner future needs more time, so this future needs more time too
		Poll::Pending => Poll::Pending,
		// Success!
		Poll::Ready(output) => Poll::Ready((output, elapsed)),
	}
}

Finally, our goal is complete — and we did it all without any unsafe code.

Summary

If a Rust type has self-referential pointers, it can’t be moved safely. After all, moving doesn’t update the pointers, so they’ll still be pointing at the old memory address, so they’re now invalid. Rust can automatically tell which types are safe to move (and will auto impl the Unpin trait for them). If you have a Pin-ned pointer to some data, Rust can guarantee that nothing unsafe will happen (if it’s safe to move, you can move it, if it’s unsafe to move, then you can’t). This is important because many Future types are self-referential, so we need Pin to safely poll a Future. You probably won’t have to poll a future yourself (just use async/await instead), but if you do, use the pin-project crate to simplify things.

I hope this helped — if you have any questions, please ask me on Twitter. And if you want to get paid to talk to me about Rust and networking protocols, my team at Cloudflare is hiring, so be sure to visit careers.cloudflare.com.

References

Micro-frontend Architectures on AWS

Post Syndicated from Bryant Bost original https://aws.amazon.com/blogs/architecture/micro-frontend-architectures-on-aws/

A microservice architecture is characterized by independent services that are focused on a specific business function and maintained by small, self-contained teams. Microservice architectures are used frequently for web applications developed on AWS, and for good reason. They offer many well-known benefits such as development agility, technological freedom, targeted deployments, and more. Despite the popularity of microservices, many frontend applications are still built in a monolithic style. For example, they have one large code base that interacts with all backend microservices, and is maintained by a large team of developers.

Monolith Frontend

Figure 1. Microservice backend with monolith frontend

What is a Micro-frontend?

The micro-frontend architecture introduces microservice development principles to frontend applications. In a micro-frontend architecture, development teams independently build and deploy “child” frontend applications. These applications are combined by a “parent” frontend application that acts as a container to retrieve, display, and integrate the various child applications. In this parent/child model, the user interacts with what appears to be a single application. In reality, they are interacting with several independent applications, published by different teams.

Micro Frontend

Figure 2. Microservice backend with micro-frontend

Micro-frontend Benefits

Compared to a monolith frontend, a micro-frontend offers the following benefits:

  • Independent artifacts: A core tenet of microservice development is that artifacts can be deployed independently, and this remains true for micro-frontends. In a micro-frontend architecture, teams should be able to independently deploy their frontend applications with minimal impact to other services. Those changes will be reflected by the parent application.
  • Autonomous teams: Each team is the expert in its own domain. For example, the billing service team members have specialized knowledge. This includes the data models, the business requirements, the API calls, and user interactions associated with the billing service. This knowledge allows the team to develop the billing frontend faster than a larger, less specialized team.
  • Flexible technology choices: Autonomy allows each team to make technology choices that are independent from other teams. For instance, the billing service team could develop their micro-frontend using Vue.js and the profile service team could develop their frontend using Angular.
  • Scalable development: Micro-frontend development teams are smaller and are able to operate without disrupting other teams. This allows us to quickly scale development by spinning up new teams to deliver additional frontend functionality via child applications.
  • Easier maintenance: Keeping frontend repositories small and specialized allows them to be more easily understood, and this simplifies long-term maintenance and testing. For instance, if you want to change an interaction on a monolith frontend, you must isolate the location and dependencies of the feature within the context of a large codebase. This type of operation is greatly simplified when dealing with the smaller codebases associated with micro-frontends.

Micro-frontend Challenges

Conversely, a micro-frontend presents the following challenges:

  • Parent/child integration: A micro-frontend introduces the task of ensuring the parent application displays the child application with the same consistency and performance expected from a monolith application. This point is discussed further in the next section.
  • Operational overhead: Instead of managing a single frontend application, a micro-frontend application involves creating and managing separate infrastructure for all teams.
  • Consistent user experience: In order to maintain a consistent user experience, the child applications must use the same UI components, CSS libraries, interactions, error handling, and more. Maintaining consistency in the user experience can be difficult for child applications that are at different stages in the development lifecycle.

Building Micro-frontends

The most difficult challenge with the micro-frontend architecture pattern is integrating child applications with the parent application. Prioritizing the user experience is critical for any frontend application. In the context of micro-frontends, this means ensuring a user can seamlessly navigate from one child application to another inside the parent application. We want to avoid disruptive behavior such as page refreshes or multiple logins. At its most basic definition, parent/child integration involves the parent application dynamically retrieving and rendering child applications when the parent app is loaded. Rendering the child application depends on how the child application was built, and this can be done in a number of ways. Two of the most popular methods of parent/child integration are:

  1. Building each child application as a web component.
  2. Importing each child application as an independent module. These modules either declares a function to render itself or is dynamically imported by the parent application (such as with module federation).

Registering child apps as web components:

<html>
    <head>
        <script src="https://shipping.example.com/shipping-service.js"></script>
        <script src="https://profile.example.com/profile-service.js"></script>
        <script src="https://billing.example.com/billing-service.js"></script>
        <title>Parent Application</title>
    </head>
    <body>
        <shipping-service />
        <profile-service />
        <billing-service />
    </body>
</html>

Registering child apps as modules:

<html>
    <head>
        <script src="https://shipping.example.com/shipping-service.js"></script>
        <script src="https://profile.example.com/profile-service.js"></script>
        <script src="https://billing.example.com/billing-service.js"></script>
     <title>Parent Application</title>
    </head>
    <body>
    </body>
    <script>
        // Load and render the child applications form their JS bundles.
    </script>
</html>

The following diagram shows an example micro-frontend architecture built on AWS.

AWS Micro Frontend

Figure 3. Micro-frontend architecture on AWS

In this example, each service team is running a separate, identical stack to build their application. They use the AWS Developer Tools and deploy the application to Amazon Simple Storage Service (S3) with Amazon CloudFront. The CI/CD pipelines use shared components such as CSS libraries, API wrappers, or custom modules stored in AWS CodeArtifact. This helps drive consistency across parent and child applications.

When you retrieve the parent application, it should prompt you to log in to an identity provider and retrieve JWTs. In this example, the identity provider is an Amazon Cognito User Pool. After a successful login, the parent application retrieves the child applications from CloudFront and renders them inside the parent application. Alternatively, the parent application can elect to render the child applications on demand, when you navigate to a specific route. The child applications should not require you to log in again to the Amazon Cognito user pool. They should be configured to use the JWT obtained by the parent app or silently retrieve a new JWT from Amazon Cognito.

Conclusion

Micro-frontend architectures introduce many of the familiar benefits of microservice development to frontend applications. A micro-frontend architecture also simplifies the process of building complex frontend applications by allowing you to manage small, independent components.

Save 2 clicks, test data preprocessing

Post Syndicated from Aigars Kadiķis original https://blog.zabbix.com/save-2-clicks-test-data-preprocessing/13249/

This topic is related to template development from scratch, bulk data input, and a lot of dependable items having different preprocessing steps each.

If these keywords resonate with you, keep reading.

Story stars back in a day when a “Test now” button was invented inside the item preprocessing section. In this way, we can simulate the entire preprocessing stack. A very cool feature to have.

Nevertheless, we tend to copy over and over again the data input:

While this is fine for small projects with simple preprocessing steps which match our knowledge league. It is not so OK in we have ambition to solve the impossible. Figure out a data preprocessing rule(s) which suit our needs.

For a template development process, the solution is to skip data input and inject a static value in the very first preprocessing step. Let me introduce the concept.

JavaScript preprocessing step 1:

return 'this is input text';

JavaScript preprocessing step 2:

return value.replace("text","data");

Now we have static input, no need to spend time to “click” the input data.

Sometimes the input is not just one line but multiple lines, and tabs, and spaces and double quotes and single quotes and special characters. To respect all these things, we must get our hands dirty with the base64 format.

To prepare input data as base64 string, on windows systems it can be easily done with Notepad++. Just select all text and select “Plugin commands” => “Base64 Encode” (functionality is not there with a lite version of Notepad++):

After that, we need to copy all content to clipboard:

Create the first JavasSript preprocessing with the content from the clipboard. Here is the same example:

return 'PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTE2Ij8+DQo8am9ibG9nPg0KICA8am9iX2xvZ192ZXJzaW9uIHZlcnNpb249IjIuMCIvPg0KICA8aGVhZGVyPg0KICAgIDxzdGFydF90aW1lPkpvYiBzdGFydGVkOiBNb25kYXksIEF1Z3VzdCAxMCwgMjAyMCBhdCAxOjAwOjA1IFBNPC9zdGFydF90aW1lPg0KICA8L2hlYWRlcj4NCiAgPGZvb3Rlcj4NCiAgICA8ZW5kX3RpbWU+Sm9iIGVuZGVkOiBNb25kYXksIEF1Z3VzdCAxMCwgMjAyMCBhdCAzOjE3OjUwIFBNPC9lbmRfdGltZT4NCiAgICA8T3BlcmF0aW9uRXJyb3JzIFR5cGU9ImpvYmZ0cl9qb2Jjb21wbF9zaHV0ZG93biI+Sm9iIGNvbXBsZXRpb24gc3RhdHVzOiBDYW5jZWxlZCBieSBzZXJ2aWNlIHNodXRkb3duPC9PcGVyYXRpb25FcnJvcnM+DQogICAgPGNvbXBsZXRlU3RhdHVzPjE8L2NvbXBsZXRlU3RhdHVzPg0KICAgIDxhYm9ydFVzZXJOYW1lPlRoZSBqb2Igd2FzIGNhbmNlbGVkIGJlY2F1c2UgdGhlIHJlc3BvbnNlIHRvIGEgbWVkaWEgcmVxdWVzdCBhbGVydCB3YXMgQ2FuY2VsLCBvciBiZWNhdXNlIHRoZSBhbGVydCB3YXMgY29uZmlndXJlZCB0byBhdXRvbWF0aWNhbGx5IHJlc3BvbmQgd2l0aCBDYW5jZWwsIG9yIGJlY2F1c2UgdGhlIEJhY2t1cCBFeGVjIEpvYiBFbmdpbmUgc2VydmljZSB3YXMgc3RvcHBlZC48L2Fib3J0VXNlck5hbWU+DQogIDwvZm9vdGVyPg0KPC9qb2Jsb2c+DQo=';

In the next step, there must be decoding scheduled. Kindly copy the code 1:1. Configure it as a second preprocessing step:

var k = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
function d(e) {
    var t, n, o, r, a = "",
        i = "",
        c = "",
        l = 0;
    for (/[^A-Za-z0-9+/=]/g.exec(e) && alert("1"), e = e.replace(/[^A-Za-z0-9+/=]/g, ""); t = k.indexOf(e.charAt(l++)) << 2 | (o = k.indexOf(e.charAt(l++))) >> 4, n = (15 & o) << 4 | (r = k.indexOf(e.charAt(l++))) >> 2, i = (3 & r) << 6 | (c = k.indexOf(e.charAt(l++))), a += String.fromCharCode(t), 64 != r && (a += String.fromCharCode(n)), 64 != c && (a += String.fromCharCode(i)), t = n = i = "", o = r = c = "", l < e.length;);
    return unescape(a)
}
return d(value);

This is how it looks like:

Go to testing section and ensure the data in Zabbix is similar as it was in Notepad++:

Data has been successfully decoded. Multiple lines, quite original stuff. The tabs are not visible with a naked human eye but they are there, I promise!

Now we can “play” out the next preprocessing steps and try out different things:

When one preprocessing has been figured out, just clone the item and start to developing a next one. Sure, if we succeed the ambition, it will be required to spend 5 minutes to go through all items, remove first 2 steps and link the item to master key 😉

Ok. That is it for today. Bye.

By the way, on Linux system to have base64 string we only need:

  1. A command where the output entertains us
  2. Pipe it to ‘base64 -w0’
systemctl list-unit-files --type=service | base64 -w0

Publishing private npm packages with AWS CodeArtifact

Post Syndicated from Ryan Sonshine original https://aws.amazon.com/blogs/devops/publishing-private-npm-packages-aws-codeartifact/

This post demonstrates how to create, publish, and download private npm packages using AWS CodeArtifact, allowing you to share code across your organization without exposing your packages to the public.

The ability to control CodeArtifact repository access using AWS Identity and Access Management (IAM) removes the need to manage additional credentials for a private npm repository when developers already have IAM roles configured.

You can use private npm packages for a variety of use cases, such as:

  • Reducing code duplication
  • Configuration such as code linting and styling
  • CLI tools for internal processes

This post shows how to easily create a sample project in which we publish an npm package and install the package from CodeArtifact. For more information about pipeline integration, see AWS CodeArtifact and your package management flow – Best Practices for Integration.

Solution overview

The following diagram illustrates this solution.

Diagram showing npm package publish and install with CodeArtifact

In this post, you create a private scoped npm package containing a sample function that can be used across your organization. You create a second project to download the npm package. You also learn how to structure your npm package to make logging in to CodeArtifact automatic when you want to build or publish the package.

The code covered in this post is available on GitHub:

Prerequisites

Before you begin, you need to complete the following:

  1. Create an AWS account.
  2. Install the AWS Command Line Interface (AWS CLI). CodeArtifact is supported in these CLI versions:
    1. 18.83 or later: install the AWS CLI version 1
    2. 0.54 or later: install the AWS CLI version 2
  3. Create a CodeArtifact repository.
  4. Add required IAM permissions for CodeArtifact.

Creating your npm package

You can create your npm package in three easy steps: set up the project, create your npm script for authenticating with CodeArtifact, and publish the package.

Setting up your project

Create a directory for your new npm package. We name this directory my-package because it serves as the name of the package. We use an npm scope for this package, where @myorg represents the scope all of our organization’s packages are published under. This helps us distinguish our internal private package from external packages. See the following code:

npm init [email protected] -y

{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

The package.json file specifies that the main file of the package is called index.js. Next, we create that file and add our package function to it:

module.exports.helloWorld = function() {
  console.log('Hello world!');
}

Creating an npm script

To create your npm script, complete the following steps:

  1. On the CodeArtifact console, choose the repository you created as part of the prerequisites.

If you haven’t created a repository, create one before proceeding.

CodeArtifact repository details console

  1. Select your CodeArtifact repository and choose Details to view the additional details for your repository.

We use two items from this page:

  • Repository name (my-repo)
  • Domain (my-domain)
  1. Create a script named co:login in our package.json. The package.json contains the following code:
{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Running this script updates your npm configuration to use your CodeArtifact repository and sets your authentication token, which expires after 12 hours.

  1. To test our new script, enter the following command:

npm run co:login

The following code is the output:

> aws codeartifact login --tool npm --repository my-repo --domain my-domain
Successfully configured npm to use AWS CodeArtifact repository https://my-domain-<ACCOUNT ID>.d.codeartifact.us-east-1.amazonaws.com/npm/my-repo/
Login expires in 12 hours at 2020-09-04 02:16:17-04:00
  1. Add a prepare script to our package.json to run our login command:
{
  "name": "@myorg/my-package",
  "version": "1.0.0",
  "description": "A sample private scoped npm package",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

This configures our project to automatically authenticate and generate an access token anytime npm install or npm publish run on the project.

If you see an error containing Invalid choice, valid choices are:, you need to update the AWS CLI according to the versions listed in the perquisites of this post.

Publishing your package

To publish our new package for the first time, run npm publish.

The following screenshot shows the output.

Terminal showing npm publish output

If we navigate to our CodeArtifact repository on the CodeArtifact console, we now see our new private npm package ready to be downloaded.

CodeArtifact console showing published npm package

Installing your private npm package

To install your private npm package, you first set up the project and add the CodeArtifact configs. After you install your package, it’s ready to use.

Setting up your project

Create a directory for a new application and name it my-app. This is a sample project to download our private npm package published in the previous step. You can apply this pattern to all repositories you intend on installing your organization’s npm packages in.

npm init -y

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "A sample application consuming a private scoped npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Adding CodeArtifact configs

Copy the npm scripts prepare and co:login created earlier to your new project:

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "A sample application consuming a private scoped npm package",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Installing your new private npm package

Enter the following command:

npm install @myorg/my-package

Your package.json should now list @myorg/my-package in your dependencies:

{
  "name": "my-app",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "prepare": "npm run co:login",
    "co:login": "aws codeartifact login --tool npm --repository my-repo --domain my-domain",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "dependencies": {
    "@myorg/my-package": "^1.0.0"
  }
}

Using your new npm package

In our my-app application, create a file named index.js to run code from our package containing the following:

const { helloWorld } = require('@myorg/my-package');

helloWorld();

Run node index.js in your terminal to see the console print the message from our @myorg/my-package helloWorld function.

Cleaning Up

If you created a CodeArtifact repository for the purposes of this post, use one of the following methods to delete the repository:

Remove the changes made to your user profile’s npm configuration by running npm config delete registry, this will remove the CodeArtifact repository from being set as your default npm registry.

Conclusion

In this post, you successfully published a private scoped npm package stored in CodeArtifact, which you can reuse across multiple teams and projects within your organization. You can use npm scripts to streamline the authentication process and apply this pattern to save time.

About the Author

Ryan Sonshine

Ryan Sonshine is a Cloud Application Architect at Amazon Web Services. He works with customers to drive digital transformations while helping them architect, automate, and re-engineer solutions to fully leverage the AWS Cloud.

 

 

Building, bundling, and deploying applications with the AWS CDK

Post Syndicated from Cory Hall original https://aws.amazon.com/blogs/devops/building-apps-with-aws-cdk/

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to model and provision your cloud application resources using familiar programming languages.

The post CDK Pipelines: Continuous delivery for AWS CDK applications showed how you can use CDK Pipelines to deploy a TypeScript-based AWS Lambda function. In that post, you learned how to add additional build commands to the pipeline to compile the TypeScript code to JavaScript, which is needed to create the Lambda deployment package.

In this post, we dive deeper into how you can perform these build commands as part of your AWS CDK build process by using the native AWS CDK bundling functionality.

If you’re working with Python, TypeScript, or JavaScript-based Lambda functions, you may already be familiar with the PythonFunction and NodejsFunction constructs, which use the bundling functionality. This post describes how to write your own bundling logic for instances where a higher-level construct either doesn’t already exist or doesn’t meet your needs. To illustrate this, I walk through two different examples: a Lambda function written in Golang and a static site created with Nuxt.js.

Concepts

A typical CI/CD pipeline contains steps to build and compile your source code, bundle it into a deployable artifact, push it to artifact stores, and deploy to an environment. In this post, we focus on the building, compiling, and bundling stages of the pipeline.

The AWS CDK has the concept of bundling source code into a deployable artifact. As of this writing, this works for two main types of assets: Docker images published to Amazon Elastic Container Registry (Amazon ECR) and files published to Amazon Simple Storage Service (Amazon S3). For files published to Amazon S3, this can be as simple as pointing to a local file or directory, which the AWS CDK uploads to Amazon S3 for you.

When you build an AWS CDK application (by running cdk synth), a cloud assembly is produced. The cloud assembly consists of a set of files and directories that define your deployable AWS CDK application. In the context of the AWS CDK, it might include the following:

  • AWS CloudFormation templates and instructions on where to deploy them
  • Dockerfiles, corresponding application source code, and information about where to build and push the images to
  • File assets and information about which S3 buckets to upload the files to

Use case

For this use case, our application consists of front-end and backend components. The example code is available in the GitHub repo. In the repository, I have split the example into two separate AWS CDK applications. The repo also contains the Golang Lambda example app and the Nuxt.js static site.

Golang Lambda function

To create a Golang-based Lambda function, you must first create a Lambda function deployment package. For Go, this consists of a .zip file containing a Go executable. Because we don’t commit the Go executable to our source repository, our CI/CD pipeline must perform the necessary steps to create it.

In the context of the AWS CDK, when we create a Lambda function, we have to tell the AWS CDK where to find the deployment package. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  runtime: lambda.Runtime.GO_1_X,
  handler: 'main',
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-go-executable')),
});

In the preceding code, the lambda.Code.fromAsset() method tells the AWS CDK where to find the Golang executable. When we run cdk synth, it stages this Go executable in the cloud assembly, which it zips and publishes to Amazon S3 as part of the PublishAssets stage.

If we’re running the AWS CDK as part of a CI/CD pipeline, this executable doesn’t exist yet, so how do we create it? One method is CDK bundling. The lambda.Code.fromAsset() method takes a second optional argument, AssetOptions, which contains the bundling parameter. With this bundling parameter, we can tell the AWS CDK to perform steps prior to staging the files in the cloud assembly.

Breaking down the BundlingOptions parameter further, we can perform the build inside a Docker container or locally.

Building inside a Docker container

For this to work, we need to make sure that we have Docker running on our build machine. In AWS CodeBuild, this means setting privileged: true. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-source-code'), {
    bundling: {
      image: lambda.Runtime.GO_1_X.bundlingDockerImage,
      command: [
        'bash', '-c', [
          'go test -v',
          'GOOS=linux go build -o /asset-output/main',
      ].join(' && '),
    },
  })
  ...
});

We specify two parameters:

  • image (required) – The Docker image to perform the build commands in
  • command (optional) – The command to run within the container

The AWS CDK mounts the folder specified as the first argument to fromAsset at /asset-input inside the container, and mounts the asset output directory (where the cloud assembly is staged) at /asset-output inside the container.

After we perform the build commands, we need to make sure we copy the Golang executable to the /asset-output location (or specify it as the build output location like in the preceding example).

This is the equivalent of running something like the following code:

docker run \
  --rm \
  -v folder-containing-source-code:/asset-input \
  -v cdk.out/asset.1234a4b5/:/asset-output \
  lambci/lambda:build-go1.x \
  bash -c 'GOOS=linux go build -o /asset-output/main'

Building locally

To build locally (not in a Docker container), we have to provide the local parameter. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-source-code'), {
    bundling: {
      image: lambda.Runtime.GO_1_X.bundlingDockerImage,
      command: [],
      local: {
        tryBundle(outputDir: string) {
          try {
            spawnSync('go version')
          } catch {
            return false
          }

          spawnSync(`GOOS=linux go build -o ${path.join(outputDir, 'main')}`);
          return true
        },
      },
    },
  })
  ...
});

The local parameter must implement the ILocalBundling interface. The tryBundle method is passed the asset output directory, and expects you to return a boolean (true or false). If you return true, the AWS CDK doesn’t try to perform Docker bundling. If you return false, it falls back to Docker bundling. Just like with Docker bundling, you must make sure that you place the Go executable in the outputDir.

Typically, you should perform some validation steps to ensure that you have the required dependencies installed locally to perform the build. This could be checking to see if you have go installed, or checking a specific version of go. This can be useful if you don’t have control over what type of build environment this might run in (for example, if you’re building a construct to be consumed by others).

If we run cdk synth on this, we see a new message telling us that the AWS CDK is bundling the asset. If we include additional commands like go test, we also see the output of those commands. This is especially useful if you wanted to fail a build if tests failed. See the following code:

$ cdk synth
Bundling asset GolangLambdaStack/MyGoFunction/Code/Stage...
✓  . (9ms)
✓  clients (5ms)

DONE 8 tests in 11.476s
✓  clients (5ms) (coverage: 84.6% of statements)
✓  . (6ms) (coverage: 78.4% of statements)

DONE 8 tests in 2.464s

Cloud Assembly

If we look at the cloud assembly that was generated (located at cdk.out), we see something like the following code:

$ cdk synth
Bundling asset GolangLambdaStack/MyGoFunction/Code/Stage...
✓  . (9ms)
✓  clients (5ms)

DONE 8 tests in 11.476s
✓  clients (5ms) (coverage: 84.6% of statements)
✓  . (6ms) (coverage: 78.4% of statements)

DONE 8 tests in 2.464s

It contains our GolangLambdaStack CloudFormation template that defines our Lambda function, as well as our Golang executable, bundled at asset.01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952/main.

Let’s look at how the AWS CDK uses this information. The GolangLambdaStack.assets.json file contains all the information necessary for the AWS CDK to know where and how to publish our assets (in this use case, our Golang Lambda executable). See the following code:

{
  "version": "5.0.0",
  "files": {
    "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952": {
      "source": {
        "path": "asset.01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952",
        "packaging": "zip"
      },
      "destinations": {
        "current_account-current_region": {
          "bucketName": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}",
          "objectKey": "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952.zip",
          "assumeRoleArn": "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/cdk-hnb659fds-file-publishing-role-${AWS::AccountId}-${AWS::Region}"
        }
      }
    }
  }
}

The file contains information about where to find the source files (source.path) and what type of packaging (source.packaging). It also tells the AWS CDK where to publish this .zip file (bucketName and objectKey) and what AWS Identity and Access Management (IAM) role to use (assumeRoleArn). In this use case, we only deploy to a single account and Region, but if you have multiple accounts or Regions, you see multiple destinations in this file.

The GolangLambdaStack.template.json file that defines our Lambda resource looks something like the following code:

{
  "Resources": {
    "MyGoFunction0AB33E85": {
      "Type": "AWS::Lambda::Function",
      "Properties": {
        "Code": {
          "S3Bucket": {
            "Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
          },
          "S3Key": "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952.zip"
        },
        "Handler": "main",
        ...
      }
    },
    ...
  }
}

The S3Bucket and S3Key match the bucketName and objectKey from the assets.json file. By default, the S3Key is generated by calculating a hash of the folder location that you pass to lambda.Code.fromAsset(), (for this post, folder-containing-source-code). This means that any time we update our source code, this calculated hash changes and a new Lambda function deployment is triggered.

Nuxt.js static site

In this section, I walk through building a static site using the Nuxt.js framework. You can apply the same logic to any static site framework that requires you to run a build step prior to deploying.

To deploy this static site, we use the BucketDeployment construct. This is a construct that allows you to populate an S3 bucket with the contents of .zip files from other S3 buckets or from a local disk.

Typically, we simply tell the BucketDeployment construct where to find the files that it needs to deploy to the S3 bucket. See the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-directory')),
  ],
  destinationBucket: myBucket
});

To deploy a static site built with a framework like Nuxt.js, we need to first run a build step to compile the site into something that can be deployed. For Nuxt.js, we run the following two commands:

  • yarn install – Installs all our dependencies
  • yarn generate – Builds the application and generates every route as an HTML file (used for static hosting)

This creates a dist directory, which you can deploy to Amazon S3.

Just like with the Golang Lambda example, we can perform these steps as part of the AWS CDK through either local or Docker bundling.

Building inside a Docker container

To build inside a Docker container, use the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-nuxtjs-project'), {
      bundling: {
        image: cdk.BundlingDockerImage.fromRegistry('node:lts'),
        command: [
          'bash', '-c', [
            'yarn install',
            'yarn generate',
            'cp -r /asset-input/dist/* /asset-output/',
          ].join(' && '),
        ],
      },
    }),
  ],
  ...
});

For this post, we build inside the publicly available node:lts image hosted on DockerHub. Inside the container, we run our build commands yarn install && yarn generate, and copy the generated dist directory to our output directory (the cloud assembly).

The parameters are the same as described in the Golang example we walked through earlier.

Building locally

To build locally, use the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-nuxtjs-project'), {
      bundling: {
        local: {
          tryBundle(outputDir: string) {
            try {
              spawnSync('yarn --version');
            } catch {
              return false
            }

            spawnSync('yarn install && yarn generate');

       fs.copySync(path.join(__dirname, ‘path-to-nuxtjs-project’, ‘dist’), outputDir);
            return true
          },
        },
        image: cdk.BundlingDockerImage.fromRegistry('node:lts'),
        command: [],
      },
    }),
  ],
  ...
});

Building locally works the same as the Golang example we walked through earlier, with one exception. We have one additional command to run that copies the generated dist folder to our output directory (cloud assembly).

Conclusion

This post showed how you can easily compile your backend and front-end applications using the AWS CDK. You can find the example code for this post in this GitHub repo. If you have any questions or comments, please comment on the GitHub repo. If you have any additional examples you want to add, we encourage you to create a Pull Request with your example!

Our code also contains examples of deploying the applications using CDK Pipelines, so if you’re interested in deploying the example yourself, check out the example repo.

 

About the author

Cory Hall

Cory is a Solutions Architect at Amazon Web Services with a passion for DevOps and is based in Charlotte, NC. Cory works with enterprise AWS customers to help them design, deploy, and scale applications to achieve their business goals.

How to add authentication to a single-page web application with Amazon Cognito OAuth2 implementation

Post Syndicated from George Conti original https://aws.amazon.com/blogs/security/how-to-add-authentication-single-page-web-application-with-amazon-cognito-oauth2-implementation/

In this post, I’ll be showing you how to configure Amazon Cognito as an OpenID provider (OP) with a single-page web application.

This use case describes using Amazon Cognito to integrate with an existing authorization system following the OpenID Connect (OIDC) specification. OIDC is an identity layer on top of the OAuth 2.0 protocol to enable clients to verify the identity of users. Amazon Cognito lets you add user sign-up, sign-in, and access control to your web and mobile apps quickly and easily. Some key reasons customers select Amazon Cognito include:

  • Simplicity of implementation: The console is very intuitive; it takes a short time to understand how to configure and use Amazon Cognito. Amazon Cognito also has key out-of-the-box functionality, including social sign-in, multi-factor authentication (MFA), forgotten password support, and infrastructure as code (AWS CloudFormation) support.
  • Ability to customize workflows: Amazon Cognito offers the option of a hosted UI where users can sign-in directly to Amazon Cognito or sign-in via social identity providers such as Amazon, Google, Apple, and Facebook. The Amazon Cognito hosted UI and workflows help save your team significant time and effort.
  • OIDC support: Amazon Cognito can securely pass user profile information to an existing authorization system following the ODIC authorization code flow. The authorization system uses the user profile information to secure access to the app.

Amazon Cognito overview

Amazon Cognito follows the OIDC specification to authenticate users of web and mobile apps. Users can sign in directly through the Amazon Cognito hosted UI or through a federated identity provider, such as Amazon, Facebook, Apple, or Google. The hosted UI workflows include sign-in and sign-up, password reset, and MFA. Since not all customer workflows are the same, you can customize Amazon Cognito workflows at key points with AWS Lambda functions, allowing you to run code without provisioning or managing servers. After a user authenticates, Amazon Cognito returns standard OIDC tokens. You can use the user profile information in the ID token to grant your users access to your own resources or you can use the tokens to grant access to APIs hosted by Amazon API Gateway. You can also exchange the tokens for temporary AWS credentials to access other AWS services.

Figure 1: Amazon Cognito sign-in flow

Figure 1: Amazon Cognito sign-in flow

OAuth 2.0 and OIDC

OAuth 2.0 is an open standard that allows a user to delegate access to their information to other websites or applications without handing over credentials. OIDC is an identity layer on top of OAuth 2.0 that uses OAuth 2.0 flows. OAuth 2.0 defines a number of flows to manage the interaction between the application, user, and authorization server. The right flow to use depends on the type of application.

The client credentials flow is used in machine-to-machine communications. You can use the client credentials flow to request an access token to access your own resources, which means you can use this flow when your app is requesting the token on its own behalf, not on behalf of a user. The authorization code grant flow is used to return an authorization code that is then exchanged for user pool tokens. Because the tokens are never exposed directly to the user, they are less likely to be shared broadly or accessed by an unauthorized party. However, a custom application is required on the back end to exchange the authorization code for user pool tokens. For security reasons, we recommend the Authorization Code Flow with Proof Key Code Exchange (PKCE) for public clients, such as single-page apps or native mobile apps.

The following table shows recommended flows per application type.

Application CFlow Description
Machine Client credentials Use this flow when your application is requesting the token on its own behalf, not on behalf of the user
Web app on a server Authorization code grant A regular web app on a web server
Single-page app Authorization code grant PKCE An app running in the browser, such as JavaScript
Mobile app Authorization code grant PKCE iOS or Android app

Securing the authorization code flow

Amazon Cognito can help you achieve compliance with regulatory frameworks and certifications, but it’s your responsibility to use the service in a way that remains compliant and secure. You need to determine the sensitivity of the user profile data in Amazon Cognito; adhere to your company’s security requirements, applicable laws and regulations; and configure your application and corresponding Amazon Cognito settings appropriately for your use case.

Note: You can learn more about regulatory frameworks and certifications at AWS Services in Scope by Compliance Program. You can download compliance reports from AWS Artifact.

We recommend that you use the authorization code flow with PKCE for single-page apps. Applications that use PKCE generate a random code verifier that’s created for every authorization request. Proof Key for Code Exchange by OAuth Public Clients has more information on use of a code verifier. In the following sections, I will show you how to set up the Amazon Cognito authorization endpoint for your app to support a code verifier.

The authorization code flow

In OpenID terms, the app is the relying party (RP) and Amazon Cognito is the OP. The flow for the authorization code flow with PKCE is as follows:

  1. The user enters the app home page URL in the browser and the browser fetches the app.
  2. The app generates the PKCE code challenge and redirects the request to the Amazon Cognito OAuth2 authorization endpoint (/oauth2/authorize).
  3. Amazon Cognito responds back to the user’s browser with the Amazon Cognito hosted sign-in page.
  4. The user signs in with their user name and password, signs up as a new user, or signs in with a federated sign-in. After a successful sign-in, Amazon Cognito returns the authorization code to the browser, which redirects the authorization code back to the app.
  5. The app sends a request to the Amazon Cognito OAuth2 token endpoint (/oauth2/token) with the authorization code, its client credentials, and the PKCE verifier.
  6. Amazon Cognito authenticates the app with the supplied credentials, validates the authorization code, validates the request with the code verifier, and returns the OpenID tokens, access token, ID token, and refresh token.
  7. The app validates the OpenID ID token and then uses the user profile information (claims) in the ID token to provide access to resources.(Optional) The app can use the access token to retrieve the user profile information from the Amazon Cognito user information endpoint (/userInfo).
  8. Amazon Cognito returns the user profile information (claims) about the authenticated user to the app. The app then uses the claims to provide access to resources.

The following diagram shows the authorization code flow with PKCE.

Figure 2: Authorization code flow

Figure 2: Authorization code flow

Implementing an app with Amazon Cognito authentication

Now that you’ve learned about Amazon Cognito OAuth implementation, let’s create a working example app that uses Amazon Cognito OAuth implementation. You’ll create an Amazon Cognito user pool along with an app client, the app, an Amazon Simple Storage Service (Amazon S3) bucket, and an Amazon CloudFront distribution for the app, and you’ll configure the app client.

Step 1. Create a user pool

Start by creating your user pool with the default configuration.

Create a user pool:

  1. Go to the Amazon Cognito console and select Manage User Pools. This takes you to the User Pools Directory.
  2. Select Create a user pool in the upper corner.
  3. Enter a Pool name, select Review defaults, and select Create pool.
  4. Copy the Pool ID, which will be used later to create your single-page app. It will be something like region_xxxxx. You will use it to replace the variable YOUR_USERPOOL_ID in a later step.(Optional) You can add additional features to the user pool, but this demonstration uses the default configuration. For more information see, the Amazon Cognito documentation.

The following figure shows you entering the user pool name.

Figure 3: Enter a name for the user pool

Figure 3: Enter a name for the user pool

The following figure shows the resulting user pool configuration.

Figure 4: Completed user pool configuration

Figure 4: Completed user pool configuration

Step 2. Create a domain name

The Amazon Cognito hosted UI lets you use your own domain name or you can add a prefix to the Amazon Cognito domain. This example uses an Amazon Cognito domain with a prefix.

Create a domain name:

  1. Sign in to the Amazon Cognito console, select Manage User Pools, and select your user pool.
  2. Under App integration, select Domain name.
  3. In the Amazon Cognito domain section, add your Domain prefix (for example, myblog).
  4. Select Check availability. If your domain isn’t available, change the domain prefix and try again.
  5. When your domain is confirmed as available, copy the Domain prefix to use when you create your single-page app. You will use it to replace the variable YOUR_COGNITO_DOMAIN_PREFIX in a later step.
  6. Choose Save changes.

The following figure shows creating an Amazon Cognito hosted domain.

Figure 5: Creating an Amazon Cognito hosted UI domain

Figure 5: Creating an Amazon Cognito hosted UI domain

Step 3. Create an app client

Now create the app client user pool. An app client is where you register your app with the user pool. Generally, you create an app client for each app platform. For example, you might create an app client for a single-page app and another app client for a mobile app. Each app client has its own ID, authentication flows, and permissions to access user attributes.

Create an app client:

  1. Sign in to the Amazon Cognito console, select Manage User Pools, and select your user pool.
  2. Under General settings, select App clients.
  3. Choose Add an app client.
  4. Enter a name for the app client in the App client name field.
  5. Uncheck Generate client secret and accept the remaining default configurations.

    Note: The client secret is used to authenticate the app client to the user pool. Generate client secret is unchecked because you don’t want to send the client secret on the URL using client-side JavaScript. The client secret is used by applications that have a server-side component that can secure the client secret.

  6. Choose Create app client as shown in the following figure.

    Figure 6: Create and configure an app client

    Figure 6: Create and configure an app client

  7. Copy the App client ID. You will use it to replace the variable YOUR_APPCLIENT_ID in a later step.

The following figure shows the App client ID which is automatically generated when the app client is created.

Figure 7: App client configuration

Figure 7: App client configuration

Step 4. Create an Amazon S3 website bucket

Amazon S3 is an object storage service that offers industry-leading scalability, data availability, security, and performance. We use Amazon S3 here to host a static website.

Create an Amazon S3 bucket with the following settings:

  1. Sign in to the AWS Management Console and open the Amazon S3 console.
  2. Choose Create bucket to start the Create bucket wizard.
  3. In Bucket name, enter a DNS-compliant name for your bucket. You will use this in a later step to replace the YOURS3BUCKETNAME variable.
  4. In Region, choose the AWS Region where you want the bucket to reside.

    Note: It’s recommended to create the Amazon S3 bucket in the same AWS Region as Amazon Cognito.

  5. Look up the region code from the region table (for example, US-East [N. Virginia] has a region code of us-east-1). You will use the region code to replace the variable YOUR_REGION in a later step.
  6. Choose Next.
  7. Select the Versioning checkbox.
  8. Choose Next.
  9. Choose Next.
  10. Choose Create bucket.
  11. Select the bucket you just created from the Amazon S3 bucket list.
  12. Select the Properties tab.
  13. Choose Static website hosting.
  14. Choose Use this bucket to host a website.
  15. For the index document, enter index.html and then choose Save.

Step 5. Create a CloudFront distribution

Amazon CloudFront is a fast content delivery network service that helps securely deliver data, videos, applications, and APIs to customers globally with low latency and high transfer speeds—all within a developer-friendly environment. In this step, we use CloudFront to set up an HTTPS-enabled domain for the static website hosted on Amazon S3.

Create a CloudFront distribution (web distribution) with the following modified default settings:

  1. Sign into the AWS Management Console and open the CloudFront console.
  2. Choose Create Distribution.
  3. On the first page of the Create Distribution Wizard, in the Web section, choose Get Started.
  4. Choose the Origin Domain Name from the dropdown list. It will be YOURS3BUCKETNAME.s3.amazonaws.com.
  5. For Restrict Bucket Access, select Yes.
  6. For Origin Access Identity, select Create a New Identity.
  7. For Grant Read Permission on Bucket, select Yes, Update Bucket Policy.
  8. For the Viewer Protocol Policy, select Redirect HTTP to HTTPS.
  9. For Cache Policy, select Managed-Caching Disabled.
  10. Set the Default Root Object to index.html.(Optional) Add a comment. Comments are a good place to describe the purpose of your distribution, for example, “Amazon Cognito SPA.”
  11. Select Create Distribution. The distribution will take a few minutes to create and update.
  12. Copy the Domain Name. This is the CloudFront distribution domain name, which you will use in a later step as the DOMAINNAME value in the YOUR_REDIRECT_URI variable.

Step 6. Create the app

Now that you’ve created the Amazon S3 bucket for static website hosting and the CloudFront distribution for the site, you’re ready to use the code that follows to create a sample app.

Use the following information from the previous steps:

  1. YOUR_COGNITO_DOMAIN_PREFIX is from Step 2.
  2. YOUR_REGION is the AWS region you used in Step 4 when you created your Amazon S3 bucket.
  3. YOUR_APPCLIENT_ID is the App client ID from Step 3.
  4. YOUR_USERPOOL_ID is the Pool ID from Step 1.
  5. YOUR_REDIRECT_URI, which is https://DOMAINNAME/index.html, where DOMAINNAME is your domain name from Step 5.

Create userprofile.js

Use the following text to create the userprofile.js file. Substitute the preceding pre-existing values for the variables in the text.

var myHeaders = new Headers();
myHeaders.set('Cache-Control', 'no-store');
var urlParams = new URLSearchParams(window.location.search);
var tokens;
var domain = "YOUR_COGNITO_DOMAIN_PREFIX";
var region = "YOUR_REGION";
var appClientId = "YOUR_APPCLIENT_ID";
var userPoolId = "YOUR_USERPOOL_ID";
var redirectURI = "YOUR_REDIRECT_URI";

//Convert Payload from Base64-URL to JSON
const decodePayload = payload => {
  const cleanedPayload = payload.replace(/-/g, '+').replace(/_/g, '/');
  const decodedPayload = atob(cleanedPayload)
  const uriEncodedPayload = Array.from(decodedPayload).reduce((acc, char) => {
    const uriEncodedChar = ('00' + char.charCodeAt(0).toString(16)).slice(-2)
    return `${acc}%${uriEncodedChar}`
  }, '')
  const jsonPayload = decodeURIComponent(uriEncodedPayload);

  return JSON.parse(jsonPayload)
}

//Parse JWT Payload
const parseJWTPayload = token => {
    const [header, payload, signature] = token.split('.');
    const jsonPayload = decodePayload(payload)

    return jsonPayload
};

//Parse JWT Header
const parseJWTHeader = token => {
    const [header, payload, signature] = token.split('.');
    const jsonHeader = decodePayload(header)

    return jsonHeader
};

//Generate a Random String
const getRandomString = () => {
    const randomItems = new Uint32Array(28);
    crypto.getRandomValues(randomItems);
    const binaryStringItems = randomItems.map(dec => `0${dec.toString(16).substr(-2)}`)
    return binaryStringItems.reduce((acc, item) => `${acc}${item}`, '');
}

//Encrypt a String with SHA256
const encryptStringWithSHA256 = async str => {
    const PROTOCOL = 'SHA-256'
    const textEncoder = new TextEncoder();
    const encodedData = textEncoder.encode(str);
    return crypto.subtle.digest(PROTOCOL, encodedData);
}

//Convert Hash to Base64-URL
const hashToBase64url = arrayBuffer => {
    const items = new Uint8Array(arrayBuffer)
    const stringifiedArrayHash = items.reduce((acc, i) => `${acc}${String.fromCharCode(i)}`, '')
    const decodedHash = btoa(stringifiedArrayHash)

    const base64URL = decodedHash.replace(/\+/g, '-').replace(/\//g, '_').replace(/=+$/, '');
    return base64URL
}

// Main Function
async function main() {
  var code = urlParams.get('code');

  //If code not present then request code else request tokens
  if (code == null){

    // Create random "state"
    var state = getRandomString();
    sessionStorage.setItem("pkce_state", state);

    // Create PKCE code verifier
    var code_verifier = getRandomString();
    sessionStorage.setItem("code_verifier", code_verifier);

    // Create code challenge
    var arrayHash = await encryptStringWithSHA256(code_verifier);
    var code_challenge = hashToBase64url(arrayHash);
    sessionStorage.setItem("code_challenge", code_challenge)

    // Redirtect user-agent to /authorize endpoint
    location.href = "https://"+domain+".auth."+region+".amazoncognito.com/oauth2/authorize?response_type=code&state="+state+"&client_id="+appClientId+"&redirect_uri="+redirectURI+"&scope=openid&code_challenge_method=S256&code_challenge="+code_challenge;
  } else {

    // Verify state matches
    state = urlParams.get('state');
    if(sessionStorage.getItem("pkce_state") != state) {
        alert("Invalid state");
    } else {

    // Fetch OAuth2 tokens from Cognito
    code_verifier = sessionStorage.getItem('code_verifier');
  await fetch("https://"+domain+".auth."+region+".amazoncognito.com/oauth2/token?grant_type=authorization_code&client_id="+appClientId+"&code_verifier="+code_verifier+"&redirect_uri="+redirectURI+"&code="+ code,{
  method: 'post',
  headers: {
    'Content-Type': 'application/x-www-form-urlencoded'
  }})
  .then((response) => {
    return response.json();
  })
  .then((data) => {

    // Verify id_token
    tokens=data;
    var idVerified = verifyToken (tokens.id_token);
    Promise.resolve(idVerified).then(function(value) {
      if (value.localeCompare("verified")){
        alert("Invalid ID Token - "+ value);
        return;
      }
      });
    // Display tokens
    document.getElementById("id_token").innerHTML = JSON.stringify(parseJWTPayload(tokens.id_token),null,'\t');
    document.getElementById("access_token").innerHTML = JSON.stringify(parseJWTPayload(tokens.access_token),null,'\t');
  });

    // Fetch from /user_info
    await fetch("https://"+domain+".auth."+region+".amazoncognito.com/oauth2/userInfo",{
      method: 'post',
      headers: {
        'authorization': 'Bearer ' + tokens.access_token
    }})
    .then((response) => {
      return response.json();
    })
    .then((data) => {
      // Display user information
      document.getElementById("userInfo").innerHTML = JSON.stringify(data, null,'\t');
    });
  }}}
  main();

Create the verifier.js file

Use the following text to create the verifier.js file.

var key_id;
var keys;
var key_index;

//verify token
async function verifyToken (token) {
//get Cognito keys
keys_url = 'https://cognito-idp.'+ region +'.amazonaws.com/' + userPoolId + '/.well-known/jwks.json';
await fetch(keys_url)
.then((response) => {
return response.json();
})
.then((data) => {
keys = data['keys'];
});

//Get the kid (key id)
var tokenHeader = parseJWTHeader(token);
key_id = tokenHeader.kid;

//search for the kid key id in the Cognito Keys
const key = keys.find(key =>key.kid===key_id)
if (key === undefined){
return "Public key not found in Cognito jwks.json";
}

//verify JWT Signature
var keyObj = KEYUTIL.getKey(key);
var isValid = KJUR.jws.JWS.verifyJWT(token, keyObj, {alg: ["RS256"]});
if (isValid){
} else {
return("Signature verification failed");
}

//verify token has not expired
var tokenPayload = parseJWTPayload(token);
if (Date.now() >= tokenPayload.exp * 1000) {
return("Token expired");
}

//verify app_client_id
var n = tokenPayload.aud.localeCompare(appClientId)
if (n != 0){
return("Token was not issued for this audience");
}
return("verified");
};

Create an index.html file

Use the following text to create the index.html file.

<!doctype html>

<html lang="en">
<head>
<meta charset="utf-8">

<title>MyApp</title>
<meta name="description" content="My Application">
<meta name="author" content="Your Name">
</head>

<body>
<h2>Cognito User</h2>

<p style="white-space:pre-line;" id="token_status"></p>

<p>Id Token</p>
<p style="white-space:pre-line;" id="id_token"></p>

<p>Access Token</p>
<p style="white-space:pre-line;" id="access_token"></p>

<p>User Profile</p>
<p style="white-space:pre-line;" id="userInfo"></p>
<script language="JavaScript" type="text/javascript"
src="https://kjur.github.io/jsrsasign/jsrsasign-latest-all-min.js">
</script>
<script src="js/verifier.js"></script>
<script src="js/userprofile.js"></script>
</body>
</html>

Upload the files into the Amazon S3 Bucket you created earlier

Upload the files you just created to the Amazon S3 bucket that you created in Step 4. If you’re using Chrome or Firefox browsers, you can choose the folders and files to upload and then drag and drop them into the destination bucket. Dragging and dropping is the only way that you can upload folders.

  1. Sign in to the AWS Management Console and open the Amazon S3 console.
  2. In the Bucket name list, choose the name of the bucket that you created earlier in Step 4.
  3. In a window other than the console window, select the index.html file to upload. Then drag and drop the file into the console window that lists the destination bucket.
  4. In the Upload dialog box, choose Upload.
  5. Choose Create Folder.
  6. Enter the name js and choose Save.
  7. Choose the js folder.
  8. In a window other than the console window, select the userprofile.js and verifier.js files to upload. Then drag and drop the files into the console window js folder.

    Note: The Amazon S3 bucket root will contain the index.html file and a js folder. The js folder will contain the userprofile.js and verifier.js files.

Step 7. Configure the app client settings

Use the Amazon Cognito console to configure the app client settings, including identity providers, OAuth flows, and OAuth scopes.

Configure the app client settings:

  1. Go to the Amazon Cognito console.
  2. Choose Manage your User Pools.
  3. Select your user pool.
  4. Select App integration, and then select App client settings.
  5. Under Enabled Identity Providers, select Cognito User Pool.(Optional) You can add federated identity providers. Adding User Pool Sign-in Through a Third-Party has more information about how to add federation providers.
  6. Enter the Callback URL(s) where the user is to be redirected after successfully signing in. The callback URL is the URL of your web app that will receive the authorization code. In our example, this will be the Domain Name for the CloudFront distribution you created earlier. It will look something like https://DOMAINNAME/index.html where DOMAINNAME is xxxxxxx.cloudfront.net.

    Note: HTTPS is required for the Callback URLs. For this example, I used CloudFront as a HTTPS endpoint for the app in Amazon S3.

  7. Next, select Authorization code grant from the Allowed OAuth Flows and OpenID from Allowed OAuth Scopes. The OpenID scope will return the ID token and grant access to all user attributes that are readable by the client.
  8. Choose Save changes.

Step 8. Show the app home page

Now that the Amazon Cognito user pool is configured and the sample app is built, you can test using Amazon Cognito as an OP from the sample JavaScript app you created in Step 6.

View the app’s home page:

  1. Open a web browser and enter the app’s home page URL using the CloudFront distribution to serve your index.html page created in Step 6 (https://DOMAINNAME/index.html) and the app will redirect the browser to the Amazon Cognito /authorize endpoint.
  2. The /authorize endpoint redirects the browser to the Amazon Cognito hosted UI, where the user can sign in or sign up. The following figure shows the user sign-in page.

    Figure 8: User sign-in page

    Figure 8: User sign-in page

Step 9. Create a user

You can use the Amazon Cognito user pool to manage your users or you can use a federated identity provider. Users can sign in or sign up from the Amazon Cognito hosted UI or from a federated identity provider. If you configured a federated identity provider, users will see a list of federated providers that they can choose from. When a user chooses a federated identity provider, they are redirected to the federated identity provider sign-in page. After signing in, the browser is directed back to Amazon Cognito. For this post, Amazon Cognito is the only identity provider, so you will use the Amazon Cognito hosted UI to create an Amazon Cognito user.

Create a new user using Amazon Cognito hosted UI:

  1. Create a new user by selecting Sign up and entering a username, password, and email address. Then select the Sign up button. The following figure shows the sign up screen.

    Figure 9: Sign up with a new account

    Figure 9: Sign up with a new account

  2. The Amazon Cognito sign up workflow will verify the email address by sending a verification code to that address. The following figure shows the prompt to enter the verification code.

    Figure 10: Enter the verification code

    Figure 10: Enter the verification code

  3. Enter the code from the verification email in the Verification Code text box.
  4. Select Confirm Account.

Step 10. Viewing the Amazon Cognito tokens and profile information

After authentication, the app displays the tokens and user information. The following figure shows the OAuth2 access token and OIDC ID token that are returned from the /token endpoint and the user profile returned from the /userInfo endpoint. Now that the user has been authenticated, the application can use the user’s email address to look up the user’s account information in an application data store. Based on the user’s account information, the application can grant/restrict access to paid content or show account information like order history.

Figure 11: Token and user profile information

Figure 11: Token and user profile information

Note: Many browsers will cache redirects. If your browser is repeatedly redirecting to the index.html page, clear the browser cache.

Summary

In this post, we’ve shown you how easy it is to add user authentication to your web and mobile apps with Amazon Cognito.

We created a Cognito User Pool as our user directory, assigned a domain name to the Amazon Cognito hosted UI, and created an application client for our application. Then we created an Amazon S3 bucket to host our website. Next, we created a CloudFront distribution for our Amazon S3 bucket. Then we created our application and uploaded it to our Amazon S3 website bucket. From there, we configured the client app settings with our identity provider, OAuth flows, and scopes. Then we accessed our application and used the Amazon Cognito sign-in flow to create a username and password. Finally, we logged into our application to see the OAuth and OIDC tokens.

Amazon Cognito saves you time and effort when implementing authentication with an intuitive UI, OAuth2 and OIDC support, and customizable workflows. You can now focus on building features that are important to your core business.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Cognito forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

George Conti

George is a Solution Architect for the AWS Financial Services team. He is passonate about technology and helping Financial Services Companies build solutions with AWS Services.

Eliminating cold starts with Cloudflare Workers

Post Syndicated from Ashcon Partovi original https://blog.cloudflare.com/eliminating-cold-starts-with-cloudflare-workers/

Eliminating cold starts with Cloudflare Workers

Eliminating cold starts with Cloudflare Workers

A “cold start” is the time it takes to load and execute a new copy of a serverless function for the first time. It’s a problem that’s both complicated to solve and costly to fix. Other serverless platforms make you choose between suffering from random increases in execution time, or paying your way out with synthetic requests to keep your function warm. Cold starts are a horrible experience, especially when serverless containers can take full seconds to warm up.

Eliminating cold starts with Cloudflare Workers

Unlike containers, Cloudflare Workers utilize isolate technology, which measure cold starts in single-digit milliseconds. Well, at least they did. Today, we’re removing the need to worry about cold starts entirely, by introducing support for Workers that have no cold starts at all – that’s right, zero. Forget about cold starts, warm starts, or… any starts, with Cloudflare Workers you get always-hot, raw performance in more than 200 cities worldwide.

Why is there a cold start problem?

It’s impractical to keep everyone’s functions warm in memory all the time. Instead, serverless providers only warm up a function after the first request is received. Then, after a period of inactivity, the function becomes cold again and the cycle continues.

For Workers, this has never been much of a problem. In contrast to containers that can spend full seconds spinning up a new containerized process for each function, the isolate technology behind Workers allows it to warm up a function in under 5 milliseconds.

Eliminating cold starts with Cloudflare Workers
Learn more about how isolates enable Cloudflare Workers to be performant and secure here.

Cold starts are ugly. They’re unexpected, unavoidable, and cause unpredictable code execution times. You shouldn’t have to compromise your customers’ experience to enjoy the benefits of serverless. In a collaborative effort between our Workers and Network teams, we set out to create a solution where you never have to worry about cold starts, warm starts, or pre-warming ever again.

How is a zero cold start even possible?

Like many features at Cloudflare, security and encryption make our network more intelligent. Since 95% of Worker requests are securely handled over HTTPS, we engineered a solution that uses the Internet’s encryption protocols to our advantage.

Before a client can send an HTTPS request, it needs to establish a secure channel with the server. This process is known as “handshaking” in the TLS, or Transport Layer Security, protocol. Most clients also send a hostname (e.g. cloudflare.com) in that handshake, which is referred to as the SNI, or Server Name Indication. The server receives the handshake, sends back a certificate, and now the client is allowed to send its original request, encrypted.

Previously, Workers would only load and compile after the entire handshake process was complete, which involves two round-trips between the client and server. But wait, we thought, if the hostname is present in the handshake, why wait until the entire process is done to preload the Worker? Since the handshake takes some time, there is an opportunity to warm up resources during the waiting time before the request arrives.

Eliminating cold starts with Cloudflare Workers

With our newest optimization, when Cloudflare receives the first packet during TLS negotiation, the “ClientHello,” we hint the Workers runtime to eagerly load that hostname’s Worker. After the handshake is done, the Worker is warm and ready to receive requests. Since it only takes 5 milliseconds to load a Worker, and the average latency between a client and Cloudflare is more than that, the cold start is zero. The Worker starts executing code the moment the request is received from the client.

Eliminating cold starts with Cloudflare Workers

When are zero cold starts available?

Now, and for everyone! We’ve rolled out this optimization to all Workers customers and it is in production today. There’s no extra fee and no configuration change required. When you build on Cloudflare Workers, you build on an intelligent, distributed network that is constantly pushing the bounds of what’s possible in terms of performance.

For now, this is only available for Workers that are deployed to a “root” hostname like “example.com” and not specific paths like “example.com/path/to/something.” We plan to introduce more optimizations in the future that can preload specific paths.

What about performance beyond cold starts?

We also recognize that performance is more than just zero cold starts. That’s why we announced the beta of Workers Unbound earlier this week. Workers Unbound has the simplicity and performance of Workers with no limits, just raw performance.

Workers, equipped with zero cold starts, no CPU limits, and a network that spans over 200 cities is prime and ready to take on any serious workload. Now that’s performance.

Interested in deploying with Workers?

Cloudflare Workers Announces Broad Language Support

Post Syndicated from Cody Koeninger original https://blog.cloudflare.com/cloudflare-workers-announces-broad-language-support/

Cloudflare Workers Announces Broad Language Support

Cloudflare Workers Announces Broad Language Support

We initially launched Cloudflare Workers with support for JavaScript and languages that compile to WebAssembly, such as Rust, C, and C++. Since then, Cloudflare and the community have improved the usability of Typescript on Workers. But we haven’t talked much about the many other popular languages that compile to JavaScript. Today, we’re excited to announce support for Python, Scala, Kotlin, Reason and Dart.

You can build applications on Cloudflare Workers using your favorite language starting today.

Cloudflare Workers Announces Broad Language Support

Getting Started

Getting started is as simple as installing Wrangler, then running generate for the template for your chosen language: Python, Scala, Kotlin, Dart, or Reason. For Python, this looks like:

wrangler generate my-python-project https://github.com/cloudflare/python-worker-hello-world

Follow the installation instructions in the README inside the generated project directory, then run wrangler publish. You can see the output of your Worker at your workers.dev subdomain, e.g. https://my-python-project.cody.workers.dev/. You can sign up for a free Workers account if you don’t have one yet.

That’s it. It is really easy to write in your favorite languages. But, this wouldn’t be a very compelling blog post if we left it at that. Now, I’ll shift the focus to how we added support for these languages and how you can add support for others.

How it all works under the hood

Language features are important. For instance, it’s hard to give up the safety and expressiveness of pattern matching once you’ve used it. Familiar syntax matters to us as programmers.

You may also have existing code in your preferred language that you’d like to reuse. Just keep in mind that the advantages of running on V8 come with the limitation that if you use libraries that depend on native code or language-specific VM features, they may not translate to JavaScript. WebAssembly may be an option in that case. But for memory-managed languages you’re usually better off compiling to JavaScript, at least until the story around garbage collection for WebAssembly stabilizes.

I’ll walk through how the Worker language templates are made using a representative example of a dynamically typed language, Python, and a statically typed language, Scala. If you want to follow along, you’ll need to have Wrangler installed and configured with your Workers account. If it’s your first time using Workers it’s a good idea to go through the quickstart.

Dynamically typed languages: Python

You can generate a starter “hello world” Python project for Workers by running

wrangler generate my-python-project https://github.com/cloudflare/python-worker-hello-world

Wrangler will create a my-python-project directory and helpfully remind you to configure your account_id in the wrangler.toml file inside it.  The README.md file in the directory links to instructions on setting up Transcrypt, the Python to JavaScript compiler we’re using. If you already have Python 3.7 and virtualenv installed, this just requires running

cd my-python-project
virtualenv env
source env/bin/activate
pip install transcrypt
wrangler publish

The main requirement for compiling to JavaScript on Workers is the ability to produce a single js file that fits in our bundle size limit of 1MB. Transcrypt adds about 70k for its Python runtime in this case, which is well within that limit. But by default running Transcrypt on a Python file will produce multiple JS and source map files in a __target__ directory. Thankfully Wrangler has built in support for webpack. There’s a webpack loader for Transcrypt, making it easy to produce a single file. See the webpack.config.js file for the setup.

The point of all this is to run some Python code, so let’s take a look at index.py:

def handleRequest(request):
   return __new__(Response('Python Worker hello world!', {
       'headers' : { 'content-type' : 'text/plain' }
   }))

addEventListener('fetch', (lambda event: event.respondWith(handleRequest(event.request))))

In most respects this is very similar to any other Worker hello world, just in Python syntax. Dictionary literals take the place of JavaScript objects, lambda is used instead of an anonymous arrow function, and so on. If using __new__ to create instances of JavaScript classes seems awkward, the Transcrypt docs discuss an alternative.

Clearly, addEventListener is not a built-in Python function, it’s part of the Workers runtime. Because Python is dynamically typed, you don’t have to worry about providing type signatures for JavaScript APIs. The downside is that mistakes will result in failures when your Worker runs, rather than when Transcrypt compiles your code. Transcrypt does have experimental support for some degree of static checking using mypy.

Statically typed languages: Scala

You can generate a starter “hello world” Scala project for Workers by running

wrangler generate my-scala-project https://github.com/cloudflare/scala-worker-hello-world

The Scala to JavaScript compiler we’re using is Scala.js. It has a plugin for the Scala build tool, so installing sbt and a JDK is all you’ll need.

Running sbt fullOptJS in the project directory will compile your Scala code to a single index.js file. The build configuration in build.sbt is set up to output to the root of the project, where Wrangler expects to find an index.js file. After that you can run wrangler publish as normal.

Scala.js uses the Google Closure Compiler to optimize for size when running fullOptJS. For the hello world, the file size is 14k. A more realistic project involving async fetch weighs in around 100k, still well within Workers limits.

In order to take advantage of static type checking, you’re going to need type signatures for the JavaScript APIs you use. There are existing Scala signatures for fetch and service worker related APIs. You can see those being imported in the entry point for the Worker, Main.scala:

import org.scalajs.dom.experimental.serviceworkers.{FetchEvent}
import org.scalajs.dom.experimental.{Request, Response, ResponseInit}
import scala.scalajs.js

The import of scala.scalajs.js allows easy access to Scala equivalents of JavaScript types, such as js.Array or js.Dictionary. The remainder of Main looks fairly similar to a Typescript Worker hello world, with syntactic differences such as Unit instead of Void and square brackets instead of angle brackets for type parameters:

object Main {
  def main(args: Array[String]): Unit = {
    Globals.addEventListener("fetch", (event: FetchEvent) => {
      event.respondWith(handleRequest(event.request))
    })
  }

  def handleRequest(request: Request): Response = {
    new Response("Scala Worker hello world", ResponseInit(
        _headers = js.Dictionary("content-type" -> "text/plain")))
  }
}  

Request, Response and FetchEvent are defined by the previously mentioned imports. But what’s this Globals object? There are some Worker-specific extensions to JavaScript APIs. You can handle these in a statically typed language by either automatically converting existing Typescript type definitions for Workers or by writing type signatures for the features you want to use. Writing the type signatures isn’t hard, and it’s good to know how to do it, so I included an example in Globals.scala:

import scalajs.js
import js.annotation._

@js.native
@JSGlobalScope
object Globals extends js.Object {
  def addEventListener(`type`: String, f: js.Function): Unit = js.native
}

The annotation @js.native indicates that the implementation is in existing JavaScript code, not in Scala. That’s why the body of the addEventListener definition is just js.native. In a JavaScript Worker you’d call addEventListener as a top-level function in global scope. Here, the @JSGlobalScope annotation indicates that the function signatures we’re defining are available in the JavaScript global scope.

You may notice that the type of the function passed to addEventListener is just js.Function, rather than specifying the argument and return types. If you want more type safety, this could be done as js.Function1[FetchEvent, Unit].  If you’re trying to work quickly at the expense of safety, you could use def addEventListener(any: Any*): Any to allow anything.

For more information on defining types for JavaScript interfaces, see the Scala.js docs.

Using Workers KV and async Promises

Let’s take a look at a more realistic example using Workers KV and asynchronous calls. The idea for the project is our own HTTP API to store and retrieve text values. For simplicity’s sake I’m using the first slash-separated component of the path for the key, and the second for the value. Usage of the finished project will look like PUT /meaning of life/42 or GET /meaning of life/

The first thing I need is to add type signatures for the parts of the KV API that I’m using, in Globals.scala. My KV namespace binding in wrangler.toml is just going to be named KV, resulting in a corresponding global object:

object Globals extends js.Object {
  def addEventListener(`type`: String, f: js.Function): Unit = js.native
  
  val KV: KVNamespace = js.native
}
bash$ curl -w "\n" -X PUT 'https://scala-kv-example.cody.workers.dev/meaning of life/42'

bash$ curl -w "\n" -X GET 'https://scala-kv-example.cody.workers.dev/meaning of life/'
42

So what’s the definition of the KVNamespace type? It’s an interface, so it becomes a Scala trait with a @js.native annotation. The only methods I need to add right now are the simple versions of KV.get and KV.put that take and return strings. The return values are asynchronous, so they’re wrapped in a js.Promise. I’ll make that wrapped string a type alias, KVValue, just in case we want to deal with the array or stream return types in the future:

object KVNamespace {
  type KVValue = js.Promise[String]
}

@js.native
trait KVNamespace extends js.Object {
  import KVNamespace._
  
  def get(key: String): KVValue = js.native
  
  def put(key: String, value: String): js.Promise[Unit] = js.native
}

With type signatures complete, I’ll move on to Main.scala and how to handle interaction with JavaScript Promises. It’s possible to use js.Promise directly, but I’d prefer to use Scala semantics for asynchronous Futures. The methods toJSPromise and toFuture from js.JSConverters can be used to convert back and forth:

  def get(key: String): Future[Response] = {
    Globals.KV.get(key).toFuture.map { (value: String) =>
        new Response(value, okInit)
    } recover {
      case err =>
        new Response(s"error getting a value for '$key': $err", errInit)
    }
  }

The function for putting values makes similar use of toFuture to convert the return value from KV into a Future. I use map to transform the value into a Response, and recover to handle failures. If you prefer async / await syntax instead of using combinators, you can use scala-async.

Finally, the new definition for handleRequest is a good example of how pattern matching makes code more concise and less error-prone at the same time. We match on exactly the combinations of HTTP method and path components that we want, and default to an informative error for any other case:

  def handleRequest(request: Request): Future[Response] = {
    (request.method, request.url.split("/")) match {
      case (HttpMethod.GET, Array(_, _, _, key)) =>
        get(key)
      case (HttpMethod.PUT, Array(_, _, _, key, value)) =>
        put(key, value)
      case _ =>
        Future.successful(
          new Response("expected GET /key or PUT /key/value", errInit))
    }
  }

You can get the complete code for this example by running

wrangler generate projectname https://github.com/cloudflare/scala-worker-kv

How to contribute

I’m a fan of programming languages, and will continue to add more Workers templates. You probably know your favorite language better than I do, so pull requests are welcome for a simple hello world or more complex example.

And if you’re into programming languages check out the latest language rankings from RedMonk where Python is the first non-Java or JavaScript language ever to place in the top two of these rankings.

Stay tuned for the rest of Serverless Week!

The Migration of Legacy Applications to Workers

Post Syndicated from Kirk Schwenkler original https://blog.cloudflare.com/the-migration-of-legacy-applications-to-workers/

The Migration of Legacy Applications to Workers

The Migration of Legacy Applications to Workers

As Cloudflare Workers, and other Serverless platforms, continue to drive down costs while making it easier for developers to stand up globally scaled applications, the migration of legacy applications is becoming increasingly common. In this post, I want to show how easy it is to migrate such an application onto Workers. To demonstrate, I’m going to use a common migration scenario: moving a legacy application — on an old compute platform behind a VPN or in a private cloud — to a serverless compute platform behind zero-trust security.

Wait but why?

Before we dive further into the technical work, however, let me just address up front: why would someone want to do this? What benefits would they get from such a migration? In my experience, there are two sets of reasons: (1) factors that are “pushing” off legacy platforms, or the constraints and problems of the legacy approach; and (2) factors that are “pulling” onto serverless platforms like Workers, which speaks to the many benefits of this new approach. In terms of the push factors, we often see three core ones:

  • Legacy compute resources are not flexible and must be constantly maintained, often leading to capacity constraints or excess cost;
  • Maintaining VPN credentials is cumbersome, and can introduce security risks if not done properly;
  • VPN client software can be challenging for non-technical users to operate.

Similarly, there are some very key benefits “pulling” folks onto Serverless applications and zero-trust security:

  • Instant scaling, up or down, depending on usage. No capacity constraints, and no excess cost;
  • No separate credentials to maintain, users can use Single Sign On (SSO) across many applications;
  • VPN hardware / private cloud; and existing compute, can be retired to simplify operations and reduce cost

While the benefits to this more modern end-state are clear, there’s one other thing that causes organizations to pause: the costs in time and migration effort seem daunting. Often what organizations find is that migration is not as difficult as they fear. In the rest of this post, I will show you how Cloudflare Workers, and the rest of the Cloudflare platform, can greatly simplify migrations and help you modernize all of your applications.

Getting Started

To take you through this, we will use a contrived application I’ve written in Node.js to illustrate the steps we would take with a real, and far more complex, example. The goal is to show the different tools and features you can use at each step; and how our platform design supports development and cutover of an application.  We’ll use four key Cloudflare technologies, as we see how to move this Application off of my Laptop and into the Cloud:

  1. Serverless Compute through Workers
  2. Robust Developer-focused Tooling for Workers via Wrangler
  3. Zero-Trust security through Access
  4. Instant, Secure Origin Tunnels through Argo Tunnels

Our example application for today is called Post Process, and it performs business logic on input provided in an HTTP POST body. It takes the input data from authenticated clients, performs a processing task, and responds with the result in the body of an HTTP response. The server runs in Node.js on my laptop.

Since the example application is written in Node.js; we will be able to directly copy some of the JavaScript assets for our new application. You could follow this “direct port” method not only for JavaScript applications, but even applications in our other WASM-supported languages. For other languages, you first need to rewrite or transpile into one with WASM support.

Into our ApplicationOur basic example will perform only simple text processing, so that we can focus on the broad features of the migration. I’ve set up an unauthenticated copy (using Workers, to give us a scalable and reliable place to host it) at https://postprocess-workers.kirk.workers.dev/postprocess where you can see how it operates. Here is an example cURL:

curl -X POST https://postprocess-workers.kirk.workers.dev/postprocess --data '{"operation":"2","data":"Data-Gram!"}'

The relevant takeaways from the code itself are pretty simple:

  • There are two code modules, which conveniently split the application logic completely from the Preprocessing / HTTP interface.
  • The application logic module exposes one function postProcess(object) where object is the parsed JSON of the POST body. It returns a JavaScript object, ready to be encoded into a string in the JSON HTTP response. This module can be run on Workers JavaScript, with no changes. It only needs a new preprocessing / HTTP interface.
  • The Preprocessing / HTTP interface runs on raw Node.js; and exposes a local HTTPS server. The server does not directly take inbound traffic from the Internet, but sits behind a gateway which controls access to the service.

Code snippet from Node.js HTTP module

const server = http.createServer((req, res) => {
    if (req.url == '/postprocess') {
        if(req.method == 'POST') {
                gatherPost(req, data => {
                        try{
                                jsonData = JSON.parse(data)
                        } catch (e) {
                                res.end('Invalid JSON payload! \n')
                                return
                        }
                        result = postProcess(jsonData)
                        res.write(JSON.stringify(result) + '\n');
                        res.end();
                })
        } else {
                res.end('Invalid Method, only POST is supported! \nPlease send a POST with data in format {"Operation":1","data","Data-Gram!"        }
    } else {
        res.end('Invalid request. Did you mean to POST to /postprocess? \n');
    }
});

Code snippet from Node.js logic module

function postProcess (postJson) {
        const ServerVersion = "2.5.17"
        if(postJson != null && 'operation' in postJson && 'data' in postJson){
                var output
                var operation = postJson['operation']
                var data = postJson['data']
                switch(operation){
                        case "1":
                              output = String(data).toLowerCase()
                              break
                        case "2":
                              d = data + "\n"
                              output = d + d + d
                              break
                        case "3":
                              output = ServerVersion
                              break
                        default:
                              output = "Invalid Operation"
                }
                return {'Output': output}
        }
        else{
                return {'Error':'Invalid request, invalid JSON format'}
        }

Current State Application Architecture

The Migration of Legacy Applications to Workers

Design Decisions

With all this information in hand, we can arrive at at the details of our new Cloudflare-based design:

  1. Keep the business logic completely intact, and specifically use the same .js asset
  2. Build a new preprocessing layer in Workers to replace the Node.js module
  3. Use Cloudflare Access to authenticate users to our application

Target State Application Architecture

The Migration of Legacy Applications to Workers

Finding the first win

One good way to make a migration successful is to find a quick win early on; a useful task which can be executed while other work is still ongoing. It is even better if the quick win also benefits the eventual cutover. We can find a quick win here, if we solve the zero-trust security problem ahead of the compute problem by putting Cloudflare’s security in front of the existing application.

We will do this by using cloudflare’s Argo Tunnel feature to securely connect to the existing application, and Access for zero-trust authentication. Below, you can see how easy this process is for any command-line user, with our cloudflared tool.

I open up a terminal and use cloudflared tunnel login, which presents me with an authentication flow. I then use the cloudflared tunnel --hostname postprocess.kschwenkler.com --url localhost:8080 command to connect an Argo Tunnel between the “url” (my local server) and the “hostname” (the new, public address we will use on my Cloudflare zone).

The Migration of Legacy Applications to Workers

Next I flip over to my Cloudflare dashboard, and attach an Access Policy to the “hostname” I specified before. We will be using the Service Token mode of Access; which generates a client-specific security token which that client can attach to each HTTP POST. Other modes are better suited to interactive browser use cases.

The Migration of Legacy Applications to Workers

Now, without using the VPN, I can send a POST to the service, still running on Node.js on my laptop, from any Internet-connected device which has the correct token! It has taken only a few minutes to add zero-trust security to this application; and safely expose it to the Internet while still running on a legacy compute platform (my laptop!).

The Migration of Legacy Applications to Workers

“Quick Win” Architecture

The Migration of Legacy Applications to Workers

Beyond the direct benefit of a huge security upgrade; we’ve also made our eventual application migration much easier, by putting the traffic through the target-state API gateway already. We will see later how we can surgically move traffic to the new application for testing, in this state.

Lift to the Cloud

With our zero-trust security benefits in hand, and our traffic running through Cloudflare; we can now proceed with the migration of the application itself to Workers. We’ll be using the Wrangler tooling to make this process very easy.

As noted when we first looked at the code, this contrived application exposes a very clean interface between the Node.js-specific HTTP module, which we need to replace, and the business logic postprocess module which we can use as is with Workers. We’ll first need to re-write the HTTP module, and then bundle it with the existing business logic into a new Workers application.

Here is a handwritten example of the basic pattern we’ll use for the HTTP module. We can see how the Service Workers API makes it very easy to grab the POST body with await, and how the JSON interface lets us easily pass the data to the postprocess module we took directly from the initial Node.js app.

addEventListener('fetch', event => {
 event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
 try{
   requestData = await request.json()
 } catch (e) {
   return new Response("Invalid JSON", {status:500})
 }
 const response = new Response(JSON.stringify(postProcess (requestData)))
 return response
}

For our work on the mock application, we will go a slightly different route; more in line with a real application which would be more complex. Instead of writing this by hand, we will use Wrangler and our Router template, to build the new front end from a robust framework.

We’ll run wrangler generate post-process-workers https://github.com/cloudflare/worker-template-router to initialize a new Wrangler project with the Router template. Most of the configurations for this template will work as is; and we just have to update account_id in our wrangler.toml and make a few small edits to the code in index.js.

Below is our index.js after my edits, Note the line const postProcess = require('./postProcess.js') at the start of the new http module – this will tell Wrangler to include the original business logic, from the legacy app’s postProcess.js module which I will copy to our working directory.

const Router = require('./router')
const postProcess = require('./postProcess.js')

addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request))
})

async function handler(request) {
    const init = {
        headers: { 'content-type': 'application/json' },
    }
    const body = JSON.stringify(postProcess(await request.json()))
    return new Response(body, init)
}

async function handleRequest(request) {
    const r = new Router()
    r.post('.*/postprocess*', request => handler(request))
    r.get('/', () => new Response('Hello worker!')) // return a default message for the root route

    const resp = await r.route(request)
    return resp
}

Now we can simply run wrangler publish, to put our application on workers.dev for testing! The Router template’s defaults; and the small edits made above, are all we need. Since Wrangler automatically exposes the test application to the Internet (note that we can *also* put the test application behind Access, with a slightly modified method), we can easily send test traffic from any device.

The Migration of Legacy Applications to Workers

Shift, Safely!

With our application up for testing on workers.dev, we finally come to the last and most daunting migration step: cutting over traffic from the legacy application to the new one without any service interruption.

Luckily, we had our quick win earlier and are already routing our production traffic through the Cloudflare network (to the legacy application via Argo Tunnels). This provides huge benefits now that we are at the cutover step. Without changing our IP address, SSL configuration, or any other client-facing properties, we can route traffic to the new application with just one wrangler command.

Seamless cutover from Transition to Target state

The Migration of Legacy Applications to Workers

We simply modify wrangler.toml to indicate the production domain / route we’d like the application to operate on; and wrangler publish. As soon as Cloudflare receives this update; it will send production traffic to our new application instead of the Argo Tunnel. We have configured the application to send a ‘version’ header which lets us verify this easily using curl.

The Migration of Legacy Applications to Workers
The Migration of Legacy Applications to Workers

Rollback, if it is needed, is also very easy. We can either set the wrangler.toml back to the workers.dev only mode, and wrangler publish again; or delete our route manually. Either will send traffic back to the Argo Tunnel.

The Migration of Legacy Applications to Workers
The Migration of Legacy Applications to Workers
The Migration of Legacy Applications to Workers

In Conclusion

Clearly, a real application will be more complex than our example above. It may have multiple components, with complex interactions, which must each be handled in turn. Argo Tunnel might remain in use, to connect to a data store or other application outside of our network. We might use WASM to support modules written in other languages. In any of these scenarios, Cloudflare’s Wrangler tooling and Serverless capabilities will help us work through the complexities and achieve success.

I hope that this simple example has helped you to see how Wrangler, cloudflared, Workers, and our entire global network can work together to make migrations as quick and hassle-free as possible. Whether for this case of an old application behind a VPN, or another application that has outgrown its current home – our Workers platform, Wrangler tooling, and underlying platform will scale to meet your business needs.

Replacing web server functionality with serverless services

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/replacing-web-server-functionality-with-serverless-services/

Web servers bring together many useful services in traditional web development. Developers use servers like Apache and NGINX for many common tasks. Linux, Apache, MySQL, and PHP formed the LAMP stack to power a large percentage of the world’s websites. Other variants, like the MEAN stack (MongoDB, Express.js, AngularJS, Node.js), have also been popular.

In the migration to serverless, it’s important to understand where this functionality moves to. There are significant benefits in taking a serverless approach to developing web apps but there are differences in where developers spend their efforts. This blog post provides a guide to serverless development for traditional web developers to help with this transition.

Comparing a “Hello World” example

To run a “Hello World” example in a highly available configuration, using a traditional webserver approach you need more than one server in more than one Availability Zone. This server contains an operating system, runtime, and web server software, together with your code. You might build an Amazon Machine Image (AMI) to help with creating more servers.

Scalable "Hello World"

With a web framework like Express, the following code starts a server and listens on port 3000 for connections. For requests at the root URL, it responds with the “Hello World” greeting:

Hello World output

There is a reasonable amount of configuration and infrastructure needed to make this example work. Even creating a TLS connection requires you to maintain a certificate or install and maintain a service like Let’s Encrypt. Additionally, you must patch and maintain the underlying EC2 instance to keep this service running once it’s deployed.

The serverless equivalent is simpler. I can define the Hello World example using an AWS Serverless Application Model (SAM) template:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: hello-world

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function 
    Properties:
      Handler: index.lambdaHandler
      Runtime: nodejs12.x
      InlineCode: | 
        exports.lambdaHandler = async (event, context) => {
          return { 'statusCode': 200, 'body': 'Hello World!' }
        }
      Events:
        HelloWorld:
          Type: Api 
          Properties:
            Path: /hello
            Method: get

The SAM deployment creates an AWS Lambda function with an Amazon API Gateway endpoint:

Serverless Hello World

This is a highly available, scalable endpoint. The developer does not need to define VPCs, subnets or security groups, or install and manage a web server stack. A considerable part of the underlying infrastructure is managed for you, letting you focus primarily on the business logic of the application.

Additionally, using the default Service Quotas, this Endpoint can handle millions of requests a day. To handle this equivalent load with a traditional web server, you may need EC2 Auto Scaling. Lambda manages the scaling automatically, and also scales down as needed without any intervention from the developer.

Implementing authentication in serverless web apps

Many traditional web servers use web frameworks like Python Flask or Express and implement session-based authentication. This allows the server to authenticate users, often with a user name and password validation scheme. The server is responsible for storing user lists, and hashing and salting passwords securely. There are also user administration flows required for tasks such as creating accounts and resetting passwords.

While you can implement all these within a Lambda function, there is another approach that can be more secure and reduce boilerplate code. You can implement authorization and authentication in serverless development by using open standard JSON Web Tokens (JWTs). API Gateway then authenticates the user at the service level using Amazon Cognito, a Lambda authorizer, or with a JWT authorizer with HTTP APIs.

You use an identity provider such as Amazon Cognito or Auth0 to generate the user token. You pass the token in the API request in the Authorization header. The API Gateway service then validates the token before the request is sent downstream to your application.

While you can use JWTs in server-based web applications, there are benefits to separating out this functionality using serverless services:

  • Failed requests do not put any additional load on your infrastructure. API Gateway also does not charge on authenticated routes when authorization headers are missing.
  • You eliminate custom code for handling and processing logins since this happens before reaching your business logic.
  • You can add support for social logins, multi-factor authentication (MFA) and OAuth without changing your code.

Additionally, as your application grows to more functions or across Regions, you are not relying on a single authentication point in your architecture. Each microservice validates a JWT independently and can verify the authorization claims that can be securely embedded in the token’s payload.

For web developers, one of the most common questions is how to handle the user interface elements related to authorization within the application. Auth0 offers a number of customizable components that you can integrate into any JavaScript application. Amplify Framework provides the Authenticator component that provides a wrapper for common flows for signing in users.

Amplify signin UI

Using either approach eliminates boilerplate user management code and helps provide a consistent and professional login experience for your users. To learn more about using Auth0’s integrated sign-in, see the Ask Around Me application code repo.

Generating HTML, CSS and front-end templates

Many web frameworks use templating languages like Jinja or Mustache to help developers inject dynamic content into static HTML and CSS layouts. Typically, the web server creates the entire page layout for each request. You can use the same approach with Lambda if preferred, having the function build the HTML response for the browser.

However, single-page application (SPA) frameworks such as React, Vue.js, and AngularJS offer a different paradigm that works well for serverless development. The build process for SPA applications generates static HTML, CSS, and JavaScript files. When downloaded to the browser, they use JavaScript to fetch dynamic data and interact with the backend application:

SPA backend architecture

  1. The user visits the web application’s URL. The browser downloads the application’s HTML, CSS, and JavaScript files from Amazon S3 via Amazon CloudFront.
  2. The browser executes the application’s JavaScript.
  3. The application calls API Gateway endpoints to fetch and store dynamic data.

This architecture offers a number of benefits. First, serving the application’s assets is offloaded from your infrastructure to a global CDN. This reduces latency and increases scalability. Second, the HTML page building and rendering is managed entirely by the client browser, improving responsiveness and reducing network traffic with the application backend.

Uploading, processing, and saving binary files

Many web applications handle large binary files, such as user uploads. Processing these on web servers can be compute and network-intensive. You must also manage the amount of temporary space in use on the web server, and scale the fleet of servers appropriately during busy periods.

You can upload files serverlessly by using Amazon S3 directly. In this process, you request a presigned URL and upload the binary data directly to this endpoint. This reduces load on your infrastructure and increase scalability. The code is also simple to adapt for non-serverless applications that use S3. Watch this video to see how you can build an S3 uploader solution.

For processing binaries, you can use the S3 PutObject event to trigger serverless workflows. For example, you can process images, translate documents, or transcribe audio. For complex business workflows, the event can trigger AWS Step Functions workflows. This is a highly scalable way to bring automation and custom processing to binary uploads in your web applications.

When processing binary data, Lambda provides a 512 MB temporary file system (located at /tmp). You use this space for intermediate processing, not permanent storage, since the storage is ephemeral. For example, this can be useful for unzipping files or creating PDFs.

When saving files permanently in serverless applications, S3 buckets are the most common storage choice. S3 is highly durable and highly available, provides robust encryption options, and is a scalable, cost-effective solution for many workloads.

Storing application state

In many traditional applications, the web server stores temporary, context-specific application state, and a relational database stores data permanently.

Serverless tools have a range of different options available for managing state. Lambda functions are ephemeral and stateless, and there is no guarantee of reusing the same instance of a Lambda function multiple times.

For functions that need a durable store of user data that can be rehydrated between invocations, Amazon DynamoDB tables provide a low-latency, cost-effective solution. For example, this is ideal for recalling shopping cart contents or user profiles.

For more complex state, tracking long-lived or complex business workflows, the best practice is to use AWS Step Functions. You can model workflows in JSON that use parallel tasks, require human interaction, or take up to one year to complete.

Conclusion

In this post, I show how traditional web-server applications compare with their serverless counterparts. I show how the infrastructure is managed for you in serverless, and how code for serverless developers in primarily focused on business logic.

I look at how common web server tasks, such as authentication and authorization, are managed by scalable services. In single-page applications, front-end layouts are generated on the client-side, and the distribution is managed by a global CDN.

To learn more about how to build web applications with serverless, see the Ask Around Me application repo.

JAMstack at the Edge: How we built Built with Workers… on Workers

Post Syndicated from Kristian Freeman original https://blog.cloudflare.com/jamstack-at-the-edge-how-we-built-built-with-workers-on-workers/

JAMstack at the Edge: How we built Built with Workers… on Workers

I’m extremely stoked to announce Built with Workers today – it’s an awesome resource for exploring what you can build with Cloudflare Workers. As Adam explained in our launch post, showcasing developers building incredible projects with tools like Workers KV or our streaming HTML rewriter is a great way to celebrate users of our platform. It also helps encourage developers to try building their dream app on top of Workers. In this post, I’ll explore some of the architectural and implementation designs we made while building the site.

When we first started planning Built with Workers, we wanted to use the site as an opportunity to build a new greenfield application, showcasing the strength of the Workers platform. The Workers Developer Experience team is cross-functional: while we might spend most of our time improving our docs, or developing features for our command-line interface Wrangler, most of us have spent years developing on the web. The prospect of starting a new application is always fun, but in this instance, it was a prime chance to ask (and answer) the question, “If I could build this site on Workers with whatever tools I want, what would I choose?”

A guiding principle for the Workers platform is ease-of-use. The programming model is simple: it’s just JavaScript (or, via WASM, Rust, C, and C++), and you have complete control over the requests coming in and the requests going out from your Workers script. In the same way, while building Built with Workers, it was crucial to find a set of tools that could enable something like this throughout the process of building the entire application. To enable this, we’ve embraced JAMstack – a software stack comprised of JavaScript, APIs, and markup – with Built with Workers, deploying always up-to-date static builds of the site directly to the edge, using Workers Sites. Our framework of choice, Gatsby.js, provides a set of sane defaults to build a modern web application. To manage content and the layout of the site, we’ve chosen Sanity.io, a powerful headless CMS that allows us to model the entire website without needing to deploy any databases or spin up any additional infrastructure.

Personally, I’m excited about JAMstack as a methodology for building web applications because of this emphasis on reducing infrastructure: it’s incredibly similar to the motivations behind deploying serverless applications using Cloudflare Workers, and as we developed Built with Workers, we discovered a number of these philosophical similarities in JAMstack and Cloudflare Workers – exciting! To help encourage developers to explore building their own JAMstack applications on Workers, I’m also announcing today that we’ve made the Built with Workers codebase open-source on GitHub – you can check out how the application is developed, built and deployed from start to finish.

In this post, we’ll dig into Built with Workers, exploring how it works, the technical decisions we’ve made, and some of the most fascinating aspects of what it means to build applications on the edge.

JAMstack at the Edge: How we built Built with Workers… on Workers
A screenshot of the Built with Workers homepage

Exploring the JAMstack

My first encounter with tooling that would ultimately become part of “JAMstack” was in 2013. I noticed the huge proliferation of developers building personal “static” sites – taking blog posts written primarily in Markdown, and pushing them through frameworks like Jekyll to build full websites that could easily be deployed to a number of CDNs and file hosting platforms. These static sites were fast – they are just HTML, CSS, and JavaScript – and easy to update. The average developer spends their days maintaining large and complex software systems, so it was relaxing to just write Markdown, plug in some re-usable HTML and CSS, and deploy your website. The advent of static sites, of course, isn’t new – but after years of increasingly complex full-stack technology, the return to simplicity was a promising development for many kinds of websites that didn’t need databases, or any dynamic content.

In the last couple years, JAMstack has built upon that resurgence, and represents an approach to building complete, complex applications using the same tooling that has become the first choice for developers building their simple personal sites. JAMstack is comprised of three conceptual pieces – JavaScript, APIs, and Markup – each of which is a crucial aspect of simplifying our web applications and making them easy to write, build, and deploy.

J is for JavaScript

The JAMstack architecture relies heavily on the ubiquity of JavaScript as the language of the web. Many modern web applications use powerful, dynamic front-end frameworks like React and Vue to render user interfaces and process state on the client for users. On the backend, or in Workers’ case, on the edge, any dynamic functionality in your JAMstack application should be built on top of JavaScript, often working in the request-response model that full-stack developers are accustomed to.

The Workers platform is perfectly suited to this requirement! As a developer building on Workers, you have total control of incoming requests and outgoing responses, using the JavaScript Service Worker APIs you know and love. We built Workers Sites as an extension of the Workers platform (and Workers KV as a storage mechanism at the edge), making it possible to deploy your site assets using a single command in Wrangler: wrangler publish.

When your Workers Site receives a new request, we’ll execute JavaScript at the edge to look up a piece of content from Workers KV, and serve it back to the client at lightning speed. Remarkably, you can deploy JAMstack applications on Workers with no additional configuration besides generating your Workers Siteby design, Workers Sites is built to serve as an exceptional JAMstack deployment platform.

A is for APIs

The advent of static site tooling for personal sites makes sense: your site is a few pages: a few blog posts, for instance, and the classic “About” or “Contact” page. When it’s compiled to HTML, the footprint is quite small! This small footprint is what makes static sites easy to reason about: they’re trivial to host in terms of bandwidth and storage costs, and they rarely change, so they’re easily cacheable.

While that works for personal sites, complex applications actually have data requirements! We need to talk to the user data in our databases, and analytics information in our data warehouses. JAMstack apps tackle this by definitively stating that these data sources should be accessible via HTTPS APIs, consumable by the application as a way to provide dynamic information to clients.

Workers is a fascinating platform in regards to JAMstack APIs. It can serve as a gateway to your data, or as a place to persist and return data itself. I can, for instance, expose an API endpoint via my Workers script without giving clients access to my origin. I can also use tooling like Workers KV to persist data directly on the edge, and when a user requests that data, I can resolve the data by returning JSON directly from my application.

This flexibility has been an unexpected part of the experience of developing Built with Workers. In a later section of this post, I’ll talk about how we developed an integral feature of the site based on the unique strengths of Workers as a way to host static assets and as a dynamic JavaScript execution platform. This has remarkable implications that blur the lines between classic static sites and dynamic applications, and I’m really excited about it.

M is for Markup

A breakthrough moment in my understanding of JAMstack came at the beginning of this year. I was working on a job board for frontend developers, using the static site framework Gatsby.js and Sanity.io, a headless CMS tool that allows developers to model content without maintaining a database or any infrastructure. (As a reminder – this set of tools is identical to what we ultimately used to develop Built with Workers. It’s a very good stack!)

SEO is crucial to a job board, and as I began to explore how to drive more traffic to my job board, I landed on the idea of generating a huge amount of search-oriented content automatically from the job data I already had. For instance, if I had job posts with the keywords “React”, “Europe”, and “Senior” (as in “senior developer”), what if I created pages with titles like “Senior React developer jobs in Europe”, or “Remote Angular jobs”? This approach would allow the site to begin ranking for a variety of job positions, locations, and experience levels, and as more jobs were posted on the site, each of these pages would be enriched with more useful information and relevant content, helping it rank higher in search.

“But static sites are… static!”, I told myself. Would I need to build an entire dynamic API on top of my static site, just to be able to serve these search-engine optimized pages? This led me to a “eureka” moment with Gatsby – I could define markup (the “M” in JAMstack), and when I’m building my site, I could look at all the available job data I had, cycling through every available keyword combination and inserting it into my markup to generate thousands of these pages. As I later learned, this idea is not necessarily unique to Gatsby – it is possible, for instance, to automate getting data from your API and writing it to data files in earlier static site frameworks like Hugo – but it is a first-class citizen in Gatsby. There are a ton of data sources available via Gatsby plugins, and because they’re all exposed via HTTPS, the workflow is standardized inside of the framework.

In Built with Workers, we connect to the Sanity.io CMS instance at build-time: crucially, by the time that the site has been deployed to Workers, the application effectively has no idea what Sanity even is! Our Gatsby application connects to Sanity.io via an HTTPS API, and using GraphQL, we look at all the data that we have in our CMS, and make decisions about what pages to generate and how to render the site’s interface, ultimately resulting in a statically-built application that is derived from dynamic data.

This emphasis on the build step in JAMstack is quite different than the classic web application. In the past, a user requested data, a web server looked at what the user was requesting, and then the user waits, as the server gets that data, returns JSON, or interpolates it into templates written in tools like Pug or ERB. With JAMstack, the pages are already built, and the deployed application is just a collection of plain HTML, CSS, and JavaScript.

Why Cloudflare Workers?

Cloudflare’s network is a fascinating place to deploy JAMstack applications. Yes, Cloudflare’s edge network can act as a CDN for your static assets, like your CSS stylesheets, or your client-side JavaScript code. But with Workers, we now have the ability to run JavaScript side-by-side with our static assets. In most JAMstack applications, the CDN is simply a bucket where your application ends up. Usually, the CDN is the most boring part of the stack! With Cloudflare Workers, we don’t just have a CDN: we also have access to an extremely low-latency, fully-featured JavaScript runtime.

The implications of this on the standard JAMstack workflow are, frankly, mind-boggling, and as part of developing Built with Workers, we’ve been exploring what it means to have this runtime available side-by-side with our statically-built JAMstack application.

To demonstrate this, we’ve implemented a bookmarking feature, which allows users of Built with Workers to bookmark projects. If you look at a project’s usage of our streaming HTML rewriter and say “Wow, that’s cool!”, you can also bookmark for the project to show your support. This feature, rendered as a button tag is deceptively simple: it’s a single piece of the user interface that makes use of the entirety of the Workers platform, to provide user-specific dynamic functionality. We’ll explore this in greater detail later in the post – see “Enhancing static sites at the edge”.


A modern development and content workflow

In the announcement post for Workers Sites, Rita outlined the motivations behind launching Workers Sites as a modern way to deploy sites:

“Born on the edge, Workers Sites is what we think modern development on the web should look like, natively secure, fast, and massively scalable. Less of your time is spent on configuration, and more of your time is spent on your code, and content itself.”

A few months later, I can say definitively that Workers Sites has enabled us to develop Built with Workers and spend almost no time on configuration. Using our GitHub Action for deploying Workers applications with Wrangler, the site has been continuously deploying to a staging environment for the past couple weeks. The simplicity around this continuous deployment workflow has allowed us to focus on the important aspects of the project: development and content.

The static site framework ecosystem is fairly competitive, but as we considered our options for this site, I advocated strongly for Gatsby.js. It’s an incredible tool for building JAMstack applications, with a great set of default for performant applications. It’s common to see Gatsby sites with Lighthouse scores in the upper 90s, and the decision to use React for implementing the UI makes it straightforward to onboard new developers if they’re familiar with React.

As I mentioned in a previous section, Gatsby shines at build-time. Gatsby’s APIs for creating pages during the build process based on API data are incredibly powerful, allowing developers to concretely define every statically-generated page on their web application, as well as any relevant data that needs to be passed in.

With Gatsby decided upon as our static site framework, we needed to evaluate where our content would live. Built with Workers has two primary data models, used to generate the UI:

  • Projects: websites, applications, and APIs created by developers using Cloudflare Workers. For instance, Built with Workers!
  • Features: features available on the Workers platform used to build applications. For instance, Workers KV, or our streaming HTML rewriter/parser.

Given these requirements, there were a number of potential approaches to take to store this data, and make it accessible. Keeping in line with JAMstack, we know that we probably should expose it via an HTTPS API, but from where? In what format?

As a full-stack developer who’s comfortable with databases, it’s easy to envision a world where we spin up a PostgreSQL instance, write a REST API, and write all kinds of fetch('/api/projects') to get the information we need. This method works, but we can do better! In the same way we built Workers Sites to simplify the deployment process, it was worthwhile to explore the JAMstack ecosystem and see what solutions exist for modeling data without being on the hook for more infrastructure.

Of the different tools in the ecosystem – databases, whether SQL or NoSQL, key-value stores (such as our own, Workers KV), etc. – the growth of “headless CMS” tools has made the largest impact on my development workflow.

On CSS Tricks, Chris Coyier wrote about the rise of headless CMS tools back in March 2016, and summarizes their function well:

[Headless CMSes are] very related to The Big Conversation™ on the web the last many years. How are we going to handle bringing Our Stuff™ all these different devices/screens/inputs.
Responsive design says "let’s let our design and media accommodate as much variation in screens as possible."
Progressive enhancement says "let’s make the functionality of this site work no matter what."
Designing for accessibility says "let’s ensure everyone can use this regardless of their capabilities as a person."
A headless CMS says "let’s not tie our data to any one way of doing things."

Using our headless CMS, Sanity.io, we can get every project inside our dataset, and call Gatsby’s createPage function to create a new page for each project, using a pre-defined project template file:

// gatsby-node.js

exports.createPages = async ({ graphql, actions }) => {
  const { createPage } = actions;

  const result = await graphql(`
    {
      allSanityProject {
        edges {
          node {
            slug
          }
        }
      }
    }
  `);

  if (result.errors) {
    throw result.errors;
  }

  const {
    data: { allSanityProject }
  } = result;

  const projects = allSanityProject.edges.map(({ node }) => node);
  projects.forEach((node, _index) => {
    const path = `/built-with/projects/${node.slug}`;

    createPage({
      path,
      component: require.resolve("./src/templates/project.js"),
      context: { slug: node.slug }
    });
  });
};

Using Sanity to drive the content for Built with Workers has been a huge win for our team. We’re no longer constrained by code deploys to make changes to content on the site – we don’t need to make a pull request to create a new project, and edits to a project’s name or description aren’t constrained by someone with the ability to deploy the project. Instead, we can empower members of our team to log in directly to the CMS and make changes, and be confident that once the corresponding deploy has completed (see “The CDN is the deployment platform” below), their changes will be live on the site.

Dynamic JAMstack layouts

As our team got up and running with Sanity.io, we found that the flexibility of a headless CMS platform was useful not just for creating our original data requirements – projects and features – but in rethinking and innovating on how we actually format the application itself.

With our previous objective of empowering non-technical folks to make changes to the site without deploying any code in consideration, we’ve also taken the entire homepage of Built with Workers and defined it as an instance of the “layout” data model in Sanity.io. By doing this, we can define corresponding “collections”, which are sets of projects. When a layout has many collections defined inside of the CMS, we can rapidly re-order, re-arrange, and experiment with new collections on the homepage, seeing the updated version of the site reflected immediately, and live on the production site within only a few minutes, after our continuous deployment process has finished.

JAMstack at the Edge: How we built Built with Workers… on Workers
Updating the layout of Built with Workers live from Sanity’s studio

With this work implemented, it’s easy to envision a world where our React code is purely concerned with rendering each individual aspect of the application’s interface – for instance, the project title component, or the “card” for an individual project – and the CMS drives the entire layout of the site. In the future, I’d like to continue exploring this idea in other pages in Built with Workers, including the project pages and any other future content we put on the site.

Enhancing static sites at the edge

Much of what we’ve discussed so far can be thought of as features and workflows that have great DX (developer experience), but not specific to Workers. Gatsby and Sanity.io are great, and although Workers Sites is a great platform for deploying JAMstack applications due to the Workers platform’s low-latency and performance characteristics, you could deploy the site to a number of different providers with no real differentiating features.

As we began building a JAMstack application on top of Built with Workers, we also wanted to explore how the Workers platform could allow developers to combine the simplicity of static site deployments with the dynamism of having a JavaScript runtime immediately available.

In particular, our recently-released streaming HTML rewriter seems like a perfect fit for “enhancing” our static sites. Our application is being served by Workers Sites, which itself is a Workers template that can be customized. By passing each HTML page through the HTML rewriter on its way to the client, we had an opportunity to customize the content without any negative performance implications.

As I mentioned previously, we landed on a first exploration of this platform advantage via the “bookmark” button. Users of Built with Workers can “bookmark” for a project – this action sends a request back up to the Workers application, storing the bookmark data as JSON in Workers KV.

// User-specific data stored in Workers KV, representing
// per-project bookmark information

{
  "bytesized_scraper_bookmarked": false,
  "web_scraper_bookmarked": true
}

When a user returns to Built with Workers, we can make a request to Workers KV, looking for corresponding data for that user and the project they’re currently viewing. If that data exists, we can embed the “edge state” directly into the HTML using the streaming HTML rewriter.

// workers-site/index.js

import { getAssetFromKV } from "@cloudflare/kv-asset-handler"

addEventListener("fetch", event => { 
  event.respondWith(handleEvent(event)) 
})

class EdgeStateEmbed {
  constructor(state) {
    this._state = state
  }
  
  element(element) {
    const edgeStateElement = `
      <script id='edge_state' type='application/json'>
        ${JSON.stringify(this._state)}
      </script>
    `
    element.prepend(edgeStateElement, { html: true })
  }
}

const hydrateEdgeState = async ({ state, response }) => {
  const rewriter = new HTMLRewriter().on(
    "body",
    new EdgeStateEmbed(await state)
  )
  return rewriter.transform(await response)
}

async function handleEvent(event) {
  return hydrateEdgeState({
    response: getAssetFromKV(event, options),
    // Get associated state for a request, based on the user and URL
    state: transformBookmark(event.request),
  })
}

When the React application is rendered on the client, it can then check for that embedded edge state, which influences how the “bookmark” icon is rendered – either as “bookmarked”, or “bookmarked”. To support this, we’ve leaned on React’s useContext, which allows any component inside of the application component tree to pull out the edge state and use it inside of the component:

// edge_state.js

import React from "react"
import { useSSR } from "../utils"

const parseDocumentState = () => {
  const edgeStateElement = document.querySelector("#edge_state")
  return edgeStateElement ? JSON.parse(edgeStateElement.innerText) : {}
}

export const EdgeStateContext = React.createContext([{}, () => {}])
export const EdgeStateProvider = ({ children }) => {
  const { isBrowser } = useSSR()
  if (!isBrowser) {
    return <>{children}</>
  }
  
  const edgeState = parseDocumentState()
  const [state, setState] = React.useState(edgeState)
  const updateState = (newState, options = { immutable: true }) => options.immutable
    ? setState(Object.assign({}, state, newState))
    : setState(newState)
  
  return (
    <EdgeStateContext.Provider value={[state, updateState]}>
      {children}
    </EdgeStateContext.Provider>
  )
}

// Inside of a React component
const Bookmark = ({ bookmarked, project, setBookmarked, setLoaded }) => {
const [state, setState] = React.useContext(EdgeStateContext)
// `bookmarked` value is a simplification of actual code
return <BookmarkButton bookmarked={state[project.id]} />
}

The combination of a straightforward JAMstack deployment platform with dynamic key-value object storage and a streaming HTML rewriter is really, really cool. This is an initial exploration into what I consider to be a platform-defining feature, and if you’re interested in this stuff and want to continue to explore how this will influence how we write web applications, get in touch with me on Twitter!

The CDN is the deployment platform

While it doesn’t appear in the acronym, an unsung hero of the JAMstack architecture is deployment. In my local terminal, when I run gatsby build inside of the Built with Workers project, the result is a folder of static HTML, CSS, and JavaScript. It should be easy to deploy!

The recent release of GitHub Actions has proven to be a great companion to building JAMstack applications with Cloudflare Workers – we’ve open-sourced our own wrangler-action, which allows developers to build their Workers applications and deploy directly from GitHub.

The standard workflows in the continuous deployment world – deploy every hour, deploy on new changes to the master branch, etc – are possible and already being used by many developers who make use of our wrangler-action workflow in their projects. Particular to JAMstack and to headless CMS tools is the idea of “build-on-change”: namely, when someone publishes a change in Sanity.io, we want to do a new deploy of the site to immediately reflect our new content in production.

The versatility of Workers as a place to deploy JavaScript code comes to the rescue, again! By telling Sanity.io to make a GET request to a deployed Workers webhook, we can trigger a repository_event on GitHub Actions for our repository, allowing new deploys to happen immediately after a change is detected in the CMS:

const headers = {
  Accept: 'application/vnd.github.everest-preview+json',
  Authorization: 'Bearer $token',
}

const body = JSON.stringify({ event_type: 'repository_dispatch' })

const url = `https://api.github.com/repos/cloudflare/built-with-workers/dispatches`

const handleRequest = async evt => {
  await fetch(url, { method: 'POST', headers, body })
  return new Response('OK')
}

addEventListener('fetch', handleRequest)

In doing this, we’ve made it possible to completely abstract away every deployment task around the Built with Workers project. Not only does the site deploy on a schedule, and on new commits to master, but it can also do additional deploys as the content changes, so that the site is always reflective of the current content in our CMS.

JAMstack at the Edge: How we built Built with Workers… on Workers
The GitHub Actions deployment workflow for Built with Workers

Conclusion

We’re super excited about Built with Workers, not only because it will serve as a great place to showcase the incredible things people are building with the Cloudflare Workers platform, but because it also has allowed us to explore what the future of web development may look like. I’ve been advocating for what I’ve seen referred to as “full-stack serverless” development throughout 2019, and I couldn’t be happier to start 2020 with launching a project like Built with Workers. The full-stack serverless stack feels like the future, and it’s actually fun to build with on a daily basis!

If you’re building something awesome with Cloudflare Workers, we’re looking for submissions to the site! Get in touch with us via this form – we’re excited to speak with you!

Finally, if topics like JAMstack on Cloudflare Workers, “edge state” and dynamic static site hydration, or continuous deployment interest you, the Built with Workers repository is open-source! Check it out, and if you’re inspired to build something cool with Workers after checking out the code, make sure to let us know!

JavaScript Libraries Are Almost Never Updated Once Installed

Post Syndicated from Zack Bloom original https://blog.cloudflare.com/javascript-libraries-are-almost-never-updated/

JavaScript Libraries Are Almost Never Updated Once Installed

Cloudflare helps run CDNJS, a very popular way of including JavaScript and other frontend resources on web pages. With the CDNJS team’s permission we collect anonymized and aggregated data from CDNJS requests which we use to understand how people build on the Internet. Our analysis today is focused on one question: once installed on a site, do JavaScript libraries ever get updated?

Let’s consider jQuery, the most popular JavaScript library on Earth. This chart shows the number of requests made for a selected list of jQuery versions over the past 12 months:

JavaScript Libraries Are Almost Never Updated Once Installed

Spikes in the CDNJS data as you see with version 3.3.1 are not uncommon as very large sites add and remove CDNJS script tags.

We see a steady rise of version 3.4.1 following its release on May 2nd, 2019. What we don’t see is a substantial decline of old versions. Version 3.2.1 shows an average popularity of 36M requests at the beginning of our sample, and 29M at the end, a decline of approximately 20%. This aligns with a corpus of research which shows the average website lasts somewhere between two and four years. What we don’t see is a decline in our old versions which come close to the volume of growth of new versions when they’re released. In fact the release of 3.4.1, as popular as it quickly becomes, doesn’t change the trend of old version deprecation at all.

If you’re curious, the oldest version of jQuery CDNJS includes is 1.10.0, released on May 25, 2013. The project still gets an average of 100k requests per day, and the sites which use it are growing in popularity:

JavaScript Libraries Are Almost Never Updated Once Installed

To confirm our theory, let’s consider another project, TweenMax:

JavaScript Libraries Are Almost Never Updated Once Installed

As this package isn’t as popular as jQuery, the data has been smoothed with a one week trailing average to make it easier to identify trends.

Version 1.20.4 begins the year with 18M requests, and ends it with 14M, a decline of about 23%, again in alignment with the loss of websites on the Internet. The growth of 2.1.3 shows clear evidence that the release of a new version has almost no bearing on the popularity of old versions, the trend line for those older versions doesn’t change even as 2.1.3 grows to 29M requests per day.

JavaScript Libraries Are Almost Never Updated Once Installed

Clearly the new version cannot be replacing many of the legacy installations.

The clear conclusion is whatever libraries you publish will exist on websites forever. The underlying web platform consequently must support aged conventions indefinitely if it is to continue supporting the full breadth of the web. Cloudflare is also, of course, very interested in how we can contribute to a web which is kept up-to-date. Please make suggestions in the comments below.

An Update on CDNJS

Post Syndicated from Zack Bloom original https://blog.cloudflare.com/an-update-on-cdnjs/

An Update on CDNJS

When you loaded this blog, a file was delivered to your browser called jquery-3.2.1.min.js. jQuery is a library which makes it easier to build websites, and was at one point included on as many as 74.1% of all websites. A full eighteen million sites include jQuery and other libraries using one of the most popular tools on Earth: CDNJS. Beginning about a month ago Cloudflare began to take a more active role in the operation of CDNJS. This post is here to tell you more about CDNJS’ history and explain why we are helping to manage CDNJS.

What CDNJS Does

Virtually every site is composed of not just the code written by its developers, but also dozens or hundreds of libraries. These libraries make it possible for websites to extend what a web browser can do on its own. For example, libraries can allow a site to include powerful data visualizations, respond to user input, or even get more performant.

These libraries created wondrous and magical new capabilities for web browsers, but they can also cause the size of a site to explode. Particularly a decade ago, connections were not always fast enough to permit the use of many libraries while maintaining performance. But if so many websites are all including the same libraries, why was it necessary for each of them to load their own copy?

If we all load jQuery from the same place the browser can do a much better job of not actually needing to download it for every site. When the user visits the first jQuery-powered site it will have to be downloaded, but it will already be cached on the user’s computer for any subsequent jQuery-powered site they might visit.

An Update on CDNJS

The first visit might take time to load:

An Update on CDNJS

But any future visit to any website pointing to this common URL would already be cached:

An Update on CDNJS

<!-- Loaded only on my site, will need to be downloaded by every user -->
<script src="./jquery.js"></script>

<!-- Loaded from a common location across many sites -->
<script src="https://cdnjs.cloudflare.com/jquery.js"></script>

Beyond the performance advantage, including files this way also made it very easy for users to experiment and create. When using a web browser as a creation tool users often didn’t have elaborate build systems (this was also before npm), so being able to include a simple script tag was a boon. It’s worth noting that it’s not clear a massive performance advantage was ever actually provided by this scheme. It is becoming even less of a performance advantage now that browser vendors are beginning to use separate cache’s for each website you visit, but with millions of sites using CDNJS there’s no doubt it is a critical part of the web.

A CDN for all of us

My first Pull Request into the CDNJS project was in 2013. Back then if you created a JavaScript project it wasn’t possible to have it included in the jQuery CDN, or the ones provided by large companies like Google and Microsoft. They were only for big, important, projects. Of course, however, even the biggest project starts small. The community needed a CDN which would agree to host nearly all JavaScript projects, even the ones which weren’t world-changing (yet). In 2011, that project was launched by Ryan Kirkman and Thomas Davis as CDNJS.

The project was quickly wildly successful, far beyond their expectations. Their CDN bill quickly began to skyrocket (it would now be over a million dollars a year on AWS). Under the threat of having to shut down the service, Cloudflare was approached by the CDNJS team to see if we could help. We agreed to support their efforts and created cdnjs.cloudflare.com which serves the CDNJS project free of charge.

CDNJS has been astonishingly successful. The project is currently installed on over eighteen million websites (10% of the Internet!), offers files totaling over 1.5 billion lines of code, and serves over 173 billion requests a month. CDNJS only gets more popular as sites get larger, with 34% of the top 10k websites using the service. Each month we serve almost three petabytes of JavaScript, CSS, and other resources which power the web via cdnjs.cloudflare.com.

An Update on CDNJS
Spikes can happen when a very large or popular site installs CDNJS, or when a misbehaving web crawler discovers a CDNJS link.

The future value of CDNJS is now in doubt, as web browsers are beginning to use a separate cache for every website you visit. It is currently used on such a wide swath of the web, however, it is unlikely it will be disappearing any time soon.

How CDNJS Works

CDNJS starts with a Github repo. That project contains every file served by CDNJS, at every version which it has ever offered. That’s 182 GB without the commit history, over five million files, and over 1.5 billion lines of code.

Given that it stores and delivers versioned code files, in many ways it was the Internet’s first JavaScript package manager. Unlike other package managers and even other CDNs everything CDNJS serves is publicly versioned. All 67,724 commits! This means you as a user can verify that you are being served files which haven’t been tampered with.

To make changes to CDNJS a commit has to be made. For new projects being added to CDNJS, or when projects change significantly, these commits are made by humans, and get reviewed by other humans. When projects just release new versions there is a bot made by Peter and maintained by Sven which sucks up changes from npm and automatically creates commits.

Within Cloudflare’s infrastructure there is a set of machines which are responsible for pulling the latest version of the repo periodically. Those machines then become the origin for cdnjs.cloudflare.com, with Cloudflare’s Global Load Balancer automatically handling failures. Cloudflare’s cache automatically stores copies of many of the projects making it possible for us to deliver them quickly from all 195 of our data centers.

An Update on CDNJS

The Internet on a Shoestring Budget

The CDNJS project has always been administered independently of Cloudflare. In addition to the founders, the project has additionally been maintained by exceptionally hard-working caretakers like Peter and Matt Cowley. Maintaining a single repo of nearly every frontend project on Earth is no small task, and it has required a substantial amount of both manual work and bot development.

Unfortunately approximately thirty days ago one of those bots stopped working, preventing updated projects from appearing in CDNJS. The bot’s open-source maintainer was not able to invest the time necessary to keep the bot running. After several weeks we were asked by the community and the CDNJS founders to take over maintenance of the CDNJS repo itself. This means the Cloudflare engineering team is taking responsibility for keeping the contents of github.com/cdnjs/cdnjs up to date, in addition to ensuring it is correctly served on cdnjs.cloudflare.com.

We agreed to do this because we were, frankly, scared. Like so many open-source projects CDNJS was a critical part of our world, but wasn’t getting the attention it needed to survive. The Internet relies on CDNJS as much as on any other single project, losing it or allowing it to be commandeered would be catastrophic to millions of websites and their visitors. If it began to fail, some sites would adapt and update, others would be broken forever.

CDNJS has always been, and remains, a project for and by the community. We are invested in making all decisions in a transparent and inclusive manner. If you are interested in contributing to CDNJS or in the topics we’re currently discussing please visit the CDNJS Github Issues page.

An Update on CDNJS

A Plan for the Future

One example of an area where we could use your help is in charting a path towards a CDNJS which requires less manual moderation. Nothing can replace the intelligence and creativity of a human (yet), but for a task like managing what resources go into a CDN, it is error prone and time consuming. At present a human has to review every new project to be included, and often has to take additional steps to include new versions of a project.

As a part of our analysis of the project we examined a snapshot of the still-open PRs made against CDNJS for several months:

An Update on CDNJS

The vast majority of these PRs were changes which ultimately passed the automated review but nevertheless couldn’t be merged without manual review.

There is consensus that we should move to a model which does not require human involvement in most cases. We would love your input and collaboration on the best way for that to be solved. If this is something you are passionate about, please contribute here.

Our plan is to support the CDNJS project in whichever ways it requires for as long as the Internet relies upon it. We invite you to use CDNJS in your next project with the full assurance that it is backed by the same network and team who protect and accelerate over twenty million of your favorite websites across the Internet. We are also planning more posts diving further into the CDNJS data, subscribe to this blog if you would like to be notified upon their release.