Tag Archives: serverless

Troubleshooting Amazon API Gateway with enhanced observability variables

2020-09-04 Eric Johnson

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/troubleshooting-amazon-api-gateway-with-enhanced-observability-variables/

Amazon API Gateway is often used for managing access to serverless applications. Additionally, it can help developers reduce code and increase security with features like AWS WAF integration and authorizers at the API level.

Because more is handled by API Gateway, developers tell us they would like to see more data points on the individual parts of the request. This data helps developers understand each phase of the API request and how it affects the request as a whole. In response to this request, the API Gateway team has added new enhanced observability variables to the API Gateway access logs. With these new variables, developers can troubleshoot on a more granular level to quickly isolate and resolve request errors and latency issues.

The phases of an API request

API Gateway divides requests into phases, reflected by the variables that have been added. Depending upon the features configured for the application, an API request goes through multiple phases. The phases appear in a specific order as follows:

Phases of an API request

WAF: the WAF phase only appears when an AWS WAF web access control list (ACL) is configured for enhanced security. During this phase, WAF rules are evaluated and a decision is made on whether to continue or cancel the request.
Authenticate: the authenticate phase is only present when AWS Identity and Access Management (IAM) authorizers are used. During this phase, the credentials of the signed request are verified. Access is granted or denied based on the client’s right to assume the access role.
Authorizer: the authorizer phase is only present when a Lambda, JWT, or Amazon Cognito authorizer is used. During this phase, the authorizer logic is processed to verify the user’s right to access the resource.
Authorize: the authorize phase is only present when a Lambda or IAM authorizer is used. During this phase, the results from the authenticate and authorizer phase are evaluated and applied.
Integration: during this phase, the backend integration processes the request.

Each phrase can add latency to the request, return a status, or raise an error. To capture this data, API Gateway now provides enhanced observability variables based on each phase. The variables are named according to the phase they occur in and follow the naming structure, $context.phase.property. Therefore, you can get data about WAF latency by using $context.waf.latency.

Some existing variables have also been given aliases to match this naming schema. For example, $context.integrationErrorMessage has a new alias of $context.integration.error. The resulting list of variables is as follows:

Phases and variables for API Gateway requests

API Gateway provides status, latency, and error data for each phase. In the authorizer and integration phases, there are additional variables you can use in logs. The $context.phase.requestId provides the request ID from that service and the $context.phase.integrationStatus provide the status code.

For example, when using an AWS Lambda function as the integration, API Gateway receives two status codes. The first, $context.integration.integrationStatus, is the status of the Lambda service itself. This is usually 200, unless there is a service or permissions error. The second, $context.integration.status, is the status of the Lambda function and reports on the success or failure of the code.

A full list of access log variables is in the documentation for REST APIs, WebSocket APIs, and HTTP APIs.

A troubleshooting example

In this example, an application is built using an API Gateway REST API with a Lambda function for the backend integration. The application uses an IAM authorizer to require AWS account credentials for application access. The application also uses an AWS WAF ACL to rate limit requests to 100 requests per IP, per five minutes. The demo application and deployment instructions can be found in the Sessions With SAM repository.

Because the application involves an AWS WAF and IAM authorizer for security, the request passes through four phases: waf, authenticate, authorize, and integration. The access log format is configured to capture all the data regarding these phases:

{
  "requestId":"$context.requestId",
  "waf-error":"$context.waf.error",
  "waf-status":"$context.waf.status",
  "waf-latency":"$context.waf.latency",
  "waf-response":"$context.wafResponseCode",
  "authenticate-error":"$context.authenticate.error",
  "authenticate-status":"$context.authenticate.status",
  "authenticate-latency":"$context.authenticate.latency",
  "authorize-error":"$context.authorize.error",
  "authorize-status":"$context.authorize.status",
  "authorize-latency":"$context.authorize.latency",
  "integration-error":"$context.integration.error",
  "integration-status":"$context.integration.status",
  "integration-latency":"$context.integration.latency",
  "integration-requestId":"$context.integration.requestId",
  "integration-integrationStatus":"$context.integration.integrationStatus",
  "response-latency":"$context.responseLatency",
  "status":"$context.status"
}

Once the application is deployed, use Postman to test the API with a sigV4 request.

Configuring Postman authorization

To show troubleshooting with the new enhanced observability variables, the first request sent through contains invalid credentials. The user receives a 403 Forbidden error.

Client response view with invalid tokens

The access log for this request is:

{
    "requestId": "70aa9606-26be-4396-991c-405a3671fd9a",
    "waf-error": "-",
    "waf-status": "200",
    "waf-latency": "8",
    "waf-response": "WAF_ALLOW",
    "authenticate-error": "-",
    "authenticate-status": "403",
    "authenticate-latency": "17",
    "authorize-error": "-",
    "authorize-status": "-",
    "authorize-latency": "-",
    "integration-error": "-",
    "integration-status": "-",
    "integration-latency": "-",
    "integration-requestId": "-",
    "integration-integrationStatus": "-",
    "response-latency": "48",
    "status": "403"
}

The request passed through the waf phase first. Since this is the first request and the rate limit has not been exceeded, the request is passed on to the next phase, authenticate. During the authenticate phase, the user’s credentials are verified. In this case, the credentials are invalid and the request is rejected with a 403 response before invoking the downstream phases.

To correct this, the next request uses valid credentials, but those credentials do not have access to invoke the API. Again, the user receives a 403 Forbidden error.

Client response view with unauthorized tokens

The access log for this request is:

{
  "requestId": "c16d9edc-037d-4f42-adf3-eaadf358db2d",
  "waf-error": "-",
  "waf-status": "200",
  "waf-latency": "7",
  "waf-response": "WAF_ALLOW",
  "authenticate-error": "-",
  "authenticate-status": "200",
  "authenticate-latency": "8",
  "authorize-error": "The client is not authorized to perform this operation.",
  "authorize-status": "403",
  "authorize-latency": "0",
  "integration-error": "-",
  "integration-status": "-",
  "integration-latency": "-",
  "integration-requestId": "-",
  "integration-integrationStatus": "-",
  "response-latency": "52",
  "status": "403"
}

This time, the access logs show that the authenticate phase returns a 200. This indicates that the user credentials are valid for this account. However, the authorize phase returns a 403 and states, “The client is not authorized to perform this operation”. Again, the request is rejected with a 403 response before invoking downstream phases.

The last request for the API contains valid credentials for a user that has rights to invoke this API. This time the user receives a 200 OK response and the requested data.

Client response view with valid request

The log for this request is:

{
  "requestId": "ac726ce5-91dd-4f1d-8f34-fcc4ae0bd622",
  "waf-error": "-",
  "waf-status": "200",
  "waf-latency": "7",
  "waf-response": "WAF_ALLOW",
  "authenticate-error": "-",
  "authenticate-status": "200",
  "authenticate-latency": "1",
  "authorize-error": "-",
  "authorize-status": "200",
  "authorize-latency": "0",
  "integration-error": "-",
  "integration-status": "200",
  "integration-latency": "16",
  "integration-requestId": "8dc58335-fa13-4d48-8f99-2b1c97f41a3e",
  "integration-integrationStatus": "200",
  "response-latency": "48",
  "status": "200"
}

This log contains a 200 status code from each of the phases and returns a 200 response to the user. Additionally, each of the phases reports latency. This request had a total of 48 ms of latency. The latency breaks down according to the following:

Request latency breakdown

Developers can use this information to identify the cause of latency within the API request and adjust accordingly. While some phases like authenticate or authorize are immutable, optimizing the integration phase of this request could remove a large chunk of the latency involved.

Conclusion

This post covers the enhanced observability variables, the phases they occur in, and the order of those phases. With these new variables, developers can quickly isolate the problem and focus on resolving issues.

When configured with the proper access logging variables, API Gateway access logs can provide a detailed story of API performance. They can help developers to continually optimize that performance. To learn how to configure logging in API Gateway with AWS SAM, see the demonstration app for this blog.

#ServerlessForEveryone

Introducing larger state payloads for AWS Step Functions

2020-09-04 Rob Sutter

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/introducing-larger-state-payloads-for-aws-step-functions/

AWS Step Functions allows you to create serverless workflows that orchestrate your business processes. Step Functions stores data from workflow invocations as application state. Today we are increasing the size limit of application state from 32,768 characters to 256 kilobytes of data per workflow invocation. The new limit matches payload limits for other commonly used serverless services such as Amazon SNS, Amazon SQS, and Amazon EventBridge. This means you no longer need to manage Step Functions payload limitations as a special case in your serverless applications.

Faster, cheaper, simpler state management

Previously, customers worked around limits on payload size by storing references to data, such as a primary key, in their application state. An AWS Lambda function then loaded the data via an SDK call at runtime when the data was needed. With larger payloads, you now can store complete objects directly in your workflow state. This removes the need to persist and load data from data stores such as Amazon DynamoDB and Amazon S3. You do not pay for payload size, so storing data directly in your workflow may reduce both cost and execution time of your workflows and Lambda functions. Storing data in your workflow state also reduces the amount of code you need to write and maintain.

AWS Management Console and workflow history improvements

Larger state payloads mean more data to visualize and search. To help you understand that data, we are also introducing changes to the AWS Management Console for Step Functions. We have improved load time for the Execution History page to help you get the information you need more quickly. We have also made backwards-compatible changes to the GetExecutionHistory API call. Now if you set includeExecutionData to false, GetExecutionHistory excludes payload data and returns only metadata. This allows you to debug your workflows more quickly.

Doing more with dynamic parallelism

A larger payload also allows your workflows to process more information. Step Functions workflows can process an arbitrary number of tasks concurrently using dynamic parallelism via the Map State. Dynamic parallelism enables you to iterate over a collection of related items applying the same process to each item. This is an implementation of the map procedure in the MapReduce programming model.

When to choose dynamic parallelism

Choose dynamic parallelism when performing operations on a small collection of items generated in a preliminary step. You define an Iterator, which operates on these items individually. Optionally, you can reduce the results to an aggregate item. Unlike with parallel invocations, each item in the collection is related to the other items. This means that an error in processing one item typically impacts the outcome of the entire workflow.

Example use case

Ecommerce and line of business applications offer many examples where dynamic parallelism is the right approach. Consider an order fulfillment system that receives an order and attempts to authorize payment. Once payment is authorized, it attempts to lock each item in the order for shipment. The available items are processed and their total is taken from the payment authorization. The unavailable items are marked as pending for later processing.

The following Amazon States Language (ASL) defines a Map State with a simplified Iterator that implements the order fulfillment steps described previously.


    "Map": {
      "Type": "Map",
      "ItemsPath": "$.orderItems",
      "ResultPath": "$.packedItems",
      "MaxConcurrency": 40,
      "Next": "Print Label",
      "Iterator": {
        "StartAt": "Lock Item",
        "States": {
          "Lock Item": {
            "Type": "Pass",
            "Result": "Item locked!",
            "Next": "Pull Item"
          },
          "Pull Item": {
            "Type": "Pass",
            "Result": "Item pulled!",
            "Next": "Pack Item"
          },
          "Pack Item": {
            "Type": "Pass",
            "Result": "Item packed!",
            "End": true
          }
        }
      }
    }

The following image provides a visualization of this workflow. A preliminary state retrieves the collection of items from a data store and loads it into the state under the orderItems key. The triple dashed lines represent the Map State which attempts to lock, pull, and pack each item individually. The result of processing each individual item impacts the next state, Print Label. As more items are pulled and packed, the total weight increases. If an item is out of stock, the total weight will decrease.

A visualization of a portion of an AWS Step Functions workflow that implements dynamic parallelism

Dynamic parallelism or the “Map State”

Larger state payload improvements

Without larger state payloads, each item in the $.orderItems object in the workflow state would be a primary key to a specific item in a DynamoDB table. Each step in the “Lock, Pull, Pack” workflow would need to read data from DynamoDB for every item in the order to access detailed item properties.

With larger state payloads, each item in the $.orderItems object can be a complete object containing the required fields for the relevant items. Not only is this faster, resulting in a better user experience, but it also makes debugging workflows easier.

Pricing and availability

Larger state payloads are available now in all commercial and AWS GovCloud (US) Regions where Step Functions is available. No changes to your workflows are required to use larger payloads, and your existing workflows will continue to run as before. The larger state is available however you invoke your Step Functions workflows, including the AWS CLI, the AWS SDKs, the AWS Step Functions Data Science SDK, and Step Functions Local.

Larger state payloads are included in existing Step Functions pricing for Standard Workflows. Because Express Workflows are priced by runtime and memory, you may see more cost on individual workflows with larger payloads. However, this increase may also be offset by the reduced cost of Lambda, DynamoDB, S3, or other AWS services.

Conclusion

Larger Step Functions payloads simplify and increase the efficiency of your workflows by eliminating function calls to persist and retrieve data. Larger payloads also allow your workflows to process more data concurrently using dynamic parallelism.

With larger payloads, you can minimize the amount of custom code you write and focus on the business logic of your workflows. Get started building serverless workflows today!

Building a serverless document scanner using Amazon Textract and AWS Amplify

2020-09-03 Moheeb Zara

Post Syndicated from Moheeb Zara original https://aws.amazon.com/blogs/compute/building-a-serverless-document-scanner-using-amazon-textract-and-aws-amplify/

This guide demonstrates creating and deploying a production ready document scanning application. It allows users to manage projects, upload images, and generate a PDF from detected text. The sample can be used as a template for building expense tracking applications, handling forms and legal documents, or for digitizing books and notes.

The frontend application is written in Vue.js and uses the Amplify Framework. The backend is built using AWS serverless technologies and consists of an Amazon API Gateway REST API that invokes AWS Lambda functions. Amazon Textract is used to analyze text from uploaded images to an Amazon S3 bucket. Detected text is stored in Amazon DynamoDB.

An architectural diagram of the application.

Prerequisites

You need the following to complete the project:

Node.js and npm installed on a computer.
An AWS account. This project can be completed using the AWS Free Tier.

Deploy the application

The solution consists of two parts, the frontend application and the serverless backend. The Amplify CLI deploys all the Amazon Cognito authentication, and hosting resources for the frontend. The backend requires the Amazon Cognito user pool identifier to configure an authorizer on the API. This enables an authorization workflow, as shown in the following image.

A diagram showing how an Amazon Cognito authorization workflow works

First, configure the frontend. Complete the following steps using a terminal running on a computer or by using the AWS Cloud9 IDE. If using AWS Cloud9, create an instance using the default options.

From the terminal:

Install the Amplify CLI by running this command.
```
npm install -g @aws-amplify/cli
```
Configure the Amplify CLI using this command. Follow the guided process to completion.
```
amplify configure
```

Clone the project from GitHub.

git clone https://github.com/aws-samples/aws-serverless-document-scanner.git

Navigate to the amplify-frontend directory and initialize the project using the Amplify CLI command. Follow the guided process to completion.
```
cd aws-serverless-document-scanner/amplify-frontend

amplify init
```
Deploy all the frontend resources to the AWS Cloud using the Amplify CLI command.
```
amplify push
```
After the resources have finishing deploying, make note of the StackName and UserPoolId properties in the amplify-frontend/amplify/backend/amplify-meta.json file. These are required when deploying the serverless backend.

Next, deploy the serverless backend. While it can be deployed using the AWS SAM CLI, you can also deploy from the AWS Management Console:

Navigate to the document-scanner application in the AWS Serverless Application Repository.
In Application settings, name the application and provide the StackName and UserPoolId from the frontend application for the UserPoolID and AmplifyStackName parameters. Provide a unique name for the BucketName parameter.
Choose Deploy.
Once complete, copy the API endpoint so that it can be configured on the frontend application in the next section.

Configure and run the frontend application

Create a file, amplify-frontend/src/api-config.js, in the frontend application with the following content. Include the API endpoint and the unique BucketName from the previous step. The s3_region value must be the same as the Region where your serverless backend is deployed.
```
const apiConfig = {
	"endpoint": "<API ENDPOINT>",
	"s3_bucket_name": "<BucketName>",
	"s3_region": "<Bucket Region>"
};

export default apiConfig;
```
In a terminal, navigate to the root directory of the frontend application and run it locally for testing.
```
cd aws-serverless-document-scanner/amplify-frontend

npm install

npm run serve
```
You should see an output like this:
To publish the frontend application to cloud hosting, run the following command.
```
amplify publish
```
Once complete, a URL to the hosted application is provided.

Using the frontend application

Once the application is running locally or hosted in the cloud, navigating to it presents a user login interface with an option to register. The registration flow requires a code sent to the provided email for verification. Once verified you’re presented with the main application interface.

Once you create a project and choose it from the list, you are presented with an interface for uploading images by page number.

On mobile, it uses the device camera to capture images. On desktop, images are provided by the file system. You can replace an image and the page selector also lets you go back and change an image. The corresponding analyzed text is updated in DynamoDB as well.

Each time you upload an image, the page is incremented. Choosing “Generate PDF” calls the endpoint for the GeneratePDF Lambda function and returns a PDF in base64 format. The download begins automatically.

You can also open the PDF in another window, if viewing a preview in a desktop browser.

Understanding the serverless backend

An architecture diagram of the serverless backend.

In the GitHub project, the folder serverless-backend/ contains the AWS SAM template file and the Lambda functions. It creates an API Gateway endpoint, six Lambda functions, an S3 bucket, and two DynamoDB tables. The template also defines an Amazon Cognito authorizer for the API using the UserPoolID passed in as a parameter:

Parameters:
  UserPoolID:
    Type: String
    Description: (Required) The user pool ID created by the Amplify frontend.

  AmplifyStackName:
    Type: String
    Description: (Required) The stack name of the Amplify backend deployment. 

  BucketName:
    Type: String
    Default: "ds-userfilebucket"
    Description: (Required) A unique name for the user file bucket. Must be all lowercase.  


Globals:
  Api:
    Cors:
      AllowMethods: "'*'"
      AllowHeaders: "'*'"
      AllowOrigin: "'*'"

Resources:

  DocumentScannerAPI:
    Type: AWS::Serverless::Api
    Properties:
      StageName: Prod
      Auth:
        DefaultAuthorizer: CognitoAuthorizer
        Authorizers:
          CognitoAuthorizer:
            UserPoolArn: !Sub 'arn:aws:cognito-idp:${AWS::Region}:${AWS::AccountId}:userpool/${UserPoolID}'
            Identity:
              Header: Authorization
        AddDefaultAuthorizerToCorsPreflight: False

This only allows authenticated users of the frontend application to make requests with a JWT token containing their user name and email. The backend uses that information to fetch and store data in DynamoDB that corresponds to the user making the request.

Two DynamoDB tables are created. A Project table, which tracks all the project names by user, and a Pages table, which tracks pages by project and user. The DynamoDB tables are created by the AWS SAM template with the partition key and range key defined for each table. These are used by the Lambda functions to query and sort items. See the documentation to learn more about DynamoDB table key schema.

ProjectsTable:
    Type: AWS::DynamoDB::Table
    Properties: 
      AttributeDefinitions: 
        - 
          AttributeName: "username"
          AttributeType: "S"
        - 
          AttributeName: "project_name"
          AttributeType: "S"
      KeySchema: 
        - AttributeName: username
          KeyType: HASH
        - AttributeName: project_name
          KeyType: RANGE
      ProvisionedThroughput: 
        ReadCapacityUnits: "5"
        WriteCapacityUnits: "5"

  PagesTable:
    Type: AWS::DynamoDB::Table
    Properties: 
      AttributeDefinitions: 
        - 
          AttributeName: "project"
          AttributeType: "S"
        - 
          AttributeName: "page"
          AttributeType: "N"
      KeySchema: 
        - AttributeName: project
          KeyType: HASH
        - AttributeName: page
          KeyType: RANGE
      ProvisionedThroughput: 
        ReadCapacityUnits: "5"
        WriteCapacityUnits: "5"

When an API Gateway endpoint is called, it passes the user credentials in the request context to a Lambda function. This is used by the CreateProject Lambda function, which also receives a project name in the request body, to create an item in the Project Table and associate it with a user.

The endpoint for the FetchProjects Lambda function is called to retrieve the list of projects associated with a user. The DeleteProject Lambda function removes a specific project from the Project table and any associated pages in the Pages table. It also deletes the folder in the S3 bucket containing all images for the project.

When a user enters a Project, the API endpoint calls the FetchPageCount Lambda function. This returns the number of pages for a project to update the current page number in the upload selector. The project is retrieved from the path parameters, as defined in the AWS SAM template:

FetchPageCount:
    Type: AWS::Serverless::Function
    Properties:
      Handler: app.handler
      Runtime: python3.8
      CodeUri: lambda_functions/fetchPageCount/
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref PagesTable
      Environment:
        Variables:
          PAGES_TABLE_NAME: !Ref PagesTable
      Events:
        GetResource:
          Type: Api
          Properties:
            RestApiId: !Ref DocumentScannerAPI
            Path: /pages/count/{project+}
            Method: get

The template creates an S3 bucket and two AWS IAM managed policies. The policies are applied to the AuthRole and UnauthRole created by Amplify. This allows users to upload images directly to the S3 bucket. To understand how Amplify works with Storage, see the documentation.

The template also sets an S3 event notification on the bucket for all object create events with a “.png” suffix. Whenever the frontend uploads an image to S3, the object create event invokes the ProcessDocument Lambda function.

The function parses the object key to get the project name, user, and page number. Amazon Textract then analyzes the text of the image. The object returned by Amazon Textract contains the detected text and detailed information, such as the positioning of text in the image. Only the raw lines of text are stored in the Pages table.

import os
import json, decimal
import boto3
import urllib.parse
from boto3.dynamodb.conditions import Key, Attr

client = boto3.resource('dynamodb')
textract = boto3.client('textract')

tableName = os.environ.get('PAGES_TABLE_NAME')

def handler(event, context):

  table = client.Table(tableName)

  print(table.table_status)
 
  key = urllib.parse.unquote(event['Records'][0]['s3']['object']['key'])
  bucket = event['Records'][0]['s3']['bucket']['name']
  project = key.split('/')[3]
  page = key.split('/')[4].split('.')[0]
  user = key.split('/')[2]
  
  response = textract.detect_document_text(
    Document={
        'S3Object': {
            'Bucket': bucket,
            'Name': key
        }
    })
    
  fullText = ""
  
  for item in response["Blocks"]:
    if item["BlockType"] == "LINE":
        fullText = fullText + item["Text"] + '\n'
  
  print(fullText)

  table.put_item(Item= {
    'project': user + '/' + project,
    'page': int(page), 
    'text': fullText
    })

  # print(response)
  return

The GeneratePDF Lambda function retrieves the detected text for each page in a project from the Pages table. It combines the text into a PDF and returns it as a base64-encoded string for download. This function can be modified if your document structure differs.

Understanding the frontend

In the GitHub repo, the folder amplify-frontend/src/ contains all the code for the frontend application. In main.js, the Amplify VueJS modules are configured to use the resources defined in aws-exports.js. It also configures the endpoint and S3 bucket of the serverless backend, defined in api-config.js.

In components/DocumentScanner.vue, the API module is imported and the API is defined.

API calls are defined as Vue methods that can be called by various other components and elements of the application.

In components/Project.vue, the frontend uses the Storage module for Amplify to upload images. For more information on how to use S3 in an Amplify project see the documentation.

Conclusion

This blog post shows how to create a multiuser application that can analyze text from images and generate PDF documents. This guide demonstrates how to do so in a secure and scalable way using a serverless approach. The example also shows an event driven pattern for handling high volume image processing using S3, Lambda, and Amazon Textract.

The Amplify Framework simplifies the process of implementing authentication, storage, and backend integration. Explore the full solution on GitHub to modify it for your next project or startup idea.

To learn more about AWS serverless and keep up to date on the latest features, subscribe to the YouTube channel.

#ServerlessForEveryone

Rendering React on the Edge with Flareact and Cloudflare Workers

2020-09-03 Guest Author

Post Syndicated from Guest Author original https://blog.cloudflare.com/rendering-react-on-the-edge-with-flareact-and-cloudflare-workers/

Rendering React on the Edge with Flareact and Cloudflare Workers

The following is a guest post from Josh Larson, Engineer at Vox Media.

Imagine you’re the maintainer of a high-traffic media website, and your DNS is already hosted on Cloudflare.

Page speed is critical. You need to get content to your audience as quickly as possible on every device. You also need to render ads in a speedy way to maintain a good user experience and make money to support your journalism.

One solution would be to render your site statically and cache it at the edge. This would help ensure you have top-notch delivery speed because you don’t need a server to return a response. However, your site has decades worth of content. If you wanted to make even a small change to the site design, you would need to regenerate every single page during your next deploy. This would take ages.

Another issue is that your site would be static — and future updates to content or new articles would not be available until you deploy again.

That’s not going to work.

Another solution would be to render each page dynamically on your server. This ensures you can return a dynamic response for new or updated articles.

However, you’re going to need to pay for some beefy servers to be able to handle spikes in traffic and respond to requests in a timely manner. You’ll also probably need to implement a system of internal caches to optimize the performance of your app, which could lead to a more complicated development experience. That also means you’ll be at risk of a thundering herd problem if, for any reason, your cache becomes invalidated.

Neither of these solutions are great, and you’re forced to make a tradeoff between one of these two approaches.

Thankfully, you’ve recently come across a project like Next.js which offers a hybrid approach: static-site generation along with incremental regeneration. You’re in love with the patterns and developer experience in Next.js, but you’d also love to take advantage of the Cloudflare Workers platform to host your site.

Cloudflare Workers allow you to run your code on the edge quickly, efficiently and at scale. Instead of paying for a server to host your code, you can host it directly inside the datacenter — reducing the number of network trips required to load your application. In a perfect world, we wouldn’t need to find hosting for a Next.js site, because Cloudflare offers the same JavaScript hosting functionality with the Workers platform. With their dynamic runtime and edge caching capabilities, we wouldn’t need to worry about making a tradeoff between static and dynamic for our site.

Unfortunately, frameworks like Next.js and Cloudflare Workers don’t mesh together particularly well due to technical constraints. Until now:

I’m excited to announce Flareact, a new open-source React framework built for Cloudflare Workers.

Rendering React on the Edge with Flareact and Cloudflare Workers

With Flareact, you don’t need to make the tradeoff between a static site and a dynamic application.

Flareact allows you to render your React apps at the edge rather than on the server. It is modeled after Next.js, which means it supports file-based page routing, dynamic page paths and edge-side data fetching APIs.

Not only are Flareact pages rendered at the edge — they’re also cached at the edge using the Cache API. This allows you to provide a dynamic content source for your app without worrying about traffic spikes or response times.

With no servers or origins to deal with, your site is instantly available to your audience. Cloudflare Workers gives you a 0ms cold start and responses from the edge within milliseconds.

You can check out the docs and get started now by clicking the button below:

To get started manually, install the latest wrangler, and use the handy wrangler generate command below to create your first project:

npm i @cloudflare/wrangler -g
wrangler generate my-project https://github.com/flareact/flareact-template

What’s the big deal?

Hosting React apps on Cloudflare Workers Sites is not a new concept. In fact, you’ve always been able to deploy a create-react-app project to Workers Sites in addition to static versions of other frameworks like Gatsby and Next.js.

However, Flareact renders your React application at the edge. This allows you to provide an initial server response with HTML markup — which can be helpful for search engine crawlers. You can also cache the response at the edge and optionally invalidate that cache on a timed basis — meaning your static markup will be regenerated if you need it to be fresh.

This isn’t a new pattern: Next.js has done the hard work in defining the shape of this API with SSG support and Incremental Static Regeneration. While there are nuanced differences in the implementation between Flareact and Next.js, they serve a similar purpose: to get your application to your end-user in the quickest and most-scalable way possible.

A focus on developer experience

A magical developer experience is a crucial ingredient to any successful product.

As a longtime fan and user of Next.js, I wanted to experiment with running the framework on Cloudflare Workers. However, Next.js and its APIs are framed around the Node.js HTTP Server API, while Cloudflare Workers use V8 isolates and are modeled after the FetchEvent type.

Since we don’t have typical access to a filesystem inside V8 isolates, it’s tough to mimic the environment required to run a dynamic Next.js server at the edge. Though projects like Fab have come up with workarounds, I decided to approach the project with a clean slate and use existing patterns established in Next.js in a brand-new framework.

As a developer, I absolutely love the simplicity of exporting an asynchronous function from my page to have it supply props to the component. Flareact implements this pattern by allowing you to export a getEdgeProps function. This is similar to getStaticProps in Next.js, and it matches the expected return shape of that function in Next.js — including a revalidate parameter. Learn more about data fetching in Flareact.

I was also inspired by the API Routes feature of Next.js when I implemented the API Routes feature of Flareact — enabling you to write standard Cloudflare Worker scripts directly within your React app.

I hope porting over an existing Next.js project to Flareact is a breeze!

How it works

When a FetchEvent request comes in, Flareact inspects the URL pathname to decide how to handle it:

If the request is for a page or for page props, it checks the cache for that request and returns it if there’s a hit. If there is a cache miss, it generates the page request or props function, stores the result in the cache, and returns the response.

If the request is for an API route, it sends the entire FetchEvent along to the user-defined API function, allowing the user to respond as they see fit.

If you want your cached page to be revalidated after a certain amount of time, you can return an additional revalidate property from getEdgeProps(). This instructs Flareact to cache the endpoint for that number of seconds before generating a new response.

Finally, if the request is for a static asset, it returns it directly from the Workers KV.

The Worker

The core responsibilities of the Worker — or in a traditional SSR framework, the server — are to:

Render the initial React page component into static HTML markup.
Provide the initial page props as a JSON object, embedded into the static markup in a script tag.
Load the client-side JavaScript bundles and stylesheets necessary to render the interactive page.

One challenge with building Flareact is that the Webpack targets the webworker output rather than the node output. This makes it difficult to inform the worker which pages exist in the filesystem, since there is no access to the filesystem.

To get around this, Flareact leverages require.context, a Webpack-specific API, to inspect the project and build a manifest of pages on the client and the worker. I’d love to replace this with a smarter bundling strategy on the client-side eventually.

The Client

In addition to handling incoming Worker requests, Flareact compiles a client bundle containing the code necessary for routing, data fetching and more from the browser.

The core responsibilities of the client are to:

Listen for routing events
Fetch the necessary page component and its props from the worker over AJAX

Building a client router from scratch has been a challenge. It listens for changes to the internal route state, updates the URL pathname with pushState, makes an AJAX request to the worker for the page props, and then updates the current component in the render tree with the requested page.

It was fun building a flareact/link component similar to next/link:

import Link from "flareact/link";

export default function Index() {
  return (
    <div>
      <Link href="/about">
        <a>Go to About</a>
      </Link>
    </div>
  );
}

I also set out to build a custom version of next/head for Flareact. As it turns out, this was non-trivial! With lots of interesting stuff going on behind the scenes to support SSR and client-side routing events, I decided to make flareact/head a simple wrapper around react-helmet instead:

import Head from "flareact/head";

export default function Index() {
  return (
    <div>
      <Head>
        <title>My page title</title>
      </Head>
      <h1>Hello, world.</h1>
    </div>
  );
}

Local Development

The local developer experience of Flareact leverages the new wrangler dev command, sending server requests through a local tunnel to the Cloudflare edge and back to your machine.

This is a huge win for productivity, since you don’t need to manually build and deploy your application to see how it will perform in a production environment.

It’s also a really exciting update to the serverless toolchain. Running a robust development environment in a serverless world has always been a challenge, since your code is executing in a non-traditional context. Tunneling local code to the edge and back is such a great addition to Cloudflare’s developer experience.

Use cases

Flareact is a great candidate for a lot of Jamstack-adjacent applications, like blogs or static marketing sites.

It could also be used for more dynamic applications, with robust API functions and authentication mechanisms — all implemented using Cloudflare Workers.

Imagine building a high-traffic e-commerce site with Flareact, where both site reliability and dynamic rendering for things like price changes and stock availability are crucial.

There are also untold possibilities for integrating the Workers KV into your edge props or API functions as a first-class database solution. No need to reach for an externally-hosted database!

While the project is still in its early days, here are a couple real-world examples:

The Flareact docs site, powered by Markdown files
A blog site, powered by a headless WordPress API

The road ahead

I have to be honest: creating a server-side rendered React framework with little prior knowledge was very difficult. There’s still a ton to learn, and Flareact has a long way to go to reach parity with Next.js in the areas of optimization and production-readiness.

Here’s what I’m hoping to add to Flareact in the near future:

Smarter client bundling and Webpack chunks to reduce individual page weight
A more feature-complete client-side router
The ability to extend and customize the root document of the app
Support for more style frameworks (CSS-in-JS, Sass, CSS modules, etc)
A more stable development environment
Documentation and support for environment variables, secrets and KV namespaces
A guide for deploying from GitHub Actions and other CI tools

If the project sounds interesting to you, be sure to check out the source code on GitHub. Contributors are welcome!

Jump-starting your serverless development environment

2020-09-02 Benjamin Smith

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/jump-starting-your-serverless-development-environment/

Developers building serverless applications often wonder how they can jump-start their local development environment. This blog post provides a broad guide for those developers wanting to set up a development environment for building serverless applications.

AWS and open source tools for a serverless development environment .

To use AWS Lambda and other AWS services, create and activate an AWS account.

Command line tooling

Command line tools are scripts, programs, and libraries that enable rapid application development and interactions from within a command line shell.

The AWS CLI

The AWS Command Line Interface (AWS CLI) is an open source tool that enables developers to interact with AWS services using a command line shell. In many cases, the AWS CLI increases developer velocity for building cloud resources and enables automating repetitive tasks. It is an important piece of any serverless developer’s toolkit. Follow these instructions to install and configure the AWS CLI on your operating system.

AWS enables you to build infrastructure with code. This provides a single source of truth for AWS resources. It enables development teams to use version control and create deployment pipelines for their cloud infrastructure. AWS CloudFormation provides a common language to model and provision these application resources in your cloud environment.

AWS Serverless Application Model (AWS SAM CLI)

AWS Serverless Application Model (AWS SAM) is an extension for CloudFormation that further simplifies the process of building serverless application resources.

It provides shorthand syntax to define Lambda functions, APIs, databases, and event source mappings. During deployment, the AWS SAM syntax is transformed into AWS CloudFormation syntax, enabling you to build serverless applications faster.

The AWS SAM CLI is an open source command line tool used to locally build, test, debug, and deploy serverless applications defined with AWS SAM templates.

Install AWS SAM CLI on your operating system.

Test the installation by initializing a new quick start project with the following command:

$ sam init

Choose 1 for the “Quick Start Templates”
Choose 1 for the “Node.js runtime”
Use the default name.

The generated /sam-app/template.yaml contains all the resource definitions for your serverless application. This includes a Lambda function with a REST API endpoint, along with the necessary IAM permissions.

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      CodeUri: hello-world/
      Handler: app.lambdaHandler
      Runtime: nodejs12.x
      Events:
        HelloWorld:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /hello
            Method: get

Deploy this application using the AWS SAM CLI guided deploy:

$ sam deploy -g

Local testing with AWS SAM CLI

The AWS SAM CLI requires Docker containers to simulate the AWS Lambda runtime environment on your local development environment. To test locally, install Docker Engine and run the Lambda function with following command:

$ sam local invoke "HelloWorldFunction" -e events/event.json

The first time this function is invoked, Docker downloads the lambci/lambda:nodejs12.x container image. It then invokes the Lambda function with a pre-defined event JSON file.

Helper tools

There are a number of open source tools and packages available to help you monitor, author, and optimize your Lambda-based applications. Some of the most popular tools are shown in the following list.

Template validation tooling

CloudFormation Linter is a validation tool that helps with your CloudFormation development cycle. It analyses CloudFormation YAML and JSON templates to resolve and validate intrinsic functions and resource properties. By analyzing your templates before deploying them, you can save valuable development time and build automated validation into your deployment release cycle.

Follow these instructions to install the tool.

Once, installed, run the cfn-lint command with the path to your AWS SAM template provided as the first argument:

cfn-lint template.yaml

AWS SAM template validation with cfn-lint

The following example shows that the template is not valid because the !GettAtt function does not evaluate correctly.

IDE tooling

Use AWS IDE plugins to author and invoke Lambda functions from within your existing integrated development environment (IDE). AWS IDE toolkits are available for PyCharm, IntelliJ. Visual Studio.

The AWS Toolkit for Visual Studio Code provides an integrated experience for developing serverless applications. It enables you to invoke Lambda functions, specify function configurations, locally debug, and deploy—all conveniently from within the editor. The toolkit supports Node.js, Python, and .NET.

The AWS Toolkit for Visual Studio Code

From Visual Studio Code, choose the Extensions icon on the Activity Bar. In the Search Extensions in Marketplace box, enter AWS Toolkit and then choose AWS Toolkit for Visual Studio Code as shown in the following example. This opens a new tab in the editor showing the toolkit’s installation page. Choose the Install button in the header to add the extension.

AWS Toolkit extension for Visual Studio Code

AWS Cloud9

Another option to build a development environment without having to install anything locally is to use AWS Cloud9. AWS Cloud9 is a cloud-based integrated development environment (IDE) for writing, running, and debugging code from within the browser.

It provides a seamless experience for developing serverless applications. It has a preconfigured development environment that includes AWS CLI, AWS SAM CLI, SDKs, code libraries, and many useful plugins. AWS Cloud9 also provides an environment for locally testing and debugging AWS Lambda functions. This eliminates the need to upload your code to the Lambda console. It allows developers to iterate on code directly, saving time, and improving code quality.

Follow this guide to set up AWS Cloud9 in your AWS environment.

Advanced tooling

Efficient configuration of Lambda functions is critical when expecting optimal cost and performance of your serverless applications. Lambda allows you to control the memory (RAM) allocation for each function.

Lambda charges based on the number of function requests and the duration, the time it takes for your code to run. The price for duration depends on the amount of RAM you allocate to your function. A smaller RAM allocation may reduce the performance of your application if your function is running compute-heavy workloads. If performance needs outweigh cost, you can increase the memory allocation.

Cost and performance optimization tooling

AWS Lambda power tuner is an open source tool that uses an AWS Step Functions state machine to suggest cost and performance optimizations for your Lambda functions. It invokes a given function with multiple memory configurations. It analyzes the execution log results to determine and suggest power configurations that minimize cost and maximize performance.

To deploy the tool:

Clone the repository as follows:

$ git clone https://github.com/alexcasalboni/aws-lambda-power-tuning.git

Create an Amazon S3 bucket and enter the deployment configurations in /scripts/deploy.sh:

# config
BUCKET_NAME=your-sam-templates-bucket
STACK_NAME=lambda-power-tuning
PowerValues='128,512,1024,1536,3008'

Run the deploy.sh script from your terminal, this uses the AWS SAM CLI to deploy the application:
```
$ bash scripts/deploy.sh
```

Run the power tuning tool from the terminal using the AWS CLI:

aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:0123456789:stateMachine:powerTuningStateMachine-Vywm3ozPB6Am \
--input "{\"lambdaARN\": \"arn:aws:lambda:us-east-1:1234567890:function:testytest\", \"powerValues\":[128,256,512,1024,2048],\"num\":50,\"payload\":{},\"parallelInvocation\":true,\"strategy\":\"cost\"}" \
--output json

The Step Functions execution output produces a link to a visual summary of the suggested results:

AWS Lambda power tuning results

Monitoring and debugging tooling

Sls-dev-tools is an open source serverless tool that delivers serverless metrics directly to the terminal. It provides developers with feedback on their serverless application’s metrics and key bindings that deploy, open, and manipulate stack resources. Bringing this data directly to your terminal or IDE, reduces context switching between the developer environment and the web interfaces. This can increase application development speed and improve user experience.

Follow these instructions to install the tool onto your development environment.

To open the tool, run the following command:

$ Sls-dev-tools

Follow the in-terminal interface to choose which stack to monitor or edit.

The following example shows how the tool can be used to invoke a Lambda function with a custom payload from within the IDE.

Invoke an AWS Lambda function with a custom payload using sls-dev-tools

Serverless database tooling

NoSQL Workbench for Amazon DynamoDB is a GUI application for modern database development and operations. It provides a visual IDE tool for data modeling and visualization with query development features to help build serverless applications with Amazon DynamoDB tables. Define data models using one or more tables and visualize the data model to see how it works in different scenarios. Run or simulate operations and generate the code for Python, JavaScript (Node.js), or Java.

Choose the correct operating system link to download and install NoSQL Workbench on your development machine.

The following example illustrates a connection to a DynamoDB table. A data scan is built using the GUI, with Node.js code generated for inclusion in a Lambda function:

Connecting to an Amazon DynamoBD table with NoSQL Workbench for AmazonDynamoDB

Connecting to an Amazon DynamoDB table with NoSQL Workbench for Amazon DynamoDB

Generating query code with NoSQL Workbench for Amazon DynamoDB

Conclusion

Building serverless applications allows developers to focus on business logic instead of managing and operating infrastructure. This is achieved by using managed services. Developers often struggle with knowing which tools, libraries, and frameworks are available to help with this new approach to building applications. This post shows tools that builders can use to create a serverless developer environment to help accelerate software development.

This list represents AWS and open source tools but does not include our APN Partners. For partner offers, check here.

Read more to start building serverless applications.

Using serverless backends to iterate quickly on web apps – part 3

2020-08-31 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-serverless-backends-to-iterate-quickly-on-web-apps-part-3/

This series is about building flexible backends for web applications. The example is Happy Path, a web app that allows park visitors to upload and share maps and photos, to help replace printed materials.

In part 1, I show how the application works and walk through the backend architecture. In part 2, you deploy a basic image-processing workflow. To do this, you use an AWS Serverless Application Model (AWS SAM) template to create an AWS Step Functions state machine.

In this post, I show how to deploy progressively more complex workflow functionality while using minimal code. This solution in this web app is designed for 100,000 monthly active users. Using this approach, you can introduce more sophisticated workflows without affecting the existing backend application.

The code and instructions for this application are available in the GitHub repo.

Introducing image moderation

In the first version, users could upload any images to parks on the map. One of the most important feature requests is to enable moderation, to prevent inappropriate images from appearing on the app. To handle a large number of uploaded images, using human moderation would be slow and labor-intensive.

In this section, you use Amazon ML services to automate analyzing the images for unsafe content. Amazon Rekognition provides an API to detect if an image contains moderation labels. These labels are categorized into different types of unsafe content that would not be appropriate for this app.

Version 2 of the workflow uses this API to automate the process of checking images. To install version 2:

From a terminal window, delete the v1 workflow stack:
aws cloudformation delete-stack --stack-name happy-path-workflow-v1
Change directory to the version 2 AWS SAM template in the repo:
cd .\workflows\templates\v2
Build and deploy the solution:
sam build
sam deploy --guided
The deploy process prompts you for several parameters. Enter happy-path-workflow-v2 as the Stack Name. The other values are the outputs from the backend deployment process, detailed in the repo’s README. Enter these to complete the deployment.

From VS Code, open the v2 state machine in the repo from workflows/statemachines/v2.asl.json. Choose the Render graph option in the CodeLens to see the workflow visualization.

This new workflow introduces a Moderator step. This invokes a Moderator Lambda function that uses the Amazon Rekognition API. If this API identifies any unsafe content labels, it returns these as part of the function output.

The next step in the workflow is a Moderation result choice state. This evaluates the output of the previous function – if the image passes moderation, the process continues to the Resizer function. If it fails, execution moves to the RecordFailState step.

Step Functions integrates directly with some AWS services so that you can call and pass parameters into the APIs of those services. The RecordFailState uses an Amazon DynamoDB service integration to write the workflow failure to the application table, using the arn:aws:states:::dynamodb:updateItem resource.

Testing the workflow

To test moderation, I use an unsafe image with suggestive content. This is an image that is not considered appropriate for this application. To test the deployed v2 workflow:

Open the frontend application at https://localhost:8080 in your browser.
Select a park location, choose Show Details, and then choose Upload images.
Select an unsafe image to upload.
Navigate to the Step Functions console. This shows the v2StateMachine with one successful execution:
Select the state machine, and choose the execution to display more information. Scroll down to the Visual workflow panel.

This shows that the moderation failed and the path continued to log the failed state in the database. If you open the Output resource, this displays more details about why the image is considered unsafe.

Checking the image size and file type

The upload component in the frontend application limits file selection to JPG images but there is no check to reject images that are too small. It’s prudent to check and enforce image types and sizes on the backend API in addition to the frontend. This is because it’s possible to upload images via the API without using the frontend.

The next version of the workflow enforces image sizes and file types. To install version 3:

From a terminal window, delete the v2 workflow stack:
aws cloudformation delete-stack --stack-name happy-path-workflow-v2
Change directory to the version 3 AWS SAM template in the repo:
cd .\workflows\templates\v3
Build and deploy the solution:
sam build
sam deploy --guided
The deploy process prompts you for several parameters. Enter happy-path-workflow-v3 as the Stack Name. The other values are the outputs from the backend deployment process, detailed in the repo’s README. Enter these to complete the deployment.

From VS Code, open the v3 state machine in the repo from workflows/statemachines/v3.asl.json. Choose the Render graph option in the CodeLens to see the workflow visualization.

This workflow changes the starting point of the execution, introducing a Check Dimensions step. This invokes a Lambda function that checks the size and types of the Amazon S3 object using the image-size npm package. This function uses environment variables provided by the AWS SAM template to compare against a minimum size and allowed type array.

The output is evaluated by the Dimension Result choice state. If the image is larger than the minimum size allowed, execution continues to the Moderator function as before. If not, it passes to the RecordFailState step to log the result in the database.

Testing the workflow

To test, I use an image that’s narrower than the mixPixels value. To test the deployed v3 workflow:

Open the frontend application at https://localhost:8080 in your browser.
Select a park location, choose Show Details, and then choose Upload images.
Select an image with a width smaller than 800 pixels. After a few seconds, a rejection message appears:
Navigate to the Step Functions console. This shows the v3StateMachine with one successful execution. Choose the execution to show more detail.

The execution shows that the Check Dimension step added the image dimensions to the event object. Next, the Dimensions Result choice state rejected the image, and logged the result at the RecordFailState step. The application’s DynamoDB table now contains details about the failed upload:

Pivoting the application to a new concept

Until this point, the Happy Path web application is designed to help park visitors share maps and photos. This is the development team’s original idea behind the app. During the product-market fit stage of development, it’s common for applications to pivot substantially from the original idea. For startups, it can be critical to have the agility to modify solutions quickly to meet the needs of customers.

In this scenario, the original idea has been less successful than predicted, and park visitors are not adopting the app as expected. However, the business development team has identified a new opportunity. Restaurants would like an app that allows customers to upload menus and food photos. How can the development team create a new proof-of-concept app for restaurant customers to test this idea?

In this version, you modify the application to work for restaurants. While features continue to be added to the parks workflow, it now supports business logic specifically for the restaurant app.

To create the v4 workflow and update the frontend:

From a terminal window, delete the v3 workflow stack:
aws cloudformation delete-stack --stack-name happy-path-workflow-v3
Change directory to the version 4 AWS SAM template in the repo:
cd .\workflows\templates\v4
Build and deploy the solution:
sam build
sam deploy --guided
The deploy process prompts you for several parameters. Enter happy-path-workflow-v4 as the Stack Name. The other values are the outputs from the backend deployment process, detailed in the repo’s README. Enter these to complete the deployment.
Open frontend/src/main.js and update the businessType variable on line 63. Set this value to ‘restaurant’.
Start the local development server:
npm run serve
Open the application at http://localhost:8080. This now shows restaurants in the local area instead of parks.

In the Step Functions console, select the v4StateMachine to see the latest workflow, then open the Definition tab to see the visualization:

This workflow starts with steps that apply to both parks and restaurants – checking the image dimensions. Next, it determines the place type from the placeId record in DynamoDB. Depending on place type, it now follows a different execution path:

Parks continue to run the automated moderator process, then resizer and publish the result.
Restaurants now use Amazon Rekognition to determine the labels in the image. Any photos containing people are rejected. Next, the workflow continues to the resizer and publish process.
Other business types go to the RecordFailState step since they are not supported.

Testing the workflow

To test the deployed v4 workflow:

Open the frontend application at https://localhost:8080 in your browser.
Select a restaurant, choose Show Details, and then choose Upload images.
Select an image from the test photos dataset. After a few seconds, you see a message confirming the photo has been added.
Next, select an image that contains one or more people. The new restaurant workflow rejects this type of photo:
In the Step Functions console, select the last execution for the v4StateMachine to see how the Check for people step rejected the image:

If other business types are added later to the application, you can extend the Step Functions workflow accordingly. The cost of Step Functions is based on the number of transitions in a workflow, not the number of total steps. This means you can branch by business type in the Happy Path application. This doesn’t affect the overall cost of running an execution, if the total transitions are the same per execution.

Conclusion

Previously in this series, you deploy a simple workflow for processing image uploads in the Happy Path web application. In this post, you add progressively more complex functionality by deploying new versions of workflows.

The first iteration introduces image moderation using Amazon Rekognition, providing the ability to automate the evaluation of unsafe content. Next, the workflow is modified to check image size and file type. This allows you to reject any images that are too small or do not meet the type requirements. Finally, I show how to expand the logic further to accept other business types with their own custom workflows.

To learn more about building serverless web applications, see the Ask Around Me series.

Asynchronous HTMLRewriter for Cloudflare Workers

2020-08-28 Ashcon Partovi

Post Syndicated from Ashcon Partovi original https://blog.cloudflare.com/asynchronous-htmlrewriter-for-cloudflare-workers/

Asynchronous HTMLRewriter for Cloudflare Workers

Last year, we launched HTMLRewriter for Cloudflare Workers, which enables developers to make streaming changes to HTML on the edge. Unlike a traditional DOM parser that loads the entire HTML document into memory, we developed a streaming parser written in Rust. Today, we’re announcing support for asynchronous handlers in HTMLRewriter. Now you can perform asynchronous tasks based on the content of the HTML document: from prefetching fonts and image assets to fetching user-specific content from a CMS.

How can I use HTMLRewriter?

We designed HTMLRewriter to have a jQuery-like experience. First, you define a handler, then you assign it to a CSS selector; Workers does the rest for you. You can look at our new and improved documentation to see our supported list of selectors, which now include nth-child selectors. The example below changes the alternative text for every second image in a document.

async function editHtml(request) {
  return new HTMLRewriter()
     .on("img:nth-child(2)", new ElementHandler())
     .transform(await fetch(request))
}

class ElementHandler {
   element(e) {
      e.setAttribute("alt", "A very interesting image")
   }
}

Since these changes are applied using streams, we maintain a low TTFB (time to first byte) and users never know the HTML was transformed. If you’re interested in how we’re able to accomplish this technically, you can read our blog post about HTML parsing.

What’s new with HTMLRewriter?

Now you can define an async handler which allows any code that uses await. This means you can make dynamic HTML injection, based on the contents of the document, without having prior knowledge of what it contains. This allows you to customize HTML based on a particular user, feature flag, or even an integration with a CMS.

class UserCustomizer {
   // Remember to add the `async` keyword to the handler method
   async element(e) {
      const user = await fetch(`https://my.api.com/user/${e.getAttribute("user-id")}/online`)
      if (user.ok) {
         // Add the user’s name to the element
         e.setAttribute("user-name", await user.text())
      } else {
         // Remove the element, since this user not online
         e.remove()
      }
   }
}

What can I build with HTMLRewriter?

To illustrate the flexibility of HTMLRewriter, I wrote an example that you can deploy on your own website. If you manage a website, you know that old links and images can expire with time. Here’s an excerpt from a years’ old post I wrote on the Cloudflare Blog:

As you might see, that missing image is not the prettiest sight. However, we can easily fix this using async handlers in HTMLRewriter. Using a service like the Internet Archive API, we can check if an image no longer exists and rewrite the URL to use the latest archive. That means users don’t see an ugly placeholder and won’t even know the image was replaced.

async function fetchAndFixImages(request) {
   return new HTMLRewriter()
      .on("img", new ImageFixer())
      .transform(await fetch(request))
}

class ImageFixer {
   async element(e) {
    var url = e.getAttribute("src")
    var response = await fetch(url)
    if (!response.ok) {
       var archive = await fetch(`https://archive.org/wayback/available?url=${url}`)
       if (archive.ok) {
          var snapshot = await archive.json()
          e.setAttribute("src", snapshot.archived_snapshots.closest.url)
       } else {
          e.remove()
       }
    }
  }
}

Using the Workers Playground, you can view a working sample of the above code. A more complex example could even alert a service like Sentry when a missing image is detected. Using the previous missing image, now you can see the image is restored and users are none of the wiser.

If you’re interested in deploying this to your own website, click on the button below:

What else can I build with HTMLRewriter?

We’ve been blown away by developer projects using HTMLRewriter. Here are a few projects that caught our eye and are great examples of the power of Cloudflare Workers and HTMLRewriter:

If you’re interested in using HTMLRewriter, check out our documentation. Also be sure to share any creations you’ve made with @CloudflareDev, we love looking at the awesome projects you build.

Building Salesforce integrations with Amazon EventBridge and Amazon AppFlow

2020-08-26 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-salesforce-integrations-with-amazon-eventbridge/

This post is courtesy of Den Delimarsky, Senior Product Manager, and Vinay Kondapi, Senior Product Manager.

The integration between Amazon EventBridge and Amazon AppFlow enables customers to receive and react to events from Salesforce in their event-driven applications. In this blog post, I show you how to set up the integration, and route Salesforce events to an AWS Lambda function for processing.

Amazon AppFlow is a fully managed integration service that enables you to securely transfer data between software as a service (SaaS) applications like Salesforce, Marketo, Slack, and ServiceNow, and AWS services like Amazon S3 and Amazon Redshift.

EventBridge SaaS integrations make it easier for customers to receive events from over 30 different SaaS providers. Salesforce is a popular SaaS provider among AWS customers, so it has been one of the most anticipated event sources for EventBridge. Customers want to build rich applications that can react to events that track campaigns, contracts, opportunities, and order changes.

The ability to receive these events allows you to build workflows where you can start a variety of processes. For example, you could notify a broad range of subscribers about the changes, or enrich the data with information from another service. Or you could route the event to an order delivery system.

Previously, to connect Salesforce to your application, you must write custom API polling code that routes events either directly to an application or to an event bus. With the Salesforce integration with EventBridge and Amazon AppFlow, the integration is built in minutes directly through the AWS Management Console, with no code required.

The solution outlined in this blog post is structured as follows:

Setting up the event source

To set up the event source:

Open the Amazon AppFlow console, and create a new flow. Choose Create flow button on the service landing page. Give your flow a unique name, and choose Next.
In the Source name list, select Salesforce, and then choose Connect. Select the Salesforce environment you are using, and provide a unique connection name.
Choose Continue. When prompted, provide your Salesforce credentials. These are the credentials that are associated with the specific Salesforce environment selected in the previous step.
Select Salesforce events from the list of available options for the flow, and choose the event that you want to route to EventBridge. This ensures that Amazon AppFlow can route specific events that are coming from Salesforce to an EventBridge event bus.
With the source set up, you can now specify the destination. In the Destination name list, select EventBridge.

To send Salesforce events to EventBridge, Amazon AppFlow creates a new partner event source that is associated with a partner event bus.

To create a partner event source:

Select an existing partner event source, or create a new one by choosing the list of partner event sources.
When creating a new event source, you can optionally customize the name, to make it easier for you to identify it later.
Choose an Amazon S3 bucket for large events. For events that are larger than 256 KB, Amazon AppFlow sends a URL for the S3 object to the event bus instead of the event payload.
Define a flow trigger, which determines when the flow is started. Because we are tracking events, we want to react to those as they come in. Using the default Run flow on event enables this scenario as changes occur in Salesforce.

With Amazon AppFlow, you can also configure data field mapping, validation rules, and filters. These enable you to enrich and modify event data before it is sent to the event bus.

Once you create the flow, you must activate the event source that you created. To complete this step:

Open the EventBridge console.
Associate a partner event source with an event bus by following the link in the Amazon AppFlow integration dialog box, or navigating directly to the partner event sources view. You can see a partner event source with a Pending state.
Select the event source and choose Associate with event bus.
Confirm the settings and choose on Associate.
Return to the Amazon AppFlow console, and open the flow you were creating. Choose Activate flow.

Your integration is now complete, and EventBridge can start receiving Salesforce events from the configured flow.

Routing Salesforce events to Lambda function

The associated partner event bus receives all events of the configured type from the connected Salesforce accounts. Now your application can react to these events with the help of rules in EventBridge. Rules allow you to set conditions for event routing that determine what targets receive event payloads. You can learn more about this functionality in the EventBridge documentation.

To create a new rule:

Go to the rules view in the EventBridge console, and choose Create rule.
Provide a unique name and an optional description for your rule.
Select the Event pattern option in the Define pattern section. With event pattern configuration, you can define parts of the event payload that EventBridge must look at to determine where to route the event.
For this exercise, start by capturing every Salesforce event that goes through the partner event bus. The only events routed through this bus are from the partner event source. In this case, it is Amazon AppFlow connected to Salesforce.
Set the event matching pattern to Pre-defined pattern by service, with the service provider being All Events. The default setting allows you to receive all events that are coming through the partner event bus.
Select the event bus that the rule should be associated with. Choose Custom or partner event bus and select the event bus that you associated with the Amazon AppFlow event source. Every rule in EventBridge is associated with an event bus.

When rules are triggered, the event can be routed to other AWS services. Additionally, every rule can have up to five different AWS targets. You can read more about available targets in the EventBridge documentation. For this blog post, we use an AWS Lambda function as a target for Salesforce events received from Amazon AppFlow.

To configure targets for your rule:

From the list of targets, select Lambda function, and select an existing function. If you do not yet have a function available, you can create one in the AWS Lambda console.
Choose Create. You have now completed the rule setup.

Now, Salesforce events that match the configured type are routed directly to a Lambda function in your account.

Testing the integration

To test the integration:

Open the Lambda view in the AWS Management Console.
Choose the function that is handling the events from EventBridge.

In the Function code section, update the code to:

exports.handler = async (event) => {
    console.log(event);
    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from Lambda!'),
    };
    return response;
};

Choose Save.
Open your Salesforce instance, and take an action that is associated with the event you configured earlier. For example, you could update a contract or create an order.
Go back to your function in AWS Management Console, and choose the Monitoring tab.
Scroll to CloudWatch Logs Insights section.
Choose the latest log stream. Make sure that the timestamp approximately matches the time when you triggered the action in Salesforce.
Choose the log stream.
Observe log events that contain Salesforce event data.

You have completed your first Salesforce integration with EventBridge and Amazon AppFlow. You are now able to build decoupled and highly scalable applications that integrate with Salesforce.

Conclusion

Building decoupled and scalable cross-service applications is more relevant than ever with requirements for high availability, consistency, and reliability. This blog post demonstrates a solution that connects Salesforce to an event-driven application that uses EventBridge and Amazon AppFlow to route events. The application uses events from Salesforce as a starting point for a custom processing workflow in a Lambda function.

To learn more about EventBridge, visit the EventBridge documentation or EventBridge Learning Path.

To learn more about Amazon AppFlow, visit the Amazon AppFlow documentation.

Announcing wrangler dev — the Edge on localhost

2020-08-26 Rita Kozlov

Post Syndicated from Rita Kozlov original https://blog.cloudflare.com/announcing-wrangler-dev-the-edge-on-localhost/

Announcing wrangler dev — the Edge on localhost

Cloudflare Workers — our serverless platform — allows developers around the world to run their applications from our network of 200 datacenters, as close as possible to their users.

A few weeks ago we announced a release candidate for wrangler dev — today, we’re excited to take wrangler dev, the world’s first edge-based development environment, to GA with the release of wrangler 1.11.

Think locally, develop globally

It was once assumed that to successfully run an application on the web, one had to go and acquire a server, set it up (in a data center that hopefully you had access to), and then maintain it on an ongoing basis. Luckily for most of us, that assumption was challenged with the emergence of the cloud. The cloud was always assumed to be centralized — large data centers in a single region (“us-east-1”), reserved for compute. The edge? That was for caching static content.

Again, assumptions are being challenged.

Cloudflare Workers is about moving compute from a centralized location to the edge. And it makes sense: if users are distributed all over the globe, why should all of them be routed to us-east-1, on the opposite side of the world, causing latency and degrading user experience?

But challenging one assumption caused others to come into view. One of the most obvious ones was: would a local development environment actually provide the best experience for someone looking to test their Worker code? Trying to fit the entire Cloudflare edge, with all its dependencies onto a developer’s machine didn’t seem to be the best approach. Especially given that the place the developer was going to run that code in production was mere milliseconds away from the computer they were running on.

When I was in college, getting started with programming, one of the biggest barriers to entry was installing all the dependencies required to run a single library. I would go as far as to say that the third, and often forgotten hardest problem in computer science is dependency management.

We’ve not the first to try and unify development environments across machines — tools such as Docker aim to solve this exact problem by providing a prepackaged development environment.

Yet, packaging up the Workers runtime is not quite so simple.

Beyond the Workers runtime, there are many components that make up Cloudflare’s edge, including DNS resolution, the Cloudflare cache — all of those parts are what makes Cloudflare Workers so powerful. That means that without those components, a standalone runtime is insufficient to represent the behavior of Worker request handling. The reason to develop locally first is to have the opportunity to experiment without affecting production. Thus, having a local development environment that truly reflects production is a requirement.

`wrangler dev`

wrangler dev provides all the convenience of a local development environment, without the headache of trying to reproduce the reality of production locally — and then having to keep the two environments in sync.

By running at the edge, it provides a high fidelity, consistent experience for all developers, without sacrificing the speedy feedback loop of a local development environment.

Live reloading

Announcing wrangler dev — the Edge on localhost

As you update your code, wrangler dev will detect changes, and push the new version of your code to the edge.

console.log() at your fingertips

Previously to extract your console logs from the Workers runtime, you had to have the Workers Preview open in a browser window at all times. With wrangler dev, you can receive your own logs, directly to your terminal of choice.

Cache API, KV, and more!

Since wrangler dev runs on the edge, you can now easily test the state of a cache.put(), without having to deploy your Worker to production.

wrangler dev will spin up a new KV namespace for development, so you don’t have to worry about affecting your production data.

And if you’re looking to test out some of the features provided on request.cf that provide rich information about the request such as geo-location — they will all be provided from the Cloudflare data center.

Get started

wrangler dev is now available in the latest version of Wrangler, the official Cloudflare Workers CLI.

To get started, follow our installation instructions here.

What’s next?

wrangler dev is just our first foray into giving our developers more visibility and agility with their development process.

We recognize that we have a lot more work to do to meet our developers needs, including providing an easy testing framework for Workers, and allowing our customers to observe their Workers’ behavior in production.

Just as wrangler dev provides a quick feedback loop between our developers and their code, we love to have a tight feedback loop between our developers and our product. We love to hear what you’re building, how you’re building it, and how we can help you build it better.

Building storage-first serverless applications with HTTP APIs service integrations

2020-08-25 Eric Johnson

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/building-storage-first-applications-with-http-apis-service-integrations/

Over the last year, I have been talking about “storage first” serverless patterns. With these patterns, data is stored persistently before any business logic is applied. The advantage of this pattern is increased application resiliency. By persisting the data before processing, the original data is still available, if or when errors occur.

Common pattern for serverless API backend

Using Amazon API Gateway as a proxy to an AWS Lambda function is a common pattern in serverless applications. The Lambda function handles the business logic and communicates with other AWS or third-party services to route, modify, or store the processed data. One option is to place the data in an Amazon Simple Queue Service (SQS) queue for processing downstream. In this pattern, the developer is responsible for handling errors and retry logic within the Lambda function code.

The storage first pattern flips this around. It uses native error handling with retry logic or dead-letter queues (DLQ) at the SQS layer before any code is run. By directly integrating API Gateway to SQS, developers can increase application reliability while reducing lines of code.

Storage first pattern for serverless API backend

Previously, direct integrations require REST APIs with transformation templates written in Velocity Template Language (VTL). However, developers tell us they would like to integrate directly with services in a simpler way without using VTL. As a result, HTTP APIs now offers the ability to directly integrate with five AWS services without needing a transformation template or code layer.

The first five service integrations

This release of HTTP APIs direct integrations includes Amazon EventBridge, Amazon Kinesis Data Streams, Simple Queue Service (SQS), AWS System Manager’s AppConfig, and AWS Step Functions. With these new integrations, customers can create APIs and webhooks for their business logic hosted in these AWS services. They can also take advantage of HTTP APIs features like authorizers, throttling, and enhanced observability for securing and monitoring these applications.

Amazon EventBridge

HTTP APIs service integration with Amazon EventBridge

The HTTP APIs direct integration for EventBridge uses the PutEvents API to enable client applications to place events on an EventBridge bus. Once the events are on the bus, EventBridge routes the event to specific targets based upon EventBridge filtering rules.

This integration is a storage first pattern because data is written to the bus before any routing or logic is applied. If the downstream target service has issues, then EventBridge implements a retry strategy with incremental back-off for up to 24 hours. Additionally, the integration helps developers reduce code by filtering events at the bus. It routes to downstream targets without the need for a Lambda function as a transport layer.

Use this direct integration when:

Different tasks are required based upon incoming event details
Only data ingestion is required
Payload size is less than 256 kb
Expected requests per second are less than the Region quotas.

Amazon Kinesis Data Streams

HTTP APIs service integration with Amazon Kinesis Data Streams

The HTTP APIs direct integration for Kinesis Data Streams offers the PutRecord integration action, enabling client applications to place events on a Kinesis data stream. Kinesis Data Streams are designed to handle up to 1,000 writes per second per shard, with payloads up to 1 mb in size. Developers can increase throughput by increasing the number of shards in the data stream. You can route the incoming data to targets like an Amazon S3 bucket as part of a data lake or a Kinesis data analytics application for real-time analytics.

This integration is a storage first option because data is stored on the stream for up to seven days until it is processed and routed elsewhere. When processing stream events with a Lambda function, errors are handled at the Lambda layer through a configurable error handling strategy.

Use this direct integration when:

Ingesting large amounts of data
Ingesting large payload sizes
Order is important
Routing the same data to multiple targets

Amazon SQS

HTTP APIs service integration with Amazon SQS

The HTTP APIs direct integration for Amazon SQS offers the SendMessage, ReceiveMessage, DeleteMessage, and PurgeQueue integration actions. This integration differs from the EventBridge and Kinesis integrations in that data flows both ways. Events can be created, read, and deleted from the SQS queue via REST calls through the HTTP API endpoint. Additionally, a full purge of the queue can be managed using the PurgeQueue action.

This pattern is a storage first pattern because the data remains on the queue for four days by default (configurable to 14 days), unless it is processed and removed. When the Lambda service polls the queue, the messages that are returned are hidden in the queue for a set amount of time. Once the calling service has processed these messages, it uses the DeleteMessage API to remove the messages permanently.

When triggering a Lambda function with an SQS queue, the Lambda service manages this process internally. However, HTTP APIs direct integration with SQS enables developers to move this process to client applications without the need for a Lambda function as a transport layer.

Use this direct integration when:

Data must be received as well as sent to the service
Downstream services need reduced concurrency
The queue requires custom management
Order is important (FIFO queues)

AWS AppConfig

HTTP APIs service integration with AWS Systems Manager AppConfig

The HTTP APIs direct integration for AWS AppConfig offers the GetConfiguration integration action and allows applications to check for application configuration updates. By exposing the systems parameter API through an HTTP APIs endpoint, developers can automate configuration changes for their applications. While this integration is not considered a storage first integration, it does enable direct communication from external services to AppConfig without the need for a Lambda function as a transport layer.

Use this direct integration when:

Access to AWS AppConfig is required.
Managing application configurations.

AWS Step Functions

HTTP APIs service integration with AWS Step Functions

The HTTP APIs direct integration for Step Functions offers the StartExecution and StopExecution integration actions. These actions allow for programmatic control of a Step Functions state machine via an API. When starting a Step Functions workflow, JSON data is passed in the request and mapped to the state machine. Error messages are also mapped to the state machine when stopping the execution.

This pattern provides a storage first integration because Step Functions maintains a persistent state during the life of the orchestrated workflow. Step Functions also supports service integrations that allow the workflows to send and receive data without needing a Lambda function as a transport layer.

Use this direct integration when:

Orchestrating multiple actions.
Order of action is required.

Building HTTP APIs direct integrations

HTTP APIs service integrations can be built using the AWS CLI, AWS SAM, or through the API Gateway console. The console walks through contextual choices to help you understand what is required for each integration. Each of the integrations also includes an Advanced section to provide additional information for the integration.

Creating an HTTP APIs service integration

Once you build an integration, you can export it as an OpenAPI template that can be used with infrastructure as code (IaC) tools like AWS SAM. The exported template can also include the API Gateway extensions that define the specific integration information.

Exporting the HTTP APIs configuration to OpenAPI

OpenAPI template

An example of a direct integration from HTTP APIs to SQS is located in the Sessions With SAM repository. This example includes the following architecture:

AWS SAM template resource architecture

The AWS SAM template creates the HTTP APIs, SQS queue, Lambda function, and both Identity and Access Management (IAM) roles required. This is all generated in 58 lines of code and looks like this:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: HTTP API direct integrations

Resources:
  MyQueue:
    Type: AWS::SQS::Queue
    
  MyHttpApi:
    Type: AWS::Serverless::HttpApi
    Properties:
      DefinitionBody:
        'Fn::Transform':
          Name: 'AWS::Include'
          Parameters:
            Location: './api.yaml'
          
  MyHttpApiRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal:
              Service: "apigateway.amazonaws.com"
            Action: 
              - "sts:AssumeRole"
      Policies:
        - PolicyName: ApiDirectWriteToSQS
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              Action:
              - sqs:SendMessage
              Effect: Allow
              Resource:
                - !GetAtt MyQueue.Arn
                
  MyTriggeredLambda:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.lambdaHandler
      Runtime: nodejs12.x
      Policies:
        - SQSPollerPolicy:
            QueueName: !GetAtt MyQueue.QueueName
      Events:
        SQSTrigger:
          Type: SQS
          Properties:
            Queue: !GetAtt MyQueue.Arn

Outputs:
  ApiEndpoint:
    Description: "HTTP API endpoint URL"
    Value: !Sub "https://${MyHttpApi}.execute-api.${AWS::Region}.amazonaws.com"

The OpenAPI template handles the route definitions for the HTTP API configuration and configures the service integration. The template looks like this:

openapi: "3.0.1"
info:
  title: "my-sqs-api"
paths:
  /:
    post:
      responses:
        default:
          description: "Default response for POST /"
      x-amazon-apigateway-integration:
        integrationSubtype: "SQS-SendMessage"
        credentials:
          Fn::GetAtt: [MyHttpApiRole, Arn]
        requestParameters:
          MessageBody: "$request.body.MessageBody"
          QueueUrl:
            Ref: MyQueue
        payloadFormatVersion: "1.0"
        type: "aws_proxy”
        connectionType: "INTERNET"
x-amazon-apigateway-importexport-version: "1.0"

Because the OpenAPI template is included in the AWS SAM template via a transform, the API Gateway integration can reference the roles and services created within the AWS SAM template.

Conclusion

This post covers the concept of storage first integration patterns and how the new HTTP APIs direct integrations can help. I cover the five current integrations and possible use cases for each. Additionally, I demonstrate how to use AWS SAM to build and manage the integrated applications using infrastructure as code.

Using the storage first pattern with direct integrations can help developers build serverless applications that are more durable with fewer lines of code. A Lambda function is no longer required to transport data from the API endpoint to the desired service. Instead, use Lambda function invocations for differentiating business logic.

To learn more join us for the HTTP API service integrations session of Sessions With SAM!

#ServerlessForEveryone

Fundbox: Simplifying Ways to Query and Analyze Data by Different Personas

2020-08-25 Annik Stahl

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/fundbox-simplifying-ways-to-query-and-analyze-data-by-different-personas/

Fundbox is a leading technology platform focused on disrupting the $21 trillion B2B commerce market by building the world’s first B2B payment and credit network. With Fundbox, sellers of all sizes can quickly increase average order volumes (AOV) and improve close rates by offering more competitive net terms and payment plans to their SMB buyers. With heavy investments in machine learning and the ability to quickly analyze the transactional data of SMB’s, Fundbox is reimagining B2B payments and credit products in new category-defining ways.

Learn how how the company simplified the way different personas in the organization query and analyze data by building a self-service data orchestration platform. The platform architecture is entirely serverless, which simplifies the ability to scale and adopt to unpredictable demand. The platform was built using AWS Step Functions, AWS Lambda, Amazon API Gateway, Amazon DynamoDB, AWS Fargate, and other AWS Serverless managed services.

For more content like this, subscribe to our YouTube channels This is My Architecture, This is My Code, and This is My Model, or visit the This is My Architecture on AWS, which has search functionality and the ability to filter by industry, language, and service.

Improving the Wrangler Startup Experience

2020-08-25 Joshua Johnson

Post Syndicated from Joshua Johnson original https://blog.cloudflare.com/improving-the-wrangler-startup-experience/

Improving the Wrangler Startup Experience

Today I’m excited to announce wrangler login, an easy way to get started with Wrangler! This summer for my internship on the Workers Developer Productivity team I was tasked with helping improve the Wrangler user experience. For those who don’t know, Workers is Cloudflare’s serverless platform which allows users to deploy their software directly to Cloudflare’s edge network.

This means you can write any behaviour on requests heading to your site or even run fully fledged applications directly on the edge. Wrangler is the open-source CLI tool used to manage your Workers and has a big focus on enabling a smooth developer experience.

When I first heard I was working on Wrangler, I was excited that I would be working on such a cool product but also a little nervous. This was the first time I would be writing Rust in a professional environment, the first time making meaningful open-source contributions, and on top of that the first time doing all of this remotely. But thanks to lots of guidance and support from my mentor and team, I was able to help make the Wrangler and Workers developer experience just a little bit better.

The Problem

The main improvement I focused on this summer was the experience when getting started with Wrangler. For many of the commands to publish and develop live Workers, the user first needs to authenticate with Cloudflare. This is mainly done through the wrangler config command which has the user go create an API token and paste it into Wrangler. Creating a token involves going to the Cloudflare dashboard, going to your profile, going to the API tokens page, selecting a token template, adding your zones and accounts, and finally creating the token. While this is a completely valid authentication flow, it’s not as easy as it could be.

It could be frustrating to users who have to leave Wrangler and then possibly get lost in the wrong dashboard page or use the wrong settings for their token. When a group of intern candidates were given the task of using Wrangler, most of them got stuck on this step! Many users might forgo using Workers altogether if this is the first thing they encounter when sitting down to develop. Instead we wanted an experience where users could use their Cloudflare login (ie. their username, password, and possible two-factor authentication) and immediately be ready to go.

No OAuth? No Problem

What we came up with was a way to create and transfer API Tokens for a user, similar to how Argo Tunnel does their login.

An overview of the process is shown above, which starts with Wrangler. When the user types wrangler login in their terminal, they will be prompted to open the Cloudflare dashboard in their browser. All dashboard pages require the user to sign in before loading and once the user is signed in, all actions taken by the dashboard page will use the authentication of that user.

This means we can make a dashboard page which automatically creates an API token configured to manage Workers. Then when the user loads this page, a properly configured API token will be created for that user. Our dashboard page will then hand off the token to EdgeWorker Config Service (EWC) which will temporarily store it. While this is all going on Wrangler will be polling EWC waiting for the token to appear and once it does, Wrangler will retrieve the token and authenticate the user. With this, we have a seamless way to authenticate a Cloudflare user.

Security

One thing we had to be mindful of was security, these are users’ tokens after all. If someone was listening to network traffic and saw the request to the Cloudflare dashboard page, nothing would be stopping them from polling EWC themselves and stealing the token away from the user to wreak havoc on their Workers and zones. To solve this problem we used asymmetric RSA encryption. Asymmetric encryption lets us create two separate but mathematically connected keys. One is a private key which can encrypt and decrypt information and one is a public key which can only encrypt information.

Wrangler will generate a public-private key pair and pass off the public key to our dashboard page. Once the dashboard page is finished creating our token, EWC will then encrypt the token using the public key before storing. This means in the previous scenario where someone takes the token from our user, all they will have is an encrypted token they can’t use. The only way to decrypt it would be with the private key held by Wrangler.

In the end, this solution results in a smooth experience for Workers users. Now instead of rummaging through dashboard pages you can get started with Wrangler in only a few seconds, sometimes without having to leave the comfort of your own terminal.

Try out wrangler login in the 1.11.0 release of Wrangler and let us know how you like it. Also I would like to thank the Workers team for helping make this possible and giving me an awesome experience this summer! In order to implement this feature I had to touch different parts of Cloudflare like EWC and Stratus (Cloudflare’s front end monorepo) and work in areas unfamiliar to me such as frontend TypeScript and React. The responsiveness and encouragement I received helped get this feature created and helped make for a great summer!

Using serverless backends to iterate quickly on web apps – part 2

2020-08-24 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-serverless-backends-to-iterate-quickly-on-web-apps-part-2/

This series is about building flexible solutions that can adapt as user requirements change. One of the challenges of building modern web applications is that requirements can change quickly. This is especially true for new applications that are finding their product-market fit. Many development teams start building a product with one set of requirements, and quickly find they must build a product with different features.

For both start-ups and enterprises, it’s often important to find a development methodology and architecture that allows flexibility. This is the surest way to keep up with feature requests in evolving products and innovate to delight your end-users. In this post, I show how to build sophisticated workflows using minimal custom code.

Part 1 introduces the Happy Path application that allows park visitors to share maps and photos with other users. In that post, I explain the functionality, how to deploy the application, and walk through the backend architecture.

The Happy Path application accepts photo uploads from users’ smartphones. The application architecture must support 100,000 monthly active users. These binary uploads are typically 3–9 MB in size and must be resized and optimized for efficient distribution.

Using a serverless approach, you can develop a robust low-code solution that can scale to handle millions of images. Additionally, the solution shown here is designed to handle complex changes that are introduced in subsequent versions of the software. The code and instructions for this application are available in the GitHub repo.

Architecture overview

After installing the backend in the previous post, the architecture looks like this:

In this design, the API, storage, and notification layers exist as one application, and the business logic layer is a separate application. These two applications are deployed using AWS Serverless Application Model (AWS SAM) templates. This architecture uses Amazon EventBridge to pass events between the two applications.

In the business logic layer:

The workflow starts when events are received from EventBridge. Each time a new object is uploaded by an end-user, the PUT event in the Amazon S3 Upload bucket triggers this process.
After the workflow is completed successfully, processed images are stored in the Distribution bucket. Related metadata for the object is also stored in the application’s Amazon DynamoDB table.

By separating the architecture into two independent applications, you can replace the business logic layer as needed. Providing that the workflow accepts incoming events and then stores processed images in the S3 bucket and DynamoDB table, the workflow logic becomes interchangeable. Using the pattern, this workflow can be upgraded to handle new functionality.

Introducing AWS Step Functions for workflow management

One of the challenges in building distributed applications is coordinating components. These systems are composed of separate services, which makes orchestrating workflows more difficult than working with a single monolithic application. As business logic grows more complex, if you attempt to manage this in custom code, it can become quickly convoluted. This is especially true if it handles retries and error handling logic, and it can be hard to test and maintain.

AWS Step Functions is designed to coordinate and manage these workflows in distributed serverless applications. To do this, you create state machine diagrams using Amazon States Language (ASL). Step Functions renders a visualization of your state machine, which makes it simpler to see the flow of data from one service to another.

Each state machine consists of a series of steps. Each step takes an input and produces an output. Using ASL, you define how this data progresses through the state machine. The flow from step to step is called a transition. All state machines transition from a Start state towards an End state.

The Step Functions service manages the state of individual executions. The service also supports versioning, which makes it easier to modify state machines in production systems. Executions continue to use the version of a state machine when they were started, so it’s possible to have active executions on multiple versions.

For developers using VS Code, the AWS Toolkit extension provides support for writing state machines using ASL. It also renders visualizations of those workflows. Combined with AWS Serverless Application Model (AWS SAM) templates, this provides a powerful way to deploy and maintain applications based on Step Functions. I refer to this IDE and AWS SAM in this walkthrough.

Version 1: Image resizing

The Happy Path application uses Step Functions to manage the image-processing part of the backend. The first version of this workflow resizes the uploaded image.

To see this workflow:

In VS Code, open the workflows/statemachines folder in the Explorer panel.
Choose the v1.asl.sjon file.
Choose the Render graph option in the CodeLens. This opens the workflow visualization.

In this basic workflow, the state machine starts at the Resizer step, then progresses to the Publish step before ending:

In the top-level attributes in the definition, StartsAt sets the Resizer step as the first action.
The Resizer step is defined as a task with an ARN of a Lambda function. The Next attribute determines that the Publish step is next.
In the Publish step, this task defines a Lambda function using an ARN reference. It sets the input payload as the entire JSON payload. This step is set as the End of the workflow.

Deploying the Step Functions workflow

To deploy the state machine:

In the terminal window, change directory to the workflows/templates/v1 folder in the repo.
Execute these commands to build and deploy the AWS SAM template:
```
sam build
sam deploy –guided
```
The deploy process prompts you for several parameters. Enter happy-path-workflow-v1 as the Stack Name. The other values are the outputs from the backend deployment process, detailed in the repo’s README. Enter these to complete the deployment.

Testing and inspecting the deployed workflow

Now the workflow is deployed, you perform an integration test directly from the frontend application.

To test the deployed v1 workflow:

Open the frontend application at https://localhost:8080 in your browser.
Select a park location, choose Show Details, and then choose Upload images.
Select an image from the sample photo dataset.
After a few seconds, you see a pop-up message confirming that the image has been added:
Select the same park location again, and the information window now shows the uploaded image:

To see how the workflow processed this image:

Navigate to the Steps Functions console.
Here you see the v1StateMachine with one execution in the Succeeded column.
Choose the state machine to display more information about the start and end time.
Select the execution ID in the Executions panel to open details of this single instance of the workflow.

This view shows important information that’s useful for understanding and debugging an execution. Under Input, you see the event passed into Step Functions by EventBridge:

This contains detail about the S3 object event, such as the bucket name and key, together with the placeId, which identifies the location on the map. Under Output, you see the final result from the state machine, shows a successful StatusCode (200) and other metadata:

Using AWS SAM to define and deploy Step Functions state machines

The AWS SAM template defines both the state machine, the trigger for executions, and the permissions needed for Step Functions to execute. The AWS SAM resource for a Step Functions definition is AWS::Serverless::StateMachine.

In this example:

DefinitionUri refers to an external ASL definition, instead of embedding the JSON in the AWS SAM template directly.
DefinitionSubstitutions allow you to use tokens in the ASL definition that refer to resources created in the AWS SAM template. For example, the token ${ResizerFunctionArn} refers to the ARN of the resizer Lambda function.
Events define how the state machine is invoked. Here it defines an EventBridge rule. If an event matches this source and detail-type, it triggers an execution.
Policies: the Step Functions service must have permission to invoke the services that perform tasks in the state machine. AWS SAM policy templates provide a convenient shorthand for common execution policies, such as invoking a Lambda function.

This workflow application is separate from the main backend template. As more functionality is added to the workflow, you deploy the subsequent AWS SAM templates in the same way.

Conclusion

Using AWS SAM, you can specify serverless resources, configure permissions, and define substitutions for the ASL template. You can deploy a standalone Step Functions-based application using the AWS SAM CLI, separately from other parts of your application. This makes it easier to decouple and maintain larger applications. You can visualize these workflows directly in the VS Code IDE in addition to the AWS Management Console.

In part 3, I show how to build progressively more complex workflows and how to deploy these in-place without affecting the other parts of the application.

To learn more about building serverless web applications, see the Ask Around Me series.