Tag Archives: serverless

New – Using Step Functions to Orchestrate Amazon EMR workloads

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-using-step-functions-to-orchestrate-amazon-emr-workloads/

AWS Step Functions allows you to add serverless workflow automation to your applications. The steps of your workflow can run anywhere, including in AWS Lambda functions, on Amazon Elastic Compute Cloud (EC2), or on-premises. To simplify building workflows, Step Functions is directly integrated with multiple AWS Services: Amazon ECS, AWS Fargate, Amazon DynamoDB, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), AWS Batch, AWS Glue, Amazon SageMaker, and (to run nested workflows) with Step Functions itself.

Starting today, Step Functions connects to Amazon EMR, enabling you to create data processing and analysis workflows with minimal code, saving time, and optimizing cluster utilization. For example, building data processing pipelines for machine learning is time consuming and hard. With this new integration, you have a simple way to orchestrate workflow capabilities, including parallel executions and dependencies from the result of a previous step, and handle failures and exceptions when running data processing jobs.

Specifically, a Step Functions state machine can now:

  • Create or terminate an EMR cluster, including the possibility to change the cluster termination protection. In this way, you can reuse an existing EMR cluster for your workflow, or create one on-demand during execution of a workflow.
  • Add or cancel an EMR step for your cluster. Each EMR step is a unit of work that contains instructions to manipulate data for processing by software installed on the cluster, including tools such as Apache Spark, Hive, or Presto.
  • Modify the size of an EMR cluster instance fleet or group, allowing you to manage scaling programmatically depending on the requirements of each step of your workflow. For example, you may increase the size of an instance group before adding a compute-intensive step, and reduce the size just after it has completed.

When you create or terminate a cluster or add an EMR step to a cluster, you can use synchronous integrations to move to the next step of your workflow only when the corresponding activity has completed on the EMR cluster.

Reading the configuration or the state of your EMR clusters is not part of the Step Functions service integration. In case you need that, the EMR List* and Describe* APIs can be accessed using Lambda functions as tasks.

Building a Workflow with EMR and Step Functions
On the Step Functions console, I create a new state machine. The console renders it visually, so that is much easier to understand:

To create the state machine, I use the following definition using the Amazon States Language (ASL):

{
  "StartAt": "Should_Create_Cluster",
  "States": {
    "Should_Create_Cluster": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.CreateCluster",
          "BooleanEquals": true,
          "Next": "Create_A_Cluster"
        },
        {
          "Variable": "$.CreateCluster",
          "BooleanEquals": false,
          "Next": "Enable_Termination_Protection"
        }
      ],
      "Default": "Create_A_Cluster"
    },
    "Create_A_Cluster": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
      "Parameters": {
        "Name": "WorkflowCluster",
        "VisibleToAllUsers": true,
        "ReleaseLabel": "emr-5.28.0",
        "Applications": [{ "Name": "Hive" }],
        "ServiceRole": "EMR_DefaultRole",
        "JobFlowRole": "EMR_EC2_DefaultRole",
        "LogUri": "s3://aws-logs-123412341234-eu-west-1/elasticmapreduce/",
        "Instances": {
          "KeepJobFlowAliveWhenNoSteps": true,
          "InstanceFleets": [
            {
              "InstanceFleetType": "MASTER",
              "TargetOnDemandCapacity": 1,
              "InstanceTypeConfigs": [
                {
                  "InstanceType": "m4.xlarge"
                }
              ]
            },
            {
              "InstanceFleetType": "CORE",
              "TargetOnDemandCapacity": 1,
              "InstanceTypeConfigs": [
                {
                  "InstanceType": "m4.xlarge"
                }
              ]
            }
          ]
        }
      },
      "ResultPath": "$.CreateClusterResult",
      "Next": "Merge_Results"
    },
    "Merge_Results": {
      "Type": "Pass",
      "Parameters": {
        "CreateCluster.$": "$.CreateCluster",
        "TerminateCluster.$": "$.TerminateCluster",
        "ClusterId.$": "$.CreateClusterResult.ClusterId"
      },
      "Next": "Enable_Termination_Protection"
    },
    "Enable_Termination_Protection": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:setClusterTerminationProtection",
      "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "TerminationProtected": true
      },
      "ResultPath": null,
      "Next": "Add_Steps_Parallel"
    },
    "Add_Steps_Parallel": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "Step_One",
          "States": {
            "Step_One": {
              "Type": "Task",
              "Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
              "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "Step": {
                  "Name": "The first step",
                  "ActionOnFailure": "CONTINUE",
                  "HadoopJarStep": {
                    "Jar": "command-runner.jar",
                    "Args": [
                      "hive-script",
                      "--run-hive-script",
                      "--args",
                      "-f",
                      "s3://eu-west-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
                      "-d",
                      "INPUT=s3://eu-west-1.elasticmapreduce.samples",
                      "-d",
                      "OUTPUT=s3://MY-BUCKET/MyHiveQueryResults/"
                    ]
                  }
                }
              },
              "End": true
            }
          }
        },
        {
          "StartAt": "Wait_10_Seconds",
          "States": {
            "Wait_10_Seconds": {
              "Type": "Wait",
              "Seconds": 10,
              "Next": "Step_Two (async)"
            },
            "Step_Two (async)": {
              "Type": "Task",
              "Resource": "arn:aws:states:::elasticmapreduce:addStep",
              "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "Step": {
                  "Name": "The second step",
                  "ActionOnFailure": "CONTINUE",
                  "HadoopJarStep": {
                    "Jar": "command-runner.jar",
                    "Args": [
                      "hive-script",
                      "--run-hive-script",
                      "--args",
                      "-f",
                      "s3://eu-west-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
                      "-d",
                      "INPUT=s3://eu-west-1.elasticmapreduce.samples",
                      "-d",
                      "OUTPUT=s3://MY-BUCKET/MyHiveQueryResults/"
                    ]
                  }
                }
              },
              "ResultPath": "$.AddStepsResult",
              "Next": "Wait_Another_10_Seconds"
            },
            "Wait_Another_10_Seconds": {
              "Type": "Wait",
              "Seconds": 10,
              "Next": "Cancel_Step_Two"
            },
            "Cancel_Step_Two": {
              "Type": "Task",
              "Resource": "arn:aws:states:::elasticmapreduce:cancelStep",
              "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "StepId.$": "$.AddStepsResult.StepId"
              },
              "End": true
            }
          }
        }
      ],
      "ResultPath": null,
      "Next": "Step_Three"
    },
    "Step_Three": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
      "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "Step": {
          "Name": "The third step",
          "ActionOnFailure": "CONTINUE",
          "HadoopJarStep": {
            "Jar": "command-runner.jar",
            "Args": [
              "hive-script",
              "--run-hive-script",
              "--args",
              "-f",
              "s3://eu-west-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
              "-d",
              "INPUT=s3://eu-west-1.elasticmapreduce.samples",
              "-d",
              "OUTPUT=s3://MY-BUCKET/MyHiveQueryResults/"
            ]
          }
        }
      },
      "ResultPath": null,
      "Next": "Disable_Termination_Protection"
    },
    "Disable_Termination_Protection": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:setClusterTerminationProtection",
      "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "TerminationProtected": false
      },
      "ResultPath": null,
      "Next": "Should_Terminate_Cluster"
    },
    "Should_Terminate_Cluster": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.TerminateCluster",
          "BooleanEquals": true,
          "Next": "Terminate_Cluster"
        },
        {
          "Variable": "$.TerminateCluster",
          "BooleanEquals": false,
          "Next": "Wrapping_Up"
        }
      ],
      "Default": "Wrapping_Up"
    },
    "Terminate_Cluster": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:terminateCluster.sync",
      "Parameters": {
        "ClusterId.$": "$.ClusterId"
      },
      "Next": "Wrapping_Up"
    },
    "Wrapping_Up": {
      "Type": "Pass",
      "End": true
    }
  }
}

I let the Step Functions console create a new AWS Identity and Access Management (IAM) role for the executions of this state machine. The role automatically includes all permissions required to access EMR.

This state machine can either use an existing EMR cluster, or create a new one. I can use the following input to create a new cluster that is terminated at the end of the workflow:

{
"CreateCluster": true,
"TerminateCluster": true
}

To use an existing cluster, I need to provide input in the cluster ID, using this syntax:

{
"CreateCluster": false,
"TerminateCluster": false,
"ClusterId": "j-..."
}

Let’s see how that works. As the workflow starts, the Should_Create_Cluster Choice state looks into the input to decide if it should enter the Create_A_Cluster state or not. There, I use a synchronous call (elasticmapreduce:createCluster.sync) to wait for the new EMR cluster to reach the WAITING state before progressing to the next workflow state. The AWS Step Functions console shows the resource that is being created with a link to the EMR console:

After that, the Merge_Results Pass state merges the input state with the cluster ID of the newly created cluster to pass it to the next step in the workflow.

Before starting to process any data, I use the Enable_Termination_Protection state (elasticmapreduce:setClusterTerminationProtection) to help ensure that the EC2 instances in my EMR cluster are not shut down by an accident or error.

Now I am ready to do something with the EMR cluster. I have three EMR steps in the workflow. For the sake of simplicity, these steps are all based on this Hive tutorial. For each step, I use Hive’s SQL-like interface to run a query on some sample CloudFront logs and write the results to Amazon Simple Storage Service (S3). In a production use case, you’d probably have a combination of EMR tools processing and analyzing your data in parallel (two or more steps running at the same time) or with some dependencies (the output of one step is required by another step). Let’s try to do something similar.

First I execute Step_One and Step_Two inside a Parallel state:

  • Step_One is running the EMR step synchronously as a job (elasticmapreduce:addStep.sync). That means that the execution waits for the EMR step to be completed (or cancelled) before moving on to the next step in the workflow. You can optionally add a timeout to monitor that the execution of the EMR step happens within an expected time frame.
  • Step_Two is adding an EMR step asynchronously (elasticmapreduce:addStep). In this case, the workflow moves to the next step as soon as EMR replies that the request has been received. After a few seconds, to try another integration, I cancel Step_Two (elasticmapreduce:cancelStep). This integration can be really useful in production use cases. For example, you can cancel an EMR step if you get an error from another step running in parallel that would make it useless to continue with the execution of this step.

After those two steps have both completed and produce their results, I execute Step_Three as a job, similarly to what I did for Step_One. When Step_Three has completed, I enter the Disable_Termination_Protection step, because I am done using the cluster for this workflow.

Depending on the input state, the Should_Terminate_Cluster Choice state is going to enter the Terminate_Cluster state (elasticmapreduce:terminateCluster.sync) and wait for the EMR cluster to terminate, or go straight to the Wrapping_Up state and leave the cluster running.

Finally I have a state for Wrapping_Up. I am not doing much in this final state actually, but you can’t end a workflow from a Choice state.

In the EMR console I see the status of my cluster and of the EMR steps:

Using the AWS Command Line Interface (CLI), I find the results of my query in the S3 bucket configured as output for the EMR steps:

aws s3 ls s3://MY-BUCKET/MyHiveQueryResults/
...

Based on my input, the EMR cluster is still running at the end of this workflow execution. I follow the resource link in the Create_A_Cluster step to go to the EMR console and terminate it. In case you are following along with this demo, be careful to not leave your EMR cluster running if you don’t need it.

Available Now
Step Functions integration with EMR is available in all regions. There is no additional cost for using this feature on top of the usual Step Functions and EMR pricing.

You can now use Step Functions to quickly build complex workflows for executing EMR jobs. A workflow can include parallel executions, dependencies, and exception handling. Step Functions makes it easy to retry failed jobs and terminate workflows after critical errors, because you can specify what happens when something goes wrong. Let me know what are you going to use this feature for!

Danilo

Serverless at AWS re:Invent 2019

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/serverless-at-aws-reinvent-2019/

Our annual AWS re:Invent conference is just two weeks away! We can’t wait to meet you for an AWSome week in Las Vegas. The Serverless team is now hard at work preparing to deliver over 130 sessions at re:Invent. Come meet us and learn about how to use the newest Serverless innovations to build and architect for modern applications.

reInvent 2019

Breakouts, Talks, Builders, & Demos!

To find any Serverless session, you can search our Agenda for the key words “SVS” or you can visit our re:Invent 2019 Session Catalog. Lets take a look at some of the Architecture-focused sessions you might want to join:

Workshops

  • SVS305-RHow to secure your Serverless APIs
    You’ll get hands on with Amazon API Gateway and learn how to architect for scale and security.
  • SVS303-R: Monolith to Serverless
    This workshop shows you how to re-architect monolithic applications to AWS Lambda-based microservices.

Breakouts

  • SVS308Moving to event-driven architectures
    Learn about the new event-driven world and how our newest tools help you develop event-centric applications.
  • SVS407: Architecting and operating resilient Serverless systems
    This is an excellent session to learn best practice patterns for building reliable applications.
  • SVS401Optimizing your Serverless applications
    Learn how to choose the correct services in your architecture and how to design your Lambda functions and APIs for security and scale.

Chalk Talks

  • SVS338: API Patterns and architectures (REST vs GraphQL APIs)
    We’ll help you evaluate your choices for modern APIs. Come learn how to choose between Amazon S3 REST and GraphQL
  • SVS213: Thinking Serverless
    How do you go from a flowchart to a Serverless application? Come to this session to learn the techniques you can use to design Serverless architectures.
  • SVS323: Mastering AWS Lambda streaming event sources
    This talk will go in depth on the common architecture patterns for consuming and scaling Amazon Kinesis and Amazon DynamoDB streams with AWS Lambda.

Builders Sessions

  • SVS330: Build secure Serverless mobile or web applications
    Get hands on experience building a serverless web application using AWS AppSync, AWS Lambda, Amazon API Gateway, and Amazon DynamoDB.

Come Meet Us

Don’t forget to come stop by our Serverless expert booth in the main Expo Hall. We will have many people from the Serverless team ready to speak with you!

Our Serverless team, including specialist solutions architects and developer advocates will be onsite throughout the week. We’d love to meet you, hear about your projects, and help with any architecture questions. Reach out to Sam Dengler, Brian McNamara, Chris Munns, Eric Johnson, James Beswick, and me, George Mao. See you onsite!

See You in Las Vegas!

I can’t wait to meet you in Las Vegas and hear about your projects. Please reach out to us and let’s chat about Serverless! As a side note, reserved seating is available for all sessions, so be sure to log in to your re:Invent account to reserve a seat and join us for all kinds of Serverless architecture discussions and hands-on training.

Java 11 runtime now available in AWS Lambda

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/java-11-runtime-now-available-in-aws-lambda/

We are excited to announce that you can now develop your AWS Lambda functions using the Java 11 runtime. Start using this runtime today by specifying a runtime parameter value of java11 when creating or updating your Lambda functions.

The Java 11 runtime does not introduce any changes in Lambda’s programming model, such as handler definition or logging statements. Customers can continue authoring their Lambda functions in Java as they have in the past while benefitting from the new features of Java 11.

New features in Java 11 runtime

Java 11 is a long-term support release and brings with it several new features, including a Java-native HTTP client with HTTP/2 support and the var keyword. The Java 11 runtime also benefits from Amazon Corretto running on Amazon Linux 2.

HTTP client (standard)

Java 11 introduces a native HTTP client, HttpClient. Previous versions of Java provided the HttpURLConnection class for accessing HTTP resources but, for more complex use cases, developers typically had to select and import a third-party library. HttpClient supports both synchronous and asynchronous HTTP requests.

Example: Synchronous HTTP request

Synchronous requests block execution while the HTTP client waits for a response. This is a common programming model for Lambda functions that are invoked synchronously themselves, for example, via Amazon API Gateway.

package helloworld;

import java.net.http.HttpClient;
import java.net.http.HttpHeaders;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.net.http.HttpResponse.BodyHandlers;
import java.net.URI;
import java.time.Duration;
import java.util.HashMap;
import java.util.Map;

import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;

/**
 * Handler for requests to Lambda function.
 */
public class App implements RequestHandler<Object, Object> {

    public Object handleRequest(final Object input, final Context context) {
        Map<String, String> headers = new HashMap<>();
        headers.put("Content-Type", "application/json");
        headers.put("X-Custom-Header", "application/json");
        HttpClient client = HttpClient.newHttpClient();
        HttpRequest request = HttpRequest.newBuilder()
            .GET()
            .version(HttpClient.Version.HTTP_2)
            .uri(URI.create("https://checkip.amazonaws.com"))
            .timeout(Duration.ofSeconds(15))
            .build();

        try {
            HttpResponse<String> response =
            client.send(request, BodyHandlers.ofString());

            String output = String.format("{ \"message\": \"hello world\", \"location\": \"%s\" }", response.body());
            return new GatewayResponse(output, headers, response.statusCode());    
        } catch (Exception e) {
            return new GatewayResponse("{}", headers, 500);
        }
    }
}

 

The var keyword

The var keyword allows you to declare local variables and infer their type at compile time. This helps reduce verbosity, especially with composite types, as you no longer have to explicitly define type information on both sides of the equal sign. For example, to create a map of key/value string pairs, you can now do:

var map = new HashMap<String, String>();

Corretto benefits

The Java 11 runtime benefits from Amazon Corretto. Corretto is a no-cost, multiplatform, production-ready distribution of the Open Java Development Kit (OpenJDK). Corretto comes with long-term support that will include performance enhancements and security fixes. Amazon runs Corretto internally on thousands of production services.

Special considerations

Developers migrating to the new runtimes should consider the following known issues.

Java 8 to Java 11 migration

After migrating from Java 8 to Java 11, using internal packages such as sun.misc.* or sun.* now produces compiler errors instead of warnings.

Amazon Linux 2

Java 11, like Python 3.8 and Node.js 10 and 12, is based on an Amazon Linux 2 execution environment. Amazon Linux 2 provides a secure, stable, and high-performance execution environment to develop and run cloud and enterprise applications.

Next steps

Get started building with Java 11 today by specifying a runtime parameter value of java11 when creating or updating your Lambda functions.

Hope you enjoy building with the new features in Java 11!

Node.js 12.x runtime now available in AWS Lambda

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/node-js-12-x-runtime-now-available-in-aws-lambda/

We are excited to announce that you can now develop AWS Lambda functions using the Node.js 12.x runtime, which is the current Long Term Support (LTS) version of Node.js. Start using this new version today by specifying a runtime parameter value of nodejs12.x when creating or updating functions.

Language Updates

Here is a quick primer that highlights just some of the new or improved features that come with Node.js 12:

  • Updated V8 engine
  • Public class fields
  • Private class fields
  • TLS improvements

Updated V8 engine

Node.js 12.x is powered by V8 7.4, which is a significant upgrade from V8 6.8 powering the previous Node.js 10.x. This upgrade brings with it performance improvements for faster JavaScript execution, better memory management, and broader support for ECMAScript.

Public class fields

With the upgraded V8 version comes support for public class fields. This enhancement allows for public fields to be defined at the class level, providing cleaner code.

Before:

class User {
	constructor(user){
		this.firstName = user.firstName
		this.lastName = user.lastName
		this.id = idGenerator()
	}
}

After:

{	
	id = idGenerator()
	
	constructor(user){
		this.firstName = user.firstName
		this.lastName = user.lastName
	}
}

Private class fields

In addition to public class fields, Node.js also supports the use of private fields in a class. These fields are not accessible outside of the class and will throw a SyntaxError if an attempt is made. To mark a field as private in a class simply start the name of the field with a ‘#’.

class User {
	#loginAttempt = 0;
	
	increment() {
		this.#loginAttempt++;
	}
	
	get loginAttemptCount() {
		return this.#loginAttempt;
	}
}

const user = new User()
console.log(user.loginAttemptCount) // returns 0
user.increment()				
console.log(user.loginAttemptCount)	// returns 1
console.log(user.#loginAttempt)		// returns SyntaxError: Private field '#loginAttempt'
									// must be declared in an enclosing class

TLS improvements

As a security improvement, Node.js 12 has also added support for TLS 1.3. This increases the security of TLS connections by removing hard to configure and often vulnerable features like SHA-1, RC4, DES, and AES-CBC. Performance is also increased because TLS 1.3 only requires a single round trip for a TLS handshake compared to earlier versions requiring at least two.

For more information, see the AWS Lambda Developer Guide.

Runtime Updates

Multi-line log events in Node.js 12 will work the same way they did in Node.js 8.10 and before. Node.js 12 will also support exception stack traces in AWS X-Ray helping you to debug and optimize your application. Additionally, to help keep Lambda functions secure, AWS will update Node.js 12 with all minor updates released by the Node.js community.

Deprecation schedule

AWS will be deprecating Node.js 8.10 according to the end of life schedule provided by the community. Node.js 8.10 will reach end of life on December 31, 2019. After January 6, 2020, you can no longer create a Node.js 8.10 Lambda function and the ability to update will be disabled after February 3, 2020. More information on can be found here.

Existing Node.js 8.10 functions can be migrated to the new runtime by making any necessary changes to code for compatibility with Node.js 12, and changing the function’s runtime configuration to “nodejs12.x”. Lambda functions running on Node.js 12 will have 2 full years of support.

Amazon Linux 2

Node.js 12, like Node.js 10, Java 11, and Python 3.8, is based on an Amazon Linux 2 execution environment. Amazon Linux 2 provides a secure, stable, and high-performance execution environment to develop and run cloud and enterprise applications.

Next steps

Get started building with Nodejs 12 today by specifying a runtime parameter value of nodejs12.x when creating or updating your Lambda functions.

Happy coding with Nodejs 12!

Python 3.8 runtime now available in AWS Lambda

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/python-3-8-runtime-now-available-in-aws-lambda/

You can now develop your AWS Lambda functions using the Python 3.8 runtime. Start using this runtime today by specifying a runtime parameter value of python3.8 when creating or updating Lambda functions.

New Python runtime features

Python 3.8 is a stable release and brings several new features, including assignment expressions, positional-only arguments, and vectorcall.

Assignment expressions

Assignment expressions provide a way to assign values to variables within expressions using the new notation NAME := expr. This is informally known as the “walrus operator,” as it looks like the eyes and tusks of a walrus on its side.

Before:

>>> walrus = True
print(walrus)
True

After:

>>> print(walrus := True)
True

Previously, assignment was only available in statement form. With Python 3.8, it is available in list comprehensions and other expression contexts.

For examples, usage, and limitations, see PEP-572.

Positional-only arguments

Positional-only parameters give more control to library authors by introducing a new function parameter syntax `/` to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments.

When describing APIs, they can be used to better express the intended usage and allow the API to evolve in a safe, backward-compatible way. They also make the Python language more consistent with existing documentation and the behavior of various “builtin” and standard library functions.

For a full specification, including syntax and examples, see PEP-570 and PEP-457.

f-strings support

An f-string is a formatted string literal to enclose variables, and even expressions, inside curly braces. An F-string is evaluated at runtime and included in the string, recognized by the leading f:

>>> city = "London"
>>> f"Living in {city} is amazing"
'Living in London is amazing'

Python 3.8 adds the ability to use assignment expressions inside f-strings which are evaluated from left-to-right.

>>> import math
>>> radius = 3.8

>>> f"With a diameter of {( diameter := 2 * radius)}, the circumference is {math.pi * diameter:.2f}"
' With a diameter of 7.6 the circumference is 23.88'

Vectorcall

Vectorcall is a new protocol and calling convention based on the “fastcall” convention, which was previously used internally by CPython. The new features can be used by any user-defined extension class. Vectorcall generalizes the calling convention used internally for Python and builtin functions so that all calls can benefit from better performance. It is designed to remove the overhead of temporary object creation and multiple indirections.

For additional information on vectorcall, see PEP-590.

Amazon Linux 2

Python 3.8, like Node.js 10 and 12, and Java 11, is based on an Amazon Linux 2 execution environment. Amazon Linux 2 provides a secure, stable, and high-performance execution environment to develop and run cloud and enterprise applications.

Next steps

Get started building with Python 3.8 today by specifying a runtime parameter value of python3.8 when creating or updating your Lambda functions.

Enjoy, go build with Python 3.8!

Designing durable serverless apps with DLQs for Amazon SNS, Amazon SQS, AWS Lambda

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/designing-durable-serverless-apps-with-dlqs-for-amazon-sns-amazon-sqs-aws-lambda/

This post is courtesy of Otavio Ferreira, Sr Manager, SNS.

In a postal system, a dead-letter office is a facility for processing undeliverable mail. In pub/sub messaging, a dead-letter queue (DLQ) is a queue to which messages published to a topic can be sent, in case those messages cannot be delivered to a subscribed endpoint.

Amazon SNS supports DLQs, making your applications more resilient and durable upon delivery failure modes.

Understanding message delivery failures and retries

The delivery of a message fails when it’s not possible for Amazon SNS to access the subscribed endpoint. There are two reasons why this might happen:

  • Client errors, where the client is SNS (the message sender).
  • Server errors, where the server is the system that hosts the subscription endpoint (the message receiver), such as Amazon SQS or AWS Lambda.

Client errors

Client errors happen when SNS has stale subscription metadata. One common cause of client errors is when you (the endpoint owner) delete the endpoint. For example, you might delete the SQS queue that is subscribed to your SNS topic, without also deleting the SNS subscription corresponding to the queue. Another common cause is when you change the resource policy attached to your endpoint in a way that prevents SNS from delivering messages to that endpoint.

These errors are considered client errors because the client has attempted the delivery of a message to a destination that, from the client’s perspective, is no longer accessible. SNS does not retry the delivery of messages that failed as the result of client errors.

Server errors

Server errors happen when the system that powers the subscribed endpoint is unavailable, or when it returns an exception response indicating that it failed to process a valid request from SNS.

When server errors occur, SNS retries the failed deliveries according to a backoff function, which can be either linear or exponential. When a server error occurs for an AWS managed endpoint, backed by either SQS or Lambda, then SNS retries the delivery for up to 100,015 times, over 23 days.

Server errors can also happen with customer managed endpoints, namely HTTP, SMS, email, and mobile push endpoints. SNS also retries the delivery for these types of endpoints. HTTP endpoints support customer-defined retry policies, while SNS sets an internal delivery retry policy for SMS, email, and mobile push endpoints to 50 times, over 6 hours.

Delivery retries

SNS may receive a client error, or continue to receive a server error for a message beyond the number of retries defined by the corresponding retry policy. In that event, SNS discards the message. Setting a DLQ to your SNS subscription enables you to keep this message, regardless of the type of error, either client or server. DLQs give you more control over messages that cannot be delivered.

For more information on the delivery retry policy for each delivery protocol supported by SNS, see Amazon SNS Message Delivery Retry.

Using DLQs for AWS services

SNS, SQS, and Lambda support DLQs, addressing different failure modes. All DLQs are regular queues powered by SQS.

In SNS, DLQs store the messages that failed to be delivered to subscribed endpoints. For more information, see Amazon SNS Dead-Letter Queues.

In SQS, DLQs store the messages that failed to be processed by your consumer application. This failure mode can happen when producers and consumers fail to interpret aspects of the protocol that they use to communicate. In that case, the consumer receives the message from the queue, but fails to process it, as the message doesn’t have the structure or content that the consumer expects. The consumer can’t delete the message from the queue either. After exhausting the receive count in the redrive policy, SQS can sideline the message to the DLQ. For more information, see Amazon SQS Dead-Letter Queues.

In Lambda, DLQs store the messages that resulted in failed asynchronous executions of your Lambda function. An execution can result in an error for several reasons. Your code might raise an exception, time out, or run out of memory. The runtime executing your code might encounter an error and stop. Your function might hit its concurrency limit and be throttled. Regardless of the error type, when the error occurs, your code might have run completely, partially, or not at all. By default, Lambda retries an asynchronous execution twice. After exhausting the retries, Lambda can sideline the message to the DQL. For more information, see AWS Lambda Dead-Letter Queues.

When you have a fan-out architecture, with SQS queues and Lambda functions subscribed to an SNS topic, we recommend that you set DLQs to your SNS subscriptions, and to your destination queues and functions as well. This approach gives your application resilience against message delivery failures, message processing failures, and function execution failures too.

Applying DLQs in a use case

Here’s how everything comes together. The following diagram shows a serverless backend architecture that supports a car rental application. This is a durable serverless architecture based on DLQs for SNS, SQS, and Lambda.

Dead Letter Queue - DLQ SNS use case with architecture diagram

When a customer places an order to rent a car, the application sends that request to an API, which is powered by Amazon API Gateway. The REST API is backed by an SNS topic named Rental-Orders, and deployed onto an Amazon VPC subnet. The topic then fans out that order to the following two subscribed endpoints, for parallel processing:

  • An SQS queue, named Rental-Fulfilment, which feeds the integration with an internal fulfilment system hosted on Amazon EC2.
  • A Lambda function, named Rental-Billing, which processes and loads the customer order into a third-party billing system, also hosted on Amazon EC2.

To increase the durability of this serverless backend API, the following DLQs have been set up:

  • Two SNS DLQs, namely Rental-Fulfilment-Fanout-DLQ and Rental-Billing-Fanout-DLQ, which store the order in case either the subscribed SQS queue or Lambda function ever becomes unreachable.
  • An SQS DLQ, named Rental-Fulfilment-DLQ, which stores the order when the fulfilment system fails to process the order.
  • A Lambda DLQ, named Rental-Billing-DLQ, which stores the order when the function fails to process and load the order into the billing system.

When the DLQ captures the message, you can inspect the message for troubleshooting purposes. After you address the error at hand, you can poll the DLQ to retry the processing of the message.

Setting up DLQs for subscriptions, queues, and functions can be done using the AWS Management Console, SDK, CLI, API, or AWS CloudFormation. You can use the SDK, CLI, and API for polling the DLQs as well.

Configuring DLQs for subscriptions

You can attach a DLQ to an SNS subscription by setting the subscription’s RedrivePolicy parameter. The policy is a JSON object that refers to the DLQ ARN. The ARN must point to an SQS queue in the same AWS account as that of the SNS subscription. Also, both the DLQ and the subscription must be in the same AWS Region.

Here’s how you can configure one of the SNS DLQs applied in the car rental application example, presented earlier.

The following JSON object is a CloudFormation template that subscribes the SQS queue Rental-Fulfilment to the SNS topic Rental-Orders. The template also sets a RedrivePolicy that targets Rental-Fulfilment-Fanout-DLQ as a DLQ.

Lastly, the template sets a FilterPolicy value. It makes SNS deliver a message to the subscribed queue only if the published message carries an attribute named order-status with value set to either confirmed or canceled. As Amazon SNS Message Filtering happens before message delivery, messages that are filtered out aren’t sent to that subscription’s DLQ.

Internally, the CloudFormation template uses the SNS Subscribe API action for deploying the subscription and setting both policies, all part of the same API request.

{  
   "Resources": {
      "mySubscription": {
         "Type" : "AWS::SNS::Subscription",
         "Properties" : {
            "Protocol": "sqs",
            "Endpoint": "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment",
            "TopicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
            "RedrivePolicy": {
               "deadLetterTargetArn": 
                  "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ"
            },
            "FilterPolicy": { 
               "order-status": [ "confirmed", "canceled" ]
            }
         }
      }
   }
}

Maybe the SNS topic and subscription are already deployed. In that case, you can use the SNS SetSubscriptionAttributes API action to set the RedrivePolicy, as shown by the following code examples, based on the AWS CLI and the AWS SDK for Java.

$ aws sns set-subscription-attributes 
   --region us-east-1
   --subscription-arn arn:aws:sns:us-east-1:123456789012:Rental-Orders:44019880-ffa0-4067-9cb4-b974443bcck2
   --attribute-name RedrivePolicy 
   --attribute-value '{"deadLetterTargetArn":"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ"}'
AmazonSNS sns = AmazonSNSClientBuilder.defaultClient();

String subscriptionArn = "arn:aws:sns:us-east-1:123456789012:Rental-Orders:44019880-ffa0-4067-9cb4-b974443bcck2";

String redrivePolicy = "{\"deadLetterTargetArn\":\"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ\"}";

SetSubscriptionAttributesRequest request = new SetSubscriptionAttributesRequest(
  subscriptionArn, 
  "RedrivePolicy", 
  redrivePolicy
);

sns.setSubscriptionAttributes(request);

Monitoring DLQs

You can use Amazon CloudWatch metrics and alarms to monitor the DLQs associated with your SNS subscriptions. In the car rental example, you can monitor the DLQs to be notified when the API failed to distribute any car rental order to the fulfillment or billing systems.

As regular SQS queues, the DLQs in SNS emit a number of metrics to CloudWatch, in 5-minute data points, such as NumberOfMessagesSent, NumberOfMessagesReceived and NumberOfMessagesDeleted. You can use these SQS metrics to be notified upon activity in your DLQs in SNS, so you may trigger a message recovery protocol.

You might have a case where you expect the DLQ to be always empty. In that case, create an CloudWatch alarm on NumberOfMessagesSent, set the alarm threshold to zero, and provide a separate SNS topic to be notified when the alarm goes off. The SNS topic, in its turn, can delivery your alarm notification to any endpoint type that you choose, such as email address, phone number, or mobile pager app.

Additionally, SNS itself provides its own set of metrics that are relevant to DLQs. Specifically, SNS metrics include the following:

  • NumberOfNotificationsRedrivenToDlq – Used when sending the message to the DLQ succeeds.
  • NumberOfNotificationsFailedToRedriveToDlq – Used when sending the message to the DLQ fails. This can happen because the DLQ either doesn’t exist anymore or doesn’t have the required access permissions to allow SNS to send messages to it. For more information about setting up the required access policy, see Giving Permissions for Amazon SNS to Send Messages to Amazon SQS.

Debugging with DLQs

Use CloudWatch Logs to see the exceptions that caused your SNS deliveries to fail and your messages to be sidelined to DLQs. In the car rental example, you can inspect the rental orders in the DLQs, as well as the logs associated with these queues. Then you can understand why those orders failed to be fanned out to the fulfilment or billing systems.

SNS can log both successful and failed deliveries in CloudWatch. You can enable Amazon SNS Delivery Status Logging by setting three SNS topic attributes, which are delivery protocol-specific. As an example, for SNS deliveries to SQS queues, you must set the following topic attributes: SQSSuccessFeedbackRoleArn,  SQSFailureFeedbackRoleArn, and SQSSuccessFeedbackSampleRate.

The following JSON object represents a successful SNS delivery in an CloudWatch Logs entry. The status code logged is 200 (SUCCESS). The attribute RedrivePolicy shows that the SNS subscription in question had its DLQ set.

{
  "notification": {
    "messageMD5Sum": "7bb3327ac55e49485bad42e159ca4d4b",
    "messageId": "e8c2bb09-235c-5f5d-b583-efd8df0f7d74",
    "topicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
    "timestamp": "2019-10-04 05:13:55.876"
  },
  "delivery": {
    "deliveryId": "6adf232e-fb12-5062-a564-27ff3741051f",
    "redrivePolicy": "{\"deadLetterTargetArn\": \"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ\"}",
    "destination": "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment",
    "providerResponse": "{\"sqsRequestId\":\"b2608a46-ccc4-51cc-003d-de972097debc\",\"sqsMessageId\":\"05fecd22-60a1-4d7d-bb79-026d49700b5a\"}",
    "dwellTimeMs": 58,
    "attempts": 1,
    "statusCode": 200
  },
  "status": "SUCCESS"
}

The following JSON object represents a failed SNS delivery in CloudWatch Logs. In the following code example, the subscribed queue doesn’t exist. As a client error, the status code logged is 400 (FAILURE). Again, the RedrivePolicy attribute refers to a DLQ.

{
  "notification": {
    "messageMD5Sum": "81c395cbd350da6bedfe3b24db9517b0",
    "messageId": "9959db9d-25c8-57a6-9439-8e5be8f71a1f",
    "topicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
    "timestamp": "2019-10-04 05:16:51.116"
  },
  "delivery": {
    "deliveryId": "be743821-4c2c-5acc-a586-6cf0807f6fb1",
    "redrivePolicy": "{\"deadLetterTargetArn\": \"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ\"}",
    "destination": "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment",
    "providerResponse": "{\"ErrorCode\":\"AWS.SimpleQueueService.NonExistentQueue\", \"ErrorMessage\":\"The specified queue does not exist or you do not have access to it.\",\"sqsRequestId\":\"Unrecoverable\"}",
    "dwellTimeMs": 53,
    "attempts": 1,
    "statusCode": 400
  },
  "status": "FAILURE"
}

When the message delivery fails and there is a DLQ attached to the subscription, the message is sent to the DLQ and an additional entry is logged in CloudWatch. This new entry is specific to the delivery to the DLQ and refers to the DLQ ARN as the destination, as shown in the following JSON object.

{
  "notification": {
    "messageMD5Sum": "81c395cbd350da6bedfe3b24db9517b0",
    "messageId": "8959db9d-25c8-57a6-9439-8e5be8f71a1f",
    "topicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
    "timestamp": "2019-10-04 05:16:52.876"
  },
  "delivery": {
    "deliveryId": "a877c79f-a3ee-5105-9bbd-92596eae0232",
    "destination":"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ",
    "providerResponse": "{\"sqsRequestId\":\"8cef1af5-e86a-519e-ad36-4f33252aa5ec\",\"sqsMessageId\":\"2b742c5c-0750-4ec5-a717-b95897adda8e\"}",
    "dwellTimeMs": 51,
    "attempts": 1,
    "statusCode": 200
  },
  "status": "SUCCESS"
}

By analyzing Amazon CloudWatch Logs entries, you can understand why an SNS message was moved to a DLQ, and then take the required set of steps to recover the message. When you enable delivery status logging in SNS, you can configure the sample rate in which deliveries are logged, from 0% to 100%.

Encrypting DLQs

When your SNS subscription targets an SQS encrypted queue, then you probably want your DLQ to be an SQS encrypted queue as well. This configuration provides consistency in the form that your messages are encrypted at rest.

To follow this security recommendation, give the CMK you used to encrypt your DLQ a key policy that grants the SNS service principal access to AWS KMS API actions. For example, see the following sample key policy:

{
    "Sid": "GrantSnsAccessToKms",
    "Effect": "Allow",
    "Principal": { "Service": "sns.amazonaws.com" },
    "Action": [ "kms:Decrypt", "kms:GenerateDataKey*" ],
    "Resource": "*"
}

If you have an SNS encrypted topic, but a subscription in this topic points to a DLQ that isn’t an SQS encrypted queue, then messages sidelined to the DLQ aren’t encrypted at rest.

For more information, see Enabling Server-Side Encryption (SSE) for an Amazon SNS Topic with an Amazon SQS Encrypted Queue Subscribed.

Summary

DLQs for SNS, SQS, and Lambda increase the resiliency and durability of your applications. These DLQs address different failure modes, and can be used together.

  • SNS DLQs store messages that failed to be delivered to subscribed endpoints.
  • SQS DLQs store messages that the consumer system failed to process.
  • Lambda DLQs store the messages that resulted in failed asynchronous executions of your functions.

Setting up DLQs for subscriptions, queues, and functions can be done using the AWS Management Console, SDK, CLI, API, or CloudFormation. DLQs are available in all AWS Regions. Start today by running the tutorials:

ICYMI: Serverless Q3 2019

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q3-2019/

This post is courtesy of Julian Wood, Senior Developer Advocate – AWS Serverless

Welcome to the seventh edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, checkout what happened last quarter here.

ICYMI calendar

Launches/New products

Amazon EventBridge was technically launched in this quarter although we were so excited to let you know, we squeezed it into the Q2 2019 update. If you missed it, EventBridge is the serverless event bus that connects application data from your own apps, SaaS, and AWS services. This allows you to create powerful event-driven serverless applications using a variety of event sources.

The AWS Bahrain Region has opened, the official name is Middle East (Bahrain) and the API name is me-south-1. AWS Cloud now spans 22 geographic Regions with 69 Availability Zones around the world.

AWS Lambda

In September we announced dramatic improvements in cold starts for Lambda functions inside a VPC. With this announcement, you see faster function startup performance and more efficient usage of elastic network interfaces, drastically reducing VPC cold starts.

VPC to VPC NAT

These improvements are rolling out to all existing and new VPC functions at no additional cost. Rollout is ongoing, you can track the status from the announcement post.

AWS Lambda now supports custom batch window for Kinesis and DynamoDB Event sources, which helps fine-tune Lambda invocation for cost optimization.

You can now deploy Amazon Machine Images (AMIs) and Lambda functions together from the AWS Marketplace using using AWS CloudFormation with just a few clicks.

AWS IoT Events actions now support AWS Lambda as a target. Previously you could only define actions to publish messages to SNS and MQTT. Now you can define actions to invoke AWS Lambda functions and even more targets, such as Amazon Simple Queue Service and Amazon Kinesis Data Firehose, and republish messages to IoT Events.

The AWS Lambda Console now shows recent invocations using CloudWatch Logs Insights. From the monitoring tab in the console, you can view duration, billing, and memory statistics for the 10 most recent invocations.

AWS Step Functions

AWS Step Functions example

AWS Step Functions has now been extended to support probably its most requested feature, Dynamic Parallelism, which allows steps within a workflow to be executed in parallel, with a new Map state type.

One way to use the new Map state is for fan-out or scatter-gather messaging patterns in your workflows:

  • Fan-out is applied when delivering a message to multiple destinations, and can be useful in workflows such as order processing or batch data processing. For example, you can retrieve arrays of messages from Amazon SQS and Map sends each message to a separate AWS Lambda function.
  • Scatter-gather broadcasts a single message to multiple destinations (scatter), and then aggregates the responses back for the next steps (gather). This is useful in file processing and test automation. For example, you can transcode ten 500-MB media files in parallel, and then join to create a 5-GB file.

Another important update is AWS Step Functions adds support for nested workflows, which allows you to orchestrate more complex processes by composing modular, reusable workflows.

AWS Amplify

A new Predictions category as been added to the Amplify Framework to quickly add machine learning capabilities to your web and mobile apps.

Amplify framework

With a few lines of code you can add and configure AI/ML services to configure your app to:

  • Identify text, entities, and labels in images using Amazon Rekognition, or identify text in scanned documents to get the contents of fields in forms and information stored in tables using Amazon Textract.
  • Convert text into a different language using Amazon Translate, text to speech using Amazon Polly, and speech to text using Amazon Transcribe.
  • Interpret text to find the dominant language, the entities, the key phrases, the sentiment, or the syntax of unstructured text using Amazon Comprehend.

AWS Amplify CLI (part of the open source Amplify Framework) has added local mocking and testing. This allows you to mock some of the most common cloud services and test your application 100% locally.

For this first release, the Amplify CLI can mock locally:

amplify mock

AWS CloudFormation

The CloudFormation team has released the much-anticipated CloudFormation Coverage Roadmap.

Styled after the popular AWS Containers Roadmap, the CloudFormation Coverage Roadmap provides transparency about our priorities, and the opportunity to provide your input.

The roadmap contains four columns:

  • Shipped – Available for use in production in all public AWS Regions.
  • Coming Soon – Generally a few months out.
  • We’re working on It – Work in progress, but further out.
  • Researching – We’re thinking about the right way to implement the coverage.

AWS CloudFormation roadmap

Amazon DynamoDB

NoSQL Workbench for Amazon DynamoDB has been released in preview. This is a free, client-side application available for Windows and macOS. It helps you more easily design and visualize your data model, run queries on your data, and generate the code for your application.

Amazon Aurora

Amazon Aurora Serverless is a dynamically scaling version of Amazon Aurora. It automatically starts up, shuts down, and scales up or down, based on your application workload.

Aurora Serverless has had a MySQL compatible edition for a while, now we’re excited to bring more serverless joy to databases with the PostgreSQL compatible version now GA.

We also have a useful post on Reducing Aurora PostgreSQL storage I/O costs.

AWS Serverless Application Repository

The AWS Serverless Application Repository has had some useful SAR apps added by Serverless Developer Advocate James Beswick.

  • S3 Auto Translator which automatically converts uploaded objects into other languages specified by the user, using Amazon Translate.
  • Serverless S3 Uploader allows you to upload JPG files to Amazon S3 buckets from your web applications using presigned URLs.

Serverless posts

July

August

September

Tech talks

We hold several AWS Online Tech Talks covering serverless tech talks throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page.

Here are the ones from Q3:

Twitch

July

August

September

There are also a number of other helpful video series covering Serverless available on the AWS Twitch Channel.

AWS re:Invent

AWS re:Invent

December 2 – 6 in Las Vegas, Nevada is peak AWS learning time with AWS re:Invent 2019. Join tens of thousands of AWS customers to learn, share ideas, and see exciting keynote announcements.

Be sure to take a look at the growing catalog of serverless sessions this year. Make sure to book time for Builders SessionsChalk Talks, and Workshops as these sessions will fill up quickly. The schedule is updated regularly so if your session is currently fully booked, a repeat may be scheduled.

Register for AWS re:Invent now!

What did we do at AWS re:Invent 2018? Check out our recap here: AWS re:Invent 2018 Recap at the San Francisco Loft.

Our friends at IOPipe have written 5 tips for avoiding serverless FOMO at this year’s re:Invent.

AWS Serverless Heroes

We are excited to welcome some new AWS Serverless Heroes to help grow the serverless community. We look forward to some amazing content to help you with your serverless journey.

Still looking for more?

The Serverless landing page has much more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

 

What’s new with Workers KV?

Post Syndicated from Steve Klabnik original https://blog.cloudflare.com/whats-new-with-workers-kv/

What’s new with Workers KV?

What’s new with Workers KV?

The Storage team here at Cloudflare shipped Workers KV, our global, low-latency, key-value store, earlier this year. As people have started using it, we’ve gotten some feature requests, and have shipped some new features in response! In this post, we’ll talk about some of these use cases and how these new features enable them.

New KV APIs

We’ve shipped some new APIs, both via api.cloudflare.com, as well as inside of a Worker. The first one provides the ability to upload and delete more than one key/value pair at once. Given that Workers KV is great for read-heavy, write-light workloads, a common pattern when getting started with KV is to write a bunch of data via the API, and then read that data from within a Worker. You can now do these bulk uploads without needing a separate API call for every key/value pair. This feature is available via api.cloudflare.com, but is not yet available from within a Worker.

For example, say we’re using KV to redirect legacy URLs to their new homes. We have a list of URLs to redirect, and where they should redirect to. We can turn this list into JSON that looks like this:

[
  {
    "key": "/old/post/1",
    "value": "/new-post-slug-1"
  },
  {
    "key": "/old/post/2",
    "value": "/new-post-slug-2"
  }
]

And then POST this JSON to the new bulk endpoint, /storage/kv/namespaces/:namespace_id/bulk. This will add both key/value pairs to our namespace.

Likewise, if we wanted to drop support for these redirects, we could issue a DELETE that has this body:

[
    "/old/post/1",
    "/old/post/2"
]

to /storage/kv/namespaces/:namespace_id/bulk, and we’d delete both key/value pairs in a single call to the API.

The bulk upload API has one more trick up its sleeve: not all data is a string. For example, you may have an image as a value, which is just a bag of bytes. if you need to write some binary data, you’ll have to base64 the value’s contents so that it’s valid JSON. You’ll also need to set one more key:

[
  {
    "key": "profile-picture",
    "value": "aGVsbG8gd29ybGQ=",
    "base64": true
  }
]

Workers KV will decode the value from base64, and then store the resulting bytes.

Beyond bulk upload and delete, we’ve also given you the ability to list all of the keys you’ve stored in any of your namespaces, from both the API and within a Worker. For example, if you wrote a blog powered by Workers + Workers KV, you might have each blog post stored as a key/value pair in a namespace called “contents”. Most blogs have some sort of “index” page that lists all of the posts that you can read. To create this page, we need to get a listing of all of the keys, since each key corresponds to a given post. We could do this from within a Worker by calling list() on our namespace binding:

const value = await contents.list()

But what we get back isn’t only a list of keys. The object looks like this:

{
  keys: [
    { name: "Title 1” },
    { name: "Title 2” }
  ],
  list_complete: false,
  cursor: "6Ck1la0VxJ0djhidm1MdX2FyD"
}

We’ll talk about this “cursor” stuff in a second, but if we wanted to get the list of titles, we’d have to iterate over the keys property, and pull out the names:

const keyNames = value.keys.map(e => e.name)

keyNames would be an array of strings:

[“Title 1”, “Title 2”, “Title 3”, “Title 4”, “Title 5”]

We could take keyNames and those titles to build our page.

So what’s up with the list_complete and cursor properties? Well, imagine that we’ve been a very prolific blogger, and we’ve now written thousands of posts. The list API is paginated, meaning that it will only return the first thousand keys. To see if there are more pages available, you can check the list_complete property. If it is false, you can use the cursor to fetch another page of results. The value of cursor is an opaque token that you pass to another call to list:

const value = await NAMESPACE.list()
const cursor = value.cursor
const next_value = await NAMESAPACE.list({"cursor": cursor})

This will give us another page of results, and we can repeat this process until list_complete is true.

Listing keys has one more trick up its sleeve: you can also return only keys that have a certain prefix. Imagine we want to have a list of posts, but only the posts that were made in October of 2019. While Workers KV is only a key/value store, we can use the prefix functionality to do interesting things by filtering the list. In our original implementation, we had stored the titles of keys only:

  • Title 1
  • Title 2

We could change this to include the date in YYYY-MM-DD format, with a colon separating the two:

  • 2019-09-01:Title 1
  • 2019-10-15:Title 2

We can now ask for a list of all posts made in 2019:

const value = await NAMESAPCE.list({"prefix": "2019"})

Or a list of all posts made in October of 2019:

const value = await NAMESAPCE.list({"prefix": "2019-10"})

These calls will only return keys with the given prefix, which in our case, corresponds to a date. This technique can let you group keys together in interesting ways. We’re looking forward to seeing what you all do with this new functionality!

Relaxing limits

For various reasons, there are a few hard limits with what you can do with Workers KV. We’ve decided to raise some of these limits, which expands what you can do.

The first is the limit of the number of namespaces any account could have. This was previously set at 20, but some of you have made a lot of namespaces! We’ve decided to relax this limit to 100 instead. This means you can create five times the number of namespaces you previously could.

Additionally, we had a two megabyte maximum size for values. We’ve increased the limit for values to ten megabytes. With the release of Workers Sites, folks are keeping things like images inside of Workers KV, and two megabytes felt a bit cramped. While Workers KV is not a great fit for truly large values, ten megabytes gives you the ability to store larger images easily. As an example, a 4k monitor has a native resolution of 4096 x 2160 pixels. If we had an image at this resolution as a lossless PNG, for example, it would be just over five megabytes in size.

KV browser

Finally, you may have noticed that there’s now a KV browser in the dashboard! Needing to type out a cURL command just to see what’s in your namespace was a real pain, and so we’ve given you the ability to check out the contents of your namespaces right on the web. When you look at a namespace, you’ll also see a table of keys and values:

What’s new with Workers KV?

The browser has grown with a bunch of useful features since it initially shipped. You can not only see your keys and values, but also add new ones:

What’s new with Workers KV?

edit existing ones:

What’s new with Workers KV?

…and even upload files!

What’s new with Workers KV?

You can also download them:

What’s new with Workers KV?

As we ship new features in Workers KV, we’ll be expanding the browser to include them too.

Wrangler integration

The Workers Developer Experience team has also been shipping some features related to Workers KV. Specifically, you can fully interact with your namespaces and the key/value pairs inside of them.

For example, my personal website is running on Workers Sites. I have a Wrangler project named “website” to manage it. If I wanted to add another namespace, I could do this:

$ wrangler kv:namespace create new_namespace
Creating namespace with title "website-new_namespace"
Success: WorkersKvNamespace {
    id: "<id>",
    title: "website-new_namespace",
}

Add the following to your wrangler.toml:

kv-namespaces = [
    { binding = "new_namespace", id = "<id>" }
]

I’ve redacted the namespace IDs here, but Wrangler let me know that the creation was successful, and provided me with the configuration I need to put in my wrangler.toml. Once I’ve done that, I can add new key/value pairs:

$ wrangler kv:key put "hello" "world" --binding new_namespace
Success

And read it back out again:

> wrangler kv:key get "hello" --binding new_namespace
world

If you’d like to learn more about the design of these features, “How we design features for Wrangler, the Cloudflare Workers CLI” discusses them in depth.

More to come

The Storage team is working hard at improving Workers KV, and we’ll keep shipping new stuff every so often. Our updates will be more regular in the future. If there’s something you’d particularly like to see, please reach out!

Serverlist October: GitHub Actions, Deployment Best Practices, and more

Post Syndicated from Connor Peshek original https://blog.cloudflare.com/serverlist-9th-edition/

Serverlist October: GitHub Actions, Deployment Best Practices, and more

Check out our ninth edition of The Serverlist below. Get the latest scoop on the serverless space, get your hands dirty with new developer tutorials, engage in conversations with other serverless developers, and find upcoming meetups and conferences to attend.

Sign up below to have The Serverlist sent directly to your mailbox.


Update: Issue affecting HashiCorp Terraform resource deletions after the VPC Improvements to AWS Lambda

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/update-issue-affecting-hashicorp-terraform-resource-deletions-after-the-vpc-improvements-to-aws-lambda/

On September 3, 2019, we announced an exciting update that improves the performance, scale, and efficiency of AWS Lambda functions when working with Amazon VPC networks. You can learn more about the improvements in the original blog post. These improvements represent a significant change in how elastic network interfaces (ENIs) are configured to connect to your VPCs. With this new model, we identified an issue where VPC resources, such as subnets, security groups, and VPCs, can fail to be destroyed via HashiCorp Terraform. More information about the issue can be found here. In this post we will help you identify whether this issue affects you and the steps to resolve the issue.

How do I know if I’m affected by this issue?

This issue only affects you if you use HashiCorp Terraform to destroy environments. Versions of Terraform AWS Provider that are v2.30.0 or older are impacted by this issue. With these versions you may encounter errors when destroying environments that contain AWS Lambda functions, VPC subnets, security groups, and Amazon VPCs. Typically, terraform destroy fails with errors similar to the following:

Error deleting subnet: timeout while waiting for state to become 'destroyed' (last state: 'pending', timeout: 20m0s)

Error deleting security group: DependencyViolation: resource sg-<id> has a dependent object
        	status code: 400, request id: <guid>

Depending on which AWS Regions the VPC improvements are rolled out, you may encounter these errors in some Regions and not others.

How do I resolve this issue if I am affected?

You have two options to resolve this issue. The recommended option is to upgrade your Terraform AWS Provider to v2.31.0 or later. To learn more about upgrading the Provider, visit the Terraform AWS Provider Version 2 Upgrade Guide. You can find information and source code for the latest releases of the AWS Provider on this page. The latest version of the Terraform AWS Provider contains a fix for this issue as well as changes that improve the reliability of the environment destruction process. We highly recommend that you upgrade the Provider version as the preferred option to resolve this issue.

If you are unable to upgrade the Provider version, you can mitigate the issue by making changes to your Terraform configuration. You need to make the following sets of changes to your configuration:

  1. Add an explicit dependency, using a depends_on argument, to the aws_security_group and aws_subnet resources that you use with your Lambda functions. The dependency has to be added on the aws_security_group or aws_subnet and target the aws_iam_policy resource associated with IAM role configured on the Lambda function. See the example below for more details.
  2. Override the delete timeout for all aws_security_group and aws_subnet resources. The timeout should be set to 40 minutes.

The following configuration file shows an example where these changes have been made(scroll to see the full code):

provider "aws" {
    region = "eu-central-1"
}
 
resource "aws_iam_role" "lambda_exec_role" {
  name = "lambda_exec_role"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}
 
data "aws_iam_policy" "LambdaVPCAccess" {
  arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole"
}
 
resource "aws_iam_role_policy_attachment" "sto-lambda-vpc-role-policy-attach" {
  role       = "${aws_iam_role.lambda_exec_role.name}"
  policy_arn = "${data.aws_iam_policy.LambdaVPCAccess.arn}"
}
 
resource "aws_security_group" "allow_tls" {
  name        = "allow_tls"
  description = "Allow TLS inbound traffic"
  vpc_id      = "vpc-<id>"
 
  ingress {
    # TLS (change to whatever ports you need)
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    # Please restrict your ingress to only necessary IPs and ports.
    # Opening to 0.0.0.0/0 can lead to security vulnerabilities.
    cidr_blocks = ["0.0.0.0/0"]
  }
 
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "tcp"
    cidr_blocks     = ["0.0.0.0/0"]
  }
  
  timeouts {
    delete = "40m"
  }
  depends_on = ["aws_iam_role_policy_attachment.sto-lambda-vpc-role-policy-attach"]  
}
 
resource "aws_subnet" "main" {
  vpc_id     = "vpc-<id>"
  cidr_block = "172.31.68.0/24"

  timeouts {
    delete = "40m"
  }
  depends_on = ["aws_iam_role_policy_attachment.sto-lambda-vpc-role-policy-attach"]
}
 
resource "aws_lambda_function" "demo_lambda" {
    function_name = "demo_lambda"
    handler = "index.handler"
    runtime = "nodejs10.x"
    filename = "function.zip"
    source_code_hash = "${filebase64sha256("function.zip")}"
    role = "${aws_iam_role.lambda_exec_role.arn}"
    vpc_config {
     subnet_ids         = ["${aws_subnet.main.id}"]
     security_group_ids = ["${aws_security_group.allow_tls.id}"]
  }
}

The key block to note here is the following, which can be seen in both the “allow_tls” security group and “main” subnet resources:

timeouts {
  delete = "40m"
}
depends_on = ["aws_iam_role_policy_attachment.sto-lambda-vpc-role-policy-attach"]

These changes should be made to your Terraform configuration files before destroying your environments for the first time.

Can I delete resources remaining after a failed destroy operation?

Destroying environments without upgrading the provider or making the configuration changes outlined above may result in failures. As a result, you may have ENIs in your account that remain due to a failed destroy operation. These ENIs can be manually deleted a few minutes after the Lambda functions that use them have been deleted (typically within 40 minutes). Once the ENIs have been deleted, you can re-re-run terraform destroy.

Improving the Getting Started experience with AWS Lambda

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/improving-the-getting-started-experience-with-aws-lambda/

A common question from developers is, “How do I get started with creating serverless applications?” Frequently, I point developers to the AWS Lambda console where they can create a new Lambda function and immediately see it working.

While you can learn the basics of a Lambda function this way, it does not encompass the full serverless experience. It does not allow you to take advantage of best practices like infrastructure as code (IaC) or continuous integration and continuous delivery (CI/CD). A full-on serverless application could include a combination of services like Amazon API Gateway, Amazon S3, and Amazon DynamoDB.

To help you start right with serverless, AWS has added a Create application experience to the Lambda console. This enables you to create serverless applications from ready-to-use sample applications, which follow these best practices:

  • Use infrastructure as code (IaC) for defining application resources
  • Provide a continuous integration and continuous deployment (CI/CD) pipeline for deployment
  • Exemplify best practices in serverless application structure and methods

IaC

Using IaC allows you to automate deployment and management of your resources. When you define and deploy your IaC architecture, you can standardize infrastructure components across your organization. You can rebuild your applications quickly and consistently without having to perform manual actions. You can also enforce best practices such as code reviews.

When you’re building serverless applications on AWS, you can use AWS CloudFormation directly, or choose the AWS Serverless Application Model, also known as AWS SAM. AWS SAM is an open source framework for building serverless applications that makes it easier to build applications quickly. AWS SAM provides a shorthand syntax to express APIs, functions, databases, and event source mappings. Because AWS SAM is built on CloudFormation, you can specify any other AWS resources using CloudFormation syntax in the same template.

Through this new experience, AWS provides an AWS SAM template that describes the entire application. You have instant access to modify the resources and security as needed.

CI/CD

When editing a Lambda function in the console, it’s live the moment that the function is saved. This works when developing against test environments, but risks introducing untested, faulty code in production environments. That’s a stressful atmosphere for developers with the unneeded overhead of manually testing code on each change.

Developers say that they are looking for an automated process for consistently testing and deploying reliable code. What they need is a CI/CD pipeline.

CI/CD pipelines are more than just convenience, they can be critical in helping development teams to be successful. CI/CDs provide code integration, testing, multiple environment deployments, notifications, rollbacks, and more. The functionality depends on how you choose to configure it.

When you create a new application through Lambda console, you create a CI/CD pipeline to provide a framework for automated testing and deployment. The pipeline includes the following resources:

Best practices

Like any other development pattern, there are best practices for serverless applications. These include testing strategies, local development, IaC, and CI/CD. When you create a Lambda function using the console, most of this is abstracted away. A common request from developers learning about serverless is for opinionated examples of best practices.

When you choose Create application, the application uses many best practices, including:

  • Managing IaC architectures
  • Managing deployment with a CI/CD pipeline
  • Runtime-specific test examples
  • Runtime-specific dependency management
  • A Lambda execution role with permissions boundaries
  • Application security with managed policies

Create an application

Now, lets walk through creating your first application.

  1. Open the Lambda console, and choose Applications, Create application.
  2. Choose Serverless API backend. The next page shows the architecture, services used, and development workflow of the chosen application.
  3. Choose Create and then configure your application settings.
    • For Application name and Application description, enter values.
    • For Runtime, the preview supports Node.js 10.x. Stay tuned for more runtimes.
    • For Source Control Service, I chose CodeCommit for this example, but you can choose either. If you choose GitHub, you are asked to connect to your GitHub account for authorization.
    • For Repository Name, feel free to use whatever you want.
    • Under Permissions, check Create roles and permissions boundary.
  4. Choose Create.

Exploring the application

That’s it! You have just created a new serverless application from the Lambda console. It takes a few moments for all the resources to be created. Take a moment to review what you have done so far.

Across the top of the application, you can see four tabs, as shown in the following screenshot:

  • Overview—Shows the current page, including a Getting started section, and application and toolchain resources of the application
  • Code—Shows the code repository and instructions on how to connect
  • Deployments—Links to the deployment pipeline and a deployment history.
  • Monitoring—Reports on the application health and performance

getting started dialog

The Resources section lists all the resources specific to the application. This application includes three Lambda functions, a DynamoDB table, and the API. The following screenshot shows the resources for this sample application.resources view

Finally, the Infrastructure section lists all the resources for the CI/CD pipeline including the AWS Identity and Access Management (IAM) roles, the permissions boundary policy, the S3 bucket, and more. The following screenshot shows the resources for this sample application.application view

About Permissions Boundaries

This new Create application experience utilizes an IAM permissions boundary to help further secure the function that gets created and prevent an overly permissive function policy from being created later on. The boundary is a separate policy that acts as a maximum bound on what an IAM policy for your function can be created to have permissions for. This model allows developers to build out the security model of their application while still meeting certain requirements that are often put in place to prevent overly permissive policies and is considered a best practice. By default, the permissions boundary that is created limits the application access to just the resources that are included in the example template. In order to expand the permissions of the application, you’ll first need to extend what is defined in the permissions boundary to allow it.

A quick test

Now that you have an application up and running, try a quick test to see if it works.

  1. In the Lambda console, in the left navigation pane, choose Applications.
  2. For Applications, choose Start Right application.
  3. On the Endpoint details card, copy your endpoint.
  4. From a terminal, run the following command:
    curl -d '{"id":"id1", "name":"name1"}' -H "Content-Type: application/json" -X POST <YOUR-ENDPOINT>

You can find tips like this, and other getting started hints in the README.md file of your new serverless application.

Outside of the console

With the introduction of the Create application function, there is now a closer tie between the Lambda console and local development. Before this feature, you would get started in the Lambda console or with a framework like AWS SAM. Now, you can start the project in the console and then move to local development.

You have already walked through the steps of creating an application, now pull it local and make some changes.

  1. In the Lambda console, in the left navigation pane, choose Applications.
  2. Select your application from the list and choose the Code tab.
  3. If you used CodeCommit, choose Connect instructions to configure your local git client. To copy the URL, choose the SSH squares icon.
  4. If you used GitHub, click on the SSH squares icon.
  5. In a terminal window, run the following command:
    git clone <your repo>
  6. Update one of the Lambda function files and save it.
  7. In the terminal window, commit and push the changes:
    git commit -am "simple change"
    git push
  8. In the Lambda console, under Deployments, choose View in CodePipeline.codepipeline pipeline

The build has started and the application is being deployed .

Caveats

submit feedback

This feature is currently available in US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo). This is a feature beta and as such, it is not a full representation of the final experience. We know this is limited in scope and request your feedback. Let us know your thoughts about any future enhancements you would like to see. The best way to give feedback is to use the feedback button in the console.

Conclusion

With the addition of the Create application feature, you can now start right with full serverless applications from within the Lambda console. This delivers the simplicity and ease of the console while still offering the power of an application built on best practices.

Until next time: Happy coding!

Serverlist Sept. Wrap-up: Static sites, serverless costs, and more

Post Syndicated from Connor Peshek original https://blog.cloudflare.com/serverlist-8th-edition/

Serverlist Sept. Wrap-up: Static sites, serverless costs, and more

Check out our eighth edition of The Serverlist below. Get the latest scoop on the serverless space, get your hands dirty with new developer tutorials, engage in conversations with other serverless developers, and find upcoming meetups and conferences to attend.

Sign up below to have The Serverlist sent directly to your mailbox.


Learn more about Workers Sites at Austin & San Francisco Meetups

Post Syndicated from Andrew Fitch original https://blog.cloudflare.com/learn-more-about-workers-sites-at-austin-san-francisco-meetups/

Learn more about Workers Sites at Austin & San Francisco Meetups

Learn more about Workers Sites at Austin & San Francisco Meetups

Last Friday, at the end of Cloudflare’s 9th birthday week, we announced Workers Sites.

Now, using the Wrangler CLI, you can deploy entire websites directly to the Cloudflare Network using Cloudflare Workers and Workers KV. If you can statically generate the assets for your site, think create-react-app, Jekyll, or even the WP2Static plugin, you can deploy it to our global network, which spans 194 cities in more than 90 countries.

If you’d like to learn more about how it was built, you can read more about this in the technical blog post. Additionally, I wanted to give you an opportunity to meet with some of the developers who contributed to this product and hear directly from them about their process, potential use cases, and what it took to build.

Check out these events. If you’re based in Austin or San Francisco (more cities coming soon!), join us on-site. If you’re based somewhere else, you can watch the recording of the events afterwards.

Growing Dev Platforms at Scale & Deploying Static Websites

Talk 1: Inspiring with Content: How to Grow Developer Platforms at Scale

Serverless platforms like Cloudflare Workers provide benefits like scalability, high performance, and lower costs. However, when talking to developers, one of the most common reactions is, “this sounds interesting, but what do I build with it?”

In this talk, we’ll cover how at Cloudflare we’ve been able to answer this question at scale with Workers Sites. We’ll go over why this product exists and how the implementation leads to some unintended discoveries.

Speaker Bio:
Victoria Bernard is a full-stack, product-minded engineer focused on Cloudflare Workers Developer Experience. An engineer who started a career working at large firms in hardware sales and moved throughout Cloudflare from support to product and to development. Passionate about building products that make developer lives easier and more productive.

Talk 2:  Extending a Serverless Platform: How to Fake a File System…and Get Away With It

When building a platform for developers, you can’t anticipate every use case. So, how do you build new functionality into a platform in a sustainable way, and inspire others to do the same?

Let’s talk about how we took a globally distributed serverless platform (Cloudflare Workers) and key-value store (Workers KV) intended to store short-lived data and turned them into a way to easily deploy static websites. It wasn’t a straightforward journey, but join as we overcome roadblocks and learn a few lessons along the way.

Speaker Bio:
Ashley Lewis headed the development of the features that became Workers Sites. She’s process and collaboration oriented and focused on user experience first at every level of the stack. Ashley proudly tops the leaderboard for most LOC deleted.

Agenda:

  • 6:00pm – Doors open
  • 6:30pm – Talk 1: Inspiring with Content: How to Grow Developer Platforms at Scale
  • 7:00pm – Talk 2:  Extending a Serverless Platform: How to Fake a File System…and Get Away With It
  • 7:30pm – Networking over food and drinks
  • 8:00pm – Event conclusion

Austin, Texas Meetup

Learn more about Workers Sites at Austin & San Francisco Meetups

Register Here »

San Francisco, California Meetup

Learn more about Workers Sites at Austin & San Francisco Meetups

Register Here »

While you’re at it, check out our monthly developer newsletter: The Serverlist


Have you built something interesting with Workers? Let us know @CloudflareDev!

Not so static… Introducing the HTMLRewriter API Beta to Cloudflare Workers

Post Syndicated from Rita Kozlov original https://blog.cloudflare.com/html-rewriter-beta/

Not so static... Introducing the HTMLRewriter API Beta to Cloudflare Workers

Not so static... Introducing the HTMLRewriter API Beta to Cloudflare Workers

Today, we’re excited to announce HTMLRewriter beta — a streaming HTML parser with an easy to use selector based JavaScript API for DOM manipulation, available in the Cloudflare Workers runtime.

For those of you who are unfamiliar, Cloudflare Workers is a lightweight serverless platform that allows developers to leverage Cloudflare’s network to augment existing applications or create entirely new ones without configuring or maintaining infrastructure.

Static Sites to Dynamic Applications

On Friday we announced Workers Sites: a static site deployment workflow built into the Wrangler CLI tool. Now, paired with the HTML Rewriter API, you can perform DOM transformations on top of your static HTML, right on the Cloudflare edge.

You could previously do this by ingesting the entire body of the response into the Worker, however, that method was prone to introducing a few issues. First, parsing a large file was bound to run into memory or CPU limits. Additionally, it would impact your TTFB as the body could no longer be streamed, and the browser would be prevented from doing any speculative parsing to load subsequent assets.

HTMLRewriter was the missing piece to having your application fully live on the edge – soup to nuts. You can build your API on Cloudflare Workers as a serverless function, have the static elements of your frontend hosted on Workers Sites, and dynamically tie them together using the HTMLRewriter API.

Enter JAMStack

You may be thinking “wait!”, JavaScript, serverless APIs… this is starting to sound a little familiar. It sounded familiar to us too.

Is this JAMStack?

First, let’s answer the question — what is JAMStack? JAMStack is a term coined by Mathias Biilmann, that stands for JavaScript, APIs, and Markup. JAMStack applications are intended to be very easy to scale since they rely on simplified static site deployment. They are also intended to simplify the web development workflow, especially for frontend developers, by bringing data manipulation and rendering that traditionally happened on the backend to the front-end and interacting with the backend only via API calls.

So to that extent, yes, this is JAMStack. However, HTMLRewriter takes this idea one step further.

The Edge: Not Quite Client, Not Quite Server

Most JAMStack applications rely on client-side calls to third-party APIs, where the rendering can be handled client-side using JavaScript, allowing front end developers to work with toolchains and languages they are already familiar with. However, this means with every page load the client has to go to the origin, wait for HTML and JS, and then after being parsed and loaded make multiple calls to APIs. Additionally, all of this happens on client-side devices which are inevitably less powerful machines than servers and have potentially flaky last-mile connections.

With HTMLRewriter in Workers, you can make those API calls from the edge, where failures are significantly less likely than on client device connections, and results can often be cached. Better yet, you can write the APIs themselves in Workers and can incorporate the results directly into the HTML — all on the same powerful edge machine. Using these machines to perform “edge-side rendering” with HTMLRewriter always happens as close as possible to your end users, without happening on the device itself, and it eliminates the latency of traveling all the way to the origin.

What does the HTMLRewriter API look like?

The HTMLRewriter class is a jQuery-like experience directly inside of your Workers application, allowing developers to build deeply functional applications, leaning on a powerful JavaScript API to parse and transform HTML.

Below is an example of how you can use the HTMLRewriter to rewrite links on a webpage from HTTP to HTTPS.

async function handleRequest(req) {
    const res = await fetch(req);
    return new HTMLRewriter()
    .on('a', { element:  e => rewriteUrl(e, 'href') })
    .on('img', { element: e => rewriteUrl(e, 'src') })
    .transform(res);
}

In the example above, we create a new instance of HTMLRewriter, and use the selector to find all instances of a and img elements, and call the rewriteURL function on the href and src properties respectively.

Internationalization and localization tutorial: If you’d like to take things further, we have a full tutorial on how to make your application i18n friendly using HTMLRewriter.

Not so static... Introducing the HTMLRewriter API Beta to Cloudflare Workers

Getting started

If you’re already using Cloudflare Workers, you can simply get started with the HTMLRewriter by consulting our documentation (no sign up or anything else required!). If you’re new to Cloudflare Workers, we recommend starting out by signing up here.

You are, of course, not limited to Workers Sites only. Since Cloudflare Workers can be deployed as a proxy in front of any application you can use the HTMLRewriter as an elegant way to augment your existing site, and easily add dynamic elements, regardless of backend.

If you’re interested in the nitty, gritty details of how the HTMLRewriter works, and learning more than you’ve ever wanted to know about parsing the DOM, stay tuned. We’re excited to share the details with you in a future post.

One last thing, you are not limited to Workers Sites only. Since Cloudflare Workers can be deployed as a proxy in front of any application you can use the HTMLRewriter as an elegant way to augment your existing site, and easily add dynamic elements, regardless of backend.

We love to hear from you!

We’re always iterating and working to improve our product based on customer feedback! Please help us out by filling out our survey about your experience.


Have you built something interesting with Workers? Let us know @CloudflareDev!

Birthday Week 2019 Wrap-up

Post Syndicated from Jake Anderson original https://blog.cloudflare.com/birthday-week-2019-wrap-up/

Birthday Week 2019 Wrap-up

Birthday Week 2019 Wrap-up

This week we celebrated Cloudflare’s 9th birthday by launching a variety of new offerings that support our mission: to help build a better Internet.  Below is a summary recap of how we celebrated Birthday Week 2019.

Cleaning up bad bots

Every day Cloudflare protects over 20 million Internet properties from malicious bots, and this week you were invited to join in the fight!  Now you can enable “bot fight mode” in the Firewall settings of the Cloudflare Dashboard and we’ll start deploying CPU intensive code to traffic originating from malicious bots.  This wastes the bots’ CPU resources and makes it more difficult and costly for perpetrators to deploy malicious bots at scale. We’ll also share the IP addresses of malicious bot traffic with our Bandwidth Alliance partners, who can help kick malicious bots offline. Join us in the battle against bad bots – and, as you can read here – you can help the climate too!

Browser Insights

Speed matters, and if you manage a website or app, you want to make sure that you’re delivering a high performing website to all of your global end users. Now you can enable Browser Insights in the Speed section of the Cloudflare Dashboard to analyze website performance from the perspective of your users’ web browsers.  

WARP, the wait is over

Several months ago we announced WARP, a free mobile app purpose-built to address the security and performance challenges of the mobile Internet, while also respecting user privacy.  After months of testing and development, this week we (finally) rolled out WARP to approximately 2 million wait-list customers.  We also enabled Warp Plus, a WARP experience that uses Argo routing technology to route your mobile traffic across faster, less-congested, routes through the Internet.  Warp and Warp Plus (Warp+) are now available in the iOS and Android App stores and we can’t wait for you to give it a try!

HTTP/3 Support

Last year we announced early support for QUIC, a UDP based protocol that aims to make everything on the Internet work faster, with built-in encryption. The IETF subsequently decided that QUIC should be the foundation of the next generation of the HTTP protocol, HTTP/3. This week, Cloudflare was the first to introduce support for HTTP/3 in partnership with Google Chrome and Mozilla.

Workers Sites

Finally, to wrap up our birthday week announcements, we announced Workers Sites. The Workers serverless platform continues to grow and evolve, and every day we discover new and innovative ways to help developers build and optimize their applications. Workers Sites enables developers to easily deploy lightweight static sites across Cloudflare’s global cloud platform without having to build out the traditional backend server infrastructure to support these sites.

We look forward to Birthday Week every year, as a chance to showcase some of our exciting new offerings — but we all know building a better Internet is about more than one week.  It’s an effort that takes place all year long, and requires the help of our partners, employees and especially you — our customers. Thank you for being a customer, providing valuable feedback and helping us stay focused on our mission to help build a better Internet.

Can’t get enough of this week’s announcements, or want to learn more? Register for next week’s Birthday Week Recap webinar to get the inside scoop on every announcement.

Workers Sites: Extending the Workers platform with our own serverless building blocks

Post Syndicated from Ashley Williams original https://blog.cloudflare.com/extending-the-workers-platform/

Workers Sites: Extending the Workers platform with our own serverless building blocks

As of today, with the Wrangler CLI, you can now deploy entire websites directly to Cloudflare Workers and Workers KV. If you can statically generate the assets for your site, think create-react-app, Jekyll, or even the WP2Static plugin, you can deploy it to our entire global network, which spans 194 cities in more than 90 countries.

While you could deploy an entire site directly to Workers before, it wasn’t the easiest process. So, the Workers Developer Experience Team came up with a solution to make deploying static assets a significantly better experience.

Using our Workers command-line tool Wrangler, we’ve made it possible to deploy any static site to Workers in three easy steps: run wrangler init --site, configure the newly created wrangler.toml file with your account and project details, and then publish it to Cloudflare’s edge with wrangler publish. If you want to explore how this works, check out our new Workers Sites tutorial for create-react-app, where we cover how this new functionality allows you to deploy without needing to write any additional code!

While in hindsight the path we took to get to this point might not seem the most straightforward, it really highlights the flexibility of the entire Workers platform to easily support use cases that we didn’t originally envision. With this in mind, I’ll walk you through the implementation and thinking we did to get to this point. I’ll also talk a bit about how the flexibility of the Workers platform has us excited, both for the ethos it represents, and the future it enables.

So, what went into building Workers Sites?

“Filesystem?! Where we’re going, we don’t need a filesystem!”

The Workers platform is built on v8 isolates, which, while awesome, lack a filesystem. If you’ve ever deployed a static site via FTP, uploaded it to object storage, or used a computer, you’d probably agree that filesystems are important. For many use cases, like building an API or routing, you don’t need a filesystem, but as the vision for Workers grew and our audience grew with it, it became clear to us that this was a limitation we needed to address for new features.

Welcome to the simulation

Without a filesystem, we decided to simulate one on top of Workers KV! Workers KV provides access to a secure key-value store that runs across Cloudflare’s Edge alongside Workers.

When running wrangler preview or wrangler publish, we check your wrangler.toml for the site key. The site key points to a bucket, which represents the KV namespace we’ll use to represent your static assets. We then upload each of your assets, where the path relative to the entry directory is the key, and the blob of the file is the value.

Workers Sites: Extending the Workers platform with our own serverless building blocks

When a request from a user comes in, the Worker reads the request’s URI and looks up the asset that matches the segment requested. For example, if a user fetches “my-site.com/about.html”, the Worker looks up the “about.html” key in KV and returns the blob. Behind the scenes, we’ll also detect the mime-type of the requested asset and return the response with the correct content-type headers.

For folks who are used to building static sites or sites with a static asset serving component, this could feel deeply overengineered. Others may argue that, indeed, this is just how filesystems are built! The interesting thing for us is that we had to build one, there wasn’t just one there waiting for us.

It was great that we could put this together with Workers KV, but we still had a problem…

Cache rules everything around me

Workers KV is a database, and so it’s set up for both read and write operations. However, it’s primarily tuned for read-heavy workloads on entries that don’t generally have a long life span. This works well for applications where data is accessed frequently and often updated. But, for static websites, assets are generally written once, and then they are never (or infrequently) written to again. Static site content should be cached for a very long time, if not forever (long live Space Jam). This means we need to cache data much longer than KV is used to.

To fix this, on publish or preview, Wrangler walks the entry-point directory you’ve declared in your wrangler.toml and creates an asset manifest: a map of your filenames to a hash of their content. We use this asset manifest to map requests for a particular filename, say index.html, to the content hash of the most recently uploaded static asset.

You may be familiar with the concept of an asset manifest from using tools like create-react-app. Asset manifests help maintain asset fingerprints for caching in the browser. We took this idea and implemented it in Workers Sites, so that we can leverage the edge cache as well!

Workers Sites: Extending the Workers platform with our own serverless building blocks

This now allows us to, after first read per location, cache the static assets in the Cloudflare cache so that the assets can be stored on the edge indefinitely. This reduces reads to KV to almost nothing; we want to use KV for durability purposes, but we want to use a longer caching strategy for performance. Let’s dive in to exactly what this looks like:

How it works

When a new asset is created, Wrangler publish will push the new asset to KV as well as an asset manifest to the edge alongside your Worker.

Workers Sites: Extending the Workers platform with our own serverless building blocks

When someone first accesses your page, the Cloudflare location closest to them will run your Worker. The Worker script will determine the content hash of the asset they’ve requested by looking up that asset in the asset manifest. It will use the filename and content hash as the key to fetch the asset’s contents from KV. At this time it will also insert the asset’s contents into Cloudflare’s edge cache, again keyed by filename and content hash. It will then respond to the request with the asset.

Workers Sites: Extending the Workers platform with our own serverless building blocks

On subsequent requests, the Worker script will look up the content hash in the asset manifest, and check the cache to see if the asset is there. Since this is a subsequent request, it will find your asset in the cache on the edge and return a response containing the asset without having to fetch the asset contents from KV.

Workers Sites: Extending the Workers platform with our own serverless building blocks

So what happens when you update your “index.html”- or any of your static assets? The process is very similar to what happens on the upload of a new asset. You’ll run wrangler publish with your new asset on your local machine. Wrangler will walk your asset directory and upload them to KV. At the same time, it will create a new asset manifest containing the filename and a content hash representing the new contents of the asset. When a request comes into your Worker, your Worker will look into the asset manifest and retrieve the new content hash for that asset. The Worker will look in the cache now for the new hash! It will then fetch the new asset from KV, populate the cache, and return the new file to your end user.

Edge caching happens per location across 194 cities around the world, ensuring that the most frequently accessed content on your page is cached in a location closest to those requesting content, reducing latency. All of this happens in *addition* to the browser cache, which means that your assets are nearly always incredibly close to end users!

By being on the edge, a Worker is in a unique position to be able to cache not only static assets like JS, CSS and images, but also HTML assets! Traditional static site solutions utilize your site’s HTML an entry point to the static site generator’s asset manifest. With this method of caching your HTML, it would be impossible to bust that cache because there is no other entry point to manage your assets’ fingerprints other than the HTML itself. However, in a Worker the entry point is your *Worker*! We can then leverage our wrangler asset-manifest to look up and fetch the accurate and cacheable HTML, while at the same time cache bust on content hash.

Making the possible imaginable

“What we have is a crisis of imagination. Albert Einstein said that you cannot solve a problem with the same mind-set that created it.” – Peter Buffett

When building a brand new developer platform, there’s often a vast number of possible applications. However, the sheer number of possibilities often make each one difficult to imagine. That’s why we think the most important part of any platform is its flexibility to adapt to previously unimagined use cases. And, we don’t mean that just for us. It’s important that everyone has the ability to customize the platform to new and interesting use cases!

At face value, the work we did to implement this feature might seem like another solution for a previously solved problem. However, it’s a great example of how a group of dedicated developers can improve the platform experience for others.

We hope that by paving a way to include static assets in a Worker, developers can use the extra cognitive space to conceive of even more new ways to use Workers that may have been hard to imagine before.

Workers Sites isn’t the end goal, but a stepping stone to continue to think critically about what it means to build a Web Application. We’re excited to give developers the space to explore how simple static applications can grow and evolve, when combined with the dynamic power of edge computing.

Go forth and build something awesome!


Have you built something interesting with Workers? Let us know @CloudflareDev!

Workers Sites: deploy your website directly to our network

Post Syndicated from Rita Kozlov original https://blog.cloudflare.com/workers-sites/

Workers Sites: deploy your website directly to our network

Workers Sites: deploy your website directly to our network

Performance on the web has always been a battle against the speed of light — accessing a site from London that is served from Seattle, WA means every single asset request has to travel over seven thousand miles. The first breakthrough in the web performance battle was HTTP/1.1 connection keep-alive and browsers opening multiple connections. The next breakthrough was the CDN, bringing your static assets closer to your end users by caching them in data centers closer to them. Today, with Workers Sites, we’re excited to announce the next big breakthrough — entire sites distributed directly onto the edge of the Internet.

Deploying to the edge of the network

Why isn’t just caching assets sufficient? Yes, caching improves performance, but significant performance improvement comes with a series of headaches. The CDN can make a guess at which assets it should cache, but that is just a guess. Configuring your site for maximum performance has always been an error-prone process, requiring a wide collection of esoteric rules and headers. Even when perfectly configured, almost nothing is cached forever, precious requests still often need to travel all the way to your origin (wherever it may be). Cache invalidation is, after all, one of the hardest problems in computer science.

This begs the question: rather than moving bytes from the origin to the edge bit by bit clumsily, why not push the whole origin to the edge?

Workers Sites: Extending the Workers platform

Two years ago for Birthday Week, we announced Cloudflare Workers, a way for developers to write and run JavaScript and WebAssembly on our network in 194 cities around the world. A year later, we released Workers KV, our distributed key-value store that gave developers the ability to store state at the edge in those same cities.

Workers Sites leverages the power of Workers and Workers KV by allowing developers to upload their sites directly to the edge, and closer to the end users. Born on the edge, Workers Sites is what we think modern development on the web should look like, natively secure, fast, and massively scalable. Less of your time is spent on configuration, and more of your time is spent on your code, and content itself.

How it works

Workers Sites are deployed with a few terminal commands, and can serve a site generated by any static site generator, such as Hugo, Gatsby or Jekyll. Using Wrangler (our CLI), you can upload your site’s assets directly into KV. When a request hits your Workers Site, the Cloudflare Worker generated by Wrangler, will read and serve the asset from KV, with the appropriate headers (no need to worry about Content-Type, and Cache-Control; we’ve got you covered).

Workers Sites can be used to deploy any static site such as a blog, marketing sites, or portfolio.  If you ever decide your site needs to become a little less static, your Worker is just code, edit and extend it until you have a dynamic site running all around the world.

Getting started

To get started with Workers Sites, you first need to sign up for Workers. After selecting your workers.dev subdomain, choose the Workers Unlimited plan (starting at $5 / month) to get access to Workers KV and the ability to deploy Workers Sites.

After signing up for Workers Unlimited you’ll need to install the CLI for Workers, Wrangler. Wrangler can be installed either from NPM or Cargo:

# NPM Installation
npm i @cloudflare/wrangler -g
# Cargo Installation
cargo install wrangler

Once you install Wrangler, you are ready to deploy your static site, with the following steps:

  1. Run wrangler init --site in the directory that contains your static site’s built assets
  2. Fill in the newly created wrangler.toml file with your account and project details
  3. Publish your site with wrangler publish

You can also check out our Workers Sites reference documentation or follow the full tutorial for create-react-app in the docs.

If you’d prefer to get started by watching a video, we’ve got you covered! This video will walk you through creating and deploying your first Workers Site.


Blazing fast: from Atlanta to Zagreb

In addition to improving the developer experience, we did a lot of work behind the scenes making sure that both deploys and the sites themselves are blazing fast — we’re excited to share the how with you in our technical blog post.

To test the performance of Workers Sites we took one of our personal sites and deployed it to run some benchmarks. This test was for our site but your results may vary.

One common way to benchmark the performance of your site it using Google Lighthouse, which you can do directly from the Audits tab of your Chrome browser.

Workers Sites: deploy your website directly to our network

So we passed the first test with flying colors — 100! However, running a benchmark from your own computer introduces a bias: your users are not necessarily where you are. In fact, your users are increasingly not where you are.

Where you’re benchmarking from is really important: running tests from different locations will yield different results. Benchmarking from Seattle and hitting a server on the West coast says very little about your global performance.

We decided to use a tool called Catchpoint to run benchmarks from cities around the world. To see how we compare, we deployed the site to three different static site deployment platforms including Workers Sites.

Since providers offer data center regions on the coasts of the United States, or central Europe, it’s common to see good performance in regions such as North America, and we’ve got you covered here:

Workers Sites: deploy your website directly to our network

But what about your users in the rest of the world? Performance is even more critical in those regions: the first users are not going to be connecting to your site on a MacBook Pro, on a blazing fast connection. Workers Sites allows you to reach those regions without any additional effort on your part — every time our map grows, your global presence grows with it.

We’ve done the work of running some benchmarks from different parts of the world for you, and we’re pleased to share the results:

Workers Sites: deploy your website directly to our network

One last thing…

Deploying your next site with Workers Sites is easy and leads to great performance, so we thought it was only right that we deploy with Workers Sites ourselves. With this announcement, we are also open sourcing the Cloudflare Workers docs! And, they are now served from a Cloudflare data center near you using Workers Sites.

We can’t wait to see what you deploy with Workers Sites!


Have you built something interesting with Workers or Workers Sites? Let us know @CloudflareDev!

New – Step Functions Support for Dynamic Parallelism

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-step-functions-support-for-dynamic-parallelism/

Microservices make applications easier to scale and faster to develop, but coordinating the components of a distributed application can be a daunting task. AWS Step Functions is a fully managed service that makes coordinating tasks easier by letting you design and run workflows that are made of steps, each step receiving as input the output of the previous step. For example, Novartis Institutes for Biomedical Research is using Step Functions to empower scientists to run image analysis without depending on cluster experts.

Step Functions added some very interesting capabilities recently, such as callback patterns, to simplify the integration of human activities and third-party services, and nested workflows, to assemble together modular, reusable workflows. Today, we are adding support for dynamic parallelism within a workflow!

How Dynamic Parallelism Works
States machines are defined using the Amazon States Language, a JSON-based, structured language. The Parallel state can be used to execute in parallel a fixed number of branches defined in the state machine. Now, Step Functions supports a new Map state type for dynamic parallelism.

To configure a Map state, you define an Iterator, which is a complete sub-workflow. When a Step Functions execution enters a Map state, it will iterate over a JSON array in the state input. For each item, the Map state will execute one sub-workflow, potentially in parallel. When all sub-workflow executions complete, the Map state will return an array containing the output for each item processed by the Iterator.

You can configure an upper bound on how many concurrent sub-workflows Map executes by adding the MaxConcurrency field. The default value is 0, which places no limit on parallelism and iterations are invoked as concurrently as possible. A MaxConcurrency value of 1 has the effect to invoke the Iterator one element at a time, in the order of their appearance in the input state, and will not start an iteration until the previous iteration has completed execution.

One way to use the new Map state is to leverage fan-out or scatter-gather messaging patterns in your workflows:

  • Fan-out is a applied when delivering a message to multiple destinations, and can be useful in workflows such as order processing or batch data processing. For example, you can retrieve arrays of messages from Amazon SQS and Map will send each message to a separate AWS Lambda function.
  • Scatter-gather broadcasts a single message to multiple destinations (scatter) and then aggregates the responses back for the next steps (gather). This can be useful in file processing and test automation. For example, you can transcode ten 500 MB media files in parallel and then join to create a 5 GB file.

Like Parallel and Task states, Map supports Retry and Catch fields to handle service and custom exceptions. You can also apply Retry and Catch to states inside your Iterator to handle exceptions. If any Iterator execution fails, because of an unhandled error or by transitioning to a Fail state, the entire Map state is considered to have failed and all its iterations are stopped. If the error is not handled by the Map state itself, Step Functions stops the workflow execution with an error.

Using the Map State
Let’s build a workflow to process an order and, by using the Map state, to work on the items in the order in parallel. The tasks executed as part of this workflow are all Lambda functions, but with Step Functions you can use other AWS service integrations and have code running on EC2 instances, containers, or on-premises infrastructure.

Here’s our sample order, expressed as a JSON document, for a few books, plus some coffee to drink while reading them. The order has a detail section, where there is a list of items that are part of the order.

{
  "orderId": "12345678",
  "orderDate": "20190820101213",
  "detail": {
    "customerId": "1234",
    "deliveryAddress": "123, Seattle, WA",
    "deliverySpeed": "1-day",
    "paymentMethod": "aCreditCard",
    "items": [
      {
        "productName": "Agile Software Development",
        "category": "book",
        "price": 60.0,
        "quantity": 1
      },
      {
        "productName": "Domain-Driven Design",
        "category": "book",
        "price": 32.0,
        "quantity": 1
      },
      {
        "productName": "The Mythical Man Month",
        "category": "book",
        "price": 18.0,
        "quantity": 1
      },
      {
        "productName": "The Art of Computer Programming",
        "category": "book",
        "price": 180.0,
        "quantity": 1
      },
      {
        "productName": "Ground Coffee, Dark Roast",
        "category": "grocery",
        "price": 8.0,
        "quantity": 6
      }
    ]
  }
}

To process this order, I am using a state machine defining how the different tasks should be executed. The Step Functions console creates a visual representation of the workflow I am building:

  • First, I validate and check the payment.
  • Then, I process the items in the order, potentially in parallel, to check their availability, prepare for delivery, and start the delivery process.
  • At the end, a summary of the order is sent to the customer.
  • In case the payment check fails, I intercept that, for example to send a notification to the customer.

 

Here is the same state machine definition expressed as a JSON document. The ProcessAllItems state is using Map to process items in the order in parallel. In this case, I limit concurrency to 3 using the MaxConcurrency field. Inside the Iterator, I can put a sub-workflow of arbitrary complexity. In this case, I have three steps, to CheckAvailability, PrepareForDelivery, and StartDelivery of the item. Each of this step can Retry and Catch errors to make the sub-workflow execution more reliable, for example in case of integrations with external services.

{
  "StartAt": "ValidatePayment",
  "States": {
    "ValidatePayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:validatePayment",
      "Next": "CheckPayment"
    },
    "CheckPayment": {
      "Type": "Choice",
      "Choices": [
        {
          "Not": {
            "Variable": "$.payment",
            "StringEquals": "Ok"
          },
          "Next": "PaymentFailed"
        }
      ],
      "Default": "ProcessAllItems"
    },
    "PaymentFailed": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:paymentFailed",
      "End": true
    },
    "ProcessAllItems": {
      "Type": "Map",
      "InputPath": "$.detail",
      "ItemsPath": "$.items",
      "MaxConcurrency": 3,
      "Iterator": {
        "StartAt": "CheckAvailability",
        "States": {
          "CheckAvailability": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:us-west-2:123456789012:function:checkAvailability",
            "Retry": [
              {
                "ErrorEquals": [
                  "TimeOut"
                ],
                "IntervalSeconds": 1,
                "BackoffRate": 2,
                "MaxAttempts": 3
              }
            ],
            "Next": "PrepareForDelivery"
          },
          "PrepareForDelivery": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:us-west-2:123456789012:function:prepareForDelivery",
            "Next": "StartDelivery"
          },
          "StartDelivery": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:us-west-2:123456789012:function:startDelivery",
            "End": true
          }
        }
      },
      "ResultPath": "$.detail.processedItems",
      "Next": "SendOrderSummary"
    },
    "SendOrderSummary": {
      "Type": "Task",
      "InputPath": "$.detail.processedItems",
      "Resource": "arn:aws:lambda:us-west-2:123456789012:function:sendOrderSummary",
      "ResultPath": "$.detail.summary",
      "End": true
    }
  }
}

The Lambda functions used by this workflow are not aware of the overall structure of the order JSON document. They just need to know the part of the input state they are going to process. This is a best practice to make those functions easily reusable in multiple workflows. The state machine definition is manipulating the path used for the input and the output of the functions using JsonPath syntax via the InputPathItemsPathResultPath, and OutputPath fields:

  • InputPath is used to filter the data in the input state, for example to only pass the detail of the order to the Iterator.
  • ItemsPath is specific to the Map state and is used to identify where, in the input, the array field to process is found, for example to process the items inside the detail of the order.
  • ResultPath makes it possible to add the output of a task to the input state, and not overwrite it completely, for example to add a summary to the detail of the order.
  • I am not using OutputPath this time, but it could be useful to filter out unwanted information and pass only the portion of JSON that you care about to the next state. For example, to send as output only the detail of the order.

Optionally, the Parameters field may be used to customize the raw input used for each iteration. For example, the deliveryAddress is in the detail of the order, but not in each item. To have the Iterator have an index of the items, and access the deliveryAddress, I can add this to a Map state:

"Parameters": {
  "index.$": "$$.Map.Item.Index",
  "item.$": "$$.Map.Item.Value",
  "deliveryAddress.$": "$.deliveryAddress"
}

Available Now
This new feature is available today in all regions where Step Functions is offered. Dynamic parallelism was probably the most requested features for Step Functions. It unblocks the implementation of new use cases and can help optimize existing ones. Let us know what are you going to use it for!

How We Design Features for Wrangler, the Cloudflare Workers CLI

Post Syndicated from Ashley M Lewis original https://blog.cloudflare.com/how-we-design-features-for-wrangler/

How We Design Features for Wrangler, the Cloudflare Workers CLI

How We Design Features for Wrangler, the Cloudflare Workers CLI

The most recent update to Wrangler, version 1.3.1, introduces important new features for developers building Cloudflare Workers — from built-in deployment environments to first class support for Workers KV. Wrangler is Cloudflare’s first officially supported CLI. Branching into this field of software has been a novel experience for us engineers and product folks on the Cloudflare Workers team.

As part of the 1.3.1 release, the folks on the Workers Developer Experience team dove into the thought process that goes into building out features for a CLI and thinking like users. Because while we wish building a CLI were as easy as our teammate Avery tweeted…


… it brings design challenges that many of us have never encountered. To overcome these challenges successfully requires deep empathy for users across the entire team, as well as the ability to address ambiguous questions related to how developers write Workers.

Wrangler, meet Workers KV

Our new KV functionality introduced a host of new features, from creating KV namespaces to bulk uploading key-value pairs for use within a Worker. This new functionality primarily consisted of logic for interacting with the Workers KV API, meaning that the technical work under “the hood” was relatively straightforward. Figuring out how to cleanly represent these new features to Wrangler users, however, became the fundamental question of this release.

Designing the invocations for new KV functionality unsurprisingly required multiple iterations, and taught us a lot about usability along the way!

Attempt 1

For our initial pass, the path originally seemed so obvious. (Narrator: It really, really wasn’t). We hypothesized that having Wrangler support familiar commands — like ls and rm — would be a reasonable mapping of familiar command line tools to Workers KV, and ended up with the following set of invocations below:

# creates a new KV Namespace
$ wrangler kv add myNamespace									
	
# sets a string key that doesn't expire
$ wrangler kv set myKey=”someStringValue”

# sets many keys
$ wrangler kv set myKey=”someStringValue” myKey2=”someStringValue2” ...

# sets a volatile (expiring) key that expires in 60 s
$ wrangler kv set myVolatileKey=path/to/value --ttl 60s

# deletes three keys
$ wrangler kv rm myNamespace myKey1 myKey2 myKey3

# lists all your namespaces
$ wrangler kv ls

# lists all the keys for a namespace
$ wrangler kv ls myNamespace

# removes all keys from a namespace, then removes the namespace		
$ wrangler kv rm -r myNamespace

While these commands invoked familiar shell utilities, they made interacting with your KV namespace a lot more like interacting with a filesystem than a key value store. The juxtaposition of a well-known command like ls with a non-command, set, was confusing. Additionally, mapping preexisting command line tools to KV actions was not a good 1-1 mapping (especially for rm -r; there is no need to recursively delete a KV namespace like a directory if you can just delete the namespace!)

This draft also surfaced use cases we needed to support: namely, we needed support for actions like easy bulk uploads from a file. This draft required users to enter every KV pair in the command line instead of reading from a file of key-value pairs; this was also a non-starter.

Finally, these KV subcommands caused confusion about actions to different resources. For example, the command for listing your Workers KV namespaces looked a lot like the command for listing keys within a namespace.

Going forward, we needed to meet these newly identified needs.

Attempt 2

Our next attempt shed the shell utilities in favor of simple, declarative subcommands like create, list, and delete. It also addressed the need for easy-to-use bulk uploads by allowing users to pass a JSON file of keys and values to Wrangler.

# create a namespace
$ wrangler kv create namespace <title>

# delete a namespace
$ wrangler kv delete namespace <namespace-id>

# list namespaces
$ wrangler kv list namespace

# write key-value pairs to a namespace, with an optional expiration flag
$ wrangler kv write key <namespace-id> <key> <value> --ttl 60s

# delete a key from a namespace
$ wrangler kv delete key <namespace-id> <key>

# list all keys in a namespace
$ wrangler kv list key <namespace-id>

# write bulk kv pairs. can be json file or directory; if dir keys will be file paths from root, value will be contents of files
$ wrangler kv write bulk ./path/to/assets

# delete bulk pairs; same input functionality as above
$ wrangler kv delete bulk ./path/to/assets

Given the breadth of new functionality we planned to introduce, we also built out a taxonomy of new subcommands to ensure that invocations for different resources — namespaces, keys, and bulk sets of key-value pairs — were consistent:

How We Design Features for Wrangler, the Cloudflare Workers CLI

Designing invocations with taxonomies became a crucial part of our development process going forward, and gave us a clear look at the “big picture” of our new KV features.

This approach was closer to what we wanted. It offered bulk put and bulk delete operations that would read multiple key-value pairs from a JSON file. After specifying an action subcommand (e.g. delete), users now explicitly stated which resource an action applied to (namespace , key, bulk) and reduced confusion about which action applied to which KV component.

This draft, however, was still not as explicit as we wanted it to be. The distinction between operations on namespaces versus keys was not as obvious as we wanted, and we still feared the possibility of different delete operations accidentally producing unwanted deletes (a possibly disastrous outcome!)

Attempt 3

We really wanted to help differentiate where in the hierarchy of structs a user was operating at any given time. Were they operating on namespaces, keys, or bulk sets of keys in a given operation, and how could we make that as clear as possible? We looked around, comparing the ways CLIs from kubectl to Heroku’s handled commands affecting different objects. We landed on a pleasing pattern inspired by Heroku’s CLI: colon-delimited command namespacing:

plugins:install PLUGIN    # installs a plugin into the CLI
plugins:link [PATH]       # links a local plugin to the CLI for development
plugins:uninstall PLUGIN  # uninstalls or unlinks a plugin
plugins:update            # updates installed plugins

So we adopted kv:namespace, kv:key, and kv:bulk to semantically separate our commands:

# namespace commands operate on namespaces
$ wrangler kv:namespace create <title> [--env]
$ wrangler kv:namespace delete <binding> [--env]
$ wrangler kv:namespace rename <binding> <new-title> [--env]
$ wrangler kv:namespace list [--env]
# key commands operate on individual keys
$ wrangler kv:key write <binding> <key>=<value> [--env | --ttl | --exp]
$ wrangler kv:key delete <binding> <key> [--env]
$ wrangler kv:key list <binding> [--env]
# bulk commands take a user-generated JSON file as an argument
$ wrangler kv:bulk write <binding> ./path/to/data.json [--env]
$ wrangler kv:bulk delete <binding> ./path/to/data.json [--env]

And ultimately ended up with this topology:

How We Design Features for Wrangler, the Cloudflare Workers CLI

We were even closer to our desired usage pattern; the object acted upon was explicit to users, and the action applied to the object was also clear.

There was one usage issue left. Supplying namespace-ids–a field that specifies which Workers KV namespace to perform an action to–required users to get their clunky KV namespace-id (a string like 06779da6940b431db6e566b4846d64db) and provide it in the command-line under the namespace-id option. This namespace-id value is what our Workers KV API expects in requests, but would be cumbersome for users to dig up and provide, let alone frequently use.

The solution we came to takes advantage of the wrangler.toml present in every Wrangler-generated Worker. To publish a Worker that uses a Workers KV store, the following field is needed in the Worker’s wrangler.toml:

kv-namespaces = [
	{ binding = "TEST_NAMESPACE", id = "06779da6940b431db6e566b4846d64db" }
]

This field specifies a Workers KV namespace that is bound to the name TEST_NAMESPACE, such that a Worker script can access it with logic like:

TEST_NAMESPACE.get(“my_key”);

We also decided to take advantage of this wrangler.toml field to allow users to specify a KV binding name instead of a KV namespace id. Upon providing a KV binding name, Wrangler could look up the associated id in wrangler.toml and use that for Workers KV API calls.

Wrangler users performing actions on KV namespaces could simply provide --binding TEST_NAMESPACE for their KV calls let Wrangler retrieve its ID from wrangler.toml. Users can still specify --namespace-id directly if they do not have namespaces specified in their wrangler.toml.

Finally, we reached our happy point: Wrangler’s new KV subcommands were explicit, offered functionality for both individual and bulk actions with Workers KV, and felt ergonomic for Wrangler users to integrate into their day-to-day operations.

Lessons Learned

Throughout this design process, we identified the following takeaways to carry into future Wrangler work:

  1. Taxonomies of your CLI’s subcommands and invocations are a great way to ensure consistency and clarity. CLI users tend to anticipate similar semantics and workflows within a CLI, so visually documenting all paths for the CLI can greatly help with identifying where new work can be consistent with older semantics. Drawing out these taxonomies can also expose missing features that seem like a fundamental part of the “big picture” of a CLI’s functionality.
  2. Use other CLIs for inspiration and sanity checking. Drawing logic from popular CLIs helped us confirm our assumptions about what users like, and learn established patterns for complex CLI invocations.
  3. Avoid logic that requires passing in raw ID strings. Testing CLIs a lot means that remembering and re-pasting ID values gets very tedious very quickly. Emphasizing a set of purely human-readable CLI commands and arguments makes for a far more intuitive experience. When possible, taking advantage of configuration files (like we did with wrangler.toml) offers a straightforward way to provide mappings of human-readable names to complex IDs.

We’re excited to continue using these design principles we’ve learned and documented as we grow Wrangler into a one-stop Cloudflare Workers shop.

If you’d like to try out Wrangler, check it out on GitHub and let us know what you think! We would love your feedback.

How We Design Features for Wrangler, the Cloudflare Workers CLI

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Post Syndicated from Jenny Hang original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-september-aws-online-tech-talks/

Learn about AWS Services & Solutions – September AWS Online Tech Talks

AWS Tech Talks

Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

 

Compute:

September 23, 2019 | 11:00 AM – 12:00 PM PTBuild Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.

September 26, 2019 | 1:00 PM – 2:00 PM PTSelf-Hosted WordPress: It’s Easier Than You Think – Learn how you can easily build a fault-tolerant WordPress site using Amazon Lightsail.

October 3, 2019 | 11:00 AM – 12:00 PM PTLower Costs by Right Sizing Your Instance with Amazon EC2 T3 General Purpose Burstable Instances – Get an overview of T3 instances, understand what workloads are ideal for them, and understand how the T3 credit system works so that you can lower your EC2 instance costs today.

 

Containers:

September 26, 2019 | 11:00 AM – 12:00 PM PTDevelop a Web App Using Amazon ECS and AWS Cloud Development Kit (CDK) – Learn how to build your first app using CDK and AWS container services.

 

Data Lakes & Analytics:

September 26, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Provisioning Amazon MSK Clusters and Using Popular Apache Kafka-Compatible Tooling – Learn best practices on running Apache Kafka production workloads at a lower cost on Amazon MSK.

 

Databases:

September 25, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

October 3, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Enterprise-Class Security, High-Availability, and Scalability with Amazon ElastiCache – Learn about new enterprise-friendly Amazon ElastiCache enhancements like customer managed key and online scaling up or down to make your critical workloads more secure, scalable and available.

 

DevOps:

October 1, 2019 | 9:00 AM – 10:00 AM PT – CI/CD for Containers: A Way Forward for Your DevOps Pipeline – Learn how to build CI/CD pipelines using AWS services to get the most out of the agility afforded by containers.

 

Enterprise & Hybrid:

September 24, 2019 | 1:00 PM – 2:30 PM PT Virtual Workshop: How to Monitor and Manage Your AWS Costs – Learn how to visualize and manage your AWS cost and usage in this virtual hands-on workshop.

October 2, 2019 | 1:00 PM – 2:00 PM PT – Accelerate Cloud Adoption and Reduce Operational Risk with AWS Managed Services – Learn how AMS accelerates your migration to AWS, reduces your operating costs, improves security and compliance, and enables you to focus on your differentiating business priorities.

 

IoT:

September 25, 2019 | 9:00 AM – 10:00 AM PTComplex Monitoring for Industrial with AWS IoT Data Services – Learn how to solve your complex event monitoring challenges with AWS IoT Data Services.

 

Machine Learning:

September 23, 2019 | 9:00 AM – 10:00 AM PTTraining Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.

September 30, 2019 | 11:00 AM – 12:00 PM PTUsing Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.

October 3, 2019 | 1:00 PM – 2:30 PM PTVirtual Workshop: Getting Hands-On with Machine Learning and Ready to Race in the AWS DeepRacer League – Join DeClercq Wentzel, Senior Product Manager for AWS DeepRacer, for a presentation on the basics of machine learning and how to build a reinforcement learning model that you can use to join the AWS DeepRacer League.

 

AWS Marketplace:

September 30, 2019 | 9:00 AM – 10:00 AM PTAdvancing Software Procurement in a Containerized World – Learn how to deploy applications faster with third-party container products.

 

Migration:

September 24, 2019 | 11:00 AM – 12:00 PM PTApplication Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.

 

Networking & Content Delivery:

September 25, 2019 | 11:00 AM – 12:00 PM PTBuilding Highly Available and Performant Applications using AWS Global Accelerator – Learn how to build highly available and performant architectures for your applications with AWS Global Accelerator, now with source IP preservation.

September 30, 2019 | 1:00 PM – 2:00 PM PTAWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.

 

Robotics:

October 1, 2019 | 11:00 AM – 12:00 PM PTRobots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.

 

Security, Identity & Compliance:

October 1, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on Running Active Directory on AWS – Learn how to deploy Active Directory on AWS and start migrating your windows workloads.

 

Serverless:

October 2, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.

 

Storage:

September 24, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Amazon S3 Data Lake with S3 Storage Classes and Management Tools – Learn how to use the Amazon S3 Storage Classes and management tools to better manage your data lake at scale and to optimize storage costs and resources.

October 2, 2019 | 11:00 AM – 12:00 PM PTThe Great Migration to Cloud Storage: Choosing the Right Storage Solution for Your Workload – Learn more about AWS storage services and identify which service is the right fit for your business.