Tag Archives: serverless

Building serverless multi-Region WebSocket APIs

2022-03-10 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-multi-region-websocket-apis/

This post is written by Ben Freiberg, Senior Solutions Architect, and Marcus Ziller, Senior Solutions Architect.

Many modern web applications use the WebSocket protocol for bidirectional communication between frontend clients and backends. The fastest way to get started with WebSockets on AWS is to use WebSocket APIs powered by Amazon API Gateway.

This serverless solution allows customers to get started with WebSockets without having the complexity of running a WebSocket API. WebSocket APIs are a Regional service bound to a single Region, which may affect latency and resilience for some workloads.

This post shows how to build a multi-regional WebSocket API for a global real-time chat application.

Overview of the solution

This solution uses AWS Cloud Development Kit (CDK). This is an open source software development framework to model and provision cloud application resources. Using the CDK can reduce the complexity and amount of code needed to automate the deployment of resources.

This solution uses AWS Lambda, Amazon API Gateway, Amazon DynamoDB, and Amazon EventBridge.

This diagram outlines the workflow implemented in this blog:

Users across different Regions establish WebSocket connections to an API endpoint in a Region. For every connection, the respective API Gateway invokes the ConnectionHandler Lambda function, which stores the connection details in a Regional DynamoDB table.
User A sends a chat message via the established WebSocket connection. The API Gateway invokes the ClientMessageHandler Lambda function with the received message. The Lambda function publishes an event to an EventBridge event bus that contains the message and the connectionId of the message sender.
The event bus invokes the EventBusMessageHandler Lambda function, which pushes the received message to all other clients connected in the Region. It also replicates the event into us-west-1.
EventBusMessageHandler in us-west-1 receives and send it out to all connected clients in the Region via the same mechanism.

Walkthrough

The following walkthrough explains the required components, their interactions and how the provisioning can be automated via CDK.

For this walkthrough, you need:

An AWS account
Installed Node.js
Installed git
AWS CDK installed

Checkout and deploy the sample stack:

After completing the prerequisites, clone the associated GitHub repository by running the following command in a local directory:
```
git clone [email protected]/aws-samples/multi-region-websocket-api
```
Open the repository in your preferred editor and review the contents of the src and cdk folder.
Follows the instructions in the README.md to deploy the stack.

The following components are deployed in your account for every specified Region. If you didn’t change the default, the Regions are eu-west-1 and us-west-1.

API Gateway for WebSocket connectivity

API Gateway is a fully managed service that makes it easier for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the “front door” for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications.

WebSocket APIs serve as a stateful frontend for an AWS service, in this case AWS Lambda. A Lambda function is used for the WebSocket endpoint that maintains a persistent connection to handle message transfer between the backend service and clients. The WebSocket API invokes the backend based on the content of the messages that it receives from client apps.

There are three predefined routes that can be used: $connect, $disconnect, and $default.

const connectionLambda = new lambda.Function(..);
const requestHandlerLambda = new lambda.Function(..);

const webSocketApi = new apigwv2.WebSocketApi(this, 'WebsocketApi', {
      apiName: 'WebSocketApi',
      description: 'A regional Websocket API for the multi-region chat application sample',
      connectRouteOptions: {
        integration: new WebSocketLambdaIntegration('connectionIntegration', connectionLambda.fn),
      },
      disconnectRouteOptions: {
        integration: new WebSocketLambdaIntegration('disconnectIntegration', connectionLambda.fn),
      },
      defaultRouteOptions: {
        integration: new WebSocketLambdaIntegration('defaultIntegration', requestHandlerLambda.fn),
      },
});

const websocketStage = new apigwv2.WebSocketStage(this, 'WebsocketStage', {
      webSocketApi,
      stageName: 'dev',
      autoDeploy: true,
});

$connect and $disconnect are used by clients to initiate or end a connection with the API Gateway. Each route has a backend integration that is invoked for the respective event. In this example, a Lambda function gets invoked with details of the event. The following code snippet shows how you can track each of the connected clients in an Amazon DynamoDB table. Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale.

// Simplified example for brevity
// Visit GitHub repository for complete code

function connectionHandler(event: APIGatewayEvent) {
  if (eventType === 'CONNECT') {
    await dynamoDbClient.put({
      Item: {
        connectionId,
        chatId: 'DEFAULT',
        ttl: Math.round(Date.now() / 1000 + 3600) // TTL of one hour
      },
    });
  }

  if (eventType === 'DISCONNECT') {
    await dynamoDbClient.delete({
      TableName: process.env.TABLE_NAME!,
      Key: {
        connectionId,
        chatId: 'DEFAULT',
      },
    })
  }

  return ..
}

The $default route is used when the route selection expression produces a value that does not match any of the other route keys in your API routes. For this post, we use it as a default route for all messages sent to the API Gateway by a client. For each message, a Lambda function is invoked with an event of the following format.

{
     "requestContext": {
         "routeKey": "$default",
         "messageId": "GXLKJfX4FiACG1w=",
         "eventType": "MESSAGE",
         "messageDirection": "IN",
         "connectionId": "GXLKAfX1FiACG1w=",
         "apiId": "3m4dnp0wy4",
         "requestTimeEpoch": 1632812813588,
         // some fields omitted for brevity   
         },
     "body": "{ .. }",
     "isBase64Encoded": false
}

EventBridge for cross-Region message distribution

The Lambda function uses the AWS SDK to publish the message data in event.body to EventBridge. EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale. It delivers a stream of real-time data from event sources to targets. You can set up routing rules to determine where to send your data to build application architectures that react in real time to your data sources with event publishers and consumers decoupled.

The following CDK code defines routing rules on the event bus that is applied for every event with source ChatApplication and detail type ChatMessageReceived.

    new events.Rule(this, 'ProcessRequest', {
      eventBus,
      enabled: true,
      ruleName: 'ProcessChatMessage',
      description: 'Invokes a Lambda function for each chat message to push the event via websocket and replicates the event to event buses in other regions.',
      eventPattern: {
        detailType: ['ChatMessageReceived'],
        source: ['ChatApplication'],
      },
      targets: [
        new LambdaFunction(processLambda.fn),
        ...additionalEventBuses,
      ],
    });

Intra-Region message delivery

The first target is a Lambda function that sends the message out to clients connected to the API Gateway endpoint in the same Region where the message was received.

To that end, the function first uses the AWS SDK to query DynamoDB for active connections for a given chatId in its AWS Region. It then removes the connectionId of the message sender from the list and calls postToConnection(..) for the remaining connection ids to push the message to the respective clients.

export async function handler(event: EventBridgeEvent<'EventResponse', ResponseEventDetails>): Promise<any> {
  const connections = await getConnections(event.detail.chatId);
  connections
    .filter((cId: string) => cId !== event.detail.senderConnectionId)
    .map((connectionId: string) => gatewayClient.postToConnection({
      ConnectionId: connectionId,
      Data: JSON.stringify({ data: event.detail.message }),
    })
}

Inter-Region message delivery

To send messages across Regions, this solution uses EventBridge’s cross-Region event routing capability. Cross-Region event routing allows you to replicate events across Regions by adding an event bus in another Region as the target of a rule. In this case, the architecture is a mesh of event buses in all Regions so that events in every event bus are replicated to all other Regions.

A message sent to an event bus in a Region is replicated to the event buses in the other Regions and trigger the intra-region workflow that I described earlier. However, to avoid infinite loops the EventBridge service implements circuit breaker logic that prevents infinite loops of event buses sending messages back and forth. Thus, only ProcessRequestLambda is invoked as a rule target. The function receives the message via its invocation event and looks up the active WebSocket connections in its Region. It then pushes the message to all relevant clients.

This process happens in every Region so that the initial message is delivered to every connected client with at-least-once semantics.

Improving resilience

The architecture of this solution is resilient to service disruptions in a Region. In such an event, all clients connected to the affected Region reconnect to an unaffected Region and continue to receive events. Although this isn’t covered in the CDK code, you can also set up Amazon Route 53 health checks to automate DNS failover to a healthy Region.

Testing the workflow

You can use any WebSocket client to test the application. Here you can see three clients, one connected to the us-west-1 API Gateway endpoint and two connected to the eu-west-1 endpoint. Each one sends a message to the application and every other client receives it, regardless of the Region it is connected to.

Cleaning up

Most services used in this blog post have an allowance in the AWS Free Tier. Be sure to check potential costs of this solution and delete the stack if you don’t need it anymore. Instructions on how to do this are included inside the README in the repository.

Conclusion

This blog post shows how to use the AWS serverless platform to build a multi-regional chat application over WebSockets. With the cross-Region event routing of EventBridge the architecture is resilient as well as extensible.

For more resources on how to get the most out of the AWS serverless platform, visit Serverless Land.

Composing AWS Step Functions to abstract polling of asynchronous services

2022-03-09 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/composing-aws-step-functions-to-abstract-polling-of-asynchronous-services/

This post is written by Nicolas Jacob Baer, Senior Cloud Application Architect, Goeksel Sarikaya, Senior Cloud Application Architect, and Shukhrat Khodjaev, Engagement Manager, AWS ProServe.

AWS Step Functions workflows can use the three main integration patterns when calling AWS services. A common integration pattern is to call a service and wait for a response before proceeding to the next step.

While this works well with AWS services, there is no built-in support for third-party, on-premises, or custom-built services running on AWS. When a workflow integrates with such a service in an asynchronous fashion, it requires a primary step to invoke the service. There are additional steps to wait and check the result, and handle possible error scenarios, retries, and fallbacks.

Although this approach may work well for small workflows, it does not scale to multiple interactions in different steps of the workflow and becomes repetitive. This may result in a complex workflow, which makes it difficult to build, test, and modify. In addition, large workflows with many repeated steps are more difficult to troubleshoot and understand.

This post demonstrates how to decompose the custom service integration by nesting Step Functions workflows. One basic workflow is dedicated to handling the asynchronous communication that offers modularity. It can be re-used as a building block. Another workflow is used to handle the main process by invoking the nested workflow for service interaction, where all the repeated steps are now hidden in multiple executions of the nested workflow.

Overview

Consider a custom service that provides an asynchronous interface, where an action is initially triggered by an API call. After a few minutes, the result is available to be polled by the caller. The following diagram shows a basic workflow interacting with this custom service that encapsulates the service communication in a workflow:

Call Custom Service API – calls a custom service, in this case through an API. Potentially, this could use a service integration or use AWS Lambda if there is custom code.
Wait – waits for the service to prepare a result. This depends on the service that the workflow is interacting with, and could vary from seconds to days to months.
Poll Result – attempts to poll the result from the custom service.
Choice – repeats the polling in case the result was not available yet, move on to failed or success state if result was retrieved. In addition, a timeout should be in place here in case the result is not available within the expected time range. Otherwise, this might lead to an infinite loop.
Fail – fails the workflow if a timeout or a threshold for the number of retries with error conditions is reached.
Transform Result – transforms the result or adds additional meta information to provide further information to the caller (for example, runtime or retries).
Success – finishes the workflow successfully.

If you build a larger workflow that interacts with this custom service in multiple steps in a workflow, you can reduce the complexity by using the Step Functions integration to call the nested workflow with a Wait state.

An illustration of this can be found in the following diagram, where the nested stack is called three times sequentially. Likewise, you can build a more complex workflow that adds additional logic through more steps or interacts with a custom service in parallel. The polling logic is hidden in the nested workflow.

Walkthrough

To get started with AWS Step Functions and Amazon API Gateway using the AWS Management Console:

Go to AWS Step Functions in the AWS Management Console.
Choose Run a sample project and choose Start a workflow within a workflow.
Scroll down to the sample projects, which are defined using Amazon States Language (ASL).
Review the example definition, then choose Next.
Choose Deploy resources. The deployment can take up to 10 minutes. After deploying the resources, you can edit the sample ASL code to define steps in the state machine.
The deployment creates two state machines: NestingPatternMainStateMachine and NestingPatternAnotherStateMachine. NestingPatternMainStateMachine orchestrates the other nested state machines sequentially or in parallel.
Select a state machine, then choose Edit to edit the workflow. In the NestingPatternMainStateMachine, the first state triggers the nested workflow NestingPatternAnotherStateMachine. You can pass necessary parameters to the nested workflow by using Parameters and Input as shown in the example below with Parameter1 and Parameter2. Once the first nested workflow completes successfully, the second nested workflow is triggered. If the result of the first nested workflow is not successful, the NestingPatternMainStateMachine fails with the Fail state.
Select nested workflow NestingPatternAnotherStateMachine, and then select Edit to add AWS Lambda functions to start a job and poll the state of the jobs. This can be any asynchronous job that needs to be polled to query its state. Based on expected job duration, the Wait state can be configured for 10-20 seconds. If the workflow is successful, the main workflow returns a successful result.

Use cases and limitations

This approach allows encapsulation of workflows consisting of multiple sequential or parallel services. Therefore, it provides flexibility that can be used for different use cases. Services can be part of distributed applications, part of automated business processes, big data or machine learning pipelines using AWS services.

Each nested workflow is responsible for an individual step in the main workflow, providing flexibility and scalability. Hundreds of nested workflows can run and be monitored in parallel with the main workflow (see AWS Step Functions Service Quotas).

The approach described here is not applicable for custom service interactions faster than 1 second, since it is the minimum configurable value for a wait step.

Nested workflow encapsulation

Similar to the principle of encapsulation in object-oriented programming, you can use a nested workflow for different interactions with a custom service. You can dynamically pass input parameters to the nested workflow during workflow execution and receive return values. This way, you can define a clear interface between the nested workflow and the parent workflow with different actions and integrations. Depending on the use-case, a custom service may offer a variety of different actions that must be integrated into workflows that run in Step Functions, but can all be combined into a single workflow.

Debugging and tracing

Additionally, debugging and tracing can be done through Execution Event History in the State Machine Management Console. In the Resource column, you can find a link to the executed nested step function. It can be debugged in case of any error in the nested Step Functions workflow.

However, debugging can be challenging in case of multiple parallel nested workflows. In such cases, AWS X-Ray can be enabled to visualize the components of a state machine, identify performance bottlenecks, and troubleshoot requests that have led to an error.

To enable AWS X-Ray in AWS Step Functions:

Open the Step Functions console and choose Edit state machine.
Scroll down to Tracing settings, and Choose Enable X-Ray tracing.

For detailed information regarding AWS X-Ray and AWS Step Functions please refer to the following documentation: https://docs.aws.amazon.com/step-functions/latest/dg/concepts-xray-tracing.html

Conclusion

This blog post describes how to compose a nested Step Functions workflow, which asynchronously manages a custom service using the polling mechanism.

To learn more about how to use AWS Step Functions workflows for serverless microservices orchestration, visit Serverless Land.

Building a serverless image catalog with AWS Step Functions Workflow Studio

2022-03-08 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-serverless-image-catalog-with-aws-step-functions-workflow-studio/

This post is written by Pascal Vogel, Associate Solutions Architect, and Benjamin Meyer, Sr. Solutions Architect.

Workflow Studio is a low-code visual workflow designer for AWS Step Functions that enables the orchestration of serverless workflows through a guided interactive interface. With the integration of Step Functions and the AWS SDK, you can now access more than 200 AWS services and over 9,000 API actions in your state machines.

This walkthrough uses Workflow Studio to implement a serverless image cataloging pipeline. It includes content moderation, automated tagging, and parallel image processing. Workflow Studio allows you to set up API integrations to other AWS services quickly with drag and drop actions, without writing custom application code.

Solution overview

Photo sharing websites often allow users to publish user-generated content such as text, images, or videos. Manual content review and categorization can be challenging. This solution enables the automation of these tasks.

In this workflow:

An image stored in Amazon S3 is checked for inappropriate content using the Amazon Rekognition DetectModerationLabels API.
Based on the result of (1), appropriate images are forwarded to image processing while inappropriate ones trigger an email notification.
Appropriate images undergo two processing steps in parallel: the detection of objects and text in the image via Amazon Rekognition’s DetectLabels and DetectText APIs. The results of both processing steps are saved in an Amazon DynamoDB table.
An inappropriate image triggers an email notification for manual content moderation via the Amazon Simple Notification Service (SNS).

Prerequisites

To follow this walkthrough, you need:

An AWS account.
An AWS user with AdministratorAccess (see the instructions on the AWS Identity and Access Management (IAM) console).
AWS CLI using the instructions here.
AWS Serverless Application Model (AWS SAM) CLI using the instructions here.

Initial project setup

Get started by cloning the project repository from GitHub:

git clone https://github.com/aws-samples/aws-step-functions-image-catalog-blog.git

The cloned repository contains two AWS SAM templates.

The starter directory contains a template. It deploys AWS resources and permissions that you use later for building the image cataloging workflow.
The solution directory contains a template that deploys the finished image cataloging pipeline. Use this template if you want to skip ahead to the finished solution.

Both templates deploy the following resources to your AWS account:

An Amazon S3 bucket that holds the image files for the catalog.
A DynamoDB table as the data store of the image catalog.
An SNS topic and subscription that allow you to send an email notification.
A Step Functions state machine that defines the processing steps in the cataloging pipeline.

To follow the walkthrough, deploy the AWS SAM template in the starter directory using the AWS SAM CLI:

cd aws-step-functions-image-catalog-blog/starter
sam build
sam deploy --guided

Configure the AWS SAM deployment as follows. Input your email address for the parameter ModeratorEmailAddress:

During deployment, you receive an email asking you to confirm the subscription to notifications generated by the Step Functions workflow. In the email, choose Confirm subscription to receive these notifications.

Confirm successful resource creation by going to the AWS CloudFormation console. Open the serverless-image-catalog-starter stack and choose the Stack info tab:

View the Outputs tab of the CloudFormation stack. You reference these items later in the walkthrough:

Implementing the image cataloging pipeline

Accessing Step Functions Workflow Studio

To access Step Functions in Workflow Studio:

Access the Step Functions console.
In the list of State machines, select image-catalog-workflow-starter.
Choose the Edit button.
Choose Workflow Studio.

Workflow Studio consists of three main areas:

The Canvas lets you modify the state machine graph via drag and drop.
The States Browser lets you browse and search more than 9,000 API Actions from over 200 AWS services.
The Inspector panel lets you configure the properties of state machine states and displays the Step Functions definition in the Amazon States Language (ASL).

For the purpose of this walkthrough, you can delete the Pass state present in the state machine graph. Right click on it and choose Delete state.

Auto-moderating content with Amazon Rekognition and the Choice State

Use Amazon Rekognition’s DetectModerationLabels API to detect inappropriate content in the images processed by the workflow:

In the States browser, search for the DetectModerationLabels API action.
Drag and drop the API action on the state machine graph on the canvas.

In the Inspector panel, select the Configuration tab and add the following API Parameters:

{
  "Image": {
    "S3Object": {
      "Bucket.$": "$.bucket",
      "Name.$": "$.key"
    }
  }
}

Switch to the Output tab and check the box next to Add original input to output using ResultPath. This allows you to pass both the original input and the task’s output on to the next state on the state machine graph.

Input the following ResultPath:

$.moderationResult

Step Functions enables you to make decisions based on the output of previous task states via the choice state. Use the result of the DetectModerationLabels API action to decide how to proceed with the image:

Access the Flow tab in the States browser. Drag and drop a Choice state to the state machine graph below the DetectModerationLabels API action.
In the States browser, choose Flow.
Select a Choice state and place it after the DetectModerationLabels state on the graph.
Select the added Choice state.
In the Inspector panel, choose Rule #1 and select Edit.
Choose Add conditions.
For Variable, enter $.moderationResult.ModerationLabels[0].
For Operator, choose is present.
Choose Save conditions.

If Amazon Rekognition detects inappropriate content, the workflow notifies content moderators to inspect the image manually:

In the States browser, find the SNS Publish API Action.
Drag the Action into the Rule #1 branch of the Choice state.
For API Parameters, select the SNS topic that is visible in the Outputs of the serverless-image-catalog-starter stack in the CloudFormation console.

Speeding up image cataloging with the Parallel state

Appropriate images should be processed and included in the image catalog. In this example, processing includes the automated generation of tags based on objects and text identified in the image.

To accelerate this, instruct Step Functions to perform these tasks concurrently via a Parallel state:

In the States browser, select the Flow tab.
Drag and drop a Parallel state onto the Default branch of the previously added Choice state.
Search the Amazon Rekognition DetectLabels API action in the States browser
Drag and drop it inside the parallel state.

Configure the following API parameters:

{
  "Image": {
    "S3Object": {
      "Bucket.$": "$.bucket",
      "Name.$": "$.key"
    }
  }
}

Switch to the Output tab and check the box next to Add original input to output using ResultPath. Set the ResultPath to $.output.

Record the results of the Amazon Rekognition DetectLabels API Action to the DynamoDB database:

Place a DynamoDB UpdateItem API Action inside the Parallel state below the Amazon Rekognition DetectLabels API action.
Configure the following API Parameters to save the tags to the DynamoDB table. Input the name of the DynamoDB table visible in the Outputs of the serverless-image-catalog-starter stack in the CloudFormation console:

{
  "TableName": "<DynamoDB table name>",
  "Key": {
    "Id": {
      "S.$": "$.key"
    }
  },
  "UpdateExpression": "set detectedObjects=:o",
  "ExpressionAttributeValues": {
    ":o": {
      "S.$": "States.JsonToString($.output.Labels)"
    }
  }
}

This API parameter definition makes use of an intrinsic function to convert the list of objects identified by Amazon Rekognition from JSON to String.

In addition to objects, you also want to identify text in images and store it in the database. To do so:

Drag and drop an Amazon Rekognition DetectText API action into the Parallel state next to the DetectLabels Action.
Configure the API Parameters and ResultPath identical to the DetectLabels API Action.
Place another DynamoDB UpdateItem API Action inside the Parallel state below the Amazon Rekognition DetectText API Action. Set the following API Parameters and input the same DynamoDB table name as before.

{
  "TableName": "<DynamoDB table name>",
  "Key": {
    "Id": {
      "S.$": "$.key"
    }
  },
  "UpdateExpression": "set detectedText=:t",
  "ExpressionAttributeValues": {
    ":t": {
      "S.$": "States.JsonToString($.output.TextDetections)"
    }
  }
}

To save the state machine:

Choose Apply and exit.
Choose Save.
Choose Save anyway.

Finishing up and testing the image cataloging workflow

To test the image cataloging workflow, upload an image to the S3 bucket created as part of the initial project setup. Find the name of the bucket in the Outputs of the serverless-image-catalog-starter stack in the CloudFormation console.

Select the image-catalog-workflow-starter state machine in the Step Functions console.
Choose Start execution.

Paste the following test event (use your S3 bucket name):

{
    "bucket": "<S3-bucket-name>",
    "key": "<Image-name>.jpeg"
}

Choose Start execution.

Once the execution has started, you can follow the state of the state machine live in the Graph inspector. For an appropriate image, the result will look as follows:

Next, repeat the test process with an image that Amazon Rekognition classifies as inappropriate. Find out more about inappropriate content categories here. This produces the following result:

You receive an email notifying you regarding the inappropriate image and its properties.

Cleaning up

To clean up the resources provisioned as part of the solution run the following command in the aws-step-functions-image-catalog-blog/starter directory:

sam delete

Conclusion

This blog post demonstrates how to implement a serverless image cataloging pipeline using Step Functions Workflow Studio. By orchestrating AWS API actions and flow states via drag and drop, you can process user-generated images. This example checks images for appropriateness and generates tags based on their content without custom application code.

You can now expand and improve this workflow by triggering it automatically each time an image is uploaded to the Amazon S3 bucket or by adding a manual approval step for submitted content. To find out more about Workflow Studio, visit the AWS Step Functions Developer Guide.

For more serverless learning resources, visit Serverless Land.

Decoding protobuf messages using AWS Lambda

2022-03-07 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/decoding-protobuf-messages-using-aws-lambda/

This post is written by Ennio Pastore, Data Lab Architect.

Protobuf is short for protocol buffers, which are language- and platform-neutral mechanisms for serializing structured data. Compared to XML or JSON the size of the messages is smaller, so the network transfer is faster, reducing latency in the interactions between applications. They are commonly used in communications protocols like RPC systems, for persistent storage of data in a variety of storage systems, and in use-cases ranging from data analysis pipelines to mobile clients.

Since the protobuf messages are encoded in a binary format, they are non-human readable and in order to be processed you have to decode them. You define how you want your data to be structured once, then you can use generated source code to read and write structured data more easily. You can use a variety of languages to read and write data from a variety of data streams. Currently the supported languages are C++, C#, Dart, Go, Java, Kotlin, Python.

This blog post shows you how to decode protobuf messages in a data stream processing application using AWS Lambda functions.

Overview

This example assumes you are already receiving protobuf messages in an Amazon Kinesis Data Streams.

You will learn how to deploy a Lambda function that decodes protobuf messages and store them in JSON format in an Amazon S3 bucket.

To achieve this, create an AWS Lambda layer (step 1) containing the protobuf libraries that are required for the decoding. You can use any development environment where you can install Python 3.x and pip to create the Lambda layers.

After creating the layer, you can include it in the Lambda function (step 2) and you can implement the logic to decode the messages.

Prerequisites

You need the following prerequisites to deploy the solution:

AWS account
AWS CLI
AWS Serverless Application Model (AWS SAM) CLI
Python 3.9
An AWS Identity and Access Management (IAM) role with appropriate access.
Python source code for protobuf

To generate the Python source code required to decode protobuf data, you need a development environment with Python (3.x) and pip already installed.

You can use a local machine, an Amazon EC2 instance, or if you cannot install Python locally, use AWS Cloud9.

Generation of the Python source code for protobuf

Generate the Python source code required for the protobuf encoding and decoding, starting from the proto definition file. This code can be generated using the protobuf compiler from the proto definition file.

Create the proto definition file:

cat > /home/ec2-user/environment/demo.proto << ENDOFFILE
syntax = "proto3";
message demo {
  optional int32 id = 1;
  optional string name = 2;
  optional int32 timevalue = 3;
  optional string event = 4;
}
ENDOFFILE

Compile this file with the protobuf compiler (protoc) to generate the Python source code required for the protobuf encoding/decoding. The generated code only works for the classes defined in the proto definition file.

wget 
https://github.com/protocolbuffers/protobuf/releases/download/v3.19.1/protoc-3.19.1-linux-x86_64.zip

unzip protoc-3.19.1-linux-x86_64.zip

mkdir /home/ec2-user/environment/output

/home/ec2-user/environment/bin/protoc -I=/home/ec2-user/environment/ --python_out=/home/ec2-user/environment/output demo.proto

Create the Lambda layer

In your development environment, in the output directory, create a new directory named protobuf. Install the protobuf libraries locally:
```
mkdir -p ~/environment/output/protobuf
cd ~/environment/output/protobuf
mkdir python
cd python
pip3 install protobuf --target .
```

Include the Python source code to the libraries installed locally:

mkdir custom
cd custom
cp ~/environment/output/demo_pb2.py .
echo 'custom' >> ~/environment/output/protobuf/python/protobuf-3.19.1.dist-info/namespace_packages.txt
echo 'custom/demo_pb2.py' >> ~/environment/output/protobuf/python/protobuf-3.19.1.dist-info/RECORD
echo 'custom' >> ~/environment/output/protobuf/python/protobuf-3.19.1.dist-info/top_level.txt

Zip the Python folder:

cd ~/environment/output/protobuf
zip -r protobuf.zip .

The Lambda layer is ready. If you built it on a remote instance, you must download it in your local machine.

Step 4: Adding the Protobuf Layer to Lambda

Add the layer created in the previous steps to Lambda:

From the AWS Management Console select the Lambda service and choose Create a Layer:
Enter the name protobuf-lambda and upload the protobuf.zip that you created in the previous step.
Once the upload is complete, select x86_64 compatible architecture and select the corresponding Python runtime versions.

Implementation

The full source of the solution is in the GitHub repository and is deployed with AWS SAM.

Clone the solution repository using git:

git clone https://github.com/aws-samples/lambda-protobuf-decoder

Build the AWS SAM project:
```
sam build
```
Deploy the project using AWS SAM and the AWS SAM CLI. Follow the prompts, entering:
1. The name of the Kinesis Data Stream containing the protobuf messages
2. The name of the S3 Bucket that will be used to store the decoded messages
3. The name of your previously created AWS Lambda layer.For all other prompts select “Y”.

Deploy the project using AWS SAM:

sam deploy --guided --capabilities CAPABILITY_NAMED_IAM

The stack is complete when the message “Successfully created/updated stack”. If the stack fails, find the resources that failed to create and troubleshoot any issues.

Testing the AWS SAM stack

Once the AWS SAM stack is successfully deployed, navigate to the Lambda service and choose “protobuf-decoder-lambda”.
Choose the “Monitoring” tab, then “View logs in CloudWatch”:
Select the top Log stream from the list. The logs show for each message the original protobuf message and the decoded message:

Check that all the messages are stored correctly in JSON format in the S3 bucket:

Navigate to the Amazon S3 console and find the destination bucket you specified in the AWS SAM template.
There are multiple files. Select one and choose Actions -> Query with S3 Select.
In the “Input settings” panel and “Output settings” panels, for the “Format” options, select the value JSON.
In the “SQL query” panel, using the default query, choose Run SQL Query. You can see that the content of the object in the S3 bucket is a JSON message.

Cleaning up

If you have generated any events, empty the S3 bucket before deleting the entire stack. If you do not, the data will not be deleted.

To delete the stack, use the AWS SAM CLI. Assuming the stack name is protodecoder, run:

sam delete --stack-name protodecoder

Conclusion

This post shows how to create a Lambda function to decode in real-time protobuf messages. You import the proto message definition in a development environment and compile it to generate the Python source code.

You create the Lambda layer for the protobuf decoding Lambda function, integrating the Python source code previously created with the protobuf libraries. Using AWS SAM, you create the Lambda function including the protobuf libraries.

If you want to dig deeper into Lambda functions, see What is AWS Lambda? To extend the Lambda function to interact with multiple AWS services, see the Boto3 documentation.

For more serverless learning resources, visit Serverless Land.

Migrating a monolithic .NET REST API to AWS Lambda

2022-03-02 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/migrating-a-monolithic-net-rest-api-to-aws-lambda/

This post is written by James Eastham, Cloud Infrastructure Architect.

There are many ways to deploy a .NET application to AWS. From a single process ASP.NET core web API hosted on an EC2 instance to a serverless API backed by AWS Lambda. This post explains key topics to simplify your move from monolith to serverless.

The .NET Framework launched in 2002. This means that there are years’ worth of existing .NET application code that can benefit from moving to a serverless architecture. With the release of the AWS Porting Assistant for .NET and the AWS Microsoft Extractor for .NET, AWS tooling can assist directly with this modernization.

These tools help modernization but don’t migrate the compute layer from traditional servers to serverless technology.

Hexagonal architecture

The hexagonal architecture pattern proposes the division of a system into loosely coupled and interchangeable components. The application and business logic sit at the core of the application.

The next layer up is a set of interfaces that handle bidirectional communication from the core business logic layer. Implementation details are moved to the outside. The inputs (API controllers, UI, consoles, test scripts) and outputs (database implementations, message bus interactions) are at the perimeter.

The chosen compute layer becomes an implementation detail, not a core part of the system. It allows a cleaner process for migrating any integrations, from the frontend, to the compute layer and underlying database engine.

Code examples

The GitHub repo contains the code examples from this post with instructions for deploying the migrated serverless application.

The repository contains a .NET core REST API. It uses MySQL for its database engine and relies on an external API as part of its business logic. It also contains a migrated serverless version of the same application that you can deploy to your AWS account. This uses a combination of the AWS Cloud Development Kit (CDK) and the AWS Serverless Application Model (AWS SAM) CLI.

The architecture of the deployed monolithic application is:

After migrating the application to Lambda, the architecture is:

Integrations

Modern web applications rely on databases, file systems, and even other applications. With first class support for dependency injection in .NET Core, managing these integrations is simpler.

The following code snippet is taken from the BookingController.cs file. It shows how required interfaces are injected into the constructor of the controller. One of the controller methods uses the injected interface to list bookings from the BookingRepository.

    [ApiController]
    [Route("[controller]")]
    public class BookingController : ControllerBase
    {
        private readonly ILogger<BookingController> _logger;
        private readonly IBookingRepository _bookingRepository;
        private readonly ICustomerService _customerService;

        public BookingController(ILogger<BookingController> logger,
            IBookingRepository bookingRepository,
            ICustomerService customerService)
        {
            this._logger = logger;
            this._bookingRepository = bookingRepository;
            this._customerService = customerService;
        }

        /// <summary>
        /// HTTP GET endpoint to list all bookings for a customer.
        /// </summary>
        /// <param name="customerId">The customer id to list for.</param>
        /// <returns>All <see cref="Booking"/> for the given customer.</returns>
        [HttpGet("customer/{customerId}")]
        public async Task<IActionResult> ListForCustomer(string customerId)
        {
            this._logger.LogInformation($"Received request to list bookings for {customerId}");

            return this.Ok(await this._bookingRepository.ListForCustomer(customerId));
        }
}

The implementation of the IBookingRepository is configured at startup using dependency injection in the Startup.cs file.

services.AddTransient<IBookingRepository, BookingRepository>();

This works when using an ASP.NET Core Web API project, since the framework abstracts much of the complexity and configuration. But it’s possible to apply the same practices for .NET core code running in Lambda.

Configuring dependency injection in AWS Lambda

The startup logic is moved to a standalone DotnetToLambda.Serverless.Config library. This allows you to share the dependency injection configuration between multiple Lambda functions. This library contains a single static class named ServerlessConfig.

There is little difference between this file and the Startup.cs file:

public void ConfigureServices(IServiceCollection services)
{
	var databaseConnection =
		new DatabaseConnection(this.Configuration.GetConnectionString("DatabaseConnection"));
	
	services.AddSingleton<DatabaseConnection>(databaseConnection);
	
	services.AddDbContext<BookingContext>(options =>
		options.UseMySQL(databaseConnection.ToString()));

	services.AddTransient<IBookingRepository, BookingRepository>();
	services.AddHttpClient<ICustomerService, CustomerService>();
	
	services.AddControllers();
}

And the configuration method in the ServerlessConfig class:


public static void ConfigureServices()
{
	var client = new AmazonSecretsManagerClient();
	
	var serviceCollection = new ServiceCollection();

	var connectionDetails = LoadDatabaseSecret(client);

	serviceCollection.AddDbContext<BookingContext>(options =>
		options.UseMySQL(connectionDetails.ToString()));
	
	serviceCollection.AddHttpClient<ICustomerService, CustomerService>();
	serviceCollection.AddTransient<IBookingRepository, BookingRepository>();
	serviceCollection.AddSingleton<DatabaseConnection>(connectionDetails);
	serviceCollection.AddSingleton<IConfiguration>(LoadAppConfiguration());

	serviceCollection.AddLogging(logging =>
	{
		logging.AddLambdaLogger();
		logging.SetMinimumLevel(LogLevel.Debug);
	});

	Services = serviceCollection.BuildServiceProvider();
}

The key addition is the manual creation of the ServiceCollection object on line 27 and the call to BuildServiceProvider on line 45. In.NET core the framework abstracts away this manual object initialization. The created ServiceProvider is then exposed as a read-only property of the ServerlessConfig class. All we have done is taken the boilerplate code that an ASP.NET Core web API performs behind the scenes and brought it into the foreground.

This allows you to copy and paste large parts of the startup configuration directly from the web API and re-use it in your Lambda functions.

Lambda API controllers

For the function code, follow a similar process. For example, here is the ListForCustomer endpoint re-written for Lambda:

 public class Function
{
	private readonly IBookingRepository _bookingRepository;
	private readonly ILogger<Function> _logger;
	
	public Function()
	{
		ServerlessConfig.ConfigureServices();

		this._bookingRepository = ServerlessConfig.Services.GetRequiredService<IBookingRepository>();
		this._logger = ServerlessConfig.Services.GetRequiredService<ILogger<Function>>();
	}
	
	public async Task<APIGatewayProxyResponse> FunctionHandler(APIGatewayProxyRequest apigProxyEvent, ILambdaContext context)
	{
		if (!apigProxyEvent.PathParameters.ContainsKey("customerId"))
		{
			return new APIGatewayProxyResponse
			{
				StatusCode = 400,
				Headers = new Dictionary<string, string> { { "Content-Type", "application/json" } }
			};
		}

		var customerId = apigProxyEvent.PathParameters["customerId"];
		
		this._logger.LogInformation($"Received request to list bookings for: {customerId}");

		var customerBookings = await this._bookingRepository.ListForCustomer(customerId);
		
		return new APIGatewayProxyResponse
		{
			Body = JsonSerializer.Serialize(customerBookings),
			StatusCode = 200,
			Headers = new Dictionary<string, string> { { "Content-Type", "application/json" } }
		};
	}
}

The function constructor calls the startup configuration. This allows the initial configuration to be re-used while the Lambda execution environment is still active. Once the services have been configured any required interfaces can be retrieved from the services property of the ServerlessConfig class.

The second key differences are the mapping of the inbound request and response back to API Gateway. The HTTP request arrives as an event and the contents must be manually parsed out of the raw HTTP data. The same applies to the HTTP response, which must be constructed manually. Other than these two differences, it’s a copy from the original BookingController.

Application configuration

An ASP.NET Core Web API contains an appsettings.json file, which contains runtime specific configuration. The framework handles loading the file and exposing it as an injectable IConfiguration interface. It’s also possible to load settings from environment variables.

This is still possible when using Lambda. You can package an appsettings.json file with the compiled code and load it manually at runtime. However, when using Lambda as the compute layer, there are AWS-specific options for managing configuration.

Environment variables

Lambda environment variables are used to add runtime configuration, as shown in the template.yaml file:

 Environment:
	Variables:
		SERVICE: bookings
		DATABASE_CONNECTION_SECRET_ID: !Ref SecretArn

This AWS SAM configuration adds an environment variable named DATABASE_CONNECTION_SECRET_ID. You can access this in Lambda the same way an environment variable is accessed in any C# application:

 var databaseConnectionSecret = client.GetSecretValueAsync(new GetSecretValueRequest()
            {
                SecretId = Environment.GetEnvironmentVariable("DATABASE_CONNECTION_SECRET_ID"),
            }).Result;

This is the simplest way to add runtime configuration. The variables are stored in plaintext and any change requires a redeployment or manual interaction.

External configuration services

AWS has services that allow you to move application configuration outside of the function code. These include AWS Systems Manager Parameter Store, AWS AppConfig and AWS Secrets Manager.

You can use Parameter Store to store plaintext parameters that can also be encrypted using the AWS Key Management Service. The contents of the appsettings.json file from the ASP.NET Core API is directly copied into the parameter string and deployed using the AWS CDK.

 var parameter = new StringParameter(this, "dev-configuration", new StringParameterProps()
{
	ParameterName = "dotnet-to-lambda-dev",
	StringValue = "{\"CustomerApiEndpoint\": \"https://jsonplaceholder.typicode.com/users\"}",
	DataType = ParameterDataType.TEXT,
	Tier = ParameterTier.STANDARD,
	Type = ParameterType.STRING,
	Description = "Dev configuration for dotnet to lambda",
});

This JSON data is loaded as part of the startup configuration. The IConfiguration implementation is then built manually using the parameter string.

 private static IConfiguration LoadAppConfiguration()
{
	var client = new AmazonSimpleSystemsManagementClient();
	var param = client.GetParameterAsync(new GetParameterRequest()
	{
		Name = "dotnet-to-lambda-dev"
	}).Result;
	
	return new ConfigurationBuilder()
		.AddJsonStream(new MemoryStream(Encoding.ASCII.GetBytes(param.Parameter.Value)))
		.Build();

The second configuration mechanism is Secrets Manager. This helps protect secrets and provides easier rotation and management of database credentials.

Amazon RDS is integrated with Secrets Manager. When creating a new RDS instance, the database connection details can be automatically encrypted and persisted as a secret. The details for the MySQL instance are stored in Secrets Manager and are not exposed. These connection details can be accessed as part of the startup configuration using the Secrets Manager SDK.

private static DatabaseConnection LoadDatabaseSecret(AmazonSecretsManagerClient client)
{
	var databaseConnectionSecret = client.GetSecretValueAsync(new GetSecretValueRequest()
	{
		SecretId = Environment.GetEnvironmentVariable("DATABASE_CONNECTION_SECRET_ID"),
	}).Result;

	return JsonSerializer
		.Deserialize<DatabaseConnection>(databaseConnectionSecret.SecretString);
}

The Lambda functions require IAM permissions to access both Secrets Manager and Parameter Store. AWS SAM includes pre-defined policy templates that you can add to the template. Four lines of YAML apply the required Secrets Manager and SSM permissions:

Policies:
	- AWSSecretsManagerGetSecretValuePolicy:
		SecretArn: !Ref SecretArn
	- SSMParameterReadPolicy:
		ParameterName: dotnet-to-lambda-dev

For a full list, see the policy template list.

Networking

The final architectural component is the network. Lambda functions are deployed into a VPC owned by the service. The function can access anything available on the public internet such as other AWS services, HTTPS endpoints for APIs, or services and endpoints outside AWS. The function then has no way to connect to your private resources inside of your VPC.

When deploying an RDS instance into AWS, it’s best practice to place the database in a private subnet with external ingress. If Lambda uses RDS, you must create a connection between the Lambda service VPC and your VPC. The details of this networking component can be found in this blog article.

The AWS SAM template defines this networking configuration:

VpcConfig:
	SubnetIds:
	  - !Ref PrivateSubnet1
	  - !Ref PrivateSubnet2
	SecurityGroupIds:
	  - !Ref SecurityGroup

In this example, the networking configuration is applied globally. This means that the same configuration is applied to all Lambda functions in the template. The functions here are deployed across two subnets and one security group. Learn more about the steps for configuring the subnets and security groups for RDS access in this article.

The specific values for the subnets and security groups are taken from environment variables. When running locally, you can provide these variables manually. When deploying via CICD, these variables can be changed dynamically based on the stage of the pipeline.

 PrivateSubnet1:
	Description: 'Required. Private subnet 1. Output from cdk deploy'
	Type: 'String'
PrivateSubnet2:
	Description: 'Required. Private subnet 2. Output from cdk deploy'
	Type: 'String'
SecurityGroup:
	Description: 'Required. Security group. Output from cdk deploy'
	Type: 'String'

Conclusion

This blog post shows the required considerations for migrating a .NET core REST API to AWS Lambda. You can now start to look at your existing code base and make an informed decision whether Lambda is for you. With the right abstractions and configuration, you can migrate a .NET core API to Lambda compute with copy and paste.

For more serverless learning resources, visit Serverless Land.

Using DevOps Automation to Deploy Lambda APIs across Accounts and Environments

2022-02-25 Subrahmanyam Madduru

Post Syndicated from Subrahmanyam Madduru original https://aws.amazon.com/blogs/architecture/using-devops-automation-to-deploy-lambda-apis-across-accounts-and-environments/

by Subrahmanyam Madduru – Global Partner Solutions Architect Leader, AWS, Sandipan Chakraborti – Senior AWS Architect, Wipro Limited, Abhishek Gautam – AWS Developer and Solutions Architect, Wipro Limited, Arati Deshmukh – AWS Architect, Infosys

As more and more enterprises adopt serverless technologies to deliver their business capabilities in a more agile manner, it is imperative to automate release processes. Multiple AWS Accounts are needed to separate and isolate workloads in production versus non-production environments. Release automation becomes critical when you have multiple business units within an enterprise, each consisting of a number of AWS accounts that are continuously deploying to production and non-production environments.

As a DevOps best practice, the DevOps engineering team responsible for build-test-deploy in a non-production environment should not release the application and infrastructure code on to both non-production and production environments. This risks introducing errors in application and infrastructure deployments in production environments. This in turn results in significant rework and delays in delivering functionalities and go-to-market initiatives. Deploying the code in a repeatable fashion while reducing manual error requires automating the entire release process. In this blog, we show how you can build a cross-account code pipeline that automates the releases across different environments using AWS CloudFormation templates and AWS cross-account access.

Cross-account code pipeline enables an AWS Identity & Access Management (IAM) user to assume an IAM Production role using AWS Secure Token Service (Managing AWS STS in an AWS Region – AWS Identity and Access Management) to switch between non-production and production deployments based as required. An automated release pipeline goes through all the release stages from source, to build, to deploy, on non-production AWS Account and then calls STS Assume Role API (cross-account access) to get temporary token and access to AWS Production Account for deployment. This follow the least privilege model for granting role-based access through IAM policies, which ensures the secure automation of the production pipeline release.

Solution Overview

In this blog post, we will show how a cross-account IAM assume role can be used to deploy AWS Lambda Serverless API code into pre-production and production environments. We are building on the process outlined in this blog post: Building a CI/CD pipeline for cross-account deployment of an AWS Lambda API with the Serverless Framework by programmatically automating the deployment of Amazon API Gateway using CloudFormation templates. For this use case, we are assuming a single tenant customer with separate AWS Accounts to isolate pre-production and production workloads. In Figure 1, we have represented the code pipeline workflow diagramatically for our use case.

Figure 1. AWS cross-account CodePipeline for production and non-production workloads

Figure 1. AWS cross-account AWS CodePipeline for production and non-production workloads

Let us describe the code pipeline workflow in detail for each step noted in the preceding diagram:

An IAM user belonging to the DevOps engineering team logs in to AWS Command-line Interface (AWS CLI) from a local machine using an IAM secret and access key.
Next, the IAM user assumes the IAM role to the corresponding activities – AWS Code Commit, AWS CodeBuild, AWS CodeDeploy, AWS CodePipeline Execution and deploys the code for pre-production.
A typical AWS CodePipeline comprises of build, test and deploy stages. In the build stage, the AWS CodeBuild service generates the Cloudformation template stack (template-export.yaml) into Amazon S3.
In the deploy stage, AWS CodePipeline uses a CloudFormation template (a yaml file) to deploy the code from an S3 bucket containing the application API endpoints via Amazon API Gateway in the pre-production environment.
The final step in the pipeline workflow is to deploy the application code changes onto the Production environment by assuming STS production IAM role.

Since the AWS CodePipeline is fully automated, we can use the same pipeline by switching between pre-production and production accounts. These accounts assume the IAM role appropriate to the target environment and deploy the validated build to that environment using CloudFormation templates.

Prerequisites

Here are the pre-requisites before you get started with implementation.

A user with appropriate privileges (for example: Project Admin) in a production AWS account
A user with appropriate privileges (for example: Developer Lead) in a pre-production AWS account such as development
A CloudFormation template for deploying infrastructure in the pre-production account
Ensure your local machine has AWS CLI installed and configured

Implementation Steps

In this section, we show how you can use AWS CodePipeline to release a serverless API in a secure manner to pre-production and production environments. AWS CloudWatch logging will be used to monitor the events on the AWS CodePipeline.

1. Create Resources in a pre-production account

In this step, we create the required resources such as a code repository, an S3 bucket, and a KMS key in a pre-production environment.

Clone the code repository into your CodeCommit. Make necessary changes to index.js and ensure the buildspec.yaml is there to build the artifacts.
- Using codebase (lambda APIs) as input, you output a CloudFormation template, and environmental configuration JSON files (used for configuring Production and other non-Production environments such as dev, test). The build artifacts are packaged using AWS Serverless Application Model into a zip file and uploads it to an S3 bucket created for storing artifacts. Make note of the repository name as it will be required later.
Create an S3 bucket in a Region (Example: us-east-2). This bucket will be used by the pipeline for get and put artifacts. Make a note of the bucket name.
- Make sure you edit the bucket policy to have your production account ID and the bucket name. Refer to AWS S3 Bucket Policy documentation to make changes to Amazon S3 bucket policies and permissions.
Navigate to AWS Key Management Service (KMS) and create a symmetric key.
Then create a new secret, configure the KMS key and provide access to development and production account. Make a note of the ARN for the key.

2. Create IAM Roles in the Production Account and required policies

In this step, we create roles and policies required to deploy the code.

Create a cross account access IAM role (CodePipelineCrossAccountRole) in the Production account.
Create a read/write policy to access S3 bucket containing the resource artifacts
Create a policy is to access KMS keys to AWS Production account.

{
"Version": "2012-10-17",
"Statement": [
{
    "Effect": "Allow",
      "Action": [
      "kms:DescribeKey",
    "kms:GenerateDataKey*",
      "kms:Encrypt",
      "kms:ReEncrypt*",
      "kms:Decrypt"
],
"Resource": [
      "Your KMS Key ARN you created in Development Account"
]
    }
]
}

Once you’ve created both policies, attach them to the previously created cross-account role.

3. Create a CloudFormation Deployment role

In this step, you need to create another IAM role, “CloudFormationDeploymentRole” for Application deployment. Then attach the following four policies to it.

Policy 1: For Cloudformation to deploy the application in the Production account

{
"Version": "2012-10-17",
"Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "cloudformation:DetectStackDrift",
        "cloudformation:CancelUpdateStack",
        "cloudformation:DescribeStackResource",
        "cloudformation:CreateChangeSet",
        "cloudformation:ContinueUpdateRollback",
        "cloudformation:DetectStackResourceDrift",
    "cloudformation:DescribeStackEvents",
    "cloudformation:UpdateStack",
    "cloudformation:DescribeChangeSet",
    "cloudformation:ExecuteChangeSet",
    "cloudformation:ListStackResources",
    "cloudformation:SetStackPolicy",
    "cloudformation:ListStacks",
        "cloudformation:DescribeStackResources",
        "cloudformation:DescribePublisher",
        "cloudformation:GetTemplateSummary",
    "cloudformation:DescribeStacks",
    "cloudformation:DescribeStackResourceDrifts",
    "cloudformation:CreateStack",
      "cloudformation:GetTemplate",
      "cloudformation:DeleteStack",
      "cloudformation:TagResource",
    "cloudformation:UntagResource",
    "cloudformation:ListChangeSets",
        "cloudformation:ValidateTemplate"
],
      "Resource": "arn:aws:cloudformation:us-east-2:940679525002:stack/DevOps-Automation-API*/*"        }
]
}

Policy 2: For Cloudformation to perform required IAM actions

{
"Version": "2012-10-17",
"Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "iam:GetRole",
        "iam:GetPolicy",
        "iam:TagRole",
        "iam:DeletePolicy",
        "iam:CreateRole",
        "iam:DeleteRole",
        "iam:AttachRolePolicy",
        "iam:PutRolePolicy",
        "iam:TagPolicy",
        "iam:CreatePolicy",
        "iam:PassRole",
        "iam:DetachRolePolicy",
        "iam:DeleteRolePolicy"
      ],
      "Resource": "*"
    }
]
}

Policy 3: Lambda function service invocation policy

{
"Version": "2012-10-17",
"Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "lambda:CreateFunction",
        "lambda:UpdateFunctionCode",
        "lambda:AddPermission",
        "lambda:InvokeFunction",
      "lambda:GetFunction",
        "lambda:DeleteFunction",
        "lambda:PublishVersion",
        "lambda:CreateAlias"
      ],
      "Resource": "arn:aws:lambda:us-east-2:Your_Production_AccountID:function:SampleApplication*"
    }
]
}

Policy 4: API Gateway service invocation policy

{
"Version": "2012-10-17",
"Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:POST",
        "apigateway:GET"
      ],
      "Resource": [
        "arn:aws:apigateway:*::/restapis/*/deployments/*",
        "arn:aws:apigateway:*::/restapis/*/stages/*",
        "arn:aws:apigateway:*::/clientcertificates",
        "arn:aws:apigateway:*::/restapis/*/models",
        "arn:aws:apigateway:*::/restapis/*/resources/*",
        "arn:aws:apigateway:*::/restapis/*/models/*",
        "arn:aws:apigateway:*::/restapis/*/gatewayresponses/*",
        "arn:aws:apigateway:*::/restapis/*/stages",
        "arn:aws:apigateway:*::/restapis/*/resources",
        "arn:aws:apigateway:*::/restapis/*/gatewayresponses",
        "arn:aws:apigateway:*::/clientcertificates/*",
        "arn:aws:apigateway:*::/account",
        "arn:aws:apigateway:*::/restapis/*/deployments",
        "arn:aws:apigateway:*::/restapis"
      ]
    },
    {
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:POST",
        "apigateway:GET"
      ],
      "Resource": "arn:aws:apigateway:*::/restapis/*/resources/*/methods/*/responses/*"
    },
    {
      "Sid": "VisualEditor2",
      "Effect": "Allow",
      "Action": [
    "apigateway:DELETE",
    "apigateway:PATCH",
    "apigateway:GET"
      ],
      "Resource": "arn:aws:apigateway:*::/restapis/*"
    },
    {
      "Sid": "VisualEditor3",
      "Effect": "Allow",
      "Action": [
        "apigateway:DELETE",
        "apigateway:PATCH",
        "apigateway:GET"
      ],
      "Resource": "arn:aws:apigateway:*::/restapis/*/resources/*/methods/*"
}
]
}

Make sure you also attach the S3 read/write access and KMS policies created in Step-2, to the CloudFormationDeploymentRole.

4. Setup and launch CodePipeline

You can launch the CodePipeline either manually in the AWS console using “Launch Stack” or programmatically via command-line in CLI.

On your local machine go to terminal/ command prompt and launch this command:

aws cloudformation deploy –template-file <Path to pipeline.yaml> –region us-east-2 –stack-name <Name_Of_Your_Stack> –capabilities CAPABILITY_IAM –parameter-overrides ArtifactBucketName=<Your_Artifact_Bucket_Name> ArtifactEncryptionKeyArn=<Your_KMS_Key_ARN> ProductionAccountId=<Your_Production_Account_ID> ApplicationRepositoryName=<Your_Repository_Name> RepositoryBranch=master

If you have configured a profile in AWS CLI, mention that profile while executing the command:

–profile <your_profile_name>

After launching the pipeline, your serverless API gets deployed in pre-production as well as in the production Accounts. You can check the deployment of your API in production or pre-production Account, by navigating to the API Gateway in the AWS console and looking for your API in the Region where it was deployed.

Figure 2. Check your deployment in pre-production/production environment

Then select your API and navigate to stages, to view the published API with an endpoint. Then validate your API response by selecting the API link.

Figure 3. Check whether your API is being published in pre-production/production environment

Alternatively you can also navigate to your APIs by navigating through your deployed application CloudFormation stack and selecting the link for API in the Resources tab.

Cleanup

If you are trying this out in your AWS accounts, make sure to delete all the resources created during this exercise to avoid incurring any AWS charges.

Conclusion

In this blog, we showed how to build a cross-account code pipeline to automate releases across different environments using AWS CloudFormation templates and AWS Cross Account Access. You also learned how serveless APIs can be securely deployed across pre-production and production accounts. This helps enterprises automate release deployments in a repeatable and agile manner, reduce manual errors and deliver business cababilities more quickly.

Introducing the .NET 6 runtime for AWS Lambda

2022-02-24 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-the-net-6-runtime-for-aws-lambda/

This is written by Norm Johanson, Senior Software Dev Engineer.

You can now use the .NET 6 runtime to build AWS Lambda functions. The new managed runtime supports both x86 and Arm/Graviton2 processors. You can get started with .NET 6 and Lambda using your tool of choice, including Visual Studio 2022 with the AWS Toolkit for Visual Studio, the .NET CLI with the Amazon.Lambda.Tools global tool, and the AWS Serverless Application Model CLI (AWS SAM CLI).

.NET 6 has many new features for .NET developers including support for C# 10 and F# 6. In addition to these features in .NET 6, this blog post explains new features added to the .NET Lambda experience. You can use these to improve diagnostics and performance and use new coding patterns.

Improved logging

Logging in .NET Lambda functions has been improved for .NET 6, providing better traceability, and control of what is being logged. If you prefer the style of logging in previous .NET managed runtimes, set the environment variable AWS_LAMBDA_HANDLER_LOG_FORMAT to Unformatted.

Request ID

One of the common requested features for the previous .NET Lambda runtime is adding the Lambda request ID to logs for better traceability. This is available in the .NET 6 runtime, making the .NET logging format similar to other Lambda runtimes.

Log levels

.NET 6 logging uses log levels. The ILambdaLogger is accessed from the ILambdaContext and has the following new logging APIs:

LogCritical(string message)
LogError(string message)
LogWarning(string message)
LogInformation(string message)
LogDebug(string message)
LogTrace(string message)
Log(LogLevel level, string message)

Levels for log messages are visible in Amazon CloudWatch Logs, like the request id. This makes it easier to filter and search the logs for particular types of messages, such as errors or warnings.

Console.WriteLine calls are written to CloudWatch Logs as an info level message; Console.Error.WriteLine calls are written as error level.

The following example shows using info messages for logging the fetched user object. It writes a warning message if the user is not found:

public APIGatewayProxyResponse Get(APIGatewayProxyRequest request, ILambdaContext context)
{
    User user = null;
    try
    {
        var id = request.PathParameters["id"];

        context.Logger.LogInformation($"Loading user {id}");
        user = FetchUser(id);
        context.Logger.LogInformation($"User: {user.Name}");
    }
    catch(Exception e)
    {
        context.Logger.LogWarning($"Unable to find user: {e.Message}");
    }

    ...

}

When the user cannot be fetched, this is the resulting log messages showing the log level and request id:

By default, info level messages or higher are written to CloudWatch Logs. You can adjust the level written to CloudWatch Logs using the AWS_LAMBDA_HANDLER_LOG_LEVEL environment variable. The value of the environment variable is set to the values of the LogLevel enum.

With this new filtering, you can instrument Lambda functions with additional logging using the debug and trace log levels. This allows you to turn on additional logging from Lambda functions for troubleshooting, without redeploying new code.

Using source generator for JSON serialization

C# 9 provides source generators, which allow code generation during compilation. This can reduce the use of reflection APIs and improve application startup time. .NET 6 updated the native JSON library System.Text.Json to use source generators, allowing JSON parsing without requiring reflection APIs.

When targeting .NET 6 support, you can take advantage of System.Text.Json’s source generator support to improve cold start performance. This is done using the Amazon.Lambda.Serialization.SystemTextJson package that handles the serialization of Lambda events and responses to .NET types.

To use the source generator, you must define a new empty class in your project that derives from System.Text.Json.Serialization.JsonSerializerContext. This class must be a partial class because the source generator adds code to this class to handle serialization. On the empty partial class, add the JsonSerializable attribute for each .NET type the source generator must generate the serialization code for.

Here is an example called HttpApiJsonSerializerContext that registers the Amazon API Gateway HTTP API event and response types to have the serialization code generated:

[JsonSerializable(typeof(APIGatewayHttpApiV2ProxyRequest))]
[JsonSerializable(typeof(APIGatewayHttpApiV2ProxyResponse))]
public partial class HttpApiJsonSerializerContext : JsonSerializerContext
{
}

Lambda functions using Amazon.Lambda.Serialization.SystemTextJson use the Amazon.Lambda.Core.LambdaSerializer attribute to register the serializer. Most commonly the DefaultLambdaJsonSerializer type is specified. To use the source generator, you must register SourceGeneratorLambdaJsonSerializer, passing the previously defined JsonSerializerContext subclass as the generic parameter.

Here is an example of registering the serializer using the HttpApiJsonSerializerContext type:

[assembly: LambdaSerializer(typeof(SourceGeneratorLambdaJsonSerializer<APIGatewayExampleImage.HttpApiJsonSerializerContext>))]

After these steps, Lambda uses the source-generated JSON serialization code to handle all of the serialization of Lambda events and responses. Reflection API calls are not used for serialization, improving the Lambda function’s cold start performance.

Below is a full example of an API Gateway-based Lambda function using the source generator.

using System.Collections.Generic;
using System.Net;
using System.Text.Json.Serialization;


using Amazon.Lambda.Core;
using Amazon.Lambda.APIGatewayEvents;
using Amazon.Lambda.Serialization.SystemTextJson;

[assembly: LambdaSerializer(typeof(SourceGeneratorLambdaJsonSerializer<SourceGeneratorExample.HttpApiJsonSerializerContext>))]

namespace SourceGeneratorExample;

[JsonSerializable(typeof(APIGatewayHttpApiV2ProxyRequest))]
[JsonSerializable(typeof(APIGatewayHttpApiV2ProxyResponse))]
public partial class HttpApiJsonSerializerContext : JsonSerializerContext
{
}


public class Functions
{
    public APIGatewayProxyResponse Get(APIGatewayHttpApiV2ProxyRequest request, ILambdaContext context)
    {
        context.Logger.LogInformation("Get Request");

        var response = new APIGatewayHttpApiV2ProxyResponse
        {
            StatusCode = (int)HttpStatusCode.OK,
            Body = "Hello AWS Serverless",
            Headers = new Dictionary<string, string> { { "Content-Type", "text/plain" } }
        };

        return response;
    }
}

Top-level statements

The new .NET 6 Lambda runtime adds support for writing Lambda functions using C# 9’s top-level statements feature. Top-level statements allow you to remove much of the initial boilerplate code for a .NET project.

In a typical hello world example:

using System;

namespace Application
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(“Enjoying .NET 6 in AWS Lambda”);
        }
    }
}

With top-level statements, you can write this in one line, removing brackets, indentations, namespaces, and type declarations:

Console.WriteLine(“Enjoying .NET 6 in AWS Lambda”);

At a high level, the C# compiler generates the .NET assembly’s Main() method, with your top-level code within it.

Executable assemblies

With top-level statements, the Main() method has been generated by the compiler. This is different from the traditional way of writing .NET Lambda functions. Previously, a Lambda project is a class library and the Lambda function handler is set to the assembly, type, and method name that the Lambda runtime client invokes.

Here is an example of .NET Lambda function handler string:

LambdaProject::LambdaProject.Function::FunctionHandler

And here is what the code for this function handler could look like:

using System.IO;
using System.Threading.Tasks;

using Amazon.Lambda.Core;
using Amazon.Lambda.S3Events;
using Amazon.S3;

// Assembly attribute to enable the Lambda function’s JSON input to be converted into a .NET class.
[assembly: LambdaSerializer(typeof(Amazon.Lambda.Serialization.SystemTextJson.DefaultLambdaJsonSerializer))]

namespace LambdaProject
{
    public class Function
    {
        IAmazonS3 _s3Client;

        public Function()
        {
            _s3Client = new AmazonS3Client();
        }

        public async Task FunctionHandler(S3Event evnt, IlambdaContext context)
        {
            foreach (var record in evnt.Records)
            {
                using var response = await _s3Client.GetObjectAsync(record.S3.Bucket.Name, record.S3.Object.Key);
                using var reader = new StreamReader(response.ResponseStream);
                // Run business logic on the text contexts of the S3 object
            }
        }
    }
}

Using reflection, the .NET Lambda runtime client uses the function handler string to identify the method to call in the .NET assembly.

When using top-level statements, you instead tell Lambda to run the assembly, which runs the top-level statements. To indicate that you want Lambda to run the assembly, set the Lambda function handler to the assembly name only. Using the previous example, the .NET Lambda function handler string is LambdaProject.

With the .NET assembly containing the Lambda function being run at startup, instead of the Lambda runtime client, your function code must start the Lambda runtime client so that Lambda events are sent to your code.

To start the Lambda runtime client:

Add the Amazon.Lambda.RuntimeSupport NuGet package to your project.
In the file that defines all of your top-level statements add to the end of the file the code to start the Lambda runtime client. The exact code is shown at the end of the example below.

This is a full example of a C# Lambda function using top-level statements that processes Lambda events:

using Amazon.Lambda.Core;
using Amazon.Lambda.RuntimeSupport;
using Amazon.Lambda.Serialization.SystemTextJson;
using Amazon.Lambda.S3Events;
using Amazon.S3;

// Code outside of the handler will be executed during Lambda initialization
var s3Client = new AmazonS3Client();

// The function handler that will be called for each Lambda event
var handler = async (S3Event evnt, ILambdaContext context) =>
{
    foreach(var record in evnt.Records)
    {
        using var response = await s3Client.GetObjectAsync(record.S3.Bucket.Name, record.S3.Object.Key);
        using var reader = new StreamReader(response.ResponseStream);
        // Run business logic on the text contexts of the S3 object
    }
};

// Build the Lambda runtime client passing in the handler to call for each
// event and the JSON serializer to use for translating Lambda JSON documents
// to .NET types.
await LambdaBootstrapBuilder.Create(handler, new DefaultLambdaJsonSerializer())
        .Build()
        .RunAsync();

ASP.NET Core minimal APIs

Since the first .NET Lambda runtime, you can run ASP.NET Core applications as Lambda functions using the Amazon.Lambda.AspNetCoreServer NuGet package.

.NET 6 introduces a new style of writing ASP.NET Core applications called Minimal APIs. These take advantage of C# 9’s top-level statement support simplifying the initialization of an ASP.NET Core application, allowing you to define an entire ASP.NET Core application in a single file.

To deploy an ASP.NET Core application using Minimal APIs to Lambda:

Add the Amazon.Lambda.AspNetCoreServer.Hosting NuGet package to your project.
Add a call to AddAWSLambdaHosting in your application when the services are being defined for the application. The argument for AddAWSLambdaHosting is the event source for the Lambda function. This can be an API Gateway REST or HTTP API, or an Application Load Balancer.

When the ASP.NET Core project is run locally, AddAWSLambdaHosting does nothing, allowing the normal .NET Kestrel web server to handle the local experience. When running in Lambda, AddAWSLambdaHosting swaps out Kestrel with Amazon.Lambda.AspNetCoreServer allowing Lambda and API Gateway to act as the web server instead of Kestrel. Since Minimal APIs take advantage of top-level statements, AddAWSLambdaHosting also starts the Lambda runtime client.

This example shows a Minimal API ASP.NET Core application. There is one Lambda-specific line calling AddAWSLambdaHosting that configures the project for Lambda support:

using Amazon.S3;
using Microsoft.AspNetCore.Mvc;

var builder = WebApplication.CreateBuilder(args);

// Add Swagger/OpenAPI support
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();

builder.Services.AddControllers();

// Add S3 service client to dependency injection container
builder.Services.AddAWSService<IAmazonS3>();

// Add AWS Lambda support.
builder.Services.AddAWSLambdaHosting(LambdaEventSource.HttpApi);

var app = builder.Build();

app.UseSwagger();
app.UseSwaggerUI();

// Add support for controllers defined in other files
app.MapControllers();

// Example GET route
app.MapGet("/document/{name}", async ([FromServices] IAmazonS3 s3Client, string name) =>
{
    using var response = await s3Client.GetObjectAsync(app.Configuration["S3Bucket"], name);
    using var reader = new StreamReader(response.ResponseStream);
    var content = await reader.ReadToEndAsync();

    // Run business logic on the text contexts of the S3 object

    return content;
});

app.Run();

You must deploy as an executable assembly so the function handler string is set to the assembly name only. For example, this is how the preceding ASP.NET Core application is defined in AWS CloudFormation:

   ...
    
   "AspNetCoreFunction": {
      "Type": "AWS::Serverless::Function",
      "Properties": {
        "Handler": "AspNetCoreMinimalApiExample", // The assembly name only
        "Runtime": "dotnet6"
        "MemorySize": 256,
        "Timeout": 30,
        "Role": null,
        "Policies": [
          "AWSLambda_FullAccess",
          "AmazonS3ReadOnlyAccess"
        ],
        "Events": {
          "ProxyResource": {
            "Type": "HttpApi",
            "Properties": {
              "Path": "/{proxy+}",
              "Method": "ANY"
            }
          },
          "RootResource": {
            "Type": "HttpApi",
            "Properties": {
              "Path": "/",
              "Method": "ANY"
            }
          }
        }
      }
    }
  },
  
  ...

Open source Lambda runtime client

Over the last few years, AWS has open sourced more components of Lambda to help the community contribute to the Lambda experience. For .NET, you can find all the AWS client libraries in the aws/aws-lambda-dotnet GitHub repository.

For .NET 6, the managed runtime now uses the open source Lambda runtime client from the aws/aws-lambda-dotnet repository. Previously, the open source Lambda runtime client was used for functions that used Lambda’s custom runtime or container-image based support.

Now you have a consistent and transparent Lambda runtime client experience in all environments whether that is the managed runtime, container images or using the Lambda runtime client for .NET custom runtimes. The switch from the previous runtime client to the open source runtime client is transparent as Lambda functions are migrated to .NET 6.

The open source Lambda runtime client has different performance characteristics than the .NET Core 3.1 Lambda runtime client. This is because the open source client uses all managed code, whereas the .NET Core 3.1 client uses a mix of managed and native code. In our testing, cold starts for basic “Hello, world!” functions may be slightly faster in .NET Core 3.1. However, for Lambda functions that do real world work, the testing shows a significant cold start improvement in .NET 6. For example, a .NET 6 Lambda function that uses the AWS .NET SDK to retrieve an item from DynamoDB showed a 25% performance improvement.

Migrating to .NET 6

To migrate existing .NET Lambda functions to the new .NET 6 runtime:

Open the csproj or fsproj file. Set the TargetFramework element to net6.0.
Open the aws-lambda-tools-defaults.json file, if it exists:
1. Set the function-runtime field to dotnet6
2. Set the framework field to net6.0. If you remove the field, the value is inferred from the project file.
If it exists, open the serverless.template file. For any AWS::Lambda::Function or AWS::Servereless::Function resource, set the Runtime property to dotnet6.
Update all Amazon.Lambda.* NuGet package references to the latest versions.

Conclusion

We are excited to add support for .NET 6 to Lambda. It’s fast to get started or migrate existing functions to .NET 6, with many new features in .NET 6 to take advantage of. Read the Lambda Developer Guide for more getting started information.

To provide feedback for .NET on AWS Lambda, contact the AWS .NET team on the .NET Lambda GitHub repository.

For more serverless learning resources, visit Serverless Land.

Building TypeScript projects with AWS SAM CLI

2022-02-23 Eric Johnson

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/building-typescript-projects-with-aws-sam-cli/

This post written by Dan Fox, Principal Specialist Solutions Architect and Roman Boiko, Senior Specialist Solutions Architect

The AWS Serverless Application Model (AWS SAM) CLI provides developers with a local tool for managing serverless applications on AWS. This command line tool allows developers to initialize and configure applications, build and test locally, and deploy to the AWS Cloud. Developers can also use AWS SAM from IDEs like Visual Studio Code, JetBrains, or WebStorm. TypeScript is a superset of JavaScript and adds static typing, which reduces errors during development and runtime.

On February 22, 2022 we announced the beta of AWS SAM CLI support for TypeScript. These improvements simplify TypeScript application development by allowing you to build and deploy serverless TypeScript projects using AWS SAM CLI commands. To install the latest version of the AWS SAM CLI, refer to the installation section of the AWS SAM page.

In this post, I initialize a TypeScript project using an AWS SAM template. Then I build a TypeScript project using the AWS SAM CLI. Next, I use AWS SAM Accelerate to speed up the development and test iteration cycles for your TypeScript project. Last, I measure the impact of bundling, tree shaking, and minification on deployment package size.

Initializing a TypeScript template

This walkthrough requires:

Node.js 14. x
AWS SAM CLI

AWS SAM now provides the capability to create a sample TypeScript project using a template. Since this feature is still in preview, you can enable this by one of the following methods:

Use env variable `SAM_CLI_BETA_ESBUILD=1`

Add the following parameters to your samconfig.toml

[default.build.parameters]
beta_features = true
[default.sync.parameters]
beta_features = true

Use the --beta-features option with sam build and sam sync. I use this approach in the following examples.
Choose option ‘y’ when CLI prompts you about using beta features.

To create a new project:

Run – sam init
In the wizard, select the following options:
1. AWS Quick Start Templates
2. Hello World Example
3. nodejs14.x – TypeScript
4. Zip
5. Keep the name of the application as sam-app

sam init wizard steps

Open the created project in a text editor. In the root, you see a README.MD file with the project description and a template.yaml. This is the specification that defines the serverless application.

In the hello-world folder is an app.ts file written in TypeScript. This project also includes a unit test in Jest and sample configurations for ESLint, Prettier, and TypeScript compilers.

Project structure

Building and deploying a TypeScript project

Previously, to use TypeScript with AWS SAM CLI, you needed custom steps. These transform the TypeScript project into a JavaScript project before running the build.

Today, you can use the sam build command to transpile code from TypeScript to JavaScript. This bundles local dependencies and symlinks, and minifies files to reduce asset size.

AWS SAM uses the popular open source bundler esbuild to perform these tasks. This does not perform type checking but you may use the tsc CLI to perform this task. Once you have built the TypeScript project, use the sam deploy command to deploy to the AWS Cloud.
The following shows how this works.

Navigate to the root of sam-app.
Run sam build. This command uses esbuild to transpile and package app.ts.

sam build wizard
Customize the esbuild properties by editing the Metadata section in the template.yaml file.

Esbuild configuration
After a successful build, run sam deploy --guided to deploy the application to your AWS account.
Accept all the default values in the wizard, except this question:
HelloWorldFunction may not have authorization defined, Is this okay? [y/N]: y

sam deploy wizard
After successful deployment, test that the function is working by querying the API Gateway endpoint displayed in the Outputs section.

sam deploy output

Using AWS SAM Accelerate with TypeScript

AWS SAM Accelerate is a set of features that reduces development and test cycle latency by enabling you to test code quickly against AWS services in the cloud. AWS SAM Accelerate released beta support for TypeScript. Use the template from the last example to use SAM Accelerate with TypeScript.

Use AWS SAM Accelerate to build and deploy your code upon changes.

Run sam sync --stack-name sam-app --watch.
Open your browser with the API Gateway endpoint from the Outputs section.

Update the handler function in app.ts file to:

export const lambdaHandler = async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {
    let response: APIGatewayProxyResult;
    try {
        response = {
            statusCode: 200,
            body: JSON.stringify({
                message: 'hello SAM',
            }),
        };
    } catch (err) {
        console.log(err);
        response = {
            statusCode: 500,
            body: JSON.stringify({
                message: 'some error happened',
            }),
        };
    }

    return response;
};

Save changes. AWS SAM automatically rebuilds and syncs the application code to the cloud.

AWS SAM Accelerate output
Refresh the browser to see the updated message.

Deployment package size optimizations

One additional benefit of the TypeScript build process is that it reduces your deployment package size through bundling, tree shaking, and minification. The bundling process removes dependency files not referenced in the control flow. Tree shaking is the term used for unused code elimination. It is a compiler optimization that removes unreachable code within files.

Minification reduces file size by removing white space, rewriting syntax to be more compact, and renaming local variables to be shorter. The sam build process performs bundling and tree shaking by default. Configure minification, a feature typically used in production environments, within the Metadata section of the template.yaml file.

Measure the impact of these optimizations by the reduced deployment package size. For example, measure the before and after size of an application, which includes the AWS SDK for JavaScript v3 S3 Client as a dependency.

To begin, change the package.json file to include the @aws-sdk/client-s3 as a dependency:

From the application root, cd into the hello-world directory.
Run the command:
npm install @aws-sdk/client-s3
Delete all the devDependencies except for esbuild to get a more accurate comparison

package.json contents
Run the following command to build your dependency library:
npm install
From the application root, run the following command to measure the size of the application directory contents:
du -sh hello-worldThe current application is approximately 50 MB.
Turn on minification by setting the Minify value to true in the template.yaml file

Metadata section of template.yaml
Now run the following command to build your project using bundling, tree shaking, and minification.
sam build
Your deployment package is now built in the .aws_sam directory. You can measure the size of the package with the following command:
du -sh .aws-sam

The new package size is approximately 2.8 MB. That represents a 94% reduction in uncompressed application size.

Conclusion

This post reviews several new features that can improve the development experience for TypeScript developers. I show how to create a sample TypeScript project using sam init. I build and deploy a TypeScript project using the AWS SAM CLI. I show how to use AWS SAM Accelerate with your TypeScript project. Last, I measure the impact of bundling, tree shaking, and minification on a sample project. We invite the serverless community to help improve AWS SAM. AWS SAM is an open source project and you can contribute to the repository here.

For more serverless content, visit Serverless Land.

How to secure API Gateway HTTP endpoints with JWT authorizer

2022-02-14 Siva Rajamani

Post Syndicated from Siva Rajamani original https://aws.amazon.com/blogs/security/how-to-secure-api-gateway-http-endpoints-with-jwt-authorizer/

This blog post demonstrates how you can secure Amazon API Gateway HTTP endpoints with JSON web token (JWT) authorizers. Amazon API Gateway helps developers create, publish, and maintain secure APIs at any scale, helping manage thousands of API calls. There are no minimum fees, and you only pay for the API calls you receive.

Based on customer feedback and lessons learned from building the REST and WebSocket APIs, AWS launched HTTP APIs for Amazon API Gateway, a service built to be fast, low cost, and simple to use. HTTP APIs offer a solution for building APIs, as well as multiple mechanisms for controlling and managing access through AWS Identity and Access Management (IAM) authorizers, AWS Lambda authorizers, and JWT authorizers.

This post includes step-by-step guidance for setting up JWT authorizers using Amazon Cognito as the identity provider, configuring HTTP APIs to use JWT authorizers, and examples to test the entire setup. If you want to protect HTTP APIs using Lambda and IAM authorizers, you can refer to Introducing IAM and Lambda authorizers for Amazon API Gateway HTTP APIs.

Prerequisites

Before you can set up a JWT authorizer using Cognito, you first need to create three Lambda functions. You should create each Lambda function using the following configuration settings, permissions, and code:

The first Lambda function (Pre-tokenAuthLambda) is invoked before the token generation, allowing you to customize the claims in the identity token.
The second Lambda function (LambdaForAdminUser) acts as the HTTP API Gateway integration target for /AdminUser HTTP API resource route.
The third Lambda function (LambdaForRegularUser) acts as the HTTP API Gateway integration target for /RegularUser HTTP API resource route.

IAM policy for Lambda function

You first need to create an IAM role using the following IAM policy for each of the three Lambda functions:

	{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "logs:CreateLogGroup",
			"Resource": "arn:aws:logs:us-east-1:<AWS Account Number>:*"
		},
		{
			"Effect": "Allow",
			"Action": [
				"logs:CreateLogStream",
				"logs:PutLogEvents"
			],
			"Resource": [
				"arn:aws:logs:us-east-1:<AWS Account Number>:log-group:/aws/lambda/<Name of the Lambda functions>:*"
			]
		}
	]
}

Settings for the required Lambda functions

For the three Lambda functions, use these settings:

Function name

Enter an appropriate name for the Lambda function, for example:

Pre-tokenAuthLambda for the first Lambda
LambdaForAdminUser for the second
LambdaForRegularUser for the third

Runtime

Choose Node.js 12.x

Permissions

Choose Use an existing role and select the role you created with the IAM policy in the Prerequisites section above.

Pre-tokenAuthLambda code

This first Lambda code, Pre-tokenAuthLambda, converts the authenticated user’s Cognito group details to be returned as the scope claim in the id_token returned by Cognito.

	exports.lambdaHandler = async (event, context) => {
		let newScopes = event.request.groupConfiguration.groupsToOverride.map(item => `${item}-${event.callerContext.clientId}`)
	event.response = {
		"claimsOverrideDetails": {
			"claimsToAddOrOverride": {
				"scope": newScopes.join(" "),
			}
		}
  	};
  	return event
}

LambdaForAdminUser code

This Lambda code, LambdaForAdminUser, acts as the HTTP API Gateway integration target and sends back the response Hello from Admin User when the /AdminUser resource path is invoked in API Gateway.

	exports.handler = async (event) => {

		const response = {
			statusCode: 200,
			body: JSON.stringify('Hello from Admin User'),
		};
		return response;
	};

LambdaForRegularUser code

This Lambda code, LambdaForRegularUser , acts as the HTTP API Gateway integration target and sends back the response Hello from Regular User when the /RegularUser resource path is invoked within API Gateway.

	exports.handler = async (event) => {

		const response = {
			statusCode: 200,
			body: JSON.stringify('Hello from Regular User'),
		};
		return response;
	};

Deploy the solution

To secure the API Gateway resources with JWT authorizer, complete the following steps:

Create an Amazon Cognito User Pool with an app client that acts as the JWT authorizer
Create API Gateway resources and secure them using the JWT authorizer based on the configured Amazon Cognito User Pool and app client settings.

The procedures below will walk you through the step-by-step configuration.

Set up JWT authorizer using Amazon Cognito

The first step to set up the JWT authorizer is to create an Amazon Cognito user pool.

To create an Amazon Cognito user pool

Go to the Amazon Cognito console.
Choose Manage User Pools, then choose Create a user pool.

Figure 1: Create a user pool
Enter a Pool name, then choose Review defaults.

Figure 2: Review defaults while creating the user pool
Choose Add app client.

Figure 3: Add an app client for the user pool
Enter an app client name. For this example, keep the default options. Choose Create app client to finish.

Figure 4: Review the app client configuration and create it
Choose Return to pool details, and then choose Create pool.

Figure 5: Complete the creation of user pool setup

To configure Cognito user pool settings

Now you can configure app client settings:

On the left pane, choose App client settings. In Enabled Identity Providers, select the identity providers you want for the apps you configured in the App Clients tab.
Enter the Callback URLs you want, separated by commas. These URLs apply to all selected identity providers.
Under OAuth 2.0, select the from the following options.
- For Allowed OAuth Flows, select Authorization code grant.
- For Allowed OAuth Scopes, select phone, email, openID, and profile.
Choose Save changes.

Figure 6: Configure app client settings
Now add the domain prefix to use for the sign-in pages hosted by Amazon Cognito. On the left pane, choose Domain name and enter the appropriate domain prefix, then Save changes.

Figure 7: Choose a domain name prefix for the Amazon Cognito domain
Next, create the pre-token generation trigger. On the left pane, choose Triggers and under Pre Token Generation, select the Pre-tokenAuthLambda Lambda function you created in the Prerequisites procedure above, then choose Save changes.

Figure 8: Configure Pre Token Generation trigger Lambda for user pool
Finally, create two Cognito groups named admin and regular. Create two Cognito users named adminuser and regularuser. Assign adminuser to both admin and regular group. Assign regularuser to regular group.

Figure 9: Create groups and users for user pool

Configuring HTTP endpoints with JWT authorizer

The first step to configure HTTP endpoints is to create the API in the API Gateway management console.

To create the API

Go to the API Gateway management console and choose Create API.

Figure 10: Create an API in API Gateway management console
Choose HTTP API and select Build.

Figure 11: Choose Build option for HTTP API
Under Create and configure integrations, enter JWTAuth for the API name and choose Review and Create.

Figure 12: Create Integrations for HTTP API
Once you’ve created the API JWTAuth, choose Routes on the left pane.

Figure 13: Navigate to Routes tab
Choose Create a route and select GET method. Then, enter /AdminUser for the path.

Figure 14: Create the first route for HTTP API
Repeat step 5 and create a second route using the GET method and /RegularUser for the path.

Figure 15: Create the second route for HTTP API

To create API integrations

Now that the two routes are created, select Integrations from the left pane.

Figure 16: Navigate to Integrations tab
Select GET for the /AdminUser resource path, and choose Create and attach an integration.

Figure 17: Attach an integration to first route
To create an integration, select the following values
Integration type: Lambda function
Integration target: LambdaForAdminUser
Choose Create.
NOTE: LambdaForAdminUser is the Lambda function you previously created as part of the Prerequisites procedure LambdaForAdminUser code.

Figure 18: Create an integration for first route
Next, select GET for the /RegularUser resource path and choose Create and attach an integration.

Figure 19: Attach an integration to second route
To create an integration, select the following values
Integration type: Lambda function
Integration target: LambdaForRegularUser
Choose Create.
NOTE: LambdaForRegularUser is the Lambda function you previously created as part of the Prerequisites procedure LambdaForRegularUser code.

Figure 20: Create an integration for the second route

To configure API authorization

Select Authorization from the left pane, select /AdminUser path and choose Create and attach an authorizer.

Figure 21: Navigate to Authorization left pane option to create an authorizer
For Authorizer type select JWT and under Authorizer settings enter the following details:

Name: JWTAuth

Identity source: $request.header.Authorization

Issuer URL: https://cognito-idp.us-east1.amazonaws.com/<your_userpool_id>

Audience: <app_client_id_of_userpool>
Choose Create.

Figure 22: Create and attach an authorizer to HTTP API first route
In the Authorizer for route GET /AdminUser screen, choose Add scope in the Authorization Scope section and enter scope name as admin-<app_client_id> and choose Save.

Figure 23: Add authorization scopes to first route of HTTP API
Now select the /RegularUser path and from the dropdown, select the JWTAuth authorizer you created in step 3. Choose Attach authorizer.

Figure 24: Attach an authorizer to HTTP API second route
Choose Add scope and enter the scope name as regular-<app_client_id> and choose Save.

Figure 25: Add authorization scopes to second route of HTTP API
Enter Test as the Name and then choose Create.

Figure 26: Create a stage for HTTP API
Under Select a stage, enter Test, and then choose Deploy to stage.

Figure 27: Deploy HTTP API to stage

Test the JWT authorizer

You can use the following examples to test the API authentication. We use Curl in this example, but you can use any HTTP client.

To test the API authentication

Send a GET request to the /RegularUser HTTP API resource without specifying any authorization header.
```
curl -s -X GET https://a1b2c3d4e5.execute-api.us-east-1.amazonaws.com/RegularUser
```
API Gateway returns a 401 Unauthorized response, as expected.

{“message”:”Unauthorized”}

The required $request.header.Authorization identity source is not provided, so the JWT authorizer is not called. Supply a valid Authorization header key and value. You authenticate as the regularuser, using the aws cognito-idp initiate-auth AWS CLI command.

aws cognito-idp initiate-auth --auth-flow USER_PASSWORD_AUTH --client-id <Cognito User Pool App Client ID> --auth-parameters USERNAME=regularuser,PASSWORD=<Password for regularuser>

CLI Command response:


{
	"ChallengeParameters": {},
	"AuthenticationResult": {
		"AccessToken": "6f5e4d3c2b1a111112222233333xxxxxzz2yy",
		"ExpiresIn": 3600,
		"TokenType": "Bearer",
		"RefreshToken": "xyz123abc456dddccc0000",
		"IdToken": "aaabbbcccddd1234567890"
	}
}

The command response contains a JWT (IdToken) that contains information about the authenticated user. This information can be used as the Authorization header value.

curl -H "Authorization: aaabbbcccddd1234567890" -s -X GET https://a1b2c3d4e5.execute-api.us-east-1.amazonaws.com/RegularUser

API Gateway returns the response Hello from Regular User. Now test access for the /AdminUser HTTP API resource with the JWT token for the regularuser.
```
curl -H "Authorization: aaabbbcccddd1234567890" -s -X GET "https://a1b2c3d4e5.execute-api.us-east-1.amazonaws.com/AdminUser"
```
API Gateway returns a 403 – Forbidden response.
{“message”:”Forbidden”}
The JWT token for the regularuser does not have the authorization scope defined for the /AdminUser resource, so API Gateway returns a 403 – Forbidden response.
Next, log in as adminuser and validate that you can successfully access both /RegularUser and /AdminUser resource. You use the cognito-idp initiate-auth AWS CLI command.

aws cognito-idp initiate-auth --auth-flow USER_PASSWORD_AUTH --client-id <Cognito User Pool App Client ID> --auth-parameters USERNAME=adminuser,PASSWORD==<Password for adminuser>

CLI Command response:


{
	"ChallengeParameters": {},
	"AuthenticationResult": {
		"AccessToken": "a1b2c3d4e5c644444555556666Y2X3Z1111",
		"ExpiresIn": 3600,
		"TokenType": "Bearer",
		"RefreshToken": "xyz654cba321dddccc1111",
		"IdToken": "a1b2c3d4e5c6aabbbcccddd"
	}
}

Using Curl, you can validate that the adminuser JWT token now has access to both the /RegularUser resource and the /AdminUser resource. This is possible when adminuser is part of both Cognito groups, so the JWT token contains both authorization scopes.
```
curl -H "Authorization: a1b2c3d4e5c6aabbbcccddd" -s -X GET https://a1b2c3d4e5.execute-api.us-east-1.amazonaws.com/RegularUser
```
API Gateway returns the response Hello from Regular User
```
curl -H "Authorization: a1b2c3d4e5c6aabbbcccddd" -s -X GET https://a1b2c3d4e5.execute-api.us-east-1.amazonaws.com/AdminUser
```
API Gateway returns the following response Hello from Admin User

Conclusion

AWS enabled the ability to manage access to an HTTP API in API Gateway in multiple ways: with Lambda authorizers, IAM roles and policies, and JWT authorizers. This post demonstrated how you can secure API Gateway HTTP API endpoints with JWT authorizers. We configured a JWT authorizer using Amazon Cognito as the identity provider (IdP). You can achieve the same results with any IdP that supports OAuth 2.0 standards. API Gateway validates the JWT that the client submits with API requests. API Gateway allows or denies requests based on token validation along with the scope of the token. You can configure distinct authorizers for each route of an API, or use the same authorizer for multiple routes.

To learn more, we recommend:

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Building custom connectors using the Amazon AppFlow Custom Connector SDK

2022-02-14 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-custom-connectors-using-the-amazon-appflow-custom-connector-sdk/

This post is written by Kamen Sharlandjiev, Sr. Specialist SA, Integration, Ray Jang, Principal PMT, Amazon AppFlow, and Dhiraj Mahapatro, Sr. Specialist SA, Serverless.

Amazon AppFlow is a fully managed integration service that enables you to transfer data securely between software as a service (SaaS) applications like Salesforce, SAP, Zendesk, Slack, ServiceNow, and AWS services like Amazon S3 and Amazon Redshift. Amazon AppFlow lets you run enterprise-scale data flows on a schedule, in response to business events, or on-demand.

Amazon AppFlow is a managed integration service that replaces the heavy-lifting of developing, maintaining, and updating connectors. It supports bidirectional integration between external SaaS applications and AWS services.

The Custom Connector Software Development Kit (SDK) now makes it easier to integrate with private API endpoints, proprietary applications, or other cloud services. It provides access to all available managed integrations and the ability to build your own custom integration as part of the integrated experience. The SDK is open-source and available for Java or Python.

You can deploy custom connectors built with the SDK in different ways:

Private – The connector is available only inside the AWS account where deployed.
Shared – The connector can be shared for use with other AWS accounts.
Public – Publish connectors on the AWS Marketplace for free or charge a subscription fee. For more information, refer to Sharing AppFlow connectors via AWS Marketplace.

Overview

This blog takes you through building and deploying your own Amazon AppFlow Custom Connector using the Java SDK. The sample application shows how to build your first custom connector with Amazon AppFlow.

The process of building, deploying, and using a custom connector is:

Create a custom connector as an AWS Lambda function using the Amazon AppFlow Custom Connector SDK.
Deploy the custom connector Lambda function, which provides the serverless compute for the connector.
Lambda function integrates with a SaaS application or private API.
Register the custom connector with Amazon AppFlow.
Users can now use this custom connector in the Amazon AppFlow service.

Building an Amazon AppFlow custom connector

The sample application used in this blog creates a new custom connector that implements a MySQL JDBC driver. With this connector, you can connect to a remote MySQL or MariaDB instance to read and write data.

The SDK allows you to build custom connectors and use the service’s built-in authentication support for: OAuth2, API key, and basic auth. For other use cases, such as JDBC, you must create your own custom authentication implementation.

The SDK includes the source code for an example Salesforce connector. This highlights a complete use case for a source and destination Amazon AppFlow connector using OAuth2 as authentication.

Details

There are three mandatory Java interfaces that a connector must implement:

ConfigurationHandler.java: Defines the functionality to implement connector configurations, and credentials-related operations.
MetadataHandler.java: Represents the functionality to implement for objects metadata.
RecordHandler.java: Defines functionality to implement record-related CRUD operations.

Prerequisites

Ensure that the following software is installed on your workstation:

To run the sample application:

Clone the code repository:

git clone https://github.com/aws-samples/amazon-appflow-custom-jdbc-connector.git

cd amazon-appflow-custom-jdbc-connector

After cloning the sample application, visit these Java classes for more details:

JDBCConnectorConfigurationHandler.java: A configuration handler validates credentials for the JDBC client connection, validates connector runtime settings, and describes connector configuration.
JDBCConnectorMetadataHandler.java: A metadata handler describes and lists all entities for the JDBC connector metadata
JDBCConnectorRecordsHandler.java: A record handler that defines how to implement CRUD operations for this JDBC connector

To add JDBC clients for other database engines, implement JDBCClient.java interface. The custom connector uses a Lambda function as a POJO class to handle requests. The SDK provides an abstract BaseLambdaConnectorHandler class that, which you use as follows:

import com.amazonaws.appflow.custom.connector.lambda.handler.BaseLambdaConnectorHandler;

public class JDBCConnectorLambdaHandler extends BaseLambdaConnectorHandler {

  public JDBCConnectorLambdaHandler() {
    super(
      new JDBCConnectorMetadataHandler(),
      new JDBCConnectorRecordHandler(),
      new JDBCConnectorConfigurationHandler()
    );
  }
}

Local testing and debugging

While developing the connector specific functionality, developers require local testing capability to build and debug faster. The SDK and the example connector provides examples on testing custom connectors.

Additionally, you can experiment with JUnit and the DSL builders provided by the SDK. The JUnit test allows you to test this implementation locally by simulating an appropriate request to the Lambda functions. You can use debug points and step into the code implementation from start to end using the built-in IDE debugger. The sample application comes with example of JUnit tests that can be used with debug points.

Credentials management

Amazon AppFlow stores all sensitive information in AWS Secrets Manager. The secret is created when you create a connector profile. The secret ARN is passed in the ConnectorContext that forms part of the Lambda function’s invocation request.

To test locally:

Mock the “CredentialsProvider” and stub out the response of GetCredentials API. Note that the CredentialProvider provides several different GetCredentials methods, depending on the authentication used.
Create a secret in AWS Secrets Manager. Configure an IAM user with programmatic access and sufficient permissions to allow the secretsmanager:GetSecretValue action and let the CredentialsProvider call Secrets Manager locally. When you initialize a new service client without supplying any arguments, the SDK attempts to find AWS credentials by using the default credential provider chain.

For more information, read Working with AWS Credentials (SDK for Java) and Creating an IAM user with programmatic access.

Deploying the Lambda function in an AWS account

This example connector package provides an AWS Serverless Application Model (AWS SAM) template in the project folder. It describes the following resources:

The Lambda function containing the custom connector code.
The AWS IAM policy, allowing the function to read secrets from AWS Secrets Manager.
The AWS Lambda policy permission allowing Amazon AppFlow to invoke the Lambda function.

The sample application’s AWS SAM template provides two resources:

AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Description: Template to deploy the lambda connector in your account.
Resources:
  ConnectorFunction:
    Type: 'AWS::Serverless::Function'
    Properties:
      Handler: "org.custom.connector.jdbc.handler.JDBCConnectorLambdaHandler::handleRequest"
      CodeUri: "./target/appflow-custom-jdbc-connector-jdbc-1.0.jar"
      Description: "AppFlow custom JDBC connector example"
      Runtime: java11
      Timeout: 30
      MemorySize: 1024
      Policies:
        Version: '2012-10-17'
        Statement:
          Effect: Allow
          Action: 'secretsmanager:GetSecretValue'
          Resource: !Sub 'arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:appflow!${AWS::AccountId}-*'

  PolicyPermission:
    Type: 'AWS::Lambda::Permission'
    Properties:
      FunctionName: !GetAtt ConnectorFunction.Arn
      Action: lambda:InvokeFunction
      Principal: 'appflow.amazonaws.com'
      SourceAccount: !Ref 'AWS::AccountId'
      SourceArn: !Sub 'arn:aws:appflow:${AWS::Region}:${AWS::AccountId}:*'

Deploy this custom connector by using the following command from the amazon-appflow-custom-jdbc-connector base directory:

mvn package && sam deploy –-guided

Once deployment completes, follow below steps to register and use the connector.

Registering the custom connector

There are two ways to register the custom connector.

1. Register through the AWS Management Console

From the AWS Management Console, navigate to Amazon AppFlow. Select Connectors on the left-side menu. Choose on the “Register New Connector” button.
Register the connector by selecting your Lambda function and typing in the connector label.
The newly created custom connector Lambda function appears in the list if you deployed using AWS SAM by following the steps in this tutorial. If you deployed the Lambda function manually, ensure that appropriate Lambda permissions are set, as described in the Lambda Permissions and Resource Policy section.
Provide a label for the connector. The label must be unique per account per Region. Choose Register.
The connector appears in the list of custom connectors.

2. Register with the API

Invoke the registerConnector public API endpoint with the following request payload:

{
   "connectorLabel":"TestCustomConnector",
   "connectorProvisioningType":"LAMBDA",
   "connectorProvisioningConfig":{
      "lambda":{ "lambdaArn":"arn:aws:lambda:<region>:<aws_account_id>:function:<lambdaFunctionName>"
      }
   }
}

For connectorLabel, use a unique label. Currently, the only supported connectorProvisioningType is LAMBDA.

Using the new custom connector

Navigate to the Connections link from the left-menu. Select the registered connector from the drop-down.
Choose Create Connection.
Complete the connector-specific setup:
Proceed with creating a flow and selecting your new connection.
Check Lambda function’s Amazon CloudWatch Logs to troubleshoot errors, if any, during connector registration, connector profile creation, and flow execution process.

Production considerations

This example is a proof of concept. To build a production-ready solution, review the non-exhaustive list of differences between sample and production-ready solutions.

If you plan to use the custom connector with high concurrency, review AWS Lambda quotas and limitations.

Cleaning up the custom connector stack

To delete the connector:

Delete all flows in Amazon AppFlow that you created as part of this tutorial.
Delete any connector profiles.
Unregister the custom connector.
To delete the stack, run the following command from the amazon-appflow-custom-jdbc-connector base directory:
```
sam delete
```

Conclusion

This blog post shows how to extend the Amazon AppFlow service to move data between SaaS endpoints and custom APIs. You can now build custom connectors using the Amazon AppFlow Custom Connector SDK.

Using custom connectors in Amazon AppFlow allows you to integrate siloed applications with minimal code. For example, different business units using legacy applications in an organization can now integrate their services via the Amazon AppFlow Custom Connectors SDK.

Depending on your choice of framework you can use the open source Python SDK or Java SDK from GitHub. To learn more, refer to the Custom Connector SDK Developer Guide.

For more serverless learning resources, visit Serverless Land.

Introducing AWS Virtual Waiting Room

2022-02-10 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-aws-virtual-waiting-room/

This post is written by Justin Pirtle, Principal Solutions Architect, Joan Morgan, Software Developer Engineer, and Jim Thario, Software Developer Engineer.

Today, AWS is introducing an official AWS Virtual Waiting Room solution. You can integrate this new, open-source solution with existing web and mobile applications. It can help buffer users during times of peak demand and sudden bursts of traffic, preventing systems from resource exhaustion.

Events commonly use virtual waiting rooms where there is either unknown demand or expected large bursts of traffic. Examples of such events include concert ticket sales, Black Friday promotions, COVID-19 vaccine registrations, and more. Virtual waiting rooms allow a quota of users to view, select, and complete their transactions directly. They shield the application’s backend environment from traffic by buffering users in a waiting room until it is their turn in line.

Like any real-life queuing system, a user enters the AWS Virtual Waiting Room and requests a number in line. After receiving a number corresponding to the unique device ID, the browser then polls regularly for updates. The update provides the current number being served and anticipated time until they are front of line.

After reaching the front of the line, the user can exchange the number and device ID for a secure session token. This is included with their downstream requests to authenticate users securely.

If a user discovers the backend endpoint and tries to send requests, they are redirected into the waiting room. The API requests are denied access until they have a valid token. This prevents the backend from needing to scale to accommodate all users at a single time.

Integrating the AWS Virtual Waiting Room into your application

Integration steps depend on the integration pattern for your application. You can decide if all users are routed through the waiting room or only during periods of excessive traffic. You can also choose to protect only the web host serving the backend webpages or one or more APIs powering backend commerce services.

There are four common patterns supported for integrating the waiting room into your application:

Upstream redirection of all traffic from the main target site to flow through AWS Virtual Waiting Room. This option sends all user traffic through the waiting room with the initial capacity of users permitted to the protected system. The traffic passes through transparently, then it buffers the remaining users. It admits new users as capacity becomes available. The target system is only accessible by users who pass through the waiting room.
Downstream redirection to the virtual waiting room from the target site. This option sends all traffic to the target site. The target site conditionally redirects requests that need to enter the waiting room. No DNS or upstream modifications are needed. The target site must be able to handle the initial user requests and redirection responses.
Direct target site API integration for buffering users from an existing website without any redirection. Your web or mobile application integrates the virtual waiting room at the API-level. This does not need any redirection to a different waiting room endpoint or site. This can offer a seamless user experience but may require more development for the integration.
OpenID Connect (OIDC) adapter. This option offers no-code native integration of the waiting room with OpenID Connect-enabled system components, such as the AWS Application Load Balancer (ALB). Users are redirected by the load balancer or similar component to the waiting room. They are buffered until issued a signed, time-limited JSON Web Token (JWT). Once the user’s JWT token is issued, the load balancer then forwards user requests to the target backend systems.

Overview of the AWS Virtual Waiting Room solution

The AWS Virtual Waiting Room solution implementation includes three main components:

Core APIs. The main resources deployed include two Amazon API Gateway deployments, a VPC, several AWS Lambda functions, an Amazon DynamoDB table, and an Amazon ElastiCache cluster. This API provides the basic mechanisms for tracking clients entering the waiting room. It requests status of the line progression and an authentication token to enter the target protected site.
Waiting room front-end website. The waiting room static site is shown to users awaiting their turn. This site dynamically updates the position being served and their place in line on a configurable interval. You customize this site’s HTML, CSS, and JavaScript to match your frontend styling and theme.
Lambda authorizer for protected target system. The Lambda authorizer wraps and protects the downstream protected target system’s APIs. This ensures that all user invocations have a validated time-limited token issued by the waiting room core API. It helps to prevent users from bypassing the waiting room.

The Virtual Waiting Room CloudFormation template deploys the following infrastructure:

An Amazon CloudFront distribution to deliver public API calls for the client.
Amazon API Gateway public API resources to process queue requests from the virtual waiting room, track the queue position, and support validation of tokens that allow access to the target website.
An Amazon Simple Queue Service (Amazon SQS) queue to regulate traffic to the AWS Lambda function that processes the queue messages. Instead of invoking the Lambda function for each request, the SQS queue batches the incoming bursts of requests.
API Gateway private API resources to support administrative functions.
Lambda functions to validate and process public and private API requests, and return the appropriate responses.
Amazon Virtual Private Cloud (VPC) to host the Lambda functions that interact directly with the Amazon ElastiCache for Redis cluster. VPC endpoints allow Lambda functions in the VPC to communicate with services within the solution.
An Amazon CloudWatch rule to invoke a Lambda function that works with a custom Amazon EventBridge bus to periodically broadcast status updates.
An Amazon DynamoDB table to store token data.
AWS Secrets Manager to store keys for token operations and other sensitive data.
(Optional) Authorizer component consisting of an AWS Identity and Access Management (IAM) role and a Lambda function to validate signatures for your API calls. The only requirement for the authorizer to protect your API is to use API Gateway.
(Optional) Amazon Simple Notification Service (Amazon SNS), CloudWatch, and Lambda functions to support two inlet strategies.
(Optional) OpenID adaptor component with API Gateway and Lambda functions to allow an OpenID provider to authenticate users to your website. CloudFront distribution with an Amazon Simple Storage Service (Amazon S3) bucket for the waiting room page for this component.
(Optional) A CloudFront distribution with Amazon S3 origin bucket for the optional sample waiting room web application.

Deploying the AWS Virtual Waiting Room

To get started with the AWS Virtual Waiting Room, deploy the Getting Started stack. This deploys the Core APIs stack, the Authorizers stack, and a sample application CloudFormation stack:

Launch the Getting Started CloudFormation stack. The template launches in the US East (N. Virginia) Region by default. To launch the solution in a different AWS Region, use the Region selector in the console navigation bar.
On the Create stack page, verify that the correct template URL is in the Amazon S3 URL text box and choose Next.
On the Specify stack details page, assign a name to your solution stack, and accept all default parameter values. For information about naming character limitations, refer to IAM and STS Limits in the AWS Identity and Access Management User Guide. Choose Next.
On the Configure stack options page, choose Next.
On the Review page, review and confirm the settings. Check the box acknowledging that the template creates AWS Identity and Access Management (IAM) resources.
Choose Create stack to deploy the stack.
You can view the status of the stack in the AWS CloudFormation Console in the Status column. You should receive a CREATE_COMPLETE status in approximately 30 minutes.
Once successfully deployed, browse to the Outputs tab.
Copy the ControlPanelURL and WaitingRoomURL to a scratch pad file for later use.

Configuring the AWS Virtual Waiting Room

After deploying the three stacks, test the waiting room using the sample application:

Navigate to the IAM console. Create a new IAM user or select an existing IAM user in the same account where you deployed the waiting room stack.
Grant the selected IAM user programmatic access. Download the key file or copy the access key ID and secret access key values to your scratch pad for later use.
Add the IAM user to the ProtectedAPIGroup IAM user group created by the getting started template:
Open the control panel in a new tab or browser window using the ControlPanelURL output you saved earlier.
In the control panel, expand the Configuration section.
Enter the access key ID and secret access key that you retrieved in Generate AWS keys to call the IAM secured APIs. The endpoints and event ID are filled in from the URL parameters.
Choose Use. The button activates after you have supplied the credentials.
You now see the status “Connected” shown following the various metrics reported:

Test the sample waiting room

Browse to the sample waiting room in a new browser tab. Use the WaitingRoomURL you captured previously from the CloudFormation stack output values.
Select Reserve to enter the waiting room. If you are unable to proceed with your transaction, your assigned number is not yet reached.
Navigate back to the browser tab with the control panel.
Under Increment Serving Counter, select Change. This manually increments the serving counter and allows 100 users to move on from the waiting room to the target site.
Navigate back to the waiting room and choose Check out now! You are redirected to the target site since your serving number is eligible to proceed beyond the waiting room.
Select Purchase now to finish your transaction at the target site. This page represents the protected system beyond the waiting room. Replace this with the actual system users you are protecting.
After the simulated purchase is complete, you can see that the transaction is successful. This transaction is authorized using the time-limited authorization token, which came from the waiting room previously. If a user bypasses the waiting room, they would not be successful in completing a transaction.

Customizing the AWS Virtual Waiting Room for your application

The sample browser client demonstrates an entire user flow frontend with the AWS Virtual Waiting Room flow for an ecommerce purchase. You can use this code as a starting point for your waiting room or reference the API communication code for integrating the waiting room into your existing website.

This sample code is built with Vue.js and Bootstrap to render the user interface. It uses the Axios and Axios-Retry packages to make API calls to the virtual waiting room stack. The sample code uses the Axios-Retry package to show how to handle throttling conditions and exponential backoff in high-traffic situations.

The control panel client is used to make requests to the private waiting room API that requires IAM-based authorization. The control panel client demonstrates how to construct and sign requests to the private API. It can be used in production or customized further. All of the sample source code room source is available in GitHub including the sample user client and control panel client.

Conclusion

The AWS Virtual Waiting Room solution is available today at no additional cost, provided as open source under the Apache 2 license. It supports customized integration with any front-end application via a variety of integration techniques. You can also customize how and when the waiting room allows users to progress into the protected target system using a variety of strategies.

To learn more about the AWS Virtual Waiting Room solution, visit the solution implementation and implementation guide.

For more serverless learning resources, visit Serverless Land.

Capturing client events using Amazon API Gateway and Amazon EventBridge

2022-02-08 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/capturing-client-events-using-amazon-api-gateway-and-amazon-eventbridge/

This post is written by Tim Bruce, Senior Solutions Architect, DevAx.

Event producers are one of the three main components in an event-driven architecture. Event producers create and publish events to event routers, which send them to event consumers. Any portion of a system, including a mobile or web client, can be an event producer.

To extend the event model to your mobile and web clients, you must implement standards for security, messaging formats, and event storage.

This post shows how to build a client-enabled event-handling solution. It uses Amazon EventBridge, Amazon API Gateway, AWS Lambda, and Amazon Cognito. This architecture supports routing client events to internal and external destinations. It provides a blueprint that you can use to simplify the integration.

Overview

This example creates a RESTful API using API Gateway. It sends events directly to EventBridge without the need for compute services. In production, you have more requirements than only receiving and forwarding events. Additional requirements include security, user identification, validation, enrichment, transformation, event forwarding, and storing.

In this example, API Gateway provides security and user identification by invoking a Lambda authorizer. The authorizer generates a policy and returns client identification to API Gateway. API Gateway then performs request validation and message enrichment before forwarding the events to EventBridge.

EventBridge evaluates the events against rules and forwards the events to targets. The rules apply transformation to the events and forward an event to up to five targets. Targets include AWS services, such as Amazon Kinesis Data Firehose, and many third-party solutions, such as Zendesk, with HTTPS endpoints.

Lastly, Kinesis Data Firehose provides a cost-effective solution to store events into an Amazon S3 bucket. Before storing the events, Kinesis Data Firehose transforms records via Lambda transformers. It also partitions records using data in the record or calculated data via a Lambda function. Kinesis Data Firehose uses this partitioning data to create keys in the bucket and store matching records within the keys.

Example architecture

The example consists of the following resources defined in the AWS SAM template:

An API Gateway instance to receive the messages.
A Lambda authorizer to validate requests.
An EventBridge event bus to receive events.
An EventBridge rule to forward all events to Kinesis Data Firehose.
An EventBridge rule to forward specific events to Zendesk.
An EventBridge API destination to connect to your Zendesk.
A Kinesis Data Firehose to transform, partition, and store events in an S3 bucket.
A Lambda Kinesis Data Firehose data transformation.
An S3 bucket to store event data.

Data flow

Application clients collect or generate the events.
The client sends the events to API Gateway as URL-encoded JSON. The client includes the user’s JWT in an authorization header with the request for validation.
The Lambda authorizer validates the JWT with Amazon Cognito and returns the user’s unique clientID value to API Gateway.
API Gateway transforms the request into events, appending clientId, the bus name, and environment.
API Gateway sends the events to EventBridge.
EventBridge rules match the events and:
1. Forwards all client events to Kinesis Data Firehose.
2. Forwards client events with detail.eventType of “loyaltypurchase” to Zendesk.
Kinesis Data Firehose receives the records.
The Kinesis Data Firehose data transformation processes each record, moving the client ID to the detail object.
Kinesis Data Firehose partitions the records and stores them in an S3 bucket.

Overall design

The following sections discuss details of the solution, starting from the event in a web or mobile client. This solution requires the client to create an HTTPS request, including the user’s JWT as an authorization header.

{"entries": [{"entry": "{\"eventType\": \"searching\", \"schemaVersion\":1, \"data\": {\"searchTerm\":\"games\"}}"}]}

The preceding JSON shows a sample request body for this solution. The top-level item “entries” is an array of “entry” items. API Gateway will translate each “entry” to the event-detail field in EventBridge events. The client must escape the data for “entry” to prevent translation errors.

API Gateway and Lambda authorizer

API Gateway receives the request and validates the JWT by invoking the Lambda authorizer. The authorizer generates a policy allowing the request for valid tokens. It adds the Amazon Cognito “custom:clientId” custom attribute to the response context before returning the response to API Gateway. The “custom:clientId” attribute is a unique client identifier in the form of a UUID that downstream systems can use to retrieve data about the customer.

API Gateway validates the request by matching the request body against a model. Models represent what a request should look like. A mapping template then transforms valid requests to the format required by EventBridge. Mapping templates use velocity templating language (VTL) to do this.

This mapping template uses a #foreach loop to process the array “entries” from the request body. The process enriches each event with the user’s “custom:clientId” and stage variables for bus name and environment from API Gateway.

The preceding API Gateway AWS integration enables API Gateway to send the events to EventBridge without using compute services, such as Lambda or Amazon EC2. The integration and IAM execution role enable API Gateway to call the EventBridge PutEvents API to do this.

EventBridge rules and transformations

EventBridge rules match events against criteria, transform the events, and forward the events to targets. There are two rules in this example. One processes events for Zendesk tickets and the other forwards data to Kinesis Data Firehose to store events for triage and analytics.

This example creates service tickets in the Zendesk ticketing system. The tickets trigger agents to contact customers who are expecting a call to complete their purchases. The software client, by sending the event directly, reducing time-to-action for back-office processes and helping improve customer satisfaction.

This rule matches client event messages for loyalty purchases and forwards details to the Zendesk API. The rule includes a transformation, which selects a portion of the event before sending the information to the target.

EventBridge uses an API destination to store details about the HTTP endpoint and usage policies. Additionally, an EventBridge connection and an AWS Secrets Manager secret store details. These include the authentication policy and authentication credentials to connect to the API destination.

Successfully processed events open tickets in Zendesk using the API destination. Agents now have a list of customers to contact.

Enterprises often require storing the events for troubleshooting or analytics. EventBridge does not include a newline between records when forwarding events to Kinesis Data Firehose. Because of this, it may be more challenging to discern each record when analyzing the data.

A rule for all client events changes this behavior. This AWS CloudFormation snippet defines the rule that will transform each event, adding a new line after each. The “\n” character in the InputTemplate field adds the separator between records before forwarding the data to Kinesis Data Firehose.

After, Kinesis Data Firehose receives each record separated by a new line, enabling both triage and analytics without extra overhead.

Kinesis Data Firehose to S3

Kinesis Data Firehose is a cost-effective way to batch and write records to S3. It offers optional transformation capabilities by invoking a Lambda function. This example uses a Lambda function that moves the “clientID” field to the detail section of the event record.

Kinesis Data Firehose also supports dynamic partitioning of records when writing to S3. It selects data from the records or data calculated by a Lambda function. In this example, it selects data from the records to store data in separate folders in S3.

Event durability considerations

You can extend this example using an EventBridge archive and Amazon Kinesis Data Streams. Archiving allows you to create an encrypted archive of matching events. You can define the data retention in days, from one through indefinite. You can replay events from your archive when you must re-process data.

Kinesis Data Streams is a serverless data streaming solution. The EventBridge rule for all records can forward data to Kinesis Data Streams instead of Kinesis Data Firehose. Multiple applications can consume the Kinesis Data Streams. Kinesis Data Firehose would consume this stream of data and store it in S3.

Prerequisites

You need the following prerequisites to deploy the example solution:

AWS account
AWS CLI
AWS Serverless Application Model (AWS SAM) CLI
Python 3.9
An AWS Identity and Access Management (IAM) role with appropriate access.
A Zendesk trial account
A Zendesk API key

Implementation

The full source of the solution is in the GitHub repository and is deployed with AWS SAM.

Create a Secrets Manager secret using the command the AWS CLI:
aws secretsmanager create-secret --name proto/Zendesk --secret-string '{"username":"<YOUR EMAIL>","apiKey":"<YOUR APIKEY>"}
Clone the solution repository using git:
git clone https://github.com/aws-samples/client-event-sample
Build the AWS SAM project:
sam build --use-container
Deploy the project using AWS SAM:
sam deploy --guided --capabilities CAPABILITY_NAMED_IAM

From the outputs from the deployment, set the following shell variables:

APPCLIENTID=<output APPCLIENTID>
APIID=<output APIID>
REGION=<region you deployed to>

Create a user in Amazon Cognito using the AWS CLI:
aws cognito-idp sign-up --client-id $APPCLIENTID --username <YOUR USER ID> --password <YOUR PASSWORD> --user-attributes Name=email,Value=<YOUR EMAIL>
After you receive the confirmation code, confirm the user using the AWS CLI:
aws cognito-idp confirm-sign-up --client-id $APPCLIENTID --username <userid> --confirmation-code <confirmation code>
Test the user login with the AWS CLI:
aws cognito-idp initiate-auth --auth-flow USER_PASSWORD_AUTH --client-id $APPCLIENTID --auth-parameters USERNAME=<YOUR USER ID>,PASSWORD=<YOUR PASSWORD>

If successful, this returns a JSON web token (JWT).

Testing the client event solution

The sample repository includes an event generator in the util directory. The generator uses your credentials and simulates events from a user’s software client. From the utils directory, run the generator:
python3 generator.py --minutes <minutes to run generator> --batch <batch size from 1-10> --errors <True|False> --userid <YOUR USER ID> --password <YOUR PASSWORD> --region $REGION --appclientid $APPCLIENTID --apiid $APIID
Log in to your Zendesk console and view the created tickets.
After five minutes, review the “clientevents” bucket to view the event records.

Cleaning up

To remove the example:

Delete the data stored in the clientevents buckets created from the template.
Delete the stack using the command:
sam delete --stack-name clientevents
Delete the secret using the command:
aws secretsmanager delete-secret --secret-id <arn of secret>

Conclusion

This post shows how to send client events to an API and EventBridge to enable new customer experiences. The example covers enabling new experiences by creating a way for software clients to send events with minimal custom code. This blueprint shows how you can include client events in your solution, featuring validation, enrichment, transformation, and storage.

You can modify the example code provided here for your use in your organization. This enables your client software to register events without modifying backend code.

For more serverless learning resources, visit Serverless Land.

Mocking service integrations with AWS Step Functions Local

2022-01-31 Benjamin Smith

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/mocking-service-integrations-with-aws-step-functions-local/

This post is written by Sam Dengler, Principal Specialist Solutions Architect, and Dhiraj Mahapatro, Senior Specialist Solutions Architect.

AWS Step Functions now supports over 200 AWS Service integrations via AWS SDK Integration. Developers want to build and test control flow logic for workflows using branching logic, error handling, and retries. This allows for precise workflow execution with deterministic results. Additionally, developers use Step Functions’ input and output processing features to transform data as it enters and exits tasks.

Developers can test their state machines locally using Step Functions Local before deploying them to an AWS account. However state machines that use service integrations like AWS Lambda, Amazon SQS, or Amazon SNS require Step Functions Local to perform calls to AWS service endpoints. Often, developers want to test the control and data flows of their state machine executions in isolation, without any dependency on service integration availability.

Today, AWS is releasing Mocked Service Integrations for Step Functions Local. This allows developers to define sample outputs from AWS service integrations. You can combine them into test case scenarios to validate workflow control and data flow definitions. You can find the code used in this post in the Step Functions examples GitHub repository.

Sales lead generation sample workflow

In this example, new sales leads are created in a customer relationship management system. This triggers the sample workflow execution using input data, which provides information about the contact.

Using the sales lead data, the workflow first validates the contact’s identity and address. If valid, it uses Step Functions’ AWS SDK integration for Amazon Comprehend to call the DetectSentiment API. It uses the sales lead’s comments as input for sentiment analysis.

If the comments have a positive sentiment, it adds the sales leads information to a DynamoDB table for follow-up. The event is published to Amazon EventBridge to notify subscribers.

If the sales lead data is invalid or a negative sentiment is detected, it publishes events to EventBridge for notification. No record is added to the Amazon DynamoDB table. The following Step Functions Workflow Studio diagram shows the control logic:

The full workflow definition is available in the code repository. Note the workflow task names in the diagram, such as DetectSentiment, which are important when defining the mocked responses.

Sentiment analysis test case

In this example, you test a scenario in which:

The identity and address are successfully validated using a Lambda function.
A positive sentiment is detected using the Comprehend.DetectSentiment API after three retries.
A contact item is written to a DynamoDB table successfully
An event is published to an EventBridge event bus successfully

The execution path for this test scenario is shown in the following diagram (the red and green numbers have been added). 0 represents the first execution; 1, 2, and 3 represent the max retry attempts (MaxAttempts), in case of an InternalServerException.

Mocked response configuration

To use service integration mocking, create a mock configuration file with sections specifying mock AWS service responses. These are grouped into test cases that can be activated when executing state machines locally. The following example provides code snippets and the full mock configuration is available in the code repository.

To mock a successful Lambda function invocation, define a mock response that conforms to the Lambda.Invoke API response elements. Associate it to the first request attempt:

"CheckIdentityLambdaMockedSuccess": {
  "0": {
    "Return": {
      "StatusCode": 200,
      "Payload": {
        "statusCode": 200,
        "body": "{\"approved\":true,\"message\":\"identity validation passed\"
}"
      }
    }
  }
}

To mock the DetectSentiment retry behavior, define failure and successful mock responses that conform to the Comprehend.DetectSentiment API call. Associate the failure mocks to three request attempts, and associate the successful mock to the fourth attempt:

"DetectSentimentRetryOnErrorWithSuccess": {
  "0-2": {
    "Throw": {
      "Error": "InternalServerException",
      "Cause": "Server Exception while calling DetectSentiment API in Comprehend Service"
    }
  },
  "3": {
    "Return": {
      "Sentiment": "POSITIVE",
      "SentimentScore": {
        "Mixed": 0.00012647535,
        "Negative": 0.00008031699,
        "Neutral": 0.0051454515,
        "Positive": 0.9946478
      }
    }
  }
}

Note that Step Functions Local does not validate the structure of the mocked responses. Ensure that your mocked responses conform to actual responses before testing. To review the structure of service responses, either perform the actual service calls using Step Functions or view the documentation for those services.

Next, associate the mocked responses to a test case identifier:

"RetryOnServiceExceptionTest": {
  "Check Identity": "CheckIdentityLambdaMockedSuccess",
  "Check Address": "CheckAddressLambdaMockedSuccess",
  "DetectSentiment": "DetectSentimentRetryOnErrorWithSuccess",
  "Add to FollowUp": "AddToFollowUpSuccess",
  "CustomerAddedToFollowup": "CustomerAddedToFollowupSuccess"
}

With the test case and mock responses configured, you can use them for testing with Step Functions Local.

Test case execution using Step Functions Local

The Step Functions Developer Guide describes the steps used to set up Step Functions Local on your workstation and create a state machine.

After these steps are complete, you can run a workflow locally using the start-execution AWS CLI command. Activate the mocked responses by appending a pound sign and the test case identifier to the state machine ARN:

aws stepfunctions start-execution \
  --endpoint http://localhost:8083 \
  --state-machine arn:aws:states:us-east-1:123456789012:stateMachine: LeadGenerationStateMachine#RetryOnServiceExceptionTest \
  --input file://events/sfn_valid_input.json

Test case validation

To validate the workflow executed correctly in the test case, examine the state machine execution events using the StepFunctions.GetExecutionHistory API. This ensures that the correct states are used. There are a variety of validation tools available. This post shows how to achieve this using the AWS CLI filtering feature using JMESPath syntax.

In this test case, you validate the TaskFailed and TaskSucceeded events match the retry definition for the DetectSentiment task, which specifies three retries. Use the following AWS CLI command to get the execution history and filter on the execution events:

aws stepfunctions get-execution-history \
  --endpoint http://localhost:8083 \
  --execution-arn <ExecutionArn>
  --query 'events[?(type==`TaskFailed` && contains(taskFailedEventDetails.cause, `Server Exception while calling DetectSentiment API in Comprehend Service`)) || (type==`TaskSucceeded` && taskSucceededEventDetails.resource==`comprehend:detectSentiment`)]'

The results include matching events:

{
  "timestamp": "2022-01-13T17:24:32.276000-05:00",
  "type": "TaskFailed",
  "id": 19,
  "previousEventId": 18,
  "taskFailedEventDetails": {
    "error": "InternalServerException",
    "cause": "Server Exception while calling DetectSentiment API in Comprehend Service"
  }
}

These results should be compared to the test acceptance criteria to verify the execution behavior. Test cases, acceptance criteria, and validation expressions vary by customer and use case. These techniques are flexible to accommodate various happy path and error scenarios. To explore additional sample test cases and examples, visit the example code repository.

Conclusion

This post introduces a new robust way to test AWS Step Functions state machines in isolation. With mocking, developers get more control over the type of scenarios that a state machine can handle, leading to assertion of multiple behaviors. Testing a state machine with mocks can also be part of the software release. Asserting on behaviors like error handling, branching, parallel, dynamic parallel (map state) helps test the entire state machine’s behavior. For any new behavior in the state machine, such as a new type of exception from a state, you can mock and add as a test.

See the Step Functions Developer Guide for more information on service mocking with Step Functions Local. The sample application covers basic scenarios of testing a state machine. You can use a similar approach for complex scenarios including other Step Functions flows, like map and wait.

For more serverless learning resources, visit Serverless Land.

Using the circuit breaker pattern with AWS Step Functions and Amazon DynamoDB

2022-01-31 Eric Johnson

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/using-the-circuit-breaker-pattern-with-aws-step-functions-and-amazon-dynamodb/

This post is written by Anitha Deenadayalan, Developer Specialist SA, DevAx

Modern applications use microservices as an architectural and organizational approach to software development, where the application comprises small independent services that communicate over well-defined APIs.

When multiple microservices collaborate to handle requests, one or more services may become unavailable or exhibit a high latency. Microservices communicate through remote procedure calls, and it is always possible that transient errors could occur in the network connectivity, causing failures.

This can cause performance degradation in the entire application during synchronous execution because of the cascading of timeouts or failures causing poor user experience. When complex applications use microservices, an outage in one microservice can lead to application failure. This post shows how to use the circuit breaker design pattern to help with a graceful service degradation.

Introducing circuit breakers

Michael Nygard popularized the circuit breaker pattern in his book, Release It. This design pattern can prevent a caller service from retrying another callee service call that has previously caused repeated timeouts or failures. It can also detect when the callee service is functional again.

Fallacies of distributed computing are a set of assertions made by Peter Deutsch and others at Sun Microsystems. They say the programmers new to distributed applications invariably make false assumptions. The network reliability, zero-latency expectations, and bandwidth limitations result in software applications written with minimal error handling for network errors.

During a network outage, applications may indefinitely wait for a reply and continually consume application resources. Failure to retry the operations when the network becomes available can also lead to application degradation. If API calls to a database or an external service time-out due to network issues, repeated calls with no circuit breaker can affect cost and performance.

The circuit breaker pattern

There is a circuit breaker object that routes the calls from the caller to the callee in the circuit breaker pattern. For example, in an ecommerce application, the order service can call the payment service to collect the payments. When there are no failures, the order service routes all calls to the payment service by the circuit breaker:

Circuit breaker with no failures

If the payment service times out, the circuit breaker can detect the timeout and track the failure. If the timeouts exceed a specified threshold, the application opens the circuit:

Circuit breaker with payment service failure

Once the circuit is open, the circuit breaker object does not route the calls to the payment service. It returns an immediate failure when the order service calls the payment service:

Circuit breaker stops routing to payment service

The circuit breaker object periodically tries to see if the calls to the payment service are successful:

Circuit breaker retries payment service

When the call to payment service succeeds, the circuit is closed, and all further calls are routed to the payment service again:

Circuit breaker with working payment service again

Architecture overview

This example uses the AWS Step Functions, AWS Lambda, and Amazon DynamoDB to implement the circuit breaker pattern:

Circuit breaker architecture

The Step Functions workflow provides circuit breaker capabilities. When a service wants to call another service, it starts the workflow with the name of the callee service.

The workflow gets the circuit status from the CircuitStatus DynamoDB table, which stores the currently degraded services. If the CircuitStatus contains a record for the service called, then the circuit is open. The Step Functions workflow returns an immediate failure and exit with a FAIL state.

If the CircuitStatus table does not contain an item for the called service, then the service is operational. The ExecuteLambda step in the state machine definition invokes the Lambda function sent through a parameter value. The Step Functions workflow exits with a SUCCESS state, if the call succeeds.

The items in the DynamoDB table have the following attributes:

DynamoDB items list

If the service call fails or a timeout occurs, the application retries with exponential backoff for a defined number of times. If the service call fails after the retries, the workflow inserts a record in the CircuitStatus table for the service with the CircuitStatus as OPEN, and the workflow exits with a FAIL state. Subsequent calls to the same service return an immediate failure as long as the circuit is open.

I enter the item with an associated time-to-live (TTL) value to ensure eventual connection retries and the item expires at the defined TTL time. DynamoDB’s time to live (TTL) allows you to define a per-item timestamp to determine when an item is no longer needed. Shortly after the date and time of the specified timestamp, DynamoDB deletes the item from your table without consuming write throughput.

For example, if you set the TTL value to 60 seconds to check a service status after a minute, DynamoDB deletes the item from the table after 60 seconds. The workflow invokes the service to check for availability when a new call comes in after the item has expired.

Circuit breaker Step Function

Prerequisites

For this walkthrough, you need:

An AWS account and an AWS user with AdministratorAccess (see the instructions on the AWS Identity and Access Management (IAM) console)
Access to the following AWS services: AWS Lambda, AWS Step Functions, and Amazon DynamoDB.
AWS SAM CLI using the instructions here.
NET Core 3.1 SDK installed
JetBrains Rider or Microsoft Visual Studio 2017 or later (or Visual Studio Code)

Setting up the environment

Use the .NET Core 3.1 code in the GitHub repository and the AWS SAM template to create the AWS resources for this walkthrough. These include IAM roles, DynamoDB table, the Step Functions workflow, and Lambda functions.

You need an AWS access key ID and secret access key to configure the AWS Command Line Interface (AWS CLI). To learn more about configuring the AWS CLI, follow these instructions.
Clone the repo:
git clone https://github.com/aws-samples/circuit-breaker-netcore-blog
After cloning, this is the folder structure:

Project file structure

Deploy using Serverless Application Model (AWS SAM)

The AWS Serverless Application Model (AWS SAM) CLI provides developers with a local tool for managing serverless applications on AWS.

The sam build command processes your AWS SAM template file, application code, and applicable language-specific files and dependencies. The command copies build artifacts in the format and location expected for subsequent steps in your workflow. Run these commands to process the template file:
```
cd circuit-breaker
sam build
```
After you build the application, test using the sam deploy command. AWS SAM deploys the application to AWS and displays the output in the terminal.
```
sam deploy --guided
```
Output from sam deploy
You can also view the output in AWS CloudFormation page.

Output in CloudFormation console
The Step Functions workflow provides the circuit-breaker function. Refer to the circuitbreaker.asl.json file in the statemachine folder for the state machine definition in the Amazon States Language (ASL).

To deploy with the CDK, refer to the GitHub page.

Running the service through the circuit breaker

To provide circuit breaker capabilities to the Lambda microservice, you must send the name or function ARN of the Lambda function to the Step Functions workflow:

{
  "TargetLambda": "<Name or ARN of the Lambda function>"
}

Successful run

To simulate a successful run, use the HelloWorld Lambda function provided by passing the name or ARN of the Lambda function the stack has created. Your input appears as follows:

{
  "TargetLambda": "circuit-breaker-stack-HelloWorldFunction-pP1HNkJGugQz"
}

During the successful run, the Get Circuit Status step checks the circuit status against the DynamoDB table. Suppose that the circuit is CLOSED, which is indicated by zero records for that service in the DynamoDB table. In that case, the Execute Lambda step runs the Lambda function and exits the workflow successfully.

Step Function with closed circuit

Service timeout

To simulate a timeout, use the TestCircuitBreaker Lambda function by passing the name or ARN of the Lambda function the stack has created. Your input appears as:

{
  "TargetLambda": "circuit-breaker-stack-TestCircuitBreakerFunction-mKeyyJq4BjQ7"
}

Again, the circuit status is checked against the DynamoDB table by the Get Circuit Status step in the workflow. The circuit is CLOSED during the first pass, and the Execute Lambda step runs the Lambda function and timeout.

The workflow retries based on the retry count and the exponential backoff values, and finally returns a timeout error. It runs the Update Circuit Status step where a record is inserted in the DynamoDB table for that service, with a predefined time-to-live value specified by TTL attribute ExpireTimeStamp.

Step Function with open circuit

Repeat timeout

As long as there is an item for the service in the DynamoDB table, the circuit breaker workflow returns an immediate failure to the calling service. When you re-execute the call to the Step Functions workflow for the TestCircuitBreaker Lambda function within 20 seconds, the circuit is still open. The workflow immediately fails, ensuring the stability of the overall application performance.

Step Function workflow immediately fails until retry

The item in the DynamoDB table expires after 20 seconds, and the workflow retries the service again. This time, the workflow retries with exponential backoffs, and if it succeeds, the workflow exits successfully.

Cleaning up

To avoid incurring additional charges, clean up all the created resources. Run the following command from a terminal window. This command deletes the created resources that are part of this example.

sam delete --stack-name circuit-breaker-stack --region <region name>

Conclusion

This post showed how to implement the circuit breaker pattern using Step Functions, Lambda, DynamoDB, and .NET Core 3.1. This pattern can help prevent system degradation in service failures or timeouts. Step Functions and the TTL feature of DynamoDB can make it easier to implement the circuit breaker capabilities.

To learn more about developing microservices on AWS, refer to the whitepaper on microservices. To learn more about serverless and AWS SAM, visit the Sessions with SAM series and find more resources at Serverless Land.

Codacy Measures Developer Productivity using AWS Serverless

2022-01-27 Catarina Gralha

Post Syndicated from Catarina Gralha original https://aws.amazon.com/blogs/architecture/codacy-measures-developer-productivity-using-aws-serverless/

Codacy is a DevOps insights company based in Lisbon, Portugal. Since its launch in 2012, Codacy has helped software development and engineering teams reduce defects, keep technical debt in check, and ship better code, faster.

Codacy’s latest product, Pulse, is a service that helps understand and improve the performance of software engineering teams. This includes measuring metrics such as deployment frequency, lead time for changes, or mean time to recover. Codacy’s main platform is built on top of AWS products like Amazon Elastic Kubernetes Service (EKS), but they have taken Pulse one step further with AWS serverless.

In this post, we will explore the Pulse’s requirements, architecture, and the services it is built on, including AWS Lambda, Amazon API Gateway, and Amazon DynamoDB.

Pulse prototype requirements

Codacy had three clear requirements for their initial Pulse prototype.

The solution must enable the development team to iterate quickly and have minimal time-to-market (TTM) to validate the idea.
The solution must be easily scalable and match the demands of both startups and large enterprises alike. This was of special importance, as Codacy wanted to onboard Pulse with some of their existing customers. At the time, these customers already had massive amounts of information.
The solution must be cost-effective, particularly during the early stages of the product development.

Enter AWS serverless

Codacy could have built Pulse on top of Amazon EC2 instances. However, this brings the undifferentiated heavy lifting of having to provision, secure, and maintain the instances themselves.

AWS serverless technologies are fully managed services that abstract the complexity of infrastructure maintenance away from developers and operators, so they can focus on building products.

Serverless applications also scale elastically and automatically behind the scenes, so customers don’t need to worry about capacity provisioning. Furthermore, these services are highly available by design and span multiple Availability Zones (AZs) within the Region in which they are deployed. This gives customers higher confidence that their systems will continue running even if one Availability Zone is impaired.

AWS serverless technologies are cost-effective too, as they are billed per unit of value, as opposed to billing per provisioned capacity. For example, billing is calculated by the amount of time a function takes to complete or the number of messages published to a queue, rather than how long an EC2 instance runs. Customers only pay when they are getting value out of the services, for example when serving an actual customer request.

Overview of Pulse’s solution architecture

An event is generated when a developer performs a specific action as part of their day-to-day tasks, such as committing code or merging a pull request. These events are the foundational data that Pulse uses to generate insights and are thus processed by multiple Pulse components called modules.

Let’s take a detailed look at a few of them.

Ingestion module

Figure 1. Pulse ingestion module architecture

Figure 1 shows the ingestion module, which is the entry point of events into the Pulse platform and is built on AWS serverless applications as follows:

The ingestion API is exposed to customers using Amazon API Gateway. This defines REST, HTTP, and WebSocket APIs with sophisticated functionality such as request validation, rate limiting, and more.
The actual business logic of the API is implemented as AWS Lambda functions. Lambda can run custom code in a fully managed way. You only pay for the time that the function takes to run, in 1-millisecond increments. Lambda natively supports multiple languages, but customers can also bring their own runtimes or container images as needed.
API requests are authorized with keys, which are stored in Amazon DynamoDB, a key-value NoSQL database that delivers single-digit millisecond latency at any scale. API Gateway invokes a Lambda function that validates the key against those stored in DynamoDB (this is called a Lambda authorizer.)
While API Gateway provides a default domain name for each API, Codacy customizes it with Amazon Route 53, a service that registers domain names and configures DNS records. Route 53 offers a service level agreement (SLA) of 100% availability.
Events are stored in raw format in Pulse’s data lake, which is built on top of AWS’ object storage service, Amazon Simple Storage Service (S3). With Amazon S3, you can store massive amounts of information at low cost using simple HTTP requests. The data is highly available and durable.
Whenever a new event is ingested by the API, a message is published in Pulse’s message bus. (More information later in this post.)

Events module

Figure 2. Pulse events module architecture

The events module handles the aggregation and storage of events for actual consumption by customers, see Figure 2:

Events are consumed from the message bus and processed with a Lambda function, which stores them in Amazon Redshift.
Amazon Redshift is AWS’ managed data warehouse, and enables Pulse’s users to get insights and metrics by running analytical (OLAP) queries with the highest performance.
These metrics are exposed to customers via another API (the public API), which is also built on API Gateway.
The business logic for this API is implemented using Lambda functions, like the Ingestion module.

Message bus

Figure 3. Message bus architecture

We mentioned earlier that Pulse’s modules communicate messages with each other via the “message bus.” When something occurs at a specific component, a message (event) is published to the bus. At the same time, developers create subscriptions for each module that should receive these messages. This is known as the publisher/subscriber pattern (pub/sub for short), and is a fundamental piece of event-driven architectures.

With the message bus, you can decouple all modules from each other. In this way, a publisher does not need to worry about how many or who their subscribers are, or what to do if a new one arrives. This is all handled by the message bus.

Pulse’s message bus is built like this, shown in Figure 3:

Events are published via Amazon Simple Notification Service (SNS), using a construct called a topic. Topics are the basic unit of message publication and consumption. Components are subscribed to this topic, and you can filter out unwanted messages.
Developers configure Amazon SNS subscriptions to have the events sent to a queue, which provides a buffering layer from which workers can process messages. At the same time, queues also ensure that messages are not lost if there is an error. In Pulse’s case, these queues are implemented with Amazon Simple Queue Service (SQS).

Other modules

There are other parts of Pulse architecture that also use AWS serverless. For example, user authentication and sign-up are handled by Amazon Cognito, and Pulse’s frontend application is hosted on Amazon S3. This app is served to customers worldwide with low latency using Amazon CloudFront, a content delivery network.

Summary and next steps

By using AWS serverless, Codacy has been able to reduce the time required to bring Pulse to market by staying focused on developing business logic, rather than managing servers. Furthermore, Codacy is confident they can handle Pulse’s growth, as this serverless architecture will scale automatically according to demand.

Learn more about Serverless on AWS.
Visit Codacy to find out more about Pulse.

Migrating AWS Lambda functions to Arm-based AWS Graviton2 processors

2022-01-24 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/migrating-aws-lambda-functions-to-arm-based-aws-graviton2-processors/

AWS Lambda now allows you to configure new and existing functions to run on Arm-based AWS Graviton2 processors in addition to x86-based functions. Using this processor architecture option allows you to get up to 34% better price performance. This blog post highlights some considerations when moving from x86 to arm64 as the migration process is code and workload dependent.

Functions using the Arm architecture benefit from the performance and security built into the Graviton2 processor, which is designed to deliver up to 19% better performance for compute-intensive workloads. Workloads using multithreading and multiprocessing, or performing many I/O operations, can experience lower invocation time, which reduces costs.

Duration charges, billed with millisecond granularity, are 20 percent lower when compared to current x86 pricing. This also applies to duration charges when using Provisioned Concurrency. Compute Savings Plans supports Lambda functions powered by Graviton2.

The architecture change does not affect the way your functions are invoked or how they communicate their responses back. Integrations with APIs, services, applications, or tools are not affected by the new architecture and continue to work as before.

The following runtimes, which use Amazon Linux 2, are supported on Arm:

Node.js 12 and 14
Python 3.8 and 3.9
Java 8 (java8.al2) and 11
.NET Core 3.1
Ruby 2.7
Custom runtime (provided.al2)

Lambda@Edge does not support Arm as an architecture option.

You can create and manage Lambda functions powered by Graviton2 processor using the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS CloudFormation, AWS Serverless Application Model (AWS SAM), and AWS Cloud Development Kit (AWS CDK). Support is also available through many AWS Lambda Partners.

Understanding Graviton2 processors

AWS Graviton processors are custom built by AWS. Generally, you don’t need to know about the specific Graviton processor architecture, unless your applications can benefit from specific features.

The Graviton2 processor uses the Neoverse-N1 core and supports Arm V8.2 (include CRC and crypto extensions) plus several other architectural extensions. In particular, Graviton2 supports the Large System Extensions (LSE), which improve locking and synchronization performance across large systems.

Migrating x86 Lambda functions to arm64

Many Lambda functions may only need a configuration change to take advantage of the price/performance of Graviton2. Other functions may require repackaging the Lambda function using Arm-specific dependencies, or rebuilding the function binary or container image.

You may not require an Arm processor on your development machine to create Arm-based functions. You can build, test, package, compile, and deploy Arm Lambda functions on x86 machines using AWS SAM and Docker Desktop. If you have an Arm-based system, such as an Apple M1 Mac, you can natively compile binaries.

Functions without architecture-specific dependencies or binaries

If your functions don’t use architecture-specific dependencies or binaries, you can switch from one architecture to the other with a single configuration change. Many functions using interpreted languages such as Node.js and Python, or functions compiled to Java bytecode, can switch without any changes. Ensure you check binaries in dependencies, Lambda layers, and Lambda extensions.

To switch functions from x86 to arm64, you can change the Architecture within the function runtime settings using the Lambda console.

Edit AWS Lambda function Architecture

If you want to display or log the processor architecture from within a Lambda function, you can use OS specific calls. For example, Node.js process.arch or Python platform.machine().

When using the AWS CLI to create a Lambda function, specify the --architectures option. If you do not specify the architecture, the default value is x86-64. For example, to create an arm64 function, specify --architectures arm64.

aws lambda create-function \
    --function-name MyArmFunction \
    --runtime nodejs14.x \
    --architectures arm64 \
    --memory-size 512 \
    --zip-file fileb://MyArmFunction.zip \
    --handler lambda.handler \
    --role arn:aws:iam::123456789012:role/service-role/MyArmFunction-role

When using AWS SAM or CloudFormation, add or amend the Architectures property within the function configuration.

MyArmFunction:
  Type: AWS::Lambda::Function
  Properties:
    Runtime: nodejs14.x
    Code: src/
    Architectures:
  	- arm64
    Handler: lambda.handler
    MemorySize: 512

When initiating an AWS SAM application, you can specify:

sam init --architecture arm64

When building Lambda layers, you can specify CompatibleArchitectures.

MyArmLayer:
  Type: AWS::Lambda::LayerVersion
  Properties:
    ContentUri: layersrc/
    CompatibleArchitectures:
      - arm64

Building function code for Graviton2

If you have dependencies or binaries in your function packages, you must rebuild the function code for the architecture you want to use. Many packages and dependencies have arm64 equivalent versions. Test your own workloads against arm64 packages to see if your workloads are good migration candidates. Not all workloads show improved performance due to the different processor architecture features.

For compiled languages like Rust and Go, you can use the provided.al2 custom runtime, which supports Arm. You provide a binary that communicates with the Lambda Runtime API.

When compiling for Go, set GOARCH to arm.

GOOS=linux GOARCH=arm go build

When compiling for Rust, set the target.

cargo build --release -- target-cpu=neoverse-n1

The default installation of Python pip on some Linux distributions is out of date (<19.3). To install binary wheel packages released for Graviton, upgrade the pip installation using:

sudo python3 -m pip install --upgrade pip

The Arm software ecosystem is continually improving. As a general rule, use later versions of compilers and language runtimes whenever possible. The AWS Graviton Getting Started GitHub repository includes known recent changes to popular packages that improve performance, including ffmpeg, PHP, .Net, PyTorch, and zlib.

You can use https://pkgs.org/ as a package repository search tool.

Sometimes code includes architecture specific optimizations. These can include code optimized in assembly using specific instructions for CRC, or enabling a feature that works well on particular architectures. One way to see if any optimizations are missing for arm64 is to search the code for __x86_64__ ifdefs and see if there is corresponding arm64 code included. If not, consider alternative solutions.

For additional language-specific considerations, see the links within the GitHub repository.

The Graviton performance runbook is a performance profiling reference by the Graviton to benchmark, debug, and optimize application code.

Building functions packages as container images

Functions packaged as container images must be built for the architecture (x86 or arm64) they are going to use. There are arm64 architecture versions of the AWS provided base images for Lambda. To specify a container image for arm64, use the arm64 specific image tag, for example, for Node.js 14:

public.ecr.aws/lambda/nodejs:14-arm64
public.ecr.aws/lambda/nodejs:latest-arm64
public.ecr.aws/lambda/nodejs:14.2021.10.01.16-arm64

Arm64 Images are also available from Docker Hub.

You can also use arbitrary Linux base images in addition to the AWS provided Amazon Linux 2 images. Images that support arm64 include Alpine Linux 3.12.7 or later, Debian 10 and 11, Ubuntu 18.04 and 20.04. For more information and details of other supported Linux versions, see Operating systems available for Graviton based instances.

Migrating a function

Here is an example of how to migrate a Lambda function from x86 to arm64 and take advantage of newer software versions to improve price and performance. You can follow a similar approach to test your own code.

I have an existing Lambda function as part of an AWS SAM template configured without an Architectures property, which defaults to x86_64.

  Imagex86Function:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.lambda_handler
      Runtime: python3.9

The Lambda function code performs some compute intensive image manipulation. The code uses a dependency configured with the following version:

{
  "dependencies": {
    "imagechange": "^1.1.1"
  }
}

I duplicate the Lambda function within the AWS SAM template using the same source code and specify arm64 as the Architectures.

  ImageArm64Function:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.lambda_handler
      Runtime: python3.9
      Architectures:
        - arm64

I use AWS SAM to build both Lambda functions. I specify the --use-container flag to build each function within its architecture-specific build container.

sam build –use-container

I can use sam local invoke to test the arm64 function locally even on an x86 system.

AWS SAM local invoke

I then use sam deploy to deploy the functions to the AWS Cloud.

The AWS Lambda Power Tuning open-source project runs your functions using different settings to suggest a configuration to minimize costs and maximize performance. The tool allows you to compare two results on the same chart and incorporate arm64-based pricing. This is useful to compare two versions of the same function, one using x86 and the other arm64.

I compare the performance of the X86 and arm64 Lambda functions and see that the arm64 Lambda function is 12% cheaper to run:

Compare x86 and arm64 with dependency version 1.1.1

I then upgrade the package dependency to use version 1.2.1, which has been optimized for arm64 processors.

{
  "dependencies": {
    "imagechange": "^1.2.1"
  }
}

I use sam build and sam deploy to redeploy the updated Lambda functions with the updated dependencies.

I compare the original x86 function with the updated arm64 function. Using arm64 with a newer dependency code version increases the performance by 30% and reduces the cost by 43%.

Compare x86 and arm64 with dependency version 1.2.1

You can use Amazon CloudWatch,to view performance metrics such as duration, using statistics. You can then compare average and p99 duration between the two architectures. Due to the Graviton2 architecture, functions may be able to use less memory. This could allow you to right-size function memory configuration, which also reduces costs.

Deploying arm64 functions in production

Once you have confirmed your Lambda function performs successfully on arm64, you can migrate your workloads. You can use function versions and aliases with weighted aliases to control the rollout. Traffic gradually shifts to the arm64 version or rolls back automatically if any specified CloudWatch alarms trigger.

AWS SAM supports gradual Lambda deployments with a feature called Safe Lambda deployments using AWS CodeDeploy. You can compile package binaries for arm64 using a number of CI/CD systems. AWS CodeBuild supports building Arm based applications natively. CircleCI also has Arm compute resource classes for deployment. GitHub Actions allows you to use self-hosted runners. You can also use AWS SAM within GitHub Actions and other CI/CD pipelines to create arm64 artifacts.

Conclusion

Lambda functions using the Arm/Graviton2 architecture provide up to 34 percent price performance improvement. This blog discusses a number of considerations to help you migrate functions to arm64.

Many functions can migrate seamlessly with a configuration change, others need to be rebuilt to use arm64 packages. I show how to migrate a function and how updating software to newer versions may improve your function performance on arm64. You can test your own functions using the Lambda PowerTuning tool.

Start migrating your Lambda functions to Arm/Graviton2 today.

For more serverless learning resources, visit Serverless Land.

Introducing AWS Lambda batching controls for message broker services

2022-01-20 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-aws-lambda-batching-controls-for-message-broker-services/

This post is written by Mithun Mallick, Senior Specialist Solutions Architect.

AWS Lambda now supports configuring a maximum batch window for instance-based message broker services to fine tune when Lambda invocations occur. This feature gives you an additional control on batching behavior when processing data. It applies to Amazon Managed Streaming for Apache Kafka (Amazon MSK), self-hosted Apache Kafka, and Amazon MQ for Apache ActiveMQ and RabbitMQ.

Apache Kafka is an open source event streaming platform used to support workloads such as data pipelines and streaming analytics. It is conceptually similar to Amazon Kinesis. Amazon MSK is a fully managed, highly available service that simplifies the setup, scaling, and management of clusters running Kafka.

Amazon MQ is a managed, highly available message broker service for Apache ActiveMQ and RabbitMQ that makes it easier to set up and operate message brokers on AWS. Amazon MQ reduces your operational responsibilities by managing the provisioning, setup, and maintenance of message brokers for you.

Amazon MSK, self-hosted Apache Kafka and Amazon MQ for ActiveMQ and RabbitMQ are all available as event sources for AWS Lambda. You configure an event source mapping to use Lambda to process items from a stream or queue. This allows you to use these message broker services to store messages and asynchronously integrate them with downstream serverless workflows.

In this blog, I explain how message batching works. I show how to use the new maximum batching window control for the managed message broker services and self-managed Apache Kafka.

Understanding batching

For event source mappings, the Lambda service internally polls for new records or messages from the event source, and then synchronously invokes the target Lambda function. Lambda reads the messages in batches and provides these to your function as an event payload. Batching allows higher throughput message processing, up to 10,000 messages in a batch. The payload limit of a single invocation is 6 MB.

Previously, you could only use batch size to configure the maximum number of messages Lambda would poll for. Once a defined batch size is reached, the poller invokes the function with the entire set of messages. This feature is ideal when handling a low volume of messages or batches of data that take time to build up.

Batching window

The new Batch Window control allows you to set the maximum amount of time, in seconds, that Lambda spends gathering records before invoking the function. This brings similar batching functionality that AWS supports with Amazon SQS to Amazon MQ, Amazon MSK and self-managed Apache Kafka. The Lambda event source mapping batching functionality can be described as follows.

Batching controls with Lambda event source mapping

Using MaximumBatchingWindowInSeconds, you can set your function to wait up to 300 seconds for a batch to build before processing it. This allows you to create bigger batches if there are enough messages. You can manage the average number of records processed by the function with each invocation. This increases the efficiency of each invocation, and reduces the frequency.

Setting MaximumBatchingWindowInSeconds to 0 invokes the target Lambda function as soon as the Lambda event source receives a message from the broker.

Message broker batching behavior

For ActiveMQ, the Lambda event source mapping uses the Java Message Service (JMS) API to receive messages. For RabbitMQ, Lambda uses a RabbitMQ client library to get messages from the queue.

The Lambda event source mappings act as a consumer when polling the queue. The batching pattern for all instance-based message broker services is the same. As soon as a message is received, the batching window timer starts. If there are more messages, the consumer makes additional calls to the broker and adds them to a buffer. It keeps a count of the number of messages and the total size of the payload.

The batch is considered complete if the addition of a new message makes the batch size equal to or greater than 6 MB, or the batch window timeout is reached. If the batch size is greater than 6 MB, the last message is returned back to the broker.

Lambda then invokes the target Lambda function synchronously and passes on the batch of messages to the function. The Lambda event source continues to poll for more messages and as soon as it retrieves the next message, the batching window starts again. Polling and invocation of the target Lambda function occur in separate processes.

Kafka uses a distributed append log architecture to store messages. This works differently from ActiveMQ and RabbitMQ as messages are not removed from the broker once they have been consumed. Instead, consumers must maintain an offset to the last record or message that was consumed from the broker. Kafka provides several options in the consumer API to simplify the tracking of offsets.

Amazon MSK and Apache Kafka store data in multiple partitions to provide higher scalability. Lambda reads the messages sequentially for each partition and a batch may contain messages from different partitions. Lambda then commits the offsets once the target Lambda function is invoked successfully.

Configuring the maximum batching window

To reduce Lambda function invocations for existing or new functions, set the MaximumBatchingWindowInSeconds value close to 300 seconds. A longer batching window can introduce additional latency. For latency-sensitive workloads set the MaximumBatchingWindowInSeconds value to an appropriate setting.

To configure Maximum Batching on a function in the AWS Management Console, navigate to the function in the Lambda console. Create a new Trigger, or edit an existing once. Along with the Batch size you can configure a Batch window. The Trigger Configuration page is similar across the broker services.

Max batching trigger window

You can also use the AWS CLI to configure the --maximum-batching-window-in-seconds parameter.

For example, with Amazon MQ:

aws lambda create-event-source-mapping --function-name my-function \
--maximum-batching-window-in-seconds 300 --batch-size 100 --starting-position AT_TIMESTAMP \
--event-source-arn arn:aws:mq:us-east-1:123456789012:broker:ExampleMQBroker:b-24cacbb4-b295-49b7-8543-7ce7ce9dfb98

You can use AWS CloudFormation to configure the parameter. The following example configures the MaximumBatchingWindowInSeconds as part of the AWS::Lambda::EventSourceMapping resource for Amazon MQ:

  LambdaFunctionEventSourceMapping:
    Type: AWS::Lambda::EventSourceMapping
    Properties:
      BatchSize: 10
      MaximumBatchingWindowInSeconds: 300
      Enabled: true
      Queues:
        - "MyQueue"
      EventSourceArn: !GetAtt MyBroker.Arn
      FunctionName: !GetAtt LambdaFunction.Arn
      SourceAccessConfigurations:
        - Type: BASIC_AUTH
          URI: !Ref secretARNParameter

You can also use AWS Serverless Application Model (AWS SAM) to configure the parameter as part of the Lambda function event source.

MQReceiverFunction:
      Type: AWS::Serverless::Function 
      Properties:
        FunctionName: MQReceiverFunction
        CodeUri: src/
        Handler: app.lambda_handler
        Runtime: python3.9
        Events:
          MQEvent:
            Type: MQ
            Properties:
              Broker: !Ref brokerARNParameter
              BatchSize: 10
              MaximumBatchingWindowInSeconds: 300
              Queues:
                - "workshop.queueC"
              SourceAccessConfigurations:
                - Type: BASIC_AUTH
                  URI: !Ref secretARNParameter

Error handling

If your function times out or returns an error for any of the messages in a batch, Lambda retries the whole batch until processing succeeds or the messages expire.

When a function encounters an unrecoverable error, the event source mapping is paused and the consumer stops processing records. Any other consumers can continue processing, provided that they do not encounter the same error. If your Lambda event records exceed the allowed size limit of 6 MB, they can go unprocessed.

For Amazon MQ, you can redeliver messages when there’s a function error. You can configure dead-letter queues (DLQs) for both Apache ActiveMQ, and RabbitMQ. For RabbitMQ, you can set a per-message TTL to move failed messages to a DLQ.

Since the same event may be received more than once, functions should be designed to be idempotent. This means that receiving the same event multiple times does not change the result beyond the first time the event was received.

Conclusion

Lambda supports a number of event sources including message broker services like Amazon MQ and Amazon MSK. This post explains how batching works with the event sources and how messages are sent to the Lambda function.

Previously, you could only control the batch size. The new Batch Window control allows you to set the maximum amount of time, in seconds, that Lambda spends gathering records before invoking the function. This can increase the overall throughput of message processing and reduces Lambda invocations, which may improve cost.

For more serverless learning resources, visit Serverless Land.

Using Amazon Aurora Global Database for Low Latency without Application Changes

2022-01-11 Roneel Kumar

Post Syndicated from Roneel Kumar original https://aws.amazon.com/blogs/architecture/using-amazon-aurora-global-database-for-low-latency-without-application-changes/

Deploying global applications has many challenges, especially when accessing a database to build custom pages for end users. One example is an application using AWS Lambda@Edge. Two main challenges include performance and availability.

This blog explains how you can optimally deploy a global application with fast response times and without application changes.

The Amazon Aurora Global Database enables a single database cluster to span multiple AWS Regions by asynchronously replicating your data within subsecond timing. This provides fast, low-latency local reads in each Region. It also enables disaster recovery from Region-wide outages using multi-Region writer failover. These capabilities minimize the recovery time objective (RTO) of cluster failure, thus reducing data loss during failure. You will then be able to achieve your recovery point objective (RPO).

However, there are some implementation challenges. Most applications are designed to connect to a single hostname with atomic, consistent, isolated, and durable (ACID) consistency. But Global Aurora clusters provide reader hostname endpoints in each Region. In the primary Region, there are two endpoints, one for writes, and one for reads. To achieve strong data consistency, a global application requires the ability to:

Choose the optimal reader endpoints
Change writer endpoints on a database failover
Intelligently select the reader with the most up-to-date, freshest data

These capabilities typically require additional development.

The Heimdall Proxy coupled with Amazon Route 53 allows edge-based applications to access the Aurora Global Database seamlessly, without application changes. Features include automated Read/Write split with ACID compliance and edge results caching.

Figure 1. Heimdall Proxy architecture

The architecture in Figure 1 shows Aurora Global Databases primary Region in AP-SOUTHEAST-2, and secondary Regions in AP-SOUTH-1 and US-WEST-2. The Heimdall Proxy uses latency-based routing to determine the closest Reader Instance for read traffic, and redirects all write traffic to the Writer Instance. The Heimdall Configuration stores the Amazon Resource Name (ARN) of the global cluster. It automatically detects failover and cross-Region on the cluster, and directs traffic accordingly.

With an Aurora Global Database, there are two approaches to failover:

Managed planned failover. To relocate your primary database cluster to one of the secondary Regions in your Aurora global database, see Managed planned failovers with Amazon Aurora Global Database. With this feature, RPO is 0 (no data loss) and it synchronizes secondary DB clusters with the primary before making any other changes. RTO for this automated process is typically less than that of the manual failover.
Manual unplanned failover. To recover from an unplanned outage, you can manually perform a cross-Region failover to one of the secondaries in your Aurora Global Database. The RTO for this manual process depends on how quickly you can manually recover an Aurora global database from an unplanned outage. The RPO is typically measured in seconds, but this is dependent on the Aurora storage replication lag across the network at the time of the failure.

The Heimdall Proxy automatically detects Amazon Relational Database Service (RDS) / Amazon Aurora configuration changes based on the ARN of the Aurora Global cluster. Therefore, both managed planned and manual unplanned failovers are supported.

Solution benefits for global applications

Implementing the Heimdall Proxy has many benefits for global applications:

An Aurora Global Database has a primary DB cluster in one Region and up to five secondary DB clusters in different Regions. But the Heimdall Proxy deployment does not have this limitation. This allows for a larger number of endpoints to be globally deployed. Combined with Amazon Route 53 latency-based routing, new connections have a shorter establishment time. They can use connection pooling to connect to the database, which reduces overall connection latency.
SQL results are cached to the application for faster response times.
The proxy intelligently routes non-cached queries. When safe to do so, the closest (lowest latency) reader will be used. When not safe to access the reader, the query will be routed to the global writer. Proxy nodes globally synchronize their state to ensure that volatile tables are locked to provide ACID compliance.

For more information on configuring the Heimdall Proxy and Amazon Route 53 for a global database, read the Heimdall Proxy for Aurora Global Database Solution Guide.

Download a free trial from the AWS Marketplace.

Resources:

AWS Blog: How to Split Reads and Writes for Amazon RDS
AWS Blog: Automated Query Caching
AWS Blog: Advanced Connection Pooling
Contact: [email protected]

Heimdall Data, based in the San Francisco Bay Area, is an AWS Advanced ISV partner. They have AWS Service Ready designations for Amazon RDS and Amazon Redshift. Heimdall Data offers a database proxy that offloads SQL improving database scale. Deployment does not require code changes.

Using Node.js ES modules and top-level await in AWS Lambda

2022-01-06 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-node-js-es-modules-and-top-level-await-in-aws-lambda/

This post is written by Dan Fox, Principal Specialist Solutions Architect, Serverless.

AWS Lambda now enables the use of ECMAScript (ES) modules in Node.js 14 runtimes. This feature allows Lambda customers to use dependency libraries that are configured as ES modules, or to designate their own function code as an ES module. It provides customers the benefits of ES module features like import/export operators, language-level support for modules, strict mode by default, and improved static analysis and tree shaking. ES modules also enable top-level await, a feature that can lower cold start latency when used with Provisioned Concurrency.

This blog post shows how to use ES modules in a Lambda function. It also provides guidance on how to use top-level await with Provisioned Concurrency to improve cold start performance for latency sensitive workloads.

Designating a function handler as an ES module

You may designate function code as an ES module in one of two ways. The first way is to specify the “type” in the function’s package.json file. By setting the type to “module”, you designate all “.js” files in the package to be treated as ES modules. Set the “type” as “commonjs” to specify the package contents explicitly as CommonJS modules:

// package.json
{
  "name": "ec-module-example",
  "type": "module",
  "description": "This package will be treated as an ES module.",
  "version": "1.0",
  "main": "index.js",
  "author": "Dan Fox",
  "license": "ISC"
}

// index.js – this file will inherit the type from 
// package.json and be treated as an ES module.

import { double } from './lib.mjs';

export const handler = async () => {
    let result = double(6); // 12
    return result;
};

// lib.mjs

export function double(x) {
    return x + x;
}

The second way to designate a function as either an ES module or a CommonJS module is by using the file name extension. File name extensions override the package type directive.

File names ending in .cjs are always treated as CommonJS modules. File names ending in .mjs are always treated as ES modules. File names ending in .js inherit their type from the package. You may mix ES modules and CommonJS modules within the same package. Packages are designated as CommonJS by default:

// this file is named index.mjs – it will always be treated as an ES module
import { square } from './lib.mjs';

export async function handler() {
    let result = square(6); // 36
    return result;
};

// lib.mjs
export function square(x) {
    return x * x;
}

Understanding Provisioned Concurrency

When a Lambda function scales out, the process of allocating and initializing new runtime environments may increase latency for end users. Provisioned Concurrency gives customers more control over cold start performance by enabling them to create runtime environments in advance.

In addition to creating execution environments, Provisioned Concurrency also performs initialization tasks defined by customers. Customer initialization code performs a variety of tasks including importing libraries and dependencies, retrieving secrets and configurations, and initializing connections to other services. According to an AWS analysis of Lambda service usage, customer initialization code is the largest contributor to cold start latency.

Provisioned Concurrency runs both environment setup and customer initialization code. This enables runtime environments to be ready to respond to invocations with low latency and reduces the impact of cold starts for end users.

Reviewing the Node.js event loop

Node.js has an event loop that causes it to behave differently than other runtimes. Specifically, it uses a non-blocking input/output model that supports asynchronous operations. This model enables it to perform efficiently in most cases.

For example, if a Node.js function makes a network call, that request may be designated as an asynchronous operation and placed into a callback queue. The function may continue to process other operations within the main call stack without getting blocked by waiting for the network call to return. Once the network call is returned, the callback is run and then removed from the callback queue.

This non-blocking model affects the Lambda execution environment lifecycle. Asynchronous functions written in the initialization block of a Node.js Lambda function may not complete before handler invocation. In fact, it is possible for function handlers to be invoked with open items remaining in the callback queue.

Typically, JavaScript developers use the await keyword to instruct a function to block and force it to complete before moving on to the next step. However, await is not permitted in the initialization block of a CommonJS JavaScript function. This behavior limits the amount of asynchronous initialization code that can be run by Provisioned Concurrency before the invocation cycle.

Improving cold start performance with top-level await

With ES modules, developers may use top-level await within their functions. This allows developers to use the await keyword in the top level of the file. With this feature, Node.js functions may now complete asynchronous initialization code before handler invocations. This maximizes the effectiveness of Provisioned Concurrency as a mechanism for limiting cold start latency.

Consider a Lambda function that retrieves a parameter from the AWS Systems Manager Parameter Store. Previously, using CommonJS syntax, you place the await operator in the body of the handler function:

// method1 – CommonJS

// CommonJS require syntax
const { SSMClient, GetParameterCommand } = require("@aws-sdk/client-ssm"); 

const ssmClient = new SSMClient();
const input = { "Name": "/configItem" };
const command = new GetParameterCommand(input);
const init_promise = ssmClient.send(command);

exports.handler = async () => {
    const parameter = await init_promise; // await inside handler
    console.log(parameter);

    const response = {
        "statusCode": 200,
        "body": parameter.Parameter.Value
    };
    return response;
};

When you designate code as an ES module, you can use the await keyword at the top level of the code. As a result, the code that makes a request to the AWS Systems Manager Parameter Store now completes before the first invocation:

// method2 – ES module

// ES module import syntax
import { SSMClient, GetParameterCommand } from "@aws-sdk/client-ssm"; 

const ssmClient = new SSMClient();
const input = { "Name": "/configItem" }
const command = new GetParameterCommand(input);
const parameter = await ssmClient.send(command); // top-level await

export async function handler() {
    const response = {
        statusCode: 200,
        "body": parameter.Parameter.Value
    };
    return response;
};

With on-demand concurrency, an end user is unlikely to see much difference between these two methods. But when you run these functions using Provisioned Concurrency, you may see performance improvements. Using top-level await, Provisioned Concurrency fetches the parameter during its startup period instead of during the handler invocation. This reduces the duration of the handler execution and improves end user response latency for cold invokes.

Performing benchmark testing

You can perform benchmark tests to measure the impact of top level await. I have created a project that contains two Lambda functions, one that contains an ES module and one that contains a CommonJS module.

Both functions are configured to respond to a single API Gateway endpoint. Both functions retrieve a parameter from AWS Systems Manager Parameter Store and are configured to use Provisioned Concurrency. The ES module uses top-level await to retrieve the parameter. The CommonJS function awaits the parameter retrieval in the handler.

Before deploying the solution, you need:

An AWS account (sign up for an account if you don’t have one).
The AWS SAM CLI installed.
Node.js installed (version 14.8 minimum).

To deploy:

From a terminal window, clone the git repo:
git clone https://github.com/aws-samples/aws-lambda-es-module-performance-benchmark
Change directory:
cd ./aws-lambda-es-module-performance-benchmark
Build the application:
sam build
Deploy the application to your AWS account:
sam deploy --guided
Take note of the API Gateway URL in the Outputs section.

This post uses a popular open source tool Artillery to provide load testing. To perform load tests:

Open config.yaml document in the /load_test directory and replace the target string with the URL of the API Gateway:
target: “Put API Gateway url string here”
From a terminal window, navigate to the /load_test directory:
cd load_test
Download and install dependencies:
npm install
Begin load test for the CommonJS function.
./test_commonjs.sh
Begin load test for ES module function.
./test_esmodule.sh

Reviewing the results

Here is a side-by-side comparison of the results of two load tests of 600 requests each. The left shows the results for the CommonJS module and the right shows the results for the ES module. The p99 response time reflects the cold start durations when the Lambda service scales up the function due to load. The p99 for the CommonJS module is 603 ms while the p99 for the ES module is 340.5 ms, a performance improvement of 43.5% (262.5 ms) for the p99 of this comparison load test.

Cleaning up

To delete the sample application, use the latest version of the AWS SAM CLI and run:

sam delete

Conclusion

Lambda functions now support ES modules in Node.js 14.x runtimes. ES modules support await at the top-level of function code. Using top-level await maximizes the effectiveness of Provisioned Concurrency and can reduce the latency experienced by end users during cold starts.

This post demonstrates a sample application that can be used to perform benchmark tests that measure the impact of top-level await.

For more serverless content, visit Serverless Land.

Validating addresses with AWS Lambda and the Amazon Location Service

2022-01-06 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/validating-addresses-with-aws-lambda-and-the-amazon-location-service/

This post is written by Matthew Nightingale, Associate Solutions Architect.

Traditional methods of performing address validation on geospatial datasets can be expensive and time consuming. Using Amazon Location Service with AWS Lambda in a serverless data processing pipeline, you may achieve significant performance improvements and cost savings on address validation jobs that use geospatial data.

This blog contains a deployable AWS Serverless Application Model (AWS SAM) template. It also uses sample data sourced from publicly available datasets that you can deploy and use to test the application. This blog offers a starting point to build out a serverless address validation pipeline in your own AWS account.

Overview

This application implements a serverless scatter/gather architecture using Lambda and Amazon S3, performing address validation with the Amazon Location Service. An S3 PUT event triggers each Lambda function to run data processing jobs along each step of the pipeline.

To test the application, a user uploads a .CSV file to S3. This dataset is labeled with fields that are recognized by the 2waygeocoder Lambda function. The application returns a processed dataset to S3 appended with location information from the Amazon Location Places API.

The Scatter Lambda function takes a dataset from the S3 bucket labeled input and splits it into equally sized shards.
The Process Lambda function takes each shard from the pre-processed bucket. It performs address validation in parallel with a 2waygeocoder function calling the Amazon Location Service Places API.
The Gather Lambda function takes each shard from the post-processed bucket. It appends the data into a complete dataset with additional address information.

Amazon Location Service

Amazon Location Service sources high-quality geospatial data from HERE and ESRI to support searches by using a place index resource.

With the Amazon Locations Places API, you can convert addresses and other textual queries into geographic coordinates (also known as geocoding). You can also convert geographic positions into addresses and place descriptions (known as reverse geocoding).

The example application includes a 2waygeocoder capable of both geocoding and reverse geocoding. The next section shows examples of the call and response from the Amazon Location Places API for both geocoding and reverse geocoding.

Geocoding with Amazon Location Service

Here is an example of calling the Amazon Location Service Places API using the AWS SDK for Python (Boto3). This uses the search_place_index_for_text method:

Response = location.search_place_index_for_text(
	IndexName = ‘explore.place’ 
###index is created using Amazon Location service
	Text = “Boston, MA”)
location_response = Reponse[“Results”]
print(location_response)

Example response:

Example reverse-geocoding with Amazon Location Service

Here is another example of calling the Amazon Location Service Places API using the AWS SDK for Python (boto3). This uses the search_place_index_for_position method:

Response = location.search_place_index_for_position(
	IndexName = ‘explore.place’ 
###index is created using Amazon Location service
	Position = “-71.056739, 42.358660”))
location_response = Reponse[“Results”]
print(location_response)

Example response:

Design considerations

Processing data with Lambda in parallel using a serverless scatter/gather pipeline helps provide performance efficiency at lower cost. To provide even greater performance, you can optimize your Lambda configuration for higher throughput. There are several strategies you can implement to do this and a key few topics to keep in mind.

Increase the allocated memory for your Lambda function

The simplest way to increase throughput is to increase the allocated memory of the Lambda function.

Faster Lambda functions can process more data and increase throughput. This works even if a Lambda function’s memory utilization is low. This is because increasing memory also increases vCPUs in proportion to the amount configured. Each function supports up to 10 GB of memory and you can access up to six vCPUs per function.

To see the average cost and execution speed for each memory configuration, the Lambda Power Tuning tool helps to visualize the tradeoffs.

Optimize shard size

Another method for increasing performance in a serverless scatter/gather architecture is to optimize the total number of shards created by the scatter function. Increasing the total number of shards consequently reduces the size of any single shard, allowing Lambda to process each shard faster.

When scaling with Lambda, one instance of a function handles one request at a time. When the number of requests increases, Lambda creates more instances of the function to process traffic. Because S3 invokes Lambda asynchronously, there is an internal queue buffering requests between the event source and the Lambda service.

In a serverless scatter/gather architecture, having more shards results in more concurrent invocations of the process Lambda function. For more information about scaling and concurrency with Lambda, see this blog post. Increasing concurrency with Lambda can lead to API request throttling.

Consider API request throttling with your concurrent Lambda functions

In a serverless scatter/gather architecture, the rate at which your code calls APIs increases by a factor equal to the number of concurrent Lambda functions. This means API request limits can quickly be exceeded. You must consider Service Quotas and API request limits when trying to increase the performance of your serverless scatter/gather architecture.

For example, the Amazon Location Places APIs called in the processing function of this application has a default limit of 50 API requests per second. The 2waygeocoder calls on average about 12 APIs per second. Splitting the application into more than four shards may cause API throttling exception errors in this case. Requests to increase Service Quotas can be made through your AWS account.

Deploying the solution

You need the following perquisites to deploy the example application:

AWS account.
AWS SAM CLI.
Python 3.9.
An AWS Identity and Access Management (IAM) role with appropriate access.

Deploy the example application:

Clone the repository and download the sample source code to your environment where AWS SAM is installed:
git clone https://github.com/aws-samples/amazon-location-service-serverless-address-validation
Change into the project directory containing the template.yaml file:
cd ~/environment/amazon-location-service-serverless-address-validation
Build the application using AWS SAM:
sam build
Deploy the application to your account using AWS SAM. Be sure to follow proper S3 naming conventions providing globally unique names for S3 buckets:
sam deploy --guided

Testing the application

Testing geocoding

To test the application, download the dataset that is linked in Testing the Application section of the GitHub repository. These tests demonstrate both the geocoding and reverse-geocoding capabilities of the application.

First, test the geocoding capabilities. You perform address validation on the City of Hartford Business Listing dataset linked in the GitHub repository. The dataset contains a listing of all the active businesses registered in the city Hartford, CT, and each business address. The GitHub repo links to an external website where you can download the dataset.

Download the .csv version of the City of Hartford Business Listing dataset. The link is found in the Testing the Application section of the README file on GitHub.
Open the file locally to explore its contents.
Ensure that the .csv file contains columns labeled as “Address”, “City”, and “State”. The 2waygeocoder deployed as part of the AWS SAM template recognizes these columns to perform geocoding.
Before testing the application’s geocoding capabilities, explore the pricing of Amazon Location Service. In order to save money, you can trim the length of the dataset for testing by removing rows. Once the dataset is trimmed to a desired length, navigate to S3 in the AWS Management Console.
Upload the dataset to the S3 bucket labeled “input”. This triggers the scatter function.
Navigate to the S3 bucket labeled “raw” to view the shards of your dataset created by the scatter function.
Navigate to Lambda and select the 2waygeocoder function to view the CloudWatch Logs to see any information that is returned by the function code in near-real-time.
Once the data is processed, navigate to the S3 bucket labeled “destination” to view the complete processed dataset that is created by the gather function. It may take several minutes for your dataset to finish processing.

Congratulations! You have successfully geocoded a dataset using Amazon Location Service with a serverless address validation pipeline.

Testing reverse-geocoding

Next, test the reverse-geocoding capabilities of the application. You perform address validation on the Miami Housing Dataset linked in the GitHub repository. This dataset contains information on 13,932 single-family homes sold in Miami. The repo links to an external website where you can download the dataset.

Before testing, explore the pricing of Amazon Location Service. To start the test:

Download the zip file containing the .csv version of the dataset from . The link is found in the Testing the Application section of the README file on GitHub.
Open the file locally to explore its contents.
Ensure the .csv file contains columns A and B labeled “Latitude” and “Longitude”. You must edit these column headers to match the correct format that is recognized by the 2waygeocoder to perform reverse-geocoding. Only the “L” should be capitalized.
To minimize cost, trim the length of the dataset for testing by removing rows. At the full size of ~13,933 rows, the dataset takes approx. 5 minutes to process.
Once the dataset is trimmed to a desired length and both column A and B are labeled as “Latitude” and “Longitude” respectively, navigate to S3 in the AWS Management Console, and upload the dataset to your S3 bucket labeled “Input”.
Navigate to the S3 bucket labeled “raw” to view the shards of your dataset.
Navigate to Lambda and select the 2waygeocoder function to view the CloudWatch Logs to see any information that is returned by the function code in near-real-time.
Navigate to the S3 bucket labeled “destination” to view the complete processed dataset that is created by the gather function. It may take several minutes for your dataset to finish processing.

Congratulations! You have successfully reverse-geocoded a dataset with Amazon Location Service using a serverless scatter/gather pipeline. You can move on to the conclusion, or continue to test the geocoding capabilities of the application with additional datasets.

Next steps

To get started testing your own datasets, use the AWS SAM template from GitHub deployed as part of this blog. Ensure that the labels in your dataset are labeled to match the constructs used in this blog post. The 2waygeocoder recognizes columns labeled “Latitude” and “Longitude” to perform reverse-geocoding, and “Address”, “City”, and “State” to perform geocoding.

Now that the data has been geocoded by Amazon Location Service and is in S3, you can use Amazon QuickSight geospatial charts to quickly and easily create interactive charts. For information on how to create a Dataset in QuickSight using Amazon S3 Files, check out the QuickSight User Guide.

Below is an example using QuickSight Geospatial charts to map the Miami housing dataset. The map shows average sale price by zipcode:

This example uses QuickSight geospatial charts to map the City of Hartford Business dataset. The map shows DBA (doing business as) by latitude and longitude:

Conclusion

This blog post performs address validation with the Amazon Location Service, demonstrating both geocoding and reverse geocoding capabilities.

Using a serverless architecture with S3 and Lambda, you can achieve both cost optimization and performance improvement compared with traditional methods of address validation. Using this application, your organization can better understand and harness geospatial data.

For more serverless learning resources, visit Serverless Land.

Name:	JWTAuth
Identity source:	$request.header.Authorization
Issuer URL:	https://cognito-idp.us-east1.amazonaws.com/<your_userpool_id>
Audience:	<app_client_id_of_userpool>