All posts by James Beswick

Understanding database options for your serverless web applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-database-options-for-your-serverless-web-applications/

Many web developers use relational databases to store and manage data in their web applications. As you migrate to a serverless approach for web development, there are also other options available. These can help improve the scale, performance, and cost-effectiveness of your workloads. In this blog post, I highlight use-cases for different serverless database services, and common patterns useful for web applications.

Using Amazon RDS with serverless web applications

You can access Amazon RDS directly from your AWS Lambda functions. The RDS database, such as Amazon Aurora, is configured within the customer VPC. The Lambda function must be configured with access to the same VPC:

Lambda connecting to RDS

There are special considerations for this design in busy serverless applications. It’s common for popular web applications to experience “spiky” usage, where traffic volumes shift rapidly and unpredictably. Serverless services such as AWS Lambda and Amazon API Gateway are designed to automatically scale to meet these traffic increases.

However, relational databases are connection-based, so they are intended to work with a few long-lived clients, such as web servers. By contrast, Lambda functions are ephemeral and short-lived, so their database connections are numerous and brief. If Lambda scales up to hundreds or thousands of instances, you may overwhelm downstream relational databases with connection requests. This is typically only an issue for moderately busy applications. If you are using a Lambda function for low-volume tasks, such as running daily SQL reports, you do not experience this behavior.

The Amazon RDS Proxy service is built to solve the high-volume use-case. It pools the connections between the Lambda service and the downstream RDS database. This means that a scaling Lambda function is able to reuse connections via the proxy. As a result, the relational database is not overwhelmed with connections requests from individual Lambda functions. This does not require code changes in many cases. You only need to replace the database endpoint with the proxy endpoint in your Lambda function.

Lambda to RDS Proxy to RDS diagram

As a result, if you need to use a relational database in a high-volume web application, you can use RDS Proxy with minimal changes required.

Using Amazon DynamoDB as a high-performance operational database

Amazon DynamoDB is a high-performance key-value and document database that operates with single-digit millisecond response times at any scale. This is a NoSQL database that is a natural fit for many serverless workloads, especially web applications. It can operate equally well for low and high usage workloads. Unlike relational databases, the performance of a well-architected DynamoDB table is not adversely affected by heavy usage or large amounts of data storage.

For web applications, DynamoDB tables are ideal for storing common user configuration and application data. When integrated with Amazon Cognito, you can restrict row-level access to the current user context. This makes it a frequent choice for multi-tenant web applications that host data for many users.

DynamoDB tables can be useful for lookups of key-based information, in addition to geo-spatial queries in many cases. DynamoDB is not connection-based, so this integration works even if a Lambda function scales up to hundreds or thousands of concurrent executions. You can query directly from Lambda with minimal code:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION
const documentClient = new AWS.DynamoDB.DocumentClient()

// Construct params
const params = {
  TableName: 'myDDBtable',
  Item: {
    partitionKey: 'user-123',
    sortKey: Date.now(),
    name: 'Alice',
    cartItems: 3
  }
}

// Store in DynamoDB
const result = await documentClient.put(params).promise()

Using advanced patterns in DynamoDB, it’s possible to build equivalent features frequently found in relational schemas. For example, one-to-many tables, many-to-many tables, and ACID transactions can all be modeled in a single DynamoDB table.

Combining DynamoDB with RDS

While DynamoDB remains highly performant for high volumes of traffic, you need to understand data access patterns for your application before designing the schema. There are times where you need to perform ad hoc queries, or where downstream application users must use SQL-based tools to interact with databases.

In this case, combining both DynamoDB and RDS in your architecture can provide a resilient and flexible solution. For example, for a high-volume transactional web application, you can use DynamoDB to ingest data from your frontend application. For ad hoc SQL-based analytics, you could also use Amazon Aurora.

By using DynamoDB streams, you can process updates to a DynamoDB table using a Lambda function. In a simple case, this function can update tables in RDS, keeping the two databases synchronized. For example, when sales transactions are saved in DynamoDB, a Lambda function can post the sales information to transaction tables in Aurora.

DynamoDB to RDS architecture

Both the Lambda function and RDS database operate with the customer’s VPC, while DynamoDB is outside the VPC. DynamoDB Streams can invoke Lambda functions configured to access the VPC. In this model, RDS users can then run ad hoc SQL queries without impacting operational data managed by DynamoDB.

High-volume ETL processes between DynamoDB and RDS

For high-volume workloads capturing large numbers of transactions in DynamoDB, Lambda may still scale rapidly and exhaust the RDS connection pool. To process these flows, you may introduce Amazon Kinesis Data Firehose to help with data replication between DynamoDB and RDS.

ETL processing with with DynamoDB and RDS

  1. New and updated items in DynamoDB are sent to a DynamoDB stream. The stream invokes a stream processing Lambda function, sending batches of records to Kinesis Data Firehose.
  2. Kinesis buffers incoming messages and performs data transformations using a Lambda function. It then writes the output to Amazon S3, buffering by size (1–128 MB) or interval (60–900 seconds).
  3. The Kinesis Data Firehose transformation uses a custom Lambda function for processing records as needed.
  4. Amazon S3 is a durable store for these batches of transformed records. As objects are written, S3 invokes a Lambda function.
  5. The Lambda function loads the objects from S3, then connects to RDS and imports the data.

This approach supports high transaction volumes, enabling table item transformation before loading into RDS. The RDS concurrent connection pool is optimized by upstream batching and buffering, which reduces the number of concurrent Lambda functions and RDS connections.

Conclusion

Web developers commonly use relational databases in building their applications. When migrating to serverless architectures, a web developer can continue to use databases like RDS, or take advantage of other options available. RDS Proxy enables developers to pool database connections and use connection-based databases with ephemeral functions.

DynamoDB provides high-performance, low-latency NoSQL support, which is ideal for many busy web applications with spiky traffic volumes. However, it’s also possible to use both services to take advantage of the throughput of DynamoDB, together with the flexibility of ad hoc SQL queries in RDS.

For extremely high traffic volumes, you can introduce Kinesis Data Firehose to batch and transform data between DynamoDB and RDS. In this case, you separate the operational database from the analytics database. This solution uses multiple serverless services to handle scaling automatically.

To learn more about AWS serverless database solutions for web developers, visit https://aws.amazon.com/products/databases/.

Building a serverless tokenization solution to mask sensitive data

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-serverless-tokenization-solution-to-mask-sensitive-data/

This post is courtesy of Anuj Gupta, Senior Solutions Architect, and Steven David, Senior Solutions Architect.

Customers tell us that security and compliance are top priorities regardless of industry or location. Government and industry regulations are regularly updated and companies must move quickly to remain compliant. Organizations must balance the need to generate value from data and to ensure data privacy. There are many situations where it is prudent to obfuscate data to reduce the risk of exposure, while also improving the ability to innovate.

This blog discusses data obfuscation and how it can be used to reduce the risk of unauthorized access. It can also simplify PCI DSS compliance by reducing the number of components for which this compliance may apply.

Comparing tokenization and encryption

There is a difference between encryption and tokenization. Encryption is the process of using an algorithm to transform plaintext into ciphertext. An algorithm and an encryption key are required to decrypt the original plaintext.

Tokenization is the process of transforming a piece of data into a random string of characters called a token. It does not have direct meaningful value in relation to the original data. Tokens serve as a reference to the original data, but cannot be used to derive that data.

Unlike encryption, tokenization does not use a mathematical process to transform the sensitive information into the token. Instead, tokenization uses a database, often called a token vault, which stores the relationship between the sensitive value and the token. The real data in the vault is then secured, often via encryption. The token value can be used in various applications as a substitute for the original data.

For example, for processing a recurring credit card payment, the token is submitted to the vault. The index is used to fetch the original data for use in the authorization process. Recently, tokens are also being used to secure other types of sensitive or personally identifiable information. This includes data like social security numbers (SSNs), telephone numbers, and email addresses.

Overview

In this blog, we show how to design a secure, reliable, scalable, and cost-optimized tokenization solution. It can be integrated with applications to generate tokens, store ciphertext in an encrypted token vault, and exchange tokens for the original text.

In an example use-case, a data analyst needs access to a customer database. The database includes the customer’s name, SSN, credit card, order history, and preferences. Some of the customer information qualifies as sensitive data. To enforce the required information security policy, you must enforce methods such as column level access, role-based control, column level encryption, and protection from unauthorized access.

Providing access to the customer database increases the complexity of managing fine-grained access policies. Tokenization replaces the sensitive data with random unique tokens, which are stored in an application database. This lowers the complexity and the cost of managing access, while helping with data protection.

Walkthrough

This serverless application uses Amazon API Gateway, AWS Lambda, Amazon Cognito, Amazon DynamoDB, and the AWS KMS.

Serverless architecture diagram

The client authenticates with Amazon Cognito and receives an authorization token. This token is used to validate calls to the Customer Order Lambda function. The function calls the tokenization layer, providing sensitive information in the request. This layer includes the logic to generate unique random tokens and store encrypted text in a cipher database.

Lambda calls KMS to obtain an encryption key. It then uses the DynamoDB client-side encryption library to encrypt the original text and store the ciphertext in the cipher database. The Lambda function retrieves the generated token in the response from the tokenization layer. This token is then stored in the application database for future reference.

The KMS makes it easy to create and manage cryptographic keys. It provides logs of all key usage to help you meet regulatory and compliance needs.

One of the most important decisions when using the DynamoDB Encryption Client is selecting a cryptographic materials provider (CMP). The CMP determines how encryption and signing keys are generated, whether new key materials are generated for each item or are reused. It also sets the encryption and signing algorithms that are used. To identify a CMP for your workload, refer to this documentation.

The current solution selects the Direct KMS Provider as the CMP. This cryptographic materials provider returns a unique encryption key and signing key for every table item. To do this, it calls KMS every time you encrypt or decrypt an item.

The KMS process

  • To generate encryption materials, the Direct KMS Provider asks AWS KMS to generate a unique data key for each item using a customer master key (CMK) that you specify. It derives encryption and signing keys for the item from the plaintext copy of the data key, and then returns the encryption and signing keys, along with the encrypted data key, which is stored in the material description attribute of the item.
  • The item encryptor uses the encryption and signing keys and removes them from memory as soon as possible. Only the encrypted copy of the data key from which they were derived is saved in the encrypted item.
  • To generate decryption materials, the Direct KMS Provider asks AWS KMS to decrypt the encrypted data key. Then, it derives verification and signing keys from the plaintext data key, and returns them to the item encryptor.

The item encryptor verifies the item and, if verification succeeds, decrypts the encrypted values. Finally, it removes the keys from memory as soon as possible.

For enhanced security, the example creates the Lambda function inside a VPC with a security group attached to allow incoming HTTPS traffic from only private IPs. The Lambda function connects to DynamoDB and KMS via VPC endpoints instead of going through the public internet. It connects to DynamoDB using a service gateway endpoint and to KMS using an interface endpoint providing a highly available and secure connection.

Additionally, VPC endpoints can use endpoint policies to enforce allowing only permitted operations for KMS and DynamoDB over this connection. To further control the management of encryption keys, the KMS master key has a resource-based policy. It allows the Lambda layer to generate data keys for encryption and decryption, and restrict any administrative activity on master key.

To deploy this solution, follow the instructions in the aws-serverless-tokenization GitHub repo. The AWS Serverless Application Model (AWS SAM) template allows you to quickly deploy this solution into your AWS account.

Understanding the code

The solution uses the tokenizer package, deployed as a Lambda layer. It uses Python UUID4 to generate random values. You can optionally update the logic in hash_gen.py to use your own tokenization technique. For example, you could generate tokens with same length as the original text, preserving the format in the generated token.

The ddb_encrypt_item.py file contains the logic for encrypting DynamoDB items and uses a DynamoDB client-side encryption library. To learn more about how this library works, refer to this documentation.

There are three methods used in the application logic:

  • Encrypt_item encrypts the plaintext using the KMS customer managed key. In AttributeActions actions, you can specify if you don’t want to encrypt a portion of the plaintext. For example, you might exclude keys in the JSON input from being encrypted. It also requires a partition key to index the encrypted text in the DynamoDB table. The hash key is used as the name of the partition key in the DynamoDB table. The value of this partition key is the UUID token generated in the previous step.
def encrypt_item (plaintext_item,table_name):
    table = boto3.resource('dynamodb').Table(table_name)

    aws_kms_cmp = AwsKmsCryptographicMaterialsProvider(key_id=aws_cmk_id)

    actions = AttributeActions(
        default_action=CryptoAction.ENCRYPT_AND_SIGN,
        attribute_actions={'Account_Id': CryptoAction.DO_NOTHING}
    )

    encrypted_table = EncryptedTable(
        table=table,
        materials_provider=aws_kms_cmp,
        attribute_actions=actions
    )
    response = encrypted_table.put_item(Item=plaintext_item)
  • Get_decrypted_item gets the plaintext for a given partition key. For example, the UUID token using the KMS customer managed key.
  • Get_Item gets the obfuscated text, for example the ciphertext stored in the DynamoDB table for the provided partition key.

The dynamodb-encryption-sdk requires cryptography libraries as a dependency. Both of these libraries are platform-dependent and must be installed for a specific operating system. Since Lambda functions use Amazon Linux, you must install these libraries for Amazon Linux even if you are developing application code on different operating system. To do this, use the get_AMI_packages_cryptography.sh script to download the Docker image, install dependencies within the image, and export files to be used by our Lambda layer.

If you are processing DynamoDB items at a high frequency and large scale, you might exceed the AWS KMS requests-per-second limit, causing processing delays. You can use tools such as JMeter to test the required throughput based on the expected traffic for this serverless application. If you need to exceed a quota, you can request a quota increase in Service Quotas. Use the Service Quotas console or the RequestServiceQuotaIncrease operation. For details, see Requesting a quota increase in the Service Quotas User Guide. If Service Quotas for AWS KMS are not available in the AWS Region, create a case in the AWS Support Center.

After following this walkthrough, to avoid incurring future charges, delete the resources following step 7 of the README file.

Conclusion

This post shows how to use AWS Serverless services to design a secure, reliable, and cost-optimized tokenization solution. It can be integrated with applications to protect sensitive information and manage access using strict controls with less operational overhead.

Replacing web server functionality with serverless services

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/replacing-web-server-functionality-with-serverless-services/

Web servers bring together many useful services in traditional web development. Developers use servers like Apache and NGINX for many common tasks. Linux, Apache, MySQL, and PHP formed the LAMP stack to power a large percentage of the world’s websites. Other variants, like the MEAN stack (MongoDB, Express.js, AngularJS, Node.js), have also been popular.

In the migration to serverless, it’s important to understand where this functionality moves to. There are significant benefits in taking a serverless approach to developing web apps but there are differences in where developers spend their efforts. This blog post provides a guide to serverless development for traditional web developers to help with this transition.

Comparing a “Hello World” example

To run a “Hello World” example in a highly available configuration, using a traditional webserver approach you need more than one server in more than one Availability Zone. This server contains an operating system, runtime, and web server software, together with your code. You might build an Amazon Machine Image (AMI) to help with creating more servers.

Scalable "Hello World"

With a web framework like Express, the following code starts a server and listens on port 3000 for connections. For requests at the root URL, it responds with the “Hello World” greeting:

Hello World output

There is a reasonable amount of configuration and infrastructure needed to make this example work. Even creating a TLS connection requires you to maintain a certificate or install and maintain a service like Let’s Encrypt. Additionally, you must patch and maintain the underlying EC2 instance to keep this service running once it’s deployed.

The serverless equivalent is simpler. I can define the Hello World example using an AWS Serverless Application Model (SAM) template:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: hello-world

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function 
    Properties:
      Handler: index.lambdaHandler
      Runtime: nodejs12.x
      InlineCode: | 
        exports.lambdaHandler = async (event, context) => {
          return { 'statusCode': 200, 'body': 'Hello World!' }
        }
      Events:
        HelloWorld:
          Type: Api 
          Properties:
            Path: /hello
            Method: get

The SAM deployment creates an AWS Lambda function with an Amazon API Gateway endpoint:

Serverless Hello World

This is a highly available, scalable endpoint. The developer does not need to define VPCs, subnets or security groups, or install and manage a web server stack. A considerable part of the underlying infrastructure is managed for you, letting you focus primarily on the business logic of the application.

Additionally, using the default Service Quotas, this Endpoint can handle millions of requests a day. To handle this equivalent load with a traditional web server, you may need EC2 Auto Scaling. Lambda manages the scaling automatically, and also scales down as needed without any intervention from the developer.

Implementing authentication in serverless web apps

Many traditional web servers use web frameworks like Python Flask or Express and implement session-based authentication. This allows the server to authenticate users, often with a user name and password validation scheme. The server is responsible for storing user lists, and hashing and salting passwords securely. There are also user administration flows required for tasks such as creating accounts and resetting passwords.

While you can implement all these within a Lambda function, there is another approach that can be more secure and reduce boilerplate code. You can implement authorization and authentication in serverless development by using open standard JSON Web Tokens (JWTs). API Gateway then authenticates the user at the service level using Amazon Cognito, a Lambda authorizer, or with a JWT authorizer with HTTP APIs.

You use an identity provider such as Amazon Cognito or Auth0 to generate the user token. You pass the token in the API request in the Authorization header. The API Gateway service then validates the token before the request is sent downstream to your application.

While you can use JWTs in server-based web applications, there are benefits to separating out this functionality using serverless services:

  • Failed requests do not put any additional load on your infrastructure. API Gateway also does not charge on authenticated routes when authorization headers are missing.
  • You eliminate custom code for handling and processing logins since this happens before reaching your business logic.
  • You can add support for social logins, multi-factor authentication (MFA) and OAuth without changing your code.

Additionally, as your application grows to more functions or across Regions, you are not relying on a single authentication point in your architecture. Each microservice validates a JWT independently and can verify the authorization claims that can be securely embedded in the token’s payload.

For web developers, one of the most common questions is how to handle the user interface elements related to authorization within the application. Auth0 offers a number of customizable components that you can integrate into any JavaScript application. Amplify Framework provides the Authenticator component that provides a wrapper for common flows for signing in users.

Amplify signin UI

Using either approach eliminates boilerplate user management code and helps provide a consistent and professional login experience for your users. To learn more about using Auth0’s integrated sign-in, see the Ask Around Me application code repo.

Generating HTML, CSS and front-end templates

Many web frameworks use templating languages like Jinja or Mustache to help developers inject dynamic content into static HTML and CSS layouts. Typically, the web server creates the entire page layout for each request. You can use the same approach with Lambda if preferred, having the function build the HTML response for the browser.

However, single-page application (SPA) frameworks such as React, Vue.js, and AngularJS offer a different paradigm that works well for serverless development. The build process for SPA applications generates static HTML, CSS, and JavaScript files. When downloaded to the browser, they use JavaScript to fetch dynamic data and interact with the backend application:

SPA backend architecture

  1. The user visits the web application’s URL. The browser downloads the application’s HTML, CSS, and JavaScript files from Amazon S3 via Amazon CloudFront.
  2. The browser executes the application’s JavaScript.
  3. The application calls API Gateway endpoints to fetch and store dynamic data.

This architecture offers a number of benefits. First, serving the application’s assets is offloaded from your infrastructure to a global CDN. This reduces latency and increases scalability. Second, the HTML page building and rendering is managed entirely by the client browser, improving responsiveness and reducing network traffic with the application backend.

Uploading, processing, and saving binary files

Many web applications handle large binary files, such as user uploads. Processing these on web servers can be compute and network-intensive. You must also manage the amount of temporary space in use on the web server, and scale the fleet of servers appropriately during busy periods.

You can upload files serverlessly by using Amazon S3 directly. In this process, you request a presigned URL and upload the binary data directly to this endpoint. This reduces load on your infrastructure and increase scalability. The code is also simple to adapt for non-serverless applications that use S3. Watch this video to see how you can build an S3 uploader solution.

For processing binaries, you can use the S3 PutObject event to trigger serverless workflows. For example, you can process images, translate documents, or transcribe audio. For complex business workflows, the event can trigger AWS Step Functions workflows. This is a highly scalable way to bring automation and custom processing to binary uploads in your web applications.

When processing binary data, Lambda provides a 512 MB temporary file system (located at /tmp). You use this space for intermediate processing, not permanent storage, since the storage is ephemeral. For example, this can be useful for unzipping files or creating PDFs.

When saving files permanently in serverless applications, S3 buckets are the most common storage choice. S3 is highly durable and highly available, provides robust encryption options, and is a scalable, cost-effective solution for many workloads.

Storing application state

In many traditional applications, the web server stores temporary, context-specific application state, and a relational database stores data permanently.

Serverless tools have a range of different options available for managing state. Lambda functions are ephemeral and stateless, and there is no guarantee of reusing the same instance of a Lambda function multiple times.

For functions that need a durable store of user data that can be rehydrated between invocations, Amazon DynamoDB tables provide a low-latency, cost-effective solution. For example, this is ideal for recalling shopping cart contents or user profiles.

For more complex state, tracking long-lived or complex business workflows, the best practice is to use AWS Step Functions. You can model workflows in JSON that use parallel tasks, require human interaction, or take up to one year to complete.

Conclusion

In this post, I show how traditional web-server applications compare with their serverless counterparts. I show how the infrastructure is managed for you in serverless, and how code for serverless developers in primarily focused on business logic.

I look at how common web server tasks, such as authentication and authorization, are managed by scalable services. In single-page applications, front-end layouts are generated on the client-side, and the distribution is managed by a global CDN.

To learn more about how to build web applications with serverless, see the Ask Around Me application repo.

Building deep learning inference with AWS Lambda and Amazon EFS

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-deep-learning-inference-with-aws-lambda-and-amazon-efs/

This post is courtesy of Giuseppe Angelo Porcelli, Principal ML Specialist SA, and Diego Natali, Solutions Architect.

Amazon EFS for AWS Lambda makes it easier for serverless applications requiring persistent file storage or access to large amounts of reference data. Previously, applications had to download data from an object store or database to local ephemeral storage in 512-MB chunks for processing. This creates more code, causes slower startup behavior, and slower data processing. Customers also faced challenges when loading large code packages and models for ML inference.

Recently, AWS announced Amazon EFS support for AWS Lambda. It enables customers to easily share data across function invocations. It also allows you to read large reference data files, and write function output to a persistent and shared data store. Customers can now use Lambda to build data-intensive applications, and load larger libraries and models. They can process larger amounts of data in a highly distributed manner, and share data across functions, containers, and instances.

In this blog post, we show how you can use EFS to store deep learning (DL) framework libraries and models to load from Lambda to execute inferences. We provide a code example on executing serverless inferences with TensorFlow 2.

Using EFS and Lambda for deep learning inference requires to execute two steps:

  1. Storing the deep learning libraries and model on EFS
  2. Creating a Lambda function for inference, which loads the libraries and model from the EFS file system

In the next sections, we share some best practices to implement these steps, and then discuss a full, working example.

Prerequisites

This post assumes experience with Lambda, EFS, plus general knowledge of Python programming, DL, and DL frameworks. To help you get started, read the blog post and documentation.

1. Storing the deep learning libraries and model on Amazon EFS

To populate EFS with DL framework Python libraries and the DL model, there are different options. You can use EC2 instances, third-party tools like lambdci or AWS CodeBuild. AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages for deployment.

This blog post uses an AWS CodeBuild project, configured as follows:

  • The build environment is a Docker container replicating the Lambda runtime environment. To make sure that the packages work in Lambda, it uses the lambci/lambda build container images on Docker Hub.
  • The EFS file system is mounted to the CodeBuild environment.
  • Build commands are used to install the DL framework and download the model to specific paths of the file system.

After the build completes, the EFS file system contains the Python libraries and the model in specific paths. It is attached to the Lambda function for loading those libraries at runtime and execute inference.

For this example, these are the CodeBuild commands to install the TensorFlow 2 framework and an SSD (Single Shot MultiBox Detector) pre-trained object detection model from TensorFlow Hub:

'echo "Downloading and copying model..."',
'mkdir -p $CODEBUILD_EFS1/lambda/model',
'curl https://storage.googleapis.com/tfhub-modules/google/openimages_v4/ssd/mobilenet_v2/1.tar.gz --output /tmp/1.tar.gz',
'tar zxf /tmp/1.tar.gz -C $CODEBUILD_EFS1/lambda/model',

'echo "Installing virtual environment..."',
'mkdir -p $CODEBUILD_EFS1/lambda',
'python3 -m venv $CODEBUILD_EFS1/lambda/tensorflow',
'echo "Installing Tensorflow..."',
'source $CODEBUILD_EFS1/lambda/tensorflow/bin/activate && pip3 install ' +
              (props.installPackages ? props.installPackages : "tensorflow"),

'echo "Changing folder permissions..."',
'chown -R 1000:1000 $CODEBUILD_EFS1/lambda/'

Considerations

  • The approach described can also work for other ML/DL frameworks
  • The EFS file system can be attached to multiple Lambda functions. This means it can share the DL framework libraries with multiple inference functions (up to 25000 connections for each file system).
  • There are alternatives to using EFS for model storage. If the model size fits in the Lambda package deployment, then you could optimize the first invocation since it doesn’t need to download the model. You can also use the function’s initializer to load the model since the first mount to EFS only takes a few hundred milliseconds.

2. Creating a Lambda function for inference

After attaching the EFS file system, you may structure the Lambda code as follows:

Lambda code structure

The code outside the handler method first adds the local mount path to the Python path. It then imports the frameworks, and loads the model into memory. Executing those operations outside of the function’s handler ensures that those objects remain initialized and reused in subsequent invocations of the same Lambda function instance. The code inside the handler runs the inference flow by reading inputs, executing the actual inference, and returning the results to the caller.

For hosting the TensorFlow 2 object detection model in the example, this is the function code:

import sys
import os

# Setting library paths.
efs_path = "/mnt/python"
python_pkg_path = os.path.join(efs_path, "tensorflow/lib/python3.8/site-packages")
sys.path.append(python_pkg_path)

import json
import string
import time
import io
import requests

# Importing TensorFlow
import tensorflow as tf

# Loading model
model_path = os.path.join(efs_path, 'model/')
loaded_model = tf.saved_model.load(model_path)
detector = loaded_model.signatures['default']

def lambda_handler(event, context):
    r = requests.get(event['url'])
    img = tf.image.decode_jpeg(r.content, channels=3)

    # Executing inference.
    converted_img  = tf.image.convert_image_dtype(img, tf.float32)[tf.newaxis, ...]
    start_time = time.time()
    result = detector(converted_img)
    end_time = time.time()

    obj = {
        'detection_boxes' : result['detection_boxes'].numpy().tolist(),
        'detection_scores': result['detection_scores'].numpy().tolist(),
        'detection_class_entities': [el.decode('UTF-8') for el in result['detection_class_entities'].numpy()] 
    }    

    return {
        'statusCode': 200,
        'body': json.dumps(obj)
    }

When invoked, the response is like:

{
    "statusCode": 200,
    "body": "{
    \"detection_boxes\": This field contains the relative position of the bounding boxes,
    \"detection_class_entities\": This field returns the class labels,
    \"detection_scores\": This field returns the detection confidences
    }"
}

Running the example

This working example is provided to set up and run ML/AI inference on Lambda using EFS. To run it, you must have the AWS CDK installed. Execute the following commands:

# clone repository
$ git clone https://github.com/aws-samples/lambda-efs-deep-learning-inference.git
$ cd lambda-efs-ml-demo

# Install the CDK and bootstrap the target account (if this was never done before)
$ npm install -g aws-cdk
$ cdk bootstrap aws://{account_id}/{region}

# Install packages for the project, build and deploy
$ cd cdk/
$ npm install
$ npm run build
$ cdk deploy

After deployment, note the output:

Outputs:
LambdaEFSMLDemo.LambdaFunctionName = LambdaEFSMLDemo-LambdaEFSMLExecuteInference17332C2-0546aa45dfXXXXXX

It takes a few minutes for AWS CodeBuild to deploy the libraries and framework to EFS. To test the Lambda function, run this command, replacing the function name:

$ aws lambda invoke \
    --function-name LambdaEFSMLDemo-LambdaEFSMLExecuteInference17332C2-0546aa45dfXXXXXX \
    --region us-east-1 \
    --cli-binary-format raw-in-base64-out \
    --payload '{"url": "https://images.pexels.com/photos/310983/pexels-photo-310983.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=650&w=940"}' \
    --region us-east-1 \
    /tmp/return.json    

This is the output:

{
    "StatusCode": 200,
    "ExecutedVersion": "$LATEST"
}

Here you can check the inference’s result:
$ tail /tmp/return.json

The following image shows the bounding boxes created from the inference output.

Inference result

The following image shows the bounding boxes created from the inference output.

Image with bounding boxes

To generate this image with the bounding boxes, use the Jupyter notebook from the repository. We reduce the number of bounding boxes to the most relevant classes:

  • Bicycle: 91%
  • Wheel: 48%
  • Person: 45%
  • Wheel: 44%
  • Man: 40%
  • Bicycle wheel: 37%
  • Bicycle wheel: 30%

To clean up the deployment, run:

$ cdk destroy

Performance considerations

When planning for ML inference, you must keep three main aspects in mind: the type of compute resources required for inference, model size and memory footprint, function initialization and cold start.

Lambda is best suited for CPU-based inferencing, which meets the needs for most ML/DL inference use cases. Lambda’s memory can be set between 128 MB and 3008 MB. This means that large models (for example, FasterRCNN models) that may require more memory or dedicated GPUs are not a good fit.

It’s important to understand how Lambda invokes affect performance. The first request to a function instance is called a “cold-start”. This is where the function is provisioned, code downloaded, and the initializer is executed to download the code and load libraries. In this example, it takes about 40 seconds to load the full TensorFlow 2 libraries from EFS, and another 8 seconds to load the model into memory.

Subsequent calls to the same Lambda function instance don’t incur cold start latency if the request is handled by an existing execution environment. Customers who want to reduce this one-time cold start can use Provisioned Concurrency. This feature provides customers with greater control over performance of their serverless applications at any scale.

The EFS mount operation only takes a few hundred milliseconds and only happens once during the function provisioning. EFS supports up to 25,000 connections so is ideal for functions that scale up. We recommend you use EFS provisioned throughput with Provisioned Concurrency for better performance. To learn more, read the documentation about Amazon EFS performance and monitoring Amazon EFS.

Conclusion

This post shows how you can use EFS for Lambda to deploy large DL libraries and models into a function for synchronous invocations. The same approach can be applied to asynchronous invokes. For example, you could perform object detection on images stored in Amazon S3, or streaming invokes on data in Amazon Kinesis and Amazon DynamoDB.

EFS for Lambda enables many new use cases. To learn more about how to use EFS for Lambda, see the AWS News Blog post and read the documentation.

Modeling business logic flows in serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/modeling-business-logic-flows-in-serverless-applications/

Serverless applications can help you develop more agile applications that can scale automatically. By using serverless services in your architecture, this reduces the amount of boilerplate code. It also helps offload complex tasks to specialized services. As a result, a well-designed serverless application can be modified easily to deliver new feature requests, while maintaining high availability for existing production users.

Serverless applications often combine serverless services, using AWS Lambda to integrate those services and transform data. This blog post shows how to model business logic in serverless architectures, combining the services available. I also discuss how to handle state across invocations, and manage complex workflows.

Developing architecture around evolving features

Many modern applications evolve quickly to reflect the needs of their users. Developers iterate on features, releasing new versions daily or weekly, and the feature set is guided by feedback. This user-centric approach can make early architectural plans hard to develop, since developers have limited knowledge about future requirements.

Well-designed serverless applications are inherently flexible, making it faster to add new functionality as user requirements change. This is because individual parts of the workflow are specialized and loosely coupled. This can help support iterative development and also help reduce the amount of rewritten code when the design changes.

In the following example, I show how a serverless architecture can evolve due to changing requirements. In this scenario, a custom serverless application processes customer reviews in an e-commerce website.

Storing customer reviews

The serverless application saves customers reviews submitted from a webpage. The initial design of the architecture only commits the review to a database.

Version 1 architecture

The user submits the review on the webpage, which calls Amazon API Gateway. This API invokes a Lambda function to store the review in an Amazon DynamoDB table.

Version 1: Translating reviews to a common language

After gathering feedback, there is a new requirement. Incoming reviews are currently submitted in multiple languages. These must be converted in a single language to help customer support teams.

Version 2 architecture

To solve this problem, the developer adds a new Lambda function invoked from a DynamoDB stream, using Amazon Translate to process the translation work. It stores the result back in DynamoDB.

Version 2: Responding to negative reviews

Customer support needs assistance in identifying negative reviews more quickly. In this version, the application analyzes the sentiment of the review. Comments that score negatively are emailed to a manager to take further action.

Responding to negative reviews

The developer adds a second Lambda function, invoked by the same DynamoDB stream. This sends the comment text to Amazon Comprehend to run a sentiment analysis. If the resulting score is under a threshold, the function uses Amazon SNS to email a manager.

Version 3: Enabling users to upload photos with the reviews

In addition to text comments, users need to upload photos. To support this feature, there is a new API Gateway endpoint that invokes a Lambda function for an Amazon S3 presigned URL. The browser uploads the media object directly to the S3 bucket.

Uploading photos

By the latest version, the application is considerably different from the original, but the developer has been able to keep pace with new customer requests. As users create more requirements in each iteration, this serverless architecture evolves.

By using services effectively, the developer is able to assemble serverless building blocks to complete tasks quickly. By creating specialized units of custom code, additional code is added to create new features without needing to change existing code from previous versions. This makes it faster to test and deploy new functionality.

Managing local state in a stateless environment

In serverless applications, Lambda functions are ephemeral and should be designed to operate without storing state. This means that each time a function is invoked, it has no knowledge of previous interactions. Each request is handled independently.

Yet there are many tasks in an application that rely on existing state. For example, in an e-commerce application, a user may add items to a shopping cart. In a traditional web server environment, a load balancer may use sticky sessions. This setting ensure that a single user’s requests are routed to the same server. The state of the cart is kept in memory on that web server and the process is stateful.

Since Lambda functions can scale up automatically and can run in multiple Availability Zones, each invocation may happen in a different execution environment. There is no guarantee that subsequent invocations are served by the same instance of a function. By using a stateless approach, your application is indifferent if this happens, and there is no conceptual equivalent of sticky sessions.

For single Lambda functions that need to retain state, the function should fetch the state each time the function is invoked. In the e-commerce example, if a Lambda function adds an item to a shopping cart, it can fetch relevant state information from a DynamoDB table before adding the item.

Local function state with DynamoDB

Due to the low latency performance of DynamoDB, this step adds minimal overhead but means that the application has significantly better scaling capability. This can also make it easier to unit test and debug, because there is less variability caused by locally caching between invocations.

Orchestrating stateful workflows in a serverless environment

Beyond managing the state of shopping carts, it’s common for business applications to model complex workflows reflecting a custom business process. These workflows may persist for long periods of time, require human intervention, or call out to third-party systems.

For example, when a customer places an order in an e-commerce application, this triggers a workflow. There are many different potential steps and paths involved across different systems. From processing a payment to scheduling a delivery, managing the lifecycle of an order is a complex task. This gets even more difficult when you consider error states such a failed payment or a rescheduled delivery.

It’s possible to write custom code to orchestrate these flows, using a database to save the state of each process. Your code must manage retries for downstream failures and undo previous steps if a process fails. While it’s possible to engineer this for your solution, it can result in significant amounts of convoluted code. It can quickly become fragile and be difficult modify as processes change.

AWS Step Functions is designed to manage workflows, and is usually a better approach. You can model the workflow in JSON, replacing complex custom logic in your code. Step Functions can manage individual executions in a workflow that last up to one year. It can also handle different versions of workflows, making it easier to change processes without impacting in-flight executions.

Moving workflow logic into Step Functions can help create more resilient applications. You can explicitly define parallel processes, error handing, and retry logic. The service scales to handle almost any volume of individual executions, letting you focus on the custom logic specific to your application.

Workflows across different systems and beyond AWS

It’s common to build applications that need to interact with other existing applications within your organization. Increasingly, customers also use software as a service (SaaS) providers for significant parts of their systems. In traditional approaches, interoperability with these systems can be complicated.

Amazon EventBridge is a serverless event bus service designed to make it easier to communicate across systems. It integrates with third-party SaaS providers like Auth0, PagerDuty and Datadog. It also receives and routes events from AWS services and your own custom applications.

In a traditional architecture, you typically use a polling mechanism or develop a webhook to retrieve data from a third-party service. This API is publicly accessible and must be secured. You must also scale the API if the third-party service increases the volume of data sent to your endpoint.

Using an EventBridge integration, this data arrives directly in your AWS account as an event. You configure rules in EventBridge to define targets where events are routed to. The data from the third party does not use the public internet and the delivery, security, and error handling is managed by the service.

Conclusion

Serverless applications are distributed applications, using a combination of managed services and custom code. Understanding how to assemble and use these services can help improve agility and your team’s ability to deliver software feature quickly.

Writing less code and creating more specialized functions also helping layer additional features and functionality in subsequent versions. It can improve testing, reduce rewritten code, and help reduce deployment complexity.

While serverless application should be stateless at the function level, you can use DynamoDB to managing state local. This offers low latency, durable state management while enabling your functions to scale.

For managing workflows, AWS provides a number of services for building and managing workflows. EventBridge enables you to connect data between different applications and from SaaS providers. Step Functions makes it faster to develop and orchestrate complex workflows that make your application resilient.

Using these services effectively can reduce custom code and help you deliver solutions more quickly. To see this working in a sample serverless project, see the Ask Around Me blog series.

Creating low-latency, high-volume APIs with Provisioned Concurrency

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/creating-low-latency-high-volume-apis-with-provisioned-concurrency/

The AWS Lambda service runs customer code on-demand in response to events. It works by creating a new execution environment and downloading your code. This initial setup is commonly called a “cold start” and introduces latency to the total execution time of the function.

Cold starts happen when you first invoke a function, or when a function is invoked after being inactive for an extended period. They also happen when Lambda scales up a function, since each new instance of the function is a new execution environment.

The serverless community has previously created “function warmer” libraries to help improve the likelihood of a Lambda invocation using an existing execution environment. This is a good approach for development and test workloads, or where you do not need hyper-ready performance. The Provisioned Concurrency feature is designed for workloads needing predictable low-latency.

This blog post shows how to eliminate cold starts in architectures supporting web applications. I reference code from the Ask Around Me example application. This allows users to ask and answer questions in their local geographic area in real time. To learn more, refer to part 1 of the blog application series.

Cold starts and web applications

The Ask Around Me application uses the following backend architecture:

Ask Around Me backend architecture

This represents a typical web application. Some Lambda functions are invoked by Amazon API Gateway while others are invoked by services further down the application stack. API Gateway invokes Lambda functions synchronously, meaning the caller is blocked until the function returns a value.

Functions invoked by services like Amazon SQS and Amazon DynamoDB are called asynchronously. This means that the caller continues with other work during execution and the function does not return a value. This application uses both types of invocation:

Sync and async parts of the backend

Generally, cold starts are less impactful in asynchronous executions. The latency overhead in starting the execution environment usually has less impact on overall performance of the application in this case. For web applications in particular, cold starts are most noticeable in synchronous applications closer to the frontend. This is where the speed of the API request has the most influence on the user experience of your application.

In Ask Around Me, there are four Lambda functions supporting the API endpoints for the application. Three of these are lightweight functions that put messages in SQS queues and retrieve data from a DynamoDB table. The most complex function is GetQuestions, which fetches questions based upon the latitude and longitude of the user. This is also expected to receive the most usage, with an expected 50,000 queries per hour, so it’s the most important for performance optimization.

Measuring the existing Lambda function performance

In a previous blog post on load testing this application, the GetQuestions API shows considerable variability in performance. In an API load test for 30 seconds with 20 requests per second, the median response is 175 ms while the slowest is 2149 ms:

Load testing performance output

In this application, the frontend application waits until this synchronous API call is completed. The median performance is likely acceptable to a user, whereas any response time over one second makes interaction with the application appear slow.

To gain more insight into the performance of this function, I turn on AWS X-Ray for this function. From the Lambda console, I select the GetQuestions function and check the Active tracing check box in the AWS X-Ray panel. After saving the function, X-Ray is now enabled.

Enable active tracing in Lambda

I re-run the load test for this function and navigate to the X-Ray console. In the Analytics menu, the Response time distribution panel graphs the performance of the function invocations:

Response time distribution

I select all the invocations in the graph after the p95 marker, representing the slowest 5% of all requests. This filters 34 slow requests, which correspond to the number of Concurrent Executions for the function shown in the function’s Metrics console:

Concurrent executions

X-Ray lists the individual traces for the 34 slowest calls, and selecting the slowest single invocation breaks down the durations of each segment:

Slowest single invocation

This analysis shows that this function’s performance is impacted by cold starts. The initialization of the execution environment and the function code is contributing over 1 second of latency in this example. Each of the 34 slowest invocations corresponds with scaling up events for this function.

Configuring Provisioned Concurrency for a Lambda function

Provisioned concurrency is a Lambda feature that allows you to prepare execution environments before receiving traffic. In addition to downloading the function’s code, it also runs the initialization code outside of the main Lambda handler. This provides a reliable way to keep functions ready to respond within double-digit millisecond latency.

While all Provisioned Concurrency functions start more quickly than the existing on-demand Lambda execution style, this is particularly beneficial for certain function profiles. Runtimes like C# and Java have much slower initialization times than Node.js or Python, but faster execution times once initialized. With Provisioned Concurrency turned on, these runtimes benefit from both the consistent low latency of the function’s start-up and the performance during execution.

To enable Provisioned Concurrency for a Lambda function:

  1. Go to the AWS Lambda console and then choose your existing Lambda function.
  2. Provisioned concurrency settings must be applied to a published version or an alias. Go to the Actions drop-down and choose Publish new version.

    Publish new version

  3. Choose Publish. Scroll down to the Concurrency panel and choose Add Configuration.
    Configure Provisioned Concurrency
  4. Enter your preferred concurrency and choose Save.
  5. After a few minutes, Lambda has prepared the execution environments and the Status shows Ready in the console.

    Status is ready

It’s important to remember that the feature is applied explicitly to a function version or alias. Ensure that your invocation method is calling this alias, and not the $LATEST version. Provisioned Concurrency cannot be applied to the $LATEST version.

When configuring Provisioned Concurrency, you select capacity to reserve. During usage, if you exceed this level, any additional functional invocations then use the on-demand model. These invocations exhibit a more typical Lambda start-up performance profile, but you are not throttled or limited from running invocations at high levels of throughput.

Using Amazon CloudWatch Logs or the Monitoring tab for your function in the Lambda console, you can see metrics for the number of Provisioned Concurrency invocations, compared with the total. This can help identify when total load is above the amount of concurrency, and you can make changes accordingly.

You can also use Application Auto Scaling to help you automate provisioning the appropriate capacity. Instead of reserving a fixed amount of capacity, this increases the amount of concurrency during peak loads, and decreases as load reduces. You can configure this in both the AWS CLI and AWS Serverless Application Model.

Comparing performance before and after Provisioned Concurrency

I run the same load test on the same function, now using Provisioned Concurrency – 20 requests per second over 2 minutes. The results show a median latency of 165 ms, a p95 time of 202 ms, and a slowest execution of 532 ms:

Load test result after Provisioned Concurrency

In X-Ray, the latest Response time distribution graph shows the significantly improved performance across the 2400 requests:

Load test result in X-Ray

By enabling Provisioned Concurrency for this Lambda function, the slowest performance has been improved by 75%. The function can serve 1200 requests per minute with a much more consistent performance for users.

Function warmers and Provisioned Concurrency

The broader serverless community offers open source libraries to “warm“ Lambda functions via a pinging mechanism. This approach uses Amazon CloudWatch Events to invoke the function every minute to help keep the execution environment active. As a result, this can increase the likelihood of using a warm environment when you invoke the function.

However, this is not a guaranteed way to reduce cold starts. It does not help in production environments when functions scale up to meet traffic. It also does not work if the Lambda service runs your function in another Availability Zone as part of normal load balancing operations. Additionally, the Lambda service reaps execution environments regularly to keep these fresh, so it’s possible to invoke a function in between pings. In all of these cases, you experience cold starts despite using a warming library.

This approach might be adequate for development and test environments, or low-traffic or low-priority workloads. However, if you need predictable function start times for your workload, Provisioned Concurrency is the recommend solution to ensure predictable latency. It keeps your functions initialized and hyper-ready to respond in double-digit milliseconds at the scale you need.

Conclusion

This post examines how cold starts impact performance in serverless backends for web applications. It shows how the most important focus area is usually synchronous APIs called by the frontend application. I explain options available for targeting cold starts in the Lambda service.

Using the Ask Around Me application, I apply Provisioned Concurrency to the most latency-sensitive Lambda function. I compare the load testing performance before and after enabling this feature. This shows predictable start-up times and a 75% reduction in the slowest execution time. Finally, I show when you might use function warmers, and how Provisioned Concurrency is more suitable for latency-sensitive and production workloads.

To learn about the cost of using this feature, visit the Lambda pricing page.

Integrating Amazon EventBridge and Amazon ECS

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/integrating-amazon-eventbridge-and-amazon-ecs/

This post is courtesy of Jakub Narloch, Senior Software Development Engineer.

Today, AWS announced the support for Amazon API Gateway as an event target in Amazon EventBridge. This feature enables new integration scenarios for web applications and services. It allows customers to seamlessly connect their infrastructure, SaaS services, and APIs hosted in AWS.

With API Gateway as a target for EventBridge, this creates new integration capabilities for new or existing web applications. This post explains how developers can now deliver events directly to their applications hosted on Amazon ECS, Amazon Elastic Kubernetes Service (EKS), or Amazon EC2 using EventBridge and API Gateway. In this post, I show how to build an event driven application running on ECS Fargate that process events from EventBridge using API Gateway integration.

EventBridge is a serverless event bus that makes it easy to connect applications together. It uses data from your own applications, integrated software as a service (SaaS) applications, and AWS services. This simplifies the process of building event-driven architectures by decoupling event producers from event consumers. This allows producers and consumers to be scaled, updated, and deployed independently. Loose coupling improves developer agility in addition to application resiliency.

API Gateway helps developers to create, publish, and maintain secure APIs at any scale. When used with EventBridge, API Gateway authenticates and authorizes API calls. It also acts as an HTTP proxy layer for integrating other AWS services or third-party web applications.

Previously, EventBridge customers could consume events from EventBridge in ECS via Amazon SNS or Amazon SQS, or by triggering an ECS task directly. API Gateway as a target replaces this approach and brings additional API Gateway features like authentication and rate limiting. This can help you build more resilient and feature-rich integrations. API Gateway throttling limits the maximum number events delivered at a same time, while EventBridge retries events delivery for up to 24 hours.

This blog post uses an ecommerce application as an example of a custom integration. The application is responsible for processing customer orders. The following diagram illustrates the interaction of the components of the system. The application itself is hosted as ECS service on top of AWS Fargate.

Architecture diagram

To achieve high availability, the application cluster is distributed across subnets in different Availability Zones. The Application Load Balancer ensures that the incoming traffic is distributed across the nodes in the cluster. API Gateway is responsible for authenticating requests and routing to the backend. The application logic is responsible for receiving the event and persisting it in Amazon DocumentDB.

The order event is modeled as follows:

{
  "version": "0",
  "region": "us-east-1",
  "account": "123456789012",
  "id": "4236e18e-f858-4e2b-a8e8-9b772525e0b2",
  "source": "ecommerce",
  "detail-type": "CreateOrder",
  "resources": [],
  "detail": {
    "order_id": "ce4fe8b7-9911-4377-a795-29ecca9d7a3d",
    "create_date": "2020-06-02T13:01:00Z",
    "items": [
      {
        "product_id": "b8575571-5e91-4521-8a29-4af4a8aaa6f6",
        "quantity": 1,
        "price": "9.99",
        "currency": "CAD"
      }
    ],
    "customer": {
      "customer_id": "5d22899e-3ff5-4ce0-a2a3-480cfce39a56"
    },
    "payment": {
      "payment_id": "fb563473-bef4-4965-ad78-e37c6c9c0c2a",
    },
    "delivery_address": {
      "street": "510 W Georgia St",
      "city": "Vancouver",
      "state": "BC",
      "country": "Canada",
      "zip_code": "V6B0M7"
    }
  }
}

Application layer
The application that processes the orders is implemented using a reactive stack through Spring Boot. A reactive application design can help build a scalable application capable of handling thousands of transactions per second from a single instance. This is important for applications with high throughput and can help in achieving economies of scale.

The resource handler
The application itself defines a OrderResource, which acts as entry handler for receiving the events from EventBridge and processing them. The handler logic is responsible for unmarshalling the event and retrieving the order details out of the event detail. The order is then persisted in DocumentDB using a dedicated DAO instance.

@Slf4j
@RequestMapping("/orders")
@RestController
public class EventResource {
 
    private final OrderRepository orderRepository;
 
    public EventResource(OrderRepository orderRepository) {
        this.orderRepository = Objects.requireNonNull(orderRepository);
    }
 
    @RequestMapping(method = RequestMethod.PUT)
    public Mono<ResponseEntity<Object>> onEvent(@Valid @RequestBody Event<Order> event) {
 
        log.info("Processing order {}", event.getDetail().getOrderId());
 
        return orderRepository.save(event.getDetail())
                .map(order -> ResponseEntity.created(UriComponentsBuilder.fromPath("/events/{id}")
                        .build(order.getOrderId())).build())
                .onErrorResume(error -> {
                    log.error("Unhandled error occurred", error);
                    return Mono.just(ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build());
                });
    }
}

The handler is a mapped to process requests at ‘/orders’ path. The implementation unmarshals an event payload and stores it into DocumentDB. Upon successful execution, the service responds with a 201 Created HTTP status code.

You can store EventBridge events in a document database like Amazon DocumentDB. This is a non-relational database that allows you to store JSON content directly. This example uses DocumentDB for storing the documents, making it easy for writing the event payload directly. It also supports general querying of event content.

Prerequisites
To build and deploy the application, you must have AWS CDK and JDK 11 installed. Start by cloning the GitHub repository. The repository contains the example code and supporting infrastructure for deploying to AWS.

Step 1: Create Amazon ECR repository.
Start by creating a dedicated Amazon ECR repository, where Docker images are uploaded. There is an AWS CDK template in the application code repo for this purpose.

First, install Node.js dependencies needed to execute the CDK command:

cd ../eventbridge-integration-solution-aws-api-cdk
npm install

Next, compile the CDK TypeScript template.

npm run build

Finally, synthesize the CloudFormation stack.

cdk synth "*"

Now bootstrap CloudFormation resources needed to deploy the remaining templates.

cdk bootstrap

Finally, deploy the stack that creates the Amazon ECR registry.

cdk deploy EventsRegistry

Step 2: Build the application

Before the application is deployed, it must be built and uploaded to Amazon ECR.
To get started, compile the source code and build the application distribution.

cd ../eventbridge-integration-solution-aws-api
./gradlew clean build

Step 3: Containerizing the application
The build system is configured to include the task for containerizing the artifacts and creating the Docker image. To create a new Docker image from the build artifact, run the following command:

./gradlew dockerBuildImage

The build task generates the Dockerfile using the provided settings. It then executes the docker build command to create a new Docker image named eventbridge-integration-solution-aws-api.

Step 4: Upload the image to Amazon ECR
You can now upload the image directly to Amazon ECR. First, login into the Amazon ECR registry through Docker. Replace AWS_ACCOUNT_ID with your actual account.

aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin "${AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com"

Before uploading the image to ECR, tag it with the expected remote repository name. To do that, first list all of the Docker images.

docker images

Copy the image id attribute of eventbridge-integration-solution-aws-api image and use it in the tag command, also replacing AWS_ACCOUNT_ID.

docker tag $DOCKER_IMAGE "${AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/eventbridge-integration-solution-aws-api"

Finally, push the Docker image to ECR, replacing AWS_ACCOUNT_ID with your AWS account ID.

docker push "${AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/eventbridge-integration-solution-aws-api"

Step 5: Deploying the application stack
Once the application image is uploaded to Amazon ECR, you can deploy the entire application stack using CDK. The stack creates multiple resources including a VPC, DocumentDB cluster, ECS TaskDefinition and Service, Application Load Balancer, API Gateway and EventBridge rule. You can inspect the resources created in the CDK definition by opening the TypeScript files in the eventbridge-integration-solution-aws-api-cdk/lib directory.

At this point, you can proceed with deploying the CloudFormation stack.

cd ../eventbridge-integration-solution-aws-api-cdk
cdk deploy "*"

Step 6: Testing running application
Now, test the end to end event delivery by publishing the sample events to the EventBridge PutEvents API. Create a file named event.json and paste the following code:

[
  {
    "Source": "ecommerce",
    "DetailType": "CreateOrder",
    "Detail": "{\"order_id\": \"ce4fe8b7-9911-4377-a795-29ecca9d7a3d\",\"create_date\": \"2020-06-02T13:01:00Z\",\"items\": [{\"product_id\": \"b8575571-5e91-4521-8a29-4af4a8aaa6f6\",\"quantity\": 1,\"price\": \"9.99\",\"currency\": \"CAD\"}],\"customer\": {\"customer_id\": \"5d22899e-3ff5-4ce0-a2a3-480cfce39a56\"},\"payment\": {\"payment_id\": \"fb563473-bef4-4965-ad78-e37c6c9c0c2a\"},\"delivery_address\": {\"street\": \"510 W Georgia St\",\"city\": \"Vancouver\",\"state\": \"BC\",\"country\": \"Canada\",\"zip_code\": \"V6B0M7\"}}"
  }
]

Publish this event with the following AWS CLI command.

aws events put-events --entries file://event.json

EventBridge delivers the event to API Gateway and the application persists it in DocumentDB.

Step 7: Cleanup
Delete all the resources created in this tutorial by running this CDK command:

cdk destroy "*"

Additional considerations
The demo application is simplified for the purpose of showcasing the EventBridge integration with API Gateway. In production, it’s recommended that you isolate the DocumentDB cluster in a private subnet. Additionally, the Application Load Balancer can be hidden from public access and connected to API Gateway through VPC Link.

Conclusion

This post demonstrates how to set up a sample application for consuming events directly from EventBridge into a custom application hosted in ECS. This integration uses EventBridge’s native support for API Gateway as a target that allows to integrate any HTTP base web applications.

Learn more from the EventBridge documentation.

Load testing a web application’s serverless backend

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/load-testing-a-web-applications-serverless-backend/

Many web applications experience high levels of traffic and spiky load patterns. The goal of load testing is to ensure that the architecture and system design works for the amount of traffic expected. It can help developers find bottlenecks and unexpected behavior before a system is deployed to production. This post uses the Ask Around Me application as an example to show how to test load in a serverless architecture.

In Ask Around Me, users ask and answer questions in their local geographic area. The expected hourly load is 1,000 new questions, 10,000 new answers, and 50,000 question lookup queries. I use these numbers as a baseline for the tests. This is the architecture of the Ask Around Me backend application:

Ask Around Me backend architecture

Focus areas for load testing

In serverless architectures using AWS services, you can perform a round-trip test from an API endpoint. You can also isolate areas in the design where you should test performance. API testing provides the best approximation of the performance that users experience but it may not always be possible. You can also isolate microservices consuming from SQS queue or receive events from Amazon EventBridge, and test only those parts of the infrastructure.

While AWS services are built to withstand high levels of traffic, it’s important to consider the effect of Service Quotas on your application. Service Quotas are applied at the Region and account levels depending upon the service. You can view all your quotas in one place from the Service Quotas console. These are designed to protect you and other customers if your applications use more resources than planned. These quotas consist of hard and soft limits. For soft limits, you can request quota increases by opening a support ticket.

You must also consider downstream services. While serverless services like Lambda scale on your behalf, you may use downstream services that could be overwhelmed when traffic increases. Load testing can help identify these areas. You can implement mechanisms like queuing, caching, or pooling to protect those non-serverless parts of your infrastructure. If you are using Amazon RDS, for example, you might implement Amazon RDS Proxy to help pool and scale resources.

Finally, load testing can help identify custom code in Lambda functions that may not run efficiently as traffic scales up. Typically, these issues are caused by the code itself or the function configuration. For example, code may process event batches effectively or may not be configured with the appropriate concurrency or memory configuration. Frequently these issues are unnoticed in development but resurface in a load test.

Load testing tools

Load testing serverless infrastructure can be both inexpensive and systematic. There are several tools available for serverless developers to perform this task. One of the most popular is Artillery Community Edition, which is an open-source tool for testing serverless APIs. You configure the number of requests per second and overall test duration, and it uses a headless Chromium browser to run its test flows.

The performance report measures the roundtrip time from the client device, so can be affected by your machine’s performance and network. One way to eliminate your local network’s impact on the results is to use AWS Cloud9 to run the tests remotely.

For Artillery, the maximum number of concurrent tests is constrained by your local computing resources and network. To achieve higher throughput, you can use Serverless Artillery, which runs the Artillery package on Lambda functions. As a result, this tool can scale up to a significantly higher number of tests.

The Ask Around Me application is deployed in my AWS account – see the application’s blog series to learn more about the deployment process. I use an AWS Cloud9 instance to run these API tests:

  1. Adding 1,000 questions per hour using the POST /questions API.
  2. Adding 10,000 answers per hour using the POST /answers API.
  3. Fetching 50,000 questions per hour based upon random geo-location using the GET /questions API.

You can find the test scripts and Artillery configurations in the testing directory of the application’s GitHub repo.

Artillery also enables you to specify custom functions to provide randomized data and custom query parameters, as required by your API. The loadTestFunction.js file contains a function to return randomized geo-point and rating data per test:

// Sets a bounding box around an area in Virginia, USA
const bounds = {
  latMax: 38.735083,
  latMin: 40.898677,
  lngMax: -77.109339,
  lngMin: -81.587841
}

const generateRandomData = (userContext, events, done) => {
  const randomLat = ((bounds.latMax-bounds.latMin) * Math.random()) + bounds.latMin
  const randomLng = ((bounds.lngMax-bounds.lngMin) * Math.random()) + bounds.lngMin

  const id = parseInt(Math.random()*1000000)+1  //random 0-1000000
  const rating = parseInt(Math.random()*5)+1    //returns 1-5

  userContext.vars.lat = randomLat.toFixed(7)
  userContext.vars.lng = randomLng.toFixed(7)
  userContext.vars.id = id
  userContext.vars.rating = rating

  return done()
}

module.exports = { generateRandomData }

Test #1: Adding 1,000 questions per hour

The POST questions API has the following architecture:

POST questions architecture

The Artillery configuration file 1-test.yaml is set to create three requests per second over a 5-minute duration. This equates to 10,800 questions per hour, significantly higher than the estimated load for this function. The scenario specifies the JSON payload expected by the questions API:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 300
      arrivalRate: 3
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>'
scenarios:
  - flow:
    - function: "generateRandomData"
    - post:
        url: "/questions"
        json:
          question: "This is a load test question - #{{ id }}"
          type: "Star rating"
          position:
            latitude: {{ lat }}
            longitude: {{ lng }}
    - log: "Sent POST request to / with {{ lat }}, {{ lng }}"

You execute the Artillery test with the command artillery run ./1-test.yaml. My test concludes with the following results:

Artillery test results

Over 300 requests, the median response time is 114 ms. The p95 response time shows that 95% of all responses are served within 376 ms. The slowest response of 1401 ms is caused by cold starts when the Lambda service scales up the underlying function due to load.

As this process writes to a DynamoDB table, I can also see how many write capacity units (WCUs) are consumed by the test. From the DynamoDB console, select the table aamQuestions, then choose the Metrics tab. This shows the Write capacity metric:

CloudWatch DynamoDB metrics

Test #2: Adding 10,000 answers per hour.

The POST answers API has the following architecture:

POST answers architecture

The Artillery configuration in 2-test.yaml creates 10 answers per second over a 5-minute duration. This equates to 36,000 per hour, much higher than the estimated load. The scenario defines the randomized rating used by the testing process:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 300
      arrivalRate: 10
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>’
scenarios:
  - flow:
    - function: "generateRandomData"
    - post:
        url: "/answers"
        json:
          type: "Star"
          rating: "{{ rating }}"
          question: 
            type: "Star"
            latitude: 39.08259127440097
            longitude: -77.46246339003038
            rangeKey: "testuser|1-1589380702281"
    - log: "Sent POST request to / with {{ rating }}"

The test results show a median response time of 111 ms with a p95 time of 218 ms. In the worst case, a request took 1102 ms to complete:

Artillery summary report

Checking the Metrics tab for the aaAnswers table, this test consumed just under 11 WCUs at peak:

CloudWatch DynamoDB metrics

Test #3: Fetching 50,000 questions per hour

The GET questions API invokes a Lambda function that uses the Geo Library for Amazon DynamoDB:

GET questions architecture

This process is read-intensive on the underlying DynamoDB table. The testing configuration simulates 20 queries per second over 2 minutes for random locations in a bounding box around Virginia, USA:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 120
      arrivalRate: 20
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>’
scenarios:
  - flow:
    - function: "generateRandomData"
    - get:
        url: "/questions"
        qs:
          lat: "{{ lat }}"
          lng: "{{ lng }}"
    - log: "Sent POST request to / with {{ lat }}, {{ lng }}"

This is a synchronous API so the performance directly impacts the user’s experience of the application. This test shows that the median response time is 165 ms with a p95 time of 201 ms:

Artillery performance results

This level of load equates to 72,000 queries per hour, almost 50% above the expected usage. The DynamoDB metrics show a peak consumption of 82 read capacity units:

CloudWatch monitoring details

Testing authenticated routes

These API routes are protected from public access and require authorization. This application uses HTTP APIs, which accepts JWT tokens, and it uses Auth0 in the frontend application to generate these tokens. When you are load testing API Gateway routes with custom authorizers, you have a number of options.

At the early development stage, you may choose to remove the authentication to perform load tests. This simplifies the process but is not recommended beyond research and prototyping. If you turn off authentication for testing, there is a risk that it is not enabled again for production. This would leave your routes open to the public.

A better approach is to create a test user in your identity provider and use the JWT token for testing. Auth0 allows you to obtain a token manually, and use this in the Artillery configuration for the authorization header:

Embedding authorization token in test script

Since custom code frequently uses the decoded identity in processing, supplying a test token provides the closest simulation of actual usage. You must refresh this token in the test scripts periodically, and you can change scopes as needed.

The testing directory in the GitHub repo also includes a script for testing functions that consume from SQS queues. This allows you to test microservices further down in your infrastructure stack. This script injects messages into the SQS queue, simulating upstream processes.

Conclusion

In this post, I discuss focus areas for load testing of serverless applications, and highlight two tools commonly used. I show how to configure Artillery with customized functions, and how to run tests to simulate load on the Ask Around Me application.

I cover some of the options for testing authenticated API Gateway routes and how you can use JWT tokens in your load testing configuration. You can also test microservices within a serverless architecture by injecting messages into SQS queues to simulate upstream load.

To learn more about the Ask Around Me serverless applications, read the blog series.

Using AWS ParallelCluster serverless API for AWS Batch

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-aws-parallelcluster-serverless-api-for-aws-batch/

This post is courtesy of Dario La Porta, Senior Consultant, HPC.

This blog is a continuation of a series of posts demonstrating how to create serverless architectures to support HPC workloads run with AWS ParallelCluster.

The first post, Using AWS ParallelCluster with a serverless API, explains how to create a serverless API for the AWS ParallelCluster command line interface. The second post, Amazon API Gateway for HPC job submission, shows how to submit jobs to a cluster that uses a Slurm job scheduler through a similar serverless API. In this post, I create a serverless API of the AWS Batch command line interface inside ParallelCluster. This uses AWS ParallelCluster, Amazon API Gateway, and AWS Lambda.

The integration of ParallelCluster with AWS Batch replaces the need of third-party batch processing solutions. It also natively integrates with the AWS Cloud.

Many use cases can benefit from this approach. The financial services industry can automate the resourcing and scheduling of the jobs to accelerate decision-making and reduce cost. Life sciences companies can discover new drugs in a more efficient way.

Submitting HPC workloads through a serverless API enables additional workflows. You can extend on-premises clusters to run specific jobs on AWS’ scalable infrastructure to leverage its elasticity and scale. For example, you can create event-driven workflows that run in response to new data being stored in an S3 bucket.

Using a serverless API as described in this post can improve security by removing the need to log in to EC2 instances to use the AWS Batch CLI in AWS ParallelCluster.

Together, this class of workflow can further improve the security of your infrastructure and dat. It can also help optimize researchers’ time and efficiency.

In this post, I show how to create the AWS Batch cluster using AWS ParallelCluster. I then explain how to build the serverless API used for the interaction with the cluster. Finally, I explain how to use the API to query the resources of the cluster and submit jobs.

This diagram shows the different components of the solution.

Architecture diagram

AWS ParallelCluster configuration

AWS ParallelCluster is an open source cluster management tool to deploy and manage HPC clusters in the AWS Cloud.

The same procedure, described in the Using AWS ParallelCluster with a serverless API post, is used to create the AWS Batch cluster in the new template.yml and pcluster.conf file. The template.yml file contains the required policies for the Lambda function to build the AWS Batch cluster. Be sure to modify <AWS ACCOUNT ID> and <REGION> to match the value for your account.

The pcluster.conf file contains the AWS ParallelCluster configuration to build a cluster using AWS Batch as the job scheduler. The master_subnet_id is the id of the created public subnet and the compute_subnet_id is the private one. More information about ParallelCluster configuration file options and syntax are explained in the ParallelCluster documentation.

Deploy the API with AWS SAM

The code used for this example can be downloaded from this repo. Inside the repo:

  • The sam-app folder in the aws-sample repository contains the code required to build the AWS ParallelCluster serverless API for AWS Batch.
  • sam-app/template.yml contains the policy required for the Lambda function for the creation of the AWS Batch cluster. Be sure to modify <AWS ACCOUNT ID> and <REGION>to match the value for your account.

AWS Identity and Access Management Roles in AWS ParallelCluster contains the latest version of the policy. See the ParallelClusterInstancePolicy section related to the awsbatch scheduler.

To deploy the application, run the following commands:

cd sam-app
sam build
sam deploy --guided

From here, provide parameter values for the SAM deployment wizard for your preferred Region and AWS account. After the deployment, note the outputs:

Deployment output

SAM deploying:
SAM deployment output

The API Gateway endpoint URL is used to interact with the API. It has the following format:

https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch

Interact with the AWS Batch cluster using the deployed API

The deployed pclusterbatch API requires some parameters:

  • command – the pcluster Batch command to execute. A detailed list is available commands is available in the AWS ParallelCluster CLI Commands for AWS Batch page.
  • cluster_name – the name of the cluster.
  • jobid – the jobid string.
  • compute_node – parameter used to retrieve the output of the specified compute node number in a mpi job.
  • --data-binary "$(base64 /path/to/script.sh)" – parameter used to pass the job script to the API.
  • -H "additional_parameters: <param1> <param2> <...>" – used to pass additional parameters.

The cluster’s queue can be listed with the following:

$ curl --request POST -H "additional_parameters: "  "https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbqueues&cluster=cluster1"

Job output
A cluster job can be submitted with the following command. The job_script.sh is an example script used for the job.

$ curl --request POST -H "additional_parameters: -jn hello" --data-binary "$(base64 /path/to/job_script.sh)" "https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbsub&cluster=cluster1"

Job output
This command is used to check the status of the job:

$ curl --request POST -H "additional_parameters: "  "https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbstat&cluster=cluster1&jobid=3d3e092d-ca12-4070-a53a-9a1ec5c98ca0"

Job output
The output of the job can be retrieved with the following:

$ curl --request POST -H "additional_parameters: "  "https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbout&cluster=cluster1&jobid=3d3e092d-ca12-4070-a53a-9a1ec5c98ca0"

Job output

The following command can be used to list the cluster’s hosts:

$ curl –request POST –H “additional_parameters: “  “https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbhosts&cluster=cluster1”

Job output
You can also use the API to submit MPI jobs to the AWS Batch cluster. The mpi_job_script.sh can be used for the following three nodes MPI job:

curl --request POST -H "additional_parameters: -n 3" --data-binary "$(base64 mpi_script.sh)" "https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbsub&cluster=cluster1"

Job output
Retrieve the job output from the first node using the following:

$ curl --request POST -H "additional_parameters: "  "https://<ServerlessRestApi>.execute-api.eu-west-1.amazonaws.com/Prod/pclusterbatch?command=awsbout&cluster=cluster1&jobid=085b8e31-21cc-4f8e-8ab5-bdc1aff960d9&compute_node=0"

Job output

Teardown

You can destroy the resources by deleting the CloudFormation stacks created during installation. Deleting a Stack on the AWS CloudFormation Console explains the required steps.

Conclusion

In this post, I show how to integrate the AWS Batch CLI by AWS ParallelCluster with API Gateway. I explain the lifecycle of the job submission with AWS Batch using this API. API Gateway and Lambda run a serverless implementation of the CLI. It facilitates programmatic integration with AWS ParallelCluster with your on-premises or AWS Cloud applications.

You can also use this approach to integrate with the previous APIs developed in the Using AWS ParallelCluster with a serverless API and Amazon API Gateway for HPC job submission posts. By combining these different APIs, it is possible to create event-driven workflows for HPC. You can create scriptable workflows to extend on-premises infrastructure. You can also improve the security of HPC clusters by avoiding the need to use IAM roles and security groups that must otherwise be granted to individual users.

To learn more, read more about how to use AWS ParallelCluster and AWS Batch.

Managing backend requests and frontend notifications in serverless web apps

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/managing-backend-requests-and-frontend-notifications-in-serverless-web-apps/

Web and mobile applications usually interact with a backend service, often via an API. Many front-end applications pass requests for processing, wait for a result, and then display this to the user. This synchronous approach is only one way to handle messages, but modern applications have alternatives to provide a better user experience.

There are three common ways to make and manage requests from your frontend. This blog post explains the benefits and use-cases of each approach. This post references the Ask Around Me example application, which allows users to ask and answer questions in their local geographic area in real time. To learn more, refer to part 1 of the blog application series.

The synchronous model

The synchronous call is the most common API request pattern, where the caller makes a request to an API and then waits for the response:

Synchronous API example

This type of request is easy to implement and understand because it mirrors the functional call-response pattern many developers are familiar with. The requestor is blocked until the calls completes, so it is well suited to simple requests with short execution times. Use cases include: retrieving the contents of a shopping cart, looking up a value in a database, or submitting an email address from a web form.

However, when your API interacts with other services or external workflows, the synchronous model can have limitations. In the following ecommerce example, any slow performance in the downstream services delays the entire roundtrip performance. Additionally, any outages in one of those services may result in an apparent failure of the entire service.

Tight coupling of services

For services with lengthy workflows, you may reach API Gateway’s 29-second integration timeout. In the following ride sharing example, the service responsible for finding available drivers may have a highly variable response time. This request may time out. It also provides a poor user experience, as there is no feedback to the user for a considerable period.

Lengthy request example

Synchronous requests also have others limitations. You cannot receive more than one response per request, nor can you subscribe to future changes in data. In this example, the API request can only inform callers about drivers at the end of the length process request.

The asynchronous model

Asynchronous tasks are common in serverless applications and distributed applications. They allow separate parts of an application to communicate without needing to wait for a synchronous response. Asynchronous workloads often use queues between services to help manage throughout and assist with retry logic.

With asynchronous tasks, the caller hands off the event and continues on to the next task after receiving the acknowledgment response. The caller does not wait for the entire task to complete. The downstream service works on this event while the caller continues servicing other requests. The ecommerce example, converted to an asynchronous flow, looks like this:

Asynchronous request example

In this example, a caller submits an order and receives a response from API Gateway almost immediately. With service integrations, API Gateway then stores the request directly in a durable store such as Amazon SQS or DynamoDB, before any processing has occurred. This results in a relatively consistent caller response time, regardless of downstream service processing time.

The downstream services fetch messages from the SQS queue or the DynamoDB table for processing. If there is a downstream outage, messages are persisted in the queue and may be retried later. From the user’s perspective, the request has been successfully submitted.

The Ask Around Me application handles the publishing of both questions and answers asynchronously. The API passes the user data to a Lambda function that stores the message in an SQS queue. SQS responds immediately to indicate that the message has been stored successfully, ending the API response. Another Lambda function then takes the messages from the SQS queue and processes these independently.

Ask Around Me save question example

Both synchronous and asynchronous requests are useful for different functions in web applications, so it can be helpful to compare their features and behaviors:

Synchronous requestsAsynchronous requests
The caller waits until the end of processing for a response.The caller receives an acknowledgment quickly while processing continues.
Waiting may incur cost.Minimizes the cost of waiting.
Downstream slowness or outages affects the overall request.Queuing separates ingestion of the request from the processing of the request.
Passes payloads between steps.More often passes transaction identifiers.
Failure affects entire request.Failure only affects segment of request.
Easy to implement.Moderate complexity in implementation

Handling response values and state for asynchronous requests

With asynchronous processes, you cannot pass a return value back to the caller in the same way as you can for synchronous processes. Beyond the initial acknowledgment that the request has been received, there is no return path to provide further information. There are a couple of options available to web and mobile developers to track the state of inflight requests:

  • Polling: the initial request returns a tracking identifier. You create a second API endpoint for the frontend to check the status of the request, referencing the tracking ID. Use DynamoDB or another data store to track the state of the request.
  • WebSocket: this is a bidirectional connection between the frontend client and the backend service. It allows you to send additional information after the initial request is completed. Your backend services can continue to send data back to the client by using a WebSocket connection.

Polling is a simple mechanism to implement for many systems but can result in many empty calls. There is also a delay between data availability and the client being notified. WebSockets provide notifications that are closer to real time and reduce the number of messages between the client and backend system. However, implementing WebSockets is often more complex.

Using AWS IoT Core for real-time messaging

In both the synchronous and asynchronous models, it’s assumed that the caller makes a request and is only interested in the final result of that request. This doesn’t allow for partial information, such as the percentage of a task complete, or being notified continuously as data changes.

Modern web applications commonly use the publish-subscribe pattern to receive notifications as data changes. From receiving alerts when new email arrives to providing dashboard analytics, this method allows for much richer streams of event from backend systems.

In Ask Around Me, the application uses this pattern when listening for new questions from the user’s local area. The frontend subscribes to the geohash value of the user’s location via the AWS SDK. It then waits for messages published by the backend to this topic.

AWS IoT Core between frontend and backend

The SDK automatically manages the WebSocket connection and also handles many common connectivity issues in web apps. The messages are categorized using topics, which are strings defining channels of messages.

The AWS IoT Core service manages broadcasts between backend publishers and frontend subscribers. This enables fan-out functionality, which occurs when multiple subscribers are listening to the same topic. You can broadcast messages to thousands of frontend devices using this mechanism. For web application integration, this is the preferable way to implement publish-subscribe than using Amazon SNS.

The IotData class in the AWS SDK returns a client that uses the MQTT protocol. Once the frontend application establishes the connection, it returns messages, errors, and the connection status via callbacks:

        mqttClient.on('connect', function () {
          console.log('mqttClient connected')
        })

        mqttClient.on('error', function (err) {
          console.log('mqttClient error: ', err)
        })

        mqttClient.on('message', function (topic, payload) {
          const msg = JSON.parse(payload.toString())
          console.log('IoT msg: ', topic, msg)
        })

For more details on how to implement MQTT WebSocket connectivity for your application, see the Ask Around Me sample application code.

Combining multiple approaches for your frontend application

Many frontend applications can combine these models depending on the request type. The Ask Around Me application uses multiple approaches in managing the state of user questions:

Combining multiple models in one application

  1. When the application starts, it retrieves an initial set of questions from the synchronous API endpoint. This returns the available list of questions up to this point in time.
  2. Simultaneously, the frontend subscribes to the geohash topic via AWS IoT Core. Any new questions for this geohash location are sent from the backend processing service to the frontend via this topic. This allows the frontend to receive new questions without subsequent API calls.
  3. When a new question is posted, it is saved to the relevant SQS queue and acknowledged. The question is processed asynchronously by a backend process, which sends updates to the topic.

There are several benefits to combining synchronous, asynchronous, and real-time messaging approaches like this. Most importantly, the user experience remains consistent. The user receives immediate feedback to posting new questions and answers, while longer-running processes are managed asynchronously.

When new information becomes available, the frontend is notified in near-real time. This happens without needing to poll an API endpoint or have the user refresh the user interface. This also reduces the number of unnecessary API calls on the backend service, reducing the cost of running this application. Finally, this uses scalable managed services so the frontend application can support large numbers of users without impacting performance.

Conclusion

Web applications commonly use synchronous APIs when communicating with backend services. For longer-running processes, asynchronous workflows can offer an improved user experience and help manage scaling. By using durable message stores like SQS or DynamoDB, you can separate the request ingestion and response from the request processing.

In this post, I show how modern web applications use real-time messaging via WebSockets to improve the user experience. This provides a transport mechanism for pushing state updates from the backend to the frontend client. The AWS IoT Core service can fan out messages using topics, broadcasting messages to large numbers of frontend subscribers.

To see these three methods in an example frontend application, read more about the Ask Around Me example application.

Implementing geohashing at scale in serverless web applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/implementing-geohashing-at-scale-in-serverless-web-applications/

Many web and mobile applications use geospatial data, often used with map overlays. This results in dataset queries based upon proximity, for questions such as “How far is the nearest business?” or “How many users are nearby?” Applications with significant traffic need an efficient way to handle geolocation queries. This blog post explores a simple geohashing solution for serverless applications, and how this can work at scale.

Geohashing is a popular public domain geocode system that converts geographic information into an alphanumeric hash. A geohash is used to identify a rectangular area around a fixed point. The length of the hash determines the precision of the area identified. This allows you to use a hierarchical search where the length of the geohash corresponds to the size of a search area.

This blog post references the Ask Around Me example application, which helps users ask and answer questions in their local geographic area. This post explores a solution suited for the expected volumes in this web application. See part 1 of the blog application series to learn more.

Choosing geohashing precision for your application

One of the most important considerations in using geohashing is understanding the expected distribution of the locations and the size of the search area. Selecting the correct geohash length is important for the efficiency of the search. There is a balance between the number of cells searched and the number of items within each cell.

A geohash corresponds to a grid cell but a typical search corresponds to multiple overlapping cells. In the following diagram, the search radius overlaps nine distinct geohash cells. The query returns all location pins within the cells but only the green pins are relevant to the radial search. The gray pins are outside of the overlapping geohash cells so are immediately discarded in the query:

Discarded pins from a query

If the geohash is too precise, meaning the cell is small, the search radius may include many more cells:

Search radius with too many cells

If the geohash is too coarse, meaning the cell is too large, the query may return too many results for a single cell for the search to be efficient. You may also find a large number of results in those cells are not within the search radius:

Too many results in a single cell

The geohashing algorithm creates a hash that makes it easy to find the neighboring eight cells of any target cell. For optimal performance, you should ideally select a hashing resolution where most queries are resolved by searching only the target cell and its immediate neighbors.

In the canonical implementation of geohash, there are some areas on the globe where physically neighboring cells are not logically close. This can cause errors in these edge cases where there are “seams”. The S2 geometry library solves this problem by using a spherical reference, meaning you can use this approach anywhere in the world. The library has been ported to TypeScript and is available as an npm package called nodes2ts.

Using Amazon DynamoDB for geohashing queries

Amazon DynamoDB is a serverless NoSQL database that offers single-digit millisecond performance at any scale. For an application with moderate load, you can set read and write capacity to allocate a dedicated amount of throughout, or you can set the provisioning to On-Demand. DynamoDB is well suited to key-based queries needing fast, consistent performance.

For web application developers using Node.js or JavaScript, there is an npm package called dynamodb-geo that ports the Java Geo Library for DynamoDB. Both packages are based on the S2 geometry library. This provides a simple interface to use DynamoDB for geospatial data. The library is a wrapper for DynamoDB and maintains an underlying table. After configuring the table and the library, you interact with the data using the library’s API.

For example, to add a new geospatial item:

    // Add a new location to the database
const result = await myGeoTableManager.putPoint({
        RangeKeyValue: { S: 'location-id-1234' }, // unique ID
        GeoPoint: { 
            latitude: 40.6892534,
            longitude: -74.0466891
        },
        PutItemInput: { 
            Item: { 
                country: { S: 'USA' },
                state: { S: 'New York' },
                pointOfInterest: { S: 'Statue of Liberty' }
            }
        }
    }).promise()

The library also provides methods for updating and deleting data points, and supports DynamoDB batch operations. Querying the data via the API can only be done via geo-point requests. You can retrieve a dataset of items based around a rectangular or radial area:

    // Querying 10km (~6.2 miles) around Boston, Massachusetts.

    const result = await myGeoTableManager.queryRadius({
        RadiusInMeter: 10000,
        CenterPoint: {
            latitude: 42.3145186,
            longitude: -71.1103666
        }
    }).promise()

    // Outputs results as an array of DynamoDB.AttributeMaps
    console.log(result)

Moving locations and moving consumers

In Ask Around Me, questions have a fixed latitude and longitude – once a question is asked, the location never changes. The dynamodb-geo library uses the hashKey as a primary key in the underlying DynamoDB table. To update the primary key of a DynamoDB item, you must delete and recreate the entire item.

Many geolocation implementations use static data, such as a list of retail store locations. But if you need to track moving locations, such as the location of mobile users, this is not the best approach. As a result, this library works well for static data (or data where the location rarely changes) but is not suitable where locations change frequently.

In the Ask Around Me app, users receive alerts when new questions appear around their current location. Real-time messaging is implemented using AWS IoT Core, where publish-subscribe topics connect the frontend and backend.

The application retrieves the current latitude and longitude via the browser and uses the S2 geometry library to convert this to a geohash key. It then subscribes to this geohash topic. On the backend, when new questions are saved, these are published to a topic using the geohash key. As a result, users in the same geohash area receive notifications when new questions are asked nearby.

This broadcast pattern allows the application to fan out a single new question in the DynamoDB table to thousands of live users in frontend application. As users move their location, the front-end application can detect if they moved from one geohash cell to another. If so, the frontend unsubscribes from the outdated geohash identifier, and subscribes to the new. This ensures that moving consumers are always listening to questions near their current location. For more information on the broadcast pattern, see the AWS whitepaper, Designing MQTT Topics for AWS IoT Core.

Sorting, aging, and expiring location data

For an application with many writes to the underlying geolocation table, the user may expect to see newer data first. Also, you may need a strategy for removing stale data from the table. Even with a highly performant database like DynamoDB, you must ensure the first page of results return the most relevant items.

The dynamodb-geo library uses the geohash identifier as a primary key, but allows the developer to choose an identifying range key. You need this key to uniquely identify items that have the same geohash. However, you can also use this key to for sorting and paging data that’s returned by the library’s search APIs.

The Ask Around Me app uses a concatenated userId-timestamp pattern as a range key, for example jbeswick-1589202456. This helps in implementing two data access patterns:

  • Finding by user: using the begins_with operator, you can identify questions asked by a specific user. For example, begins_with(‘jbeswick’) returns all the questions for this user.
  • Sorting results by time added: as query results are always sorted by the sort key value, the Unix timestamp ensures that these are returned from oldest to newest. You can reverse this order to return newest first by setting ScanIndexForward to false in the DynamoDB query.

In this application, you might decide that questions more than a year old should be expired from the table. You can have DynamoDB expire items automatically using the time to live (TTL) feature.

To use this, create a custom numeric attribute in the table that contains the expiration time of the item, set as a Unix timestamp. For example, an item expiring on a midnight on January 1, 2023 uses the timestamp 1672531200.

When you enable TTL on a DynamoDB table, you specify which attribute contains the expiration value. A background job checks the TTL values against the current time. For any items found where the TTL timestamp is older than the current time, it expires those items within a 48-hour time window.

Conclusion

This blog post explores how you can solve geolocation queries using geohashing. I discuss how you should decide on the resolution of a geohash for your specific workload.

I explain why DynamoDB is a good fit for many geohashing applications. I cover how you can use the dynamodb-geo library to easily implement location queries in your web applications. Using AWS IoT Core, you can also use MQTT topics to fan out updates to moving subscribers.

Finally, I show how to use the DynamoDB table’s range key to help with sorting data by age and supporting addition access patterns. For application with many writes, you can also automatically expire items using DynamoDB’s TTL feature.

To learn more about how the Ask Around Me application implements geolocation queries, see the blog series.

Using Amazon EFS for AWS Lambda in your serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-amazon-efs-for-aws-lambda-in-your-serverless-applications/

Serverless applications are event-driven, using ephemeral compute functions to integrate services and transform data. While AWS Lambda includes a 512-MB temporary file system for your code, this is an ephemeral scratch resource not intended for durable storage.

Amazon EFS is a fully managed, elastic, shared file system designed to be consumed by other AWS services, such as Lambda. With the release of Amazon EFS for Lambda, you can now easily share data across function invocations. You can also read large reference data files, and write function output to a persistent and shared store. There is no additional charge for using file systems from your Lambda function within the same VPC.

EFS for Lambda makes it simpler to use a serverless architecture to implement many common workloads. It opens new capabilities, such as building and importing large code libraries directly into your Lambda functions. Since the code is loaded dynamically, you can also ensure that the latest version of these libraries is always used by every new execution environment. For appending to existing files, EFS is also a preferred option to using Amazon S3.

This blog post shows how to enable EFS for Lambda in your AWS account, and walks through some common use-cases.

Capabilities and behaviors of Lambda with EFS

EFS is built to scale on demand to petabytes of data, growing and shrinking automatically as files are written and deleted. When used with Lambda, your code has low-latency access to a file system where data is persisted after the function terminates.

EFS is a highly reliable NFS-based regional service, with all data stored durably across multiple Availability Zones. It is cost-optimized, due to no provisioning requirements, and no purchase commitments. It uses built-in lifecycle management to optimize between SSD-performance class and an infrequent access class that offer 92% lower cost.

EFS offers two performance modes – general purpose and MaxIO. General purpose is suitable for most Lambda workloads, providing lower operational latency and higher performance for individual files.

You also choose between two throughput modes – bursting and provisioned. The bursting mode uses a credit system to determine when a file system can burst. With bursting, your throughput is calculated based upon the amount of data you are storing. Provisioned throughput is useful when you need more throughout than provided by the bursting mode. Total throughput available is divided across the number of concurrent Lambda invocations.

The Lambda service mounts EFS file systems when the execution environment is prepared. This adds minimal latency when the function is invoked for the first time, often within hundreds of milliseconds. When the execution environment is already warm from previous invocations, the EFS mount is already available.

EFS can be used with Provisioned Concurrency for Lambda. When the reserved capacity is prepared, the Lambda service also configures and mounts EFS file system. Since Provisioned Concurrency executes any initialization code, any libraries or packages consumed from EFS at this point are downloaded. In this use-case, it’s recommended to use provisioned throughout when configuring EFS.

The EFS file system is shared across Lambda functions as it scales up the number of concurrent executions. As files are written by one instance of a Lambda function, all other instances can access and modify this data, depending upon the access point permissions. The EFS file system scales with your Lambda functions, supporting up to 25,000 concurrent connections.

Creating an EFS file system

Configuring EFS for Lambda is straight-forward. I show how to do this in the AWS Management Console but you can also use the AWS CLI, AWS SDK, AWS Serverless Application Model (AWS SAM), and AWS CloudFormation. EFS file systems are always created within a customer VPC, so Lambda functions using the EFS file system must all reside in the same VPC.

To create an EFS file system:

  1. Navigate to the EFS console.
  2. Choose Create File System.
    EFS: Create File System
  3. On the Configure network access page, select your preferred VPC. Only resources within this VPC can access this EFS file system. Accept the default mount targets, and choose Next Step.
  4. On Configure file system settings, you can choose to enable encryption of data at rest. Review this setting, then accept the other defaults and choose Next Step. This uses bursting mode instead of provisioned throughout.
  5. On the Configure client access page, choose Add access point.
    EFS: Add access point
  6. Enter the following parameters. This configuration creates a file system with open read/write permissions – read more about settings to secure your access points. Choose Next Step.EFS: Access points
  7. On the Review and create page, check your settings and choose Create File System.
  8. In the EFS console, you see the new file system and its configuration. Wait until the Mount target state changes to Available before proceeding to the next steps.

Alternatively, you can use CloudFormation to create the EFS access point. With the AWS::EFS::AccessPoint resource, the preceding configuration is defined as follows:

  AccessPointResource:
    Type: 'AWS::EFS::AccessPoint'
    Properties:
      FileSystemId: !Ref FileSystemResource
      PosixUser:
        Uid: "1000"
        Gid: "1000"
      RootDirectory:
        CreationInfo:
          OwnerGid: "1000"
          OwnerUid: "1000"
          Permissions: "0777"
        Path: "/lambda"

For more information, see the example setup template in the code repository.

Working with AWS Cloud9 and Amazon EC2

You can mount EFS access points on Amazon EC2 instances. This can be useful for browsing file systems contents and downloading files from other locations. The EFS console shows customized mount instructions directly under each created file system:

EFS customized mount instructions

The instance must have access to the same security group and reside in the same VPC as the EFS file system. After connecting via SSH to the EC2 instance, you mount the EFS mount target to a directory. You can also mount EFS in AWS Cloud9 instances using the terminal window.

Any files you write into the EFS file system are available to any Lambda functions using the same EFS file system. Similarly, any files written by Lambda functions are available to the EC2 instance.

Sharing large code packages with Lambda

EFS is useful for sharing software packages or binaries that are otherwise too large for Lambda layers. You can copy these to EFS and have Lambda use these packages as if there are installed in the Lambda deployment package.

For example, on EFS you can install Puppeteer, which runs a headless Chromium browser, using the following script run on an EC2 instance or AWS Cloud9 terminal:

  mkdir node && cd node
  npm init -y
  npm i puppeteer --save

Building packages in EC2 for EFS

You can then use this package from a Lambda function connected to this folder in the EFS file system. You include the Puppeteer package with the mount path in the require declaration:

const puppeteer = require ('/mnt/efs/node/node_modules/puppeteer')

In Node.js, to avoid changing declarations manually, you can add the EFS mount path to the Node.js module search path by using app-module-path. Lambda functions support a range of other runtimes, including Python, Java, and Go. Many other runtimes offer similar ways to add the EFS path to the list of default package locations.

There is an important difference between using packages in EFS compared with Lambda layers. When you use Lambda layers to include packages, these are downloaded to an immutable code package. Any changes to the underlying layer do not affect existing functions published using that layer.

Since EFS is a dynamic binding, any changes or upgrades to packages are available immediately to the Lambda function when the execution environment is prepared. This means you can output a build process to an EFS mount, and immediately consume any new versions of the build from a Lambda function.

Configuring AWS Lambda to use EFS

Lambda functions that access EFS must run from within a VPC. Read this guide to learn more about setting up Lambda functions to access resources from a VPC. There are also sample CloudFormation templates you can use to configure private and public VPC access.

The execution role for Lambda function must provide access to the VPC and EFS. For development and testing purposes, this post uses the AWSLambdaVPCAccessExecutionRole and AmazonElasticFileSystemClientFullAccess managed policies in IAM. For production systems, you should use more restrictive policies to control access to EFS resources.

Once your Lambda function is configured to use a VPC, next configure EFS in Lambda:

  1. Navigate to the Lambda console and select your function from the list.
  2. Scroll down to the File system panel, and choose Add file system.
    EFS: Add file system
  3. In the File system configuration:
  • From the EFS file system dropdown, select the required file system. From the Access point dropdown, choose the required EFS access point.
  • In the Local mount path, enter the path your Lambda function uses to access this resource. Enter an absolute path.
  • Choose Save.
    EFS: Add file system

The File system panel now shows the configuration of the EFS mount, and the function is ready to use EFS. Alternatively, you can use an AWS Serverless Application Model (SAM) template to add the EFS configuration to a function resource:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  MyLambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
	...
      FileSystemConfigs:
      - Arn: arn:aws:elasticfilesystem:us-east-1:xxxxxx:accesspoint/
fsap-123abcdef12abcdef
        LocalMountPath: /mnt/efs

To learn more, see the SAM documentation on this feature.

Example applications

You can view and download these examples from this GitHub repository. To deploy, follow the instructions in the repo’s README.md file.

1. Processing large video files

The first example uses EFS to process a 60-minute MP4 video and create screenshots for each second of the recording. This uses to the FFmpeg Linux package to process the video. After copying the MP4 to the EFS file location, invoke the Lambda function to create a series of JPG frames. This uses the following code to execute FFmpeg and pass the EFS mount path and input file parameters:

const os = require('os')

const inputFile = process.env.INPUT_FILE
const efsPath = process.env.EFS_PATH

const { exec } = require('child_process')

const execPromise = async (command) => {
	console.log(command)
	return new Promise((resolve, reject) => {
		const ls = exec(command, function (error, stdout, stderr) {
		  if (error) {
		    console.log('Error: ', error)
		    reject(error)
		  }
		  console.log('stdout: ', stdout);
		  console.log('stderr: ' ,stderr);
		})
		
		ls.on('exit', function (code) {
		  console.log('Finished: ', code);
		  resolve()
		})
	})
}

// The Lambda handler
exports.handler = async function (eventObject, context) {
	await execPromise(`/opt/bin/ffmpeg -loglevel error -i ${efsPath}/${inputFile} -s 240x135 -vf fps=1 ${efsPath}/%d.jpg`)
}

In this example, the process writes more than 2000 individual JPG files back to the EFS file system during a single invocation:

Console output from sample application

2. Archiving large numbers of files

Using the output from the first application, the second example creates a single archive file from the JPG files. The code uses the Node.js archiver package for processing:

const outputFile = process.env.OUTPUT_FILE
const efsPath = process.env.EFS_PATH

const fs = require('fs')
const archiver = require('archiver')

// The Lambda handler
exports.handler = function (event) {

  const output = fs.createWriteStream(`${efsPath}/${outputFile}`)
  const archive = archiver('zip', {
    zlib: { level: 9 } // Sets the compression level.
  })
  
  output.on('close', function() {
    console.log(archive.pointer() + ' total bytes')
  })
  
  output.on('end', function() {
    console.log('Data has been drained')
  })
  
  archive.pipe(output)  

  // append files from a glob pattern
  archive.glob(`${efsPath}/*.jpg`)
  archive.finalize()
}

After executing this Lambda function, the resulting ZIP file is written back to the EFS file system:

Console output from second sample application.

3. Unzipping archives with a large number of files

The last example shows how to unzip an archive containing many files. This uses the Node.js unzipper package for processing:

const inputFile = process.env.INPUT_FILE
const efsPath = process.env.EFS_PATH
const destinationDir = process.env.DESTINATION_DIR

const fs = require('fs')
const unzipper = require('unzipper')

// The Lambda handler
exports.handler = function (event) {

  fs.createReadStream(`${efsPath}/${inputFile}`)
    .pipe(unzipper.Extract({ path: `${efsPath}/${destinationDir}` }))

}

Once this Lambda function is executed, the archive is unzipped into a destination direction in the EFS file system. This example shows the screenshots unzipped into the frames subdirectory:

Console output from third sample application.

Conclusion

EFS for Lambda allows you to share data across function invocations, read large reference data files, and write function output to a persistent and shared store. After configuring EFS, you provide the Lambda function with an access point ARN, allowing you to read and write to this file system. Lambda securely connects the function instances to the EFS mount targets in the same Availability Zone and subnet.

EFS opens a range of potential new use-cases for Lambda. In this post, I show how this enables you to access large code packages and binaries, and process large numbers of files. You can interact with the file system via EC2 or AWS Cloud9 and pass information to and from your Lambda functions.

EFS for Lambda is supported at launch in APN Partner solutions, including Epsagon, Lumigo, Datadog, HashiCorp Terraform, and Pulumi. To learn more about how to use EFS for Lambda, see the AWS News Blog post and read the documentation.

Upgrading to Amazon EventBridge from Amazon CloudWatch Events

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/upgrading-to-amazon-eventbridge-from-amazon-cloudwatch-events/

Amazon EventBridge was announced at the AWS New York Summit in 2019. It’s a serverless event bus service that uses the Amazon CloudWatch Events API, but also includes more functionality, like the ability to ingest events from SaaS apps. Event-based architectures can make it easier to decouple your application services and make your systems more extensible.

Events are emitted from services throughout AWS, and you can create custom events from your own applications. The popularity of event-based compute is why CloudWatch Events grew to process trillions of events every month. See this video for an overview of the features of the EventBridge service.

EventBridge is designed to extend the event model beyond AWS, bringing data from software as a service (SaaS) providers into your AWS environment. This means you can consume events from popular providers such as Zendesk, PagerDuty, and Auth0. You can use these in your applications with the same ease as any AWS-generated event.

This blog post explains the differences between the CloudWatch Events and EventBridge, and the benefits of upgrading. I also provide additional resources to help you gain the benefits of using the newer EventBridge service.

Integrated SaaS providers in EventBridge

Fetching data from third-party providers has typically relied on processes such as polling, or building custom webhooks where these are supported. Polling is compute-intensive and often wasteful, since many requests return no new data. Additionally, there is a lag between new data becoming available and your system receiving the information.

While webhooks offers improvements over polling and may approximate a real-time connection, these requests travel over the public internet. This means you must secure the webhook endpoint, and use a security mechanism between the two services. You must also scale up if the SaaS provider sends large numbers of messages.

EventBridge enables SaaS providers to publish data on an event bus within their AWS environment. These events are then routed to your AWS account, and appear on your partner event bus. All of this happens on the private AWS network, away from the public internet. The service manages scaling, security, and integration for you.

Configuring EventBridge in your SaaS provider account is easy. To learn how to set up the integration, see these videos for a full walkthrough:

A full list of EventBridge SaaS integrations is available on the EventBridge website. Additionally, if you want to integrate your own SaaS software with EventBridge, read more in the onboarding guide.

Custom event buses and enhanced rules

CloudWatch Events provides a default event bus that exists in every AWS account. All AWS events are routed via the default bus. You can also choose to publish your custom events to the default bus.

EventBridge introduces custom event buses you can use exclusively for your own workloads. These can be useful for controlling access to events to a limited set of AWS accounts or custom applications. Custom event buses are free to set up. Watch this video to see how to set up a custom event bus.

You can also use powerful advanced rules for routing events. With content-based filtering, you can use comparison operators to identify and filter values in events. This allows you to use EventBridge to handle more processing on behalf of your application, reducing the load on downstream services. See this blog post to learn more about using content-filtering to build advanced rules.

Schema registry

One of the traditional challenges of working with event-based architectures is the administration of managing event structures. With different applications, services and microservices publishing events, it can be hard to standardize event formats. These formats, or schemas, may also change when developers introduce new versions of their services.

The EventBridge Schema Registry allows you to automate the discovery and creation of schemas. You can find schemas for AWS services, integrated SaaS providers, and your own custom events. EventBridge infers the schemas and stores these in a registry. You can then download custom code bindings for popular type-safe programming languages. This accelerates the development process, making it easy to construct objects based on events.

Watch this video to see how to use the EventBridge Schema Registry, and get started with your own applications.

Migration and compatibility

Comparing the EventBridge console and CloudWatch Events console, EventBridge has a new design that makes it easier to build and manage rules and event buses. If you are new to using events within AWS, it’s recommended that you start with the EventBridge console.

The EventBridge service uses the CloudWatch Events API and it is fully backward compatible. Any rules you have configured in CloudWatch Events continue to work in EventBridge. You can access the default event bus and any configurations you created in CloudWatch Events from EventBridge immediately. If you’re a current CloudWatch Events customer, you can upgrade to EventBridge by simply opening the EventBridge console.

AWS continues to build new functionality to enhance the capabilities of event-based architectures. These features are only released via EventBridge, so it’s recommended that you upgrade to ensure you can take advantage of these new capabilities.

Additional EventBridge resources for serverless developers

Event-based architectures can help serverless developers create dynamic, decoupled applications. Here are additional resources with sample code repos to help you get started:

Conclusion

EventBridge is the evolution of the CloudWatch Events service. It brings new features, including the ability to integrate data from popular SaaS providers as events within AWS.

In this post, I discuss the new ability to create custom event buses, and how you can develop advanced rules for sophisticated event routing. I also discuss the new EventBridge Schema Registry, which automates event schema discovery, and downloading code bindings directly into your IDE.

New event-based features are now released in EventBridge. By migrating from CloudWatch Events, you can take advantage of new capabilities as they are released.

To learn more about using EventBridge for your AWS workloads, visit the EventBridge Learning Path, which includes a range of learning resources.

Visualizing Amazon API Gateway usage plans using Amazon QuickSight

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/visualizing-amazon-api-gateway-usage-plans-using-amazon-quicksight/

This post is courtesy of Roberto Iturralde, Solutions Architect.

Many customers build applications for their users accessible via HTTP API endpoints. Users provide unique keys in their requests for authentication, authorization, and optional metering by the service provider. Business and technical owners benefit from detailed analytics across the API endpoints and usage patterns across customers. This information helps understand product adoption and informs future features.

Amazon API Gateway can produce detailed access logs to show who has accessed the API. When using usage plans, a customer identifier is included in the log records. You can use these logs to populate a business intelligence service, such as Amazon QuickSight, to analyze and report on usage patterns across your APIs and customers.

Solution overview

QuickSight dashboard

Using enriched API Gateway access logs, you can analyze how customers are accessing your API products. This dashboard shows several visualizations in Amazon QuickSight based on traffic to a sample API Gateway endpoint.

  • The pie chart shows the share of month-to-date traffic across all APIs by usage plan.
  • The bar chart shows the top customers in the Enterprise usage plan by month-to-date traffic, with bar coloring by HTTP status code.
  • The pivot table shows the percent of traffic to each API endpoint by usage plan and customer.

The solution described in this post is meant for business intelligence (BI) analysis. A BI dashboard is useful for historical reporting and typically the data freshness ranges from hours to days.

Solution architecture

Solution architecture

Components:

  • API access logs stream – API access logs are streamed in real time from Amazon API Gateway to Amazon Kinesis Data Firehose. Kinesis Firehose buffers the records and enriches them with information from the API usage plans. It then writes the batches of enriched records to an Amazon S3 bucket for durable, secure storage.
  • Access logs indexing – Metadata about the API access logs is stored in an AWS Glue Data Catalog that is used by Amazon QuickSight for querying. A nightly AWS Glue crawler detects and indexes newly written access logs. The Glue crawler can run more frequently for fresher data in QuickSight.
  • Data visualization – Amazon QuickSight is configured with the S3 location of the access logs as a data source to feed a QuickSight analysis.

Implementation walk-through

This tutorial assumes you already have an API Gateway API with a usage plan configured. If you do not, follow this tutorial to create an API and follow this article to create a usage plan.

First, deploy an AWS SAM template into your account. This template creates an Amazon S3 bucket where the access logs are stored for analysis. It also creates an AWS Lambda function to enrich the API access logs.

Then you create a Kinesis Data Firehose delivery stream to receive access logs from API Gateway. The stream enriches the records using the Lambda function, buffers and batches the records, and writes them to the S3 bucket. Finally, you update a deployed API Gateway stage to write access logs to the Kinesis delivery stream.

Launch the AWS SAM Template

To create some of the resources referenced in this post, you can download the SAM template or choose the button below to launch the stack.

Launch Stack button

Choose Next on each screen of the CloudFormation stack creation process. Once the stack creation completes, note the names of the resources on the Outputs tab.

Stack outputs tab

The Lambda function created by the SAM template performs a few key tasks. During function initialization, it fetches API Gateway usage plan details into memory. On each invocation, it iterates through each access log record from Kinesis Firehose. Each record is decoded from base64 encoded binary and enriched with usage plan name and customer name. Each record is then converted back to base64 encoded binary to return to the Kinesis Firehose stream.

API access logs stream

  1. Navigate to the Kinesis Data Firehose console and choose Create delivery stream.
    Create delivery stream
  2. Under Delivery stream name, enter a name in the format amazon-apigateway-{your-delivery-stream-name}. It is required that your stream name begin with amazon-apigateway-.
    New delivery stream
  3. Leave the default Source setting of Direct PUT or other sources. Choose Next.
  4. Under Data transformation, select Enabled. In the Lambda function dropdown, select the function created earlier. Choose Next.
    Transform source records
  5. Select Amazon S3 as the Destination. In the S3 bucket dropdown, select the bucket created earlier.
    S3 destination
  6. Under S3 prefix, enter logs/year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/hour=!{timestamp:HH}/. This naming convention allows the AWS Glue crawler to automatically partition this data during indexing.
    S3 prefix
  7. Under S3 error prefix, enter errors/!{firehose:random-string}/!{firehose:error-output-type}/!{timestamp:yyyy/MM/dd}/. This will write errors encountered by the Firehose delivery stream to a folder named errors in the S3 bucket, followed by folders by error type and error timestamp. Choose Next.
    S3 error prefix
  8. Leave the default buffer, compression, and other settings. At the bottom of the screen, select Create new or choose to create a new IAM role for this delivery stream. In the window that opens, leave the default settings. Choose Allow.
    Setting permissions
  9. This will return you to the Kinesis Firehose delivery stream creation wizard. Choose Next.
  10. On the review page, verify the settings and choose Create delivery stream. Wait for the stream to be successfully created.
  11. You can now configure API Gateway to stream access logs to this Kinesis Firehose delivery stream. Follow these instructions to enable access logging on your API stages using the ARN of the Firehose delivery stream you created.
  12. Under Log Format, choose the fields to include in the access logs in JSON format. Find examples in the API Gateway documentation as well as the full set of available fields in the $context variable. The below fields and mapped names are required for the enrichment Lambda function. Choose Save Changes.
    {
      "apiId": "$context.apiId",
      "identity.apiKeyId": "$context.identity.apiKeyId",
      "stage": "$context.stage"
    }
  13. As the API stages where you enabled access logging receive traffic, you will see files written to your Amazon S3 bucket. Note that the Firehose delivery stream buffers data before writing to S3, so it may take some time before files appear.

Access logs for your API are now flowing to an Amazon S3 bucket enriched with usage plan information. You now need to index this data for querying and make it available in Amazon QuickSight for analysis.

Access Logs Indexing

  1. Navigate to the AWS Glue console. If this is your first time using AWS Glue, choose the Get Started button on the landing page. On the left side of the console select Crawlers. On the Crawlers tab, choose Add crawler.
  2. Enter a name for the Crawler and choose Next.
  3. On the Specify crawler source type page, choose Data stores. Choose Next.
  4. Select S3 as the data store and leave the Connection field empty. In the Include path section, use the folder icon to browse your existing S3 buckets. Use the plus sign to expand the folders beneath the S3 bucket created earlier. Select the logs folder and choose Select. If you don’t see the logs folder, you add it manually later.
    Choose S3 path
  5. If you did not see a logs folder on the prior screen, you can add it to the end of the S3 location in the input box. Choose Next.
    Crawl data options
  6. On the Add another data source screen, leave No selected and choose Next.
  7. Select Create an IAM Role and enter a name for the IAM role that AWS Glue uses to crawl the S3 bucket. Choose Next.
  8. Under the Frequency for scheduled crawling, select Daily and choose the time when you want to update your index of access logs. The crawl frequency can be modified later. Choose Next.
  9. On the crawler output selection page, select Add database to create a new metadata database for the API Gateway access logs. Name your metadata database and choose Create. Back on the output configuration screen, choose Next.
    Configure the crawler's output
  10. Choose Finish.
  11. In the Crawlers tab of the AWS Glue console, select the checkbox next to the crawler you created. Choose Run crawler.
    Run crawler
  12. After the crawler finishes, you see a table named logs in the Glue database. Navigate to the Tables page of the Glue console to view this table. Selecting the table name will show the metadata that the crawler populated, including the file format, number of records, and schema of the access logs records.
    Tables

You now have an AWS Glue database with metadata of the access logs stored in Amazon S3 and a scheduled Glue crawler. Lastly, you need to make this data available in Amazon QuickSight for visualization and analysis.

Data Visualization

  1. Navigate to the Amazon QuickSight console.
    First-time QuickSight users: Follow these instructions to create a QuickSight account.
    All users: Follow these instructions to update your S3 permissions to include the S3 bucket created earlier containing the API Gateway access logs.
  2. In the menu bar, select Manage data.
    Manage data
  3. On the top left of the Data Sets page, choose New data set.
  4. On the Create data set page, select Amazon Athena.
    Create a data set
  5. On the New Athena data source page, enter a name for this data source. Leave the Athena workgroup on the default setting and select Create data source.
  6. On the following page, use the Database section to select the Glue database you created earlier. Once selected, you will see the tables available inside that database. Select the database table you created earlier to hold the metadata for the access logs in S3.
  7. On the final data set creation page, select Direct query your data. You can change this option later to use QuickSight’s native data cache to improve performance. Choose Visualize.
  8. This will create a QuickSight analysis based on a data set of the API Gateway access logs data. You should see the logs data set selected and the access logs fields available in the Fields list. You can now create visuals based on the API Gateway access logs data.
    QuickSight create visuals menu

Conclusion

In this post, I walk through configuring streaming of API access logs from Amazon API Gateway to Amazon S3 via a Kinesis Firehose delivery stream. An AWS Glue crawler periodically updates metadata in an AWS Glue data catalog for the access logs in S3. This metadata is used by Amazon QuickSight to query the data in S3 to populate visuals in a QuickSight analysis. This allows business and technical owners of API-based products to analyze access trends by customers accessing their APIs.

To learn more, read about different types of visualizations available in QuickSight. As a performance and cost optimization, enable compression and format conversion from JSON to a columnar data format in your Kinesis Firehose delivery stream.

Building a location-based, scalable, serverless web app – part 3

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-location-based-scalable-serverless-web-app-part-3/

In part 2, I cover the API configuration, geohashing algorithm, and real-time messaging architecture used in the Ask Around Me web application. These are needed for receiving and processing questions and answers, and sending results back to users in real time.

In this post, I explain the backend processing architecture, how data is aggregated, and how to deploy the final application to production. The code and instructions for this application are available in the GitHub repo.

Processing questions

The frontend sends new user questions to the backend via the POST questions API. While the predicted volume of questions is only 1,000 per hour, it’s possible for usage to spike unexpectedly. To help handle this load, the PostQuestions Lambda function puts incoming questions onto an Amazon SQS queue. The ProcessQuestions function takes messages from the Questions queue in batches of 10, and loads these into the Questions table in Amazon DynamoDB.

Questions processing architecture

This asynchronous process smooths out traffic spikes, ensuring that the application is not throttled by DynamoDB. It also provides consistent response times to the front-end POST request, since the API call returns as soon as the message is durably persisted to the queue.

Currently, the ProcessQuestions function does not parse or validate user questions. It would be easy to add message filtering at this stage, using Amazon Comprehend to detect sentiment or inappropriate language. These changes would increase the processing time per question, but by handling this asynchronously, the initial POST API latency is not adversely affected.

The ProcessQuestions function uses the Geo Library for Amazon DynamoDB that converts the question’s latitude and longitude into a geohash. This geohash attribute is one of the indexes in the underlying DynamoDB table. The GetQuestions function using the same library for efficiently querying questions based on proximity to the user.

There are a couple of different mechanisms used to pass information between the frontend and backend applications. When the frontend first initializes, it retrieves the current location of the user from the browser. It then calls the questions API to get a list of active questions within 5 miles of the current location. This retrieves the state up to this point in time. To receive notifications of new messages posted in the user’s area, the frontend also subscribes to the geohash topic in AWS IoT Core.

Processing answers

Answers processing architecture

The application allows two types of question that have different answer types. First, the rating questions accept an answer with a 0–5 score range. Second, the geography questions accept a geo-point, which is a latitude and longitude representing a location.

Similar to the way questions are handled, answers are also queued before processing. However, the PostAnswers Lambda function sends answers to different queues, depending on question type. Ratings messages are sent to the StarAnswers queue, while geography messages are routed to the GeoAnswers queue. Star ratings are saved as raw data in the Answers table by the ProcessAnswerStar function. Geography answers are first converted to a geohash before they are stored.

It’s possible for users to submit updates to their answers. For a star rating, the processing function simply saves the new score. For geography answers, if the updated answer contains a latitude and longitude close enough to the original answer, it results in the same geohash. This is due to the different aggregation processes used for these types of answers.

Aggregating data

In this application, the users asking questions are seeking aggregated answers instead of raw data. For example, “How do you rate the park?” shows an average score from users instead of thousands of individual ratings. To maintain performance, this aggregation occurs when new answers are saved to the database, not when the application fetches the question list.

The Answers table emits updates to a DynamoDB stream whenever new items are inserted or updated. The StreamSpecification parameter in the table definition is set to NEW_AND_OLD_IMAGES, meaning the stream record contains both the new and old item record.

New answers to questions are new items in the table, so the stream record only contains the new image. If users update their answers, this creates an updated item in the table, and the stream record contains both the new and old images of the item.

For star ratings, when receiving an updated rating, the Aggregation function uses both images to calculate the delta in the score. For example, if the old rating was 2 and the user changes this to 5, then the delta is 3. The summary score related to the answer is updated in the Questions table, using a DynamoDB update expression:

    const result = await myGeoTableManager.updatePoint({
      RangeKeyValue: { S: update }, 
      GeoPoint: {
        latitude: item.lat,
        longitude: item.lng
      },
      UpdateItemInput: {
        UpdateExpression: 'ADD answers :deltaAnswers, totalScore :deltaTotalScore',
        ExpressionAttributeValues: {
          ':deltaAnswers': { N: item.deltaAnswers.toString()},
          ':deltaTotalScore': { N: item.deltaValue.toString()}
        }
      }
    }).promise()

For geo-point ratings, the same approach is used but if the geohash changes, then the delta is -1 for the geohash in the old image, and +1 for the geohash in the new image. The update expression automatically creates a new geohash attribute on the DynamoDB item if it is not already present:

    const result = await myGeoTableManager.updatePoint({
      RangeKeyValue: { S: item.ID }, 
      GeoPoint: {
        latitude: item.lat,
        longitude: item.lng
      },
      UpdateItemInput: {
        UpdateExpression: `ADD ${item.geohash} :deltaAnswers, answers :deltaAnswers`,
        ExpressionAttributeValues: {
          ':deltaAnswers': { N: item.deltaAnswers.toString() }          
        }
      }
    }).promise()

By using a Lambda function as a DynamoDB stream processor, you can aggregate large amounts of data in near real time. The Questions and Answers tables have a one-to-many relationship – many answers belong to one question. As answers are saved, the aggregation process updates the summaries in the Questions table.

The Questions table also publishes updates to another DynamoDB stream. These are consumed by a Lambda function that sends the aggregated update to topics in AWS IoT Core. This is how updated scores are sent back to the frontend client application.

Publishing to production with Amplify Console

At this point, you can run the application on your local development machine and view the application via the localhost Vue.js server. Once you are ready to launch the application to users, you must deploy to production.

Single-page applications are easy to deploy publicly. The build process creates static HTML, JS, and CSS files. These can be served via Amazon S3 and Amazon CloudFront, together with any image and media assets used. The process of running the build process and managing the deployment can be automated using AWS Amplify Console.

In this walk through, I use GitHub as the repo provider. You can also use AWS CodeCommit, Bitbucket, GitLab, or upload the build directory from your machine.

To deploy the front end via Amplify Console:

  1. From the AWS Management Console, select the Services dropdown and choose AWS Amplify. From the initial splash screen, choose Get Started under Deploy.Amplify Console getting started
  2. Select GitHub as the repository provider, then choose Continue:Select GitHub as your code repo
  3. Follow the prompts to enable GitHub access, then select the repository dropdown and choose the repo. In the Branch dropdown, choose master. Choose Next.Add repository branch
  4. In the App build and test settings page, choose Next.
  5. In the Review page, choose Save and deploy.
  6. The final screen shows the deployment pipeline for the connected repo, starting at the Provision phase:Amplify Console deployment pipeline

After a few minutes, the Build, Deploy, and Verify steps show green checkmarks. Open the URL in a browser, and you see that the application is now served by the public URL:

Ask Around Me - Deployed application

Finally, before logging in, you must add the URL to the list of allowed URLs in the Auth0 settings:

  1. Log into Auth0 and navigate to the dashboard.
  2. Choose Applications in the menu, then select Ask Around Me from the list of applications.
  3. On the Settings tab, add the application’s URL to Allowed Callback URLs, Allowed Logout URLs, and Allowed Web Origins. Separate from the existing values using a comma.Updating the Auth0 configuration
  4. Choose Save changes. This allows the new published domain name to interact with Auth0 for authentication your application’s users.

Anytime you push changes to the code repository, Amplify Console detects the commit and redeploys the application. If errors are detected, the existing version is presented to users. If there are no errors, the new version is served to visitors.

Conclusion

In the last part of this series, I show how the application queues posted questions and answers. I explain how this asynchronous approach smooths traffic spikes and helps maintain responsive APIs.

I cover how answers are collected from thousands of users and are aggregated using DynamoDB streams. These totals are saved as summaries in the Questions table, and live updates are pushed via AWS IoT Core back to the frontend.

Finally, I show how you can automate deployment using Amplify Console. By connecting the service directly with your code repository, it publishes and serves your application with no need to manually copy files.

To learn more about this application, see the accompanying GitHub repo.

Building a location-based, scalable, serverless web app – part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-location-based-scalable-serverless-web-app-part-2/

Part 1 introduces the Ask Around Me web application that allows users to send questions to other local users in real time. I explain the app’s functionality and how using a single-page application (SPA) framework complements a serverless backend. I configure Auth0 for authentication and show how to deploy the frontend and backend. I also introduce how SPA frontends can send and receive data using both a traditional API and real-time messaging via a WebSocket.

In this post, I review the backend architecture, Amazon API Gateway’s HTTP APIs, and the geohashing implementation. The code and instructions for this application are available in the GitHub repo.

Architecture overview

After deploying the application using the repo’s README.md instructions, the backend architecture looks like this:

Ask Around Me backend architecture

The Vue.js frontend primarily interacts with the backend via HTTP APIs using Amazon API Gateway. When users submit questions or answers, the data is sent via the POST API endpoints. When the frontend requests lists of questions or answers, this occurs via the GET API endpoints.

Incoming questions and answers are posted to separate Amazon SQS queues. These queues invoke AWS Lambda functions that process and store the data in the application’s Amazon DynamoDB tables. In the Questions table, the application saves geo-location data and aggregated statistics for each question. The Answers table maintains a record of user IDs and answers to ensure that each user can only post one answer per question.

When new answers are stored in the Answers tables, a DynamoDB stream triggers a Lambda aggregation function with the update. This calculates average scores for questions and aggregates data for the heat map, then stores the result in the main Questions table. When the Questions table is updated, this DynamoDB stream invokes the Publish Lambda function. This publishes updates to the relevant topic in AWS IoT Core, which the front-end application subscribes to.

Using HTTP APIs

API Gateway is a common integration service used between the frontend and backend of serverless web applications. You can choose between the standard REST APIs, and the newer HTTP APIs. The choice depends upon which features you need, and cost considerations for your workload.

This application uses JWT authentication via Auth0 and Lambda proxy integration, and both are supported by HTTP APIs. Many advanced features like API key management, Amazon Cognito integration, and usage plans are not required in this application. It’s also important to compare to cost of each service:

API typeHourlyDaily Annually
PUT questions1,00024,0008,760,000
GET questions50,0001,200,000438,000,000
PUT answers10,000240,00087,600,000
Total API requests 534,360,000
REST APIs cost$1,870.26
HTTP APIs$534.36

Using the predicted API usage covered in part 1, you can compare the REST APIs and HTTP APIs overall cost. At an estimated $534 annually, the HTTP APIs option is approximately 30% of the cost of REST APIs.

The AWS Serverless Application Model (SAM) template in the repo defines the HTTP API resource and CORS configuration. It also includes the Auth0 authorizer used to validate each API request:

  MyApi:
    Type: AWS::Serverless::HttpApi
    Properties:
     Auth:
        Authorizers:
          MyAuthorizer:
            JwtConfiguration:
              issuer: !Ref Auth0issuer
              audience:
                - https://auth0-jwt-authorizer
            IdentitySource: "$request.header.Authorization"
        DefaultAuthorizer: MyAuthorizer

      CorsConfiguration:
        AllowMethods:
          - GET
          - POST
          - DELETE
          - OPTIONS
        AllowHeaders:
          - "*"   
        AllowOrigins: 
          - "*"   

With the HTTP API resource defined, each Lambda function has an event configuration referencing this resource. All the functions referencing the HTTP API resource automatically use the Auth0 authorizer.

  GetAnswersFunction: 
    Type: AWS::Serverless::Function
    Properties:
      Description: Get all answers for a question
      ... 
      Events:
        Get:
          Type: HttpApi
          Properties:
            Path: /answers/{Key}
            Method: get
            ApiId: !Ref MyApi    

Using geohashing in web applications

A key part of the functionality in Ask Around Me is the ability to find and answer questions near the user. Given the expected volume of questions in this system, this requires an efficient way to query based upon location that maintains performance as traffic grows.

In a naïve implementation, you might compare the current geographical position of the user with the geo-location of each question and answer in the database. But with an expected 1,000 questions per hour, this would soon become a slow operation with O(n) performance.

A more efficient solution is geohashing. This divides the geographical area of the planet into series of grid cells that are identified by an alphanumeric hash. The first character of the hash identifies one of 32 cells in the grid, roughly 5000 km x 5000 km on the planet. The second character identifies one of 32 squares in that first cell, so combining the first two characters provides a resolution of approximately 1250 km x 1250 km. By the 12th character in the hash, you can identify an area as small as a couple of square inches on Earth. For a more detailed explanation, see this geohashing site.

When using this algorithm, it’s important to choose the correct level of resolution. For Ask Around Me, the frontend searches for questions within 5 miles of the user. You can identify these areas with a 5-character hash. This means you can compare the user’s current location using their geohash, to the geohash stored in the Questions table. This comparison allows you to immediately discard most questions from the search and quickly find the relevant items.

This solution uses the Geo Library for Amazon DynamoDB npm library. Both the GET and POST questions APIs use this library to calculate the geohash when storing and fetching questions. The library requires a dedicated DynamoDB table, which is why user answers are stored in a separate table.

The GET questions API uses the latitude and longitude from the query parameters to query the underlying DynamoDB using this library:

const AWS = require('aws-sdk')
AWS.config.update({region: process.env.AWS_REGION})

const ddb = new AWS.DynamoDB() 
const ddbGeo = require('dynamodb-geo')
const config = new ddbGeo.GeoDataManagerConfiguration(ddb, process.env.TableName)
config.hashKeyLength = 5

const myGeoTableManager = new ddbGeo.GeoDataManager(config)
const SEARCH_RADIUS_METERS = 4000

exports.handler = async (event) => {

  const latitude = parseFloat(event.queryStringParameters.lat)
  const longitude = parseFloat(event.queryStringParameters.lng)

  // Get questions within geo range
  const result = await myGeoTableManager.queryRadius({
    RadiusInMeter: SEARCH_RADIUS_METERS,
    CenterPoint: {
      latitude,
      longitude
    }
  })

  return {
    statusCode: 200,
    body: JSON.stringify(result)
  }
}

The publish/subscribe pattern for real time in web apps

Modern web applications frequently use real-time notifications to keep users informed of state changes. You could achieve this with frequently polling of the APIs to fetch new information. However, this approach is usually wasteful, both in cost and compute terms, because most API calls do not return new information. Additionally, if updates are evenly distributed and you poll every n seconds, there is an average delay of n/2 seconds between data becoming available and your application receiving it.

Instead of polling, a better option for many web applications is a WebSocket. Data availability is closer to real time, and the messaging is less frequent. This can be important for web applications used on mobile devices where unnecessary messaging can impact battery life.

This approach uses the publish-subscribe pattern. The frontend makes subscriptions to a backend service, indicating topics of interest. The backend service receives messages from publishers, which are upstream processes in the application. It filters the messages and routes to the appropriate subscribers.

Although powerful, this can be complex to implement due to connectivity issues over networks. For a web application, users may turn off their devices, disconnect Wi-Fi, or become unreachable due to limited coverage. This pattern is generally forward-only, meaning you only receive messages after the point of subscription.

AWS IoT Core simplifies this process, and the JavaScript SDK handles the common reconnection issues. The backend application sends messages to topics in AWS IoT Core, and the frontend application subscribes to topics of interest. The service maintains the list of active publishers and subscribers, and routes messages between the two. It also automatically manages fan-out, which occurs when there are many subscribers to a single topic.

From a pricing perspective, this is also a cost-efficient approach. At the time of writing, AWS IoT Core costs $0.08 per million minutes of connection, and $1.00 per million messages. There are also no servers to manage, and the service scales automatically to handle your application’s load.

In the example application, the real-time connection is configured and managed in a single component, IoT.vue. This initiates a connection to an IoT endpoint when the application first starts, and listens for messages on subscribed topics. It passes data back to the global Vuex store so other components automatically receive updates with no dependency on the IoT component.

Choosing publish-subscribe topics for web apps

In a typical synchronous API call, the client application makes a specific request and receives a response from a backend service. With a topic-based subscription, the topic itself is the equivalent of the request, but you usually don’t receive immediate information.

In this web application, there are a number of topics that are potentially important to users. Some topics are shared across multiple users, while other are private to a single user:

  • Account-level topic: messages relating only to a single user ID, such as billing and notifications. These are intended for any devices where that user is logged in.
  • Per-question topic: when a user asks a question, they need alerts when new answers arrive. Each question ID maps to an individual topic. Anyone who asks or watches a question subscribes to this topic.
  • Geo-fenced alert topic: a user receives alerts when new questions are asked in their local area. In this case, the geohash of their location is the topic identifier. New questions are published to their geohash topics, and users within the same geohash area receive those messages.
  • A system-wide topic: this is a single topic that all users subscribe to. This is reserved for important messages for all application users.

In web applications, you subscribe to some topics when the application initializes, such as account-level or system-wide topics. Other subscriptions are dynamic. For example, you subscribe to a question ID topic only after posting a question, or subscribe to different geo-fence hashes when the user’s location changes.

Conclusion

This post explores the backend architecture of the Ask Around Me application. I compare the cost and features in deciding between REST APIs and HTTP APIs in API Gateway. I introduce geohashing and the npm library used to handle geo-location queries in DynamoDB. And I show how you can build real-time messaging into your web applications using the publish-subscribe pattern with AWS IoT Core.

To learn more, visit the application’s code repo on GitHub.

Building a location-based, scalable, serverless web app – part 1

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-location-based-scalable-serverless-web-app-part-1/

Web applications represent a major category of serverless usage. When used with single-page application (SPA) frameworks for front-end development, you can create highly responsive apps. With a serverless backend, these apps can scale to hundreds of thousands of users without you managing a single server.

In this 3-part series, I demonstrate how to build an example serverless web application. The application includes authentication, real-time updates, and location-specific features. I explore the functionality, architecture, and design choices involved. I provide a complete code repository for both the front-end and backend. By the end of these posts, you can use these patterns and examples in your own web applications.

In this series:

  • Part 1: Deploy the frontend and backend applications, and learn about how SPA web applications interact with serverless backends.
  • Part 2: Review the backend architecture, Amazon API Gateway HTTP APIs, and the geohashing implementation.
  • Part 3: Understand the backend data processing and aggregation with Amazon DynamoDB, and the final deployment of the application to production.

The code uses the AWS Serverless Application Model (SAM), enabling you to deploy the application easily in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost for usage beyond development and testing.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file.

Introducing “Ask Around Me” – The app for finding answers from local users

Ask Around Me is a web application that allows you to ask questions to a community of local users. It’s designed to be used on a smartphone browser.

 

Ask Around Me front end application

The front-end uses Auth0 for authentication. For simplicity, it supports social logins with other identity providers. Once a user is logged in, the app displays their local area:

No questions in your area

Users can then post questions to the neighborhood. Questions can be ratings-based (“How relaxing is the park?”) or geography-based (“Where is best coffee?”).

Ask a new question

Posted questions are published to users within a 5-mile radius. Any user in this area sees new questions appear in the list automatically:

New questions in Ask Around Me

Other users answer questions by providing a star-rating or dropping a pin on a map. As the question owner, you see real-time average scores or a heat map, depending on the question type:

Ask Around Me Heatmap

The app is designed to be fun and easy to use. It uses authentication to ensure that votes are only counted once per user ID. It uses geohashing to ensure that users only see and answer questions within their local area. It also keeps the question list and answers up to date in real time to create a sense of immediacy.

In terms of traffic, the app is expected to receive 1,000 questions and 10,000 answers posted per hour. The query that retrieves local questions is likely to receive 50,000 requests per hour. In the course of these posts, I explore the architecture and services chosen to handle this volume. All of this is built serverlessly with cost effectiveness in mind. The cost scales in line with usage, and I discuss how to make the best use of the app budget in this scenario.

SPA frameworks and serverless backends

While you can apply a serverless backend to almost any type of web or mobile framework, SPA frameworks can make development much easier. For modern web development, SPA frameworks like React.js, Vue.js, Angular have grown in popularity for serverless development. They have become the standard way to build complex, rich front-ends.

These frameworks offer benefits to both front-end developers and users. For developers, you can create the application within an IDE and test locally with hot reloading, which renders new content in the same context in the browser. For users, it creates a web experience that’s similar to a traditional application, with reactive content and faster interactive capabilities.

When you build a SPA-based application, the build process creates HTML, JavaScript, and CSS files. You serve these static assets from an Amazon CloudFront distribution with an Amazon S3 bucket set as the origin. CloudFront serves these files from 216 global points of presence, resulting in low latency downloads regardless of where the user is located.

CloudFront/S3 app distribution

You can also use AWS Amplify Console, which can automate the build and deployment process. This is triggered by build events in your code repo so once you commit code changes, these are automatically deployed to production.

A traditional webserver often serves both the application’s static assets together with dynamic content. In this model, you offload the serving of all of the application assets to a global CDN. The dynamic application is a serverless backend powered by Amazon API Gateway and AWS Lambda. By using a SPA framework with a serverless backend, you can create performant, highly scalable web applications that are also easy to develop.

Configuring Auth0

This application integrates Auth0 for user authentication. The front-end calls out to this service when users are not logged in, and Auth0 provides an open standard JWT token after the user is authenticated. Before you can install and use the application, you must sign up for an Auth0 account and configure the application:

  1. Navigate to https://auth0.com/ and choose Sign Up. Complete the account creation process.
  2. From the dashboard, choose Create Application. Enter AskAroundMe as the name and select Single Page Web Applications for the Application Type. Choose Create.Auth0 configuration
  3. In the next page, choose the Settings tab. Copy the Client ID and Domain values to a text editor – you need these for setting up the Vue.js application later.Auth0 configuration next step
  4. Further down on this same tab, enter the value http://localhost:8080 into the Allowed Logout URLs, Allowed Callback URLs and Allowed Web Origins fields. Choose Save Changes.
  5. On the Connections tab, in the Social section, add google-oauth2 and twitter and ensure that the toggles are selected. This enables social sign-in for your application.Auth0 Connections tab

This configuration allows the application to interact with the Auth0 service from your local machine. In production, you must enter the domain name of the application in these fields. For more information, see Auth0’s documentation for Application Settings.

Deploying the application

In the code repo, there are separate directories for the front-end and backend applications. You must install the backend first. To complete this step, follow the detailed instructions in the repo’s README.md.

There are several important environment variables to note from the backend installation process:

  • IoT endpoint address and Cognito Pool ID: these are used for real-time messaging between the backend and frontend applications.
  • API endpoint: the base URL path for the backend’s APIs.
  • Region: the AWS Region where you have deployed the application.

Next, you deploy the Vue.js application from the frontend directory:

  1. The application uses the Google Maps API – sign up for a developer account and make a note of your API key.
  2. Open the main.js file in the src directory. Lines 45 through 62 contain the configuration section where you must add the environment variables above:Ask Around Me Vue.js configuration

Ensure you complete the Auth0 configuration and remaining steps in the README.md file, then you are ready to test.

To launch the frontend application, run npm run serve to start the development server. The terminal shows the local URL where the application is now running:

Running the Vue.js app

Open a web browser and navigate to http://localhost:8080 to see the application.

How Vue.js applications work with a serverless backend

Unlike a traditional web application, SPA applications are loaded in the user’s browser and start executing JavaScript on the client-side. The app loads assets and initializes itself before communicating with the serverless backend. This lifecycle and behavior is comparable to a conventional desktop or mobile application.

Vue.js is a component-based framework. Each component optionally contains a user interface with related code and styling. Overall application state may be managed by a store – this example uses Vuex. You can use many of the patterns employed in this application in your own apps.

Auth0 provides a Vue.js component that automates storing and parsing the JWT token in the local browser. Each time the app starts, this component verifies the token and makes it available to your code. This app uses Vuex to manage the timing between the token becoming available and the app needing to request data.

The application completes several initialization steps before querying the backend for a list of questions to display:

Initialization process for the app

Several components can request data from the serverless backend via API Gateway endpoints. In src/views/HomeView.vue, the component loads a list of questions when it determines the location of the user:

const token = await this.$auth.getTokenSilently()
const url = `${this.$APIurl}/questions?lat=${this.currentLat}&lng=${this.currentLng}`
console.log('URL: ', url)
// Make API request with JWT authorization
const { data } = await axios.get(url, {
  headers: {
    // send access token through the 'Authorization' header
    Authorization: `Bearer ${token}`   
  }
})

// Commit question list to global store
this.$store.commit('setAllQuestions', data)

This process uses the Axios library to manage the HTTP request and pass the authentication token in the Authorization header. The resulting dataset is saved in the Vuex store. Since SPA applications react to changes in data, any frontend component displaying data is automatically refreshed when it changes.

The src/components/IoT.vue component uses MQTT messaging via AWS IoT Core. This manages real-time updates published to the frontend. When a question receives a new answer, this component receives an update. The component updates the question status in the global store, and all other components watching this data automatically receive those updates:

        mqttClient.on('message', function (topic, payload) {
          const msg = JSON.parse(payload.toString())
          
          if (topic === 'new-answer') {
            _store.commit('updateQuestion', msg)
          } else {
            _store.commit('saveQuestion', msg)
          }
        })

The application uses both API Gateway synchronous queries and MQTT WebSocket updates to communicate with the backend application. As a result, you have considerable flexibility for tracking overall application state and providing your users with a responsive application experience.

Conclusion

In this post, I introduce the Ask Around Me example web application. I discuss the benefits of using single-page application (SPA) frameworks for both developers and users. I cover how they can create highly scalable and performant web applications when powered with a serverless backend.

In this section, you configure Auth0 and deploy the frontend and backend from the application’s GitHub repo. I review the backend SAM template and the architecture it deploys.

In part 2, I will explain the backend architecture, the Amazon API Gateway configuration, and the geohashing implementation.

Building scalable serverless applications with Amazon S3 and AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-scalable-serverless-applications-with-amazon-s3-and-aws-lambda/

Well-designed serverless applications are typically a combination of managed services connected by custom business logic. One of the most powerful combinations for enterprise application development is Amazon S3 and AWS Lambda. S3 is a highly durable, highly available object store that scales to meet your storage needs. Lambda runs custom code in response to events, automatically scaling with the size of the workload. When you use the two services together, they can provide a scalable core for serverless solutions.

This blog post shows how to design and deploy serverless applications designed around S3 events. The solutions presented use AWS services to create scalable serverless architectures, using minimal custom code. This is the conclusion of a series showing how the S3-to-Lambda pattern can implement the following business solutions:

Bringing the compute layer to the data

Much traditional software operates by bringing data to the compute layer. This means that processes run on batches of data in files, databases, and other sources. This is inherently harder to scale as data volumes grow, often needing a fleet of servers to scale out at peak times. For the developer, this creates operational overhead to ensure that the compute capacity is keeping pace with the data volume.

The S3-to-Lambda serverless pattern instead brings the compute layer to the data. As data arrives, the compute process scales up and down automatically to meet the demand. This allows developers to focus on building business logic for a single item of data, and the execution at scale is handled by the Lambda service.

The image optimization application is a good example for comparing the traditional and serverless approaches. For a busy media site, capturing hundreds of images per minute in an S3 bucket, the operations overhead becomes clearer. A script running on a server must scale up across multiple instances to keep pace with this level of traffic. Compare this to the Lambda-based approach, which scales on-demand. The code itself does not change, whether it is used for a single image or thousands.

Receiving and processing events from S3 in custom code

S3 raises events when objects are put, copied, or deleted in a bucket. It also raises a broad number of other notifications, such as when lifecycle events occur. You can configure S3 to invoke Lambda from these events by using the S3 console, Lambda console, AWS CLI, or AWS Serverless Application Model (SAM) templates.

S3 passes details of the event, not the object itself, to the Lambda function in a JSON object. This object contains an array of records, so it’s possible to receive more than one S3 event per invocation:

S3 passes event details to Lambda

As the Lambda handler may receive more than one record, it should iterate through the records collection. It’s best practice to keep the handler small and generic, calling out to the business logic in a separate function or file:

const processEvent = require('my-custom-logic’)

// A Node.js Lambda handler
exports.handler = async (event) => {

  // Capture event – can be used to create mock events
  console.log (JSON.stringify(event, null, 2))  

  // Handle each incoming S3 object in the event
  await Promise.all(
    event.Records.map(async (event) => {
      try {
	  // Pass each event to the business logic handler
        await processEvent(event)
      } catch (err) {
        console.error('Handler error: ', err)
      }
    })
  )
}

This code example takes advantage of concurrent asynchronous executions available in Node.js but similar constructs are available in many other languages. This means that multiple objects are processed in parallel to minimize the overall function execution time.

Instead of handling and logging any errors within the function’s code, it’s also possible to use destinations for asynchronous invocations. You use an On failure condition to route the error to various potential targets, including another Lambda function or other AWS services. For complex applications or those handling large volumes, this provides greater control for managing events that fail processing.

During the development process, you can debug and test the S3-to-Lambda integration locally. First, capture a sample event during development to create a mock event for local testing. The sample applications in this series each use a test harness so the developer can test the handler on a local machine. The test harness invokes the handler locally, providing mock environment variables:

// Mock event
const event = require('./localTestEvent')

// Mock environment variables
process.env.AWS_REGION = 'us-east-1'
process.env.localTest = true
process.env.language = 'en'

// Lambda handler
const { handler } = require('./app')

const main = async () => {
  console.time('localTest')
  await handler(event)
  console.timeEnd('localTest')
}

main().catch(error => console.error(error))

Scaling up when more data arrives

The Lambda service scales up if S3 sends multiple events simultaneously. How this works depends on several factors. If the target Lambda function has sufficient concurrency available, and if any active instances of the function are already processing events, the Lambda service scales up.

Lambda scaling up as events queue grows

The function does not scale up if the reserved concurrency is set to 1 or the scaling capacity is fully consumed for a Region in your account. In this case, the events from S3 are queued internally until a Lambda instance is available for processing. You can request to increase the regional concurrency limit by submitting a request in the Support Center console. You may also intend to perform one-at-a-time processing by setting the reserved concurrency to 1.

One-at-a-time processing with Lambda

Generally, multiple instances of a function are invoked simultaneously when S3 receives multiple objects, to process the events as quickly as possible. It’s this rapid scaling and parallelization in both S3 and Lambda that make this pattern such a powerful core architecture for many applications.

Amazon SNS and Amazon SQS integrations

The native S3 to Lambda integration provides a reliable way to invoke one function per prefix or suffix-pattern per bucket. In example, invoking a function when object keys end in .pdf in a single bucket. This works well for the vast majority of use-cases but you may want to invoke multiple Lambda functions per S3 event.

In this case, S3 can publish notifications to SNS, where events are delivered to a range of targets. These include Lambda functions, SQS queues, HTTP endpoints, email, text messages and push notifications. SNS provides fan-out capability, enabling one event to be delivered to multiple destinations, such as Lambda functions or web hooks, for example.

In busy applications, the volume of S3 events may be too large for a downstream system, such as a non-serverless service. In this case, you can also use an SQS queue as a notification target. After events are published to a queue, they can be consumed by Lambda functions and other services. The queue acts as a buffer and can help smooth out traffic for systems consuming these events. See the DynamoDB importer repository for an example.

Uploading data to S3 in upstream applications

You may have upstream services in your architecture that generate the data stored in S3. Some upstream workloads have spiky usage patterns and large numbers of users, like web or mobile applications. You may increase the performance and throughput of these workloads by uploading directly to S3. This avoids proxying binary data through an API Gateway endpoint or web server.

For example, for a mobile application uploading user photos, S3 and Lambda can handle the upload process for large of numbers of users:

  1. The upstream process, in this case a mobile client, requests a presigned URL from an API Gateway endpoint.
  2. This invokes a Lambda function that requests a presigned URL for the S3 bucket, and returns this back via the API call.
  3. The mobile client sends the data directly to the presigned S3 URL using HTTPS POST. The upload is managed directly by S3.

This simple pattern can be a scalable and cost-effective way to upload large binary data into your applications. After the object successfully uploads, the S3 put event can then asynchronously invoke downstream workflows.

Visit this repository to see an example of a serverless S3 uploader application. You can also see a walkthrough of this process in this YouTube video.

Developing larger applications

As you develop larger serverless applications, it often becomes more practical to split applications into multiple services and repositories for separate teams. Often, individual services must integrate with existing S3 buckets, not create these in the application templates. You may also have to integrate a single service with multiple S3 buckets.

In decoupling larger applications with Amazon EventBridge, I show how you can decouple services within an application using an event bus. This pattern helps separate the producers and consumers of events in your workload. This can make each service become more independent and more resilient to changes with the overall application.

This example demonstrates how the document repository solution can be refactored into several smaller applications that communicate using events. This uses Amazon EventBridge as the event router coordinating the flow. Each application contains a SAM template that defines the EventBridge rule to filter for events, and publishes data back to the event bus after processing is complete.

One of the major benefits to using an event-based architecture is that development teams retain flexibility even as the application grows. It allows developers to separate AWS resources like S3 buckets and DynamoDB tables, from the compute resources, like Lambda functions. This decoupling can simplify the deployment process, help avoid building monoliths, and reduce the cognitive load of developing in large applications.

Conclusion

S3 and Lambda are two highly scalable AWS services that can be powerful when combined in serverless applications. In this post, I summarize many of the patterns shown across this series. I explain the integration pattern and the scaling behavior, and how you can use mock events for local testing and development. You can also use SNS and SQS in some applications for fan-out and buffering of events.

Upstream applications can upload data directly to S3 to achieve greater scalability by avoiding proxies. For larger applications, I show how using an event-based architecture modeled around EventBridge can help decouple application services. This can promote service independence, and help maintain flexibility as applications grow.

To learn more about the S3-to-Lambda architecture pattern, watch the YouTube video series, or explore the articles listed at top of this post.

Best practices for organizing larger serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/best-practices-for-organizing-larger-serverless-applications/

Well-designed serverless applications are decoupled, stateless, and use minimal code. As projects grow, a goal for development managers is to maintain the simplicity of design and low-code implementation. This blog post provides recommendations for designing and managing code repositories in larger serverless projects, and best practices for deploying releases of production systems.

Organizing your code repositories

Many serverless applications begin as monolithic applications. This can occur either because a simple application has grown more complex over time, or because developers are following existing development practices. A monolithic application is represented by a single AWS Lambda function performing multiple tasks, and a mono-repo is a single repository containing the entire application logic.

Monoliths work well for the simplest serverless applications that perform single-purpose functions. These are small applications such as cron jobs, data processing tasks, and some asynchronous processes. As those applications evolve into workflows or develop new features, it becomes important to refactor the code into smaller services.

Using frameworks such as the AWS Serverless Application Model (SAM) or the Serverless Framework can make it easier to group common pieces of functionality into smaller services. Each of these can have a separate code repository. For SAM, the template.yaml file contains all the resources and function definitions needed for an application. Consequently, breaking an application into microservices with separate templates is a simple way to split repos and resource groups.

Separate templates for microservices

In the smallest unit of a serverless application, it’s also possible to create one repository per function. If these functions are independent and do not share other AWS resources, this may be appropriate. Helper functions and simple event processing code are examples of candidates for this kind of repo structure.

In most cases, it makes sense to create repos around groups of functions and resources that define a microservice. In an ecommerce example, “Payment processing” is a microservice with multiple smaller related functions that share common resources.

As with any software, the repo design depends upon the use-case and structure of development teams. One large repo makes it harder for developer teams to work on different features, and test and deploy. Having too many repos can create duplicate code, and difficulty in sharing resources across repos. Finding the balance for your project is an important step in designing your application architecture.

Using AWS services instead of code libraries

AWS services are important building blocks for your serverless applications. These can frequently provide greater scale, performance, and reliability than bundled code packages with similar functionality.

For example, many web applications that are migrated to Lambda use web frameworks like Flask (for Python) or Express (for Node.js). Both packages support routing and separate user contexts that are well suited if the application is running on a web server. Using these packages in Lambda functions results in architectures like this:

Web servers in Lambda functions

In this case, Amazon API Gateway proxies all requests to the Lambda function to handle routing. As the application develops more routes, the Lambda function grows in size and deployments of new versions replace the entire function. It becomes harder for multiple developers to work on the same project in this context.

This approach is generally unnecessary, and it’s often better to take advantage of the native routing functionality available in API Gateway. In many cases, there is no need for the web framework in the Lambda function, which increases the size of the deployment package. API Gateway is also capable of validating parameters, reducing the need for checking parameters with custom code. It can also provide protection against unauthorized access, and a range of other features more suited to be handled at the service level. When using API Gateway this way, the new architecture looks like this:

Using API Gateway for routing

Additionally, the Lambda functions consist of less code and fewer package dependencies. This makes testing easier and reduces the need to maintain code library versions. Different developers in a team can work on separate routing functions independently, and it becomes simpler to reuse code in future projects. You can configure routes in API Gateway in the application’s SAM template:

Resources:
  GetProducts:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: getProducts/
      Handler: app.handler
      Runtime: nodejs12.x
      Events:
        GetProductsAPI:
          Type: Api 
          Properties:
            Path: /getProducts
            Method: get

Similarly, you should usually avoid performing workflow orchestrations within Lambda functions. These are sections of code that call out to other services and functions, and perform subsequent actions based on successful execution or failure.

Lambda functions with embedded workflow orchestrations

These workflows quickly become fragile and difficult to modify for new requirements. They can cause idling in the Lambda function, meaning that the function is waiting for return values from external sources, increasingly the cost of execution.

Often, a better approach is to use AWS Step Functions, which can represent complex workflows as JSON definitions in the application’s SAM template. This service reduces the amount of custom code required, and enables long-lived workflows that minimize idling in Lambda functions. It also manages in-flight executions as workflows are upgraded. The example above, rearchitected with a Step Functions workflow, looks like this:

Using Step Functions for orchestration

Using multiple AWS accounts for development teams

There are many ways to deploy serverless applications to production. As applications grow and become more important to your business, development managers generally want to improve the robustness of the deployment process. You have a number of options within AWS for managing the development and deployment of serverless applications.

First, it is highly recommended to use more than one AWS account. Using AWS Organizations, you can centrally manage the billing, compliance, and security of these accounts. You can attach policies to groups of accounts to avoid custom scripts and manual processes. One simple approach is to provide each developer with an AWS account, and then use separate accounts for a beta deployment stage and production:

Multiple AWS accounts in a deployment pipeline

The developer accounts can contains copies of production resources and provide the developer with admin-level permissions to these resources. Each developer has their own set of limits for the account, so their usage does not impact your production environment. Individual developers can deploy CloudFormation stacks and SAM templates into these accounts with minimal risk to production assets.

This approach allows developers to test Lambda functions locally on their development machines against live cloud resources in their individual accounts. It can help create a robust unit testing process, and developers can then push code to a repository like AWS CodeCommit when ready.

By integrating with AWS Secrets Manager, you can store different sets of secrets in each environment and eliminate any need for credentials stored in code. As code is promoted from developer account through to the beta and production accounts, the correct set of credentials is automatically used. You do not need to share environment-level credentials with individual developers.

It’s also possible to implement a CI/CD process to start build pipelines when code is deployed. To deploy a sample application using a multi-account deployment flow, follow this serverless CI/CD tutorial.

Managing feature releases in serverless applications

As you implement CI/CD pipelines for your production serverless applications, it is best practice to favor safe deployments over entire application upgrades. Unlike traditional software deployments, serverless applications are a combination of custom code in Lambda functions and AWS service configurations.

A feature release may consist of a version change in a Lambda function. It may have a different endpoint in API Gateway, or use a new resource such as a DynamoDB table. Access to the deployed feature may be controlled via user configuration and feature toggles, depending upon the application. AWS SAM has AWS CodeDeploy built-in, which allows you to configure canary deployments in the YAML configuration:

Resources:
 GetProducts:
   Type: AWS::Serverless::Function
   Properties:
     CodeUri: getProducts/
     Handler: app.handler
     Runtime: nodejs12.x

     AutoPublishAlias: live

     DeploymentPreference:
       Type: Canary10Percent10Minutes 
       Alarms:
         # A list of alarms that you want to monitor
         - !Ref AliasErrorMetricGreaterThanZeroAlarm
         - !Ref LatestVersionErrorMetricGreaterThanZeroAlarm
       Hooks:
         # Validation Lambda functions run before/after traffic shifting
         PreTraffic: !Ref PreTrafficLambdaFunction
         PostTraffic: !Ref PostTrafficLambdaFunction

CodeDeploy automatically creates aliases pointing to the old and versions of a function. The canary deployment enables you to gradually shift traffic from the old to the new alias, as you become confident that the new version is working as expected. Or you can rollback the update if needed. You can also set PreTraffic and PostTraffic hooks to invoke Lambda functions before and after traffic shifting.

Conclusion

As any software application grows in size, it’s important for development managers to organize code repositories and manage releases. There are established patterns in serverless to help manage larger applications. Generally, it’s best to avoid monolithic functions and mono-repos, and you should scope repositories to either the microservice or function level.

Well-designed serverless applications use custom code in Lambda functions to connect with managed services. It’s important to identify libraries and packages that can be replaced with services to minimize the deployment size and simplify the code base. This is especially true in applications that have been migrated from server-based environments.

Using AWS Organizations, you manage groups of accounts to enable your developers to have their own AWS accounts for development. This enables engineers to clone production assets and test against the AWS Cloud when writing and debugging code. You can use a CI/CD pipeline to push code through a beta environment to production, while safeguarding secrets using Secrets Manager. You can also use CodeDeploy to manage canary deployments easily.

To learn more about deploying Lambda functions with SAM and CodeDeploy, follow the steps in this tutorial.

Using dynamic Amazon S3 event handling with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-dynamic-amazon-s3-event-handling-with-amazon-eventbridge/

A common pattern in serverless applications is to invoke a Lambda function in response to an event from Amazon S3. For example, you could use this pattern for automating document translation, transcribing audio files, or staging data imports. You can configure this integration in many places, including the AWS Management Console, the AWS CLI, or the AWS Serverless Application Model (SAM).

If you need to fan out notifications, or hold messages in queue, you are also able to route S3 events to Amazon SNS or Amazon SQS. These standard notification mechanisms work well for most applications, and are simple to implement. However, for more complex notification patterns, you can use Amazon EventBridge to route events dynamically. This blog post explores advanced use-cases and how to implement these in your serverless applications.

S3 to EventBridge, using CloudTrail.

To set up the example applications, visit the GitHub repo and follow the instructions in the README.md file. The code uses SAM templates, enabling you to deploy the applications in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost if you test with large amounts of data.

Integrating S3 events with Lambda via EventBridge

EventBridge consumes S3 events via AWS CloudTrail. A single trail can log events for one or more S3 buckets, and you can configure which data events are recorded. It’s best practice to store CloudTrail log files in a separate S3 bucket. Once this is configured, EventBridge can then receive any event logged in the trail.

The first example in the GitHub repo shows how this can be configured in a SAM template. The application comprises an S3 bucket, a Lambda EventConsumer function, and other required resources. First, the template defines the two buckets:

Resources: 
  SourceBucket: 
    Type: AWS::S3::Bucket
    Properties:
      BucketName: "TheSourceBucket"

  LoggingBucket: 
    Type: AWS::S3::Bucket
    Properties:
      BucketName: "TheLoggingBucket"

Next, an S3 bucket policy grants permissions for CloudTrail to write files to the logging bucket:

  BucketPolicy: 
    Type: AWS::S3::BucketPolicy
    Properties: 
      Bucket: 
        Ref: LoggingBucket
      PolicyDocument: 
        Version: "2012-10-17"
        Statement: 
          - 
            Sid: "AWSCloudTrailAclCheck"
            Effect: "Allow"
            Principal: 
              Service: "cloudtrail.amazonaws.com"
            Action: "s3:GetBucketAcl"
            Resource: 
              !Sub |-
                arn:aws:s3:::${LoggingBucket}
          - 
            Sid: "AWSCloudTrailWrite"
            Effect: "Allow"
            Principal: 
              Service: "cloudtrail.amazonaws.com"
            Action: "s3:PutObject"
            Resource:
              !Sub |-
                arn:aws:s3:::${LoggingBucket}/AWSLogs/${AWS::AccountId}/*
            Condition: 
              StringEquals:
                s3:x-amz-acl: "bucket-owner-full-control"

The template configures the trail and sets the logging bucket. It defines event selectors, which identify the specific events for logging:

  myTrail: 
    Type: AWS::CloudTrail::Trail
    DependsOn: 
      - BucketPolicy
    Properties: 
      TrailName: "MyTrailName"
      S3BucketName: 
        Ref: LoggingBucket
      IsLogging: true
      IsMultiRegionTrail: false
      EventSelectors:
        - DataResources:
          - Type: AWS::S3::Object
            Values:
              - !Sub |-
                arn:aws:s3:::${SourceBucket}/
      IncludeGlobalServiceEvents: false

The SAM template configures a target Lambda function for receiving the events:

  EventConsumerFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: eventConsumer/
      Handler: app.handler
      Runtime: nodejs12.x

Finally, it defines a rule that sets the event pattern and targets. It also grants permission to EventBridge to invoke the Lambda function:

  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      State: "ENABLED"
      EventPattern: 
        source: 
          - "aws.s3"
        detail: 
          eventName: 
            - "PutObject"
          requestParameters:
            bucketName: !Ref SourceBucketName

      Targets: 
        - 
          Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction"
              - "Arn"
          Id: "EventConsumerFunctionTarget"

  PermissionForEventsToInvokeLambda: 
    Type: AWS::Lambda::Permission
    Properties: 
      FunctionName: 
        Ref: "EventConsumerFunction"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: 
        Fn::GetAtt: 
          - "EventRule"
          - "Arn"

To deploy this application, follow the instructions in the GitHub repo’s README.file. To test, upload any file to the Source Bucket. This invokes the Lambda function via the EventBridge event, and logs out the event details. Open the CloudWatch Logs console for the deployed Lambda function to view the output.

The event pattern in this example matches on any PutObject event in the Source Bucket. You can also match on any attribute, or combination of attributes, in an S3 event. This makes it possible to identify events by source IP address, object size, time range, or principalId (the user causing the event). With access to the entire S3 event, this enables more granularity on matching events before invoking the target Lambda function.

Consuming events from existing S3 buckets

When deploying S3 and Lambda integrations in SAM templates, you cannot use existing buckets managed outside of the CloudFormation stack. Frequently, it’s useful to deploy serverless applications that integrate with existing S3 buckets. Using the S3-to-EventBridge integration, you can create new applications that receive events from existing buckets.

Consuming events from existing S3 buckets

The second example in the GitHub repo shows how to configure a new application for an existing bucket. This template takes the existing S3 bucket name as a parameter, and generates the CloudTrail trail, EventBridge rule, and required permissions.

Follow this example’s README.md file to deploy the application. To test, upload any file into the existing S3 bucket you selected. This invokes the eventConsumer logging function deployed in the template.

Invoking a single Lambda function from multiple S3 buckets

With EventBridge decoupling the producer and consumer of the events, this also makes it easier to introduce multiple producers. In the third example, the SAM template creates three buckets that invoke the same EventConsumer Lambda function:

Invoking Lambda from multiple S3 buckets

The MultiBucketName parameter is used to create the three buckets with a number appended to the name. First, the CloudTrail EventSelector includes the three buckets in the trail:

  # The CloudTrail trail 
  myTrail: 
    Type: AWS::CloudTrail::Trail
    DependsOn: 
      - BucketPolicy
    Properties: 
      TrailName: "myTrail"
      S3BucketName: 
        Ref: LoggingBucket
      IsLogging: true
      IsMultiRegionTrail: false
      EventSelectors:
        - DataResources:
          - Type: AWS::S3::Object
            Values:
              - !Sub 'arn:aws:s3:::${MultiBucketName}-1/'
              - !Sub 'arn:aws:s3:::${MultiBucketName}-2/'
              - !Sub 'arn:aws:s3:::${MultiBucketName}-3/'
      IncludeGlobalServiceEvents: false

Next, the EventRule includes the three bucket names in the event pattern, so events from any of these buckets can now trigger the rule:

  # EventBridge rule - invokes EventConsumerFunction 
  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      State: "ENABLED"
      EventPattern: 
        source: 
          - "aws.s3"
        detail: 
          eventName: 
            - "PutObject"
          requestParameters:
            bucketName:
              - !Sub '${MultiBucketName}-1'
              - !Sub '${MultiBucketName}-2'
              - !Sub '${MultiBucketName}-3'

It’s also possible to use content-based filtering in event patterns to match dynamically on bucket names. For example, if you have multiple buckets with the prefix myCompanySales, you can create an event pattern to match all of these buckets:

      EventPattern: 
        source: 
          - "aws.s3"
        detail: 
          eventName: 
            - "PutObject"
          requestParameters:
            bucketName:
              - "prefix": "myCompanySales" 

This enables your application to consume events from new buckets created after the application is deployed. With content-based filtering, you can create search patterns that allow greater flexibility in matching events.

Multiple buckets with multiple Lambda functions

In the standard S3 and Lambda integration, a single Lambda function can only be invoked by distinct prefix and suffix patterns in the S3 trigger. This means that the same Lambda function cannot be set as the trigger for PutObject events for the same filetype or prefix. When you need to invoke multiple functions with the same or overlapping prefixes or suffixes, the EventBridge integration can handle this.

EventBridge allows up to five targets per rule, so you can specify up to five separate Lambda functions to receive the event. All five functions are invoked in parallel when the event pattern matches. To use this, add the targets in the rule – no change to the event pattern is required.

In the fourth example, the SAM template configures three buckets and three Lambda functions, all subscribing to the same event pattern.

Multiple buckets with multiple Lambda subscribers

This template takes the existing S3 bucket name as a parameter, and generates the CloudTrail trail, EventBridge rule, and required permissions. The key change to the template is in the EventRule, where now more than one target is defined:

      Targets: 
        - Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction1"
              - "Arn"
          Id: "EventConsumerFunctionTarget1"
        - Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction2"
              - "Arn"
          Id: "EventConsumerFunctionTarget2"
        - Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction3"
              - "Arn"
          Id: "EventConsumerFunctionTarget3"

This approach enables more complex routing of S3 events to Lambda targets. It allows events from multiple S3 buckets with overlapping prefixes and suffixes in object names. It also enables you to route those events to multiple Lambda functions simultaneously.

Conclusion

The standard S3 to Lambda integration enables developers to deploy code that responds to bucket- or object-based events. You can also use SNS or SQS as targets for fanning out or buffering messages from S3. Using Amazon EventBridge, you can employ even more sophisticated routing and filtering of events between S3 and Lambda.

In this blog post, I show how to deploy a basic integration using a SAM template with a single bucket and single Lambda function. I cover how to use existing S3 buckets in your new application deployments, and use EventBridge content filtering in rules to dynamically match bucket events.

Finally, in complex serverless applications, I show how EventBridge completely decouples the producers and consumers. This makes it easy to route events from multiple S3 buckets to multiple Lambda functions. When combined with attribute matching across the entire S3 event object, this allows much more granularity in identifying events before invoking Lambda functions.

To learn more about using decoupled, event-driven architectures in your serverless applications, visit the Amazon EventBridge Learning Path.