Tag Archives: AWS SDK for JavaScript in Node.js

Implementing idempotent AWS Lambda functions with Powertools for AWS Lambda (TypeScript)

Post Syndicated from Pascal Vogel original https://aws.amazon.com/blogs/compute/implementing-idempotent-aws-lambda-functions-with-powertools-for-aws-lambda-typescript/

This post is written by Alexander Schüren, Sr Specialist SA, Powertools.

One of the design principles of AWS Lambda is to “develop for retries and failures”. If your function fails, the Lambda service will retry and invoke your function again with the same event payload. Therefore, when your function performs tasks such as processing orders or making reservations, it is necessary for your Lambda function to handle requests idempotently to avoid duplicate payment or order processing, which can result in a poor customer experience.

This article explains what idempotency is and how to make your Lambda functions idempotent using the idempotency utility for Powertools for AWS Lambda (TypeScript). The Powertools idempotency utility for TypeScript was co-developed with Vanguard and is now generally available.

Understanding idempotency

Idempotency is the property of an operation that can be applied multiple times without changing the result beyond the initial execution. You can safely run an idempotent operation multiple times without side effects, such as duplicate records or data inconsistencies. This is especially relevant for payment and order processing or third-party API integrations.

There are key concepts to consider when implementing idempotency in AWS Lambda. For each invocation, you specify which subset of the event payload you want to use to identify an idempotent request. This is called the idempotency key. This key can be a single field such as transactionId, a combination of multiple fields such as customerId and requestId, or the entire event payload.

Because timestamps, dates, and other generated values within the payload affect the idempotency key, we recommend that you define specific fields rather than using the entire event payload.

By evaluating the idempotency key, you can then decide if the function needs to run again or send an existing response to the client. To do this, you need to store the following information for each request in a persistence layer (i.e., Amazon DynamoDB):

  • Status: IN_PROGRESS, EXPIRED, COMPLETE
  • Response data: the response to send back to the client instead of executing the function again
  • Expiration timestamp: when the idempotency record becomes invalid for reuse

The following diagram shows a successful request flow for this idempotency scenario:

Request flow for idempotent Lambda function

When you invoke a Lambda function with a particular event for the first time, it stores a record with a unique idempotency key tied to an event payload in the persistence layer.

The function then executes its code and updates the record in the persistence layer with the function response. For subsequent invocations with the same payload, you must check if the idempotency key exists in the persistence layer. If it exists, the function returns the same response to the client. This prevents multiple invocations of the function, making it idempotent.

There are more edge cases to be mindful of, such as when the idempotency record has expired, or handling of failures between the client, the Lambda function, and the persistence layer. The Powertools for AWS Lambda (TypeScript) documentation covers all request flows in detail.

Idempotency with Powertools for AWS Lambda (TypeScript)

Powertools for AWS Lambda, available in PythonJava, .NET, and TypeScript, provides utilities for Lambda functions to ease the adoption of best practices and to reduce the amount of code needed to perform recurring tasks. In particular, it provides a module to handle idempotency.

This post shows examples using the TypeScript version of Powertools. To get started with the Powertools idempotency module, you must install the library and configure it within your build process. For more details, follow the Powertools for AWS Lambda documentation.

Getting started

Powertools for AWS Lambda (TypeScript) is modular, meaning you can install the idempotency utility independently from the Logger, Tracing, Metrics, or other packages. Install the idempotency utility library and the AWS SDK v3 client for DynamoDB in your project using npm:

npm i @aws-lambda-powertools/idempotency @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb

Before getting started, you need to create a persistent storage layer where the idempotency utility can store its state. Your Lambda function AWS Identity and Access Management (IAM) role must have dynamodb:GetItem, dynamodb:PutItem, dynamodb:UpdateItem and dynamodb:DeleteItem permissions.

Currently, DynamoDB is the only supported persistent storage layer, so you’ll need to create a table first. Use the AWS Cloud Development Kit (CDK), AWS CloudFormation, AWS Serverless Application Model (SAM) or any Infrastructure as Code tool of your choice that supports DynamoDB resources.

The following sections illustrate how to instrument your Lambda function code to make it idempotent using a wrapper function or using middy middleware.

Using the function wrapper

Assuming you have created a DynamoDB table with the name IdempotencyTable, create a persistence layer in your Lambda function code:

import { makeIdempotent } from "@aws-lambda-powertools/idempotency";
import { DynamoDBPersistenceLayer } from "@aws-lambda-powertools/idempotency/dynamodb";

const persistenceStore = new DynamoDBPersistenceLayer({
  tableName: "IdempotencyTable",
});

Now, apply the makeIdempotent function wrapper to your Lambda function handler to make it idempotent and use the previously configured persistence store.

import { makeIdempotent } from '@aws-lambda-powertools/idempotency';
import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';
import type { Context } from 'aws-lambda';
import type { Request, Response, SubscriptionResult } from './types';

export const handler = makeIdempotent(
  async (event: Request, _context: Context): Promise<Response> => {
    try {
      const payment = … // create payment
	  
      return {
        paymentId: payment.id,
        message: 'success',
        statusCode: 200,
      };

    } catch (error) {
      throw new Error('Error creating payment');
    }
  },
  {
    persistenceStore,
  }
);

The function processes the incoming event to create a payment and return the paymentId, message, and status back to the client. Making the Lambda function handler idempotent ensures that payments are only processed once, despite multiple Lambda invocations with the same event payload. You can also apply the makeIdempotent function wrapper to any other function outside of your handler.

Use the following type definitions for this example by adding a types.ts file to your source folder:

type Request = {
  user: string;
  productId: string;
};

type Response = {
  [key: string]: unknown;
};

type SubscriptionResult = {
  id: string;
  productId: string;
};

Using middy middleware

If you are using middy middleware, Powertools provides makeHandlerIdempotent middleware to make your Lambda function handler idempotent:

import { makeHandlerIdempotent } from '@aws-lambda-powertools/idempotency/middleware';
import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';
import middy from '@middy/core';
import type { Context } from 'aws-lambda';
import type { Request, Response, SubscriptionResult } from './types';

const persistenceStore = new DynamoDBPersistenceLayer({
  tableName: 'IdempotencyTable',
});

export const handler = middy(
  async (event: Request, _context: Context): Promise<Response> => {
    try {
      const payment = … // create payment object
	  
      return {
        paymentId: payment.id,
        message: 'success',
        statusCode: 200,
      };
    } catch (error) {
      throw new Error('Error creating payment');
    }
  }
).use(
    makeHandlerIdempotent({
      persistenceStore,
  })
);

Configuration options

The Powertools idempotency utility comes with several configuration options to change the idempotency behavior that will fit your use case scenario. This section highlights the most common configurations. You can find all available customization options in the AWS Powertools for Lambda (TypeScript) documentation.

Persistence layer options

When you create a DynamoDBPersistenceLayer object, only the tableName attribute is required. Powertools will expect the table with a partition key id and will create other attributes with default values.

You can change these default values if needed by passing the options parameter:

import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';

const persistenceStore = new DynamoDBPersistenceLayer({
  tableName: 'idempotencyTableName',
  keyAttr: 'idempotencyKey', // default: id
  expiryAttr: 'expiresAt', // default: expiration
  inProgressExpiryAttr: 'inProgressExpiresAt', // default: in_progress_expiration
  statusAttr: 'currentStatus', // default: status
  dataAttr: 'resultData', // default: data
  validationKeyAttr: 'validationKey', .// default validation
});

Using a subset of the event payload

When you configure idempotency for your Lambda function handler, Powertools will use the entire event payload for idempotency handling by hashing the object.

However, events from AWS services such as Amazon API Gateway or Amazon Simple Queue Service (Amazon SQS) often have generated fields, such as timestamp or requestId. This results in Powertools treating each event payload as unique.

To prevent that, create an IdempotencyConfig and configure which part of the payload should be hashed for the idempotency logic.

Create the IdempotencyConfig and set eventKeyJmespath to a key within your event payload:

import { IdempotencyConfig } from '@aws-lambda-powertools/idempotency';

// Extract the idempotency key from the request headers
const config = new IdempotencyConfig({
  eventKeyJmesPath: 'headers."X-Idempotency-Key"',
});

Use the X-Idempotency-Key header for your idempotency key. Subsequent invocations with the same header value will be idempotent.

You can then add the configuration to the makeIdempotent function wrapper from the previous example:

export const handler = makeIdempotent(
  async (event: Request, _context: Context): Promise<Response> => {
    try {
      const payment = … // create payment
      
	  return {
        paymentId: payment.id,
        message: 'success',
        statusCode: 200,
      };
    } catch (error) {
      throw new Error('Error creating payment');
    }
  },
  {
    persistenceStore,
    config
  }
);

The event payload should contain X-Idempotency-Key in the headers, so Powertools can use this field to handle idempotency:

{
  "version": "2.0",
  "routeKey": "ANY /createpayment",
  "rawPath": "/createpayment",
  "rawQueryString": "",
  "headers": {
    "Header1": "value1",
    "X-Idempotency-Key": "abcdefg"
  },
  "requestContext": {
    "accountId": "123456789012",
    "apiId": "api-id",
    "domainName": "id.execute-api.us-east-1.amazonaws.com",
    "domainPrefix": "id",
    "http": {
      "method": "POST",
      "path": "/createpayment",
      "protocol": "HTTP/1.1",
      "sourceIp": "ip",
      "userAgent": "agent"
    },
    "requestId": "id",
    "routeKey": "ANY /createpayment",
    "stage": "$default",
    "time": "10/Feb/2021:13:40:43 +0000",
    "timeEpoch": 1612964443723
  },
  "body": "{\"user\":\"xyz\",\"productId\":\"123456789\"}",
  "isBase64Encoded": false
}

There are other configuration options you can apply, such as payload validation, expiration duration, local caching, and others. See the Powertools for AWS Lambda (TypeScript) documentation for more information.

Customizing the AWS SDK configuration

The DynamoDBPersistenceLayer is built-in and allows you to store the idempotency data for all your requests. Under the hood, Powertools uses the AWS SDK for JavaScript v3. Change the SDK configuration by passing a clientConfig object.

The following sample sets the region to eu-west-1:

import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';

const persistenceStore = new DynamoDBPersistenceLayer({
  tableName: 'IdempotencyTable',
  clientConfig: {
    region: 'eu-west-1',
  },
});

If you are using your own client, you can pass it the persistence layer:

import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';

const ddbClient = new DynamoDBClient({ region: 'eu-west-1' });

const dynamoDBPersistenceLayer = new DynamoDBPersistenceLayer({
  tableName: 'IdempotencyTable',
  awsSdkV3Client: ddbClient,
});

Conclusion

Making your Lambda functions idempotent can be a challenge and, if not done correctly, can lead to duplicate data, inconsistencies, and a bad customer experience. This post shows how to use Powertools for AWS Lambda (TypeScript) to process your critical transactions only once when using AWS Lambda.

For more details on the Powertools idempotency feature and its configuration options, see the full documentation.

For more serverless learning resources, visit Serverless Land.

Simplifying serverless best practices with AWS Lambda Powertools for TypeScript

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/simplifying-serverless-best-practices-with-aws-lambda-powertools-for-typescript/

This blog post is written by Sara Gerion, Senior Solutions Architect.

Development teams must have a shared understanding of the workloads they own and their expected behaviors to deliver business value fast and with confidence. The AWS Well-Architected Framework and its Serverless Lens provide architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the AWS Cloud.

Developers should design and configure their workloads to emit information about their internal state and current status. This allows engineering teams to ask arbitrary questions about the health of their systems at any time. For example, emitting metrics, logs, and traces with useful contextual information enables situational awareness and allows developers to filter and select only what they need.

Following such practices reduces the number of bugs, accelerates remediation, and speeds up the application lifecycle into production. They can help mitigate deployment risks, offer more accurate production-readiness assessments and enable more informed decisions to deploy systems and changes.

AWS Lambda Powertools for TypeScript

AWS Lambda Powertools provides a suite of utilities for AWS Lambda functions to ease the adoption of serverless best practices. The AWS Hero Yan Cui’s initial implementation of DAZN Lambda Powertools inspired this idea.

Following the community’s adoption of AWS Lambda Powertools for Python and AWS Lambda Powertools for Java, we are excited to announce the general availability of the AWS Lambda Powertools for TypeScript.

AWS Lambda Powertools for TypeScript provides a suite of utilities for Node.js runtimes, which you can use in both JavaScript and TypeScript code bases. The library follows a modular approach similar to the AWS SDK v3 for JavaScript. Each utility is installed as standalone NPM package.

Today, the library is ready for production use with three observability features: distributed tracing (Tracer), structured logging (Logger), and asynchronous business and application metrics (Metrics).

You can instrument your code with Powertools in three different ways:

  • Manually. It provides the most granular control. It’s the most verbose approach, with the added benefit of no additional dependency and no refactoring to TypeScript Classes.
  • Middy middleware. It is the best choice if your existing code base relies on the Middy middleware engine. Powertools offers compatible Middy middleware to make this integration seamless.
  • Method decorator. Use TypeScript method decorators if you prefer writing your business logic using TypeScript Classes. If you aren’t using Classes, this requires the most significant refactoring.

The examples in this blog post use the Middy approach. To follow the examples, ensure that middy is installed:

npm i @middy/core

Logger

Logger provides an opinionated logger with output structured as JSON. Its key features include:

  • Capturing key fields from the Lambda context, cold starts, and structure logging output as JSON.
  • Logging Lambda invocation events when instructed (disabled by default).
  • Printing all the logs only for a percentage of invocations via log sampling (disabled by default).
  • Appending additional keys to structured logs at any point in time.
  • Providing a custom log formatter (Bring Your Own Formatter) to output logs in a structure compatible with your organization’s Logging RFC.

To install, run:

npm install @aws-lambda-powertools/logger

Usage example:

import { Logger, injectLambdaContext } from '@aws-lambda-powertools/logger';
 import middy from '@middy/core';

 const logger = new Logger({
    logLevel: 'INFO',
    serviceName: 'shopping-cart-api',
});

 const lambdaHandler = async (): Promise<void> => {
     logger.info('This is an INFO log with some context');
 };

 export const handler = middy(lambdaHandler)
     .use(injectLambdaContext(logger));

In Amazon CloudWatch, the structured log emitted by your application looks like:

{
     "cold_start": true,
     "function_arn": "arn:aws:lambda:eu-west-1:123456789012:function:shopping-cart-api-lambda-prod-eu-west-1",
     "function_memory_size": 128,
     "function_request_id": "c6af9ac6-7b61-11e6-9a41-93e812345678",
     "function_name": "shopping-cart-api-lambda-prod-eu-west-1",
     "level": "INFO",
     "message": "This is an INFO log with some context",
     "service": "shopping-cart-api",
     "timestamp": "2021-12-12T21:21:08.921Z",
     "xray_trace_id": "abcdef123456abcdef123456abcdef123456"
 }

Logs generated by Powertools can also be ingested and analyzed by any third-party SaaS vendor that supports JSON.

Tracer

Tracer is an opinionated thin wrapper for AWS X-Ray SDK for Node.js.

Its key features include:

  • Auto-capturing cold start and service name as annotations, and responses or full exceptions as metadata.
  • Automatically tracing HTTP(S) clients and generating segments for each request.
  • Supporting tracing functions via decorators, middleware, and manual instrumentation.
  • Supporting tracing AWS SDK v2 and v3 via AWS X-Ray SDK for Node.js.
  • Auto-disable tracing when not running in the Lambda environment.

To install, run:

npm install @aws-lambda-powertools/tracer

Usage example:

import { Tracer, captureLambdaHandler } from '@aws-lambda-powertools/tracer';
 import middy from '@middy/core'; 

 const tracer = new Tracer({
    serviceName: 'shopping-cart-api'
});

 const lambdaHandler = async (): Promise<void> => {
     /* ... Something happens ... */
 };

 export const handler = middy(lambdaHandler)
     .use(captureLambdaHandler(tracer));
AWS X-Ray segments and subsegments emitted by Powertools

AWS X-Ray segments and subsegments emitted by Powertools

Example service map generated with Powertools

Example service map generated with Powertools

Metrics

Metrics create custom metrics asynchronously by logging metrics to standard output following the Amazon CloudWatch Embedded Metric Format (EMF). These metrics can be visualized through CloudWatch dashboards or used to trigger alerts.

Its key features include:

  • Aggregating up to 100 metrics using a single CloudWatch EMF object (large JSON blob).
  • Validating your metrics against common metric definitions mistakes (for example, metric unit, values, max dimensions, max metrics).
  • Metrics are created asynchronously by the CloudWatch service. You do not need any custom stacks, and there is no impact to Lambda function latency.
  • Creating a one-off metric with different dimensions.

To install, run:

npm install @aws-lambda-powertools/metrics

Usage example:

import { Metrics, MetricUnits, logMetrics } from '@aws-lambda-powertools/metrics';
 import middy from '@middy/core';

 const metrics = new Metrics({
    namespace: 'serverlessAirline', 
    serviceName: 'orders'
});

 const lambdaHandler = async (): Promise<void> => {
     metrics.addMetric('successfulBooking', MetricUnits.Count, 1);
 };

 export const handler = middy(lambdaHandler)
     .use(logMetrics(metrics));

In CloudWatch, the custom metric emitted by your application looks like:

{
     "successfulBooking": 1.0,
     "_aws": {
     "Timestamp": 1592234975665,
     "CloudWatchMetrics": [
         {
         "Namespace": "serverlessAirline",
         "Dimensions": [
             [
             "service"
             ]
         ],
         "Metrics": [
             {
             "Name": "successfulBooking",
             "Unit": "Count"
             }
         ]
     },
     "service": "orders"
 }

Serverless TypeScript demo application

The Serverless TypeScript Demo shows how to use Lambda Powertools for TypeScript. You can find instructions on how to deploy and load test this application in the repository.

Serverless TypeScript Demo architecture

Serverless TypeScript Demo architecture

The code for the Get Products Lambda function shows how to use the utilities. The function is instrumented with Logger, Metrics and Tracer to emit observability data.

// blob/main/src/api/get-products.ts
import { APIGatewayProxyEvent, APIGatewayProxyResult} from "aws-lambda";
import { DynamoDbStore } from "../store/dynamodb/dynamodb-store";
import { ProductStore } from "../store/product-store";
import { logger, tracer, metrics } from "../powertools/utilities"
import middy from "@middy/core";
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer';
import { injectLambdaContext } from '@aws-lambda-powertools/logger';
import { logMetrics, MetricUnits } from '@aws-lambda-powertools/metrics';

const store: ProductStore = new DynamoDbStore();
const lambdaHandler = async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {

  logger.appendKeys({
    resource_path: event.requestContext.resourcePath
  });

  try {
    const result = await store.getProducts();

    logger.info('Products retrieved', { details: { products: result } });
    metrics.addMetric('productsRetrieved', MetricUnits.Count, 1);

    return {
      statusCode: 200,
      headers: { "content-type": "application/json" },
      body: `{"products":${JSON.stringify(result)}}`,
    };
  } catch (error) {
      logger.error('Unexpected error occurred while trying to retrieve products', error as Error);

      return {
        statusCode: 500,
        headers: { "content-type": "application/json" },
        body: JSON.stringify(error),
      };
  }
};

const handler = middy(lambdaHandler)
    .use(captureLambdaHandler(tracer))
    .use(logMetrics(metrics, { captureColdStartMetric: true }))
    .use(injectLambdaContext(logger, { clearState: true, logEvent: true }));

export {
  handler
};

The Logger utility adds useful context to the application logs. Structuring your logs as JSON allows you to search on your structured data using Amazon CloudWatch Logs Insights. This allows you to filter out the information you don’t need.

For example, use the following query to search for any errors for the serverless-typescript-demo service.

fields resource_path, message, timestamp
| filter service = 'serverless-typescript-demo'
| filter level = 'ERROR'
| sort @timestamp desc
| limit 20
CloudWatch Logs Insights showing errors for the serverless-typescript-demo service.

CloudWatch Logs Insights showing errors for the serverless-typescript-demo service.

The Tracer utility adds custom annotations and metadata during the function invocation, which it sends to AWS X-Ray. Annotations allow you to search for and filter traces by business or application contextual information such as product ID, or cold start.

You can see the duration of the putProduct method and the ColdStart and Service annotations attached to the Lambda handler function.

putProduct trace view

putProduct trace view

The Metrics utility simplifies the creation of complex high-cardinality application data. Including structured data along with your metrics allows you to search or perform additional analysis when needed.

In this example, you can see how many times per second a product is created, deleted, or queried. You could configure alarms based on the metrics.

Metrics view

Metrics view

Code examples

You can use Powertools with many Infrastructure as Code or deployment tools. The project contains source code and supporting files for serverless applications that you can deploy with the AWS Cloud Development Kit (AWS CDK) or AWS Serverless Application Model (AWS SAM).

The AWS CDK lets you build reliable and scalable applications in the cloud with the expressive power of a programming language, including TypeScript. The AWS SAM CLI is that makes it easier to create and manage serverless applications.

You can use the sample applications provided in the GitHub repository to understand how to use the library quickly and experiment in your own AWS environment.

Conclusion

AWS Lambda Powertools for TypeScript can help simplify, accelerate, and scale the adoption of serverless best practices within your team and across your organization.

The library implements best practices recommended as part of the AWS Well-Architected Framework, without you needing to write much custom code.

Since the library relieves the operational burden needed to implement these functionalities, you can focus on the features that matter the most, shortening the Software Development Life Cycle and reducing the Time To Market.

The library helps both individual developers and engineering teams to standardize their organizational best practices. Utilities are designed to be incrementally adoptable for customers at any stage of their serverless journey, from startup to enterprise.

To get started with AWS Lambda Powertools for TypeScript, see the official documentation. For more serverless learning resources, visit Serverless Land.

Building serverless applications with streaming data: Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-2/

Part 1 introduces the Alleycat application that allows bike racers to compete with each other virtually on home exercise bikes. I explain the application’s functionality, how to deploy to your AWS account, and provide an architectural review.

This series is about building serverless solutions in streaming data workloads. These are traditionally challenging to build, since data can be streamed from thousands or even millions of devices continuously.

In the example scenario, there are 40,000 users and up to 1,000 competitors may race at any given time. The workload must continuously ingest and buffer this data, then process and analyze the information to provide analytics and leaderboard content for the frontend application.

In this post, I focus on data ingestion. I compare the two different methods used in Alleycat, and discuss other approaches available. This post refers to Amazon Kinesis Data Streams, the AWS SDK, and AWS IoT Core in the solutions.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Using AWS IoT Core to ingest streaming data

AWS IoT Core enables publish-subscribe capabilities for large numbers of client applications. Clients can send data to the backend using the AWS IoT Device SDK, which uses the MQTT standard for IoT messaging. After processing, the backend can publish aggregation and status messages back to the frontend via AWS IoT Core. This service fans out the messages to clients using topics.

When using this approach, note the Quality of Service (QoS) options available. By default, the SDK uses QoS level 0, which means the device does not confirm the message is received. This is intended for workloads that can lose messages occasionally without impacting performance. In Alleycat, if performance metrics are sometimes lost, this does not likely impact the overall end user experience.

For workloads requiring higher reliability, use QoS level 1, which causes the SDK to resend the message until an acknowledgement is received. While there is no additional charge for using QoS level 1, it generally increases the number of messages, which increases the overall cost. You are not charged for the PUBACK acknowledgement message – for more details, read more about AWS IoT Core pricing.

Frontend

In this scenario, the Alleycat frontend application is running on a physical exercise bike. The user selects a racer ID and exercise class and chooses Start Race to join the current virtual race for that class.

Start race UI

Every second, the frontend sends a message containing the cadence and resistance metrics and the current second in the race for the local racer. This message is created as a JSON object in the Home.vue component and sent to the ‘alleycat-publish’ topic:

      const message = {
        uuid: uuidv4(),
        event: this.event,
        deviceTimestamp: Date.now(),
        second: this.currentSecond,
        raceId: RACE_ID,
        name: this.racer.name,
        racerId: this.racer.id,
        classId: this.selectedClassId,
        cadence: this.racer.getCurrentCadence(),
        resistance: this.racer.getCurrentResistance
      }

The IoT.vue component contains the logic for this integration and uses the AWS IoT Device SDK to send and receive messages. On startup, the frontend connects to AWS IoT Core and publishes the messages using an MQTT client:

    bus.$on('publish', (data) => {
      console.log('Publish: ', data)
      mqttClient.publish(topics.publish, JSON.stringify(data))
    })

The SDK automatically attempts to retry in the event of a network disconnection and exposes an error handler to allow custom logic if other errors occur.

Backend

The resources used in the backend are defined using the AWS Serverless Application Model (AWS SAM) and configured in the core setup templates:

Reference architecture

Messages are published to topics in AWS IoT Core, which act as channels of interest. The message broker uses topic names and topic filters to route messages between publishers and subscribers. Incoming messages are routed using rules. Alleycat’s IoT rule routes all incoming messages to a Kinesis stream:

  IotTopicRule:
    Type: AWS::IoT::TopicRule
    Properties:
      RuleName: 'alleycatIngest'
      TopicRulePayload:
        RuleDisabled: 'false'
        Sql: "SELECT * FROM 'alleycat-publish'"
        Actions:
        - Kinesis:
            StreamName: 'alleycat'
            PartitionKey: "${timestamp()}"
            RoleArn: !GetAtt IoTKinesisRole.Arn

  IoTKinesisRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - iot.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: IoTKinesisPutPolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action: 'kinesis:PutRecord'
                Resource: !GetAtt KinesisStream.Arn

Using the AWS::IoT::TopicRule resource, you can optionally define an error action. This allows you to store messages in a durable location, such as an Amazon S3 bucket, if an error occurs. Errors can occur if a rule does not have permission to access a destination or throttling occurs in a target.

Rules can route matching messages to up to 10 targets. For debugging purposes, you can also enable Amazon CloudWatch Logs, which can help in troubleshoot failed message deliveries. The AWS IoT Core Message Broker allows up to 20,000 publish requests per second – if you need a higher limit for your workload, submit a request to AWS Support.

Using the AWS SDK to ingest streaming data

The Alleycat frontend creates traffic for a single user but there is also a simulator application that can generate messages for up to 1,000 riders. Instead of routing messages using an MQTT client, the simulator uses the AWS SDK to put messages directly into the Kinesis data stream.

The SDK provides a service interface object for Kinesis and two API methods for putting messages into streams: putRecord and putRecords. The first option accepts only a single message but the second enables batching of up to 500 messages per request. This is the preferred option for adding multiple messages, compared with calling putRecord multiple times.

The putRecords API takes parameters as a JSON array of messages:

const params = {
   StreamName: 'alley-cat',
   [{
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hayden\",\"racerId\":0,\"classId\":1,\"cadence\":79.8,\"resistance\":79}",
      "PartitionKey":"1620824038331"
   },
   {
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hubert\",\"racerId\":1,\"classId\":1,\"cadence\":60.4,\"resistance\":60.6}",
      "PartitionKey":"1620824038331"
   }
]}

The SDK automatically base64 encodes the Data attribute, which in this case is the JSON string output from JSON.stringify. In the JavaScript SDK, the putRecords API can return a promise, allowing the code to await the operation:

const result = await kinesis.putRecords(params).promise()

Shards and partition keys

Kinesis data streams consist of one or more shards, which are sequences of data records with a fixed capacity. Each shard can support up to 1,000 records per second for writes, up to maximum total data write rate of 1MB per second. The total capacity of a stream is the total of its shards.

When you send messages to a stream, the partitionKey attribute determines which shard it is routed to. The example application configures a Kinesis data stream with a single shard so the partitionKey attribute has no effect – all messages are routed to the same shard. However, many production applications have more than one shard and use the partitionKey to assign messages to shards.

The partitionKey is hashed by the Kinesis service to route to a shard. This diagram shows how partitionKey values from data producers are hashed by an MD5 function and mapped to individual shards:

MD5 hash process

While you cannot designate a specific shard ID in a message, you can influence the assignment depending on your choice of partitionKey:

  • Random: Using a randomized value results in random hash so messages are randomly sent to different shards. This effectively load balances messages across all available shards.
  • Time-based: A timestamp value may cause groups of messages sent to a single shard, if the messages arrive at the same time. The identical timestamp results in an identical hash.
  • Application-specific: if Alleycat used the classID as a partitionKey, racers in each class would always be routed to the same shard. This could be useful for downstream aggregation logic but would limit the capacity of messages per classID.

Optimizing capacity in a shard

Each shard can ingest data at a rate of 1 MB per second or 1,000 records per second, whichever limit is reached first. Since the payload maximum is 1MB, this could equate to one 1MB message per second. If the payload is larger, you must divide it into smaller pieces to avoid an error. For 1,000 messages, each payload must be under 1 KB on average to fit within the allowed capacity.

The combination of the two payload limits can result in different capacity profiles for a shard:

Capacity profiles in a shard

  1. The data payloads are evenly sized and use the 1 MB per second capacity.
  2. Data payload sizes vary, so the number of messages that can be packed into 1 MB varies per second.
  3. There are a large number of very messages, consuming all 1,000 messages per second. However, the total data capacity used is significantly less than 1 MB.

In the Alleycat application, the average payload size is around 170 bytes. When producing 1,000 messages a second, the workload is only using about 20% of the 1 MB per second limit. Since PUT payload size is a factor in Kinesis pricing, messages that are much smaller than 25 KB are less cost-efficient. Compare these two messaging patterns for the Alleycat application:

Producer message patterns

  1. In this default mode, a smaller message is published once per second. This reduces overall latency but results in higher overall messaging cost.
  2. The client application batches outgoing messages and sends to Kinesis every 5 seconds. This results in lower cost and better packing of messages, but introduces additional latency.

There is a tradeoff between cost and latency when optimizing a shard’s capacity and the decision depends upon the needs of your workload. If the client buffers messages, this adds latency on the client side. This is acceptable in many workloads that collect metrics for archival or asynchronous reporting purchases. However, for low-latency applications like Alleycat, it provides a better experience for the application user to send messages as soon as they are available.

Conclusion

This post focuses on ingesting data into Kinesis Data Streams. I explain the two approaches used by the Alleycat frontend and the simulator application and highlight other approaches that you can use. I show how messages are routed to shards using partition keys. Finally, I explore additional factors to consider when ingesting data, to improve efficiency and reduce cost.

Part 3 covers using Amazon Kinesis Data Firehose for transforming, aggregating, and loading streaming data into data stores. This is used to provide the historical, second-by-second leaderboard for the frontend application.

For more serverless learning resources, visit Serverless Land.

Using Lambda layers to simplify your development process

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-lambda-layers-to-simplify-your-development-process/

Serverless developers frequently import libraries and dependencies into their AWS Lambda functions. While you can zip these dependencies as part of the build and deployment process, in many cases it’s easier to use layers instead. In this post, I explain how layers work, and how you can build and include layers in your own applications.

This blog post references the Happy Path application, which shows how to build a flexible backend to a photo-processing web application. To learn more, refer to Using serverless backends to iterate quickly on web apps – part 1. This code in this post is available at this GitHub repo.

Overview of Lambda layers

A Lambda layer is an archive containing additional code, such as libraries, dependencies, or even custom runtimes. When you include a layer in a function, the contents are extracted to the /opt directory in the execution environment. You can include up to five layers per function, which count towards the standard Lambda deployment size limits.

Layers are deployed as immutable versions, and the version number increments each time you publish a new layer. When you include a layer in a function, you specify the layer version you want to use. Layers are automatically set as private, but they can be shared with other AWS accounts, or shared publicly. Permissions only apply to a single version of a layer.

Using layers can make it faster to deploy applications with the AWS Serverless Application Model (AWS SAM) or the Serverless framework. By moving runtime dependencies from your function code to a layer, this can help reduce the overall size of the archive uploaded during a deployment.

Creating a layer containing the AWS SDK

The AWS SDK allows you to interact programmatically with AWS services using one of the supported runtimes. The Lambda service includes the AWS SDK so you can use it without explicitly importing in your deployment package.

However, there is no guarantee of the version provided in the execution environment. The SDK is upgraded frequently to support new AWS services and features. As a result, the version may change at any time. You can see the current version used by Lambda by declaring an instance of the SDK and logging out the version method:

Logging out the version method

For production workloads, it’s best practice to lock the version of the AWS SDK used in your functions. You can achieve this by including the SDK with your code package. Once you include this library, your code always uses the version in the deployment package and not the version included in the Lambda service.

A serverless application may consist of many functions, which all use a common SDK version. Instead of bundling the SDK with each function deployment, you can create a layer containing the SDK. The effect of this is to reduce the size of the uploaded archive, which makes your deployments faster.

To create an AWS SDK layer:

  1. First, clone this blog post’s GitHub repo. From a terminal window, execute:
    git clone https://github.com/aws-samples/aws-lambda-layers-aws-sam-examples
    cd ./aws-sdk-layer
  2. This directory contains an AWS SAM template and Node.js package.json file. Install the package.json contents:
    npm install
  3. Create the layer directory defined in the AWS SAM template and the nodejs directory required by Lambda. Next, move the node_modules directory:
    mkdir -p ./layer/nodejs
    mv ./node_modules ./layer/nodejs
  4. Next, deploy the AWS SAM template to create the layer:
    sam deploy --guided
  5. For the Stack name, enter “aws-sdk-layer”. Enter your preferred AWS Region and accept the other defaults.
  6. After the deployment completes, the new Lambda layer is available to use. Run this command to see the available layers:aws lambda list-layersaws lambda list-layers output

After adding a layer to a function, you can use console.log to log out the AWS SDK version. This shows that the function is now using the SDK version in the layer instead of the version provided by the Lambda service:

Use the SDK layer instead of the bundled layer

Creating layers with OS-specific binaries

Many code libraries include binaries that are operating-system specific. When you build packages on your local development machine, by default the binaries for that operating system are used. These may not be the right binaries for Lambda, which runs on Amazon Linux. If you are not using a compatible operating system, you must ensure you include Linux binaries in the layer.

The simplest way to package these libraries correctly is to use AWS Cloud9. This is an IDE in the AWS Cloud, which runs on Amazon EC2. After creating an environment, you can clone a git repository directly to the local storage of the instance, and run the necessary build scripts.

The Happy Path application resizes images using the Sharp npm library. This library uses libvips, which is written in C, so the compilation is operating system-specific. By creating a layer containing this library, it simplifies the packaging and deployment of the consuming Lambda function.

To create a Sharp layer using AWS Cloud9:

  1. Navigate to the AWS Cloud9 console.
  2. Choose Create environment.
  3. Enter the name “My IDE” and choose Next step.
  4. Accept all the default and choose Next step.
  5. Review the settings and choose Create environment.
  6. In the terminal panel, enter:
    git clone https://github.com/aws-samples/aws-lambda-layers-aws-sam-examples
    cd ./aws-lambda-layers-aws-sam-examples/sharp-layer
    npm installCreating a layer in Cloud9
  7. From a terminal window, ensure you are in the directory where you cloned this post’s GitHub repo. Execute the following commands:cd ./sharp-layer
    npm install
    mkdir -p ./layer/nodejs
    mv ./node_modules ./layer/nodejsCreating the layer in Cloud9
  8. Next, deploy the AWS SAM template to create the layer:
    sam deploy --guided
  9. For the Stack name, enter “sharp-layer”. Enter your preferred AWS Region and accept the other defaults. After the deployment completes, the new Lambda layer is available to use.

In some runtimes, you can specify a local set of packages for development, and another set for production. For example, in Node.js, the package.json file allows you to specify two sections for dependencies. If your development machine uses a different operating system to Lambda, and therefore uses different binaries, you can use package.json to resolve this. In the Happy Path Resizer function, which uses the Sharp layer, the package.json refers to a local binary for development.

Adding development dependencies to package.json

AWS SAM defines Lambda functions with the AWS::Serverless::Function resource. Layers are defined as a property of functions, as a list of layer ARNs including the version:

  MyLambdaFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: myFunction/
      Handler: app.handler
      MemorySize: 128
      Layers:
        - !Ref SharpLayerARN

Sharing a layer

Layers are private to your account by default but you can optionally share with other AWS accounts or make a layer public. You cannot share layers via the AWS Management Console but instead use the AWS CLI.

To share a layer, use add-layer-version-permission, specifying the layer name, version, AWS Region, and principal:

aws lambda add-layer-version-permission \
  --layer-name node-sharp \
  --principal '*' \
  --action lambda:GetLayerVersion \
  --version-number 3 
  --statement-id public 
  --region us-east-1

In the principal parameter, specify an individual account ID or use an asterisk to make the layer public. The CLI responds with a RevisionId containing the current revision of the policy:

add-layer-version output

You can check the permissions associated with a layer version by calling get-layer-version-policy with the layer name and version:

aws lambda get-layer-version-policy \
  --layer-name node-sharp \
  --version-number 3 \
  --region us-east-1

get-layer-version-policy output

Similarly, you can delete permissions associated with a layer version by calling remove-layer-vesion-permission with the layer name, statement ID, and version:

aws lambda remove-layer-version-permission \
 -- layer-name node-sharp \
 -- statement-id public \
 -- version-number 3

Once the permissions are removed, calling get-layer-version-policy results in an error:

Error invoking after removal

Conclusion

Lambda layers provide a convenient and effective way to package code libraries for sharing with Lambda functions in your account. Using layers can help reduce the size of uploaded archives and make it faster to deploy your code.

Layers can contain packages using OS-specific binaries, providing a convenient way to distribute these to developers. While layers are private by default, you can share with other accounts or make a layer public. Layers are published as immutable versions, and deleting a layer has no effect on deployed Lambda functions already using that layer.

To learn more about using Lambda layers, visit the documentation, or see how layers are used in the Happy Path web application.