All posts by James Beswick

Python 3.11 runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/python-3-11-runtime-now-available-in-aws-lambda/

This post is written by Ramesh Mathikumar, Senior DevOps Consultant and Francesco Vergona, Solutions Architect.

AWS Lambda now supports Python 3.11 as both a managed runtime and container base image. Python 3.11 contains significant performance enhancements over Python 3.10. Features like reduced startup time, streamlined stack frames and CPython specialization adaptive interpreter help many workloads using Python 3.11 run faster and cheaper, thanks to Lambda’s per-millisecond billing model. With this release, Python developers can now take advantage of new features and improvements introduced in Python 3.11 when creating serverless applications on Lambda.

You can use Python 3.11 with Lambda Powertools for Python, a developer toolkit to implement Serverless best practices and increase developer velocity. Lambda Powertools includes proven libraries to support common patterns such as observability, parameter store integration, idempotency, batch processing, feature flags, and more. Learn more about PowerTools for AWS Lambda for Python in the documentation.

You can also use Python 3.11 with Lambda@Edge, allowing you to customize low-latency content delivered through Amazon CloudFront.

Python is a popular language for building serverless applications. The Python 3.11 release includes both performance improvements and new language features. For customers who deploy their Lambda functions using container image, the base image for Python 3.11 also includes changes to make managing installed packages easier.

This blog post reviews these changes in turn, followed by an overview of how you can get started with Python 3.11 in Lambda.

Performance improvements

Optimizations to CPython introduced by Python 3.11 brings significant performance enhancements, making it an average of 25% faster than Python 3.10, based on Python community benchmark tests using the Python Performance Benchmark Suite.

This release focuses on two key areas:

  • Faster startup: core modules essential for Python are now “frozen,” with statically allocated code objects, resulting in a 10–15% faster interpreter start up relative to Python 3.10.
  • Faster function execution: improvements include streamlined frame creation, in-lined Python function calls for reduced C stack usage, and the implementation of a Specializing Adaptive Interpreter, which specializes the interpreter for “hot code” (code that’s executed multiple times) and reducing the overhead during execution.

These optimizations can improve performance by 10-60% depending on the workload. In the context of a Lambda function execution, this results in performance improvements for both ”cold start“ and ”warm start“ invocations

In addition to faster CPython performance improvements, Python 3.11 also provides performance improvements across other areas. For example:

  • String formatting with printf-style% codes is now as fast as f-string expressions.
  • Integer division is around 20% faster on x86-64 for certain scenarios.
  • Operations like sum() and list resizing have seen notable speed enhancements.
  • Dictionaries save memory by not storing hash values when keys are Unicode objects.
  • Improvements to asyncio. DatagramProtocol introduce significantly faster large file transfers over UDP.
  • Math functions, statistics functions, and unicodedata.normalize() also benefit from substantial speed improvements.

Language features

Thanks to its simplicity, readability, and extensive community support, Python is a popular language for building serverless applications. The Python 3.11 release includes several new language features, including:

  • Variadic generics (PEP 646): Python 3.11 introduces TypeVarTuple, enabling parameterization with an arbitrary number of types.
  • Marking individual TypedDict items as required or not-required (PEP 655): The introduction of Required and NotRequired in TypedDict allows for explicit marking of individual item requirements, eliminating the need for inheritance.
  • Self type (PEP 673): The Self annotation simplifies the annotation of methods returning an instance of their class, similarly to TypeVar in PEP 484
  • Arbitrary literal string type (PEP 675): The LiteralString annotation allows a function parameter to accept any literal string type, including strings created from literals.
  • Data class transforms (PEP 681): The @dataclass_transform() decorator enables objects to utilize runtime transformations for dataclass-like functionalities.

For the full list of Python 3.11 changes, see the Python 3.11 release notes.

Change in pre-installed modules location and search path

Previously, Lambda base container images for Python included the /var/runtime directory before the /var/lang/lib/python3.x directory in the search path. This meant that packages in /var/runtime are loaded in preference to packages installed via pip into /var/lang/lib/python3.x. Since the AWS SDK for Python (boto3/botocore) was pre-installed into /var/runtime, this made it harder for base container images customers to upgrade the SDK version.

With the Python 3.11 runtime, the AWS SDK and its dependencies are now pre-installed into the /var/lang/lib/python3.11 directory, and the search path has been modified so this directory has precedence over /var/runtime. This change means customers who build and deploy Lambda functions using the Python 3.11 base container image can now override the SDK simply by running pip install on a newer version. This change also enables pip to verify and track that the pre-installed SDK and its dependencies are compatible with any customer-installed packages.

This is the default sys.path before Python 3.11 (where X.Y is the Python major.minor version):

  • /var/task/: User Function
  • /opt/python/lib/pythonX.Y/site-packages/: User Layer
  • /opt/python/: User Layer
  • /var/runtime/: Pre-installed modules
  • /var/lang/lib/pythonX.Y/site-packages/: Default pip install location

Here is the default sys.path starting from Python 3.11:

  • /var/task/: User Function
  • /opt/python/lib/pythonX.Y/site-packages/: User Layer
  • /opt/python/: User Layer
  • /var/lang/lib/pythonX.Y/site-packages/: Pre-installed modules and default pip install location
  • /var/runtime/: No pre-installed modules

Using Python 3.11 in Lambda

AWS Management Console

To use the Python 3.11 runtime to develop your Lambda functions, specify a runtime parameter value Python 3.11 when creating or updating a function. Python 3.11 version is now available in the Runtime dropdown in the Create function page.

Create function

To update an existing Lambda function to Python 3.11, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. The new version of Python is available in the Runtime dropdown:

Edit function

AWS Lambda – Container Image

Change the Python base image version by modifying FROM statement in the Dockerfile:

FROM public.ecr.aws/lambda/python:3.11
# Copy function code
COPY lambda_handler.py ${LAMBDA_TASK_ROOT}

To learn more, refer to the usage tab on building functions as container images.

AWS Serverless Application Model (AWS SAM)

In AWS SAM, set the Runtime attribute to python3.11 to use this version.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Description: My Python Lambda Function
      CodeUri: my_function/
      Handler: lambda_function.lambda_handler
      Runtime: python3.11

AWS SAM supports generating this template with Python 3.11 out of the box for new serverless applications using the sam init command. Refer to the AWS SAM documentation here.

AWS Cloud Development Kit (AWS CDK)

In the AWS CDK, set the runtime attribute to Runtime.PYTHON_3_11 to use this version. In Python:

from constructs import Construct 
from aws_cdk import ( App, Stack, aws_lambda as _lambda )

class SampleLambdaStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)
        
        base_lambda = _lambda.Function(self, 'SampleLambda', 
                                       handler='lambda_handler.handler', 
                                    runtime=_lambda.Runtime.PYTHON_3_11, 
                                 code=_lambda.Code.from_asset('lambda'))

In TypeScript:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as path from 'path';
import { Construct } from 'constructs';

export class CdkStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // The code that defines your stack goes here

    // The python3.11 enabled Lambda Function
    const lambdaFunction = new lambda.Function(this, 'python311LambdaFunction', {
      runtime: lambda.Runtime.PYTHON_3_11,
      memorySize: 512,
      code: lambda.Code.fromAsset(path.join(__dirname, '/../lambda')),
      handler: 'lambda_handler.handler'
    })
  }
}

Conclusion

You can build and deploy functions using Python 3.11 using the AWS Management Console, AWS CLI, AWS SDK, AWS SAM, AWS CDK, or your choice of Infrastructure as Code (IaC). You can also use the Python 3.11 container base image if you prefer to build and deploy your functions using container images.

We are excited to bring Python 3.11 runtime support to Lambda and empower developers to build more efficient, powerful, and scalable serverless applications. Try Python 3.11 runtime in Lambda today and experience the benefits of this updated language version and take advantage of improved performance and new language features.
For more serverless learning resources, visit Serverless Land

Migrating AWS Lambda functions from the Go1.x runtime to the custom runtime on Amazon Linux 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/migrating-aws-lambda-functions-from-the-go1-x-runtime-to-the-custom-runtime-on-amazon-linux-2/

This post is written by Micah Walter, Senior Solutions Architect, Yanko Bolanos, Senior Solutions Architect, and Ramesh Mathikumar, Senior DevOps Consultant.

This blog post describes our plans to improve performance and streamline the user experience for customers writing AWS Lambda functions using Go.

Today, customers using Go with Lambda can either use the go1.x runtime, or use the provided.al2 runtime. Going forward, we plan to deprecate the go1.x runtime in line with the end-of-life of Amazon Linux 1, currently scheduled for December 31, 2023.

Customers using the go1.x runtime should migrate their functions to the provided.al2 runtime to continue to benefit from the latest runtime updates and security patches. Customers who deploy Go functions using container images who are currently using the go1.x base container image should similarly migrate to the provided.al2 base image.

Using the provided.al2 runtime offers several benefits over the go1.x runtime. First, it supports running Lambda functions on AWS Graviton2 processors, offering up to 34% better price-performance compared to functions running on x86_64 processors. Second, it offers a streamlined implementation with a smaller deployment package and faster function invoke path. Finally, this change aligns Go with other languages that also compile to native code such as Rust or C++, which also run on the provided.al2 runtime.

This migration does not require any code changes. The only changes relate to how you build your deployment package and configure your function. This blog post outlines the steps required to update your build scripts and tooling to use the provided.al2 runtime for your Go functions.

There is a difference in Lambda billing between the go1.x runtime and the provided.al2 runtime. With the go1.x runtime, Lambda does not bill for time spent during function initialization (cold start), whereas with the provided.al2 runtime Lambda includes function initialization time in the billed function duration. Since Go functions typically initialize very quickly, and since Lambda reduces the number of initializations by re-using function execution environments for multiple function invokes, in practice the difference in your Lambda bill should be very small.

Compiling for the provided.al2 runtime

In order to run a compiled Go application on Lambda, you must compile your code for Linux. While the go1.x runtime allows you to use any executable name, the provided.al2 runtime requires you to use bootstrap as the executable name. On macOS and Linux, here’s the simplest form of the build command:

GOARCH=amd64 GOOS=linux go build -o bootstrap main.go

This build command creates a Go binary file called bootstrap compatible with the x86_64 instruction set for Lambda. To compile for AWS Graviton processors, set GOARCH=arm64 in the preceding command.

The final step is to compress this binary into a ZIP file deployment package, ready to deploy to Lambda:

zip myFunction.zip bootstrap

For users compiling on Windows, Go supports compiling for Linux without using a Linux virtual machine or build container. However, Lambda uses POSIX file permissions, which must be set correctly. Lambda provides a helper tool which builds a deployment package that is valid for Lambda—see the Lambda documentation for details. Existing users of this tool should update to the latest version to make sure their build scripts are up-to-date.

Removing the RPC dependency

The go1.x runtime uses two processes within the Lambda execution environment to route requests to your handler function. The first process, which is included in the runtime, retrieves function invocation requests from the Lambda runtime API, and uses RPC to pass the invoke to the second process. This second process runs the executable which you deploy, and comprises the aws-lambda-go package and your function code. The aws-lambda-go package receives the RPC request and executes your function.

The following runtime architecture diagram for the go1.x runtime shows the runtime client process calling the runtime API to retrieve a function invocation and using RPC to call a separate process containing the function code.

Execution environment

Go functions deployed to the provided.al2 runtime use a simpler, single-process architecture. When building the Go executable, you include the same aws-lambda-go package as before. However, in this case the aws-lambda-go package acts as the runtime client, retrieving invocation requests from the runtime API directly.

The following runtime architecture diagram shows a Go function running on the provided.al2 runtime. A single process retrieves the function invocation from the runtime API and executes the function code.

Go running on provided.al2

Removing the additional process and RPC hop streamlines the function execution path, resulting in faster invokes. You can also remove the RPC component from the aws-lambda-go package, giving a smaller binary size and faster code loading during cold starts. To remove the RPC dependency, add the lambda.norpc tag to your build command:

GOARCH=amd64 GOOS=linux go build -tags lambda.norpc -o bootstrap main.go

Creating a new Lambda function

Once your deployment package is ready, you can create a new Lambda function using the provided.al2 runtime using the Lambda console:

Creating in the Lambda console

Migrating existing functions

If you have existing Lambda functions that use the go1.x runtime, you can migrate these functions by following these steps:

  1. Recompile your binary using the preceding commands, making sure to name your binary bootstrap.
  2. If you are using the same instruction set architecture, open the runtime settings and switch the runtime to “Provide your own bootstrap on Amazon Linux 2”.
  3. Upload the new version of your binary as a zip file.
    Edit runtime settings

Note: The handler value is not used by the provided.al2 runtime, nor the aws-lambda-go library, and may be set to any value. We recommend the setting the value to bootstrap to help with migrating between go1.x and provided.al2.

To switch instruction set architecture to Graviton (arm64), save your changes, and then re-open the runtimes settings to make the architecture change.

Migrating Go1.x Lambda container images

Lambda allows you to run your Go code as a container image. Customers that are using the go1.x base image for Lambda containers must migrate to the provided.al2 base image. The Lambda documentation includes instructions on how to build and deploy Go functions using the provided.al2 base image.

The following Dockerfile uses a two-stage build to avoid unnecessary layers and files in your final image. The first stage of the process builds the application. This stage installs Go, downloads the dependencies for the code, and compiles the binary. The second stage copies the executable to a new container without the dependencies of the build process.

  1. Create a Dockerfile in your project directory:
    FROM public.ecr.aws/lambda/provided:al2 as build
    # install compiler
    RUN yum install -y golang
    RUN go env -w GOPROXY=direct
    # cache dependencies
    ADD go.mod go.sum ./
    RUN go mod download
    # build
    ADD . .
    RUN go build -tags lambda.norpc -o /main
    # copy artifacts to a clean image
    FROM public.ecr.aws/lambda/provided:al2
    COPY --from=build /main /main
    ENTRYPOINT [ "/main" ]           
    
  2. Build your Docker image with the Docker build command:
    docker build -t hello-world .
  3. Authenticate the Docker CLI to your Amazon ECR registry:
    aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com 
  4. Tag and push your image to the Amazon ECR registry:
    docker tag hello-world:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/hello-world:latest
    
    docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/hello-world:latest

You can now create or update your Go Lambda function to use the new container image.

Changes to tooling

To migrate your functions from the go1.x runtime to the provided.al2 runtime, you must make configuration changes to your build scripts or CI/CD configurations. Here are some common examples.

Makefiles and build scripts

If you use Makefiles or custom build scripts to build Go functions, you must modify to ensure the executable file is named bootstrap when deploying to the provided.al2 runtime.

Here is an example Makefile which compiles the main.go file into an executable called bootstrap in the bin folder. It also creates a zip file, which you can deploy to Lambda using the console or via the AWS CLI.

GOARCH=arm64 GOOS=linux go build -tags lambda.norpc -o ./bin/bootstrap
(cd bin && zip -FS bootstrap.zip bootstrap)

CloudFormation

If you deploy your Lambda functions using AWS CloudFormation templates, change the Handler and Runtime settings under Properties:

Resources:
  Function:
    Type: AWS::Serverless::Function
    Properties:
      Handler: bootstrap
      Runtime: provided.al2
      ... # Other required properties

AWS Serverless Application Model

If you use the AWS Serverless Application Model (AWS SAM) to build and deploy your Go functions, make the same changes to the Handler and Runtime settings as for CloudFormation. You must also add the BuildMethod:

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function
    Metadata:
      BuildMethod: go1.x 
    Properties:
      CodeUri: hello-world/ # folder where your main program resides
      Handler: bootstrap
      Runtime: provided.al2
      Architectures:
        - x86_64

Cloud Development Kit (CDK)

If you use the AWS Cloud Development Kit (AWS CDK), you can compile your Go executable and place it under a folder in your project. Next, specify the location by using awslambda.Code_FromAsset, and AWS CDK packages the binary into a zip file and uploads it.

// Go CDK
awslambda.NewFunction(stack, jsii.String("HelloHandler"), &awslambda.FunctionProps{
    Code:         awslambda.Code_FromAsset(jsii.String("lambda"), nil), //folder where bootstrap executable is located
    Runtime:      awslambda.Runtime_PROVIDED_AL2(),
    Handler:      jsii.String("bootstrap"), // Handler named bootstrap
    Architecture: awslambda.Architecture_ARM_64(),
})

Taking this further, AWS CDK can perform build commands as part of your AWS CDK build process by using the native AWS CDK bundling functionality. With the bundling parameter, AWS CDK can perform steps before staging the files in the cloud assembly. Instead of placing the binary file, place the Go code in a folder and use the Bundling option to compile the code in a Docker container.

This example uses the golang:1.20.1 Docker image. After compilation, AWS CDK creates a zip file with the binary and creates the Lambda function:

// Go CDK
awslambda.NewFunction(stack, jsii.String("HelloHandler"), &awslambda.FunctionProps{
        Code: awslambda.Code_FromAsset(jsii.String("go-lambda"), &awss3assets.AssetOptions{
                Bundling: &awscdk.BundlingOptions{
                        Image: awscdk.DockerImage_FromRegistry(jsii.String("golang:1.20.1")),
                        Command: &[]*string{
                                jsii.String("bash"),
                                jsii.String("-c"),
                                jsii.String("GOCACHE=/tmp go mod tidy && GOCACHE=/tmp GOARCH=arm64 GOOS=linux go build -tags lambda.norpc -o /asset-output/bootstrap"),
                        },
                },
        }),
        Runtime:      awslambda.Runtime_PROVIDED_AL2(),
        Handler:      jsii.String("bootstrap"),
        Architecture: awslambda.Architecture_ARM_64(),
})

Conclusion

Lambda is deprecating the go1.x runtime in line with Amazon Linux 1 end-of-life, scheduled for December 31, 2023. Customers using Go with Lambda should migrate their functions to the provided.al2 runtime. Benefits include support for AWS Graviton2 processors with better price-performance, and a streamlined invoke path with faster performance.

For more serverless learning resources, visit Serverless Land.

Decoupling event publishing with Amazon EventBridge Pipes

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/decoupling-event-publishing-with-amazon-eventbridge-pipes/

This post is written by Gregor Hohpe, Sr. Principal Evangelist, Serverless.

Event-driven architectures (EDAs) help decouple applications or application components from one another. Consuming events makes a component less dependent on the sender’s location or implementation details. Because events are self-contained, they can be consumed asynchronously, which allows sender and receiver to follow independent timing considerations. Decoupling through events can improve development agility and operational resilience when building fine-grained distributed applications, which is the preferred style for serverless applications.

Many AWS services publish events through built-in mechanisms to support building event-driven architectures with a minimal amount of custom coding. Modern applications built on top of those services can also send and consume events based on their specific business logic. AWS application integration services like Amazon EventBridge or Amazon SNS, a managed publish-subscribe service, filter those events and route them to the intended destination, providing additional decoupling between event producer and consumer.

Publishing events

Custom applications that act as event producers often use the AWS SDK library, which is available for 12 programming languages, to send an event message. The application code constructs the event as a local data structure and specifies where to send it, for example to an EventBridge event bus.

The application code required to send an event to EventBridge is straightforward and only requires a few lines of code, as shown in this (simplified) helper method that publishes an order event generated by the application:

async function sendEvent(event) {
    const ebClient = new EventBridgeClient();
    const params = { Entries: [ {
          Detail: JSON.stringify(event),
          DetailType: "newOrder",
          Source: "orders",
          EventBusName: eventBusName
        } ] };
    return await ebClient.send(new PutEventsCommand(params));
}

An application most likely calls such a method in the context of another action, for example when persisting a received order to a data store. The code that performs those tasks might look as follows:

const order = { "orderId": "1234", "Items": ["One", "Two", "Three"] }
await writeToDb(order);
await sendEvent(order);

The code populates an order object with multiple line items (in reality, this would be based on data entered by a user or received via an API call), writes it to a database (via another helper method whose implementation isn’t shown), and then sends it to an EventBridge bus via the preceding method.

Code causes coupling

Although this code is not complex, it has drawbacks from an architectural perspective:

  • It interweaves application logic with the solution’s topology because the destination of the event, both in terms of service (EventBridge versus SNS, for example) and the instance (the service bus name in this case) are defined inside the application’s source code. If the event destination changes, you must change the source code, or at least know which string constant is passed to the function via an environment variable. Both aspects work against the EDA principle of minimizing coupling between components because changes in the communication structure propagate into the producer’s source code.
  • Sending the event after updating the database is brittle because it lacks transactional guarantees across both steps. You must implement error handling and retry logic to handle cases where sending the event fails, or even undo the database update. Writing such code can be tedious and error-prone.
  • Code is a liability. After all, that’s where bugs come from. In a real-life example, a helper method similar to preceding code erroneously swapped day and month on the event date, which led to a challenging debugging cycle. Boilerplate code to send events is therefore best avoided.

Performing event routing inside EventBridge can lessen the first concern. You could reconfigure EventBridge’s rules and targets to route events with a specified type and source to a different destination, provided you keep the event bus name stable. However, the other issues would remain.

Serverless: Less infrastructure, less code

AWS serverless integration services can alleviate the need to write custom application code to publish events altogether.

A key benefit of serverless applications is that you can let the AWS Cloud do the undifferentiated heavy lifting for you. Traditionally, we associate serverless with provisioning, scaling, and operating compute infrastructure so that developers can focus on writing code that generates business value.

Serverless application integration services can also take care of application-level tasks for you, including publishing events. Most applications store data in AWS data stores like Amazon Simple Storage Service (S3) or Amazon DynamoDB, which can automatically emit events whenever an update takes place, without any application code.

EventBridge Pipes: Events without application code

EventBridge Pipes allows you to create point-to-point integrations between event producers and consumers with optional transformation, filtering, and enrichment steps. Serverless integration services combined with cloud automation allow ”point-to-point” integrations to be more easily managed than in the past, which makes them a great fit for this use case.

This example takes advantage of EventBridge Pipes’ ability to fetch events actively from sources like DynamoDB Streams. DynamoDB Streams captures a time-ordered sequence of item-level modifications in any DynamoDB table and stores this information in a log for up to 24 hours. EventBridge Pipes picks up events from that log and pushes them to one of over 14 event targets, including an EventBridge bus, SNS, SQS, or API Destinations. It also accommodates batch sizes, timeouts, and rate limiting where needed.

EventBridge Pipes example

The integration through EventBridge Pipes can replace the custom application code that sends the event, including any retry or error logic. Only the following code remains:

const order = { "orderId": "1234", "Items": ["One", "Two", "Three"] }
await writeToDb(order);

Automation as code

EventBridge Pipes can be configured from the CLI, the AWS Management Console, or from automation code using AWS CloudFormation or AWS CDK. By using AWS CDK, you can use the same programming language that you use to write your application logic to also write your automation code.

For example, the following CDK snippet configures an EventBridge Pipe to read events from a DynamoDB Stream attached to the Orders table and passes them to an EventBridge event bus.

This code references the DynamoDB table via the ordersTable variable that would be set when the table is created:

const pipe = new CfnPipe(this, 'pipe', {
  roleArn: pipeRole.roleArn,
  source: ordersTable.tableStreamArn!,
  sourceParameters: {
    dynamoDbStreamParameters: {
      startingPosition: 'LATEST'
    },
  },
  target: ordersEventBus.eventBusArn,
  targetParameters: {
    eventBridgeEventBusParameters: {
      detailType: 'order-details',
      source: 'my-source',
    },
  },
}); 

The automation code cleanly defines the dependency between the DynamoDB table and the event destination, independent of application logic.

Decoupling with data transformation

Coupling is not limited to event sources and destinations. A source’s data format can determine the event format and require downstream changes in case the data format or the data source change. EventBridge Pipes can also alleviate that consideration.

Events emitted from the DynamoDB Stream use the native, marshaled DynamoDB format that includes type information, such as an “S” for strings or “L” for lists.

For example, the order event in the DynamoDB stream from this example looks as follows. Some fields are omitted for readability:

{
  "detail": {
    "eventID": "be1234f372dd12552323a2a3362f6bd2",
    "eventName": "INSERT",
    "eventSource": "aws:dynamodb",
    "dynamodb": {
      "Keys": { "orderId": { "S": "ABCDE" } },
      "NewImage": { 
        "orderId": { "S": "ABCDE" },
        "items": {
            "L": [ { "S": "One" }, { "S": "Two" }, { "S": "Three" } ]
        }
      }
    }
  }
} 

This format is not well suited for downstream processing because it would unnecessarily couple event consumers to the fact that this event originated from a DynamoDB Stream. EventBridge Pipes can convert this event into a more easily consumable format. The transformation is specified via an inputTemplate parameter using JSONPath expressions. EventBridge Pipes added support for list processing with wildcards proves to be perfect for this scenario.

In this example, add the following transformation template inside the target parameters to the preceding CDK code (the asterisk character matches a complete list of elements):

targetParameters: {
  inputTemplate: '{"orderId": <$.dynamodb.NewImage.orderId.S>,' + 
                 '"items": <$.dynamodb.NewImage.items.L[*].S>}'
}

This transformation formats the event published by EventBridge Pipes like a regular business event, decoupling any event consumer from the fact that it originated from a DynamoDB table:

{
  "time": "2023-06-01T06:18:10Z",
  "detail": {
    "orderId": "ABCDE",
    "items": ["One", "Two", "Three" ]
  }
}

Conclusion

When building event-driven applications, consider whether you can replace application code with serverless integration services to improve the resilience of your application and provide a clean separation between application logic and system dependencies.

EventBridge Pipes can be a helpful feature in these situations, for example to capture and publish events based on DynamoDB table updates.

Learn more about EventBridge Pipes at https://aws.amazon.com/eventbridge/pipes/ and discover additional ways to reduce serverless application code at https://serverlessland.com/refactoring-serverless. For a complete code example, see https://github.com/aws-samples/aws-refactoring-to-serverless.

Understanding AWS Lambda’s invoke throttling limits

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/

This post is written by Archana Srikanta, Principal Engineer, AWS Lambda.

When you call AWS Lambda’s Invoke API, a series of throttle limits are evaluated to decide if your call is let through or throttled with a 429 “Too Many Requests” exception. This blog post explains the most common invoke throttle limits and the relationship between them, so you can better understand scaling workloads on Lambda.

Overview

Invoke call flow

The throttle limits exist to protect the following components of Lambda’s internal service architecture, and your workload, from noisy neighbors:

  • Execution environment: An execution environment is a Firecracker microVM where your function code runs. A given execution environment only hosts one invocation at a time, but it can be reused for subsequent invocations of the same function version.
  • Invoke data plane: These are a series of internal web services that, on an invoke, select (or create) a sandbox and route your request to it. This is also responsible for enforcing the throttle limits.

When you make an Invoke API call, it transits through some or all of the Invoke Data Plane services, before reaching an execution environment where your function code is downloaded and executed.

There are three distinct but related throttle limits which together decide if your invoke request is accepted by the data plane or throttled.

Concurrency

Concurrent means “existing, happening, or done at the same time”. Accordingly, the Lambda concurrency limit is a limit on the simultaneous in-flight invocations allowed at any given time. It is not a rate or transactions per second (TPS) limit in and of itself, but instead a limit on how many invocations can be inflight at the same time. This documentation visually explains the concept of concurrency.

Under the hood, the concurrency limit roughly translates to a limit on the maximum number of execution environments (and thus Firecracker microVMs) that your account can claim at any given point in time. Lambda runs a fleet of multi-tenant bare metal instances, on which Firecracker microVMs are carved out to serve as execution environments for your functions. AWS constantly monitors and scales this fleet based on incoming demand and shares the available capacity fairly among customers.

The concurrency limit helps protect Lambda from a single customer exhausting all the available capacity and causing a denial of service to other customers.

Transactions per second (TPS)

Customers often ask how their concurrency limit translates to TPS. The answer depends on how long your function invocations last.

Relation between concurrency and TPS

The diagram above considers three cases, each with a different function invocation duration, but a fixed concurrency limit of 1000. In the first case, invocations have a constant duration of 1 second. This means you can initiate 1000 invokes and claim all 1000 execution environments permitted by your concurrency limit. These execution environments remain busy for the entire second, and you cannot start any more invokes in that second because your concurrency limit prevents you from claiming any more execution environments. So, the TPS you can achieve with a concurrency limit of 1000 and a function duration of 1 second is 1000 TPS.

In case 2, the invocation duration is halved to 500ms, with the same concurrency limit of 1000. You can initiate 1000 concurrent invokes at the start of the second as before. These invokes keep the execution environments busy for the first half of the second. Once finished, you can start an additional 1000 invokes against the same execution environments while still being within your concurrency limit. So, by halving the function duration, you doubled your TPS to 2000.

Similarly, in case 3, if your function duration is 100ms, you can initiate 10 rounds of 1000 invokes each in a second, achieving a TPS of 10K.

Codifying this as an equation, the TPS you can achieve given a concurrency limit is:

TPS = concurrency / function duration in seconds

Taken to an extreme, for a function duration of only 1ms and at a concurrency limit of 1000 (the default limit), an account can drive an invoke TPS of one million. For every additional unit of concurrency granted via a limit increase, it implicitly grants an additional 1000 TPS per unit of concurrency increased. The high TPS doesn’t require any additional execution environments (Firecracker microVMs), so it’s not problematic from a fleet capacity perspective. However, driving over a million TPS from a single account puts stress on the Invoke Data Plane services. They must be protected from noisy neighbor impact as well so all customers have a fair share of the services’ bandwidth. A concurrency limit alone isn’t sufficient to protect against this – the TPS limit provides this protection.

As of this writing, the invoke TPS is capped at 10 times your concurrency. Added to the previous equation:

TPS = min( 10 x concurrency, concurrency / function duration in seconds)

The concurrency factor is common across both terms in the min function, so the key comparison is:

min(10, 1 / function duration in seconds)

If the function duration is exactly 100ms (or 1/10th of a second), both terms in the min function are equal. If the function duration is over 100ms, the second term is lower and TPS is limited as per concurrency/function duration. If the function duration is under 100ms, the first term is lower and TPS is limited as per 10 x concurrency.

Limits for functions less than 100ms

To summarize, the TPS limit exists to protect the Invoke Data Plane from the high churn of short-lived invocations, for which the concurrency limit alone affords too high of a TPS. If you drive short invocations of under 100ms, your throughput is capped as though the function duration is 100ms (at 10 x concurrency) as shown in the diagram above. This implies that short lived invocations may be TPS limited, rather than concurrency limited. However, if your function duration is over 100ms you can effectively ignore the 10 x concurrency TPS limit and calculate your available TPS as concurrency/function duration.

Burst

The third throttle limit is the burst limit. Lambda does not keep execution environments provisioned for your entire concurrency limit at all times. That would be wasteful, especially if usage peaks are transient, as is the case with many workloads. Instead, the service spins up execution environments just-in-time as the invoke arrives, if one doesn’t already exist. Once an execution environment is spun up, it remains “warm” for some period of time and is available to host subsequent invocations of the same function version.

However, if an invoke doesn’t find a warm execution environment, it experiences a “cold start” while we provision a new execution environment. Cold starts involve certain additional operations over and above the warm invoke path, such as downloading your code or container and initializing your application within the execution environment. These initialization operations are typically computationally heavy and so have a lower throughput compared to the warm invoke path. If there are sudden and steep spikes in the number of cold starts, it can put pressure on the invoke services that handle these cold start operations, and also cause undesirable side effects for your application such as increased latencies, reduced cache efficiency and increased fan out on downstream dependencies. The burst limit exists to protect against such surges of cold starts, especially for accounts that have a high concurrency limit. It ensures that the climb up to a high concurrency limit is gradual so as to smooth out the number of cold starts in a burst.

The algorithm used to enforce the burst limit is the Token Bucket rate-limiting algorithm. Consider a bucket that holds tokens. The bucket has a maximum capacity of B tokens (burst). The bucket starts full. Each time you send an invoke request that requires an additional unit of concurrency, it costs a token from the bucket. If the token exists, you are granted the additional concurrency and the token is removed from the bucket. The bucket is refilled at a constant rate of r tokens per minute (rate) until it reaches its maximum capacity.

Token bucket

What this means is that the rate of climb of concurrency is limited to r tokens per minute. Even though the algorithm allows you to collect up to B tokens and burst, you must wait for the bucket to refill before you can burst again, effectively limiting your average rate to r per minute.

Concurrency burst limit chart

The chart above shows the burst limit in action with a maximum concurrency limit of 3000, a maximum burst(B) of 1000 and a refill rate(r) of 500/minute. The token bucket starts full with 1000 tokens, as is the available burst headroom.

There is a burst activity between minute one and two, which consumes all tokens in the bucket and claims all 1000 concurrent execution environments allowed by the burst limit. At this point the bucket is empty and any attempt to claim additional concurrent execution environments is burst throttled, in spite of max concurrency not being reached yet.

The token bucket and the burst headroom are replenished at minutes two and three with 500 tokens each minute to bring it back up to its maximum capacity of 1000. At minute four, there is no additional refill because the bucket is at maximum capacity. Between minutes four and five, there is a second burst activity which empties the bucket again and claims an additional 1000 execution environments, bringing the total number of active execution environments to 2000.

The bucket continues to replenish at a rate of 500/minute at minutes five and six. At this point, sufficient tokens have been accumulated to cover the entire concurrency limit of 3000, and so the bucket isn’t refilled anymore even when you have the third burst activity at minute seven. At minute ten, when all the usage ramps down, the available burst headroom slowly stair steps back down to the maximum initial burst of 1K.

The actual numbers for maximum burst and refill rate vary by Region and are subject to change, please visit the Lambda burst limits page for specific values.

It is important to distinguish that the burst limit isn’t a rate limit on the invoke itself, but a rate limit on how quickly concurrency can rise. However, since invoke TPS is a function of concurrency, it also clamps how quickly TPS can rise (a rate limit for a rate limit). The following chart shows how the TPS burst headroom follows a similar stair step pattern as the concurrency burst headroom, only with a multiplier.

Burst limits chart

Conclusion

This blog explains three key throttle limits applied on Lambda invokes: the concurrency limit, TPS limit and burst limit. It outlines the relationship between these limits and how each one protects the system and your workload from noisy neighbors. Equipped with this knowledge you can better interpret any 429 throttling exceptions you may receive while scaling your applications on Lambda. For more information on getting started with Lambda visit the Developer Guide.

For more serverless learning resources, visit Serverless Land.

Ruby 3.2 runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/ruby-3-2-runtime-now-available-in-aws-lambda/

This post is written by Praveen Koorse, Senior Solutions Architect, AWS.

AWS Lambda now supports Ruby 3.2 runtime. With this release, Ruby developers can now take advantage of new features and improvements introduced in Ruby 3 when creating serverless applications on Lambda. Use this runtime today by specifying the runtime parameter of ruby3.2 when creating or updating Lambda functions.

Ruby 3.2 adds many features and performance improvements, including anonymous arguments passing improvements, ‘endless’ methods, Regexp improvements, a new Data class, support for pattern-matching in Time and MatchData, and support for ‘find pattern’ in pattern matching.

Our testing shows Ruby 3.2 cold starts are marginally slower than Ruby 2.7 for a trivial ‘hello world’ function. However, for many real-world workloads, the improved execution performance of Ruby 3.2 results in similar or better performance overall.

Existing Lambda customers using the Ruby 2.7 runtime should migrate to the Ruby 3.2 runtime as soon as possible. Even though community support for Ruby 2.7 has ended, Lambda has extended support for the Ruby 2.7 runtime until December 7, 2023 to provide existing Ruby customers with time to transition to Ruby 3.2. Functions using Ruby 2.7 continue to be eligible for technical support and Lambda will continue to apply OS security updates to the runtime until this date.

Anonymous arguments passing improvements

Ruby 3.2 has introduced improvements to how you can pass anonymous arguments, making it easier and cleaner to work with keyword arguments in code. Previously, you could pass anonymous keyword arguments to a method by using the delegation syntax () or use Module#ruby2_keywords and delegate *args, &block. This method was not intuitive and lacked clarity when working with multiple arguments.

def foo(...)
  target(...)
end

ruby2_keywords def foo(*args, &block)
  target(*args, &block)
end

Now, if a method declaration includes anonymous positional or keyword arguments, those can be passed to the next method as arguments themselves. The same advantages of anonymous block forwarding apply to rest and keyword rest argument forwarding.

def keywords(**) # accept keyword arguments
  foo(**) # pass them to the next method
end

def positional(*) # accept positional arguments
  bar(*) # pass to the next method
end

def positional_keywords(*, **) # same as ...
  foobar(*, **)
end

Endless methods

Ruby 3 introduced endless methods that enable developers to define methods of exactly one statement with a syntax def method() = statement. The syntax doesn’t need an end and allows methods to be defined as short-cut one liners, making the creation of basic utility methods easier and helping developers write clean code and improve readability and maintainability of code.

def dbg_args(a, b=1, c:, d: 6, &block) = puts("Args passed: #{[a, b, c, d, block.call]}")
dbg_args(0, c: 5) { 7 }
# Prints: Args passed: [0, 1, 5, 6, 7]

def square(x) = x**2
square(100) 
# => 10000

Regexp improvements

Regular expression matching can take an unexpectedly long time. If your code attempts to match a possibly inefficient Regexp against an untrusted input, an attacker may exploit it for efficient denial of service (so-called regular expression DoS, or ReDoS).

Ruby 3.2 introduces two improvements to mitigate this.

The first is an improved Regexp matching algorithm using a cache-based approach that allows most Regexp matching to be completed in linear time, thus significantly improving overall match performance.

As the prior optimization cannot be applied to some regular expressions, such as those using advanced features or working with a huge fixed number of repetitions, a Regexp timeout improvement now allows you to specify a timeout for Regexp operations as a fallback measure.

There are two APIs to set timeout:

  • Timeout.timeout: This is a global configuration and applies to all Regexps in the process.
  • timeout keyword for Regexp.new: This allows you to specify a different timeout setting for some Regexps. If used, it takes precedence over the global configuration.
Regexp.timeout = 2.0 # Global configuration set to two seconds
/^x*y?x*()\1$/ =~ "x" * 45000 + "a"
#=> Regexp::TimeoutError is raised in two seconds

my_long_rexp = Regexp.new('^x*y?x*()\1$', timeout: 4)
my_long_rexp =~ "x" * 45000 + "a"
# Regexp::TimeoutError is raised in four seconds

Data class

Ruby 3.2 introduces a new core class – ‘Data’ – to represent simple value-alike objects. The class is similar to Struct and partially shares the implementation, but is intended to be immutable with a more modern interface.

Once a Data class is defined using Data.define, both positional and keyword arguments can be used while constructing objects.

def handler(event:, context:) 
  employee = Data.define(:firstname, :lastname, :empid, :department)

  emp1 = employee.new('John', 'Doe', 12345, 'Sales')
  emp2 = employee.new(firstname: 'Alice', lastname: 'Doe', empid: 12346, department: 'Marketing')

  # Alternative form to construct an object
  emp3 = employee['Jack', 'Frost', 12354, 'Tech']
  emp4 = employee[firstname: 'Emma', lastname: 'Frost', empid: 12453, department: 'HR']
end

Pattern matching improvements

Pattern matching is a feature allowing deep matching of structured values: checking the structure and binding the matched parts to local variables.

Ruby 3.2 introduces ‘Find pattern’, allowing you to check if the given object has any elements that match a pattern. This pattern is similar to the Array pattern, except that it finds the first match in a given object containing multiple elements.

For example, previously, if you used the following in a Lambda function:

person = {name: "John", children: [{name: "Mark", age: 12}, {name: "Butler", age: 9}], siblings: [{name: "Mary", age: 31}, {name: "Conrad", age: 38}] }

case person
in {name: "John", children: [{name: "Mark", age: age}]}
  p age
end

This wouldn’t match because it does not search for the element in the ‘children’ array. It only matches for an array with a single element named “Mark”.

To find an element in an array with multiple elements, use ‘find pattern’:

case person
in {name: "John", children: [*, {name: "Mark", age: age}, *]}
  p age
end

As part of an effort to make core classes more pattern matching friendly, Ruby 3.2 also introduces the ability to deconstruct keys in Time and MatchData, allowing their use in case/in statements for pattern matching.

# `deconstruct_keys(nil)` provides all available keys:
timestamp = Time.now.deconstruct_keys(nil)

# Usage in pattern-matching:
case timestamp
in year: ...2022
  puts "Last year!"
in year: 2022, month: 1..3
  puts "Last year's first quarter"
in year: 2023, month:, day:
  puts "#{day} of #{month}th month!"
end

Standard Date and DateTime classes also have similar implementations for key deconstruction:

require 'date'
Date.today.deconstruct_keys(nil)
#=> {:year=>2023, :month=>1, :day=>15, :yday=>15, :wday=>0}
DateTime.now.deconstruct_keys(nil)
# => {:year=>2023, :month=>1, :day=>15, :yday=>15, :wday=>0, :hour=>17, :min=>19, :sec=>15, :sec_fraction=>(478525469/500000000), :zone=>"+02:00"}

Pattern matching with MatchData result deconstruction:

case db_connection_string.match(%r{postgres://(\w+):(\w+)@(.+)})
in 'admin', password, server
  # do connection with admin rights
in 'devuser', _, 'dev-server.us-east-1.rds.amazonaws.com'
  # connect to dev server
in user, password, server
  # do regular connection
end

YJIT – Yet Another Ruby JIT

YJIT, a lightweight, minimalistic Ruby JIT compiler built inside CRuby, is now an official part of the Ruby 3.2 runtime. It provides significantly higher performance, but also uses more memory than the Ruby interpreter and is generally suited for Ruby on Rails workloads.

By default, YJIT is not enabled in the Lambda Ruby 3.2 runtime. You can enable it for specific functions by setting the RUBY_YJIT_ENABLE environment variable to 1. Once enabled, you can verify it by printing the result of the RubyVM::YJIT.enabled? method.

puts(RubyVM::YJIT.enabled?())
# => true

Using Ruby 3.2 in Lambda

AWS Cloud Development Kit (AWS CDK):

In AWS CDK, set the runtime attribute to Runtime.RUBY_3_2 when creating the function to use this version. In TypeScript:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda';

export class MyCdkStack extends cdk.Stack {

  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    	super(scope, id, props);

new lambda.Function(this, 'Ruby32Lambda', {
  		runtime: lambda.Runtime.RUBY_3_2, //execution environment
  		handler: 'test.handler', //file is “test”, function is “handler”
  	code: lambda.Code.fromAsset('lambda'), //code loaded from “lambda” dir
});
}

AWS Management Console

In the Lambda console, specify a runtime parameter value of Ruby 3.2 when creating or updating a function. The Ruby 3.2 runtime is now available in the Runtime dropdown in the Create function page.

Create function

To update an existing Lambda function to Ruby 3.2, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. The new version of Ruby is available in the Runtime dropdown:

Edit runtime settings

AWS Serverless Application Model

In the AWS Serverless Application Model (AWS SAM), set the Runtime attribute to ruby3.2 to use this version in your application deployments:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: AWS Lambda Ruby 3.2 example

Resources:
  Ruby32Lambda:
    Type: AWS::Serverless::Function
    Description: 'Lambda function that uses the Ruby3.2 runtime'
    Properties:
      FunctionName: Ruby32Lambda
      Handler: function.handler
      Runtime: ruby3.2
      CodeUri: src/

Conclusion

Get started building with Ruby 3.2 today by making necessary changes for compatibility with Ruby 3.2, and specifying a runtime parameter value of ruby3.2 when creating or updating your Lambda functions. You can read about the Ruby programming model in the Lambda documentation to learn more about writing functions in Ruby 3.2.

For more serverless learning resources, visit Serverless Land.

Implementing custom domain names for Amazon API Gateway private endpoints using a reverse proxy

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/implementing-custom-domain-names-for-amazon-api-gateway-private-endpoints-using-a-reverse-proxy/

This post is written by Heeki Park, Principal Solutions Architect, Sachin Doshi, Senior Application Architect, and Jason Enderle, Senior Solutions Architect.

Amazon API Gateway enables developers to create private REST APIs that can only be accessed from within a Virtual Private Cloud (VPC). Traffic to the private API is transmitted over secure connections and stays within the AWS network and specifically within the customer’s VPC, protecting it from the public internet. This approach can be used to address a customer’s regulatory or security requirements by ensuring the confidentiality of the transmitted traffic. This makes private API Gateway endpoints suitable for publishing internal APIs, such as those used by microservices and data APIs.

In microservice architectures, teams often build and manage components in separate AWS accounts and prefer to access those private API endpoints using company-specific custom domain names. Custom domain names serve as an alias for a hostname and path to your API. This makes it easier for clients to connect using an easy-to-remember vanity URL and also maintains a stable URL in case the underlying API endpoint URL changes. Custom domain names can also improve the organization of APIs according to their functions within the enterprise. For example, the standard API Gateway URL format: “https://api-id.execute-api.region.amazonaws.com/stage” can be transformed into “https://api.private.example.com/myservice”.

Overview

This blog post builds on documentation that covers frontend invokes of private endpoints and backend integration patterns and two previously published blog posts.

The first blog post that helps you consume private API endpoints from API Gateway, using a VPC-enabled Lambda function and a container-based application using mTLS. The second post helps you implement private backend integrations to your API microservices that are deployed using AWS Fargate or Amazon EC2. This post extends these, enabling you to simplify access to your API endpoints by implementing custom domain names for private endpoints using an NGINX reverse proxy.

This solution uses NGINX because it acts as a high-performance intermediary, enabling the efficient forwarding of traffic within a private network. A configuration mapping file associates your custom domain with the corresponding private endpoint across AWS accounts. This configuration mapping file can then be source controlled and used for governed deployments into your lower and production environments.

The following diagram illustrates the interactions between the components and the path for an API request. In this use case, a shared services account (Account A) is responsible for centrally managing the mappings of custom domains and creating an AWS PrivateLink connection to private API endpoints in provider accounts (Account B and Account C).

Architecture

  1. A request to the API is made using a private custom domain from within a VPC or another device that is able to route to the VPC. For example, the request might use the domain https://api.private.example.com.
  2. An alias record in Amazon Route 53 private hosted zone resolves to the fully qualified domain name of the private Elastic Load Balancing (ELB). The ELB can be configured to be either a Network Load Balancer (NLB) or an Application Load Balancer (ALB).
  3. The ELB uses an AWS Certificate Manager (ACM) certificate to terminate TLS (Transport Layer Security) for corresponding custom private domain.
  4. The ELB listener redirects requests to an associated ELB target group, which in turn forwards the request to an Amazon Elastic Container Service task running on AWS Fargate.
  5. The Fargate service hosts a container based on NGINX that acts as a reverse proxy to the private API endpoint in one or more provider accounts. The Fargate service is configured to scale using a metric that tracks CPU utilization automatically.
  6. The Fargate task, forwards traffic to the appropriate private endpoints in provider Account B or Account C through a PrivateLink VPC Endpoint.
  7. The API Gateway resource policy limits access to the private endpoints based on a specific VPC endpoint, HTTP verbs, and source domain used to request the API.

The solution passes any additional information found in headers from upstream calls, such as authentication headers, content type headers, or custom data headers unmodified to private endpoints in provider accounts (Account B and Account C).

Prerequisites

To use custom domain names, you need two components: a TLS certificate and a DNS alias. This example uses ACM for managing the TLS certificate and Route 53 for creating the DNS alias.

ACM offers various options for integrating a TLS certificate, such as:

  1. Importing an existing TLS certificate into ACM.
  2. Requesting a TLS certificate in ACM with Email-based validation.
  3. Requesting a TLS certificate in ACM with DNS-based validation.
  4. Requesting a TLS certificate from ACM using your Organization’s Private CA in AWS.

The following diagram illustrates the benefits and drawbacks associated with each option.

Benefits and drawbacks

This solution uses DNS-based validation (option #3) to request TLS certificates from ACM. It is assumed that a public hosted zone with a registered root domain (such as example.com) is already deployed in the target account. The solution then uses ACM to validate ownership of the domain names specified in the configuration mapping file during deployment.

With a deployed public hosted zone, private child domains (such as api.private.example.com) can be deployed using DNS validation, which enables infrastructure as code (IaC) deployment to automate certificate validation during deployment of the solution. Additionally, DNS-based validation automatically renews the ACM certificate before it expires.

This solution requires the presence of specific VPC endpoints, namely execute-api, logs, ecr.dkr, ecr.api, and Amazon S3 gateway in a shared services account (Account A). Enabling private DNS on the execute-api VPC endpoint is optional and is not a requirement of the solution. Some customers may choose not to enable private DNS on the execute-api VPC endpoint, as this then allows applications within the VPC to reach the private API endpoints through the NGINX reverse proxy but also resolve public API Gateway endpoints.

Deploying the example

You can use the example AWS Cloud Development Kit (CDK) or Terraform code available on GitHub to deploy this pattern in your own account.

This solution uses a YAML-based configuration mapping file to add, update, or delete a mapping between a custom domain and a private API endpoint. During deployment, the automated infrastructure as code (IaC) script parses the provided YAML file and does the following:

  • Create an NGINX configuration file.
  • Apply the NGINX configuration file to the standard NGINX container image.
  • Parses the mapping file and creates necessary Route 53 private hosted zones in Account A.
  • Creates wildcard-based SSL certificates (such as *.example.com) in Account A. ACM validates these certificates using its respective public hosted zone (such as example.com) and attaches them to the ELB listener. By default, an ELB listener supports up to 25 SSL certificates. Wildcards are used to secure an unlimited number of subdomains, making it easier to manage and scale multiple subdomains.

Description of mapping file fields

Property

Required Example Values

Description

CUSTOM_DOMAIN_URL true api.private.example.com Desired custom URL for private API.
PRIVATE_API_URL true https://a1b2c3d4e5.execute-api.us-east-1.amazonaws.com/dev/path1/path2 Execution URL of targeted private endpoint
VERBS false [“GET”,” POST”, “PUT”, “PATCH”, “DELETE”, “HEAD”, “OPTIONS”]

This property would be used to create API Gateway resource policies.

You can provide one or more verbs as a comma separated list. If this property is not provided, all verbs are permitted.

Using API Gateway resource policies for private endpoints

To allow access to private endpoints from your VPCs or from VPCs in another account, you must implement a resource policy. Resource policies can be used to restrict access based on specific criteria such as VPC endpoints, API paths, and API verbs. To enable this functionality, follow these steps:

  • Complete the infrastructure as code (IaC) deployment.
  • Create or update an API Gateway resource policy in the provider accounts (such as Account B and Account C). This policy should include the VPC endpoint id from the shared services account (Account A).
  • Deploy your API to apply the changes in provider accounts (such as Account B and Account C).

To update the API Gateway resource policy with code, refer to the documentation and code examples in the GitHub repository.

Deploying updates to the mapping file

To add, update, or delete a mapping between your custom domain and private endpoint, you can update the mapping file and then rerun the deployment using the same steps as before.

Deploying subsequent updates to the mapping file using the existing infrastructure as code pipeline reduces the risk of human error, adds traceability, prevents configuration drift, and allows the deployment process to follow your existing DevOps and governance processes in place.

For example, you could store the configuration mapping file in a separate source control repository and commit each change to that repository. Each change could then trigger a deployment process, which would then check the configuration changes and conduct the appropriate deployment. If required, you could introduce gates to enforce either manual checks or a ticketing process to ensure that change control processes are enforced.

Understanding cost of the solution

Most of the services mentioned in this solution are billed according to usage, which is determined by the number of requests made.

However, there are a few services that incur hourly or monthly costs. These include monthly fees for Route 53 hosted zones, hourly charges for VPC endpoints, Elastic Load Balancing, and the hourly cost of running the NGINX reverse proxy on Fargate. To estimate the cost for these options based on your specific workload, you can utilize the AWS pricing calculator. Here is an example outlining the approximate cost associated with the architecture implemented in this solution.

Conclusion

This blog post demonstrates a solution that allows customers to utilize their private endpoints securely with API Gateway across AWS accounts and within a VPC network by using a reverse proxy with a custom domain name. The solution offers a simplified approach to manage the mapping between private endpoints with API Gateway and custom domain names, ensuring seamless connectivity and security.

For more serverless learning resources, visit Serverless Land.

Automating stopping and starting Amazon MWAA environments to reduce cost

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/automating-stopping-and-starting-amazon-mwaa-environments-to-reduce-cost/

This was written by Uma Ramadoss, Specialist Integration Services, and Chandan Rupakheti, Solutions Architect.

This blog post shows how you can save cost by automating the stopping and starting of an Amazon Managed Workflows for Apache Airflow (Amazon MWAA) environment. It describes how you can retain the data stored in a metadata database and presents an automated solution you can use in your AWS account.

Customers run end to end data pipelines at scale with MWAA. It is a common best practice to run non-production environments for development and testing. A nonproduction environment often does not need to run throughout the day due to factors such as working hours of the development team. As there is no automatic way to stop an MWAA environment when not in use and deleting an environment causes metadata loss, customers often run it continually and pay the full cost.

Overview

Amazon MWAA has a distributed architecture with multiple components such as scheduler, worker, webserver, queue, and database. Customers build data pipelines as Directed Acyclic Graphs (DAGs) and run in Amazon MWAA. The DAGs use variables and connections from the Amazon MWAA metadata database. The history of DAG runs and related data are stored in the same metadata database. The database also stores other information such as user roles and permissions.

When you delete the Amazon MWAA environment, all the components including the database are deleted so that you do not incur any cost. As this normal deletion results in loss of metadata, you need a customized solution to back up the data and to automate the deletion and recreation.

The sample application deletes and recreates your MWAA environment at a scheduled interval defined by you using Amazon EventBridge Scheduler. It exports all metadata into an Amazon S3 bucket before deletion and imports the metadata back to the environment after creation. As this is a managed database and you cannot access the database outside the Amazon MWAA environment, it uses DAGs to import and export the data. The entire process is orchestrated using AWS Step Functions.

Deployment architecture

The sample application is in a GitHub repository. Use the instructions in the readme to deploy the application.

Sample architecture

The sample application deploys the following resources –

  1. A Step Functions state machine to orchestrate the steps needed to delete the MWAA environment.
  2. A Step Functions state machine to orchestrate the steps needed to recreate the MWAA environment.
  3. EventBridge Scheduler rules to trigger the state machines at the scheduled time.
  4. An S3 bucket to store metadata database backup and environment details.
  5. Two DAG files uploaded to the source S3 bucket configured with the MWAA environment. The export DAG exports metadata from the MWAA metadata database to backup S3 bucket. The import DAG restores the metadata from the backup S3 bucket to the newly created MWAA environment.
  6. AWS Lambda functions for triggering the DAGs using MWAA CLI API.
  7. A Step Functions state machine to wait for the long-running MWAA creation and deletion process.
  8. Amazon EventBridge rule to notify on state machine failures.
  9. Amazon Simple Notification Service (Amazon SNS) topic as a target to the EventBridge rule for failure notifications.
  10. Amazon Interface VPC Endpoint for Step Functions for MWAA environment deployed in the private mode.

Stop workflow

At a scheduled time, Amazon EventBridge Scheduler triggers a Step Functions state machine to stop the MWAA environment. The state machine performs the following actions:

Stop workflow

  1. Fetch Amazon MWAA environment details such as airflow configurations, IAM execution role, logging configurations and VPC details.
  2. If the environment is not in the “AVAILABLE” status, it fails the workflow by branching to the “Pausing unsuccessful” state.
  3. Otherwise, it runs the normal workflow and stores the environment details in an S3 bucket so that Start workflow can recreate the environment with this data.
  4. Trigger an MWAA DAG using AWS Lambda function to export metadata to the Amazon S3 bucket. This step uses Step Functions to wait for callback token integration.
  5. Resume the workflow when the task token is returned from the MWAA DAG.
  6. Delete Amazon MWAA environment.
  7. Wait to confirm the deletion.

Start workflow

At a scheduled time, EventBridge Scheduler triggers the Step Functions state machine to recreate the MWAA environment. The steps in the state machine perform the following actions:

Start workflow

  1. Retrieve the environment details stored in Amazon S3 bucket by the stop workflow.
  2. Create an MWAA environment with the same configuration as the original.
  3. Trigger an MWAA DAG through the Lambda function to restore the metadata from the S3 bucket to the newly created environment.

Cost savings

Consider a small MWAA environment in us-east-2 with a minimum of one worker, a maximum of one worker, and 1GB data storage. At the time of this writing, the monthly cost of the environment is $357.80. Let’s assume you use this environment between 6 am and 6 pm on weekdays.

The schedule in the env file of the sample application looks like:

MWAA_PAUSE_CRON_SCHEDULE=’0 18 ? * MON-FRI *’
MWAA_RESUME_CRON_SCHEDULE=’30 5 ? * MON-FRI *’

As MWAA environment creation takes anywhere between 20 and 30 minutes, the MWAA_RESUME_CRON_SCHEDULE is set at 5.30 pm.

Assuming 21 weekdays per month, the monthly cost of the environment is $123.48 and is 65.46% less compared to running the environment continuously:

  • 21 weekdays * 12 hours * 0.49 USD per hour = $123.48

Additional considerations

The sample application only restores at-store data. Though the deletion process pauses all the DAGs before making the backup, it cannot stop any running tasks or in-flight messages in the queue. It also does not backup tasks that are not in completed state. This can result in task history loss for the tasks that were running during the backup.

Over time, the metadata grows in size, which can increase latency in query performance. You can use a DAG as shown in the example to clean up the database regularly.

Avoid setting the catchup by default configuration flag in the environment setting to true or in the DAG definition unless it is required. Catch up feature runs all the DAG runs that are missed for any data interval. When the environment is created again, if the flag is true, it catches up with the missed DAG runs and can overload the environment.

Conclusion

Automating the deletion and recreation of Amazon MWAA environments is a powerful solution for cost optimization and efficient management of resources. By following the steps outlined in this blog post, you can ensure that your MWAA environment is deleted and recreated without losing any of the metadata or configurations. This allows you to deploy new code changes and updates more quickly and easily, without having to configure your environment each time manually.

The potential cost savings of running your MWAA environment for only 12 hours on weekdays are significant. The example shows how you can save up to 65% of your monthly costs by choosing this option. This makes it an attractive solution for organizations that are looking to reduce cost while maintaining a high level of performance.

Visit the samples repository to learn more about Amazon MWAA. It contains a wide variety of examples and templates that you can use to build your own applications.

For more serverless learning resources, visit Serverless Land.

Extending a serverless, event-driven architecture to existing container workloads

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/extending-a-serverless-event-driven-architecture-to-existing-container-workloads/

This post is written by Dhiraj Mahapatro, Principal Specialist SA, and Sascha Moellering, Principal Specialist SA, and Emily Shea, WW Lead, Integration Services.

Many serverless services are a natural fit for event-driven architectures (EDA), as events invoke them and only run when there is an event to process. When building in the cloud, many services emit events by default and have built-in features for managing events. This combination allows customers to build event-driven architectures easier and faster than ever before.

The insurance claims processing sample application in this blog series uses event-driven architecture principles and serverless services like AWS LambdaAWS Step FunctionsAmazon API GatewayAmazon EventBridge, and Amazon SQS.

When building an event-driven architecture, it’s likely that you have existing services to integrate with the new architecture, ideally without needing to make significant refactoring changes to those services. As services communicate via events, extending applications to new and existing microservices is a key benefit of building with EDA. You can write those microservices in different programming languages or running on different compute options.

This blog post walks through a scenario of integrating an existing, containerized service (a settlement service) to the serverless, event-driven insurance claims processing application described in this blog post.

Overview of sample event-driven architecture

The sample application uses a front-end to sign up a new user and allow the user to upload images of their car and driver’s license. Once signed up, they can file a claim and upload images of their damaged car. Previously, it did not yet integrate with a settlement service for completing the claims and settlement process.

In this scenario, the settlement service is a brownfield application that runs Spring Boot 3 on Amazon ECS with AWS Fargate. AWS Fargate is a serverless, pay-as-you-go compute engine that lets you focus on building container applications without managing servers.

The Spring Boot application exposes a REST endpoint, which accepts a POST request. It applies settlement business logic and creates a settlement record in the database for a car insurance claim. Your goal is to make settlement work with the new EDA application that is designed for claims processing without re-architecting or rewriting. Customer, claims, fraud, document, and notification are the other domains that are shown as blue-colored boxes in the following diagram:

Reference architecture

Project structure

The application uses AWS Cloud Development Kit (CDK) to build the stack. With CDK, you get the flexibility to create modular and reusable constructs imperatively using your language of choice. The sample application uses TypeScript for CDK.

The following project structure enables you to build different bounded contexts. Event-driven architecture relies on the choreography of events between domains. The object oriented programming (OOP) concept of CDK helps provision the infrastructure to separate the domain concerns while loosely coupling them via events.

You break the higher level CDK constructs down to these corresponding domains:

Comparing domains

Application and infrastructure code are present in each domain. This project structure creates a seamless way to add new domains like settlement with its application and infrastructure code without affecting other areas of the business.

With the preceding structure, you can use the settlement-service.ts CDK construct inside claims-processing-stack.ts:

const settlementService = new SettlementService(this, "SettlementService", {
  bus,
});

The only information the SettlementService construct needs to work is the EventBridge custom event bus resource that is created in the claims-processing-stack.ts.

To run the sample application, follow the setup steps in the sample application’s README file.

Existing container workload

The settlement domain provides a REST service to the rest of the organization. A Docker containerized Spring Boot application runs on Amazon ECS with AWS Fargate. The following sequence diagram shows the synchronous request-response flow from an external REST client to the service:

Settlement service

  1. External REST client makes POST /settlement call via an HTTP API present in front of an internal Application Load Balancer (ALB).
  2. SettlementController.java delegates to SettlementService.java.
  3. SettlementService applies business logic and calls SettlementRepository for data persistence.
  4. SettlementRepository persists the item in the Settlement DynamoDB table.

A request to the HTTP API endpoint looks like:

curl --location <settlement-api-endpoint-from-cloudformation-output> \
--header 'Content-Type: application/json' \
--data '{
  "customerId": "06987bc1-1234-1234-1234-2637edab1e57",
  "claimId": "60ccfe05-1234-1234-1234-a4c1ee6fcc29",
  "color": "green",
  "damage": "bumper_dent"
}'

The response from the API call is:

API response

You can learn more here about optimizing Spring Boot applications on AWS Fargate.

Extending container workload for events

To integrate the settlement service, you must update the service to receive and emit events asynchronously. The core logic of the settlement service remains the same. When you file a claim, upload damaged car images, and the application detects no document fraud, the settlement domain subscribes to Fraud.Not.Detected event and applies its business logic. The settlement service emits an event back upon applying the business logic.

The following sequence diagram shows a new interface in settlement to work with EDA. The settlement service subscribes to events that a producer emits. Here, the event producer is the fraud service that puts an event in an EventBridge custom event bus.

Sequence diagram

  1. Producer emits Fraud.Not.Detected event to EventBridge custom event bus.
  2. EventBridge evaluates the rules provided by the settlement domain and sends the event payload to the target SQS queue.
  3. SubscriberService.java polls for new messages in the SQS queue.
  4. On message, it transforms the message body to an input object that is accepted by SettlementService.
  5. It then delegates the call to SettlementService, similar to how SettlementController works in the REST implementation.
  6. SettlementService applies business logic. The flow is like the REST use case from 7 to 10.
  7. On receiving the response from the SettlementService, the SubscriberService transforms the response to publish an event back to the event bus with the event type as Settlement.Finalized.

The rest of the architecture consumes this Settlement.Finalized event.

Using EventBridge schema registry and discovery

Schema enforces a contract between a producer and a consumer. A consumer expects the exact structure of the event payload every time an event arrives. EventBridge provides schema registry and discovery to maintain this contract. The consumer (the settlement service) can download the code bindings and use them in the source code.

Enable schema discovery in EventBridge before downloading the code bindings and using them in your repository. The code bindings provide a marshaller that unmarshals the incoming event from SQS queue to a plain old Java object (POJO) FraudNotDetected.java. You download the code bindings using the choice of your IDE. AWS Toolkit for IntelliJ makes it convenient to download and use them.

Download code bindings

The final architecture for the settlement service with REST and event-driven architecture looks like:

Final architecture

Transition to become fully event-driven

With the new capability to handle events, the Spring Boot application now supports both the REST endpoint and the event-driven architecture by running the same business logic through different interfaces. In this example scenario, as the event-driven architecture matures and the rest of the organization adopts it, the need for the POST endpoint to save a settlement may diminish. In the future, you can deprecate the endpoint and fully rely on polling messages from the SQS queue.

You start with using an ALB and Fargate service CDK ECS pattern:

const loadBalancedFargateService = new ecs_patterns.ApplicationLoadBalancedFargateService(
  this,
  "settlement-service",
  {
    cluster: cluster,
    taskImageOptions: {
      image: ecs.ContainerImage.fromDockerImageAsset(asset),
      environment: {
        "DYNAMODB_TABLE_NAME": this.table.tableName
      },
      containerPort: 8080,
      logDriver: new ecs.AwsLogDriver({
        streamPrefix: "settlement-service",
        mode: ecs.AwsLogDriverMode.NON_BLOCKING,
        logRetention: RetentionDays.FIVE_DAYS,
      })
    },
    memoryLimitMiB: 2048,
    cpu: 1024,
    publicLoadBalancer: true,
    desiredCount: 2,
    listenerPort: 8080
  });

To adapt to EDA, you update the resources to retrofit the SQS queue to receive messages and EventBridge to put events. Add new environment variables to the ApplicationLoadBalancerFargateService resource:

environment: {
  "SQS_ENDPOINT_URL": queue.queueUrl,
  "EVENTBUS_NAME": props.bus.eventBusName,
  "DYNAMODB_TABLE_NAME": this.table.tableName
}

Grant the Fargate task permission to put events in the custom event bus and consume messages from the SQS queue:

props.bus.grantPutEventsTo(loadBalancedFargateService.taskDefinition.taskRole);
queue.grantConsumeMessages(loadBalancedFargateService.taskDefinition.taskRole);

When you transition the settlement service to become fully event-driven, you do not need the HTTP API endpoint and ALB anymore, as SQS is the source of events.

A better alternative is to use QueueProcessingFargateService ECS pattern for the Fargate service. The pattern provides auto scaling based on the number of visible messages in the SQS queue, besides CPU utilization. In the following example, you can also add two capacity provider strategies while setting up the Fargate service: FARGATE_SPOT and FARGATE. This means, for every one task that is run using FARGATE, there are two tasks that use FARGATE_SPOT. This can help optimize cost.

const queueProcessingFargateService = new ecs_patterns.QueueProcessingFargateService(this, 'Service', {
  cluster,
  memoryLimitMiB: 1024,
  cpu: 512,
  queue: queue,
  image: ecs.ContainerImage.fromDockerImageAsset(asset),
  desiredTaskCount: 2,
  minScalingCapacity: 1,
  maxScalingCapacity: 5,
  maxHealthyPercent: 200,
  minHealthyPercent: 66,
  environment: {
    "SQS_ENDPOINT_URL": queueUrl,
    "EVENTBUS_NAME": props?.bus.eventBusName,
    "DYNAMODB_TABLE_NAME": tableName
  },
  capacityProviderStrategies: [
    {
      capacityProvider: 'FARGATE_SPOT',
      weight: 2,
    },
    {
      capacityProvider: 'FARGATE',
      weight: 1,
    },
  ],
});

This pattern abstracts the automatic scaling behavior of the Fargate service based on the queue depth.

Running the application

To test the application, follow How to use the Application after the initial setup. Once complete, you see that the browser receives a Settlement.Finalized event:

{
  "version": "0",
  "id": "e2a9c866-cb5b-728c-ce18-3b17477fa5ff",
  "detail-type": "Settlement.Finalized",
  "source": "settlement.service",
  "account": "123456789",
  "time": "2023-04-09T23:20:44Z",
  "region": "us-east-2",
  "resources": [],
  "detail": {
    "settlementId": "377d788b-9922-402a-a56c-c8460e34e36d",
    "customerId": "67cac76c-40b1-4d63-a8b5-ad20f6e2e6b9",
    "claimId": "b1192ba0-de7e-450f-ac13-991613c48041",
    "settlementMessage": "Based on our analysis on the damage of your car per claim id b1192ba0-de7e-450f-ac13-991613c48041, your out-of-pocket expense will be $100.00."
  }
}

Cleaning up

The stack creates a custom VPC and other related resources. Be sure to clean up resources after usage to avoid the ongoing cost of running these services. To clean up the infrastructure, follow the clean-up steps shown in the sample application.

Conclusion

The blog explains a way to integrate existing container workload running on AWS Fargate with a new event-driven architecture. You use EventBridge to decouple different services from each other that are built using different compute technologies, languages, and frameworks. Using AWS CDK, you gain the modularity of building services decoupled from each other.

This blog shows an evolutionary architecture that allows you to modernize existing container workloads with minimal changes that still give you the additional benefits of building with serverless and EDA on AWS.

The major difference between the event-driven approach and the REST approach is that you unblock the producer once it emits an event. The event producer from the settlement domain that subscribes to that event is loosely coupled. The business functionality remains intact, and no significant refactoring or re-architecting effort is required. With these agility gains, you may get to the market faster

The sample application shows the implementation details and steps to set up, run, and clean up the application. The app uses ECS Fargate for a domain service, but you do not limit it to just Fargate. You can also bring container-based applications running on Amazon EKS similarly to event-driven architecture.

Learn more about event-driven architecture on Serverless Land.

Patterns for building an API to upload files to Amazon S3

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/patterns-for-building-an-api-to-upload-files-to-amazon-s3/

This blog is written by Thomas Moore, Senior Solutions Architect and Josh Hart, Senior Solutions Architect.

Applications often require a way for users to upload files. The traditional approach is to use an SFTP service (such as the AWS Transfer Family), but this requires specific clients and management of SSH credentials. Modern applications instead need a way to upload to Amazon S3 via HTTPS. Typical file upload use cases include:

  • Sharing datasets between businesses as a direct replacement for traditional FTP workflows.
  • Uploading telemetry and logs from IoT devices and mobile applications.
  • Uploading media such as videos and images.
  • Submitting scanned documents and PDFs.

If you have control over the application that sends the uploads, then you can integrate with the AWS SDK from within the browser with a framework such as AWS Amplify. To learn more, read Allowing external users to securely and directly upload files to Amazon S3.

Often you must provide end users direct access to upload files via an endpoint. You could build a bespoke service for this purpose, but this results in more code to build, maintain, and secure.

This post explores three different approaches to securely upload content to an Amazon S3 bucket via HTTPS without the need to build a dedicated API or client application.

Using Amazon API Gateway as a direct proxy

The simplest option is to use API Gateway to proxy an S3 bucket. This allows you to expose S3 objects as REST APIs without additional infrastructure. By configuring an S3 integration in API Gateway, this allows you to manage authentication, authorization, caching, and rate limiting more easily.

This pattern allows you to implement an authorizer at the API Gateway level and requires no changes to the client application or caller. The limitation with this approach is that API Gateway has a maximum request payload size of 10 MB. For step-by-step instructions to implement this pattern, see this knowledge center article.

This is an example implementation (you can deploy this from Serverless Land):

Using Amazon API Gateway as a direct proxy

Using API Gateway with presigned URLs

The second pattern uses S3 presigned URLs, which allow you to grant access to S3 objects for a specific period, after which the URL expires. This time-bound access helps prevent unauthorized access to S3 objects and provides an additional layer of security.

They can be used to control access to specific versions or ranges of bytes within an object. This granularity allows you to fine-tune access permissions for different users or applications, and ensures that only authorized parties have access to the required data.

This avoids the 10 MB limit of API Gateway as the API is only used to generate the presigned URL, which is then used by the caller to upload directly to S3. Presigned URLs are straightforward to generate and use programmatically, but it does require the client to make two separate requests: one to generate the URL and one to upload the object. To learn more, read Uploading to Amazon S3 directly from a web or mobile application.

Using API Gateway with presigned URLs

This pattern is limited by the 5GB maximum request size of the S3 Put Object API call. One way to work around this limit with this pattern is to leverage S3 multipart uploads. This requires that the client split the payload into multiple segments and send a separate request for each part.

This adds some complexity to the client and is used by libraries such as AWS Amplify that abstract away the multipart upload implementation. This allows you to upload objects up to 5TB in size. For more details, see uploading large objects to Amazon S3 using multipart upload and transfer acceleration.

An example of this pattern is available on Serverless Land.

Using Amazon CloudFront with Lambda@Edge

The final pattern leverages Amazon CloudFront instead of API Gateway. CloudFront is primarily a content delivery network (CDN) that caches and delivers content from an S3 bucket or other origin. However, CloudFront can also be used to upload data to an S3 bucket. Without any additional configuration, this would essentially make the S3 bucket publicly writable. To secure the solution so that only authenticated users can upload objects, you can use a Lambda@Edge function to verify the users’ permissions.

The maximum size of the object that you can upload with this pattern is 5GB. If you need to upload files larger than 5GB, then you must use multipart uploads. To implement this, deploy the example Serverless Land pattern:

Using Amazon CloudFront with Lambda@Edge

This pattern uses an origin access identity (OAI) to limit access to the S3 bucket to only come from CloudFront. The default OAI has s3:GetObject permission, which is changed to s3:PutObject to allow uploads explicitly and prevent and read operations:

{
    "Version": "2008-10-17",
    "Id": "PolicyForCloudFrontPrivateContent",
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity <origin access identity ID>"
            },
            "Action": [
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3::: DOC-EXAMPLE-BUCKET/*"
        }
    ]
}

As CloudFront is not used to cache content, the managed cache policy is set to CachingDisabled.

There are multiple options for implementing the authorization in the Lambda@Edge function. The sample repository uses an Amazon Cognito authorizer that validates a JSON Web Token (JWT) sent as an HTTP authorization header.

Using a JWT is secure as it implies this token is dynamically vended by an Identity Provider, such as Amazon Cognito. This does mean that the caller needs a mechanism to obtain this JWT token. You are in control of this authorizer function, and the exact implementation depends on your use-case. You could instead use an API Key or integrate with an alternate identity provider such as Auth0 or Okta.

Lambda@Edge functions do not currently support environment variables. This means that the configuration parameters are dynamically resolved at runtime. In the example code, AWS Systems Manager Parameter Store is used to store the Amazon Cognito user pool ID and app client ID that is required for the token verification. For more details on how to choose where to store your configuration parameters, see Choosing the right solution for AWS Lambda external parameters.

To verify the JWT token, the example code uses the aws-jwt-verify package. This supports JWTs issued by Amazon Cognito and third-party identity providers.

The Serverless Land pattern uses an Amazon Cognito identity provider to do authentication in the Lambda@Edge function. This code snippet shows an example using a pre-shared key for basic authorization:

import json

def lambda_handler(event, context):
       
    print(event)
       
    response = event["Records"][0]["cf"]["request"]
    headers = response["headers"]
       
    if 'authorization' not in headers or headers['authorization'] == None:
        return unauthorized()
           
    if headers['authorization'] == 'my-secret-key':
        return request

    return response
       
def unauthorized():
    response = {
            'status': "401",
            'statusDescription': 'Unauthorized',
            'body': 'Unauthorized'
        }
    return response

The Lambda function is associated with the CloudFront distribution by creating a Lambda trigger. The CloudFront event is set Viewer request to meaning the function is invoked in reaction to PUT events from the client.

Add trigger

The solution can be tested with an API testing client, such as Postman. In Postman, issue a PUT request to https://<your-cloudfront-domain>/<object-name> with a binary payload as the body. You receive a 401 Unauthorized response.

Postman response

Next, add the Authorization header with a valid token and submit the request again. For more details on how to obtain a JWT from Amazon Cognito, see the README in the repository. Now the request works and you receive a 200 OK message.

To troubleshoot, the Lambda function logs to Amazon CloudWatch Logs. For Lambda@Edge functions, look for the logs in the Region closest to the request, and not the same Region as the function.

The Lambda@Edge function in this example performs basic authorization. It validates the user has access to the requested resource. You can perform any custom authorization action here. For example, in a multi-tenant environment, you could restrict the prefix so that specific tenants only have permission to write to their own prefix, and validate the requested object name in the function.

Additionally, you could implement controls traditionally performed by the API Gateway such as throttling by tenant or user. Another use for the function is to validate the file type. If users can only upload images, you could validate the content-length to ensure the images are a certain size and the file extension is correct.

Conclusion

Which option you choose depends on your use case. This table summarizes the patterns discussed in this blog post:

 

API Gateway as a proxy Presigned URLs with API Gateway CloudFront with Lambda@Edge
Max Object Size 10 MB 5 GB (5 TB with multipart upload) 5 GB
Client Complexity Single HTTP Request Multiple HTTP Requests Single HTTP Request
Authorization Options Amazon Cognito, IAM, Lambda Authorizer Amazon Cognito, IAM, Lambda Authorizer Lambda@Edge
Throttling API Key throttling API Key throttling Custom throttling

Each of the available methods has its strengths and weaknesses and the choice of which one to use depends on your specific needs. The maximum object size supported by S3 is 5 TB, regardless of which method you use to upload objects. Additionally, some methods have more complex configuration that requires more technical expertise. Considering these factors with your specific use-case can help you make an informed decision on the best API option for uploading to S3.

For more serverless learning resources, visit Serverless Land.

Building private serverless APIs with AWS Lambda and Amazon VPC Lattice

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-private-serverless-apis-with-aws-lambda-and-amazon-vpc-lattice/

This post was written by Josh Kahn, Tech Leader, Serverless.

Amazon VPC Lattice is a new, generally available application networking service that simplifies connectivity between services. Builders can connect, secure, and monitor services on instances, containers, or serverless compute in a simplified and consistent manner.

VPC Lattice supports AWS Lambda functions as both a target and a consumer of services. This blog post explores how to incorporate VPC Lattice into your serverless workloads to simplify private access to HTTP-based APIs built with Lambda.

Overview

VPC Lattice is an application networking service that enables discovery and connectivity of services across VPCs and AWS accounts. VPC Lattice includes features that allow builders to define policies for network access, traffic management, and monitoring. It also supports custom domain names for private endpoints.

VPC Lattice is composed of several key components:

  • Service network – a logical grouping mechanism for a collection of services on which you can apply common policies. Associate one or more VPCs to allow access from services in the VPC to the service network.
  • Service – a unit of software that fulfills a specific task or function. Services using VPC Lattice can run on instances, containers, or serverless compute. This post focuses on services built with Lambda functions.
  • Target group – in a serverless application, a Lambda function that performs business logic in response to a request. Routing rules within the service route requests to the appropriate target group.
  • Auth policy – an AWS Identity and Access Management (IAM) resource policy that can be associated with a service network and a service that defines access to those services.

VPC Lattice enables connectivity across VPC and account boundaries, while alleviating the complexity of the underlying networking. It supports HTTP/HTTPS and gRPC protocols, though gRPC is not currently applicable for Lambda target groups.

VPC Lattice and Lambda

Lambda is one of the options to build VPC Lattice services. The AWS Lambda console supports VPC Lattice as a trigger, similar to previously existing triggers such as Amazon API Gateway and Amazon EventBridge. You can also connect VPC Lattice as an event source using infrastructure as code, such as AWS CloudFormation and Terraform.

To configure VPC Lattice as a trigger for a Lambda function in the Console, navigate to the desired function and select the Configuration tab. Select the Triggers menu on the left and then choose Add trigger.

The trigger configuration wizard allows you to define a new VPC Lattice service provided by the Lambda function or to add to an existing service. When adding to an existing service, the wizard allows configuration of path-based routing that sends requests to the target group that includes the function. Path-based and other routing mechanisms available from VPC Lattice are useful in migration scenarios.

This example shows creating a new service. Provide a unique name for the service and select the desired VPC Lattice service network. If you have not created a service network, follow the link to create a new service network in the VPC Console (to create a new service network, read the VPC Lattice documentation).

The listener configuration allows you to configure the protocol and port on which the service is accessible. HTTPS (port 443) is the default configuration, though you can also configure the listener for HTTP (port 80). Note that configuring the listener for HTTP does not change the behavior of Lambda: it is still invoked by VPC Lattice over an HTTPS endpoint, but the service endpoint is available as HTTP. Choose Add to complete setup.

In addition to configuring the VPC Lattice service and target group, the Lambda wizard also adds a resource policy to the function that allows the VPC Lattice target group to invoke the function.

Add trigger

VPC Lattice integration

When a client sends a request to a VPC Lattice service backed by a Lambda target group, VPC Lattice synchronously invokes the target Lambda function. During a synchronous invocation, the client waits for the result of the function and all retry handling is performed by the client. VPC Lattice has an idle timeout of one minute and connection timeout of ten minutes to both the client and target.

The event payload received by the Lambda function when invoked by VPC Lattice is similar to the following example. Note that base64 encoding is dependent on the content type.

{
    "body": "{ "\userId\": 1234, \"orderId\": \"5C71D3EB-3B8A-457B-961D\" }",
    "headers": {
        "accept": "application/json, text/plain, */*",
        "content-length": "156",
        "user-agent": "axios/1.3.4",
        "host": "myvpclattice-service-xxxx.xxxx.vpc-lattice-svcs.us-east-2.on.aws",
        "x-forwarded-for": "10.0.129.151"
    },
    "is_base64_encoded": false,
    "method": "PUT",
    "query_string_parameters": {
        "action": "add"
    },
    "raw_path": "/points?action=add"
}

The response payload returned by the Lambda function includes a status code, headers, base64 encoding, and an optional body as shown in the following example. A response payload that does not meet the required specification results in an error. To return binary content, you must set isBase64Encoded to true.

{
    "isBase64Encoded": false,
    "statusCode": 200,
    "statusDescription": "200 OK",
    "headers": {
        "Set-Cookie": "cookies",
        "Content-Type": "application/json"
    },
    "body": "Hello from Lambda (optional)"
}

For more details on the integration between VPC Lattice and Lambda, visit the Lambda documentation.

Calling VPC Lattice services from Lambda

VPC Lattice services support connectivity over HTTP/HTTPS and gRPC protocols as well as open access or authorization using IAM. To call a VPC Lattice service, the Lambda function must be attached to a VPC that is associated to a VPC Lattice service network:

While a function that calls a VPC Lattice service must be associated with an appropriate VPC, a Lambda function that is part of a Lattice service target group does not need to be attached to a VPC. Remember that Lambda functions are always invoked via an AWS endpoint with access controlled by AWS IAM.

Calls to a VPC Lattice service are similar to sending a request to other HTTP/HTTPS services. VPC Lattice allows builders to define an optional auth policy to enforce authentication and perform context-specific authorization and implement network-level controls with security groups. Callers of the service must meet networking and authorization requirements to access the service. VPC Lattice blocks traffic if it does not explicitly meet all conditions before your function is invoked.

A Lambda function that calls a VPC Lattice service must have explicit permission to invoke that service, unless the auth type for the service is NONE. You provide that permission through a policy attached to the Lambda function’s execution role, for example:

{
    "Action": "vpc-lattice-svcs:Invoke",
    "Resource": "arn:aws:vpc-lattice:us-east-2:123456789012:service/svc-123abc/*",
    "Effect": "Allow"
}

If the auth policy associated with your service network or service requires authenticated requests, any requests made to that service must contain a valid request signature computed using Signature Version 4 (SigV4). An example of computing a SigV4 signature can be found in the VPC Lattice documentation. VPC Lattice does not support payload signing at this time. In TypeScript, you can sign a request using the AWS SDK and Axios library as follows:

import { SignatureV4 } from "@aws-sdk/signature-v4";
import { Sha256 } from "@aws-crypto/sha256-js";
import axios from "axios";

const endpointUrl = new URL(VPC_LATTICE_SERVICE_ENDPOINT);
const sigv4 = new SignatureV4({
    service: "vpc-lattice-svcs",
    region: process.env.AWS_REGION!,
    credentials: {
        accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
        secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
        sessionToken: process.env.AWS_SESSION_TOKEN
    },
    sha256: Sha256
});

const signedRequest = await sigv4.sign({
    method: "PUT",
    hostname: endpointUrl.host,
    path: endpointUrl.pathname,
    protocol: endpointUrl.protocol,
    headers: {
        'Content-Type': 'application/json',
        host: endpointUrl.hostname,
        // Include following header as VPC Lattice does not support signed payloads
        "x-amz-content-sha256": "UNSIGNED-PAYLOAD"
    }
  });    
  const { data } = await axios({
    ...signedRequest,
    data: {
        // some data
    },
    url: VPC_LATTICE_SERVICE_ENDPOINT
  });

VPC Lattice provides several layers of security controls, including network-level and auth policies, that allow (or deny) access from a client to your service. These controls can be implemented at the service network, applying those controls across all services in the network.

Connecting to any VPC Lattice service

VPC Lattice supports services built using Amazon EKS and Amazon EC2 in addition to Lambda. Calling services built using these other compute options looks exactly the same to the caller as the preceding sample. VPC Lattice provides an endpoint that abstracts how the service itself is actually implemented.

A Lambda function configured to access resources in a VPC can potentially access VPC Lattice services that are part of the service network associated with that VPC. IAM permissions, the auth policy associated with the service, and security groups may also impact whether the function can invoke the service (see VPC Lattice documentation for details on securing your services).

Services deployed to an Amazon EKS cluster can also invoke Lambda functions exposed as VPC Lattice services using native Kubernetes semantics. They can use either the VPC Lattice-generated domain name or a configured custom domain name to invoke the Lambda function instead of API Gateway or an Application Load Balancer (ALB). Refer to this blog post on the AWS Container Blog for details on how an Amazon EKS service invokes a VPC Lattice service with access control enabled.

Building private serverless APIs

With the launch of VPC Lattice, AWS now offers several options to build serverless APIs accessible only within your customer VPC. These options include API Gateway, ALB, and VPC Lattice. Each of these services offers a unique set of features and trade-offs that may make one a better fit for your workload than others.

Private APIs with API Gateway provide a rich set of features, including throttling, caching, and API keys. API Gateway also offers a rich set of authorization and routing options. Detailed networking and DNS knowledge may be required in complex environments. Both network-level and resource policy controls are available to control access and the OpenAPI specification allows schema sharing.

Application Load Balancer provides flexibility and a rich set of routing options, including to a variety of targets. ALB also can offer a static IP address via AWS Global Accelerator. Detailed networking knowledge is required to configure cross-VPC/account connectivity. ALB relies on network-level controls.

Service networks in VPC Lattice simplify access to services on EC2, EKS, and Lambda across VPCs and accounts without requiring detailed knowledge of networking and DNS. VPC Lattice provides a centralized means of managing access control and guardrails for service-to-service communication. VPC Lattice also readily supports custom domain names and routing features (path, method, header) that enable customers to build complex private APIs without the complexity of managing networking. VPC Lattice can be used to provide east-west interservice communication in combination with API Gateway and AWS AppSync to provide public endpoints for your services.

Conclusion

We’re excited about the simplified connectivity now available with VPC Lattice. Builders can focus on creating customer value and differentiated features instead of complex networking in much the same way that Lambda allows you to focus on writing code. If you are interested in learning more about VPC Lattice, we recommend the VPC Lattice User Guide.

To learn more about serverless, visit Serverless Land for a wide array of reusable patterns, tutorials, and learning materials.

Understanding techniques to reduce AWS Lambda costs in serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-techniques-to-reduce-aws-lambda-costs-in-serverless-applications/

This post is written by Josh Kahn, Tech Leader, and Chloe Jeon, Senior PMTes Lambda.

Serverless applications can lower the total cost of ownership (TCO) when compared to a server-based cloud execution model because it effectively shifts operational responsibilities such as managing servers to a cloud provider. Deloitte research on serverless TCO with Fortune 100 clients across industries show that serverless applications can offer up to 57% cost savings compared with server-based solutions.

Serverless applications can offer lower costs in:

  • Initial development: Serverless enables builders to be more agile, deliver features more rapidly, and focus on differentiated business logic.
  • Ongoing maintenance and infrastructure: Serverless shifts operational burden to AWS. Ongoing maintenance tasks, including patching and operating system updates.

This post focuses on options available to reduce direct AWS costs when building serverless applications. AWS Lambda is often the compute layer in these workloads and may comprise a meaningful portion of the overall cost.

To help optimize your Lambda-related costs, this post discusses some of the most commonly used cost optimization techniques with an emphasis on configuration changes over code updates. This post is intended for architects and developers new to building with serverless.

Building with serverless makes both experimentation and iterative improvement easier. All of the techniques described here can be applied before application development, or after you have deployed your application to production. The techniques are roughly by applicability: The first can apply to any workload; the last applies to a smaller number of workloads.

Right-sizing your Lambda functions

Lambda uses a pay-per-use cost model that is driven by three metrics:

  • Memory configuration: allocated memory from 128MB to 10,240MB. CPU and other resources available to the function are allocated proportionally to memory.
  • Function duration: time function runs, measures in milliseconds.
  • Number of invocations: the number of times your function runs.

Over-provisioning memory is one of the primary drivers of increased Lambda cost. This is particularly acute among builders new to Lambda who are used to provisioning hosts running multiple processes. Lambda scales such that each execution environment of a function only handles one request at a time. Each execution environment has access to fixed resources (memory, CPU, storage) to complete work on the request.

By right-sizing the memory configuration, the function has the resources to complete its work and you are paying for only the needed resources. While you also have direct control of function duration, this is a less effective cost optimization to implement. The engineering costs to create a few milliseconds of savings may outweigh the cost savings. Depending on the workload, the number of times your function is invoked may be outside your control. The next section discusses a technique to reduce the number of invocations for some types of workloads.

Memory configuration is accessible via the AWS Management Console or your favorite infrastructure as code (IaC) option. The memory configuration setting defines allocated memory, not memory used by your function. Right-sizing memory is an adjustment that can reduce the cost (or increase performance) of your function. However, lowering the function-memory may not always result in cost savings. Lowering function memory means lowering available CPU for the Lambda function, which could increase the function duration, resulting in either no cost savings or higher cost. It is important to identify the optimal memory configuration for cost savings while preserving performance.

AWS offers two approaches to right-sizing memory allocation: AWS Lambda Power Tuning and AWS Compute Optimizer.

AWS Lambda Power Tuning is an open-source tool that can be used to empirically find the optimal memory configuration for your function by trading off cost against execution time. The tool runs multiple concurrent versions of your function against mock input data at different memory allocations. The result is a chart that can help you find the “sweet spot” between cost and duration/performance. Depending on the workload, you may prioritize one over the other. AWS Lambda Power Tuning is a good choice for new functions and can also help select between the two instruction set architectures offered by Lambda.

AWS Power Tuning Tool

AWS Compute Optimizer uses machine learning to recommend an optimal memory configuration based on historical data. Compute Optimizer requires that your function be invoked at least 50 times over the trailing 14 days to provide a recommendation based on past utilization, so is most effective once your function is in production.

Both Lambda Power Tuning and Compute Optimizer help derive the right-sized memory allocation for your function. Use this value to update the configuration of your function using the AWS Management Console or IaC.

This post includes AWS Serverless Application Model (AWS SAM) sample code throughout to demonstrate how to implement optimizations. You can also use AWS Cloud Development Kit (AWS CDK), Terraform, Serverless Framework, and other IaC tools to implement the same changes.

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: nodejs18.x
    Handler: app.handler
    MemorySize: 1024   # Set memory configuration to optimize for cost or performance

Setting a realistic function timeout

Lambda functions are configured with a maximum time that each invocation can run, up to 15 minutes. Setting an appropriate timeout can be beneficial in containing costs of your Lambda-based application. Unhandled exceptions, blocking actions (for example, opening a network connection), slow dependencies, and other conditions can lead to longer-running functions or functions that run until the configured timeout. Proper timeouts are the best protection against both slow and erroneous code. At some point, the work the function is performing and the per-millisecond cost of that work is wasted.

Our recommendation is to set a timeout of less than 29-seconds for all synchronous invocations, or those in which the caller is waiting for a response. Longer timeouts are appropriate for asynchronous invocations, but consider timeouts longer than 1-minute to be an exception that requires review and testing.

Using Graviton

Lambda offers two instruction set architectures in most AWS Regions: x86 and arm64.

Choosing Graviton can save money in two ways. First, your functions may run more efficiently due to the Graviton2 architecture. Second, you may pay less for the time that they run. Lambda functions powered by Graviton2 are designed to deliver up to 19 percent better performance at 20 percent lower cost. Consider starting with Graviton when developing new Lambda functions, particularly those that do not require natively compiled binaries.

If your function relies on native compiled binaries or is packaged as a container image, you must rebuild to move between arm64 and x86. Lambda layers may also include dependencies targeted for one architecture or the other. We encourage you to review dependencies and test your function before changing the architecture. The AWS Lambda Power Tuning tool also allows you to compare the price and performance of arm64 and x86 at different memory settings.

You can modify the architecture configuration of your function in the console or your IaC of choice. For example, in AWS SAM:

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Architectures:
      - arm64. # Set architecture to use Graviton2
    Runtime: nodejs18.x
    Handler: app.handler

Filtering incoming events

Lambda is integrated with over 200 event sources, including Amazon SQS, Amazon Kinesis Data Streams, Amazon DynamoDB Streams, Amazon Managed Streaming for Apache Kafka, and Amazon MQ. The Lambda service integrates with these event sources to retrieve messages and invokes your function as needed to process those messages.

When working with one of these event sources, builders can configure filters to limit the events sent to your function. This technique can greatly reduce the number of times your function is invoked depending on the number of events and specificity of your filters. When not using event filtering, the function must be invoked to first determine if an event should be processed before performing the actual work. Event filtering alleviates the need to perform this upfront check while reducing the number of invocations.

For example, you may only want a function to run when orders of over $200 are found in a message on a Kinesis data stream. You can configure an event filtering pattern using the console or IaC in a manner similar to memory configuration.

To implement the Kinesis stream filter using AWS SAM:

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: nodejs18.x
    Handler: app.handler
    Events:
      Type: Kinesis
      Properties:
        StartingPosition: LATEST
        Stream: "arn:aws:kinesis:us-east-1:0123456789012:stream/orders"
        FilterCriteria:
          Filters:
            - Pattern: '{ "data" : { "order" : { "value" : [{ "numeric": [">", 200] }] } } }'

If an event satisfies one of the event filters associated with the event source, Lambda sends the event to your function for processing. Otherwise, the event is discarded as processed successfully without invoking the function.

If you are building or running a Lambda function that is invoked by one of the previously mentioned event sources, it’s recommended that you review the filtering options available. This technique requires no code changes to your Lambda function – even if the function performs some preprocessing check, that check still completes successfully with filtering implemented.

To learn more, read Filtering event sources for AWS Lambda functions.

Avoiding recursion

You may be familiar with the programming concept of a recursive function or a function/routine that calls itself. Though rare, customers sometimes unintentionally build recursion in their architecture, so a Lambda function continuously calls itself.

The most common recursive pattern is between Lambda and Amazon S3. Actions in an S3 bucket can trigger a Lambda function, and recursion can occur when that Lambda function writes back to the same bucket.

Consider a use case in which a Lambda function is used to generate a thumbnail of user-submitted images. You configure the bucket to trigger the thumbnail generation function when a new object is put in the bucket. What happens if the Lambda function writes the thumbnail to the same bucket? The process starts anew and the Lambda function then runs on the thumbnail image itself. This is recursion and can lead to an infinite loop condition.

While there are multiple ways to prevent against this condition, it’s best practice to use a second S3 bucket to store thumbnails. This approach minimizes changes to the architecture as you do not need to change the notification settings nor the primary S3 bucket. To learn about other approaches, read Avoiding recursive invocation with Amazon S3 and AWS Lambda.

If you do encounter recursion in your architecture, set Lambda reserved concurrency to zero to stop the function from running. Allow minutes to hours before lifting the reserved concurrency cap. Since S3 events are asynchronous invocations that have automatic retries, you may continue to see recursion until you resolve the issue or all events have expired.

Conclusion

Lambda offers a number of techniques that you can use to minimize infrastructure costs whether you are just getting started with Lambda or have numerous functions already deployed in production. When combined with the lower costs of initial development and ongoing maintenance, serverless can offer a low total cost of ownership. Get hands-on with these techniques and more with the Serverless Optimization Workshop.

To learn more about serverless architectures, find reusable patterns, and keep up-to-date, visit Serverless Land.

Python 3.10 runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/python-3-10-runtime-now-available-in-aws-lambda/

This post is written by Suresh Poopandi, Senior Solutions Architect, Global Life Sciences.

AWS Lambda now supports Python 3.10 as both a managed runtime and container base image. With this release, Python developers can now take advantage of new features and improvements introduced in Python 3.10 when creating serverless applications on Lambda.

Enhancements in Python 3.10 include structural pattern matching, improved error messages, and performance enhancements. This post outlines some of the benefits of Python 3.10 and how to use this version in your Lambda functions.

AWS has also published a preview Lambda container base image for Python 3.11. Customers can use this image to get an early look at Python 3.11 support in Lambda. This image is subject to change and should not be used for production workloads. To provide feedback on this image, and for future updates on Python 3.11 support, see https://github.com/aws/aws-lambda-base-images/issues/62.

What’s new in Python 3.10

Thanks to its simplicity, readability, and extensive community support, Python is a popular language for building serverless applications. The Python 3.10 release includes several new features, such as:

  • Structural pattern matching (PEP 634): Structural pattern matching is one of the most significant additions to Python 3.10. With structural pattern matching, developers can use patterns to match against data structures such as lists, tuples, and dictionaries and run code based on the match. This feature enables developers to write code that processes complex data structures more easily and can improve code readability and maintainability.
  • Parenthesized context managers (BPO-12782): Python 3.10 introduces a new syntax for parenthesized context managers, making it easier to read and write code that uses the “with” statement. This feature simplifies managing resources such as file handles or database connections, ensuring they are released correctly.
  • Writing union types as X | Y (PEP 604): Python 3.10 allows writing union types as X | Y instead of the previous versions’ syntax of typing Union[X, Y]. Union types represent a value that can be one of several types. This change does not affect the functionality of the code and is backward-compatible, so code written with the previous syntax will still work. The new syntax aims to reduce boilerplate code, and improve readability and maintainability of Python code by providing a more concise and intuitive syntax for union types.
  • User-defined type guards (PEP 647): User-defined type guards allow developers to define their own type guards to handle custom data types or to refine the types of built-in types. Developers can define their own functions that perform more complex type checks as user-defined typed guards. This feature improves Python code readability, maintainability, and correctness, especially in projects with complex data structures or custom data types.
  • Improved error messages: Python 3.10 has improved error messages, providing developers with more information about the source of the error and suggesting possible solutions. This helps developers identify and fix issues more quickly. The improved error messages in Python 3.10 include more context about the error, such as the line number and location where the error occurred, as well as the exact nature of the error. Additionally, Python 3.10 error messages now provide more helpful information about how to fix the error, such as suggestions for correct syntax or usage.

Performance improvements

The faster PEP 590 vectorcall calling convention allows for quicker and more efficient Python function calls, particularly those that take multiple arguments. The specific built-in functions that benefit from this optimization include map(), filter(), reversed(), bool(), and float(). By using the vectorcall calling convention, according to Python 3.10 release notes, these inbuilt functions’ performance improved by a factor of 1.26x.

When a function is defined with annotations, these are stored in a dictionary that maps the parameter names to their respective annotations. In previous versions of Python, this dictionary was created immediately when the function was defined. However, in Python 3.10, this dictionary is created only when the annotations are accessed, which can happen when the function is called. By delaying the creation of the annotation dictionary until it is needed, Python can avoid the overhead of creating and initializing the dictionary during function definition. This can result in a significant reduction in CPU time, as the dictionary creation can be a time-consuming operation, particularly for functions with many parameters or complex annotations.

In Python 3.10, the LOAD_ATTR instruction, which is responsible for loading attributes from objects in the code, has been improved with a new mechanism called the “per opcode cache”. This mechanism works by storing frequently accessed attributes in a cache specific to each LOAD_ATTR instruction, which reduces the need for repeated attribute lookups. As a result of this improvement, according to Python 3.10 release notes, the LOAD_ATTR instruction is now approximately 36% faster when accessing regular attributes and 44% faster when accessing attributes defined using the slots mechanism.

In Python, the str(), bytes(), and bytearray() constructors are used to create new instances of these types from existing data or values. Based on the result of the performance tests conducted as part of  BPO-41334, constructors str(), bytes(), and bytearray() are around 30–40% faster for small objects.

Lambda functions developed with Python that read and process Gzip compressed files can gain a performance improvement. Adding _BlocksOutputBuffer for the bz2/lzma/zlib module eliminated the overhead of resizing bz2/lzma buffers, preventing excessive memory footprint of the zlib buffer. According to Python 3.10 release notes, bz2 decompression is now 1.09x faster, lzma decompression 1.20x faster, and GzipFile read is 1.11x faster

Using Python 3.10 in Lambda

AWS Management Console

To use the Python 3.10 runtime to develop your Lambda functions, specify a runtime parameter value Python 3.10 when creating or updating a function. Python 3.10 version is now available in the Runtime dropdown in the Create function page.

Lambda create function page

To update an existing Lambda function to Python 3.10, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. The new version of Python is available in the Runtime dropdown:

Edit runtime settings

AWS Serverless Application Model (AWS SAM)

In AWS SAM, set the Runtime attribute to python3.10 to use this version.

AWSTemplateFormatVersion: ‘2010-09-09’
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Description: My Python Lambda Function
      CodeUri: my_function/
      Handler: lambda_function.lambda_handler
      Runtime: python3.10

AWS SAM supports the generation of this template with Python 3.10 out of the box for new serverless applications using the sam init command. Refer to the AWS SAM documentation here.

AWS Cloud Development Kit (AWS CDK)

In the AWS CDK, set the runtime attribute to Runtime.PYTHON_3_10 to use this version. In Python:

from constructs import Construct
from aws_cdk import (
    App, Stack,
    aws_lambda as _lambda
)


class SampleLambdaStack(Stack):

    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)

        base_lambda = _lambda.Function(self, 'SampleLambda',
                                       handler='lambda_handler.handler',
                                       runtime=_lambda.Runtime.PYTHON_3_10,
                                       code=_lambda.Code.from_asset('lambda'))

In TypeScript:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as path from 'path';
import { Construct } from 'constructs';

export class CdkStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // The code that defines your stack goes here

    // The python3.10 enabled Lambda Function
    const lambdaFunction = new lambda.Function(this, 'python310LambdaFunction', {
      runtime: lambda.Runtime.PYTHON_3_10,
      memorySize: 512,
      code: lambda.Code.fromAsset(path.join(__dirname, '/../lambda')),
      handler: 'lambda_handler.handler'
    })
  }
}

AWS Lambda – Container Image

Change the Python base image version by modifying FROM statement in the Dockerfile:

FROM public.ecr.aws/lambda/python:3.10

# Copy function code
COPY lambda_handler.py ${LAMBDA_TASK_ROOT}

To learn more, refer to the usage tab on building functions as container images.

Conclusion

You can build and deploy functions using Python 3.10 using the AWS Management ConsoleAWS CLIAWS SDKAWS SAM, AWS CDK, or your choice of Infrastructure as Code (IaC). You can also use the Python 3.10 container base image if you prefer to build and deploy your functions using container images.

We are excited to bring Python 3.10 runtime support to Lambda and empower developers to build more efficient, powerful, and scalable serverless applications. Try Python 3.10 runtime in Lambda today and experience the benefits of this updated language version and take advantage of improved performance.

For more serverless learning resources, visit Serverless Land.

Optimizing AWS Lambda extensions in C# and Rust

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/optimizing-aws-lambda-extensions-in-c-and-rust/

This post is written by Siarhei Kazhura, Senior Specialist Solutions Architect, Serverless.

Customers use AWS Lambda extensions to integrate monitoring, observability, security, and governance tools with their Lambda functions. AWS, along with AWS Lambda Ready Partners, like Datadog, Dynatrace, New Relic, provides ready-to-run extensions. You can also develop your own extensions to address your specific needs.

External Lambda extensions are designed as a companion process running in the same execution environment as the function code. That means that the Lambda function shares resources like memory, CPU, and disk I/O, with the extension. Improperly designed extensions can result in a performance degradation and extra costs.

This post shows how to measure the impact an extension has on the function performance using key performance metrics on an Amazon CloudWatch dashboard.

This post focuses on Lambda extensions written in C# and Rust. It shows the benefits of choosing to write Lambda extensions in Rust. Also, it explains how you can optimize a Lambda extension written in C# to deliver three times better performance. The solution can be converted to the programming languages of your choice.

Overview

A C# Lambda function (running on .NET 6) called HeaderCounter is used as a baseline. The function counts the number of headers in a request and returns the number in the response. A static delay of 500 ms is inserted in the function code to simulate extra computation. The function has the minimum memory setting (128 MB), which magnifies the impact that extension has on performance.

A load test is performed via a curl command that is issuing 5000 requests (with 250 requests running simultaneously) against a public Amazon API Gateway endpoint backed by the Lambda function. A CloudWatch dashboard, named lambda-performance-dashboard, displays performance metrics for the function.

Lambda performance dashboard

Metrics captured by the dashboard:

  1. The Max Duration, and Average Duration metrics allow you to assess the impact the extension has on the function execution duration.
  2. The PostRuntimeExtensionsDuration metric measures the extra time that the extension takes after the function invocation.
  3. The Average Memory Used, and Memory Allocated metrics allow you to assess the impact the extension has on the function memory consumption.
  4. The Cold Start Duration, and Cold Starts metrics allow you to assess the impact the extension has on the function cold start.

Running the extensions

There are a few differences between how the extensions written in C# and Rust are run.

The extension written in Rust is published as an executable. The advantage of an executable is that it is compiled to native code, and is ready to run. The extension is environment agnostic, so it can run alongside with a Lambda function written in another runtime.

The disadvantage of an executable is the size. Extensions are served as Lambda layers, and the size of the extension counts towards the deployment package size. The maximum unzipped deployment package size for Lambda is 250 MB.

The extension written in C# is published as a dynamic-link library (DLL). The DLL contains the Common Intermediate Language (CIL), that must be converted to native code via a just-in-time (JIT) compiler. The .NET runtime must be present for the extension to run. The dotnet command runs the DLL in the example provided with the solution.

Blank extension

Blank extension

Three instances of the HeaderCounter function are deployed:

  1. The first instance, available via a no-extension endpoint, has no extensions.
  2. The second instance, available via a dotnet-extension endpoint, is instrumented with a blank extension written in C#. The extension does not provide any extra functionality, except logging the event received to CloudWatch.
  3. The third instance, available via a rust-extension endpoint, is instrumented with a blank extension written in Rust. The extension does not provide any extra functionality, except logging the event received to CloudWatch.

Dashboard results

The dashboard shows that the extensions add minimal overhead to the Lambda function. The extension written in C# adds more overhead in the higher percentile metrics, such as the Maximum Cold Start Duration and Maximum Duration.

EventCollector extension

EventCollector extension

Three instances of the HeaderCounter function are deployed:

  1. The first instance, available via a no-extension endpoint, has no extensions.
  2. The second instance, available via a dotnet-extension endpoint, is instrumented with an EventCollector extension written in C#. The extension is pushing all the extension invocation events to Amazon S3.
  3. The third instance, available via a rust-extension endpoint, is instrumented with an EventCollector extension written in Rust. The extension is pushing all the extension invocation events to S3.

Performance dashboard

The Rust extension adds little overhead in terms of the Duration, number of Cold Starts, and Average PostRuntimeExtensionDuration metrics. Yet there is a clear performance degradation for the function that is instrumented with an extension written in C#. Average Duration jumped almost three times, and the Maximum Duration is now around six times higher.

The function is now consuming almost all the memory allocated. CPU, networking, and storage for Lambda functions are allocated based on the amount of memory selected. Currently, the memory is set to 128 MB, the lowest setting possible. Constrained resources influence the performance of the function.

Performance dashboard

Increasing the memory to 512 MB and re-running the load test improves the performance. Maximum Duration is now 721 ms (including the static 500 ms delay).

For the C# function, the Average Duration is now only 59 ms longer than the baseline. The Average PostRuntimeExtensionDuration is at 36.9 ms (compared with 584 ms previously). This performance gain is due to the memory increase without any code changes.

You can also use the Lambda Power Tuning to determine the optimal memory setting for a Lambda function.

Garbage collection

Unlike C#, Rust is not a garbage collected language. Garbage collection (GC) is a process of managing the allocation and release of memory for an application. This process can be resource intensive, and can affect higher percentile metrics. The impact of GC is visible with the blank extension’s and EventCollector extension’s metrics.

Rust uses ownership and borrowing features, allowing for safe memory release without relying on GC. This makes Rust a good runtime choice for tools like Lambda extensions.

EventCollector native AOT extension

Native ahead-of-time (Native AOT) compilation (available in .NET 7 and .NET 8), allows for the extensions written in C# to be delivered as executables, similar to the extensions written in Rust.

Native AOT does not use a JIT compiler. The application is compiled into a self-contained (all the resources that it needs are encapsulated) executable. The executable runs in the target environment (for example, Linux x64) that is specified at compilation time.

These are the results of compiling the .NET extension using Native AOT and re-running the performance test (with function memory set to 128 MB):

Performance dashboard

For the C# extension, Average Duration is now close the baseline (compared to three times the baseline as a DLL). Average PostRuntimeExtensionDuration is now 0.77 ms (compared with 584 ms as a DLL). The C# extension also outperforms the Rust extension for the Maximum PostRuntimeExtensionDuration metric – 297 ms versus 497 ms.

Overall, the Rust extension still has better Average/Maximum Duration, Average/Maximum Cold Start Duration, and Memory Consumption. The Lambda function with the C# extension still uses almost all the allocated memory.

Another metric to consider is the binary size. The Rust extension compiles into a 12.3 MB binary, while the C# extension compiles into a 36.4 MB binary.

Example walkthroughs

To follow the example walkthrough, visit the GitHub repository. The walkthrough explains:

  1. The prerequisites required.
  2. A detailed solution deployment walkthrough.
  3. The cleanup process.
  4. Cost considerations.

Conclusion

This post demonstrates techniques that can be used for running and profiling different types of Lambda extensions. This post focuses on Lambda extensions written in C# and Rust. This post outlines the benefits of writing Lambda extensions in Rust and shows the techniques that can be used to improve Lambda extension written in C# to deliver better performance.

Start writing Lambda extensions with Rust by using the Runtime extensions for AWS Lambda crate. This is a part of a Rust runtime for AWS Lambda.

For more serverless learning resources, visit Serverless Land.

Using AWS Lambda SnapStart with infrastructure as code and CI/CD pipelines

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-aws-lambda-snapstart-with-infrastructure-as-code-and-ci-cd-pipelines/

This was written by Michele Ricciardi, Specialist Solutions Architect, DevAx.

AWS Lambda SnapStart enables customers to reduce cold starts by caching an encrypted snapshot of an initiated Lambda function, and reusing that snapshot for subsequent invocations.

This blog post describes in more detail how to enable SnapStart with different infrastructure as code tooling: AWS Serverless Application Model (AWS SAM), AWS CloudFormation, and Terraform. It shows what happens during a Lambda deployment when SnapStart is enabled, as well as best practices to apply in CI/CD pipelines.

Snapstart can significantly reduce the cold start times for Lambda functions written in Java. The example described in the launch blog post reduces the cold start duration from over 6 seconds to less than 200ms.

Using SnapStart

You activate the SnapStart feature through a Lambda configuration setting. However, there are additional steps to use the SnapStart feature.

Enable Lambda function versions

SnapStart works in conjunction with Lambda function versions. It initializes the function when you publish a new Lambda function version. Therefore, you must enable Lambda versions to use SnapStart. For more information, read the Lambda Developer Guide.

Optionally create a Lambda function alias

A Lambda function alias allows you to define a custom pointer, and control which Lambda version it is attached to. While this is optional, creating a Lambda function alias makes the management operations easier. You can change the Lambda version tied to a Lambda alias without modifying the applications using the Lambda function.

Modify the application or AWS service invoking the Lambda function

Once you have a SnapStart-ready Lambda version, you must change the mechanism that invokes the Lambda function to use a qualified Lambda ARN. This can either invoke the specific Lambda Version directly or invoke the Lambda alias, which points at the correct Lambda version.

If you do not modify the mechanism that invokes the Lambda function, or use an unqualified ARN, you cannot take advantage of the optimizations that SnapStart provides.

Using SnapStart with infrastructure as code

AWS CloudFormation

UnicornStockBroker is a sample Serverless application. This application is composed of an Amazon API Gateway endpoint, a Java Lambda function, and an Amazon DynamoDB table. Find the CloudFormation template without SnapStart in this GitHub repository.

To enable SnapStart:

  1. Enable Lambda Versioning by adding an AWS::Lambda::Version resource type:
      UnicornStockBrokerFunctionVersion1:
        Type: AWS::Lambda::Version
        Properties:
          FunctionName: !Ref 'UnicornStockBrokerFunction'
  2. To deploy a new Lambda function version with CloudFormation, change the resource name from UnicornStockBrokerFunctionVersion1 to UnicornStockBrokerFunctionVersion2.
  3. Create a Lambda alias using the AWS::Lambda::Alias resource type:
    UnicornStockBrokerFunctionAliasSnapStart:
      Type: AWS::Lambda::Alias
      Properties:
        Name: SnapStart
        FunctionName: !Ref 'UnicornStockBrokerFunction'
        FunctionVersion: !GetAtt 'UnicornStockBrokerFunctionVersion.Version'
  4. Update API Gateway to invoke the Lambda function alias, instead of using an unqualified ARN. This ensures that API Gateway invokes the specified Lambda alias:
    ServerlessRestApi:
      Type: AWS::ApiGateway::RestApi
      Properties:
        Body:
          paths:
            /transactions:
              post:
                x-amazon-apigateway-integration:
                  [..]
                  uri: !Sub 'arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${UnicornStockBrokerFunctionAliasSnapStart}/invocations'
    
  5. Modify the IAM permissions to allow API Gateway to invoke the specified Lambda alias:
    UnicornStockBrokerFunctionUnicornStockBrokerEventPermissionProd:
      Type: AWS::Lambda::Permission
      Properties:
        [..]
        FunctionName: !Ref UnicornStockBrokerFunctionAliasSnapStart
    
  6. Enable SnapStart on the Lambda function:
    UnicornStockBrokerFunction:
      Type: AWS::Lambda::Function
      Properties:
        [..]
        SnapStart:
          ApplyOn: PublishedVersions

After these updates, the CloudFormation template with SnapStart enabled looks like this. You can deploy the sample application and take advantage of SnapStart.

AWS SAM

You can find the AWS SAM template without SnapStart here. To enable Lambda SnapStart:

  1. Enable Lambda versions and create a Lambda alias by adding a AutoPublishAlias property, AWS SAM automatically publishes a new Lambda function version for each new deployment, and automatically assigns the Lambda alias to the new published version:
      UnicornStockBrokerFunction:
        Type: AWS::Serverless::Function
        Properties: 
          [..]
          AutoPublishAlias: SnapStart
    
  2. Enable SnapStart on the Lambda function:
    UnicornStockBrokerFunction:
      Type: AWS::Serverless::Function
      Properties:
        [..]
        SnapStart:
          ApplyOn: PublishedVersions
    

This example uses the AWS::Serverless::Function resource with the AutoPublishAlias property. AWS SAM implicitly creates an Amazon API Gateway with the correct permissions and configuration to invoke the Lambda function using the declared Lambda alias.

You can see the modified AWS SAM template here. You can deploy the sample application with AWS SAM and take advantage of SnapStart.

Terraform

You can find the Terraform template without SnapStart here. To enable Lambda SnapStart using Terraform by HashiCorp:

  1. Enable Lambda Versioning by adding a publish property to the aws_lambda_function resource:
    resource "aws_lambda_function" "UnicornStockBrokerFunction" {
      [...]
      publish = true
    }
    
  2. Create a Lambda alias by adding a aws_lambda_alias resource:
    resource "aws_lambda_alias" "UnicornStockBrokerFunction_SnapStartAlias" {
      name             = "SnapStart"
      description      = "Alias for SnapStart"
      function_name    = aws_lambda_function.UnicornStockBrokerFunction.function_name
      function_version = aws_lambda_function.UnicornStockBrokerFunction.version
    }
    
  3. Update API Gateway to invoke the Lambda function alias, instead of using an unqualified ARN:
    SnapshotFunction:
    resource "aws_api_gateway_integration" "createproduct-lambda" {
      [...]
      uri = aws_lambda_alias.UnicornStockBrokerFunction_SnapStartAlias.invoke_arn
    }
    
  4. Modify the uri on the aws_lambda_permission to allow API Gateway to invoke the specified Lambda alias:
    resource "aws_lambda_permission" "apigw-CreateProductHandler" {
      [...]
      qualifier = aws_lambda_alias.UnicornStockBrokerFunction_SnapStartAlias.name
    }
    
  5. Enable Lambda SnapStart by adding a snap_start argument to the aws_lambda_function resource:
    resource "aws_lambda_function" "UnicornStockBrokerFunction" {
      [...]
      snap_start {
        apply_on ="PublishedVersions"
      }
    }
    

You can see the modified Terraform template here. You can deploy the sample application with Terraform and take advantage of SnapStart.

How does the deployment change with SnapStart?

This section describes how the deployment changes after enabling Lambda SnapStart.

Before adding Lambda versions, Lambda alias and SnapStart, this is the process of publishing new code to a function:

Lambda deployment phases

After you enable Lambda versions, Lambda alias and Lambda SnapStart, there are more steps to the deployment:

Phases with versions, alias and SnapStart

During the deployment of a Lambda version, Lambda creates a new execution environment and initializes it with the new code. Once the code is initialized, Lambda takes a Snapshot of the initialized code. The deployment now takes longer than it did without SnapStart as it takes additional time to initialize the execution environment and to create the initialized function snapshot.

While the new Lambda version is being created, the Lambda alias remains unchanged and invokes the previous Lambda version when invoked. The calling application is unaware of the deployment; therefore, the additional deployment time does not impact calling applications.

When the new function version is ready, you can update the Lambda alias to point at the new Lambda version (this is done automatically in the preceding IaC examples).

By using a Lambda alias, you don’t need to direct all the Lambda invocations to the latest version. You can use the Alias routing configuration to shift traffic gradually to the new Lambda version, giving you an easier way to test new Lambda versions gradually and rollback in case of issues with the latest version.

There is an additional failure mode with this deployment. The initialization of a new Lambda version can fail in case of errors within the code. For example, Lambda does not initialize your code if there is an unhandled exception during the initialization phase. This is an additional failure mode that you must handle in the deployment cycle.

CI/CD considerations

Longer deployments

The first consideration is that enabling Lambda SnapStart increases the deployment time.

When your CI/CD pipeline deploys a new SnapStart-enabled Lambda version, Lambda initializes a new Lambda execution environment and takes a snapshot of both the memory and disk of the initialized code. This takes time and its latency would depend on the initialization time of your Lambda function, and the overall size of the snapshot.

This additional delay could be significant for CI/CD pipelines with large number of Lambda functions. Therefore, if you have an application with multiple Lambda functions, enable Lambda SnapStart on the Lambda functions gradually, and adjust the CI/CD pipeline timeout accordingly.

Deployment failure modes

With Lambda SnapStart, there is an additional failure mode that you would need to handle in your CI/CD pipeline.

As described earlier, during the new Lambda version creation there could be failures linked to the code initialization of the Lambda code. This additional failure scenario can be managed in 2 ways:

  1. Add a testing step in your CI/CD pipeline.
    You can add a testing step to validate that the function would initialize successfully. This testing step could include running your Lambda function locally, with sam local, or running local tests using the Lambda Java Test library.
  2. Deploy the Lambda function to a test environment with SnapStart enabled.
    An additional check to detect issues early, is to deploy your Lambda function to a testing environment with SnapStart enabled. By deploying your Lambda functions with SnapStart in a testing environment, you can catch issues earlier in your development cycle, and run additional end-to-end testing and performance tests.

Conclusion

This blog post shows the steps needed to leverage Lambda SnapStart, with examples for AWS CloudFormation, AWS SAM, and Terraform.

When you use the feature, there are additional steps that Lambda performs during the deployment. Lambda initializes a new Lambda version with the new code and takes a snapshot. This operation takes time and may fail if the code initialization fails, therefore you may need to make adjustments to your CI/CD pipeline to handle those scenarios.

If you’d like to learn more about the Lambda SnapStart feature, read more on the AWS Lambda Developer Guide or visit our Java on AWS Lambda workshop. To read more about this and other features, visit Serverless Land.

Managing sessions of anonymous users in WebSocket API-based applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/managing-sessions-of-anonymous-users-in-websocket-api-based-applications/

This post is written by Alexey Paramonov, Solutions Architect, ISV.

This blog post demonstrates how to reconnect anonymous users to WebSocket API without losing their session context. You learn how to link WebSocket API connection IDs to the logical user, what to do when the connection fails, and how to store and recover session information when the user reconnects.

WebSocket APIs are common in modern interactive applications. For example, for watching stock prices, following live chat feeds, collaborating with others in productivity tools, or playing online games. Another example described in “Implementing reactive progress tracking for AWS Step Functions” blog post uses WebSocket APIs to send progress back to the client.

The backend is aware of the client’s connection ID to find the right client to send data to. But if the connection temporarily fails, the client reconnects with a new connection ID. If the backend does not have a mechanism to associate a new connection ID with the same client, the interaction context is lost and the user must start over again. In a live chat feed that could mean skipping to the most recent message with a loss of some previous messages. And in case of AWS Step Functions progress tracking the user would lose progress updates.

Overview

The sample uses Amazon API Gateway WebSocket APIs. It uses AWS Lambda for WebSocket connection management and for mocking a teleprinter to generate a stateful stream of characters for testing purposes. There are two Amazon DynamoDB tables for storing connection IDs and user sessions, as well as an optional MediaWiki API for retrieving an article.

The following diagram outlines the architecture:

Reference architecture

  1. The browser generates a random user ID and stores it locally in the session storage. The client sends the user ID inside the Sec-WebSocket-Protocol header to WebSocket API.
  2. The default WebSocket API route OnConnect invokes OnConnect Lambda function and passes the connection ID and the user ID to it. OnConnect Lambda function determines if the user ID exists in the DynamoDB table and either creates a new item or updates the existing one.
  3. When the connection is open, the client sends the Read request to the Teleprinter Lambda function.
  4. The Teleprinter Lambda function downloads an article about WebSocket APIs from Wikipedia and stores it as a string in memory.
  5. The Teleprinter Lambda function checks if there is a previous state stored in the Sessions table in DynamoDB. If it is a new user, the Teleprinter Lambda function starts sending the article from the beginning character by character back to the client via WebSocket API. If it is a returning user, the Teleprinter Lambda function retrieves the last cursor position (the last successfully sent character) from the DynamoDB table and continues from there.
  6. The Teleprinter Lambda function sends 1 character every 100ms back to the client.
  7. The client receives the article and prints it out character by character as each character arrives.
  8. If the connection breaks, the WebSocket API calls the OnDisconnect Lambda function automatically. The function marks the connection ID for the given user ID as inactive.
  9. If the user does not return within 5 minutes, Amazon EventBridge scheduler invokes the OnDelete Lambda function, which deletes items with more than 5 minutes of inactivity from Connections and Sessions tables.

A teleprinter returns data one character at a time instead of a single response. When the Lambda function fetches an article, it feeds it character by character inside the for-loop with a delay of 100ms on every iteration. This demonstrates how the user can continue reading the article after reconnecting and not starting the feed all over again. The pattern could be useful for traffic limiting by slowing down user interactions with the backend.

Understanding sample code functionality

When the user connects to the WebSocket API, the client generates a user ID and saves it in the browser’s session storage. The user ID is a random string. When the client opens a WebSocket connection, the frontend sends the user ID inside the Sec-WebSocket-Protocol header, which is standard for WebSocket APIs. When using the JavaScript WebSocket library, there is no need to specify the header manually. To initialize a new connection, use:

const newWebsocket = new WebSocket(wsUri, userId);

wsUri is the deployed WebSocket API URL and userId is a random string generated in the browser. The userId goes to the Sec-WebSocket-Protocol header because WebSocket APIs generally offer less flexibility in choosing headers compared to RESTful APIs. Another way to pass the user ID to the backend could be a query string parameter.

The backend receives the connection request with the user ID and connection ID. The WebSocket API generates the connection ID automatically. It’s important to include the Sec-WebSocket-Protocol in the backend response, otherwise the connection closes immediately:

    return {
        'statusCode': 200,
        'headers': {
            'Sec-WebSocket-Protocol': userId
        }
    }

The Lambda function stores this information in the DynamoDB Connections table:

    ddb.put_item(
        TableName=table_name,
        Item={
            'userId': {'S': userId},
            'connectionId': {'S': connection_id},
            'domainName': {'S': domain_name},
            'stage': {'S': stage},
            'active': {'S': True}
        }
    )

The active attribute indicates if the connection is up. In the case of inactivity over a specified time limit, the OnDelete Lambda function deletes the item automatically. The Put operation comes handy here because you don’t need to query the DB if the user already exists. If it is a new user, Put creates a new item. If it is a reconnection, Put updates the connectionId and sets active to True again.

The primary key for the Connections table is userId, which helps locate users faster as they reconnect. The application relies on a global secondary index (GSI) for locating connectionId. This is necessary for marking the connection inactive when WebSocket API calls OnDisconnect function automatically.

Now the application has connection management, you can retrieve session data. The Teleprinter Lambda function receives a connection ID from WebSocket API. You can query the GSI of Connections table and find if the user exists:

    def get_user_id(connection_id):
        response = ddb.query(
            TableName=connections_table_name,
            IndexName='connectionId-index',
            KeyConditionExpression='connectionId = :c',
            ExpressionAttributeValues={
                ':c': {'S': connection_id}
            }
        )

        items = response['Items']

        if len(items) == 1:
            return items[0]['userId']['S']

        return None

Another DynamoDB table called Sessions is used to check if the user has an existing session context. If yes, the Lambda function retrieves the cursor position and resumes teletyping. If it is a brand-new user, the Lambda function starts sending characters from the beginning. The function stores a new cursor position if the connection breaks. This makes it easier to detect stale connections and store the current cursor position in the Sessions table.

    Try:
        api_client.post_to_connection(
            ConnectionId=connection_id,
            Data=bytes(ch, 'utf-8')
        )
    except api_client.exceptions.GoneException as e:
        print(f"Found stale connection, persisting state")
        store_cursor_position(user_id, pos)
        return {
            'statusCode': 410
        }

    def store_cursor_position(user_id, pos):
        ddb.put_item(
            TableName=sessions_table_name,
            Item={
                'userId': {'S': user_id},
                'cursorPosition': {'N': str(pos)}
            }
        )

After this, the Teleprinter Lambda function ends.

Serving authenticated users

The same mechanism also works for authenticated users. The main difference is it takes a given ID from a JWT token or other form of authentication and uses it instead of a randomly generated user ID. The backend relies on unambiguous user identification and may require only minor changes for handling authenticated users.

Deleting inactive users

The OnDisconnect function marks user connections as inactive and adds a timestamp to ‘lastSeen’ attribute. If the user never reconnects, the application should purge inactive items from Connections and Sessions tables. Depending on your operational requirements, there are two options.

Option 1: Using EventBridge Scheduler

The sample application uses EventBridge to schedule OnDelete function execution every 5 minutes. The following code shows how AWS SAM adds the scheduler to the OnDelete function:

  OnDeleteSchedulerFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: app.handler
      Runtime: python3.9
      CodeUri: handlers/onDelete/
      MemorySize: 128
      Environment:
        Variables:
          CONNECTIONS_TABLE_NAME: !Ref ConnectionsTable
          SESSIONS_TABLE_NAME: !Ref SessionsTable
      Policies:
      - DynamoDBCrudPolicy:
          TableName: !Ref ConnectionsTable
      - DynamoDBCrudPolicy:
          TableName: !Ref SessionsTable
      Events:
        Schedule:
          Type: ScheduleV2
          Properties:
            ScheduleExpression: rate(5 minute)

The Events section of the function definition is where AWS SAM sets up the scheduled execution. Change values in ScheduleExpression to meet your scheduling requirements.

The OnDelete function relies on the GSI to find inactive user IDs. The following code snippet shows how to query connections with more than 5 minutes of inactivity:

    five_minutes_ago = int((datetime.now() - timedelta(minutes=5)).timestamp())

    stale_connection_items = table_connections.query(
        IndexName='lastSeen-index',
        KeyConditionExpression='active = :hk and lastSeen < :rk',
        ExpressionAttributeValues={
            ':hk': 'False',
            ':rk': five_minutes_ago
        }
    )

After that, the function iterates through the list of user IDs and deletes them from the Connections and Sessions tables:

    # remove inactive connections
    with table_connections.batch_writer() as batch:
        for item in inactive_users:
            batch.delete_item(Key={'userId': item['userId']})

    # remove inactive sessions
    with table_sessions.batch_writer() as batch:
        for item in inactive_users:
            batch.delete_item(Key={'userId': item['userId']})

The sample uses batch_writer() to avoid requests for each user ID.

Option 2: Using DynamoDB TTL

DynamoDB provides a built-in mechanism for expiring items called Time to Live (TTL). This option is easier to implement. With TTL, there’s no need for EventBridge scheduler, OnDelete Lambda function and additional GSI to track time span. To set up TTL, use the ‘lastSeen’ attribute as an object creation time. TTL deletes or archives the item after a specified period of time. When using AWS SAM or AWS CloudFormation templates, add TimeToLiveSpecification to your code.

The TTL typically deletes expired items within 48 hours. If your operational requirements demand faster and more predictable timing, use option 1. For example, if your application aggregates data while the user is offline, use option 1. But if your application stores a static session data, like cursor position used in this sample, option 2 can be an easier alternative.

Deploying the sample

Prerequisites

Ensure you can manage AWS resources from your terminal.

  • AWS CLI and AWS SAM CLI installed.
  • You have an AWS account. If not, visit this page.
  • Your user has enough permissions to manage AWS resources.
  • Git is installed.
  • NPM is installed (only for local frontend deployment).

You can find the source code and README here.

The repository contains both the frontend and the backend code. To deploy the sample, follow this procedure:

  1. Open a terminal and clone the repository:
    git clone https://github.com/aws-samples/websocket-sessions-management.git
  2. Navigate to the root of the repository.
  3. Build and deploy the AWS SAM template:
    sam build && sam deploy --guided
  4. Note the value of WebSocketURL in the output. You need it later.

With the backend deployed, test it using the hosted frontend.

Testing the application

Testing the application

You can watch an animated demo here.

Notice that the app has generated a random user ID on startup. The app shows the user ID above in the header.

  1. Paste the WebSocket URL into the text field. You can find the URL in the console output after you have successfully deployed your AWS SAM template. Alternatively, you can navigate to AWS Management Console (make sure you are in the right Region), select the API you have recently deployed, go to “Stages”, select the deployed stage and copy the “WebSocket URL” value.
  2. Choose Connect. The app opens a WebSocket connection.
  3. Choose Tele-read to start receiving the Wikipedia article character by character. New characters appear in the second half of the screen as they arrive.
  4. Choose Disconnect to close WebSocket connection. Reconnect again and choose Tele-read. Your session resumes from where you stopped.
  5. Choose Reset Identity. The app closes the WebSocket connection and changes the user ID. Choose Connect and Tele-read again and your character feed starts from the beginning.

Cleaning up

To clean up the resources, in the root directory of the repository run:

sam delete

This removes all resources provisioned by the template.yml file.

Conclusion

In this post, you learn how to keep track of user sessions when using WebSockets API and not lose the session context when the user reconnects again. Apply learnings from this example to improve your user experience when using WebSocket APIs for web-frontend and mobile applications, where internet connections may be unstable.

For more serverless learning resources, visit  Serverless Land.

Building serverless Java applications with the AWS SAM CLI

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-java-applications-with-the-aws-sam-cli/

This post was written by Mehmet Nuri Deveci, Sr. Software Development Engineer, Steven Cook, Sr.Solutions Architect, and Maximilian Schellhorn, Solutions Architect.

When using Java in the serverless environment, the AWS Serverless Application Model Command Line Interface (AWS SAM CLI) offers an easier way to build and deploy AWS Lambda functions. You can either use the default AWS SAM build mechanism or tailor the build behavior to your application needs.

Since Java offers a variety of plugins and tools for building your application, builders usually have custom requirements for their build setup. In addition, when targeting GraalVM or non-LTS versions of the JVM, the build behavior requires additional configuration to build a Lambda custom runtime.

This blog post provides an overview of the common ways to build Java applications for Lambda with the AWS SAM CLI. This allows you to make well-informed decisions based on your projects’ requirements. This post focuses on Apache Maven, however the same concepts apply for Gradle.

You can find the source code for these examples in the GitHub repo.

Overview

The following diagram provides an overview of the build and deployment process with AWS SAM CLI. The default behavior includes the following steps:

Architecture overview

  1. Define your infrastructure resources such as Lambda functions, Amazon DynamoDB Tables, Amazon S3 buckets, or an Amazon API Gateway endpoint in a template.yaml file.
  2. The CLI command “sam build” builds the application based on the chosen runtime and configuration within the template.
  3. The sam build command populates the .aws-sam/build folder with the built artifacts (for example, class files or jars).
  4. The sam deploy command uploads your template and function code from the .aws-sam/build folder and starts an AWS CloudFormation deployment.
  5. After a successful deployment, you can find the provisioned resources in your AWS account.

Using the default Java build mechanism in AWS SAM CLI

AWS SAM CLI supports building Serverless Java functions with Maven or Gradle. You must use one of the supported Java runtimes (java8, java8.al2, java11) and your function’s resource CodeUri property must point to a source folder.

Default Java build mechanism

AWS SAM CLI ships with default build mechanisms provided by the aws-lambda-builders project. It is therefore not required to package or build your application in advance and any customized build steps in pom.xml will not be used. For example, by default the AWS SAM Maven Lambda Builder triggers the following steps:

  1. mvn clean install to build the function.
  2. mvn dependency:copy-dependencies -DincludeScope=runtime -Dmdep.prependGroupId=true to prepare dependency jar files.
  3. Class files in target/classes and dependency jar archives in target/dependency are copied to the final build artifact location in .aws-sam/build/{ResourceLogicalId}.

Start the build process by running the following command from the directory where the template.yaml resides:

sam build

This results in the following outputs:

Outputs

The .aws-sam build folder contains the necessary classes and libraries to run your application. The transformed template.yaml file points to the build artifacts directory (instead of pointing to the original source directory).

Run the following command to deploy the resources to AWS:

sam deploy --guided

This zips the HelloWorldFunction directory in .aws-sam/build and uploads it to the Lambda service.

Building Uber-Jars with AWS SAM CLI

A popular way for building and packaging Java projects, especially when using frameworks such as Micronaut, Quarkus and Spring Boot is to create an Uber-jar or Fat-jar. This is a jar file that contains the application class files with all the dependency class files within a single jar file. This simplifies the deployment and management of the application artifact.

Frameworks typically provide a Maven or Gradle setup that produces an Uber-jar by default. For example, by using the Apache Maven Shade plugin for Maven, you can configure Uber-jar packaging:

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-shade-plugin</artifactId>
      <version>3.4.1</version>
      <executions>
 	 <execution>
 	   <phase>package</phase>
 	     <goals>
 	       <goal>shade</goal>
 	     </goals>
 	 </execution>
      </executions>
     </plugin>
  </plugins>
</build>

The maven-shade-plugin modifies the package phase of the Maven build to produce the Uber-jar file in the target directory.

To build and package an Uber-jar with AWS SAM, you must customize the AWS SAM CLI build process. You can do this by using a Makefile to replace the default steps discussed earlier. Within the template.yaml, declare a Metadata resource attribute with a BuildMethod entry. The sam build process then looks for a Makefile within the CodeUri directory.

Makefile metadata

The Makefile is responsible for building the artifacts required by AWS SAM to deploy the Lambda function. AWS SAM runs the target in the Makefile. This is an example Makefile:

build-HelloWorldFunctionUberJar:
 	mvn clean package
 	mkdir -p $(ARTIFACTS_DIR)/lib
 	cp ./target/HelloWorld*.jar $(ARTIFACTS_DIR)/lib/

The Makefile runs the Maven clean and package goals that build Uber-jar in the target directory via the Apache Maven Plugin. As the Lambda Java runtime loads jar files from the lib directory, you can copy the uber-jar file to the $ARTIFACTS_DIR/lib directory as part of the build steps.

To build the application, run:

sam build

This triggers the customized build step and creates the following output resources:

Output response

The deployment step is identical to the previous example.

Running the build process inside a container

AWS SAM provides a mechanism to run the application build process inside a Docker container. This provides the benefit of not requiring your build dependencies (such as Maven and Java) to be installed locally or in your CI/CD environment.

To run the build process inside a container, use the command line options –use-container and optionally –build-image. The following diagram outlines the modified build process with this option:

Modified build process with this option

  1. Similar to the previous examples, the directory with the application sources or the Makefile is referenced.
  2. To run the build process inside a Docker container, provide the command line option:
    sam build –-use-container
  3. By default, AWS SAM uses images for container builds that are maintained in the GitHub repo aws/aws-sam-build-images. AWS SAM pulls the container image depending on the defined runtime. For this example, it pulls the public.ecr.aws/sam/build-java11 image, which has prerequisites such as Maven and Java11 pre-installed to build the application. In addition, the source code folder is mounted in the container and the Makefile target or default build mechanism is run within the container.
  4. The final artifacts are delivered to the.aws-sam/build directory on the local file system. If you are using a Makefile, you can copy the final artifact to the $ARTIFACTS_DIR.
  5. The sam deploy command uploads the template and function code from the .aws-sam/build directory and starts a CloudFormation deployment.
  6. After a successful deployment, you can find the provisioned resources in your AWS account.

To test the behavior, run the previous examples with the –use-container option. No additional changes are needed.

Using your own base build images for creating custom runtimes

When you are targeting a non-supported Lambda runtime, such as a non-LTS Java version or natively compiled GraalVM native images, you can create your own build image with the necessary dependencies installed.

For example, to build a native image with GraalVM, you must have GraalVM and the native image tool installed when building your application.

To create a custom image:

  1. Create a Dockerfile with the needed dependencies:
    #Use the official AWS SAM base image or Amazon Linux 2 as a starting point
    FROM public.ecr.aws/sam/build-java11:latest-x86_64
    
    #Install GraalVM dependencies
    ENV GRAAL_VERSION 22.2.0
    ENV GRAAL_FOLDERNAME graalvm-ce-java11-${GRAAL_VERSION}
    ENV GRAAL_FILENAME graalvm-ce-java11-linux-amd64-${GRAAL_VERSION}.tar.gz
    RUN curl -4 -L https://github.com/graalvm/graalvm-ce-builds/releases/download/vm-22.2.0/graalvm-ce-java11-linux-amd64-22.2.0.tar.gz | tar -xvz
    RUN mv $GRAAL_FOLDERNAME /usr/lib/graalvm
    RUN rm -rf $GRAAL_FOLDERNAME
    
    #Install Native Image dependencies
    RUN /usr/lib/graalvm/bin/gu install native-image
    RUN ln -s /usr/lib/graalvm/bin/native-image /usr/bin/native-image
    RUN ln -s /usr/lib/maven/bin/mvn /usr/bin/mvn
    
    #Set GraalVM as default
    ENV JAVA_HOME /usr/lib/graalvm
    
  2. Build your image locally or upload it to a container registry:
    docker build . -t sam/custom-graal-image
  3. Use AWS SAM build with the build image argument to provide your custom image:
    sam build --use-container --build-image sam/custom-graal-image

You can find the source code and an example Dockerfile, Makefile, and pom.xml for GraalVM native images in the GitHub repo.

When you use the official AWS SAM build images as a base image, you have all the necessary tooling such as Maven, Java11 and the Lambda builders installed. If you want to use a more customized approach and use a different base image, you must install these dependencies.

For an example, check the GraalVM with Java 17 cookiecutter template. In addition, there are multiple additional components involved when building custom runtimes, which are outlined in “Build a custom Java runtime for AWS Lambda”.

To avoid providing the command line options on every build, include them in the samconfig.toml file:

[default.build.parameters]
use_container = true
build_image = ["public.ecr.aws/sam/build-java11:latest-x86_64"]

For additional information, refer to the official AWS SAM CLI documentation.

Deploying the application without building with AWS SAM

There might be scenarios where you do not want to rely on the build process offered by AWS SAM. For example, when you have highly customized or established build processes, or advanced dependency caching or visibility requirements.

In this case, you can still use the AWS SAM CLI commands such as sam local and sam deploy. But you must point your CodeUri property directly to the pre-built artifact (instead of the source code directory):

HelloWorldFunctionSkipBuild:
  Type: AWS::Serverless::Function
  Properties:
    CodeUri: HelloWorldFunction/target/HelloWorld-1.0.jar
    Handler: helloworld.App::handleRequest

In this case, there is no need to use the sam build command, since the build logic is outside of AWS SAM CLI:

Build process

Here, the sam build command fails because it looks for a source folder to build the application. However, there might be cases where you have a mixed setup that includes some functions that point to pre-built artifacts and others that are built by AWS SAM CLI. In this scenario, you can mark those functions to explicitly skip the build process by adding the following SkipBuild flag in the Metadata section of your resource definition:

HelloWorldFunctionSkipBuild:
  Type: AWS::Serverless::Function
  Properties:
    CodeUri: HelloWorldFunction/target/HelloWorld-1.0.jar
    Handler: helloworld.App::handleRequest
    Runtime: java11
  Metadata:
    SkipBuild: True

Conclusion

This blog post shows how to build Java applications with the AWS SAM CLI. You learnt about the default build mechanisms, and how to customize the build behavior and abstract the build process inside a container environment. Visit the GitHub repository for the example code templates referenced in the examples.

To learn more, dive deep into the AWS SAM documentation. For more serverless learning resources, visit Serverless Land.

Server-side rendering micro-frontends – UI composer and service discovery

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/server-side-rendering-micro-frontends-ui-composer-and-service-discovery/

This post is written by Luca Mezzalira, Principal Specialist Solutions Architect, Serverless.

The previous blog post describes the architecture for creating a server-side rendering micro-frontend in AWS. This and subsequent posts explain the different parts that compose this architecture in detail. The code for the example is available on a AWS Samples GitHub repository.

For context, this post covers the infrastructure related to the UI composer, and why you need an Amazon S3 bucket for storing static assets:

Architecture overview

The rest of the series explores the micro-frontends composition, how to design micro-frontends using serverless services, different caching and performance optimization strategies, and the organization structure implications associated with frontend distributed systems.

A user’s request journey

The best way to navigate through this distributed system is by simulating a user request that touches all the parts implemented in the architecture.

The application example shows a product details page of a hypothetical ecommerce platform:

Building micro-frontends

When a user selects an article from the catalog page, the DNS resolves the URL to an Amazon CloudFront distribution that is the reference CDN for this project.

The request is immediately fulfilled if the page is cached. Therefore, no additional logic is requested by the cloud infrastructure and the response is fast (less than the 500 ms shown in this example).

When the page is not available in the CloudFront points of presence (PoPs), the request is forwarded to the Application Load Balancer (ALB). It arrives at the AWS Fargate cluster where the UI Composer generates the page for fulfilling the request.

Using CloudFront in the architecture

CDNs are known for accelerating application delivery thanks to caching static files from nearby PoPs. CloudFront can also accelerate uncacheable content such as dynamic APIs or personalized content.

With a network of over 450 points of presence, CloudFront terminates user TCP/TLS connections within 20-30 milliseconds on average. Traffic to origin servers is carried over the AWS global network instead of the public internet. This infrastructure is a purpose-built, highly available, and low-latency private infrastructure built on a global, fully redundant, metro fiber network that is linked via terrestrial and trans-oceanic cables across the world. In addition to terminating connections close to users, CloudFront accelerates dynamic content thanks to modern internet protocols such as QUIC and TLS1.3, and persisting TCP connections to the origin servers.

CloudFront also has security benefits, offering protection in AWS against infrastructure DDoS attacks. It integrates with AWS Web Application Firewall and AWS Shield Advanced, giving you controls to block application-level DDoS attacks. CloudFront also offers native security controls such as HTTP to HTTPS redirections, CORS management, geo-blocking, tokenization, and managing security response headers.

UI Composer application logic

When the request is not fulfilled by the CloudFront cache, it is routed to the Fargate cluster. Here, multiple tasks compute and serve the page requested.

This example uses Fastify, a fast Node.js framework that is gaining popularity among the Node.js community. When the web server initializes, it loads external parameters and the template for composing a page.

const start = async () => {
  try {
    //load parameters
    MFElist = await init();
    //load catalog template
    catalogTemplate = await loadFromS3(MFElist.template, MFElist.templatesBucket)
    await fastify.listen({ port: PORT, host: '0.0.0.0' })
  } catch (err) {
    fastify.log.error(err)
    process.exit(1)
  }
}

To maintain team independence and avoid redeploying the UI composer for every application change, the HTML templates are loaded from an S3 bucket. All teams responsible for micro-frontends in the same page can position their micro-frontends into the right place of the HTML template and delegate the composition task to the UI composer.

In this demo, the initial parameters and the catalog template are retrieved once. However, in a real scenario, it’s more likely you retrieve the parameters at initialization and at a regular cadence. The template might be loaded at runtime for every request or have another background routine fetching the initialization parameters in a similar way.

When the request reaches the product details route, the web application logic calls a transformTemplate function. It passes the catalog template, retrieved from the S3 bucket at the server initialization. It returns a 200 response if the page is composed without any issues.

fastify.get('/productdetails', async(request, reply) => {
  try{
    const catalogDetailspage = await transformTemplate(catalogTemplate)
    responseStream(catalogDetailspage, 200, reply)
  } catch(err){
    console.log(err)
    throw new Error(err)
  }
})

The page composition is the key responsibility of the UI composer. There are several viable approaches for composing micro-frontends in a server-side rendering system, covered in the next post.

Micro-frontends discovery

To decouple workloads for multiple teams, you must use architectural patterns that support it. In a microservices architecture, a pattern that allows independent evolution of a service without coupling the DNS or IP to any microservice is the service discovery pattern.

In this example, AWS System Managers Parameters Store acts as a services registry. Every micro-frontend available in the workload registers itself once the infrastructure is provisioned.

In this way, the UI composer can request the micro-frontend ID found inside the HTML template. It can retrieve the correct way to consume the micro-frontend API using an ARN or a remote HTTP URL, for instance.

AWS System Managers Parameters Store

Using ARN over HTTP requests inside the workload network can help you to reduce the latency thanks to fewer network hops. Moreover, the security is delegated to IAM policies providing a robust security implementation.

The UI composer takes care to retrieve the micro-frontends endpoints at runtime before loading them into the HTML template. This is a simpler yet powerful approach for maintaining the boundaries within your organization and allowing independent teams to evolve their architecture autonomously.

Micro-frontends discovery evolution

Using Parameter Store as a service discovery system, you can deploy a new micro-frontend by adding a new key-value into the service discovery.

A more sophisticated option could be creating a service that acts as a registry and also shapes the traffic towards different micro-frontends versions using deployment strategies like canary releases or blue/green deployments.

You can start iteratively with a simple key-value store system and evolve the architecture with a more complex approach when the workload requires, providing a robust way to roll out micro-frontends services in your system.

When this is in place, it’s likely to increase the release cadence of your micro-frontends. This is because developers often feel safer releasing in production without affecting the entire user base and they can run tests alongside real traffic.

Performance considerations

This architecture uses Fargate for composing the micro-frontends instead of Lambda functions. This allows incremental rendering offered by browsers, displaying the HTML page partially before it’s completely returned.

Consider a scenario where a micro-frontend takes longer to render due to a downstream dependency or a faulty version deployed into production. Without the streaming capability, you must wait until all the micro-frontends responses arrive, buffer them in memory, compose the page and then send the final output to the browser.

Instead, by using the streaming API offered by Node.js frameworks, you can send a partial HTML page (for example, the head tag and subsequently the rest of the page), to be rendered by a browser.

Streaming also improves server overhead, because the servers don’t have to buffer entire pages. By incrementally flushing data to browsers, servers keep memory pressure low, which lets them process more requests and save overhead costs.

However, in case your workload doesn’t require these capabilities, one or multiple Lambda functions might be suitable for your project as well, reducing the infrastructure management complexity to handle.

Conclusion

This post looks at how to use the UI Composer and micro-frontends discoverability. Once this part is developed, it won’t need to change regularly. This represents the foundation for building server-side rendering micro-frontends using HTML-over-the-wire. There might be other approaches to follow for other frameworks such as Next.js due to the architectural implementation of the framework itself.

The next post will cover how the UI composer includes micro-frontends output inside an HTML template.

For more serverless learning resources, visit Serverless Land.

Uploading large objects to Amazon S3 using multipart upload and transfer acceleration

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/uploading-large-objects-to-amazon-s3-using-multipart-upload-and-transfer-acceleration/

This post is written by Tam Baghdassarian, Cloud Application Architect, Rama Krishna Ramaseshu, Sr Cloud App Architect, and Anand Komandooru, Sr Cloud App Architect.

Web and mobile applications must often upload objects to the AWS Cloud. For example, services such as Amazon Photos allow users to upload photos and large video files from their browser or mobile application.

Amazon S3 can be ideal to store large objects due to its 5-TB object size maximum along with its support for reducing upload times via multipart uploads and transfer acceleration.

Overview

Developers who must upload large files from their web and mobile applications to the cloud can face a number of challenges. Due to underlying TCP throughput limits, a single HTTP connection cannot use the full bandwidth available, resulting in slower upload times. Furthermore, network latency and quality can result in poor or inconsistent user experience.

Using S3 features such as presigned URLs and multipart upload, developers can securely increase throughput and minimize upload retries due to network errors. Additionally, developers can use transfer acceleration to reduce network latency and provide a consistent user experience to their web and mobile app users across the globe.

This post references a sample application consisting of a web frontend and a serverless backend application. It demonstrates the benefits of using S3’s multipart upload and transfer acceleration features.

Architecture overview

Solution overview:

  1. Web or mobile application (frontend) communicates with AWS Cloud (backend) through Amazon API Gateway to initiate and complete a multipart upload.
  2. AWS Lambda functions invoke S3 API calls on behalf of the web or mobile application.
  3. Web or mobile application uploads large objects to S3 using S3 transfer acceleration and presigned URLs.
  4. File uploads are received and acknowledged by the closest edge location to reduce latency.

Using S3 multipart upload to upload large objects

A multipart upload allows an application to upload a large object as a set of smaller parts uploaded in parallel. Upon completion, S3 combines the smaller pieces into the original larger object.

Breaking a large object upload into smaller pieces has a number of advantages. It can improve throughput by uploading a number of parts in parallel. It can also recover from a network error more quickly by only restarting the upload for the failed parts.

Multipart upload consists of:

  1. Initiate the multipart upload and obtain an upload id via the CreateMultipartUpload API call.
  2. Divide the large object into multiple parts, get a presigned URL for each part, and upload the parts of a large object in parallel via the UploadPart API call.
  3. Complete the upload by calling the CompleteMultipartUpload API call.

When used with presigned URLs, multipart upload allows an application to upload the large object using a secure, time-limited method without sharing private bucket credentials.

This Lambda function can initiate a multipart upload on behalf of a web or mobile application:

const multipartUpload = await s3. createMultipartUpload(multipartParams).promise()
return {
    statusCode: 200,
    body: JSON.stringify({
        fileId: multipartUpload.UploadId,
        fileKey: multipartUpload.Key,
      }),
    headers: {
      'Access-Control-Allow-Origin': '*'
    }
};

The UploadId is required for subsequent calls to upload each part and complete the upload.

Uploading objects securely using S3 presigned URLs

A web or mobile application requires write permission to upload objects to a S3 bucket. This is usually accomplished by granting access to the bucket and storing credentials within the application.

You can use presigned URLs to access S3 buckets securely without the need to share or store credentials in the calling application. In addition, presigned URLs are time-limited (the default is 15 minutes) to apply security best practices.

A web application calls an API resource that uses the S3 API calls to generate a time-limited presigned URL. The web application then uses the URL to upload an object to S3 within the allotted time, without having explicit write access to the S3 bucket. Once the presigned URL expires, it can no longer be used.

When combined with multipart upload, a presigned URL can be generated for each of the upload parts, allowing the web or mobile application to upload large objects.

This example demonstrates generating a set of presigned URLs for index number of parts:

    const multipartParams = {
        Bucket: bucket_name,
        Key: fileKey,
        UploadId: fileId,
    }
    const promises = []

    for (let index = 0; index < parts; index++) {
        promises.push(
            s3.getSignedUrlPromise("uploadPart", {
            ...multipartParams,
            PartNumber: index + 1,
            Expires: parseInt(url_expiration)
            }),
        )
    }
    const signedUrls = await Promise.all(promises)

Prior to calling getSignedUrlPromise, the client must obtain an UploadId via CreateMultipartUpload. Read Generating a presigned URL to share an object for more information.

Reducing latency by using transfer acceleration

By using S3 transfer acceleration, the application can take advantage of the globally distributed edge locations in Amazon CloudFront. When combined with multipart uploads, each part can be uploaded automatically to the edge location closest to the user, reducing the upload time.

Transfer acceleration must be enabled on the S3 bucket. It can be accessed using the endpoint bucketname.s3-acceleration.amazonaws.com or bucketname.s3-accelerate.dualstack.amazonaws.com to connect to the enabled bucket over IPv6.

Use the speed comparison tool to test the benefits of the transfer acceleration from your location.

You can use transfer acceleration with multipart uploads and presigned URLs to allow a web or mobile application to upload large objects securely and efficiently.

Transfer acceleration needs must be enabled on the S3 bucket. This example creates an S3 bucket with transfer acceleration using CDK and TypeScript:

const s3Bucket = new s3.Bucket(this, "document-upload-bucket", {
      bucketName: “BUCKET-NAME”,
      encryption: BucketEncryption.S3_MANAGED,
      enforceSSL: true,
      transferAcceleration: true,      
      removalPolicy: cdk.RemovalPolicy.DESTROY
    });

After activating transfer acceleration on the S3 bucket, the backend application can generate transfer acceleration-enabled presigned URLs. by initializing the S3 SDK:

s3 = new AWS.S3({useAccelerateEndpoint: true});

The web or mobile application then use the presigned URLs to upload file parts.

See S3 transfer acceleration for more information.

Deploying the test solution

To set up and run the tests outlined in this blog, you need:

  • An AWS account.
  • Install and configure AWS CLI.
  • Install and bootstrap AWS CDK.
  • Deploy the backend and frontend solution at the following git repository.
  • A sufficiently large test upload file of at least 100 MB.

To deploy the backend:

  1. Clone the repository to your local machine.
  2. From the backendv2 folder, install all dependencies by running:
    npm install
  3. Use CDK to deploy the backend to AWS:
    cdk deploy --context env="randnumber" --context whitelistip="xx.xx.xxx.xxx"

You can use an additional context variable called “urlExpiry” to set a specific expiration time on the S3 presigned URL. The default value is set at 300 seconds. A new S3 bucket with the name “document-upload-bucket-randnumber” is created for storing the uploaded objects, and the whitelistip value allows API Gateway access from this IP address only.

Note the API Gateway endpoint URL for later.

To deploy the frontend:

  1. From the frontend folder, install the dependencies:
    npm install
  2. To launch the frontend application from the browser, run:
    npm run start

Testing the application

Testing the application

To test the application:

  1. Launch the user interface from the frontend folder:
    npm run
  2. Enter the API Gateway address in the API URL textbox.Select the maximum size of each part of the upload (the minimum is 5 MB) and the number of parallel uploads. Use your available bandwidth, TCP window size, and retry time requirements to determine the optimal part size. Web browsers have a limit on the number of concurrent connections to the same server. Specifying a larger number of concurrent connections results in blocking on the web browser side.
  3. Decide if transfer acceleration should be used to further reduce latency.
  4. Choose a test upload file.
  5. Use the Monitor section to observe the total time to upload the test file.

Experiment with different values for part size, number of parallel uploads, use of transfer acceleration and the size of the test file to see the effects on total upload time. You can also use the developer tools for your browser to gain more insights.

Test results

The following tests have the following characteristics:

  • The S3 bucket is located in the US East Region.
  • The client’s average upload speed is 79 megabits per second.
  • The Firefox browser uploaded a file of 485 MB.

Test 1 – Single part upload without transfer acceleration

To create a baseline, the test file is uploaded without transfer acceleration and using only a single part. This simulates a large file upload without the benefits of multipart upload. The baseline result is 72 seconds.

Single part upload without transfer acceleration

Test 2 – Single upload with transfer acceleration

The next test measured upload time using transfer acceleration while still maintaining a single upload part with no multipart upload benefits. The result is 43 seconds (40% faster).

Single upload with transfer acceleration

Test 3 – Multipart upload without transfer acceleration

This test uses multipart upload by splitting the test file into 5-MB parts with a maximum of six parallel uploads. Transfer acceleration is disabled. The result is 45 seconds (38% faster).

Multipart upload without transfer acceleration

Test 4 – Multipart upload with transfer acceleration

For this test, the test file is uploaded by splitting the file into 5-MB parts with a maximum of six parallel uploads. Transfer acceleration is enabled for each upload. The result is 28 seconds (61% faster).

Multipart upload with transfer acceleration

The following chart summarizes the test results.

Multipart upload Transfer acceleration Upload time
No No 72s
Yes No 43s
No Yes 45s
Yes Yes 28s

Conclusion

This blog shows how web and mobile applications can upload large objects to Amazon S3 in a secured and efficient manner when using presigned URLs and multipart upload.

Developers can also use transfer acceleration to reduce latency and speed up object uploads. When combined with multipart upload, you can see upload time reduced by up to 61%.

Use the reference implementation to start incorporating multipart upload and S3 transfer acceleration in your web and mobile applications.

For more serverless learning resources, visit Serverless Land.

Introducing new asynchronous invocation metrics for AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-new-asynchronous-invocation-metrics-for-aws-lambda/

This post is written by Arthi Jaganathan, Principal SA, Serverless and Dhiraj Mahapatro, Principal SA, Serverless.

Today, AWS is announcing three new Amazon CloudWatch metrics for asynchronous AWS Lambda function invocations: AsyncEventsReceived, AsyncEventAge, and AsyncEventsDropped. These metrics provide visibility for asynchronous Lambda function invocations.

Previously, customers found it challenging to monitor the processing of asynchronous invocations. With these new metrics for asynchronous function invocations, you can identify the root cause of processing issues. These issues include throttling, concurrency limit, function errors, processing latency because of retries, missing events, and taking corrective action.

This blog and the sample application provide examples that highlight the usage of the new metrics.

Overview

Architecture overview

AWS services such as Amazon S3Amazon SNS, and Amazon EventBridge invoke Lambda functions asynchronously. Lambda uses an internal queue to store events. A separate process reads events from the queue and sends them to the function.

By default, Lambda discards events from its event queue if the retry policy has exceeded the number of configured retries or the event reached its maximum age. However, the event once discarded from the event queue goes to the destination or DLQ, if configured.

This table summarizes the retry behavior. Visit the asynchronous Lambda invocations documentation to learn more:

Cause Retry behavior Override
Function errors (returned from code or runtime, such as timeouts) Retry twice Set retry attempt on function between 0-2
Throttles (429) and system errors (5xx) Retry for a maximum of 6 hours Set maximum age of event on function between 60 seconds to 6 hours
Zero reserved concurrency No retry N/A

What’s new

The AsyncEventsReceived metric is a measure of the number of events enqueued in Lambda’s internal queue. You can track events from the client using custom CloudWatch metrics or extract it from logs using Embedded Metric Format (EMF). In case this metric is lower than the number of events that you expect, it shows that the source did not emit events or events did not arrive at the Lambda service. This is possible because of transient networking issues. Lambda does not emit this metric for retried events.

The AsyncEventAge metric is a measure of the difference between the time that an event is first enqueued in the internal queue and the time the Lambda service invokes the function. With retries, Lambda emits this metric every time it attempts to invoke the function with the event. An increasing value shows retries because of error or throttles. Customers can set alarms on this metric to alert on SLA breaches.

The AsyncEventsDropped metric is a measure of the number of events dropped because of processing failure.

How to use the new async event metrics

This flowchart shows the way that you can combine the new metrics with existing metrics to troubleshoot problems with asynchronous processing:

Flowchart

Example application

You can deploy a sample Lambda function to show how to use the new metrics for troubleshooting.

To test the following scenarios, you must install:

To set up the application:

  1. Set your AWS Region:
    export REGION=<your AWS region>
  2. Clone the GitHub repository:
    git clone https://github.com/aws-samples/lambda-async-metrics-sample.git
    cd lambda-async-metrics-sample
  3. Build and deploy the application. Provide lambda-async-metric as the stack name when prompted. Keep everything else as default values:
    sam build
    sam deploy --guided --region $REGION
  4. Save the name of the function in an environment variable:
    FUNCTION_NAME=$(aws cloudformation describe-stacks \
      --region $REGION \
      --stack-name lambda-async-metric \
      --query 'Stacks[0].Outputs[?OutputKey==`HelloWorldFunctionResourceName`].OutputValue' --output text)
    
  5. Invoke the function using AWS CLI:
    aws lambda invoke \
      --region $REGION \
      --function-name $FUNCTION_NAME \
      --invocation-type Event out_file.txt
    
  6. Choose All Metrics under Metrics in the left-hand panel in the CloudWatch console and search for “async”:
    All metrics
  7. Choose Lambda > By Function Name and choose AsyncEventsReceived for the function you created. Under “Graphed metrics”, change the statistic to sum and “Period” to 1 minute. You see one record. After waiting a few seconds, refresh if you don’t see the metric immediately.Graphed metrics

Scenarios

These scenarios show how you can use the three new metrics.

Scenario 1: Troubleshooting delays due to function error

Lambda retries processing the asynchronous invocation event for a maximum of two times, in case of function error or exception. Lambda drops the event from its internal queue if the retries are exhausted.

To simulate a function error, throw an exception from the Lambda handler:

  1. Edit the function code in hello_world/app.py to raise an exception:
    def handler(event, context):
      print(“Hello from AWS Lambda”)
      raise Exception(“Lambda function throwing exception”)
    
  2. Build and deploy:
    sam build && sam deploy –region $REGION
  3. Invoke the function:
    aws lambda invoke \
      --region $REGION \
      --function-name $FUNCTION_NAME \
      --invocation-type Event out_file.txt
    

It is best practice to alert on function errors using the error metric and use the metrics to get better insights into retry behavior, such as interval between retries. For example, if a function errors because of a downstream system being overwhelmed, you can use AsyncEventAge and Concurrency metrics.

If you received an alert for function error, you see data points for AsyncEventsDropped. It is 1 for this scenario. Overlaying the Errors and Throttles metrics reconfirms function error causes this.

Overlaying metrics to see errors

There are two retries before the Lambda service drops the event. No throttling confirms the function error. Next, you can confirm that the AsyncEventAge is increasing. Lambda publishes this metric every time it polls from the event queue and sends it to the function. This creates multiple data points for the metric.

You can duplicate the metric to see both statistics on a single graph. Here, the two lines overlap because there is only one data point published in each 1-minute interval.

Duplicating the metrics

The event spent 37ms in the internal queue before the first invoke attempt. Lambda’s first retry happens after 63.5 seconds. The second and final retry happens after 189.6 seconds.

Scenario 2: Troubleshooting delays because of concurrency limits

In case of throttling or system errors, Lambda retries invoking the event up to the configured MaximumEventAgeInSeconds (the maximum is 6 hours). To simulate the throttling error without hitting the account concurrency limit, you can:

  • Set the function reserved concurrency to 1
  • Introduce a 90 seconds sleep in the function code to simulate a lengthy function execution.

The Lambda service throttles new invocations while the first request is in progress. You invoke the function in quick succession from the command line to simulate throttling and observe the retry behavior:

  1. Set the function reserved concurrency to 1 by updating the AWS SAM template.
    HelloWorldFunction:
      Type: AWS::Serverless::Function
      Properties:
        CodeUri: hello_world/
        Handler: app.lambda_handler
        Runtime: python3.9
        Timeout: 100
        MemorySize: 128
        Architectures:
        - x86_64
        ReservedConcurrentExecutions: 1
    
  2. Edit the function code in hello_world/app.py to introduce a 90-second sleep. Replace existing code with the following in app.py:
    import time
    
    def handler(event, context):
      time.sleep(90)
      print("Hello from AWS Lambda")
    
  3. Build and deploy:
    sam build && sam deploy --region $REGION
  4. Invoke the function twice in succession from the command line:
    for i in {1..2}; do aws lambda invoke \
      --region $REGION \
      --function-name $FUNCTION_NAME \
      --invocation-type Event out_file.txt; done

In a real-world use case, missing the processing SLA should trigger the troubleshooting workflow. Start with AsyncEventsReceived to confirm events enqueued by the Lambda service. This is 2 in this scenario. Look for dropped events using AsyncEventsDropped metric.

Dropped events using AsyncEventsDropped metric

AsyncEventAge verifies delays in processing. As mentioned in the previous section, there can be multiple data points for this metric in a one-minute interval. You can duplicate the metric to compare minimum and maximum values.

Duplicate the metric to compare minimum and maximum values

There are 2 data points in the first minute. The event age increases to 31 seconds during this period.

Event age increases to 31 seconds

There is only one data point for the metric in the remaining one-minute intervals, so the lines overlap. The event age increases to 59,153ms (~59 seconds) in one interval and then to 130,315ms (~130 seconds) in the next one-minute interval. Since the function sleeps for 90 seconds, it explains why the final retry is at around 2 minutes since the function received the event.

Checking function throttling, this screenshot confirms throttling six times in the first minute (07:12 UTC timestamp) and once in the subsequent minute (07:13 UTC timestamp).

Graph showing throttling

This is because of the back-off behavior of Lambda’s internal queue. The data for AsyncEventAge shows that there is only one throttle in the second interval. Lambda delivers the event during the next one-minute interval after spending around 2 minutes in the internal queue.

Overlaying the ConcurrentExecutions and AsyncEventsReceived metrics provides more information. You see receipt of two events, but concurrency stayed at 1. This results in one event being throttled:

Overlaying the ConcurrentExecutions and AsyncEventsReceived

There are multiple ways to resolve throttling errors. You optimize the function to run faster or increase the function or account concurrency limits to address throttling errors.

The sample application covers other scenarios such as troubleshooting dropped events on event expiration, and troubleshooting dropped events when a function’s reserved concurrency is set to zero.

Cleaning up

Use the following command and follow the prompts to clean up the resources:

sam delete lambda-async-metric --region $REGION

Conclusion

Using these new CloudWatch metrics, you can gain visibility into the processing of Lambda asynchronous invocations. This blog explained the new metrics AsyncEventsReceived, AsyncEventAge, and AsyncEventsDropped and how to use them to troubleshoot issues. With these new metrics, you can track the asynchronous invocation requests sent to Lambda functions. You monitor any delays in processing, and take corrective actions if required.

The Lambda service sends these new metrics to CloudWatch at no cost to you. However, charges apply for CloudWatch Metric Streams and CloudWatch Alarms. See CloudWatch pricing for more information.

For more serverless learning resources, visit Serverless Land.

Securing CI/CD pipelines with AWS SAM Pipelines and OIDC

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/securing-ci-cd-pipelines-with-aws-sam-pipelines-and-oidc/

This post is written by Rahman Syed, Sr. Solutions Architect, State & Local Government and Brian Zambrano, Sr. Specialist Solutions Architect, Serverless.

Developers of serverless applications use the AWS Serverless Application Model (AWS SAM) CLI to generate continuous integration and deployment (CI/CD) pipelines. In October 2022, AWS released OpenID Connect (OIDC) support for AWS SAM Pipelines. This improves your security posture by creating integrations that use short-lived credentials from your CI/CD provider.

OIDC is an authentication layer based on open standards that makes it easier for a client and an identity provider to exchange information. CI/CD tools like GitHub, GitLab, and Bitbucket provide support for OIDC, which ensures that you can integrate with AWS for secure deployments.

This blog post shows how to create a GitHub Actions workflow that securely integrates with AWS using GitHub as an identity provider.

Securing CI/CD systems that interact with AWS

AWS SAM Pipelines is a feature of AWS SAM CLI that generates CI/CD pipeline configurations for six CI/CD systems. These include AWS CodePipeline, Jenkins, GitHub Actions, GitLab CI/CD, and BitBucket. You can get started with these AWS-curated pipeline definitions or create your own to support your organization’s standards.

CI/CD pipelines hosted outside of AWS require credentials to deploy to your AWS environment. One way of integrating is to use an AWS Identity and Access Management (IAM) user, which requires that you store the access key and secret access key within your CI/CD provider. Long-term access keys remain valid unless you revoke them, unlike temporary security credentials that are valid for shorter periods of time.

It is a best practice to use temporary, scoped security credentials generated by AWS Security Token Service (AWS STS) to reduce your risk if credentials are exposed. Temporary tokens are generated dynamically as opposed to being stored. Because they expire after minutes or hours, temporary tokens limit the duration of any potential compromise. A token scoped with least privilege limits permissions to a set of resources and prevents wider access within your environment. AWS SAM Pipelines supports short-term credentials with three OIDC providers: GitHub, GitLab, and Bitbucket.

This post shows how AWS SAM Pipelines can integrate GitHub Actions with your AWS environment using these short-term, scoped credentials powered by the OIDC open standard. It uses a two-stage pipeline, representing a development and production environment.

Architecture overview

This example uses GitHub as the identity provider. When the dev task in the GitHub Actions workflow attempts to assume the dev pipeline execution role in the AWS account, IAM validates that the supplied OIDC token originates from a trusted source. Configuration in IAM allows role assumption from specified GitHub repositories and branches. AWS SAM Pipelines performs the initial heavy lifting of configuring both GitHub Actions and IAM using the principle of least-privileged.

Prerequisites

  1. AWS SAM CLI, version 1.60.0 or higher
  2. GitHub account: You must have the required permissions to configure GitHub projects and create pipelines.
  3. Create a new GitHub repository, using the name “sam-app”.

Creating a new serverless application

To create a new serverless application:

  1. Create a new AWS SAM application locally:
    sam init --name sam-app --runtime python3.9 --app-template hello-world --no-tracing
  2. Initialize a git repository:
    cd sam-app
    git init -b main
    git add .
    git commit -m "Creating a new SAM application"
  3. Push the new repository to GitHub:
    git remote add origin <REMOTE_URL> # e.g. https://github.com/YOURUSER/sam-app.git
    git push -u origin main

GitHub offers multiple authentication mechanisms. Regardless of how you authenticate, ensure you have the “workflow” scope. GitHub Actions only allow changes to your pipeline when you push with credentials that have this scope attached.

Creating application deployment targets

Once the AWS SAM application is hosted in a GitHub repository, you can create CI/CD resources in AWS that support two deployment stages for the serverless application environment. This is a one-time operation.

Step 1: Creating the pipeline for the first stage.

Run the command for the first stage, answering the interactive questions:

sam pipeline bootstrap --stage dev

When prompted to choose a “user permissions provider”, make sure to select OpenID Connect (OIDC). In the next question, select GitHub Actions as the OIDC provider. These selections result in additional prompts for information that later result in a least privilege integration with GitHub Actions.

The following screenshot shows the interaction with AWS SAM CLI (some values may appear differently for you):

Interaction with AWS SAM CLI

Step 2: Create deployment resources for the second stage.

Run the following command and answer the interactive questions:

sam pipeline bootstrap --stage prod

With these commands, AWS SAM CLI bootstraps the AWS resources that the GitHub Actions workflow later uses to deploy the two stages of the serverless application. This includes Amazon S3 buckets for artifacts and logs, and IAM roles for deployments. AWS SAM CLI also creates the IAM identity provider to establish GitHub Actions as a trusted OIDC provider.

The following screenshot shows these resources from within the AWS CloudFormation console. These resources do not represent a serverless application, but the AWS resources a GitHub Actions workflow must perform deployments. The aws-sam-cli-managed-dev-pipeline-resources stack creates an IAM OIDC identity provider used to establish trust between your AWS account and GitHub.

Stack resources

Generating and deploying a GitHub Actions workflow

The final step to creating a CI/CD pipeline in GitHub Actions is to use a GitHub source repository and two deployment targets in a GitHub Actions workflow.

To generate a pipeline configuration with AWS SAM Pipelines, run the following command and answer interactive questions:

sam pipeline init

The following screenshot shows the interaction with AWS SAM CLI (some values may appear differently for you):

Interaction with AWS SAM CLI

AWS SAM CLI has created a local file named pipeline.yaml which is the GitHub Actions workflow definition. Inspect the pipeline.yaml file to see how the GitHub Actions workflow deploys within your AWS account:

Pipeline.yaml contents

In this example task, GitHub Actions initiates an Action named configure-aws-credentials that uses OIDC as the method for assuming an AWS IAM role for deployment activity. The credentials are valid for 3600 seconds (one hour).

To deploy the GitHub Actions workflow, commit the new file and push to GitHub:

git add .
git commit -m "Creating a CI/CD Pipeline"
git push origin main

Once GitHub receives this commit, the repository creates a new GitHub Actions Workflow, as defined by the new pipeline.yaml configuration file.

Inspecting the GitHub Actions workflow

1. Navigate to the GitHub repository’s Actions view to see the first workflow run in progress.

First workflow run in progress.

2. Choosing the workflow run, you can see details about the deployment.

Details about the deployment

3. Once the deploy-testing step starts, open the CloudFormation console to see the sam-app-dev stack deploying.

Stack deploying

4. The GitHub Actions Pipeline eventually reaches the deploy-prod step, which deploys the production environment of your AWS SAM application. At the end of the Pipeline run, you have two AWS SAM applications in your account deployed by CloudFormation via GitHub Actions. Every change pushed to the GitHub repository now triggers your new multi-stage CI/CD pipeline.

New multi-stage CI/CD pipeline

You have successfully created a CI/CD pipeline for a system located outside of AWS that can deploy to your AWS environment without the use of long-lived credentials.

Cleanup

To clean up your AWS based resources, run following AWS SAM CLI commands, answering “y” to all questions:

sam delete --stack-name sam-app-prod
sam delete --stack-name sam-app-dev
sam delete --stack-name aws-sam-cli-managed-dev-pipeline-resources
sam delete --stack-name aws-sam-cli-managed-prod-pipeline-resources

You may also return to GitHub and delete the repository you created.

Conclusion

AWS SAM Pipeline support for OIDC is a new feature of AWS SAM CLI that simplifies the integration of CI/CD pipelines hosted outside of AWS. Using short-term credentials and scoping AWS actions to specific pipeline tasks reduces risk for your organization. This post shows you how to get started with AWS SAM Pipelines to create a GitHub Actions-based CI/CD pipeline with two deployment stages.

The Complete AWS SAM Workshop provides you with hands-on experience for many AWS SAM features, including CI/CD with GitHub Actions.

Watch guided video tutorials to learn how to create deployment pipelines for GitHub Actions, GitLab CI/CD, and Jenkins.

For more learning resources, visit https://serverlessland.com/explore/sam-pipelines.